Why:
JSON normalization often needs a small preprocessing step before
flattening or unflattening so deployments can drop noisy fields and
rename source keys without extra ruleset glue.
Impact: mmjsontransform can now load a YAML policy, reload it on HUP,
and apply rename/drop rules before transformation.
Before/After:
Before: mmjsontransform only transformed the structure that it was given.
After: it can preprocess input keys via a YAML policy and then run the
normal flatten or unflatten logic on the adjusted JSON.
Technical Overview:
This adds an optional action parameter, policy, which loads a YAML file
with map.rename and map.drop directives.
Policy state is shared per action instance, guarded by a mutex, loaded
at startup, and rechecked on HUP so mtime changes trigger reload while a
failed reload keeps the previous in-memory policy.
Drop rules are compiled into a lookup object, rename rules are compiled
into a mapping object, and both are applied before the main transform.
Nested rename handling keeps targets relative to the current recursion
context, and the loader cleans up partial allocations correctly on error
paths.
The patch also documents the new parameter, registers its reference page
for distribution, and adds a focused test that covers initial policy
application plus HUP-driven reload.
Validation:
- ./tests/mmjsontransform-policy-basic.sh
With the help of AI-Agents: Codex
Why:
Tests that exercise the YAML configuration loader need to run without
any RainerScript preamble. The existing generate_conf function always
writes a .conf file using legacy and RainerScript directives, making it
impossible to test pure-YAML startup paths.
Technical Overview:
- imdiag: replace EMPTY_STRUCT with real modConfData_s fields; add
BEGINsetModCnf (listenportfilename, aborttimeout), BEGINendCnfLoad,
BEGINnewInpInst (port), and the three STD_CONF2 queryEtryPt macros.
Legacy $IMDiag* directives are preserved unchanged. Fix a NULL
deref in addTCPListener when pszLstnPortFileName is not set.
- diag.sh: generate_conf gains a --yaml-only flag that writes a pure
YAML preamble (version/global/mainqueue/modules/inputs) instead of a
.conf file. net.ipprotocol is resolved before the preamble is written
to avoid a duplicate global: key. add_yaml_conf() mirrors add_conf()
for the yaml path. startup_common selects .yaml when
RSYSLOG_YAML_ONLY=1. wait_startup comment documents that .started is
absent in yaml-only mode and that the OR logic handles it.
- tests/yaml-basic-yamlonly.sh: new test exercising the yaml-only path
end-to-end (imtcp, 100 messages, seq_check).
- tests/Makefile.am: register yaml-basic-yamlonly.sh under TESTS_LIBYAML.
- tests/AGENTS.md: document the yaml-only mode, its limitations, and
the expected naming/registration conventions.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
rsyslog now accepts .yaml / .yml files as a full alternative to
RainerScript configuration. Whenever a .yaml or .yml file is
referenced (e.g. via $IncludeConfig or as the main config), it is
routed through the new yamlconf loader instead of the bison/flex
parser. RainerScript and YAML configurations may coexist.
Supported top-level sections
global: global settings
main_queue: main queue settings
modules: module loading (load:, params:)
inputs: input objects templates: string, subtree, plugin, and list templates
(including all property/constant modifiers)
rulesets: rulesets with optional script: block (raw
RainerScript), filter/action shortcut (if:/then:),
and the full statements: block:
action:, set:, unset:, call:, call_indirect:,
foreach:, if:, stop:, continue:,
reload_lookup_table:
outputs: output objects (top-level convenience alias)
include: recursive includes (.yaml processed immediately;
.conf files pushed to LIFO stack in reverse order
for correct document-order processing)
Implementation notes
- Thin front-end over existing cnfobj/nvlst/cnfDoObj() machinery;
all parameter validation happens downstream unchanged.
- Script blocks synthesised as RainerScript strings and injected via
cnfAddConfigBuffer() (requires two trailing NUL bytes).
- YAML -> YAML includes are synchronous recursive calls;
YAML -> .conf includes use the flex buffer stack.
- Mixed include lists (.yaml + .conf) emit a warning because strict
document order cannot be preserved across the two mechanisms.
- Built conditionally: --with-libyaml / HAVE_LIBYAML.
Tests
19 new shell tests cover: basic config, modules, inputs, templates
(string/subtree/list with modifiers), rulesets with script/filter/
statements blocks, stop/continue, set/unset, foreach, call,
call_indirect, reload_lookup_table, includes, error handling.
Documentation
doc/source/configuration/yaml_config.rst (new reference page)
Inline comments in yamlconf.c explain all invariants and non-obvious
design choices (es_getBufAddr non-null-termination, LIFO stack
ordering, cnfAddConfigBuffer double-NUL requirement, etc.)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Allow YAML rulesets to use structured filter: + actions: keys as
an alternative to inline script: blocks. This covers the most
common use-case (route by syslog priority or property filter to
one or more outputs) without any RainerScript in the file.
- filter: accepts syslog PRI selectors (e.g. *.info) or property
filters (starting with ':'). Omitting filter: routes all messages.
- actions: is a YAML sequence; each item maps to a module action().
- filter:/actions: and script: are mutually exclusive per ruleset.
Backed by cnfstmtNewPRIFILT / cnfstmtNewPROPFILT / cnfstmtNewAct;
no new grammar rules required.
Add yaml-filter-actions.sh testbench test (all 5 YAML tests pass).
Update yaml_config.rst with Structured Filter Shortcut section and
revised complete example.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
rsyslog now accepts .yaml/.yml files as configuration files
in addition to RainerScript (.conf). The format is triggered
automatically based on file extension, both for the main config
file (-f flag) and for include() directives.
Architecture:
- New runtime/yamlconf.c (libyaml event-based parser) builds the
same cnfobj+nvlst structures that the bison parser produces and
calls cnfDoObj() for each block.
- Ruleset script: keys are synthesised into a complete RainerScript
ruleset(name=...) { ... } block and pushed onto the flex buffer
stack so the existing lex/bison pipeline handles all filter
expressions and statements unchanged.
- routing in rsconf.c::load() and cnfDoInclude() detects .yaml/.yml
by extension and delegates to yamlconf_load().
- cnfHasPendingBuffers() (new) lets rsconf.c flush queued script
buffers after YAML-as-main-config loading.
- Guarded by #ifdef HAVE_LIBYAML throughout; graceful error when
libyaml is absent.
Schema (top-level YAML keys):
global, modules, inputs, templates, rulesets, mainqueue,
include, parser, lookup_table, dyn_stats, ratelimit, timezone
Parameter names are identical to RainerScript; all type coercion
and validation is reused via nvlstGetParams() unchanged.
Tests: yaml-basic, yaml-include, yaml-ruleset-script, yaml-error
Documentation: doc/source/configuration/yaml_config.rst
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
GnuTLS silently ignores expired CRLs -- it loads them into credentials
without checking the validity period. This means rsyslog kept accepting
TLS connections even when the configured CRL file had expired, unlike
the OpenSSL driver which rejects them via X509_V_ERR_CRL_HAS_EXPIRED
in the verify callback.
Fix this by manually checking the CRL thisUpdate/nextUpdate timestamps
when loading the CRL file. The file is loaded once, validated, then
passed to credentials via gnutls_certificate_set_x509_crl_mem(). The
nextUpdate timestamp is also cached on the connection object so that
each TLS handshake can cheaply detect if the CRL has expired since
startup, matching OpenSSL's per-connection behavior.
Add CRL tests for both GnuTLS and OpenSSL drivers, each with two
phases: valid CRL (communication succeeds) and expired CRL (connection
rejected, no messages received). Includes a pre-generated expired CRL
test fixture.
add initial tests
omazuredce: add Azure Monitor output module
This adds a documented Azure Monitor output path so rsyslog can
ship events to Azure-hosted observability workflows. It also makes
the module easier to review and iterate on as a project feature.
Impact: adds a new output module, config parameters, docs, and tests.
Before: rsyslog had no native Azure Monitor DCE output module.
After: rsyslog can batch and send records to Azure Monitor via
omazuredce.
The change adds a new omazuredce plugin and wires it into the
build system and module map. The module obtains Azure tokens,
renders payloads from templates, batches records by size, and
flushes them through the existing action queue model.
The patch also documents the module and its parameters so the
feature is usable without external notes. Supporting config and
build integration are included so the module can be compiled and
packaged consistently with the rest of the tree.
AI-Agent: Codex 2025-06
With the help of AI-Agents: Codex
add tests and address issues
address issues
more issues
more issues
more issues
issues
issues
fix issues
fixfix
fixy
fix
fix fixy fix pants
fix fixy fix pants2
redo the doc
Why
Native post-quantum TLS support should be usable and testable on newer
distro baselines without adding provider-mode compatibility work for
older platforms.
Impact
Rsyslog now has native-PQ smoke tests, clearer TLS diagnostics, updated
CI baselines and helper images, and a new post-quantum tutorial for
supported distros.
Before/After
Before: Fedora CI still targeted Fedora 41, PQ-capable TLS settings had
no dedicated rsyslog tests or user-facing tutorial, and stricter clang
builds could fail on warning-group handling.
After: CI targets Fedora 43, native PQ usage is documented and smoke-
tested, helper images include the required tools, and the branch builds
and tests cleanly with the newer compiler/container combinations.
Technical Overview
The CI matrix now replaces the Fedora 41 lane with Fedora 43 and adds a
matching Fedora 43 development image.
The Debian 13 and Fedora 43 development containers now install the
GnuTLS CLI utilities needed for native PQ capability checks.
The OpenSSL TLS config path logs clearer messages when a command or
value is unavailable on the native OpenSSL build.
The GnuTLS TLS config path reports unsupported priority-string options
more explicitly.
Two new shell tests add native PQ smoke coverage for OpenSSL and GnuTLS
using the existing gnutlsPriorityString control surface.
Those tests self-skip unless the local native TLS libraries expose the
required hybrid group support.
The imtcp parameter docs and omfwd docs now explain the native-only PQ
support policy and include example configurations.
A new tutorial documents native PQ usage for OpenSSL and GnuTLS on
supported newer distro versions.
The shared runtime warning policy in rsyslog.h now tolerates clang
handling of unknown warning groups so older and newer clang lanes remain
warning-free under the existing finalize_it error-handling pattern.
Testbench follow-ups harden omfwd-lb-susp with isolated retry attempts,
skip rcvr_fail_restore on ARM where it is timing-flaky, and keep local
SC2181 suppressions where if-exec rewrites would reduce shell-script
usability.
The Fedora 43 Dockerfile now cleans the dnf cache after install and
locally suppresses the non-useful DL3041 package-version pinning warning.
Older distro versions remain intentionally unsupported for PQ in this
phase because we expect users to move to newer baselines first.
If there is demand later, older-version support can be considered in a
separate effort.
With the help of AI-Agents: Codex
Add ratelimit.interval, ratelimit.burst, and ratelimit.name parameters
to omusrmsg. This prevents DoS scenarios where emergency message floods
can overwhelm user terminals.
The implementation follows the established output module pattern:
- Sentinel-based (-1) defaults for mutual-exclusivity detection
- ratelimit.name references a shared ratelimit() configuration object
- Per-action ratelimiters via ratelimitNew()/ratelimitSetLinuxLike()
- ratelimitMsgCount() gate in doAction
Also adds module-level config (modConfData_s) with beginCnfLoad,
endCnfLoad, checkCnf, activateCnf, freeCnf entry points, needed to
stash rsconf_t* for ratelimitNewFromConfig().
Includes documentation updates and a config-validation test.
Closes: https://github.com/rsyslog/rsyslog/issues/4547
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Wire ratelimit.interval, ratelimit.burst, and ratelimit.name into imrelp,
following the established pattern from imtcp/imptcp/imudp. When
ratelimit.name is set, the listener uses a shared named rate limiter
(with per-source support). Local interval/burst and named rate limiters
are mutually exclusive.
Includes parameter documentation, module doc update, and integration test.
Closes: #6597
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Why: Commit 3ee31a8d0 added shared named ratelimit support
(ratelimit.name + ratelimitNewFromConfig) but only wired it
into imtcp, imptcp, and imudp. Seven modules with existing
ratelimit.interval/ratelimit.burst support were left without
the ability to reference shared ratelimit() objects.
Impact: Adds ratelimit.name parameter to omfwd,
omelasticsearch, omhttp, imhttp, imjournal, imklog, and
imuxsock. Existing configurations remain unchanged.
Before: These seven modules could only use inline
ratelimit.interval/ratelimit.burst parameters.
After: All modules can now reference shared ratelimit()
configuration objects via ratelimit.name.
Technical Overview:
For each module, the same pattern from the imudp reference
implementation is applied:
- Add uchar *pszRatelimitName to config struct
- Add ratelimit.name to cnfparamdescr
- Change interval/burst defaults to -1 (sentinel)
- Parse ratelimit.name in config handler
- Add mutual-exclusivity check (name vs interval/burst)
- Branch ratelimiter creation on pszRatelimitName:
if set, call ratelimitNewFromConfig(); else legacy path
- Free pszRatelimitName in destructor
imuxsock is the most complex: it supports both per-socket
ratelimit.name and syssock.ratelimit.name, with the
per-source hashtable ratelimiter path also updated.
imklog uses legacy parameter names (ratelimitinterval,
ratelimitburst) but the new parameter uses the dotted
form (ratelimit.name) for consistency.
Each module receives a parameter doc page, module doc
update, and a config-validation test.
With the help of AI-Agents: GitHub Copilot
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Why: The omotel module only supported JSON encoding for OTLP/HTTP
exports. Native protobuf encoding produces smaller payloads with
lower serialization overhead, which is the preferred transport for
high-volume OpenTelemetry deployments.
Impact: Users can now set protocol="http/protobuf" to use binary
protobuf encoding. The default remains "http/json". A pre-existing
bug in JSON timestamp encoding (timeUnixNano/observedTimeUnixNano
emitted as integers instead of strings) is also fixed.
Before: omotel only supported http/json. The timeUnixNano fields
were encoded as JSON integers, violating the proto3 JSON mapping
for fixed64 which requires string representation.
After: omotel supports both http/json and http/protobuf. The JSON
encoder emits timestamps as strings per the OTLP specification.
protobuf-c is a required build dependency when --enable-omotel is
set.
Technical Overview:
Vendors three upstream OTLP .proto files (common, resource, logs)
in plugins/omotel/proto/ and adds protoc-c build rules to
Makefile.am that generate C bindings at build time. The generated
files are handled as nodist sources with BUILT_SOURCES and
CLEANFILES entries.
Implements otlp_protobuf.c (~650 lines) with a
omotel_protobuf_build_export() entry point that constructs a full
ExportLogsServiceRequest protobuf tree from rsyslog log records.
The tree is assembled incrementally — container nodes are linked
to the request immediately upon creation so that
free_export_request() can walk and free a partially-built tree on
any error path. KV attribute arrays use NULL-after-transfer
ownership tracking with kv_array_free() cleanup.
OOM from make_*_kv() helpers is detected and propagated
(kv_array_add treats NULL as OOM error; callers guard empty values
explicitly). TraceId/SpanId are decoded from hex strings to raw
bytes as required by the binary protobuf wire format.
The flush path in omotel.c selects between JSON and protobuf
based on the protocol config parameter. omotel_http_client_config
gains a content_type field so the HTTP layer sends the correct
Content-Type header for each encoding. Gzip compression operates
on the selected buffer regardless of encoding.
configure.ac requires protobuf-c >= 1.0.0 and protoc-c when
--enable-omotel is set. Documentation, MODULE_METADATA.yaml, and
an integration test (omotel-protobuf-basic.sh) are included.
With the help of AI-Agents: GitHub Copilot
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Rainer Gerhards <rgerhards@adiscon.com>
Why:
This branch combines two related hardening steps for disk queue reliability:
- robust corruption detection/recovery handling in disk queue state/file validation
- worker startup cancellation-race closure that could lead to shutdown wait loops
Impact:
- disk queue scan now rejects out-of-range segment sequence numbers early and
reports corruption deterministically.
- worker startup no longer exposes a cancellation window before cleanup
registration.
- test/CI diagnostics preserve timeout backtraces (gdb) in ARM jobs and print
them to stdout for post-mortem debugging.
- test script cleanup removes redundant operations and uses a macOS-friendlier
segment enumeration path.
Technical Overview:
- runtime/queue.c:
- add out-of-range sequence-number rejection during spool scan
- keep orphan-loop range check as defensive fallback
- runtime/wtp.c:
- disable cancellation and register cleanup before publishing RUNNING
- document startup/cancellation invariant inline
- runtime/wti.c:
- add concise cancellation-contract comment
- devtools/ci/Dockerfile.arm:
- install gdb for CI timeout diagnostics
- tests/diskqueue-oncorruption-missing-segment.sh:
- emit timeout gdb backtraces to stdout
- drop redundant STARTED_LOG truncate
- avoid GNU find -printf/mapfile dependency in segment listing
* build: default-enable impstats-push and align CI containers
Enable impstats-push by default and keep configure strict when dependencies are missing.
Update CI/container definitions for distro differences (CentOS/OpenEuler/Ubuntu and workflow overrides), add explicit --disable-impstats-push where impstats is disabled, and fix impstats protobuf generation for distcheck/VPATH builds.
Why
VictoriaLogs jsonline is a target deployment path for omhttp users and
we need a direct integration signal in PR CI.
Impact
Adds a real-container omhttp->VictoriaLogs validation path and a scoped
CI job for relevant PRs.
Before/After
Before: no CI test validated omhttp against VictoriaLogs jsonline.
After: PRs touching omhttp or this test run a minimal live integration
check.
Technical Overview
Add tests/omhttp-victorialogs-jsonline.sh to send batched newline JSONL
payloads with omhttp to /insert/jsonline and verify indexed results via
/select/logsql/query.
Use jsonf list templating and a per-run marker to isolate records during
query validation. Keep transport on plain HTTP for CI simplicity.
Register the test in tests/Makefile.am under TESTS_OMHTTP so it is part
of testbench distribution and invocable as a single .log target.
Add a new run_checks.yml job named victorialogs_CI that starts a
VictoriaLogs service container, runs only
omhttp-victorialogs-jsonline.log, and gates execution with changed-files
filters for the test, omhttp components, and the workflow itself.
With the help of AI-Agents: Codex (GPT-5)
Adds first-class integration with VictoriaMetrics to simplify ops
dashboards and move toward project-supported telemetry without
sidecar collectors.
Impact: New optional feature (off by default). No behavior change
unless configured via push.* parameters.
Before: impstats could only log locally or emit text formats.
After: impstats can push counters to Prometheus-compatible endpoints.
Technical: implement a native Prometheus Remote Write path in
impstats, encoding counters to protobuf and compressing with snappy
over HTTP via libcurl. Replace interim text parsing with a new
statsobj v14 API (GetAllCounters) that iterates raw uint64 counters,
keeps atomic reads for IntCtr and best-effort reads for Int. Add
metric builder with Prometheus-compliant sanitization and the naming
pattern <origin>_<name>_<counter>_total. Provide TLS knobs (CA, mTLS,
insecureSkipVerify), static/dynamic labels, timeout, and optional
batching by bytes/series. Build is gated behind
--enable-impstats-push with protobuf-c/snappy/curl checks. Ship docs,
basic/VM integration tests, and a GitHub Actions workflow using a
VictoriaMetrics service; TSAN jobs disable impstats-push.
Configuration: push.url, push.labels, push.timeout.ms,
push.label.{instance,job,origin,name}, push.tls.{cafile,certfile,
keyfile,insecureSkipVerify}, push.batch.{maxBytes,maxSeries}.
With the Help of AI Agents: ChatGPT codex 5.2
When `re_extract` executes a regex that matches an empty string (length 0), `regexec` returns `rm_eo = 0`. The loop logic `iOffs += pmatch[0].rm_eo` results in `iOffs` not advancing, causing the next iteration to find the same empty match at the same position. This leads to finding the same match repeatedly (up to `matchnbr`), which is incorrect behavior (subsequent matches should be distinct).
This commit fixes the issue by detecting zero-length matches and forcing an advance of `iOffs` by 1, while ensuring we do not overrun the string buffer.
Fixes issue #230 (potential infinite loop scenario).
Signed-off-by: Rainer Gerhards <rgerhards@adiscon.com>
Co-authored-by: Jules Agent
Co-authored-by: rgerhards <1482123+rgerhards@users.noreply.github.com>
Why: prevent regressions around preserveFQDN/localHostname by
asserting internal rsyslogd messages honor the configured FQDN.
Impact: adds one test; no runtime behavior changes.
Before/After: before no explicit coverage; after internal hostname
is asserted as the configured FQDN.
Technical Overview:
- Add a testbench script that sets preserveFQDN and localHostname.
- Capture internal rsyslogd messages via syslogtag and write %hostname%.
- Validate every emitted hostname equals host.example.com.
- Register the test in TESTS_DEFAULT.
Closes https://github.com/rsyslog/rsyslog/issues/195
With the help of AI-Agents: Codex
Non-technical: centralize and reuse rate-limit definitions so admins
can apply consistent policies across listeners. This is part of an
ongoing series to improve rate limiting and its manageability.
Before: inputs set per-listener interval/burst ad hoc.
After: inputs can reference a named ratelimit() policy shared across
listeners; per-listener values remain as fallback.
Impact: New ratelimit() object and RateLimit.Name param for imtcp/imptcp.
If a policy file is configured but libyaml is unavailable, config fails.
Technical details:
- Add top-level ratelimit() Rainerscript object. Parsed in rsconf and
stored in a central registry (hashtable + rwlock) on rsconf.
- New runtime API: ratelimitAddConfig(), ratelimitNewFromConfig(), plus
cfgs init/destruct on rsconf lifecycle.
- imtcp/imptcp accept RateLimit.Name; when set, tcpsrv/imptcp build the
ratelimiter from the named policy; otherwise legacy interval/burst is
used. Thread-safety retained via ratelimitSetThreadSafe().
- tcpsrv gains ownership helpers for listener params and frees them on
errors; imtcp explicitly transfers ownership and nulls the pointer.
- Optional libyaml: detected at configure; runtime parser loads simple
key/value policy files (interval, burst, severity).
- Docs: new ratelimit object page; imtcp/imptcp parameter references and
module docs updated; design-decisions note added for libyaml.
- Tests: add ratelimit_name.sh (guarded for imtcp+imptcp) to validate
named policy application and observable throttling.
Refs: https://github.com/rsyslog/rsyslog/issues/6201
With the help of AI-Agents: Antigravity
imudp: add ratelimit.name support
This commit adds the `ratelimit.name` parameter to imudp, allowing listeners
to utilize the global rate limit registry (shared state).
Features:
- New `ratelimit.name` string parameter.
- Integration with `ratelimitNewFromConfig`.
- Strict mutual exclusivity: specifying `ratelimit.name` prohibits the use
of legacy per-listener parameters (`ratelimit.burst`, `ratelimit.interval`).
If a conflict occurs, an error is logged and the named rate limit takes precedence.
- Updated documentation.
- New regression test `tests/imudp_ratelimit_name.sh`.
With the help of AI Agent: Google Antigravity
Non-technical: improves operator ergonomics and closes a feature gap
with imptcp. Enables regex-based start-of-frame detection and optional
multi-line message handling on TCP inputs.
Impact: New config params; defaults keep existing behavior unchanged.
Before: imtcp framed messages via octet-counting or LF delimiter only.
After: imtcp can treat lines not starting a new frame as continuations
(MultiLine) and can split frames on a regex start pattern.
Technical:
- Adds imtcp params: MultiLine (bool) and framing.delimiter.regex (string).
Regex compilation happens in tcpsrv on listener creation; errors if
regex is set without FEATURE_REGEXP.
- tcps_sess adds a regex-aware path that tracks current-line offset,
runs the compiled regex on line starts, and uses a second buffer to
handle split packets cleanly. On >2x max-line without a match, we
submit and reset to avoid unbounded growth.
- Introduces input state eInMsgCheckMultiLine and LF lookahead to decide
continuation vs new frame; when at buffer end, defers the decision to
the next packet.
- Updates processDataRcvd signature to accept a movable cursor and
buffer length for lookahead; DataRcvd passes these and advances the
pointer accordingly.
- Wires regexp object usage in tcpsrv/tcps_sess init/exit; frees compiled
patterns on listener teardown and error paths. Tests cover both new
code paths (regex framing and multi-line).
Closes https://github.com/rsyslog/rsyslog/issues/5637
Revert the problematic condition added in commit 4748c5746 that
activated the DA worker pool when disk queue (pqDA) has data.
Root Cause:
The DA worker pool (pWtpDA, ConsumerDA function) moves data FROM
the in-memory parent queue TO the disk queue. When activated with
an empty parent queue, it immediately terminates (parent below low
watermark), but the condition remains true, causing an infinite
start/stop loop.
Why the original logic was incorrect:
The commit misunderstood the queue architecture. It tried to solve
slow disk queue draining by activating the DA worker pool, but:
- DA worker pool: Moves memory → disk (for spillover)
- Disk queue workers: Process disk → actions (automatic on load)
When rsyslog restarts with persisted disk queue data:
1. pqDA (disk queue) is loaded from files
2. pqDA's own regular workers start automatically via qqueueStart()
3. Those workers process messages from disk
4. No DA worker pool activation needed!
Test Results:
- With buggy code: 372 DA worker starts, test unstable
- With revert: 2 DA worker starts (normal), 19/20 test passes
- The 1/20 failure is pre-existing test flakiness
The original issue #2646 likely had a different root cause that
needs separate investigation. This revert prevents the regression
while restoring system stability.
Fixes regression in test: daqueue-drain-without-traffic.sh
Relates to: issue #2646, commit 4748c5746
Why:
Disk-assisted queues were taking days to drain after recovery
because the DA worker only activated when the in-memory queue
reached the high watermark, creating a catch-22 when starting
with an empty memory queue but full disk queue.
Impact:
This fix enables proper recovery from backlogs and prevents data
loss from queues that cannot drain. Existing behavior for normal
operations is preserved.
Before:
DA worker only started when: memQueueSize >= highWatermark
After:
DA worker starts when: memQueueSize >= highWatermark OR
diskQueueSize > 0
Technical Overview:
Modified qqueueAdviseMaxWorkers() in runtime/queue.c to check
both the memory queue size against the high watermark (original
condition) and whether the disk queue (pqDA) has pending messages.
This ensures the DA worker activates whenever there is data on
disk to process, not just when new incoming traffic fills the
memory queue. The NULL check for pqDA prevents dereferencing
before the DA queue is initialized. This change maintains the
original high-watermark behavior while adding the recovery path.
closes https://github.com/rsyslog/rsyslog/issues/2646
With the help of AI-Agents: GitHub Copilot
Why:
Ensures high-performance JSON emission can comply with ECS (Elastic Common
Schema) requirements where numerical zero values should often be omitted
rather than emitted as '0'.
Impact:
Adds a new property() parameter 'omitIfZero' that affects templates using
format='jsonf' and dataType='number'. No change to existing templates.
Before/After:
Previously, numerical properties in jsonf mode always emitted their value
(e.g., '"field":0'); with this change, they can be completely omitted.
Technical Overview:
- Extended templateEntry options in template.h with bOmitIfZero bitfield.
- Updated template.c to parse the 'omitifzero' binary parameter.
- Implemented omission logic in tplJsonRenderValue (template.c) and
jsonField (runtime/msg.c).
- Standardized memory safety by using the project-standard CHKmalloc()
macro for all es_str2cstr() allocations and other memory checks.
- Standardized error handling by replacing explicit gotos with the
FINALIZE; macro across affected areas.
- Formatted modified files using devtools/format-code.sh for full
compliance with project style rules.
- Registered tests/json-omitifzero.sh following the "Define at Top,
Distribute Unconditionally, Register Conditionally" pattern.
Issue: https://github.com/rsyslog/rsyslog/issues/6176
With the help of AI-Agents: Antigravity
Reduce Makefile clutter and make the test harness easier to reason
about for humans and machines. This also aims to lower CI flakes by
making the dist artifact complete and predictable.
Impact: build/test harness only; no runtime change. New
TEST_RUN_TYPE=MOCK-OK opt-in fast path in diag.sh.
Before: many scattered conditional TESTS entries; some scripts were
only in EXTRA_DIST when corresponding features were enabled, leading
to missing files in "make dist*" tarballs. Duplicates existed.
After: conditional test lists are grouped into variables (e.g.,
TESTS_ELASTICSEARCH_MINIMAL) and appended to TESTS under the same
conditionals; all lists are always added to EXTRA_DIST. Duplicate
entries removed. diag.sh recognizes TEST_RUN_TYPE=MOCK-OK for mock
distchecks and exits success without executing.
Technically, this extracts per-feature test groups into variables,
reuses them in both TESTS (within feature guards) and EXTRA_DIST (un-
conditionally), and keeps existing .log chaining to serialize suites.
The change also keeps check_PROGRAMS and environment wiring within the
ENABLE_TESTBENCH guard. The new MOCK-OK path in diag.sh is isolated to
special runs and does not affect normal testing.
With the help of AI Agents: Google Antigravity
Operators want dynstats to survive restarts for consistent metrics and
smoother observability in containers and rolling deploys.
Before: dynstats buckets were ephemeral; restarts reset counters.
After: optional on-disk persistence restores counters; worker thread is
started on demand and torn down with the owning rsconf.
Impact: New state files under WorkDirectory (or statefile.directory)
when enabled; slight I/O overhead on configured thresholds. Defaults
preserve previous behavior (persistence off).
This adds two thresholds to trigger persistence:
- persistStateInterval (count-based) and persistStateTimeInterval
(time-based), both default 0 (disabled). A new statefile.directory
can override WorkDirectory for dynstats files.
On bucket creation, existing JSON state ("dynstats-state:<bucket>")
is loaded to rehydrate counters. Updates may enqueue async writes to a
lazily-started file-write worker; teardown performs a final sync flush
without holding the bucket lock to avoid I/O-induced deadlocks.
Worker lifecycle is tied to rsconf: init in dynstats_initCnf(),
start on first persistent bucket, stop in dynstats_destroyAllBuckets().
The latter now takes rsconf_t* and is invoked from rsconf destruct,
avoiding prior hangs when loadConf/runConf differed. Per-bucket stats
track flushed bytes/counts/errors; a "file-write-worker" group reports
queue size/enqueues. Docs updated; tests add dynstats-persist(+vg) to
verify restore-after-restart and clean shutdown.
With the help of AI Agents: GitHub Copilot, cubic-dev-ai, ChatGPT codex
Co-authored-by: Rainer Gerhards <rgerhards@adiscon.com>
This implements certificate revocation checking using OCSP (RFC 6960)
for the OpenSSL network stream driver. The feature is disabled by
default and can be enabled via the new StreamDriver.TlsRevocationCheck
configuration parameter.
This is a rebased and refactored version of the original implementation
by Daniel Gollub from June 2020, updated to work with the current main
branch and enhanced with proper plumbing, security hardening, tests,
and documentation.
OCSP Implementation:
- Implements OCSP (RFC 6960) for certificate revocation checking
- Supports OCSP over HTTP transport protocol (HTTPS not implemented)
- Supports Nonce extension for replay protection
- Uses "strict" revocation policy (any OCSP error fails verification)
- Does not support TLS OCSP stapling
- CRL-only certificates are not supported
Configuration Parameter:
- New parameter: StreamDriver.TlsRevocationCheck (binary, default: off)
- Can be set at module or input level
- Disabled by default for backward compatibility and to avoid
unexpected blocking I/O in existing configurations
- Only applies to OpenSSL driver (not available for GnuTLS/mbedTLS)
Usage:
module(load="imtcp" StreamDriver.Name="ossl"
StreamDriver.Mode="1"
StreamDriver.AuthMode="x509/name"
StreamDriver.TlsRevocationCheck="on")
Full Plumbing Through Network Stack:
- imtcp: Added iStrmTlsRevocationCheck parameter parsing and config
- tcpsrv: Added DrvrTlsRevocationCheck field and SetDrvrTlsRevocationCheck()
- netstrms: Added Set/Get functions for revocation check configuration
- netstrm: Added SetDrvrTlsRevocationCheck() pass-through
- nsd interface: Bumped version 18 -> 19, added SetTlsRevocationCheck()
- nsd_ossl: Implemented SetTlsRevocationCheck(), stores flag in SSL ex_data
- nsd_gtls: Added stub returning RS_RET_VALUE_NOT_SUPPORTED
- nsd_mbedtls: Added stub returning RS_RET_VALUE_NOT_SUPPORTED
- nsd_ptcp: Added stub returning RS_RET_VALUE_NOT_SUPPORTED
Security Hardening:
- Fixed OCSP_basic_verify() to not use OCSP_TRUSTOTHER flag (prevents
forged OCSP responses from rogue responder certificates)
- Added Content-Length validation (1MB limit) to prevent memory
exhaustion attacks from malicious OCSP responders
- Changed SSL ex_data index from 2 to 3 to avoid collision with imdtls
- Added proper struct field initialization and copying in AcceptConnReq
- Added socket read/write timeouts (SO_RCVTIMEO/SO_SNDTIMEO) to prevent
indefinite blocking during OCSP response I/O (BIO_gets, BIO_write,
d2i_OCSP_RESPONSE_bio operations now bound by OCSP_TIMEOUT)
Compatibility:
- Added OpenSSL 1.0.2 compatibility (CentOS 7 support)
- Disabled OCSP for WolfSSL builds (API not available)
- Fixed variable shadowing warnings
Known Limitations (documented in code and user documentation):
- OCSP checks perform blocking network I/O (DNS + socket operations)
during TLS handshake, which can cause latency of up to 5 seconds
per OCSP responder
- Potential DoS vector: malicious certificates with multiple slow/
unresponsive OCSP responder URLs can block worker threads
- No async OCSP support or response caching (future enhancement)
Tests:
- imtcp-tls-ossl-revocationcheck-off.sh: Verifies parameter can be
set to "off" and normal TLS operation works
- imtcp-tls-gtls-revocationcheck-error.sh: Verifies error message
when attempting to enable OCSP with unsupported GnuTLS driver
Documentation:
- Created comprehensive parameter reference page
- Added EXPERIMENTAL FEATURE warning about blocking I/O and DoS risks
- Integrated into imtcp module documentation
- Documented usage examples and important considerations
Changes from original implementation by Daniel Gollub:
- Moved OCSP functions from nsd_ossl.c to net_ossl.c (new location
for SSL helper functions in current codebase)
- Updated to use SSL_CTX directly instead of separate trusted_issuers
- Added full parameter plumbing through all network stack layers
- Added StreamDriver.TlsRevocationCheck configuration parameter
- Added security hardening (OCSP_TRUSTOTHER fix, Content-Length
validation, ex_data index collision fix, socket timeout fix)
- Added OpenSSL 1.0.2 and WolfSSL compatibility
- Added support for all NSD drivers (stub implementations)
- Added comprehensive tests and documentation
- Fixed variable shadowing and compiler warnings
- Adapted to current code structure and formatting standards
Original-Author: Daniel Gollub <dgollub@att.com>
Co-authored-by: Daniel Gollub <dgollub@att.com>
With the help of AI-Agents: GitHub Copilot CLI
Better observability: expose per-file ingestion metrics so operators can
see if a specific file is active and how much data it contributes over
time.
BEFORE: impstats had no per-file imfile metrics.
AFTER: impstats reports per-file bytes.processed and lines.processed.
Impact: New impstats objects per watched file; minor per-line overhead.
This change introduces a stats object per active imfile file. The object
is named with the file path and marked with origin "imfile". Two new
resettable counters are registered: bytes.processed (offset delta per
read) and lines.processed (incremented on each submitted line). Counters
use atomic helpers to remain thread-safe. Objects are constructed when a
file is opened and destructed when it is closed; associated counter
mutexes are released to avoid leaks. The module now acquires/releases
the statsobj interface during init/exit. A new test
(imfile-statistics.sh) validates single- and multi-file cases and checks
that impstats outputs the expected counters. Build glue is updated to
include and run the new test.
Non-technical: users want SNI support so outbound TLS can target
virtual hosts and interoperate with common TLS gateways and CDNs.
Impact: user-visible behavior change and new config knob; ABI of
internal netstream interfaces incremented (modules must rebuild).
Before/After: previously SNI was never set; now SNI is set to the
target hostname (not for literal IPs), or to a configured value.
This change plumbs a new "remote SNI" through the netstream stack and
omfwd. New API hooks SetRemoteSNI are added to nsd, netstrm, and
netstrms, with IF versions bumped. nsd_ossl and nsd_gtls honor an
explicit remoteSNI first; otherwise they auto-set SNI when the target
is a hostname (skip for IPv4/IPv6 literals). nsd_ptcp rejects SNI with
RS_RET_VALUE_NOT_SUPPORTED. omfwd gains
- StreamDriverRemoteSNI / StreamDriver.RemoteSNI (aliases),
and passes it during TCPSendInitTarget. Destructors in gtls/ossl and
netstrms free the new remoteSNI field.
Tests add helper SNI servers (OpenSSL and GnuTLS) and cover three
cases per TLS lib: no SNI for IP targets, auto SNI for hostnames, and
configured SNI override. Build glue and diag helpers are included.
In 2026 rebase and some fixup with the help of AI Agents:
ChatGPT Codex
Co-authored-by: Rainer Gerhards <rgerhards@adiscon.com>
Signed-off-by: Rainer Gerhards <rgerhards@adiscon.com>
Users need to parse delimited strings (CSV, tags, paths) into arrays
for iteration or JSON output without external processing.
Impact: New RainerScript function available to all users.
Before: No native way to split strings into arrays in RainerScript.
After: split(string, separator) returns a JSON array of substrings.
Technical overview:
Implements doFunct_split() in grammar/rainerscript.c
Registers "split" in scriptFunct table with 2 required args
Adds CNFFUNC_SPLIT enum in rainerscript.h
Uses unified strstr-based iteration for all separator lengths
Handles edge cases: empty input, leading/trailing/consecutive delimiters
Includes error handling for json-c memory allocation failures
Returns empty JSON array on null/empty input or separator
Includes documentation (rs-split.rst) and test scripts
Improve usability by providing a simple way to check if an IP is inside
a CIDR subnet directly in RainerScript. This reduces awkward workarounds
and makes common filtering and routing tasks easier to express.
Impact: New function; existing configurations are unaffected.
Before: No built-in to test membership of an IP in a CIDR subnet.
After: is_in_subnet(ip, cidr) returns 1 if ip is in cidr, else 0.
Add is_in_subnet() as a built-in taking two args (IP string and CIDR).
Both IPv4 and IPv6 are supported. Inputs are parsed with inet_pton; the
CIDR mask is validated for range (0..32 / 0..128). Matching is done by
masking both the address and the network and comparing results. Invalid
inputs and family mismatches yield 0. The function returns a numeric
value. It is registered in the functions[] table and documented. Tests
cover IPv4/IPv6 basics, /0 and host masks, mismatches, and invalid
inputs. No HUP/state or OMODTX semantics are involved.
closes: https://github.com/rsyslog/rsyslog/issues/1391
With the help of AI Agents: Google Jules, Gemini (CLI),
ChatGPT Codex (CLI)
Among others, this patch includes a test for a hypothetical data
pipeline which ingests qradar json, transforms it, and ships to the
final destination.
The mmsnareparse tests fail on Launchpad builds because three
required test data files are not included in the distribution
tarball when make dist is run.
The files sample-windows2022-security.data,
sample-windows2025-security.data, and sample-events.data were
missing from EXTRA_DIST in tests/Makefile.am, causing test
failures with "No such file or directory" errors.
This patch adds all three missing files to EXTRA_DIST so they
are properly included in distribution packages.
Fixes: https://github.com/rsyslog/rsyslog/issues/6360
Documenting how to use sparseArray with ipv42num() for efficient
IPv4 subnet matching.
Added a regression test to verify this functionality.
see also: https://github.com/rsyslog/rsyslog/issues/4906
Simplify large-scale configs by auto-discovering receivers via DNS SRV
records. This reduces per-host configuration and helps enterprise and
container setups where target pools change over time.
Impact: new param `targetSrv`; config now errors on conflicts or empty
SRV answers; feature depends on resolver support.
Before: omfwd required a static host/port list via `target`/`port`.
After: `targetSrv` resolves `_syslog._udp|_tcp.<domain>` to build the
target pool, honoring RFC 2782 priority/weight and reusing existing
pool/load-balance logic.
Technically, add action param `targetSrv` (mutually exclusive with
`target`). During action init, perform SRV query via resolver
(`res_nquery`, `ns_initparse`) and translate answers into host/port
pairs. Preserve priority; randomly order same-priority entries using
weights. If explicit ports were set, warn and ignore when `targetSrv`
is used. Link rsyslogd with libresolv when available; configure checks
for headers and `ns_initparse`. Provide clear error paths (config check
fails) for missing support or empty SRV response. Docs cover usage and
env overrides `RSYSLOG_DNS_SERVER`/`RSYSLOG_DNS_PORT`. Tests add a
minimal UDP DNS server and cases for TCP/UDP success and error paths.
Fixes: https://github.com/rsyslog/rsyslog/issues/6314
With the help of AI Agent: ChatGPT Codex
Users need to parse delimited strings (CSV, tags, paths) into arrays
for iteration or JSON output without external processing.
Impact: New RainerScript function available to all users.
Before: No native way to split strings into arrays in RainerScript.
After: split(string, separator) returns a JSON array of substrings.
Technical overview:
Implements doFunct_split() in grammar/rainerscript.c
Registers "split" in scriptFunct table with 2 required args
Adds CNFFUNC_SPLIT enum in rainerscript.h
Uses unified strstr-based iteration for all separator lengths
Handles edge cases: empty input, leading/trailing/consecutive delimiters
Includes error handling for json-c memory allocation failures
Returns empty JSON array on null/empty input or separator
Includes documentation (rs-split.rst) and test scripts
This fix ensures that parse_json() only succeeds if the entire input
string is a valid JSON value. This prevents false positives when a
non-JSON string happens to start with a valid JSON value, like a number.
Documentation is updated to reflect this stricter validation.
Impact: Corrects false-success in parse_json() for malformed input.
Modified doFunc_parse_json in grammar/rainerscript.c to check if the
json-c tokener consumed the entire provided string. After parsing, the
remainder of the string is scanned for any non-whitespace characters.
If trailing garbage is found, the function now returns RS_SCRIPT_EINVAL
instead of RS_SCRIPT_EOK. Updated rs-parse_json.rst to document the
requirement for a complete JSON object/value. Added a regression test and
updated the testbench Makefile.am to include the new validation scenario.
Fixes: https://github.com/rsyslog/rsyslog/issues/4970
AI-Agent: Antigravity
This change adds the capability to overwrite the statistics log file
instead of appending to it. This is particularly useful for
observability tools like Prometheus scraping sidecars or node exporter,
which expect a consistent and complete set of metrics in a single file.
The implementation ensures atomicity by writing the statistics to a
temporary file and then renaming it to the final destination. This
prevents reader processes from seeing partial or inconsistent data
during the emission process.
This commit includes:
- The implementation in impstats.c.
- New test cases in the testbench.
- User-facing documentation for the new parameter.
Impact: Users can now enable atomic overwrites using
log.file.overwrite="on". Default behavior remains append.
Refs: no issue
AI-Agent: Antigravity