21035 Commits

Author SHA1 Message Date
Rainer Gerhards
265f0cb2c2
Merge pull request #7064 from rsyslog/ai-fix-imptcp-nonprocessing-poller
imptcp: serialize helper work per session
2026-05-27 15:31:12 +02:00
Rainer Gerhards
3fddf8db8f doc: fix mmrfc5424addhmac sd_id parameter name 2026-05-27 15:30:04 +02:00
Rainer Gerhards
bb9dc79120
Merge pull request #7068 from rgerhards/ai-ci-relevance-gates
ci: gate expensive PR test families
2026-05-27 14:49:14 +02:00
Rainer Gerhards
fea61cc350 docker: fix clickhouse configure flag in Debian 13 dev image 2026-05-27 14:02:19 +02:00
Rainer Gerhards
5164242c94
Merge pull request #7069 from rgerhards/codex/i4199-ommysql-disconnect
ommysql: guard transaction commit after disconnect
2026-05-27 13:58:12 +02:00
Rainer Gerhards
e02fd0853d runtime: guard per-thread regexp iterator allocation 2026-05-27 13:35:13 +02:00
Rainer Gerhards
f1f34f0460
Merge pull request #7065 from rsyslog/ai-otel-port0-probe
testbench: bind OTEL collector on dynamic port
2026-05-27 13:27:45 +02:00
Rainer Gerhards
ffafda3b55 configure: detect relpCltSetKeepAlive support
Why:

omrelp keepalive settings were accepted but could be compiled out

because configure never defined HAVE_RELPCLTSETKEEPALIVE.

Impact: keepalive settings are now applied when librelp exports

the keepalive API.

Before/After: before keepalive options could be ignored silently;

after they are compiled in when supported by librelp.

Technical Overview:

- Add an AC_CHECK_FUNC probe for relpCltSetKeepAlive in the RELP

  configure block.

- Define HAVE_RELPCLTSETKEEPALIVE when the symbol is available.

- This aligns configure-time feature detection with the existing

  omrelp compile-time guard around relpCltSetKeepAlive().

With the help of AI-Agents: GPT-5.3-Codex
2026-05-27 13:24:10 +02:00
Rainer Gerhards
acea40e0c4 imptcp: serialize helper work per session
Why:
The de-flake campaign exposed a real imptcp race in the
processOnPoller="off" path under Ubuntu 26 TSAN. Multiple helper
workers could process one session concurrently and race on parser state.

Impact:
Fixes imptcp helper-worker session handling without reducing test scope.

Before/After:
Before, helper workers could race on one session; after, one worker owns
session processing, close, and rearm at a time.

Technical Overview:
Add a per-session queued-work flag protected by rsyslog's atomic helper.
Claim session epoll work before queueing it to helper workers.
Serialize receive parsing, zlib finish, session close, and epoll rearm.
Drop duplicate same-session events while already queued or processing.
Release the work claim before rearming the EPOLLONESHOT descriptor so a
fresh event cannot be lost behind the processing guard.
Avoid holding a pthread mutex across recv(), which would both hurt the
hot path and trip the clang static analyzer's blocking-in-critical-section
check.
Keep listener work concurrent and preserve helper parallelism across
independent sessions.
Document the non-processing-poller test intent and oracle.

With the help of AI-Agents: Codex
2026-05-27 13:20:25 +02:00
Rainer Gerhards
83807078ce
Merge pull request #7037 from rgerhards/antigravity-i-6017
parser: verify and fix offAfterPRI calculation
2026-05-27 13:18:11 +02:00
Rainer Gerhards
9c83614ce4
Merge pull request #7057 from rsyslog/codex/propose-fix-for-addlf-signature-vulnerability
runtime: bound KSI debug record logging by length
2026-05-27 13:17:19 +02:00
Rainer Gerhards
1e80f70356 ci: gate expensive PR test families
Why:
Regular PR CI should avoid waking long-running service-backed tests when a
change only touches unrelated helper code. Kafka, imfile, and Elasticsearch
are frequent long-tail costs, so they need focused relevance gates without
weakening full CI and flake-testing workflows.

Impact:
PR CI omits Kafka, imfile, and Elasticsearch tests for unrelated helper-only
changes, while direct module/test changes and plausible shared runtime paths
still run those families. Local CI-container runs can apply the same
relevance policy before devtools/run-ci.sh.

Before/After:
Before, broad runtime patterns made these expensive families run too often;
after, they use explicit focused dependency rules with full-run overrides.

Technical Overview:
Move the remaining root-level runtime C/H files under runtime/ so path-based
rules can reason about core code consistently. Keep conservative broad
relevance for service families that do not yet have focused dependency
rules. Add focused relevance for Kafka, imfile, and Elasticsearch covering
module paths, tests, build/testbench plumbing, config/message/action/queue,
worker, template, ruleset, parser, stats, and selected family-specific
runtime helpers. Keep isolated helpers such as lookup tables, dynstats, DNS
cache, crypto/KSI, GSSAPI, and unrelated protocol helpers from waking those
families. Add devtools/apply-service-relevance.sh so GitHub Actions and local
container testing share the same relevance-to-configure suppression logic.
Centralize Elasticsearch and Kafka job decisions on the top-level
change-scope outputs so scheduled jobs always run their test body. Preserve
RSYSLOG_TESTBENCH_FORCE_SERVICE_TESTS,
RSYSLOG_TESTBENCH_FORCE_<MODULE>_TESTS, and
RSYSLOG_TESTBENCH_SKIP_SERVICE_RELEVANCE so daily, weekly, and flake runs
can still force all tests even when there are no relevant changes. Document
that AI agents must validate both the relevance decision layer and the
resulting configured test list when changing these gates.

Validation:
bash -n tests/diag.sh devtools/apply-service-relevance.sh
git diff --check
actionlint .github/workflows/run_checks.yml
shellcheck -S warning devtools/apply-service-relevance.sh
module-needs-testing rule matrix for kafka, imfile, elasticsearch, mysql
Temporary git-diff probes for runtime/lookup.c and runtime/action.c
Source helper checks for runtime/lookup.c and runtime/action.c
Ubuntu 26.04 container make distclean plus MOCK-OK run-ci for runtime/lookup.c

With the help of AI-Agents: Codex
2026-05-27 12:46:58 +02:00
Rainer Gerhards
9ccb1dad14 testbench: bind OTEL collector on dynamic port
Why:
The de-flake campaign exposed a get_free_port race in the OTEL
collector test helper. A parallel test could claim the selected port
before otelcol bound it, while readiness checks still connected to the
wrong service.

Impact:
Makes OTEL-backed tests publish only collector-owned listener ports.

Before/After:
Before, OTEL tests preselected a racy port; after, otelcol binds
localhost port 0 and the testbench discovers the owned OTLP listener.

Technical Overview:
Configure the OTEL collector receiver and metrics endpoint with
localhost dynamic ports by default.
Start otelcol with exec so the stored PID owns the listener sockets.
Discover the actual OTLP HTTP port from /proc socket ownership and a
/v1/logs probe.
Write the test port file only after discovery and readiness succeed.
Keep explicit nonzero OTEL_COLLECTOR_ENDPOINT overrides working.
Move the discovery logic into an in-tree Python helper so normal Python
linting can inspect it.
Register the helper in EXTRA_DIST.

With the help of AI-Agents: Codex
2026-05-27 12:46:58 +02:00
Rainer Gerhards
107446947a ommysql: handle closed connection before commit
closes https://github.com/rsyslog/rsyslog/issues/4199
2026-05-27 12:43:45 +02:00
Rainer Gerhards
10fe8ba585 parser: verify and fix offAfterPRI calculation
closes https://github.com/rsyslog/rsyslog/issues/6017

Why:
The internal offAfterPRI field tracks the offset in raw messages
immediately after the PRI. This was inconsistently calculated across
modules (e.g. imuxsock omitted the closing '>') and was prone to
parsing invalid strings (e.g. '<>') as valid PRI offsets. This
caused misalignments and potential out-of-bounds risks in downstream
parser modules.

Impact:
Stabilizes syslog parsing; downstream modules consistently receive
accurate raw message text.

Before/After:
offAfterPRI was inconsistently calculated or misaligned on
malformed/special inputs; now it is centrally validated and correct.

Technical Overview:
Extracted the PRI offset logic into a strict static helper
compute_off_after_pri in runtime/parser.c to parse 1..3 digits
between '<' and '>'. Refactored ParsePRI to use this helper. Enhanced
MsgSetAfterPRIOffs in runtime/msg.c with defensive assertions to
validate offsets and enclosing brackets. Updated the legacy imuxsock
parser to set the correct offs + 1 offset when the closing '>' is
present. Created a pure C unit test checking 10 distinct
RFC3164/RFC5424 corner cases.

With the help of AI-Agents: Antigravity
2026-05-27 12:16:25 +02:00
Rainer Gerhards
26627f5a35
imfifo: implement named pipe input module (#7029)
* imfifo: implement named pipe input module

Why:
Allows rsyslog to read logs line-by-line from local POSIX named pipes
(FIFOs) without blocking the startup sequence or spinning on EOF
disconnect loops.

Impact:
Adds the 'imfifo' input module and registers its test suite.

Before/After:
Rsyslog had no native named pipe input capability; now imfifo
provides dynamic, non-blocking FIFO input instances.

Technical Overview:
- Integrated imfifo into the autotools build system with
  --enable-imfifo.
- Implemented plugins/imfifo/imfifo.c using the modern v6 config
  syntax.
- Used open(path, O_RDWR) to keep a dummy writer, avoiding
  startup hangs and EOF reopen loops.
- Implemented select-polling loop with 100ms timeout for
  clean, quick shut down responses.
- Splitted incoming chunks by newline, submitting complete
  messages using submitMsg2.
- Created tests/imfifo.sh and tests/imfifo-vg.sh to verify
  correct function and Valgrind compatibility.

closes https://github.com/rsyslog/rsyslog/issues/440

With the help of AI-Agents: Antigravity
2026-05-27 11:47:21 +02:00
Rainer Gerhards
c568975546
tests: widen service relevance defaults (#7055)
* tests: widen service relevance defaults

Why: Service-backed tests were skipped for broad, non-module edits that\ncan still affect service integrations.\n\nImpact: Elasticsearch, MySQL/libdbi, and Kafka setup paths run for\nshared core, build, workflow, and testbench changes.\n\nBefore/After: Before, only runtime and a narrow allow-list triggered\nservice tests; after, common cross-cutting edits also trigger them.\n\nTechnical Overview: Extend the generic module_needs_testing()\nchanged-file gate in tests/diag.sh.\nTreat top-level C/H changes as globally relevant because they include\nshared engine files such as action.c/template.c.\nTreat build and CI metadata updates (.mk, m4, workflows) as relevant\nso service jobs selected by CI do not self-skip prematurely.\nTreat testbench shell/testsuites edits as relevant because service\norchestration and service-specific assertions live under tests/.\nKeep module-specific path matching unchanged for targeted triggering.\n\nWith the help of AI-Agents: GPT-5.3-Codex
2026-05-27 11:45:05 +02:00
cursor[bot]
a1611a71ab
tests: cover mmjsonparse find-json conflict path
Why:
The mmjsonparse find-json ownership fix is already present via PR #7016,
but the conflict-container path still needs explicit regression coverage.

Impact:
Adds focused normal and Valgrind testbench coverage for msgAddJSON failure
after mmjsonparse hands off a parsed JSON object.

Before/After:
Before, the negative path relied on manual reasoning and broad coverage.
After, the testbench asserts rsyslog continues processing the trigger
message, and the Valgrind wrapper checks that the parsed object is not
released twice.

Technical Overview:
1. Add mmjsonparse-find-json-conflict.sh for the conflicting-container path.
2. Add a Valgrind wrapper for the same scenario.
3. Register both tests in tests/Makefile.am.

With the help of AI-Agents: Codex
2026-05-27 11:24:33 +02:00
Rainer Gerhards
b4e306ca1b
Merge pull request #7062 from rsyslog/codex/propose-fix-for-dynstats-persistence-vulnerability
dynstats: coalesce pending persistence writes per bucket
2026-05-27 10:56:34 +02:00
Rainer Gerhards
3bb845b37f
Merge pull request #7063 from rsyslog/codex/fix-nul-byte-certificate-impersonation-vulnerability
runtime: reject embedded NULs in mbedtls certificate names
2026-05-27 10:55:15 +02:00
Rainer Gerhards
e594341051
Merge pull request #7067 from rgerhards/ai-fix-da-mainmsg-q-flowctl
testbench: fix da-mainmsg-q flow control
2026-05-27 10:50:38 +02:00
Rainer Gerhards
de69e1859b
Merge pull request #7066 from rsyslog/codex/fix-man-page-sigint-shutdown-option
doc: fix invalid SIGINT config parameter spelling in rsyslogd.8
2026-05-27 10:47:51 +02:00
Rainer Gerhards
7969e3ec95
testbench: fix da-mainmsg-q flow control
Why:
da-mainmsg-q is meant to exercise disk-assisted main queue draining,
but its diagnostic injector could overrun the deliberately tiny queue
under CI stress. That made the test report message loss before it had
actually isolated the DA queue behavior it intends to verify.

Impact:
Reduces da-mainmsg-q flakes without weakening the tested DA queue oracle.

Before/After:
Before, imdiag injected a 2000-message burst as non-delayable traffic;
after, the burst participates in queue flow control and the final output
count is observed before shutdown.

Technical Overview:
Set RSTB_IMDIAG_INJECT_DELAY_MODE=full before generate_conf so imdiag
marks generated messages as fully delayable. This keeps the test's small
queue configuration intact while avoiding diagnostic-input loss as a side
effect of the stress setup.

The test still verifies the complete sequence 0..2099 after forcing DA
mode. It now also waits for the final 2100 output lines after the post-DA
recovery burst, so shutdown is not used as a substitute for the omfile
output oracle.

The header comment was updated to document the setup, stimulus, oracle,
and why the injection mode is part of the test plumbing rather than the
behavior under test.

With the help of AI-Agents: OpenAI Codex
2026-05-27 10:31:02 +02:00
Rainer Gerhards
723d50dd48 doc: fix shutdown.enable.ctlc spelling in rsyslogd man page 2026-05-27 09:51:32 +02:00
Rainer Gerhards
1662279a00
Merge pull request #7048 from rsyslog/codex/fix-ocsp-cache-stale-response-vulnerability
ossl: avoid caching stale OCSP responses when nextUpdate is not future
2026-05-27 09:39:14 +02:00
Rainer Gerhards
fe0834698d
Merge pull request #7046 from rsyslog/codex/fix-double-free-in-impstats-batching
impstats: fix double-free in remote write batching cleanup
2026-05-27 09:37:46 +02:00
Rainer Gerhards
eea4307d4e
Merge pull request #7047 from rsyslog/codex/fix-imuxsock-ratelimit.name-vulnerability
imuxsock: prevent ratelimit.name bypass in credentialed sender path
2026-05-27 09:33:47 +02:00
Rainer Gerhards
a6490d80cf
Merge pull request #7044 from rsyslog/codex/fix-unbounded-http-response-buffering
omazuredce: bound HTTP response buffering
2026-05-27 09:29:15 +02:00
Rainer Gerhards
b18f0feb93
Merge pull request #7042 from rsyslog/codex/fix-gzip-buffer-underallocation-issue
omazuredce: avoid gzip buffer underallocation
2026-05-27 09:25:08 +02:00
Rainer Gerhards
649f62da74
Merge pull request #7059 from rsyslog/codex/fix-dependency-on-disabled-regexp-module
runtime: guard optional regexp object lifecycle in tcp server
2026-05-27 09:22:11 +02:00
Rainer Gerhards
8edc4ba40a
Merge pull request #7049 from rsyslog/codex/fix-unbounded-json-batch-fan-out-in-imkafka
imkafka: cap split.json.records fan-out
2026-05-27 09:16:54 +02:00
Rainer Gerhards
185d878a7f
Merge pull request #7056 from rsyslog/codex/fix-libgcrypt-configure-check-bug
configure: avoid leaking -lgcrypt into global LIBS
2026-05-27 09:15:49 +02:00
Rainer Gerhards
afa1f15419 mbedtls-refine-nul-name-checks 2026-05-27 08:34:29 +02:00
Rainer Gerhards
540d52f827 dynstats-unlock-on-task-alloc-failure 2026-05-27 08:33:21 +02:00
Rainer Gerhards
6311a03eb5 mbedtls-apply-cert-name-style 2026-05-27 08:33:21 +02:00
Rainer Gerhards
6b092be0ab regexp-clean-up-per-thread-entries 2026-05-27 08:29:24 +02:00
Rainer Gerhards
ca6466f9b1 lmsig-ksi-bound-debug-precision 2026-05-27 08:29:24 +02:00
Rainer Gerhards
96e96aa9a1 runtime: reject embedded NULs in mbedtls cert names 2026-05-27 08:26:41 +02:00
Rainer Gerhards
bf1420ba62 dynstats: coalesce pending persistence writes per bucket 2026-05-27 08:26:29 +02:00
Rainer Gerhards
50487ffc51 runtime-format-regexp-lifecycle-guards 2026-05-27 08:22:27 +02:00
Rainer Gerhards
3af318be1c
Merge pull request #7061 from rsyslog/dependabot/github_actions/github-actions-441257e92f
ci: bump the github-actions group with 3 updates
2026-05-27 08:18:35 +02:00
dependabot[bot]
63e0d9f601
ci: bump the github-actions group with 3 updates
Bumps the github-actions group with 3 updates: [github/codeql-action](https://github.com/github/codeql-action), [docker/setup-buildx-action](https://github.com/docker/setup-buildx-action) and [docker/build-push-action](https://github.com/docker/build-push-action).


Updates `github/codeql-action` from 4.35.5 to 4.36.0
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](9e0d7b8d25...7211b7c807)

Updates `docker/setup-buildx-action` from 4.0.0 to 4.1.0
- [Release notes](https://github.com/docker/setup-buildx-action/releases)
- [Commits](4d04d5d948...d7f5e7f509)

Updates `docker/build-push-action` from 7.1.0 to 7.2.0
- [Release notes](https://github.com/docker/build-push-action/releases)
- [Commits](bcafcacb16...f9f3042f7e)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 4.36.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: github-actions
- dependency-name: docker/setup-buildx-action
  dependency-version: 4.1.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: github-actions
- dependency-name: docker/build-push-action
  dependency-version: 7.2.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: github-actions
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-05-26 22:26:52 +00:00
Cursor Agent
10f0aaf211
perctile: link buckets after initialization
Why: percentile stats buckets should become visible only after they are fully initialized.\n\nImpact: avoids publishing partially initialized buckets and removes redundant teardown unlinking.\n\nBefore/After: before, bucket construction linked into the list before stats registration; after, list insertion is the final locked step.\n\nTechnical Overview:\nMove perctile bucket insertion after stats object and metric setup complete.\nProtect the list mutation with the existing perctile bucket-list rwlock.\nRemove the helper that unlinked buckets during destruction, as failed setup never publishes buckets and normal teardown already pops buckets before destruction.\n\nWith the help of AI-Agents: GPT-5.5

Co-authored-by: Andre Lorbach <alorbach@adiscon.com>
2026-05-26 21:33:08 +00:00
Cursor Agent
eee51f9100
perctile: reuse counter cleanup helper
Why: keep the percentile stats teardown cleanup path consistent and easier to audit.\n\nImpact: no behavior change; cleanup code now uses the existing helper.\n\nBefore/After: before, one error path manually destroyed counters; after, it uses the shared helper.\n\nTechnical Overview:\nUse perctileDestroyCounter() in the perctileAddBucketMetrics() finalize path.\nThis keeps reference clearing in one helper and mirrors dynstats cleanup style.\nThe helper still handles NULL references and clears each counter handle after destruction.\n\nWith the help of AI-Agents: GPT-5.5

Co-authored-by: Andre Lorbach <alorbach@adiscon.com>
2026-05-26 21:06:25 +00:00
Cursor Agent
c1109c96ab
stats: avoid notifier teardown deadlocks
Why: stats-enabled reloads and shutdowns must not hang while tearing down dynamic statistics buckets.\n\nImpact: percentile and dynstats teardown no longer inverts locks with impstats scrapes.\n\nBefore/After: before, teardown could hold bucket locks while waiting for the global stats list mutex; after, stats objects are detached before bucket storage is freed.\n\nTechnical Overview:\nDetach per-bucket stats objects from the global stats registry before taking bucket teardown locks.\nUnlink counter lists before stats object destruction so backing counter handles can be released after registry removal.\nPop buckets from the configured bucket list one at a time, releasing the list lock before object teardown.\nTrack percentile bucket parent state so global counter handles can be released correctly after the global stats object is detached.\nReuse helper cleanup paths to clear counter references and avoid double destruction after partial setup failures.\n\nWith the help of AI-Agents: GPT-5.5

Co-authored-by: Andre Lorbach <alorbach@adiscon.com>
2026-05-26 20:47:11 +00:00
Rainer Gerhards
a69c2a24a4 runtime: guard optional regexp object lifecycle in tcp server
Wrap tcpsrv/tcps_sess objUse/objRelease of lmregexp in FEATURE_REGEXP guards.\n\nThis prevents unconditional lmregexp dependency in regexp-disabled builds while preserving existing regex framing behavior when FEATURE_REGEXP is enabled.
2026-05-26 22:10:11 +02:00
Rainer Gerhards
be4c2cf57d runtime: avoid regexp unload double free
Why: Fix reachable unload cleanup corruption in regex cache teardown.
Impact: Prevents module-unload crashes when per-thread cache remains.
Before/After: Before, class-exit could free perthread entries twice; after, each entry is freed once.
Technical Overview:
- perthread_regexs stores the same pointer as key and value.
- hashtable_destroy(..., 1) frees both key and value.
- That pattern double-frees perthread_regex_t entries on class exit.
- Destroy perthread_regexs with free_values=0 so only keys are freed.
- Keep regex_to_uncomp destruction unchanged because key/value differ.
With the help of AI-Agents: Codex
2026-05-26 22:04:10 +02:00
Rainer Gerhards
3fd8e38b47 runtime: bound KSI debug record logging by length 2026-05-26 22:03:49 +02:00
Rainer Gerhards
c0b128f96f configure: avoid leaking -lgcrypt into global LIBS 2026-05-26 21:54:36 +02:00
Rainer Gerhards
036024b393
Merge pull request #6894 from jjourdin/impstats-enqueue-size
queue: add per-queue size.enqueued counter (cumulative bytes)
2026-05-26 19:26:51 +02:00