* tests: improve kafka startup readiness handling
* add generic TCP wait helper and kafka readiness check that replaces fixed sleeps in start_kafka
* verify zookeeper client port availability after startup
* replace ad-hoc sleeps in kafka-focused tests with wait_for_kafka_startup
* rely on kafka-topics probe instead of port checks
* share kafka layout helper to avoid duplicated logic
* align kafkapid emptiness checks for consistent style
With the help of AI-Agent: ChatGPT
Motivation: code coverage reports were incomplete. This lays a better
base for consistent reporting via GitHub Actions, with room for follow-ups.
It also removes a test flake source in Kafka jobs.
Impact: CI/tests only; no runtime behavior or ABI changes expected.
Before: Coverage uploads were inconsistent; Kafka tests could hang while
reading from /dev/urandom to generate topic names.
After: Coverage is collected with lcov and uploaded via a dedicated GH
Action; Kafka topics use fast $RANDOM-based hex, avoiding early-boot
entropy stalls.
Technical details:
- Add two workflows: "codecov base" and "codecov kafka" on Ubuntu 24.04.
Use lcov capture with unexecuted blocks and prune common noise; upload
with token for same-repo PRs and tokenless for forks.
- Update .codecov.yml: add path fixes for container (/rsyslog) and
runner layouts; explicitly set comment: false and patch: false.
- Bump actions/checkout to v4 in existing workflows; add an actionlint
job to catch YAML problems early.
- Switch codecov jobs in container matrix to 24.04 images.
- Improve run-ci.sh lcov invocation to be more tolerant of line/macro
mismatches.
- Testbench: replace /dev/urandom topic generation with 8-char hex from
$RANDOM; adjust diag.sh path/quoting for zookeeper helper.
In some client-server test cases, messages are supposed to be injected into
the instance 2(client), but they are actually injected into instance 1(server),
which may lead to false negative results. This patch fixed it by replacing
'injectmsg' with 'injectmsg2', and dealt with some minor issues.
Also now permit interactivly running tests without explicitly setting
$srcdir. This now works if we are inside ./tests and fails, as before,
when we are in a different directory.
Detected by shellcheck via CodeFactor.io
- Removed all sleeps where possible.
- Moved all kafka start/stop/download logic into functions.
- Moved kafka/zookeeper stop into error_exit and exit_test.
- Kafka/Zookeeper cleanup only done on success now.
- Kafka/Zookeeper logfiles automatically dumped on error_exit only now.
- Added cleanup for Kafka/Zookeeper instances into CI/buildbot_cleanup.sh
This method currently has some race associated with it, also it is
really not required and the injectmsg method is superior.
Also fix imdiag's new $imdiagInjectDelayMode config directive. While
it was added, it was forgotten to actually apply the value.
We are not doing a separate commit as this is thightly coupled into
our testing here (which also as a side-effect tests the new imdiag
functionality).
Setting timestamp to 0 now lets kafka handle this.
Also added "RD_KAFKA_V_KEY(NULL,0) if no key is configured.
testbench: Changed kafka server configuration "log.retention.hours"
property to 5000 which avoids that the log cleaner is deleting our
records before they can be processed.
Fixed some issues with sndrcv kafka tests.
Generating kafka topics dynamically now it kafka tests.
Limited messagecount in some tests to 50000 for now.
Fixed test configurations (socket timeouts with newer kafka).
Reactivated some kafka tests that work now.
omkafka: Fixed "closeTimeout" setting in action shutdown
Support tools (like tcpflood) are also upgraded to support the
necessary dynamic port.s
This is part of the effort to make parallel testing possible.
We move parts of the cleanup to the buildbot cleanup, as we cannot
clean out instances on each test when we run parallel tests.
This is work torward the ability to run the testbench in parallel.
Some small issues in tests have also been detected, which were not
previously seen. They have been fixed. We did not do separate
commits for this as it would clutter the commit log with not
really relevant info.
Also move some cleanup to "make clean" target
To support parallel testbench runs, we need to have dynamic work files.
So deleting work files on each test does not work, especially as we can
no longer assume they are "left-overs" from a failed test - they may
actually (and quite likely) belong to tests that run in parallel.
So the proper solution is to do via "make clean".
closes https://github.com/rsyslog/rsyslog/pull/2916
changes some of the test commands to use bash functions
includes some small bug fixes to tests where bugs were
previously not seen due to different plumbing.