rsyslog/tests/omprog-feedback-mt.sh
Andre Lorbach 19ae11b869 Fix transaction suspension handling for issue #2420
- action.c: Add iSuspended flag to prevent infinite loops when transactions
  are suspended multiple times. Retry on first suspension
  and abort with RS_RET_SUSPENDED on subsequent suspensions.

- tests/omprog-transactions-failed-messages.sh: Remove TODO comment and
  workaround code related to issue #2420 (deferred messages within
  transactions not being retried), as the underlying issue appears to
  be resolved.

- tests/omprog-feedback-timeout.sh: Update expected output to reflect
  improved transaction handling behavior. The test now expects additional
  message processing cycles and proper timeout handling when the omprog
  action is suspended and restarted.

- tests/omprog-feedback.sh: Make robust against timing variations from
  new action.c retry logic by replacing exact sequence matching with
  pattern-based validation to ensure cross-system compatibility.

- tests/omhttp-retry-timeout.sh: Optimize test parameters for better
  reliability by reducing message count from 10000 to 5000, adding
  sequence check options, and reducing queue batch size from 2048 to 500
  to prevent test timeouts and improve stability.

- omhttp-batch-fail-with-400.sh test: resolve queue growth issue with
  HTTP 400 errors. The test was experiencing a queue growth issue where
  the queue size was increasing. This was caused by the omhttp module
  incorrectly treating HTTP 400 errors as retriable when they should be
  treated as permanent failures.
  FIX: Added httpretrycodes=["500", "502", "503", "504"] configuration.
  This explicitly specifies that only 5xx server errors should be retried.
  HTTP 400 errors are now properly treated as permanent failures.

Some tests needed to be adapted, because they expected an "exactly once"
paradigm, which the fixed bug seemed to provide in some cases (but not
reliably). Actually, rsyslog guarantees "at least once", so duplicates
can occur and are typical if transaction-like logic is used with
non-transactional outputs.

This addresses the transaction suspension edge case and cleans up
temporary workaround code that is no longer needed. The test updates
ensure that the improved transaction handling behavior is properly
validated across different scenarios and that tests correctly reflect
rsyslog's actual delivery semantics.

closes https://github.com/rsyslog/rsyslog/issues/2420
2025-08-15 14:28:34 +02:00

53 lines
1.8 KiB
Bash
Executable File

#!/bin/bash
# This file is part of the rsyslog project, released under ASL 2.0
# Similar to the 'omprog-feedback.sh' test, with multiple worker threads
# on high load, and a given error rate (percentage of failed messages, i.e.
# confirmed as failed by the program). Note: the action retry interval
# (1 second) causes a very low throughput; we need to set a very low error
# rate to avoid the test lasting too much.
. ${srcdir:=.}/diag.sh init
skip_platform "SunOS" "On Solaris, this test causes rsyslog to hang for unknown reasons"
if [ "$CC" == "gcc" ] && [[ "$CFLAGS" == *"-coverage"* ]]; then
printf 'This test does not work with gcc coverage instrumentation\n'
printf 'It will hang, but we do not know why. See\n'
printf 'https://github.com/rsyslog/rsyslog/issues/3361\n'
exit 77
fi
export NUMMESSAGES=10000 # number of logs to send
export ERROR_RATE_PERCENT=1 # percentage of logs to be retried
export command_line="$srcdir/testsuites/omprog-feedback-mt-bin.sh $ERROR_RATE_PERCENT"
generate_conf
add_conf '
module(load="../plugins/omprog/.libs/omprog")
template(name="outfmt" type="string" string="%msg%\n")
main_queue(
queue.timeoutShutdown="30000" # long shutdown timeout for the main queue
)
:msg, contains, "msgnum:" {
action(
type="omprog"
binary=`echo $command_line`
template="outfmt"
name="omprog_action"
confirmMessages="on"
queue.type="LinkedList" # use a dedicated queue
queue.workerThreads="10" # ...with multiple workers
queue.size="10000" # ...high capacity (default is 1000)
queue.timeoutShutdown="60000" # ...and a long shutdown timeout
action.resumeInterval="1" # retry interval: 1 second
)
}
'
startup
injectmsg 0 $NUMMESSAGES
wait_file_lines "$RSYSLOG_OUT_LOG" $NUMMESSAGES
shutdown_when_empty
wait_shutdown
exit_test