We have a long-standing issue with the mysql tests. We can't find the
root cause with special test runs. This commit now adds some debug
info. The idea is to merge this and run it in all upcoming tests, so
that we can hopefully get a sufficiently large sample that we can
address the problem.
The spec for the omprog interaction with the program it calls specifies
that the program receives one message via one line. In other words:
it must be a string terminated by LF.
However, omprog does currently rely on a proper template to fulfill this
requirement, If the template does not provide for the LF, it is never
written. For the called program, this looks like it does not receive any
input at all. Even if it finally reads data (e.g. due to full buffer),
it will not properly be able to discern the messages.
This handling is improved with this commit.
We cannot just check the template, because at the end of the template
may by a non-constant value. As such, we do not know at config load
time if there is this problem or not.
So the correct approach is to, during runtime, check if each message
is properly terminated. For those that are not:
* we append a LF, because anything else makes matters worse
* log a warning message, at least for a sample of the messages
The warning is useful in the (expected most often) case that the template
is simply missing the LF. While appending works, it slows down processing.
As such the user should be given a chance to correct the config bug.
To avoid clutter, the warning is emitted at most once every 30 seconds.
This value is hardcoded as we do not envison a need to adjust it. Usually
users should quickly fix the template.
closes https://github.com/rsyslog/rsyslog/issues/3975
1. Invalid utf8 detection didn't handle 3 and 4-byte overlong encodings (2
byte overlong econdings were handled explicitly by rejection E0 and E1
start bytes). Unified checks for overlong encodings.
2. Surrogates U+D800..U+DFFF are not valid codepoints (Unicode Standard, D92)
3. Replacement of characters in invalid 3 or 4-bytes encodings was too
eager. It must not replace bytes which are valid UTF-8 sequences. For
example, in [0xE0 0xC2 0xA7] sequence the 0xC2 is invalid as a continuation
byte, but it starts a valid UTF8 symbol [0xC2 0xA7]. That is, with current
code processing the sequence will result in "???" but the correct result is "?§"
(provided that the replacement character is "?").
4. Various tests for UTF-8 invalid/valid sequences.
Direct queues do not apply queue parameters because they are actually
no physical queue. As such, any parameter set is ignored. This can
lead to unintentional results.
The new code detects this case and warns the user.
closes https://github.com/rsyslog/rsyslog/issues/77
The new parameter permits to specify a replacement to be configured
when "escapeLF" is set to "on". Previously, a fixed replacement string
was used ("#012"/"\n") depending on circumstances. If the parameter is
set to an empty string, the LF is simply discarded.
closes https://github.com/rsyslog/rsyslog/issues/3889
Rsyslog may leave some dangling disk queue files under the following
conditions:
- batch sizes and/or messages are large
- queue files are comparatively small
- a batch spans more than two queue files (from n to n+m with m>1)
In this case, queue files n+1 to (n+m-1) are not deleted. This can
lead to problems when the queue is re-opened again. In extreme cases
this can also lead to stalled processing when the max disk space is
used up by such left-over queue files.
Using defaults this scenario is very unlikely, but it can happen,
especially when large messages are being processed.
The test was timing-sensitive as we did not properly check all data
was output to the output file - we just relied on sleep periods.
This has been changed. Also, we made some changes to the testing
framework to fully support sequence checking of multiple ZIP files.
miscellaneous bug fixes in improg:
- properly truncate string after an input event is submitted
- set msgoffset to 0.
- tests added to check above fixes
The test frequently failed in a way that suggests rsyslog terminated too
early. We try if we can improve the situation by prolonging the empty
check. While there are better solutions for this special case, there are
some cases where we do no have them. So it is good to know if this
method works.
remove trailing whitespace before checking the status string. This is
most important as a line usually ends with \n, which is considered
trailing whitespace. Accepting this increases usability.