by implementing some code that was missing so far ;) as well as
finding some real bugs. I also did some general cleanup, removing
debug strings and such. This code should be fairly OK to use, except
when "exec only when previous action was suspended" is used -- this is
NOT yet re-implemented in the tuned engine.
at least in important cases (not for non-direct action queues and some
other minor things). This version is definitely buggy, but may be tried
with success on a non-production system. I will continue to work on the
correctness, but needed to commit now to get a baseline.
Now, the full batch is passed down to the rule, which then enqueues
the elements as single messages. Note that this code has some known
defects and needs more changes until it is correct again. This is
primarily a commit to be able to return to a known-(somewhat)-good
state.
messages could get lost or be duplicated due to non-proper sync
of transactions. This is a notable slowdown again, but we know
how to get back concurrency, it just takes "some" more programming.
It is important now to come back to correct code, so that we can
base further improvements on that.
things like ACL check and message parsing. This leads to a greater level
of concurrent processing. Beware, though, that this commit duplicates
some messages. May be a regression from this or an earlier commit. I will
soon sort out.
This, in default mode, caused buffered writing to be used, what
means that it looked like no output were written or partial
lines. Thanks to Michael Biebl for pointing out this bug.
this enables us to work with the "usual" environment tweaks (for
debugging and other purposes), without the need for any special
handling in nettester itself
Further testing turned out that the rsyslog core works correctly and
this fix is not needed. The concurrency we saw was actually caused by
other actions (even processes) during directory creation. See commit
9e5b31fc44136dbcc1e443cfe7714e9daf97d844 for further details.
This bug was triggered by an open failure. The the cache was full and
a new entry needed to be placed inside it, a victim for eviction was
selected. That victim was freed, then the open of the new file tried. If
the open failed, the victim entry was still freed, and the function
exited. However, on next invocation and cache search, the victim entry
was used as if it were populated, most probably resulting in a segfault.
when dynaCache is enabled, the cache is full, a new entry needs to
be allocated, thus the LRU discarded, then a new entry is opend and that
fails. In that case, it looks like the discarded stream may be reused
improperly (based on code analysis, test case and confirmation pending)
In theory, the rsyslog core should never call in parallel into an output
module for the same instance. However, it looks like this seems to happen
under (strange?) circumstances. I have now enhanced omfile so that it guards
itself against being called in parallel on the same instance data. This is
done to help troubleshooting and may stay as an interim solution if it
proves to solve an anomaly we see in at least one installation (to trigger
this problem, an extremely large traffic volume is needed).
The previous fix fixed an issue with on/off bying used in the exact wrong
semantic. It corrected the situation, but failed to fix one spot where the
wrong semantics were used. This is done with this commit.
Note that this is NOT a bug seen in any released version.
When a write error occured in stream.c, variable iWritten had the error
code but this was handled as if it were the actual number of bytes
written. That was used in pointer arithmetic later on, and thus could
lead to all sorts of problems. However, this could only happen if the
error was EINTR or the file in question was a tty. All other cases were
handled properly. Now, iWritten is reset to zero in such cases, resulting
in proper retries.