mirror of
https://github.com/rsyslog/rsyslog.git
synced 2025-12-19 22:00:42 +01:00
260 lines
20 KiB
HTML
260 lines
20 KiB
HTML
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
|
|
<html><head>
|
|
<title>turning lanes and rsyslog queues - an analogy</title></head>
|
|
<body>
|
|
<a href="rsyslog_conf_global.html">back</a>
|
|
|
|
<h1>Turning Lanes and Rsyslog Queues - an Analogy</h1>
|
|
<p>If there is a single object absolutely vital to understanding the way
|
|
rsyslog works, this object is queues. Queues offer a variety of services,
|
|
including support for multithreading. While there is elaborate in-depth
|
|
documentation on the ins and outs of <a href="queues.html">rsyslog queues</a>,
|
|
some of the concepts are hard to grasp even for experienced people. I think this
|
|
is because rsyslog uses a very high layer of abstraction which includes things
|
|
that look quite unnatural, like queues that do <b>not</b> actually queue...
|
|
<p>With this document, I take a different approach: I will not describe every specific
|
|
detail of queue operation but hope to be able to provide the core idea of how
|
|
queues are used in rsyslog by using an analogy. I will compare the rsyslog data flow
|
|
with real-life traffic flowing at an intersection.
|
|
<p>But first let's set the stage for the rsyslog part. The graphic below describes
|
|
the data flow inside rsyslog:
|
|
<p align="center"><img src="dataflow.png" alt="rsyslog data flow">
|
|
<p>Note that there is a <a href="http://www.rsyslog.com/Article350.phtml">video tutorial</a>
|
|
available on the data flow. It is not perfect, but may aid in understanding this picture.
|
|
<p>For our needs, the important fact to know is that messages enter rsyslog on "the
|
|
left side" (for example, via UDP), are being preprocessed, put into the
|
|
so-called main queue, taken off that queue, be filtered and be placed into one or
|
|
several action queues (depending on filter results). They leave rsyslog on "the
|
|
right side" where output modules (like the file or database writer) consume them.
|
|
<p>So there are always <b>two</b> stages where a message (conceptually) is queued - first
|
|
in the main queue and later on in <i>n</i> action specific queues (with <i>n</i> being the number of
|
|
actions that the message in question needs to be processed by, what is being decided
|
|
by the "Filter Engine"). As such, a message will be in at least two queues
|
|
during its lifetime (with the exception of messages being discarded by the queue itself,
|
|
but for the purpose of this document, we will ignore that possibility).
|
|
<p>Also, it is vitally
|
|
important to understand that <b>each</b> action has a queue sitting in front of it.
|
|
If you have dug into the details of rsyslog configuration, you have probably seen that
|
|
a queue mode can be set for each action. And the default queue mode is the so-called
|
|
"direct mode", in which "the queue does not actually enqueue data".
|
|
That sounds silly, but is not. It is an important abstraction that helps keep the code clean.
|
|
<p>To understand this, we first need to look at who is the active component. In our data flow,
|
|
the active part always sits to the left of the object. For example, the "Preprocessor"
|
|
is being called by the inputs and calls itself into the main message queue. That is, the queue
|
|
receiver is called, it is passive. One might think that the "Parser & Filter Engine"
|
|
is an active component that actively pulls messages from the queue. This is wrong! Actually,
|
|
it is the queue that has a pool of worker threads, and these workers pull data from the queue
|
|
and then call the passively waiting Parser and Filter Engine with those messages. So the
|
|
main message queue is the active part, the Parser and Filter Engine is passive.
|
|
<p>Let's now try an analogy analogy for this part: Think about a TV show. The show is produced
|
|
in some TV studio, from there sent (actively) to a radio tower. The radio tower passively
|
|
receives from the studio and then actively sends out a signal, which is passively received
|
|
by your TV set. In our simplified view, we have the following picture:
|
|
<p align="center"><img src="queue_analogy_tv.png" alt="rsyslog queues and TV analogy">
|
|
<p>The lower part of the picture lists the equivalent rsyslog entities, in an abstracted way.
|
|
Every queue has a producer (in the above sample the input) and a consumer (in the above sample the Parser
|
|
and Filter Engine). Their active and passive functions are equivalent to the TV entities
|
|
that are listed on top of the rsyslog entity. For example, a rsyslog consumer can never
|
|
actively initiate reception of a message in the same way a TV set cannot actively
|
|
"initiate" a TV show - both can only "handle" (display or process)
|
|
what is sent to them.
|
|
<p>Now let's look at the action queues: here, the active part, the producer, is the
|
|
Parser and Filter Engine. The passive part is the Action Processor. The later does any
|
|
processing that is necessary to call the output plugin, in particular it processes the template
|
|
to create the plugin calling parameters (either a string or vector of arguments). From the
|
|
action queue's point of view, Action Processor and Output form a single entity. Again, the
|
|
TV set analogy holds. The Output <b>does not</b> actively ask the queue for data, but
|
|
rather passively waits until the queue itself pushes some data to it.
|
|
|
|
<p>Armed with this knowledge, we can now look at the way action queue modes work. My analogy here
|
|
is a junction, as shown below (note that the colors in the pictures below are <b>not</b> related to
|
|
the colors in the pictures above!):
|
|
<p align="center"><img src="direct_queue0.png">
|
|
<p>This is a very simple real-life traffic case: one road joins another. We look at
|
|
traffic on the straight road, here shown by blue and green arrows. Traffic in the
|
|
opposing direction is shown in blue. Traffic flows without
|
|
any delays as long as nobody takes turns. To be more precise, if the opposing traffic takes
|
|
a (right) turn, traffic still continues to flow without delay. However, if a car in the red traffic
|
|
flow intends to do a (left, then) turn, the situation changes:
|
|
<p align="center"><img src="direct_queue1.png">
|
|
<p>The turning car is represented by the green arrow. It cannot turn unless there is a gap
|
|
in the "blue traffic stream". And as this car blocks the roadway, the remaining
|
|
traffic (now shown in red, which should indicate the block condition),
|
|
must wait until the "green" car has made its turn. So
|
|
a queue will build up on that lane, waiting for the turn to be completed.
|
|
Note that in the examples below I do not care that much about the properties of the
|
|
opposing traffic. That is, because its structure is not really important for what I intend to
|
|
show. Think about the blue arrow as being a traffic stream that most of the time blocks
|
|
left-turners, but from time to time has a gap that is sufficiently large for a left-turn
|
|
to complete.
|
|
<p>Our road network designers know that this may be unfortunate, and for more important roads
|
|
and junctions, they came up with the concept of turning lanes:
|
|
<p align="center"><img src="direct_queue2.png">
|
|
<p>Now, the car taking the turn can wait in a special area, the turning lane. As such,
|
|
the "straight" traffic is no longer blocked and can flow in parallel to the
|
|
turning lane (indicated by a now-green-again arrow).
|
|
|
|
<p>However, the turning lane offers only finite space. So if too many cars intend to
|
|
take a left turn, and there is no gap in the "blue" traffic, we end up with
|
|
this well-known situation:
|
|
<p align="center"><img src="direct_queue3.png">
|
|
<p>The turning lane is now filled up, resulting in a tailback of cars intending to
|
|
left turn on the main driving lane. The end result is that "straight" traffic
|
|
is again being blocked, just as in our initial problem case without the turning lane.
|
|
In essence, the turning lane has provided some relief, but only for a limited amount of
|
|
cars. Street system designers now try to weight cost vs. benefit and create (costly)
|
|
turning lanes that are sufficiently large to prevent traffic jams in most, but not all
|
|
cases.
|
|
<p><b>Now let's dig a bit into the mathematical properties of turning lanes.</b> We assume that
|
|
cars all have the same length. So, units of cars, the length is alsways one (which is nice,
|
|
as we don't need to care about that factor any longer ;)). A turning lane has finite capacity of
|
|
<i>n</i> cars. As long as the number of cars wanting to take a turn is less than or eqal
|
|
to <i>n</i>, "straigth traffic" is not blocked (or the other way round, traffic
|
|
is blocked if at least <i>n + 1</i> cars want to take a turn!). We can now find an optimal
|
|
value for <i>n</i>: it is a function of the probability that a car wants to turn
|
|
and the cost of the turning lane
|
|
(as well as the probability there is a gap in the "blue" traffic, but we ignore this
|
|
in our simple sample).
|
|
If we start from some finite upper bound of <i>n</i>, we can decrease
|
|
<i>n</i> to a point where it reaches zero. But let's first look at <i>n = 1</i>, in which case exactly
|
|
one car can wait on the turning lane. More than one car, and the rest of the traffic is blocked.
|
|
Our everyday logic indicates that this is actually the lowest boundary for <i>n</i>.
|
|
<p>In an abstract view, however, <i>n</i> can be zero and that works nicely. There still can be
|
|
<i>n</i> cars at any given time on the turning lane, it just happens that this means there can
|
|
be no car at all on it. And, as usual, if we have at least <i>n + 1</i> cars wanting to turn,
|
|
the main traffic flow is blocked. True, but <i>n + 1 = 0 + 1 = 1</i> so as soon as there is any
|
|
car wanting to take a turn, the main traffic flow is blocked (remember, in all cases, I assume
|
|
no sufficiently large gaps in the opposing traffic).
|
|
<p>This is the situation our everyday perception calls "road without turning lane".
|
|
In my math model, it is a "road with turning lane of size 0". The subtle difference
|
|
is important: my math model guarantees that, in an abstract sense, there always is a turning
|
|
lane, it may just be too short. But it exists, even though we don't see it. And now I can
|
|
claim that even in my small home village, all roads have turning lanes, which is rather
|
|
impressive, isn't it? ;)
|
|
<p><b>And now we finally have arrived at rsyslog's queues!</b> Rsyslog action queues exists for
|
|
all actions just like all roads in my village have turning lanes! And as in this real-life sample,
|
|
it may be hard to see the action queues for that reason. In rsyslog, the "direct" queue
|
|
mode is the equivalent to the 0-sized turning lane. And actions queues are the equivalent to turning
|
|
lanes in general, with our real-life <i>n</i> being the maximum queue size.
|
|
The main traffic line (which sometimes is blocked) is the equivalent to the
|
|
main message queue. And the periods without gaps in the opposing traffic are equivalent to
|
|
execution time of an action. In a rough sketch, the rsyslog main and action queues look like in the
|
|
following picture.
|
|
<p align="center"><img src="direct_queue_rsyslog.png">
|
|
<p>We need to read this picture from right to left (otherwise I would need to redo all
|
|
the graphics ;)). In action 3, you see a 0-sized turning lane, aka an action queue in "direct"
|
|
mode. All other queues are run in non-direct modes, but with different sizes greater than 0.
|
|
<p>Let us first use our car analogy:
|
|
Assume we are in a car on the main lane that wants to take turn into the "action 4"
|
|
road. We pass action 1, where a number of cars wait in the turning lane and we pass
|
|
action 2, which has a slightly smaller, but still not filled up turning lane. So we pass that
|
|
without delay, too. Then we come to "action 3", which has no turning lane. Unfortunately,
|
|
the car in front of us wants to turn left into that road, so it blocks the main lane. So, this time
|
|
we need to wait. An observer standing on the sidewalk may see that while we need to wait, there are
|
|
still some cars in the "action 4" turning lane. As such, even though no new cars can
|
|
arrive on the main lane, cars still turn into the "action 4" lane. In other words,
|
|
an observer standing in "action 4" road is unable to see that traffic on the main lane
|
|
is blocked.
|
|
<p>Now on to rsyslog: Other than in the real-world traffic example, messages in rsyslog
|
|
can - at more or less the
|
|
same time - "take turns" into several roads at once. This is done by duplicating the message
|
|
if the road has a non-zero-sized "turning lane" - or in rsyslog terms a queue that is
|
|
running in any non-direct mode. If so, a deep copy of the message object is made, that placed into
|
|
the action queue and then the initial message proceeds on the "main lane". The action
|
|
queue then pushes the duplicates through action processing. This is also the reason why a
|
|
discard action inside a non-direct queue does not seem to have an effect. Actually, it discards the
|
|
copy that was just created, but the original message object continues to flow.
|
|
<p>
|
|
In action 1, we have some entries in the action queue, as we have in action 2 (where the queue is
|
|
slightly shorter). As we have seen, new messages pass action one and two almost instantaneously.
|
|
However, when a messages reaches action 3, its flow is blocked. Now, message processing must wait
|
|
for the action to complete. Processing flow in a direct mode queue is something like a U-turn:
|
|
|
|
<p align="center"><img src="direct_queue_directq.png" alt="message processing in an rsyslog action queue in direct mode">
|
|
<p>The message starts to execute the action and once this is done, processing flow continues.
|
|
In a real-life analogy, this may be the route of a delivery man who needs to drop a parcel
|
|
in a side street before he continues driving on the main route. As a side-note, think of what happens
|
|
with the rest of the delivery route, at least for today, if the delivery truck has a serious accident
|
|
in the side street. The rest of the parcels won't be delivered today, will they? This is exactly how the
|
|
discard action works. It drops the message object inside the action and thus the message will no
|
|
longer be available for further delivery - but as I said, only if the discard is done in a
|
|
direct mode queue (I am stressing this example because it often causes a lot of confusion).
|
|
<p>Back to the overall scenario. We have seen that messages need to wait for action 3 to
|
|
complete. Does this necessarily mean that at the same time no messages can be processed
|
|
in action 4? Well, it depends. As in the real-life scenario, action 4 will continue to
|
|
receive traffic as long as its action queue ("turn lane") is not drained. In
|
|
our drawing, it is not. So action 4 will be executed while messages still wait for action 3
|
|
to be completed.
|
|
<p>Now look at the overall picture from a slightly different angle:
|
|
<p align="center"><img src="direct_queue_rsyslog2.png" alt="message processing in an rsyslog action queue in direct mode">
|
|
<p>The number of all connected green and red arrows is four - one each for action 1, 2 and 3
|
|
(this one is dotted as action 4 was a special case) and one for the "main lane" as
|
|
well as action 3 (this one contains the sole red arrow). <b>This number is the lower bound for
|
|
the number of threads in rsyslog's output system ("right-hand part" of the main message
|
|
queue)!</b> Each of the connected arrows is a continuous thread and each "turn lane" is
|
|
a place where processing is forked onto a new thread. Also, note that in action 3 the processing
|
|
is carried out on the main thread, but not in the non-direct queue modes.
|
|
<p>I have said this is "the lower bound for the number of threads...". This is with
|
|
good reason: the main queue may have more than one worker thread (individual action queues
|
|
currently do not support this, but could do in the future - there are good reasons for that, too
|
|
but exploring why would finally take us away from what we intend to see). Note that you
|
|
configure an upper bound for the number of main message queue worker threads. The actual number
|
|
varies depending on a lot of operational variables, most importantly the number of messages
|
|
inside the queue. The number <i>t_m</i> of actually running threads is within the integer-interval
|
|
[0,confLimit] (with confLimit being the operator configured limit, which defaults to 5).
|
|
Output plugins may have more than one thread created by themselves. It is quite unusual for an
|
|
output plugin to create such threads and so I assume we do not have any of these.
|
|
Then, the overall number of threads in rsyslog's filtering and output system is
|
|
<i>t_total = t_m + number of actions in non-direct modes</i>. Add the number of
|
|
inputs configured to that and you have the total number of threads running in rsyslog at
|
|
a given time (assuming again that inputs utilize only one thread per plugin, a not-so-safe
|
|
assumption).
|
|
<p>A quick side-note: I gave the lower bound for <i>t_m</i> as zero, which is somewhat in contrast
|
|
to what I wrote at the begin of the last paragraph. Zero is actually correct, because rsyslog
|
|
stops all worker threads when there is no work to do. This is also true for the action queues.
|
|
So the ultimate lower bound for a rsyslog output system without any work to carry out actually is zero.
|
|
But this bound will never be reached when there is continuous flow of activity. And, if you are
|
|
curios: if the number of workers is zero, the worker wakeup process is actually handled within the
|
|
threading context of the "left-hand-side" (or producer) of the queue. After being
|
|
started, the worker begins to play the active queue component again. All of this, of course,
|
|
can be overridden with configuration directives.
|
|
<p>When looking at the threading model, one can simply add n lanes to the main lane but otherwise
|
|
retain the traffic analogy. This is a very good description of the actual process (think what this
|
|
means to the "turning lanes"; hint: there still is only one per action!).
|
|
<p><b>Let's try to do a warp-up:</b> I have hopefully been able to show that in rsyslog, an action
|
|
queue "sits in front of" each output plugin. Messages are received and flow, from input
|
|
to output, over various stages and two level of queues to the outputs. Actions queues are always
|
|
present, but may not easily be visible when in direct mode (where no actual queuing takes place).
|
|
The "road junction with turning lane" analogy well describes the way - and intent - of the various
|
|
queue levels in rsyslog.
|
|
<p>On the output side, the queue is the active component, <b>not</b> the consumer. As such, the consumer
|
|
cannot ask the queue for anything (like n number of messages) but rather is activated by the queue
|
|
itself. As such, a queue somewhat resembles a "living thing" whereas the outputs are
|
|
just tools that this "living thing" uses.
|
|
<p><b>Note that I left out a couple of subtleties</b>, especially when it comes
|
|
to error handling and terminating
|
|
a queue (you hopefully have now at least a rough idea why I say "terminating <b>a queue</b>"
|
|
and not "terminating an action" - <i>who is the "living thing"?</i>). An action returns
|
|
a status to the queue, but it is the queue that ultimately decides which messages can finally be
|
|
considered processed and which not. Please note that the queue may even cancel an output right in
|
|
the middle of its action. This happens, if configured, if an output needs more than a configured
|
|
maximum processing time and is a guard condition to prevent slow outputs from deferring a rsyslog
|
|
restart for too long. Especially in this case re-queuing and cleanup is not trivial. Also, note that
|
|
I did not discuss disk-assisted queue modes. The basic rules apply, but there are some additional
|
|
constraints, especially in regard to the threading model. Transitioning between actual
|
|
disk-assisted mode and pure-in-memory-mode (which is done automatically when needed) is also far from
|
|
trivial and a real joy for an implementer to work on ;).
|
|
<p>If you have not done so before, it may be worth reading the
|
|
<a href="queues.html">rsyslog queue user's guide,</a> which most importantly lists all
|
|
the knobs you can turn to tweak queue operation.
|
|
<p>[<a href="manual.html">manual index</a>]
|
|
[<a href="rsyslog_conf.html">rsyslog.conf</a>]
|
|
[<a href="http://www.rsyslog.com/">rsyslog site</a>]</p>
|
|
<p><font size="2">This documentation is part of the
|
|
<a href="http://www.rsyslog.com/">rsyslog</a> project.<br>
|
|
Copyright © 2009 by <a href="http://www.gerhards.net/rainer">Rainer Gerhards</a> and
|
|
<a href="http://www.adiscon.com/">Adiscon</a>. Released under the GNU GPL
|
|
version 3 or higher.</font></p>
|
|
</body>
|
|
</html>
|