mirror of
https://github.com/rsyslog/rsyslog.git
synced 2025-12-17 15:10:42 +01:00
This commit greatly refactors imfile internal workings. It changes the handling of inotify, FEN, and polling modes. Mostly unchanged is the processing of the way a file is read and state files are kept. This is about a 50% rewrite of the module. Polling, inotify, and FEN modes now use greatly unified code. Some differences still exists and may be changed with further commits. The internal handling of wildcards and file detection has been completely re-written from scratch. For example, previously when multi-level wildcards were used these were not reliably detected. The code also now provides much of the same functionality in all modes, most importantly wildcards are now also supported in polling mode. The refactoring sets ground for further enhancements and smaller refactorings. This commit provides the same feature set that imfile had previously and all existing CI tests pass, as do some newly created tests. Some specific changes: - bugfix: module parameter "sortfiles" ignored This parameter only works in Solaris FEN mode, but is otherwise ignored. Most importantly it is ignored under Linux. fixes https://github.com/rsyslog/rsyslog/issues/2528 - bugfix: imfile did not pick up all files when not present at startup fixes https://github.com/rsyslog/rsyslog/issues/2241 fixes https://github.com/rsyslog/rsyslog/issues/2230 fixes https://github.com/rsyslog/rsyslog/issues/2354 - bugfix: directories only support "*" wildcard, no others fixes https://github.com/rsyslog/rsyslog/issues/2303 - bugfix: parameter "sortfiles" did only work in FEN mode fixes https://github.com/rsyslog/rsyslog/issues/2528 - provides the ability to dynamically add and remove files via multi-level wildcards see also https://github.com/rsyslog/rsyslog/issues/1280 - the state file name currently has been changed to inode number This will further be worked on in upcoming PRs see also https://github.com/rsyslog/rsyslog/issues/2231 - some enhancements were also done to CI tests, most importantly they were made more compatibile with BSD Note that most of the mentioned bug fixes cannot be applied to older versions, as they fix design issues which are solved by the refactoring. Thus there are not separate commits for them. Distro maintainers: you need to decide to apply this patch as whole or not. Believe me, it is not worth the effort to try to extract specific patches from this commit. There is a good reason we do not have multiple commits. closes https://github.com/rsyslog/rsyslog/issues/2359
132 lines
4.5 KiB
Plaintext
132 lines
4.5 KiB
Plaintext
Use Cases
|
|
---------
|
|
|
|
Monitor:
|
|
[M1] /dir/*/*.log ; tag="monitor1"
|
|
[M2] /dir/abc/a.* ; tag="monitor2"
|
|
[M3] /dir/*/a/*.txt ; tag="monitor3"
|
|
[M4] /di?/*/b/*.txt ; tag="monitor4"
|
|
|
|
- if a.log is created, we have kind of a conflict: create two monitors then?
|
|
(sounds best so far)
|
|
- /dir/abc may not initially exist, but later on created
|
|
-> same problem for any path depth (polling? multiple inotify? fail?)
|
|
|
|
Let's assume /dir/abc exist, but nothing else under /dir
|
|
|
|
Setup initial inotify:
|
|
/dir [M1]
|
|
/dir [M3]
|
|
/dir/abc [M1]
|
|
/dir/abc [M2]
|
|
/dir/abc [M3]
|
|
/dir/abc [M4]
|
|
Rule? The parent of each component must be monitored
|
|
/ [M4] -- really??? Means we need to monitor whole FS tree!
|
|
/dir [M2] - /dir/abc may not yet exists but may be created later on
|
|
|
|
Q: can we combine /dir/abc?
|
|
A: well, we need to know what shall be monitored. e.g.
|
|
|
|
/dir/abc/a.logfile is created
|
|
watch /dir/abc must be evaluated
|
|
does not match [M1], but [M2]!
|
|
Event for /dir/abc is generated
|
|
need to check [M1], [M2]
|
|
|
|
/dir/def is created
|
|
watch /dir must be evaluated --> [M1], [M2], [M3], [M4] must be checked
|
|
matches [M1],[M3],[M4]
|
|
create new watch
|
|
/dir/def [M1, M3, M4] - this watch should not exist if all is right [-> assert()]
|
|
|
|
/dir/def/new.log is created
|
|
watch /dir/def must be evaluated --> [M1, M3, M4]
|
|
create new file watch
|
|
/dir/def/new.log [M1]
|
|
|
|
/dir/abc/a.log is created
|
|
watch /dir/abc must be evaluated --> [M1, M2, M3, M4]
|
|
create new file watch
|
|
/dir/abc/a.log [M1, M2]
|
|
|
|
At each level, I must know which child components to watch out for, and which
|
|
monitor/listener they belong to. Comparison must be done on immediate-child level.
|
|
|
|
Data Structures
|
|
--------------
|
|
|
|
Monitors
|
|
........
|
|
- path needs to be split into individial components --> array
|
|
|
|
File Watches
|
|
............
|
|
- contains active file state
|
|
- one file watch per file per monitor (/dir/abc/a.log sample with two active file entries)
|
|
- multiple entries are not properly supported by inotify (and possibly FEN)
|
|
--> need to multiplex ourselfes based on single notifcation
|
|
|
|
Dir Watches
|
|
...........
|
|
- contains active directory state
|
|
- used to discover new to-be-monitored files
|
|
|
|
Problems
|
|
--------
|
|
- we need to find, for each level, the proper "next component" that we need to check
|
|
against (fnmatch on component?)
|
|
- this must be done om a per-monitor basis
|
|
- one entry per monitored file, but a list of monitors (and component names to check against)
|
|
--> how do we efficiently describe the monitor so that we can do an O(1) name comparison?
|
|
--> we can NOT assume that the full path matches, just component-by-component!
|
|
--> we may NOT get notifications if path level are rapidly created, as we may not
|
|
have setup the watch sufficiently fast - so we need to poll the full tree
|
|
in any case
|
|
|
|
- state files: different state if the same file is monitored by more than one monitor
|
|
--> or should we flag this as an error case?
|
|
|
|
Poll vs. Watch
|
|
--------------
|
|
Polling is always needed (due to watch racieness). So it probably is best to have some generic
|
|
poll code, and use watches just to reduce the amount of polling. That also means that
|
|
poll mode can do exactly the same the notification mode can do!
|
|
|
|
|
|
Procedures
|
|
----------
|
|
|
|
e.g. /dir/aaa is created
|
|
this will be reported on /dir watch, which is watched by all 4 monitors
|
|
foreach monitor:
|
|
check if next-level component matches - this is the case for [M1, M2, M3]
|
|
if so:
|
|
create new/update existing entry (existing for all but first!)
|
|
arm watch for this entry (if newly created)
|
|
do a full poll of that entry and its subentries
|
|
AS FAR AS THE MONITOR IN QUESTION is affected:
|
|
M1 will scan subdir for "*.log"
|
|
M2 will scan for "abc", if it exists scan for "a.*"
|
|
M3 will scan for "a", if it exists scan for "*.txt"
|
|
--> rest of the path components need to be scanned, files
|
|
processed for each monitor
|
|
this is a recursive function
|
|
|
|
The above-mentioned poll process is fundamental, and can occur on each level
|
|
of the file system/monitor definitions. It needs to work both on active monitors as well
|
|
as the configured path components. Does ist make sense to have them in a single structure?
|
|
Is a tree more useful than an dirs table?
|
|
|
|
now the opposite:
|
|
e.g. /dir/aaa is deleted
|
|
this will be reported on /dir watch, which is watched by all 4 monitors
|
|
check if "aaa" exists as active(dynamic) entry
|
|
if so:
|
|
check for sub-entries, initiate delete events if any exists
|
|
(TODO: is MOVE here a problem?)
|
|
delete "aaa" entry
|
|
Note: we MUST NOT remove entries that are statically configured
|
|
(e.g. /dir in our samples, even if it is deleted - it may re-appear)
|
|
|