4 Commits

Author SHA1 Message Date
Sergei Turchanov
30bd511954 Enhance mmutf8fix handling of incorrect UTF-8 sequences
1. Invalid utf8 detection didn't handle 3 and 4-byte overlong encodings (2
   byte overlong econdings were handled explicitly by rejection E0 and E1
   start bytes). Unified checks for overlong encodings.
2. Surrogates U+D800..U+DFFF are not valid codepoints (Unicode Standard, D92)
3. Replacement of characters in invalid 3 or 4-bytes encodings was too
   eager. It must not replace bytes which are valid UTF-8 sequences. For
   example, in [0xE0 0xC2 0xA7] sequence the 0xC2 is invalid as a continuation
   byte, but it starts a valid UTF8 symbol [0xC2 0xA7]. That is, with current
   code processing the sequence will result in "???" but the correct result is "?§"
   (provided that the replacement character is "?").
4. Various tests for UTF-8 invalid/valid sequences.
2019-11-15 17:03:38 +10:00
Rainer Gerhards
42a8051ad9
testbench: make most tests use a port file and assign listen port 0
This makes the test much more robust against heavily loaded test
systems.
2019-08-16 17:31:52 +02:00
Rainer Gerhards
69ef6e329b fix bad bash coding style and disable shellcheck false positives
Also now permit interactivly running tests without explicitly setting
$srcdir. This now works if we are inside ./tests and fails, as before,
when we are in a different directory.

Detected by shellcheck via CodeFactor.io
2018-10-23 13:27:37 +02:00
Jan Gerhards
3479ea76ff mmutf8fix: add test for no error
closes https://github.com/rsyslog/rsyslog/issues/3070
2018-10-10 19:06:45 +02:00