rsyslog/tests/mmjsonparse-find-json-parser-validation.sh
Rainer Gerhards 7960b7f03e
mmjsonparse: add find-json mode for embedded JSON
Plain JSON embedded in text is common in production logs. This change
lets users parse such logs without cookies, improving ease of use and
lowering onboarding friction while keeping legacy behavior intact.

Before/After: cookie-only JSON -> find-json parses first top-level {}.

Impact: Default behavior unchanged. New mode and counters are opt-in.

Technical details:
- Add action parameter `mode` with `cookie` (default) and `find-json`.
  The new mode scans for the first `{` and uses json_tokener to validate
  a complete top-level object; quotes/escapes are respected.
- Add `max_scan_bytes` (default 65536) to bound scanning work and
  `allow_trailing` (default on) to accept or reject non-whitespace data
  after the parsed object. On reject/fail we return RS_RET_NO_CEE_MSG and
  fall back to {"msg":"..."} while preserving parsesuccess semantics.
- Expose per-worker scan counters via statsobj/impstats and rsyslogctl:
  scan.attempted, scan.found, scan.failed, scan.truncated. Counters are
  active only in find-json mode and are resettable.
- Use length-aware cookie parsing (getMSG/getMSGLen) and keep legacy
  RS_RET codes. Cookie mode behavior remains unchanged.
- Update docs: module overview, parameter references, statistics section
  (impstats usage), and examples incl. mixed-mode routing. Add developer
  engine overview page.
- Add tests for basic scanning, trailing control, scan limit, invalid
  JSON, invalid mode, and parser validation edge cases.

With the help of AI Agent: Copilot
2025-10-05 14:42:23 +02:00

36 lines
1.7 KiB
Bash
Executable File

#!/bin/bash
# Test mmjsonparse find-json mode with improved JSON parser validation
# This file is part of the rsyslog project, released under ASL 2.0
. ${srcdir:=.}/diag.sh init
generate_conf
add_conf '
module(load="../plugins/mmjsonparse/.libs/mmjsonparse")
template(name="outfmt" type="string" string="%msg% parsesuccess=%parsesuccess% json=%$!%\n")
# Test various JSON edge cases that manual brace counting might miss
if $msg contains "TEST" then {
action(type="mmjsonparse" mode="find-json")
action(type="omfile" file=`echo $RSYSLOG_OUT_LOG` template="outfmt")
stop
}
'
startup
# Valid JSON with nested objects and arrays
injectmsg_literal '<167>Jan 16 16:57:54 host.example.net TAG: TEST1 prefix {"key": {"nested": ["a", "b"]}, "num": 123}'
# JSON with escaped quotes in strings
injectmsg_literal '<167>Jan 16 16:57:54 host.example.net TAG: TEST2 prefix {"message": "He said \"hello\" to me"}'
# JSON with braces in string values (should not confuse parser)
injectmsg_literal '<167>Jan 16 16:57:54 host.example.net TAG: TEST3 prefix {"code": "if (x > 0) { return true; }"}'
# Invalid JSON that looks like it might work with brace counting
injectmsg_literal '<167>Jan 16 16:57:54 host.example.net TAG: TEST4 prefix {"invalid": json}'
shutdown_when_empty
wait_shutdown
export EXPECTED=' TEST1 prefix {"key": {"nested": ["a", "b"]}, "num": 123} parsesuccess=OK json={ "key": { "nested": [ "a", "b" ] }, "num": 123 }
TEST2 prefix {"message": "He said \"hello\" to me"} parsesuccess=OK json={ "message": "He said \"hello\" to me" }
TEST3 prefix {"code": "if (x > 0) { return true; }"} parsesuccess=OK json={ "code": "if (x > 0) { return true; }" }
TEST4 prefix {"invalid": json} parsesuccess=FAIL json={ "msg": " TEST4 prefix {\"invalid\": json}" }'
cmp_exact
exit_test