1 Commits

Author SHA1 Message Date
Rainer Gerhards
7960b7f03e
mmjsonparse: add find-json mode for embedded JSON
Plain JSON embedded in text is common in production logs. This change
lets users parse such logs without cookies, improving ease of use and
lowering onboarding friction while keeping legacy behavior intact.

Before/After: cookie-only JSON -> find-json parses first top-level {}.

Impact: Default behavior unchanged. New mode and counters are opt-in.

Technical details:
- Add action parameter `mode` with `cookie` (default) and `find-json`.
  The new mode scans for the first `{` and uses json_tokener to validate
  a complete top-level object; quotes/escapes are respected.
- Add `max_scan_bytes` (default 65536) to bound scanning work and
  `allow_trailing` (default on) to accept or reject non-whitespace data
  after the parsed object. On reject/fail we return RS_RET_NO_CEE_MSG and
  fall back to {"msg":"..."} while preserving parsesuccess semantics.
- Expose per-worker scan counters via statsobj/impstats and rsyslogctl:
  scan.attempted, scan.found, scan.failed, scan.truncated. Counters are
  active only in find-json mode and are resettable.
- Use length-aware cookie parsing (getMSG/getMSGLen) and keep legacy
  RS_RET codes. Cookie mode behavior remains unchanged.
- Update docs: module overview, parameter references, statistics section
  (impstats usage), and examples incl. mixed-mode routing. Add developer
  engine overview page.
- Add tests for basic scanning, trailing control, scan limit, invalid
  JSON, invalid mode, and parser validation edge cases.

With the help of AI Agent: Copilot
2025-10-05 14:42:23 +02:00