Operators want dynstats to survive restarts for consistent metrics and
smoother observability in containers and rolling deploys.
Before: dynstats buckets were ephemeral; restarts reset counters.
After: optional on-disk persistence restores counters; worker thread is
started on demand and torn down with the owning rsconf.
Impact: New state files under WorkDirectory (or statefile.directory)
when enabled; slight I/O overhead on configured thresholds. Defaults
preserve previous behavior (persistence off).
This adds two thresholds to trigger persistence:
- persistStateInterval (count-based) and persistStateTimeInterval
(time-based), both default 0 (disabled). A new statefile.directory
can override WorkDirectory for dynstats files.
On bucket creation, existing JSON state ("dynstats-state:<bucket>")
is loaded to rehydrate counters. Updates may enqueue async writes to a
lazily-started file-write worker; teardown performs a final sync flush
without holding the bucket lock to avoid I/O-induced deadlocks.
Worker lifecycle is tied to rsconf: init in dynstats_initCnf(),
start on first persistent bucket, stop in dynstats_destroyAllBuckets().
The latter now takes rsconf_t* and is invoked from rsconf destruct,
avoiding prior hangs when loadConf/runConf differed. Per-bucket stats
track flushed bytes/counts/errors; a "file-write-worker" group reports
queue size/enqueues. Docs updated; tests add dynstats-persist(+vg) to
verify restore-after-restart and clean shutdown.
With the help of AI Agents: GitHub Copilot, cubic-dev-ai, ChatGPT codex
Co-authored-by: Rainer Gerhards <rgerhards@adiscon.com>