Clear _tagsByName, _lastPublishedByRef, and _lastWrittenByRef in ShutdownAsync
(via the new shared TeardownAsync helper) so a ReinitializeAsync cycle starts
from a clean state, consistent with the existing _autoProhibited.Clear().
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
_lastPublishedByRef was a plain Dictionary<string, object> mutated inside
ShouldPublish, which runs on the PollGroupEngine onChange callback. The engine
runs one background Task per subscription, so a driver with two or more
subscriptions invokes ShouldPublish concurrently on separate threads. Concurrent
TryGetValue/indexer writes on a non-thread-safe Dictionary can corrupt internal
state, drop entries, or throw, crashing the poll loop.
Switch _lastPublishedByRef to ConcurrentDictionary<string, object>; its
TryGetValue and indexer-set operations are individually thread-safe, so the
deadband cache is now correct under concurrent multi-subscription publishing,
consistent with the lock-guarded sibling cache _lastWrittenByRef.
Add an xUnit + Shouldly regression test that runs 24 deadband-configured
single-tag subscriptions concurrently and asserts the poll loop survives without
faulting.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reviewed all 31 src/ production projects against the 10-category
checklist in REVIEW-PROCESS.md. Each module gets its own findings.md;
code-reviews/README.md is regenerated from them.
334 findings: 6 Critical, 46 High, 126 Medium, 156 Low.
Critical findings:
- Server-001: WriteNodeIdUnknown recurses unconditionally — a HistoryRead
on an unresolvable node crashes the process (remote DoS).
- Admin-001/002: app-wide auth bypass (RouteView not AuthorizeRouteView)
plus unauthenticated mutating routes.
- Core.Scripting-001: System.Environment reachable from operator scripts;
Environment.Exit() terminates the server.
- Core.AlarmHistorian-001: rowIds/events parallel-list desync on a corrupt
payload misapplies outcomes — silent alarm-event data loss.
- Driver.Galaxy-001: ReconnectSupervisor is built but never triggered, so
a transient gateway drop permanently kills the event stream.
All findings are Status=Open; resolution is tracked per REVIEW-PROCESS.md
section 4. Review only — no source code changed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>