docs(code-review): re-review 17 changed modules at 1f9de8a2 — 8 new findings

Re-reviewed the modules whose source changed since the last review baseline
(full-review remediation fd618cf1 + InboundAPI Database-helper fixes b3c90143),
focused on whether the fixes are sound and regression-free. 9 of 17 modules
clean; 8 new findings (0 Critical, 0 High, 4 Medium, 4 Low), all code-verified
by the orchestrator before recording:

- DataConnectionLayer-029 (Med): DCL-023's unsubscribe-clears-in-flight reopens a
  double-subscribe window that leaks an orphaned alarm feed; the alarm completion
  handler overwrites the subscription id without the tag-path guard at line 908.
- InboundAPI-031 (Med): WaitForAttribute's 5s grace backstop is tighter than the
  CommunicationService Ask's timeout+IntegrationTimeout (30s) round-trip slack, so
  a slow-but-valid timed-out 'false' arriving in the 5-30s window is cancelled into
  an unhandled OperationCanceledException/500 (contradicts spec 6 + its own comment).
- SiteRuntime-032 (Med): SiteRuntime-029's wasPresent guard skips the deployed-count
  decrement when deleting a DISABLED instance (absent from both maps), drifting the
  health-dashboard tally; self-heals on singleton restart (observational, hence Med).
- StoreAndForward-028 (Med): StoreAndForward-025 resets the register-guard but not
  _bufferedCount, so a same-instance Stop->Start re-seeds the depth gauge to ~2N.
- AuditLog-017, CentralUI-037, ScriptAnalysis-009, SiteRuntime-033 (Low): a
  test-coverage gap plus stale doc-comments/spec following the remediation.

Header commit/date bumped to 1f9de8a2 / 2026-06-24 on all 17 modules; README
regenerated (8 pending / 576 total).
This commit is contained in:
Joseph Doherty
2026-06-24 09:20:03 -04:00
parent 1f9de8a2b5
commit c42bb48585
18 changed files with 635 additions and 66 deletions
+47 -3
View File
@@ -5,10 +5,10 @@
| Module | `src/ZB.MOM.WW.ScadaBridge.AuditLog` |
| Design doc | `docs/requirements/Component-AuditLog.md` |
| Status | Reviewed |
| Last reviewed | 2026-06-20 |
| Last reviewed | 2026-06-24 |
| Reviewer | claude-agent |
| Commit reviewed | `4307c381` |
| Open findings | 0 |
| Commit reviewed | `1f9de8a2` |
| Open findings | 1 |
## Summary
@@ -875,3 +875,47 @@ test in `AddAuditLogTests` calls the helper twice and asserts a single
`AddAuditLogCentralMaintenance` helper is left for a follow-up — it is only
ever called from the central composition root and the unit/integration
fixtures use disposable IServiceCollections, so the foot-gun is narrower.
## Re-review — 2026-06-24 (commit `1f9de8a2`)
Focused re-review of the changes since the prior review — verifying the code-review remediation + feature fixes are sound and regression-free. Reviewed by a per-module workflow agent; findings code-verified by the orchestrator.
**Changes reviewed:** Two files changed since 4307c381. AuditLogPurgeOptions.cs (AuditLog-013) adds [ConfigurationKeyName("ChannelPurgeBatchSize")] so the documented operator config key binds onto the ChannelPurgeBatchSizeConfigured backing property instead of being silently ignored. AuditLogIngestActor.cs (AuditLog-014) wraps OnIngestAsync's scope-creation + repository resolution in an outer try/catch (mirroring the pre-existing OnCachedTelemetryAsync guard) so a transient DI/DbContext fault cannot escape the handler and restart the central singleton; plus doc-comment cleanup removing stale Bundle A/C/D/E jargon.
**Verdict:** The two changes are sound, targeted remediations and are regression-free. AuditLog-013 correctly aligns the binder with the documented AuditLog:Purge:ChannelPurgeBatchSize key (verified against Component-AuditLog.md lines 407/508); the project builds clean and the new binding test pins the exact documented section/key shape. AuditLog-014 correctly hardens OnIngestAsync against scope/resolution faults restarting the singleton, preserving the captured-Sender-before-await pattern and the unconditional reply so the site's Ask never times out; per-row failures stay contained in the inner loop and the failure counter is defensively guarded. All 28 ingest/purge/binding unit tests pass. The only gap is that the new AuditLog-014 guard path itself is not exercised by a dedicated test, but it is correct by inspection and matches an already-covered sibling path.
| # | Category | Examined | Notes |
|---|----------|----------|-------|
| 1 | Correctness & logic bugs | ☑ | AuditLog-014 outer catch only fires on scope-create/resolve/dispose faults (per-row Exception catch contains row failures); replyTo.Tell runs unconditionally after if-else. AuditLog-013 ConfigurationKeyName maps documented key onto backing prop. No logic defects. |
| 2 | Akka.NET conventions | ☑ | Sender captured before first await (line 116); singleton no longer restarts on transient DI fault (the fix's intent); SupervisorStrategy override unchanged. Ask-at-boundary reply preserved. No issues found. |
| 3 | Concurrency & thread safety | ☑ | All mutation (accepted list, nowUtc) is actor-thread local; fresh DI scope per message; no captured this/Sender in post-await closures. No shared mutable state introduced. No issues found. |
| 4 | Error handling & resilience | ☑ | New outer try/catch makes best-effort audit truly non-fatal; counter.Increment wrapped in defensive try/catch; logs with accepted count. Brings OnIngestAsync into parity with OnCachedTelemetryAsync. Sound. |
| 5 | Security | ☑ | No secret/auth exposure in new log messages; redaction path (SafeDefault fallback) unchanged. Config key change is non-sensitive. No issues found. |
| 6 | Performance & resource management | ☑ | await using on the async scope unchanged; at most one extra counter increment on a scope-dispose fault (best-effort gauge). No new allocation or leak. No issues found. |
| 7 | Design-document adherence | ☑ | ChannelPurgeBatchSize key now matches Component-AuditLog.md (lines 407/508) exactly; audit-write-failure-never-aborts invariant reinforced. No drift in either direction. |
| 8 | Code organization & conventions | ☑ | Stale Bundle A/C/D/E jargon removed from doc comments; all <see cref> targets (AuditRowProjection.WithIngestedAtUtc, ConfigurationKeyNameAttribute, SafeDefaultAuditRedactor) resolve. ConfigurationKeyName is the only repo usage but compiles fine. Clean. |
| 9 | Testing coverage | ☑ | New PurgeOptions_Bind_FromDocumentedSectionAndKeys pins AuditLog-013; 28 ingest/purge/binding tests pass. AuditLog-014's scope-throw guard path has no dedicated test (Low finding). |
| 10 | Documentation & comments | ☑ | Class/remarks doc comments updated accurately; AuditLog-014 inline comment precisely explains the guard's rationale (singleton restart / dropped reply / site retry). No stale references remain. |
**New findings from this re-review (1):**
### AuditLog-017 — AuditLog-014 guard path not exercised by a test
| | |
|--|--|
| Severity | Low |
| Category | Testing coverage |
| Status | Open |
| Location | `src/ZB.MOM.WW.ScadaBridge.AuditLog/Central/AuditLogIngestActor.cs:149` |
**Description**
The new outer try/catch in OnIngestAsync (the AuditLog-014 fix) is the regression guard that keeps the central singleton alive and still replies (with whatever was accepted) when CreateAsyncScope() or GetRequiredService<IAuditLogRepository>() throws. No unit test injects a throwing IServiceProvider/scope to assert that the actor survives, increments the failure counter, and returns an IngestAuditEventsReply. The fix is correct by inspection and mirrors the already-present OnCachedTelemetryAsync guard, but the specific behavior the fix exists to protect is not pinned, so a future refactor could silently reintroduce the restart-on-transient-fault regression.
**Recommendation**
Add a TestKit test using a service provider whose IAuditLogRepository resolution throws (or whose scope-create throws), assert the actor does not restart and replies with an empty accepted list, and assert the ICentralAuditWriteFailureCounter is incremented once.
**Resolution**
_Unresolved._