docs(code-review): re-review 17 changed modules at 1f9de8a2 — 8 new findings
Re-reviewed the modules whose source changed since the last review baseline (full-review remediationfd618cf1+ InboundAPI Database-helper fixesb3c90143), focused on whether the fixes are sound and regression-free. 9 of 17 modules clean; 8 new findings (0 Critical, 0 High, 4 Medium, 4 Low), all code-verified by the orchestrator before recording: - DataConnectionLayer-029 (Med): DCL-023's unsubscribe-clears-in-flight reopens a double-subscribe window that leaks an orphaned alarm feed; the alarm completion handler overwrites the subscription id without the tag-path guard at line 908. - InboundAPI-031 (Med): WaitForAttribute's 5s grace backstop is tighter than the CommunicationService Ask's timeout+IntegrationTimeout (30s) round-trip slack, so a slow-but-valid timed-out 'false' arriving in the 5-30s window is cancelled into an unhandled OperationCanceledException/500 (contradicts spec 6 + its own comment). - SiteRuntime-032 (Med): SiteRuntime-029's wasPresent guard skips the deployed-count decrement when deleting a DISABLED instance (absent from both maps), drifting the health-dashboard tally; self-heals on singleton restart (observational, hence Med). - StoreAndForward-028 (Med): StoreAndForward-025 resets the register-guard but not _bufferedCount, so a same-instance Stop->Start re-seeds the depth gauge to ~2N. - AuditLog-017, CentralUI-037, ScriptAnalysis-009, SiteRuntime-033 (Low): a test-coverage gap plus stale doc-comments/spec following the remediation. Header commit/date bumped to1f9de8a2/ 2026-06-24 on all 17 modules; README regenerated (8 pending / 576 total).
This commit is contained in:
@@ -5,10 +5,10 @@
|
||||
| Module | `src/ZB.MOM.WW.ScadaBridge.InboundAPI` |
|
||||
| Design doc | `docs/requirements/Component-InboundAPI.md` |
|
||||
| Status | Reviewed |
|
||||
| Last reviewed | 2026-06-20 |
|
||||
| Last reviewed | 2026-06-24 |
|
||||
| Reviewer | claude-agent |
|
||||
| Commit reviewed | `4307c381` |
|
||||
| Open findings | 0 |
|
||||
| Commit reviewed | `1f9de8a2` |
|
||||
| Open findings | 1 |
|
||||
|
||||
## Summary
|
||||
|
||||
@@ -1633,3 +1633,47 @@ databases."), mirroring how `RouteAccessor` throws on cross-site routing. Regres
|
||||
`InboundScript_Database_DiagnosesClean` (CentralUI.Tests) drives a script using all three
|
||||
`Database` methods through `ScriptAnalysisService.Diagnose(... ScriptKind.InboundApi)` and
|
||||
asserts no `CS`/`SCADA` markers — alongside the existing `InboundScript_WaitForAttribute_DiagnosesClean`.
|
||||
|
||||
## Re-review — 2026-06-24 (commit `1f9de8a2`)
|
||||
|
||||
Focused re-review of the changes since the prior review — verifying the code-review remediation + feature fixes are sound and regression-free. Reviewed by a per-module workflow agent; findings code-verified by the orchestrator.
|
||||
|
||||
**Changes reviewed:** InboundDatabaseHelper was migrated from blocking ADO.NET (.GetAwaiter().GetResult() + sync ExecuteScalar/ExecuteReader) to a fully async path (await using conn/cmd/reader; token forwarded to ExecuteScalarAsync/ExecuteReaderAsync/ReadAsync), gained an ExecuteAsync write method (InboundAPI-026) and a CommandTimeout backstop derived from the method timeout (InboundAPI-027). RouteHelper/RouteTarget gained a separate request-abort token (WithRequestAborted), and WaitForAttribute was re-bounded by its own wait timeout (+ a 5s WaitResponseGrace backstop) plus the abort/explicit tokens, deliberately excluding the method deadline (InboundAPI-029, spec §6). InboundScriptExecutor wires the method timeout into the DB helper and threads the raw request-abort token into the route helper.
|
||||
|
||||
**Verdict:** The remediation is largely sound and well-tested. The DB-helper async migration is correct end-to-end: GetConnectionAsync returns an already-opened connection, await using disposes connection/command/reader properly, the deadline token reaches every async DB call, the CommandTimeout backstop is derived safely (Math.Ceiling, only set when positive), writes are added per spec, and SQL-injection protection via parameter binding is preserved and regression-tested. RouteHelper builder composition is intact and spec §6 (wait bounded by its own timeout, not the method deadline, still cancellable by client disconnect) is honoured and well covered. The one substantive concern is in WaitForAttribute: the new local 5-second grace backstop is tighter than the 30s IntegrationTimeout round-trip slack the CommunicationService layer already budgets for the same Ask, so a slow-but-legitimate site response can be cancelled locally and surface to the script as a thrown exception instead of the spec-mandated clean false. The InboundAPI project builds clean and all 40 changed-area tests pass.
|
||||
|
||||
| # | Category | Examined | Notes |
|
||||
|---|----------|----------|-------|
|
||||
| 1 | Correctness & logic bugs | ☑ | WaitForAttribute local 5s grace backstop is shorter than the 30s IntegrationTimeout round-trip slack the CommunicationService Ask budgets; a slow site response gets cancelled locally and thrown instead of returning false. See finding. |
|
||||
| 2 | Akka.NET conventions | ☑ | No actor code changed; RouteTarget delegates to IInstanceRouter/CommunicationService which Asks at the cluster boundary. No captured sender/this, no shared mutable actor state. No issues. |
|
||||
| 3 | Concurrency & thread safety | ☑ | RouteHelper/RouteTarget are immutable (readonly fields, builder returns new instances); InboundDatabaseHelper is per-execution and not shared. Per-wait CTS created and disposed locally. No issues. |
|
||||
| 4 | Error handling & resilience | ☑ | Null-gateway throws InvalidOperationException on first use (tested); GetConnectionAsync disposes on open failure. The grace-backstop converting a clean false into a thrown OperationCanceledException is the resilience gap noted in the finding. |
|
||||
| 5 | Security | ☑ | SQL injection closed via parameter binding (regression-tested with an injection payload); named connections only, no arbitrary connection string; scripts never touch System.Data. No secret logging introduced. No issues. |
|
||||
| 6 | Performance & resource management | ☑ | Blocking .GetAwaiter().GetResult() removed (no pool-thread blocking); await using disposes connection/command/reader; per-wait CTS and linked CTS both disposed via using. No issues. |
|
||||
| 7 | Design-document adherence | ☑ | Component-InboundAPI.md Database Access (async, reads+writes, named connections, parameter binding, CommandTimeout backstop) and Routing Behavior §6 (wait bounded by its own timeout, abort still cancels) match the code. Doc is in sync. |
|
||||
| 8 | Code organization & conventions | ☑ | CreateCommand factory extracted to dedupe; builder pattern consistent across With* methods; UTC timestamps (DateTimeOffset.UtcNow) preserved on routed requests. No issues. |
|
||||
| 9 | Testing coverage | ☑ | Strong: reads/writes/injection/null-gateway/execute-path cancellation for the DB helper; wait-not-bounded-by-deadline, explicit-token, client-disconnect for the route. Gap: no test asserts behaviour when the local grace backstop fires before a slow site response (the finding scenario). |
|
||||
| 10 | Documentation & comments | ☑ | XML docs and inline rationale (InboundAPI-026/027/029) are thorough and accurate. The grace-backstop comment overstates safety relative to the 30s IntegrationTimeout slack it sits inside, but that is documentation of the finding, not a separate issue. |
|
||||
|
||||
**New findings from this re-review (1):**
|
||||
|
||||
### InboundAPI-031 — Wait grace backstop tighter than the layer's round-trip slack
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Medium |
|
||||
| Category | Correctness & logic bugs |
|
||||
| Status | Open |
|
||||
| Location | `src/ZB.MOM.WW.ScadaBridge.InboundAPI/RouteHelper.cs:266` |
|
||||
|
||||
**Description**
|
||||
|
||||
RouteTarget.WaitForAttribute builds a local backstop CTS of `timeout + WaitResponseGrace` (5s) and links it into the token passed to RouteToWaitForAttributeAsync (RouteHelper.cs:112, 266-269). The layer beneath — CommunicationService.RouteToWaitForAttributeAsync (CommunicationService.cs:601) — already bounds the same cluster Ask by `request.Timeout + IntegrationTimeout`, and IntegrationTimeout defaults to 30s (CommunicationOptions.cs:22). So the local backstop fires 25s earlier than the round-trip budget the lower layer deliberately allows. The site enforces the wait timeout and returns Matched=false; if that response is delivered between `timeout+5s` and `timeout+30s` (e.g. under cluster load, GC pause, or network latency — well within the slack the lower layer budgets), the local linked CTS cancels the in-flight Ask, which throws OperationCanceledException/TaskCanceledException straight through CommunicationServiceInstanceRouter (no try/catch) to the script — converting a clean, spec-mandated `false` (spec §6) into an unhandled exception/500. The code comment at RouteHelper.cs:107-111 claims the grace 'must sit slightly LATER than the wait timeout … must not pre-empt the site's own timed-out response', but 5s is shorter than the 30s the layer itself permits for that very response, so the backstop does exactly what the comment says it must not.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Align the local backstop with the round-trip slack the CommunicationService Ask uses (i.e. derive the grace from CommunicationOptions.IntegrationTimeout rather than a hard-coded 5s), or drop the local backstop entirely and rely on the CommunicationService Ask's own `timeout + IntegrationTimeout` bound (the request-abort and explicit caller tokens can still be linked in). Either way, ensure the local cancellation cannot fire before the lower layer would have delivered the site's timed-out response. Add a test that delays the substitute router's response past the grace but within the round-trip budget and asserts the call still returns false rather than throwing.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
Reference in New Issue
Block a user