review(Core.VirtualTags): fix Good-null upstream blocking downstream (Medium)

Re-review at 7286d320. -014 (Medium): AreInputsReady gated on value!=null, so a script
returning null (Good quality) permanently blocked change-triggered dependents at
BadWaitingForInitialData; now gates on the StatusCode Good bit only + test. -015:
TimerTriggerScheduler.Start throws on double-call. -016: fix wrong status-code comment.
This commit is contained in:
Joseph Doherty
2026-06-19 11:21:35 -04:00
parent 272a9da61e
commit 48af117bff
5 changed files with 172 additions and 10 deletions
+72 -2
View File
@@ -4,8 +4,8 @@
|---|---|
| Module | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags` |
| Reviewer | Claude Code |
| Review date | 2026-05-22 |
| Commit reviewed | `76d35d1` |
| Review date | 2026-06-19 |
| Commit reviewed | `7286d320` |
| Status | Reviewed |
| Open findings | 0 |
@@ -357,3 +357,73 @@ path) rather than rendering arrows, or reconstruct an actual cycle path within t
(a single DFS back-edge walk) before formatting.
**Resolution:** Resolved 2026-05-23 — `DependencyCycleException.BuildMessage` now formats each cycle as `cycle members: A, B, C` (comma-separated set) rather than the misleading `A -> B -> C -> A` arrow form. Added a regression test asserting the message contains the word "member" and does not fabricate an edge sequence.
---
## Re-review 2026-06-19 (commit 7286d320)
All 13 prior findings remain Resolved (no regressions found). Three new findings were added.
### Re-review checklist
| # | Category | Result |
|---|---|---|
| 1 | Correctness & logic bugs | Core.VirtualTags-014, Core.VirtualTags-015 |
| 2 | OtOpcUa conventions | No issues found |
| 3 | Concurrency & thread safety | No issues found |
| 4 | Error handling & resilience | No issues found |
| 5 | Security | No issues found |
| 6 | Performance & resource management | No issues found |
| 7 | Design-document adherence | No issues found |
| 8 | Code organization & conventions | No issues found |
| 9 | Testing coverage | Core.VirtualTags-015 (test added as part of fix) |
| 10 | Documentation & comments | Core.VirtualTags-016 |
### Core.VirtualTags-014
| Field | Value |
|---|---|
| Severity | Medium |
| Category | Correctness & logic bugs |
| Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/VirtualTagEngine.cs:400` |
| Status | Resolved |
**Description:** `AreInputsReady` checked `kv.Value.Value is null` in addition to the `StatusCode` bad-bit test. A `DataValueSnapshot(null, 0u, ...)` — produced when a script returns `null` (legal for String and any nullable path) or when `ctx.SetVirtualTag(path, null)` is called — has Good quality (StatusCode = 0) but a null Value. The null-value check caused `AreInputsReady` to return `false` for this snapshot, so every downstream change-triggered dependent of that tag was permanently stuck at `BadWaitingForInitialData` with no diagnostic. There was no log message explaining why the tag would not advance from that state.
Readiness should be determined by `StatusCode` (OPC UA quality) alone. If the upstream is Good-quality with a null value, the downstream script should run; if the script then dereferences the null unconditionally it will throw, which the outer `catch` correctly maps to `BadInternalError` — the proper sentinel for "script ran but faulted," not "inputs not yet available."
**Recommendation:** Remove the `kv.Value.Value is null` check from `AreInputsReady`; gate only on `(kv.Value.StatusCode & 0x80000000u) != 0`.
**Resolution:** Resolved 2026-06-19 — removed the `kv.Value.Value is null` guard from `AreInputsReady`; readiness now gates on StatusCode bit 31 only. Updated the method XML doc to explain the reasoning. Added regression test `AreInputsReady_Good_quality_null_upstream_does_not_block_downstream` which confirmed the bug (Consumer stuck at 0x80320000) before the fix and passes after.
### Core.VirtualTags-015
| Field | Value |
|---|---|
| Severity | Low |
| Category | Correctness & logic bugs |
| Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/TimerTriggerScheduler.cs:55-74` |
| Status | Resolved |
**Description:** `TimerTriggerScheduler.Start` had no guard against being called more than once on the same (non-disposed) instance. A second call silently appended a duplicate set of `Timer` objects and `TickGroup` entries to `_timers` and `_groups`, doubling the evaluation frequency for every timer group without any error, log, or diagnostic. On dispose, the duplicate timers were correctly cleaned up (both instances disposed), but between Start and Dispose every group fired twice per interval — each with its own independent `InFlight` flag, so the skip-tick guard in `OnTimer` was also defeated.
In the current call sites, `Start` is called exactly once per Load cycle, so the bug is latent rather than triggered by production paths. However the test suite only verified the `ObjectDisposedException` path (Start after Dispose), not the Start-Start-on-live-instance path.
**Recommendation:** Track a `_started` bool and throw `InvalidOperationException` on a second call.
**Resolution:** Resolved 2026-06-19 — added `_started` field; `Start` now throws `InvalidOperationException("TimerTriggerScheduler.Start has already been called. Create a new instance for each Load cycle.")` on a second call. Added regression test `Start_called_twice_throws_InvalidOperationException`.
### Core.VirtualTags-016
| Field | Value |
|---|---|
| Severity | Low |
| Category | Documentation & comments |
| Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/VirtualTagEngine.cs:472-475` |
| Status | Resolved |
**Description:** The `catch` block inside `CoerceResult` contained the comment "Caller logs + maps to BadTypeMismatch — we let null propagate so the outer evaluation path sets the Bad quality." The outer evaluation path at lines 348-350 maps a coercion failure (`raw is not null && coerced is null`) to `BadInternalError` (0x80020000), not `BadTypeMismatch` (0x80740000). The code was correct; only the comment named the wrong status code, which could mislead a maintainer reading the OPC UA status code tables.
**Recommendation:** Correct the comment to name `BadInternalError`.
**Resolution:** Resolved 2026-06-19 — comment updated to read "Caller maps null to BadInternalError (0x80020000) — we let null propagate so the outer evaluation path sets the Bad quality on the snapshot."