6 Commits

Author SHA1 Message Date
Joseph Doherty 2580b5026f docs(code-reviews): regenerate index — 27 Low findings resolved
Batch 2 cleared Open findings in Core.ScriptedAlarms, Core.Scripting,
Core.VirtualTags, Admin, and Server (Core.ScriptedAlarms-009 documented
under Won't Fix per the recommendation).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 07:24:27 -04:00
Joseph Doherty 6134050ceb fix(server): resolve Low code-review findings (Server-004,006,008,012,014,015)
- Server-004: pass the role-derived display name to UserIdentity's base
  ctor (the SDK's DisplayName has no public setter) and drop the dead
  Display property; make RoleBasedIdentity internal sealed.
- Server-006: derive a bounded CancellationToken from the SDK's
  OperationContext.OperationDeadline in OnReadValue / OnWriteValue so a
  stalled driver call can no longer pin the request thread.
- Server-008: mark handled slots via CallMethodRequest.Processed = true
  in RouteScriptedAlarmMethodCalls (the SDK skips on Processed, not on a
  Good error slot).
- Server-012: PeerHttpProbeLoop.ProbeAsync stops mutating client.Timeout
  per call; uses a per-request CancellationTokenSource linked to the
  shutdown token instead.
- Server-014: wire SealedBootstrap into Program.cs via AddSealedBootstrap
  + OpcUaServerService so the generation-sealed cache + stale-config flag
  + resilient reader actually run; /healthz now reflects cache-fallback
  state.
- Server-015: replace the stale 'PR 16 / PR 17 minimum-viable scope'
  class summaries on OtOpcUaServer and OpcUaServerOptions with the
  shipped LDAP + anonymous-role + configurable security-profile prose.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 07:24:20 -04:00
Joseph Doherty 2b33b64a58 fix(admin): resolve Low code-review findings (Admin-010,011,012)
- Admin-010: vendor Bootstrap 5.3.3 (CSS + JS bundle + maps + provenance
  README) under wwwroot/lib/bootstrap and reference local paths from
  App.razor — Admin no longer pulls Bootstrap from jsDelivr.
- Admin-011: swap FleetStatusPoller's three plain dictionaries for
  ConcurrentDictionary so ResetCache can't race a poll tick.
- Admin-012: drop the EquipmentId column from EquipmentCsvImporter (per
  admin-ui.md — equipment id is system-derived from EquipmentUuid);
  EquipmentImportBatchService and the textarea placeholder updated to
  match.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 07:24:07 -04:00
Joseph Doherty 3f01a24b45 fix(core-virtual-tags): resolve Low code-review findings (Core.VirtualTags-004,006,007,009,010,011,013)
- Core.VirtualTags-004: CoerceResult now covers every scalar
  DriverDataType and throws on the default arm; Load rejects unsupported
  declared types.
- Core.VirtualTags-006: Subscribe/Unsub prune empty observer-list
  entries from _observers under the same lock with a reconfirm-on-add
  race guard.
- Core.VirtualTags-007: rewrote TimerTriggerScheduler so each TickGroup
  tracks an InFlight flag (Interlocked CAS); ticks that overlap a still-
  running tick for the same group are skipped + counted.
- Core.VirtualTags-009: DirectDependencies / DirectDependents return a
  shared static empty set on miss instead of allocating per call.
- Core.VirtualTags-010: corrected XML docs to reference the real engine
  symbols (OnUpstreamChange, CascadeAsync, etc.) instead of phantom types.
- Core.VirtualTags-011: Load now rejects scripts whose declared Writes
  target a non-registered virtual-tag path.
- Core.VirtualTags-013: DependencyCycleException renders SCC members as
  a set rather than a fabricated arrow-traversal edge path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 07:23:53 -04:00
Joseph Doherty 0a20de728d fix(core-scripting): resolve Low code-review findings (Core.Scripting-005,006,008,009,011)
- Core.Scripting-005: DependencyExtractor.HandleTagCall now recognises
  raw-string literal paths by checking the StringLiteralExpression node
  kind instead of the legacy StringLiteralToken kind.
- Core.Scripting-006: scope CompiledScriptCache failed-compile eviction
  with TryRemove(KeyValuePair) so a racing retry entry is not evicted.
- Core.Scripting-008: document the per-publish assembly accretion as an
  accepted limitation in docs/VirtualTags.md.
- Core.Scripting-009: enumerate the authoritative deny-list (namespace
  prefixes + type-granular denies) in the Phase 7 decision-#6 entry to
  match ForbiddenTypeAnalyzer.
- Core.Scripting-011: pin ScriptSandbox.Build, ScriptContext.Deadband
  boundary semantics, and end-to-end factory + companion-sink
  integration.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 07:23:42 -04:00
Joseph Doherty 99354bfaf2 fix(core-scripted-alarms): resolve Low code-review findings (Core.ScriptedAlarms-003,006,008,010,011; -009 documented)
- Core.ScriptedAlarms-003: emit OnEvent OUTSIDE _evalGate by collecting
  pending emissions during the gate-held section and flushing them after
  release; eliminates re-entrancy deadlock the docs already promised.
- Core.ScriptedAlarms-006: track every fire-and-forget Reevaluate /
  ShelvingCheck task in _inFlight; Dispose drains the set so the engine
  no longer races store writes against teardown.
- Core.ScriptedAlarms-008: store comments as ImmutableList<AlarmComment>
  so AppendComment is O(log n) instead of O(n).
- Core.ScriptedAlarms-010: document the deliberate input-quality
  asymmetry (Uncertain drives the predicate, renders {?} in the message)
  in docs/ScriptedAlarms.md and on MessageTemplate.Resolve remarks.
- Core.ScriptedAlarms-011: propagate the no-op reason through
  TransitionResult.NoOp(state, reason) and log it from
  ScriptedAlarmEngine.ApplyAsync.
- Core.ScriptedAlarms-009 (Won't Fix per recommendation): documented the
  per-evaluation dictionary allocation in docs/v2/Galaxy.Performance.md
  with a mitigation path if a future soak surfaces pressure.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 07:23:31 -04:00
59 changed files with 2339 additions and 234 deletions
+7 -7
View File
@@ -7,7 +7,7 @@
| Review date | 2026-05-22 |
| Commit reviewed | `76d35d1` |
| Status | Reviewed |
| Open findings | 3 |
| Open findings | 0 |
## Checklist coverage
@@ -168,13 +168,13 @@
| Severity | Low |
| Category | OtOpcUa conventions |
| Location | `Components/App.razor:9,16` |
| Status | Open |
| Status | Resolved |
**Description:** `App.razor` loads Bootstrap CSS and JS from the `cdn.jsdelivr.net` CDN. `admin-ui.md` section "Tech Stack" specifies "Bootstrap 5 vendored under `wwwroot/lib/bootstrap/`" precisely so the Admin app has no third-party runtime dependency. A CDN reference makes the UI fail in air-gapped / locked-down fleet deployments (a stated deployment target), introduces an uncontrolled third-party origin, and is not covered by a Subresource Integrity hash.
**Recommendation:** Vendor Bootstrap under `wwwroot/lib/bootstrap/` and reference the local copies, as the design doc requires. If a CDN is retained for any asset, add `integrity` + `crossorigin` SRI attributes.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — Bootstrap 5.3.3 (CSS + JS bundle, plus their source maps) vendored under `src/Server/ZB.MOM.WW.OtOpcUa.Admin/wwwroot/lib/bootstrap/{css,js}/`; `App.razor` now references the local copies (`lib/bootstrap/css/bootstrap.min.css`, `lib/bootstrap/js/bootstrap.bundle.min.js`); a README under the vendor directory records provenance + upgrade steps. Covered by `BootstrapVendoringTests` (asserts no `cdn.jsdelivr.net`/`cdnjs`/`unpkg` references in `App.razor`, that the vendored files exist with non-trivial sizes, and that `App.razor` references the vendored paths) — verified failing pre-fix, passing post-fix.
### Admin-011
@@ -183,13 +183,13 @@
| Severity | Low |
| Category | Concurrency & thread safety |
| Location | `Hubs/FleetStatusPoller.cs:24-26,98-103` |
| Status | Open |
| Status | Resolved |
**Description:** `FleetStatusPoller` keeps three plain `Dictionary<>` fields (`_last`, `_lastRole`, `_lastResilience`) mutated from `PollOnceAsync`. The poller `ExecuteAsync` loop is single-threaded so the steady-state poll path is safe, but `ResetCache()` (exposed `internal` for tests) clears those same dictionaries with no synchronization. If a test (or any caller) invokes `ResetCache()` while a poll tick is mid-iteration, the `Dictionary` enumeration/mutation race can throw `InvalidOperationException` or corrupt state.
**Recommendation:** Either document `ResetCache()` as "only safe when the poller is stopped" and have tests stop the service first, or guard the three dictionaries with a lock / swap them atomically. Using `ConcurrentDictionary` (as the sibling `ResilientLdapGroupRoleMappingService` does) would make the intent explicit.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — `_last`, `_lastRole`, and `_lastResilience` swapped from plain `Dictionary<,>` to `ConcurrentDictionary<,>` so concurrent `ResetCache()` / poll-tick mutations are safe by construction (the recommendation's "explicit intent" form). Covered by `FleetStatusPollerConcurrencyTests` — one test guards the structural choice via reflection so a future refactor cannot silently revert; the other stress-runs concurrent mutate + `ResetCache()` via reflection, verifying the race throws no exception (verified failing pre-fix with `Dictionary<,>`).
### Admin-012
@@ -198,13 +198,13 @@
| Severity | Low |
| Category | Design-document adherence |
| Location | `Services/EquipmentCsvImporter.cs:18-19,33-37,229,232` |
| Status | Open |
| Status | Resolved |
**Description:** `EquipmentCsvImporter` declares `EquipmentId` as a required CSV column and parses it into a `required` field. `admin-ui.md` section "Equipment CSV import" (revised after adversarial review finding #4) is explicit: "No `EquipmentId` column — operator-supplied EquipmentId would mint duplicate equipment identity on typos ... never accepted from CSV imports." `EquipmentId` is system-derived (`EQ-` plus first 12 hex chars of `EquipmentUuid`). Accepting it from CSV either contradicts the design or silently lets an import set an identity field the doc says is un-settable. The XML doc on the class also cites the column as required per "decision #117", so either the code or the design doc is stale. `EquipmentImportBatchService.StageRowsAsync` propagates `row.EquipmentId` into the staging row, so any change must cover the finalize path.
**Recommendation:** Reconcile with the design: drop `EquipmentId` from `RequiredColumns` and the `EquipmentCsvRow` shape (deriving it from `EquipmentUuid` at finalize time), or — if accepting it is a deliberate reversal — update `admin-ui.md` and the decision log so the two agree.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — code reconciled with the design: `EquipmentId` dropped from `EquipmentCsvImporter.RequiredColumns`, `BuildRow`, `GetCell`, and the `EquipmentCsvRow` shape; the class XML doc now records the admin-ui.md "No EquipmentId column" rule. The finalize path is covered: `EquipmentImportBatchService.StageRowsAsync` now derives the staging-row's `EquipmentId` via `DraftValidator.DeriveEquipmentId(equipmentUuid)`, and `FinaliseBatchAsync` re-derives it from the UUID that actually lands in the `Equipment` row (so a blank/invalid staged UUID that gets replaced by `Guid.NewGuid()` no longer leaves `EquipmentId` and `EquipmentUuid` out of sync). `ImportEquipment.razor`'s textarea placeholder updated to the new header shape. Covered by `EquipmentCsvNoEquipmentIdColumnTests` (five tests guarding `RequiredColumns`/`OptionalColumns`/`EquipmentCsvRow` shape and asserting CSVs with an `EquipmentId` column are rejected as unknown while CSVs without are accepted) — verified failing pre-fix, passing post-fix. The existing `EquipmentCsvImporterTests` + `EquipmentImportBatchServiceTests` were updated to the new header shape and pass green (DB-backed suite ran against `10.100.0.35,14330`).
### Admin-013
+13 -13
View File
@@ -7,7 +7,7 @@
| Review date | 2026-05-22 |
| Commit reviewed | `76d35d1` |
| Status | Reviewed |
| Open findings | 6 |
| Open findings | 0 |
## Checklist coverage
@@ -66,13 +66,13 @@ a category produced nothing rather than leaving it blank.
| Severity | Low |
| Category | Documentation & comments |
| Location | `ScriptedAlarmEngine.cs:343`, `docs/ScriptedAlarms.md:107` |
| Status | Open |
| Status | Resolved |
**Description:** `docs/ScriptedAlarms.md` (Composition step 3) and the `OnUpstreamChange` comment ("Fire-and-forget so driver-side dispatch isn't blocked", line 225-226) describe the `OnEvent` emission path as non-blocking / fire-and-forget. In the code, `EmitEvent` invokes `OnEvent?.Invoke(this, evt)` **synchronously while `_evalGate` is held** (called from `EvaluatePredicateToStateAsync` line 305 and `ApplyAsync` line 217, both inside the gate). A slow subscriber blocks the single evaluation gate for all alarms; a subscriber that re-enters the engine (e.g. calls `AcknowledgeAsync`) deadlocks because `_evalGate` is a non-reentrant `SemaphoreSlim(1,1)`. The behaviour is defensible (the historian sink is non-blocking, per the doc), but the comments/doc are misleading about where the work happens and the re-entrancy hazard is undocumented.
**Recommendation:** Either move `EmitEvent` outside the `_evalGate` critical section (collect emissions during the locked section and raise them after `Release()`), or document explicitly on `OnEvent` that handlers run under the engine lock, must be fast, and must never call back into the engine.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — split `EmitEvent` into `BuildEmission` (called under the gate to capture a coherent value-cache snapshot for message-template resolution) and `FireEvent` (called after `_evalGate.Release()` so subscribers can re-enter the engine without deadlocking and a slow subscriber no longer blocks concurrent engine operations). Updated `ApplyAsync`, `ReevaluateAsync`, `ShelvingCheckAsync`, and `LoadAsync` (startup-recovery path) to collect emissions in a pending list and flush after the gate is released; added regression tests for both the re-entry path and a white-box gate-acquirable-from-subscriber check.
### Core.ScriptedAlarms-004
@@ -111,13 +111,13 @@ a category produced nothing rather than leaving it blank.
| Severity | Low |
| Category | Concurrency & thread safety |
| Location | `ScriptedAlarmEngine.cs:232`, `ScriptedAlarmEngine.cs:369` |
| Status | Open |
| Status | Resolved |
**Description:** `OnUpstreamChange` and `RunShelvingCheck` both launch fire-and-forget tasks (`_ = ReevaluateAsync(...)`, `_ = ShelvingCheckAsync(...)`) with `CancellationToken.None`. There is no tracking of these in-flight tasks, so `Dispose` cannot await them and a server shutdown can race a still-running re-evaluation that writes to the (possibly disposed) store. Combined with finding 005, an upstream push arriving during shutdown produces an unobserved background task touching torn state.
**Recommendation:** Track outstanding background tasks (or use a single serialised worker / `Channel`), and link them to a `CancellationTokenSource` that `Dispose` cancels and drains. At minimum, await the in-flight work in `Dispose`.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — added `_inFlight` HashSet + `TrackBackgroundTask(...)` helper to register every fire-and-forget `ReevaluateAsync`/`ShelvingCheckAsync` task, with a sync `ContinueWith` continuation that auto-removes on completion. `Dispose` snapshots the set under its own lock and `Task.WhenAll(...).GetAwaiter().GetResult()` drains them before returning; `OnUpstreamChange` also short-circuits when `_disposed` is set so no new work is queued during shutdown. Regression test exercises the slow-store path: Dispose blocks until the in-flight `SaveAsync` completes.
### Core.ScriptedAlarms-007
@@ -141,13 +141,13 @@ a category produced nothing rather than leaving it blank.
| Severity | Low |
| Category | Performance & resource management |
| Location | `Part9StateMachine.cs:261-268` |
| Status | Open |
| Status | Resolved |
**Description:** `AppendComment` copies the entire existing comment list into a new `List` on every audit-producing transition (ack, confirm, shelve, unshelve, enable, disable, add-comment, auto-unshelve). The `Comments` list is append-only and unbounded — for a long-lived alarm that is acknowledged/commented hundreds of times, every transition is an O(n) copy and the full history is also re-serialised to the store on every `SaveAsync`. Over a multi-month uptime this is a slowly growing per-transition cost.
**Recommendation:** Acceptable for now given audit requirements, but consider an immutable persistent list / `ImmutableList<AlarmComment>` to make append O(log n), or have the store persist comments incrementally (append-only audit table) rather than rewriting the whole collection each save. At minimum, note the unbounded-growth characteristic in the design doc.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — switched `AlarmConditionState.Comments` from `IReadOnlyList<AlarmComment>` to `ImmutableList<AlarmComment>` and rewrote `AppendComment` as `existing.Add(...)` so each append is O(log n) instead of the prior O(n) copy. `ImmutableList<T>` still implements `IReadOnlyList<T>` so existing consumers compile unchanged; the persistence layer continues to store comments as JSON so wire-format is unaffected. Regression test asserts the runtime type is `ImmutableList<AlarmComment>`.
### Core.ScriptedAlarms-009
@@ -156,13 +156,13 @@ a category produced nothing rather than leaving it blank.
| Severity | Low |
| Category | Performance & resource management |
| Location | `ScriptedAlarmEngine.cs:309-315`, `ScriptedAlarmEngine.cs:271` |
| Status | Open |
| Status | Won't Fix |
**Description:** `BuildReadCache` allocates a fresh `Dictionary<string, DataValueSnapshot>` on every predicate evaluation, i.e. on every upstream tag change for every referencing alarm. On a busy line where many tags feeding many alarms change frequently, this is a steady stream of short-lived dictionary allocations on the hot path. `AlarmPredicateContext` is also newly constructed each evaluation (line 281).
**Recommendation:** Minor. If the evaluation path shows up in allocation profiling, the read cache could be a reused per-alarm buffer cleared between evaluations (evaluations are already serialised under `_evalGate`, so a single shared scratch dictionary is safe). Not worth doing speculatively — flag for the perf surface in `docs/v2/Galaxy.Performance.md` if alarm evaluation is ever soak-tested.
**Resolution:** _(open)_
**Resolution:** Won't Fix 2026-05-23 — per the recommendation, no code change. Documented the known allocation characteristic in `docs/v2/Galaxy.Performance.md` (new "Scripted-alarm engine — known hot-path allocations" section) so a future soak that surfaces pressure has a noted mitigation (reused per-alarm scratch buffer) and we don't re-find this in a later review.
### Core.ScriptedAlarms-010
@@ -171,13 +171,13 @@ a category produced nothing rather than leaving it blank.
| Severity | Low |
| Category | Design-document adherence |
| Location | `ScriptedAlarmEngine.cs:325-336`, `AlarmPredicateContext.cs:33-40`, `MessageTemplate.cs:47` |
| Status | Open |
| Status | Resolved |
**Description:** Quality handling is inconsistent across the three places that inspect a `DataValueSnapshot.StatusCode`. `AreInputsReady` (engine, line 333) treats only outright Bad (bit 31) as not-ready, so an Uncertain-quality input is fed to the predicate. `MessageTemplate.Resolve` (line 47) rejects *any* non-zero status code — including Uncertain — and renders `{?}`. `AlarmPredicateContext.GetTag` returns `BadNodeIdUnknown` (`0x80340000`) for a missing path. The net effect: an Uncertain-quality tag is considered good enough to drive an alarm *activation* decision but not good enough to print in the alarm *message*. `docs/ScriptedAlarms.md` ("Fallback rules") only documents the message-template behaviour and does not mention that predicate evaluation accepts Uncertain. The two policies should be reconciled and documented.
**Recommendation:** Decide one quality policy for "is this input usable" and apply it in both `AreInputsReady` and the message resolver, or explicitly document why predicate evaluation and message rendering treat Uncertain differently. Add the predicate-side Uncertain rule to `docs/ScriptedAlarms.md`.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — documented the deliberate asymmetry. Added an "Input-quality policy" section to `docs/ScriptedAlarms.md` (table contrasting `AreInputsReady`'s Bad-only rejection with `MessageTemplate.Resolve`'s Good-only acceptance, plus the rationale) and a cross-referencing remarks block on `MessageTemplate.Resolve`. The two policies are kept distinct on purpose: predicate evaluation accepts Uncertain because the value is still inspectable, while the operator-facing message must render `{?}` to make the qualifier visible. Regression test locks in both behaviours with a single Uncertain-quality input that activates the alarm and surfaces `{?}` in the emission message.
### Core.ScriptedAlarms-011
@@ -186,13 +186,13 @@ a category produced nothing rather than leaving it blank.
| Severity | Low |
| Category | Code organization & conventions |
| Location | `Part9StateMachine.cs:275` |
| Status | Open |
| Status | Resolved |
**Description:** `TransitionResult.NoOp(state, reason)` takes a `reason` string parameter that is documented in the calling code as a diagnostic ("disabled — predicate result ignored", "already acknowledged", etc.) but the factory method silently discards it — it just returns `new(state, EmissionKind.None)`, identical to `None(state)`. Every call site that passes a carefully-worded reason string is doing dead work, and the comments in `Part9StateMachine` and the class-level remarks claim disabled/no-op transitions "produce ... a diagnostic log line", which they do not.
**Recommendation:** Either propagate the reason (add it to `TransitionResult` and have the engine log it at debug level when emission is `None` for a no-op), or remove the unused `reason` parameter and collapse `NoOp` into `None`. Update the `Part9StateMachine` remarks that promise a diagnostic log line.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — added a nullable `NoOpReason` property to `TransitionResult` (defaulted on the primary constructor so existing positional `new TransitionResult(state, kind)` call sites remain valid) and propagated it from `TransitionResult.NoOp(state, reason)`. `ScriptedAlarmEngine.ApplyAsync` now logs the reason at debug level via the alarm's script logger when the transition is a no-op, fulfilling the class-level remarks. Two regression tests assert that `NoOp` carries the reason and `None` does not.
### Core.ScriptedAlarms-012
+10 -10
View File
@@ -7,7 +7,7 @@
| Review date | 2026-05-22 |
| Commit reviewed | `76d35d1` |
| Status | Reviewed |
| Open findings | 5 |
| Open findings | 0 |
## Checklist coverage
@@ -169,7 +169,7 @@ member-access call to a non-ctx `GetTag` is untested and would be misattributed.
| Severity | Low |
| Category | Correctness & logic bugs |
| Location | `DependencyExtractor.cs:97` |
| Status | Open |
| Status | Resolved |
**Description:** A raw string literal token passed as the tag path (a raw triple-quote
literal) tokenizes as `SingleLineRawStringLiteralToken` /
@@ -183,7 +183,7 @@ paths) but the error text would confuse anyone who does.
`literal.IsKind(SyntaxKind.StringLiteralExpression)` on the expression node, or include
the raw-string token kinds, so a static raw string is harvested rather than rejected.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — `HandleTagCall` now checks `literal.IsKind(SyntaxKind.StringLiteralExpression)` on the expression node, which covers regular string literals, single-line raw strings, and multi-line raw strings uniformly. Regression tests `Accepts_single_line_raw_string_literal_path` and `Accepts_multi_line_raw_string_literal_path` added to `DependencyExtractorTests`.
### Core.Scripting-006
@@ -192,7 +192,7 @@ the raw-string token kinds, so a static raw string is harvested rather than reje
| Severity | Low |
| Category | Concurrency & thread safety |
| Location | `CompiledScriptCache.cs:55` |
| Status | Open |
| Status | Resolved |
**Description:** On a failed compile the `catch` block calls
`_cache.TryRemove(key, out _)` without a value comparison. If two threads race a miss for
@@ -206,7 +206,7 @@ but the removal should be key+value scoped for correctness.
remove only the specific faulted `Lazy` instance, so a concurrently re-added entry is not
evicted.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — `GetOrCompile`'s catch block now evicts via `_cache.TryRemove(new KeyValuePair<string, Lazy<…>>(key, lazy))`, comparing the value reference so only the faulted Lazy is removed; a concurrent retry that re-inserted a fresh Lazy under the same key is preserved. Regression test `Failed_compile_eviction_does_not_remove_a_concurrent_retry_entry` added to `CompiledScriptCacheTests` (reflection-driven deterministic race: the faulted Lazy's factory swaps the dictionary entry to a fresh Lazy as a side effect of its throw, modelling the precise race window).
### Core.Scripting-007
@@ -257,7 +257,7 @@ compile scripts into a collectible `AssemblyLoadContext` so `Clear()` can unload
generations. At minimum add a note to `docs/ScriptedAlarms.md` so operators with
high-publish-frequency deployments are aware.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — accepted as a documented known limitation rather than fixing in code (collectible `AssemblyLoadContext` for Roslyn-emitted assemblies is a v3 concern). The "Compile cache" section of `docs/VirtualTags.md` now carries a "Per-publish assembly accretion (accepted limitation, Core.Scripting-008)" note that operators with high-publish-frequency deployments can scan, and `docs/ScriptedAlarms.md` cross-references it. The accretion is benign at the expected "low thousands" of scripts scale; recommended mitigation is a scheduled server restart for deployments that publish very frequently.
### Core.Scripting-009
@@ -266,7 +266,7 @@ high-publish-frequency deployments are aware.
| Severity | Low |
| Category | Design-document adherence |
| Location | `ForbiddenTypeAnalyzer.cs:45` |
| Status | Open |
| Status | Resolved |
**Description:** The Phase 7 plan decision #6
(`docs/v2/implementation/phase-7-scripting-and-alarming.md`) enumerates the forbidden
@@ -283,7 +283,7 @@ authoritative deny-list exactly as `ForbiddenTypeAnalyzer.ForbiddenNamespacePref
defines it, including the `System.Environment` allowed-compromise, so the docs match the
code.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — `docs/v2/implementation/phase-7-scripting-and-alarming.md` decision #6 row + the "Sandbox escape" compliance-check row now enumerate the authoritative deny-list exactly as `ForbiddenTypeAnalyzer` defines it (namespace-prefix denies for `System.IO` / `System.Net` / `System.Diagnostics` / `System.Reflection` / `System.Threading.Tasks` / `System.Runtime.InteropServices` / `Microsoft.Win32`; type-granular denies for `System.Environment` / `System.AppDomain` / `System.GC` / `System.Activator` / `System.Threading.Thread`), and the compliance-check row lists the syntactic vectors (`typeof` / generic arg / cast / `is`/`as` / `default(T)` / array element / declared local) the broadened analyzer covers. `docs/VirtualTags.md` already documents the same list and is unchanged.
### Core.Scripting-010
@@ -318,7 +318,7 @@ assert a `ScriptSandboxViolationException` (or `CompilationErrorException`) at c
| Severity | Low |
| Category | Testing coverage |
| Location | `tests/Core/ZB.MOM.WW.OtOpcUa.Core.Scripting.Tests/` |
| Status | Open |
| Status | Resolved |
**Description:** Two source files have no direct test coverage: `ScriptContext`
(`Deadband` static helper is exercised only indirectly through `ScriptSandboxTests`, and
@@ -335,4 +335,4 @@ unverified.
a script logging at Error level produces both a `scripts-*.log` event and a companion
Warning event.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — added three new test files: `ScriptSandboxBuildTests` covers the `Build` null / non-`ScriptContext` / base-class / concrete-subclass paths; `ScriptContextTests` locks `Deadband` boundary semantics (equal-to-tolerance returns false; just-over returns true; symmetric in direction; zero-tolerance returns true only on non-equal; negative tolerance trips on any non-equal); the new `Factory_plus_companion_sink_integration_surfaces_script_error_in_both_logs` test in `ScriptLogCompanionSinkTests` wires `ScriptLoggerFactory` + the companion sink together end-to-end and asserts an Error emission lands in both the scripts sink (at Error) and the main sink (at Warning), each tagged with `ScriptName`. Suite now 101 green (was 85 before).
+15 -15
View File
@@ -7,7 +7,7 @@
| Review date | 2026-05-22 |
| Commit reviewed | `76d35d1` |
| Status | Reviewed |
| Open findings | 7 |
| Open findings | 0 |
## Checklist coverage
@@ -124,7 +124,7 @@ collection is keyed off the registered set, not the raw input list.
| Severity | Low |
| Category | Correctness & logic bugs |
| Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/VirtualTagEngine.cs:349` |
| Status | Open |
| Status | Resolved |
**Description:** `CoerceResult`'s switch has a default arm (`_ => raw`) that returns the
script's raw return value uncoerced for any `DriverDataType` not in the explicit list
@@ -139,7 +139,7 @@ the outer pipeline maps to BadInternalError) for an unsupported `DriverDataType`
document precisely which `DriverDataType` values `CoerceResult` supports and validate at
`Load` time that no definition declares an unsupported type.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — extended `CoerceResult` to cover every scalar `DriverDataType` (`Int16`, `UInt16`, `UInt32`, `UInt64` added); the default arm now throws (mapped to `BadInternalError`) instead of returning the uncoerced raw value, and a new `IsSupportedDataType` validation in `Load` rejects definitions declaring an unsupported type (currently `Reference`) so the typo is caught at publish time. Added regression tests for both Int16/UInt16/UInt32/UInt64 round-trip and the publish-time rejection.
### Core.VirtualTags-005
@@ -172,7 +172,7 @@ delivered before any subsequent change for that path.
| Severity | Low |
| Category | Concurrency & thread safety |
| Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/VirtualTagEngine.cs:177-182`, `:395-401` |
| Status | Open |
| Status | Resolved |
**Description:** `Subscribe` does `_observers.GetOrAdd(path, _ => [])` then
`lock (list) { list.Add(observer); }`. When `Unsub.Dispose` removes the last observer,
@@ -188,7 +188,7 @@ but it makes any future "prune empty entries" logic racy.
lock, re-checking emptiness inside the lock to avoid dropping a concurrently-added
observer.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — `Unsub.Dispose` now removes the dictionary entry under the same lock when the observer list becomes empty, using the `ICollection<KeyValuePair>.Remove(key,value)` overload so a racing Subscribe's brand-new list is not collateral damage. `Subscribe` retries via the GetOrAdd / lock-and-reconfirm pattern so it cannot deposit an observer into a list that has already been pruned. Added a regression test that subscribes twice + disposes both and asserts the dictionary entry is gone.
### Core.VirtualTags-007
@@ -197,7 +197,7 @@ observer.
| Severity | Low |
| Category | Error handling & resilience |
| Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/TimerTriggerScheduler.cs:58` |
| Status | Open |
| Status | Resolved |
**Description:** `Tick` calls
`_engine.EvaluateOneAsync(p, _cts.Token).GetAwaiter().GetResult()`, blocking the
@@ -214,7 +214,7 @@ if the previous one for that group is still running (a per-group "in flight" fla
rather than blocking synchronously. At minimum, document the blocking behaviour and the
expected upper bound on group evaluation time relative to the interval.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — rewrote `TimerTriggerScheduler` to use a per-`TickGroup` `InFlight` flag (`Interlocked.CompareExchange`-guarded). The timer callback no longer blocks on `GetAwaiter().GetResult()`; instead it kicks off an async `RunTickAsync` and skips the tick (incrementing the new `SkippedTickCount` diagnostic counter) when the prior tick for that group is still running. Added a regression test that runs a 250ms evaluation against a 50ms cadence and asserts `SkippedTickCount > 2`.
### Core.VirtualTags-008
@@ -246,7 +246,7 @@ O(V+E) cost into an O(closure) cost.
| Severity | Low |
| Category | Performance & resource management |
| Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/DependencyGraph.cs:64-65`, `:72-73` |
| Status | Open |
| Status | Resolved |
**Description:** `DirectDependencies` and `DirectDependents` allocate a fresh empty
`HashSet<string>` on every call for an unregistered node. `DirectDependents` is called
@@ -257,7 +257,7 @@ on the change-cascade path.
**Recommendation:** Return a shared static empty set for the miss case instead of
allocating each time.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — `DependencyGraph` now exposes a shared static `EmptySet` instance and `DirectDependencies` / `DirectDependents` return it on a miss instead of allocating a fresh `HashSet<string>` every call. Added regression tests asserting `ReferenceEquals` across two miss calls.
### Core.VirtualTags-010
@@ -266,7 +266,7 @@ allocating each time.
| Severity | Low |
| Category | Documentation & comments |
| Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/ITagUpstreamSource.cs:18`, `VirtualTagContext.cs:30`, `VirtualTagDefinition.cs:28` |
| Status | Open |
| Status | Resolved |
**Description:** Several XML docs reference component names that do not exist in the
codebase. `ITagUpstreamSource` XML doc says the subscription path "feeds the engine's
@@ -280,7 +280,7 @@ XML docs mislead maintainers searching for the named component.
`CascadeAsync`, `EvaluateInternalAsync`) or drop the specific name in favour of a
behavioural description.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — replaced the stale type names: `ITagUpstreamSource` now references `VirtualTagEngine.OnUpstreamChange` + `CascadeAsync`; `VirtualTagContext` references `VirtualTagEngine.OnScriptSetVirtualTag` + `CascadeAsync`; `VirtualTagDefinition.TimerInterval` references `VirtualTagEngine.EvaluateInternalAsync`.
### Core.VirtualTags-011
@@ -289,7 +289,7 @@ behavioural description.
| Severity | Low |
| Category | Code organization & conventions |
| Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/VirtualTagEngine.cs:404-409` |
| Status | Open |
| Status | Resolved |
**Description:** `VirtualTagState` records a Writes set (the `ctx.SetVirtualTag` targets
extracted by `DependencyExtractor`), but nothing in the engine reads it -- it is captured
@@ -305,7 +305,7 @@ miss), so an operator typo is caught at publish rather than silently dropped at
If validation is deliberately deferred, remove the unused field or comment why it is
retained.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — `Load` now iterates every registered tag's `Writes` set and adds a `compileFailures` entry for any write target that does not resolve to a registered virtual tag. Updated the pre-existing Core.VirtualTags-012 "warning on non-registered path" test to assert publish-time rejection (the runtime warning branch remains as a defensive guard but the static `DependencyExtractor` enforces literal-string paths, so it is unreachable for any operator-authored script). Added a positive companion test confirming a write to a registered path still loads cleanly.
### Core.VirtualTags-012
@@ -342,7 +342,7 @@ correspond to open correctness findings and would have caught them.
| Severity | Low |
| Category | Documentation & comments |
| Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/DependencyGraph.cs:266-270` |
| Status | Open |
| Status | Resolved |
**Description:** `DependencyCycleException.BuildMessage` renders each cycle as
`string.Join(" -> ", c) + " -> " + c[0]`, presenting the SCC member list as a traversable
@@ -356,4 +356,4 @@ into looking for an edge that is not in their config.
path) rather than rendering arrows, or reconstruct an actual cycle path within the SCC
(a single DFS back-edge walk) before formatting.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — `DependencyCycleException.BuildMessage` now formats each cycle as `cycle members: A, B, C` (comma-separated set) rather than the misleading `A -> B -> C -> A` arrow form. Added a regression test asserting the message contains the word "member" and does not fabricate an edge sequence.
+31 -31
View File
@@ -10,7 +10,7 @@ Each module's `findings.md` is the source of truth; this file is generated from
| Module | Reviewer | Date | Commit | Status | Open | Total |
|---|---|---|---|---|---|---|
| [Admin](Admin/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 3 | 13 |
| [Admin](Admin/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 13 |
| [Analyzers](Analyzers/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 7 |
| [Client.CLI](Client.CLI/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 8 | 10 |
| [Client.Shared](Client.Shared/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 5 | 11 |
@@ -19,9 +19,9 @@ Each module's `findings.md` is the source of truth; this file is generated from
| [Core](Core/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 12 |
| [Core.Abstractions](Core.Abstractions/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 8 |
| [Core.AlarmHistorian](Core.AlarmHistorian/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 11 |
| [Core.ScriptedAlarms](Core.ScriptedAlarms/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 6 | 12 |
| [Core.Scripting](Core.Scripting/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 5 | 11 |
| [Core.VirtualTags](Core.VirtualTags/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 7 | 13 |
| [Core.ScriptedAlarms](Core.ScriptedAlarms/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 12 |
| [Core.Scripting](Core.Scripting/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 1 | 11 |
| [Core.VirtualTags](Core.VirtualTags/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 13 |
| [Driver.AbCip](Driver.AbCip/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 5 | 15 |
| [Driver.AbCip.Cli](Driver.AbCip.Cli/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 6 | 8 |
| [Driver.AbLegacy](Driver.AbLegacy/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 3 | 13 |
@@ -40,7 +40,7 @@ Each module's `findings.md` is the source of truth; this file is generated from
| [Driver.S7.Cli](Driver.S7.Cli/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 4 | 7 |
| [Driver.TwinCAT](Driver.TwinCAT/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 5 | 16 |
| [Driver.TwinCAT.Cli](Driver.TwinCAT.Cli/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 7 | 7 |
| [Server](Server/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 6 | 15 |
| [Server](Server/findings.md) | Claude Code | 2026-05-22 | `76d35d1` | Reviewed | 0 | 15 |
## Pending findings
@@ -48,9 +48,6 @@ Findings with status `Open` or `In Progress`, ordered by severity.
| ID | Severity | Category | Location | Description |
|---|---|---|---|---|
| Admin-010 | Low | OtOpcUa conventions | `Components/App.razor:9,16` | `App.razor` loads Bootstrap CSS and JS from the `cdn.jsdelivr.net` CDN. `admin-ui.md` section "Tech Stack" specifies "Bootstrap 5 vendored under `wwwroot/lib/bootstrap/`" precisely so the Admin app has no third-party runtime dependency. A… |
| Admin-011 | Low | Concurrency & thread safety | `Hubs/FleetStatusPoller.cs:24-26,98-103` | `FleetStatusPoller` keeps three plain `Dictionary<>` fields (`_last`, `_lastRole`, `_lastResilience`) mutated from `PollOnceAsync`. The poller `ExecuteAsync` loop is single-threaded so the steady-state poll path is safe, but `ResetCache()`… |
| Admin-012 | Low | Design-document adherence | `Services/EquipmentCsvImporter.cs:18-19,33-37,229,232` | `EquipmentCsvImporter` declares `EquipmentId` as a required CSV column and parses it into a `required` field. `admin-ui.md` section "Equipment CSV import" (revised after adversarial review finding #4) is explicit: "No `EquipmentId` column… |
| Client.CLI-002 | Low | Correctness & logic bugs | `Commands/SubscribeCommand.cs:129-137` | The summary computes `neverWentBad` as every target whose node-id key is absent from the `everBad` dictionary. A node that received no update at all is also absent from `everBad`, so it is counted in `neverWentBad` and printed under the he… |
| Client.CLI-003 | Low | Correctness & logic bugs | `Commands/BrowseCommand.cs:29-30`, `Commands/SubscribeCommand.cs:20-27`, `Commands/AlarmsCommand.cs:28-29`, `Commands/HistoryReadCommand.cs:42-43` | Numeric command options accept any value with no range validation. `--depth`, `--interval`, `--max-depth`, `--max`, and the history `--interval` can all be supplied as `0` or a negative number. A negative `--depth`/`--max-depth` silently d… |
| Client.CLI-004 | Low | OtOpcUa conventions | `Commands/SubscribeCommand.cs:13-37` | `SubscribeCommand` is the only command in the module whose constructor and all `[CommandOption]` properties have no XML doc comments. Every other command (`ConnectCommand`, `ReadCommand`, `WriteCommand`, `BrowseCommand`, `AlarmsCommand`, `… |
@@ -70,24 +67,7 @@ Findings with status `Open` or `In Progress`, ordered by severity.
| Client.UI-009 | Low | Design-document adherence | `ViewModels/HistoryViewModel.cs:44-54` | `HistoryViewModel.AggregateTypes` exposes eight entries: `null` (Raw) plus Average, Minimum, Maximum, Count, Start, End, and `StandardDeviation`. `docs/Client.UI.md` ("Query Options" table) lists only "Raw (default), Average, Minimum, Maxi… |
| Client.UI-010 | Low | Code organization & conventions | `Controls/DateTimeRangePicker.axaml.cs:33-37`, `Controls/DateTimeRangePicker.axaml.cs:70-80` | `DateTimeRangePicker` declares `MinDateTimeProperty` / `MaxDateTimeProperty` styled properties with public CLR accessors, but neither is read anywhere in the control. `TryParseDateTime`, `OnStartLostFocus`, and `OnEndLostFocus` never clamp… |
| Client.UI-011 | Low | Documentation & comments | `Views/MainWindow.axaml:81`, `Services/JsonSettingsService.cs:11-15` | The certificate-store-path `TextBox` watermark reads `(default: AppData/LmxOpcUaClient/pki)`, referencing the legacy pre-task-#208 folder name. Per `CLAUDE.md` / `docs/Client.UI.md` the canonical path is now `{LocalAppData}/OtOpcUaClient/`… |
| Core.ScriptedAlarms-003 | Low | Documentation & comments | `ScriptedAlarmEngine.cs:343`, `docs/ScriptedAlarms.md:107` | `docs/ScriptedAlarms.md` (Composition step 3) and the `OnUpstreamChange` comment ("Fire-and-forget so driver-side dispatch isn't blocked", line 225-226) describe the `OnEvent` emission path as non-blocking / fire-and-forget. In the code, `… |
| Core.ScriptedAlarms-006 | Low | Concurrency & thread safety | `ScriptedAlarmEngine.cs:232`, `ScriptedAlarmEngine.cs:369` | `OnUpstreamChange` and `RunShelvingCheck` both launch fire-and-forget tasks (`_ = ReevaluateAsync(...)`, `_ = ShelvingCheckAsync(...)`) with `CancellationToken.None`. There is no tracking of these in-flight tasks, so `Dispose` cannot await… |
| Core.ScriptedAlarms-008 | Low | Performance & resource management | `Part9StateMachine.cs:261-268` | `AppendComment` copies the entire existing comment list into a new `List` on every audit-producing transition (ack, confirm, shelve, unshelve, enable, disable, add-comment, auto-unshelve). The `Comments` list is append-only and unbounded —… |
| Core.ScriptedAlarms-009 | Low | Performance & resource management | `ScriptedAlarmEngine.cs:309-315`, `ScriptedAlarmEngine.cs:271` | `BuildReadCache` allocates a fresh `Dictionary<string, DataValueSnapshot>` on every predicate evaluation, i.e. on every upstream tag change for every referencing alarm. On a busy line where many tags feeding many alarms change frequently,… |
| Core.ScriptedAlarms-010 | Low | Design-document adherence | `ScriptedAlarmEngine.cs:325-336`, `AlarmPredicateContext.cs:33-40`, `MessageTemplate.cs:47` | Quality handling is inconsistent across the three places that inspect a `DataValueSnapshot.StatusCode`. `AreInputsReady` (engine, line 333) treats only outright Bad (bit 31) as not-ready, so an Uncertain-quality input is fed to the predica… |
| Core.ScriptedAlarms-011 | Low | Code organization & conventions | `Part9StateMachine.cs:275` | `TransitionResult.NoOp(state, reason)` takes a `reason` string parameter that is documented in the calling code as a diagnostic ("disabled — predicate result ignored", "already acknowledged", etc.) but the factory method silently discards… |
| Core.Scripting-005 | Low | Correctness & logic bugs | `DependencyExtractor.cs:97` | A raw string literal token passed as the tag path (a raw triple-quote literal) tokenizes as `SingleLineRawStringLiteralToken` / `MultiLineRawStringLiteralToken`, not `StringLiteralToken`. The check `literal.Token.IsKind(SyntaxKind.StringLi… |
| Core.Scripting-006 | Low | Concurrency & thread safety | `CompiledScriptCache.cs:55` | On a failed compile the `catch` block calls `_cache.TryRemove(key, out _)` without a value comparison. If two threads race a miss for the same bad source, both observe the same faulted `Lazy` and throw, and both call `TryRemove(key)`. If a… |
| Core.Scripting-008 | Low | Performance & resource management | `CompiledScriptCache.cs:34`, `ScriptEvaluator.cs:34` | `CompiledScriptCache` has no capacity bound (acknowledged in the class remarks) and no eviction. Each cached `ScriptEvaluator` holds a Roslyn `ScriptRunner<T>` delegate, which keeps the dynamically emitted script assembly loaded for the pr… |
| Core.Scripting-009 | Low | Design-document adherence | `ForbiddenTypeAnalyzer.cs:45` | The Phase 7 plan decision #6 (`docs/v2/implementation/phase-7-scripting-and-alarming.md`) enumerates the forbidden surface as "No HttpClient / File / Process / reflection". `ForbiddenTypeAnalyzer` actually denies a broader set — `System.Th… |
| Core.Scripting-011 | Low | Testing coverage | `tests/Core/ZB.MOM.WW.OtOpcUa.Core.Scripting.Tests/` | Two source files have no direct test coverage: `ScriptContext` (`Deadband` static helper is exercised only indirectly through `ScriptSandboxTests`, and not for its boundary `tolerance` behaviour) and `ScriptSandbox.Build` itself (the `Argu… |
| Core.VirtualTags-004 | Low | Correctness & logic bugs | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/VirtualTagEngine.cs:349` | `CoerceResult`'s switch has a default arm (`_ => raw`) that returns the script's raw return value uncoerced for any `DriverDataType` not in the explicit list (e.g. an array type, Byte, or a future enum member). The resulting `DataValueSnap… |
| Core.VirtualTags-006 | Low | Concurrency & thread safety | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/VirtualTagEngine.cs:177-182`, `:395-401` | `Subscribe` does `_observers.GetOrAdd(path, _ => [])` then `lock (list) { list.Add(observer); }`. When `Unsub.Dispose` removes the last observer, the now-empty List is left in `_observers` and the dictionary entry is never removed. For a l… |
| Core.VirtualTags-007 | Low | Error handling & resilience | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/TimerTriggerScheduler.cs:58` | `Tick` calls `_engine.EvaluateOneAsync(p, _cts.Token).GetAwaiter().GetResult()`, blocking the `System.Threading.Timer` callback thread (a thread-pool thread) for the full duration of the evaluation. Because `EvaluateInternalAsync` serialis… |
| Core.VirtualTags-009 | Low | Performance & resource management | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/DependencyGraph.cs:64-65`, `:72-73` | `DirectDependencies` and `DirectDependents` allocate a fresh empty `HashSet<string>` on every call for an unregistered node. `DirectDependents` is called inside the `TopologicalSort` Kahn loop and the `CascadeAsync` DFS, so for a graph wit… |
| Core.VirtualTags-010 | Low | Documentation & comments | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/ITagUpstreamSource.cs:18`, `VirtualTagContext.cs:30`, `VirtualTagDefinition.cs:28` | Several XML docs reference component names that do not exist in the codebase. `ITagUpstreamSource` XML doc says the subscription path "feeds the engine's ChangeTriggerDispatcher" -- there is no ChangeTriggerDispatcher; the actual path is `… |
| Core.VirtualTags-011 | Low | Code organization & conventions | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/VirtualTagEngine.cs:404-409` | `VirtualTagState` records a Writes set (the `ctx.SetVirtualTag` targets extracted by `DependencyExtractor`), but nothing in the engine reads it -- it is captured at `Load` and never used. Declared write targets are not validated against th… |
| Core.VirtualTags-013 | Low | Documentation & comments | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/DependencyGraph.cs:266-270` | `DependencyCycleException.BuildMessage` renders each cycle as `string.Join(" -> ", c) + " -> " + c[0]`, presenting the SCC member list as a traversable edge path that loops back to its first element. Tarjan's algorithm returns the members… |
| Driver.AbCip-007 | Low | OtOpcUa conventions | `AbCipDriver.cs` (whole file), `AbCipAlarmProjection.cs`, `LibplctagTagRuntime.cs` | `CLAUDE.md` Library Preferences mandate Serilog with a rolling daily file sink. The driver has no logging at all: no `ILogger`/Serilog dependency is injected or used. Failure paths instead swallow exceptions into the `_health` string (`Rea… |
| Driver.AbCip-011 | Low | Error handling & resilience | `AbCipDriver.cs:144-152`, `AbCipDriverOptions.cs:131-143` | `InitializeAsync` only starts probe loops when `_options.Probe.Enabled` is true AND `Probe.ProbeTagPath` is non-blank. When `Probe.Enabled` is true (the default) but `ProbeTagPath` is null (also the default; the doc comment says "PR 8 wire… |
| Driver.AbCip-012 | Low | Performance & resource management | `LibplctagTemplateReader.cs:15-35`, `AbCipDriver.cs:88-92` | `LibplctagTemplateReader` is created per `FetchUdtShapeAsync` call, and each call constructs a fresh libplctag `Tag` for the @udt pseudo-tag, initializes it (a CIP connection handshake), reads, and disposes it. There is no reuse of the `Ta… |
@@ -175,12 +155,6 @@ Findings with status `Open` or `In Progress`, ordered by severity.
| Driver.TwinCAT.Cli-005 | Low | Code organization & conventions | `Commands/ProbeCommand.cs:23`, `Commands/ReadCommand.cs:20`, `Commands/WriteCommand.cs:20`, `Commands/SubscribeCommand.cs:18` | The `--type` option is declared with the short alias `-t` on `read`, `write`, and `subscribe`, but `ProbeCommand` declares `[CommandOption("type", ...)]` with no short alias. An operator who has internalised `-t` from the other three verbs… |
| Driver.TwinCAT.Cli-006 | Low | Testing coverage | `tests/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.Cli.Tests/WriteCommandParseValueTests.cs` | The only test file covers `WriteCommand.ParseValue` and `ReadCommand.SynthesiseTagName`. Other deterministic, router-independent logic is untested: `TwinCATCommandBase.Gateway` (the `ads://{netId}:{port}` string the driver's `TwinCATAmsAdd… |
| Driver.TwinCAT.Cli-007 | Low | Documentation & comments | `TwinCATCommandBase.cs:31-36` | The `Timeout` override has an empty `init` accessor with the comment `/* driven by TimeoutMs */`. Because the base `DriverCommandBase.Timeout` is declared `abstract { get; init; }`, the override must supply an `init`, but here it silently… |
| Server-004 | Low | OtOpcUa conventions | `src/Server/ZB.MOM.WW.OtOpcUa.Server/OpcUa/OtOpcUaServer.cs:187-200` | `RoleBasedIdentity` declares its own `Display` property, but the base `UserIdentity` already has a settable `DisplayName`. `DriverNodeManager.ResolveCallUser`/`RouteScriptedAlarmMethodCalls` read the base `DisplayName`, never `Display`. Si… |
| Server-006 | Low | Concurrency & thread safety | `src/Server/ZB.MOM.WW.OtOpcUa.Server/OpcUa/DriverNodeManager.cs:478-482, 1342-1348` | `OnReadValue`/`OnWriteValue` are synchronous stack hooks that block on async driver calls via `.GetAwaiter().GetResult()` with `CancellationToken.None`. With `MaxRequestThreadCount = 100`, a burst of reads/writes into a stalled driver pins… |
| Server-008 | Low | Error handling & resilience | `src/Server/ZB.MOM.WW.OtOpcUa.Server/OpcUa/DriverNodeManager.cs:736` | `RouteScriptedAlarmMethodCalls` marks a handled slot by setting `errors[i] = ServiceResult.Good`, assuming `base.Call` skips non-null *Good* error slots. The stack and `GateCallMethodRequests` only ever pre-populate *Bad* slots; the skip-o… |
| Server-012 | Low | Performance & resource management | `src/Server/ZB.MOM.WW.OtOpcUa.Server/Hosting/PeerHttpProbeLoop.cs:78-79` | `ProbeAsync` creates an `IHttpClientFactory` client and mutates `client.Timeout` on every 2-second probe tick. The timeout belongs on the request or on the named-client registration, not set per call on a factory-vended instance. |
| Server-014 | Low | Code organization & conventions | `src/Server/ZB.MOM.WW.OtOpcUa.Server/SealedBootstrap.cs` | `SealedBootstrap` claims in its xml-doc to "close release blocker #2" by consuming the generation-sealed cache + resilient reader + stale-config flag, but `Program.cs` registers and uses `NodeBootstrap` instead. `SealedBootstrap` is never… |
| Server-015 | Low | Documentation & comments | `src/Server/ZB.MOM.WW.OtOpcUa.Server/OpcUa/OtOpcUaServer.cs:16-21`, `src/Server/ZB.MOM.WW.OtOpcUa.Server/OpcUa/OpcUaServerOptions.cs:21-26` | `OtOpcUaServer`'s class doc still says "PR 16 minimum-viable scope ... no security ... LDAP + security profiles are deferred." `OpcUaServerOptions`'s says "PR 17 minimum-viable scope: no LDAP, no security profiles beyond None." Both are st… |
## Closed findings
@@ -367,6 +341,9 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
| Server-010 | Medium | Resolved | Security | `src/Server/ZB.MOM.WW.OtOpcUa.Server/OpcUa/OpcUaServerOptions.cs:59`, `src/Server/ZB.MOM.WW.OtOpcUa.Server/OpcUa/OpcUaApplicationHost.cs:284-291` |
| Server-011 | Medium | Resolved | Security | `src/Server/ZB.MOM.WW.OtOpcUa.Server/OpcUa/OpcUaApplicationHost.cs:322-346` |
| Server-013 | Medium | Resolved | Design-document adherence | `src/Server/ZB.MOM.WW.OtOpcUa.Server/OpcUa/OpcUaServerOptions.cs:9-19`, `src/Server/ZB.MOM.WW.OtOpcUa.Server/OpcUa/OpcUaApplicationHost.cs:296-346`, `src/Server/ZB.MOM.WW.OtOpcUa.Server/Program.cs:89` |
| Admin-010 | Low | Resolved | OtOpcUa conventions | `Components/App.razor:9,16` |
| Admin-011 | Low | Resolved | Concurrency & thread safety | `Hubs/FleetStatusPoller.cs:24-26,98-103` |
| Admin-012 | Low | Resolved | Design-document adherence | `Services/EquipmentCsvImporter.cs:18-19,33-37,229,232` |
| Analyzers-002 | Low | Resolved | Correctness & logic bugs | `src/Tooling/ZB.MOM.WW.OtOpcUa.Analyzers/UnwrappedCapabilityCallAnalyzer.cs:46-50,130` |
| Analyzers-003 | Low | Resolved | Error handling & resilience | `src/Tooling/ZB.MOM.WW.OtOpcUa.Analyzers/UnwrappedCapabilityCallAnalyzer.cs:80,114-116` |
| Analyzers-004 | Low | Resolved | Performance & resource management | `src/Tooling/ZB.MOM.WW.OtOpcUa.Analyzers/UnwrappedCapabilityCallAnalyzer.cs:95-112` |
@@ -390,3 +367,26 @@ Findings with status `Resolved`, `Won't Fix`, or `Deferred`.
| Core.Abstractions-008 | Low | Resolved | Documentation & comments | `src/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions/DriverHealth.cs:9`, `src/Core/ZB.MOM.WW.OtOpcUa.Core.Abstractions/IHistoryProvider.cs:39-43,65-69` |
| Core.AlarmHistorian-008 | Low | Resolved | Performance & resource management | `src/Core/ZB.MOM.WW.OtOpcUa.Core.AlarmHistorian/SqliteStoreAndForwardSink.cs:107-127,255-278` |
| Core.AlarmHistorian-011 | Low | Resolved | Documentation & comments | `src/Core/ZB.MOM.WW.OtOpcUa.Core.AlarmHistorian/IAlarmHistorianSink.cs:5-9,76`, `AlarmHistorianEvent.cs:20` |
| Core.ScriptedAlarms-003 | Low | Resolved | Documentation & comments | `ScriptedAlarmEngine.cs:343`, `docs/ScriptedAlarms.md:107` |
| Core.ScriptedAlarms-006 | Low | Resolved | Concurrency & thread safety | `ScriptedAlarmEngine.cs:232`, `ScriptedAlarmEngine.cs:369` |
| Core.ScriptedAlarms-008 | Low | Resolved | Performance & resource management | `Part9StateMachine.cs:261-268` |
| Core.ScriptedAlarms-009 | Low | Won't Fix | Performance & resource management | `ScriptedAlarmEngine.cs:309-315`, `ScriptedAlarmEngine.cs:271` |
| Core.ScriptedAlarms-010 | Low | Resolved | Design-document adherence | `ScriptedAlarmEngine.cs:325-336`, `AlarmPredicateContext.cs:33-40`, `MessageTemplate.cs:47` |
| Core.ScriptedAlarms-011 | Low | Resolved | Code organization & conventions | `Part9StateMachine.cs:275` |
| Core.Scripting-005 | Low | Resolved | Correctness & logic bugs | `DependencyExtractor.cs:97` |
| Core.Scripting-006 | Low | Resolved | Concurrency & thread safety | `CompiledScriptCache.cs:55` |
| Core.Scripting-009 | Low | Resolved | Design-document adherence | `ForbiddenTypeAnalyzer.cs:45` |
| Core.Scripting-011 | Low | Resolved | Testing coverage | `tests/Core/ZB.MOM.WW.OtOpcUa.Core.Scripting.Tests/` |
| Core.VirtualTags-004 | Low | Resolved | Correctness & logic bugs | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/VirtualTagEngine.cs:349` |
| Core.VirtualTags-006 | Low | Resolved | Concurrency & thread safety | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/VirtualTagEngine.cs:177-182`, `:395-401` |
| Core.VirtualTags-007 | Low | Resolved | Error handling & resilience | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/TimerTriggerScheduler.cs:58` |
| Core.VirtualTags-009 | Low | Resolved | Performance & resource management | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/DependencyGraph.cs:64-65`, `:72-73` |
| Core.VirtualTags-010 | Low | Resolved | Documentation & comments | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/ITagUpstreamSource.cs:18`, `VirtualTagContext.cs:30`, `VirtualTagDefinition.cs:28` |
| Core.VirtualTags-011 | Low | Resolved | Code organization & conventions | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/VirtualTagEngine.cs:404-409` |
| Core.VirtualTags-013 | Low | Resolved | Documentation & comments | `src/Core/ZB.MOM.WW.OtOpcUa.Core.VirtualTags/DependencyGraph.cs:266-270` |
| Server-004 | Low | Resolved | OtOpcUa conventions | `src/Server/ZB.MOM.WW.OtOpcUa.Server/OpcUa/OtOpcUaServer.cs:187-200` |
| Server-006 | Low | Resolved | Concurrency & thread safety | `src/Server/ZB.MOM.WW.OtOpcUa.Server/OpcUa/DriverNodeManager.cs:478-482, 1342-1348` |
| Server-008 | Low | Resolved | Error handling & resilience | `src/Server/ZB.MOM.WW.OtOpcUa.Server/OpcUa/DriverNodeManager.cs:736` |
| Server-012 | Low | Resolved | Performance & resource management | `src/Server/ZB.MOM.WW.OtOpcUa.Server/Hosting/PeerHttpProbeLoop.cs:78-79` |
| Server-014 | Low | Resolved | Code organization & conventions | `src/Server/ZB.MOM.WW.OtOpcUa.Server/SealedBootstrap.cs` |
| Server-015 | Low | Resolved | Documentation & comments | `src/Server/ZB.MOM.WW.OtOpcUa.Server/OpcUa/OtOpcUaServer.cs:16-21`, `src/Server/ZB.MOM.WW.OtOpcUa.Server/OpcUa/OpcUaServerOptions.cs:21-26` |
+13 -13
View File
@@ -7,7 +7,7 @@
| Review date | 2026-05-22 |
| Commit reviewed | `76d35d1` |
| Status | Reviewed |
| Open findings | 6 |
| Open findings | 0 |
## Checklist coverage
@@ -74,13 +74,13 @@
| Severity | Low |
| Category | OtOpcUa conventions |
| Location | `src/Server/ZB.MOM.WW.OtOpcUa.Server/OpcUa/OtOpcUaServer.cs:187-200` |
| Status | Open |
| Status | Resolved |
**Description:** `RoleBasedIdentity` declares its own `Display` property, but the base `UserIdentity` already has a settable `DisplayName`. `DriverNodeManager.ResolveCallUser`/`RouteScriptedAlarmMethodCalls` read the base `DisplayName`, never `Display`. Since the ctor passes only `userName` to base, `DisplayName` resolves to the username — so scripted-alarm Ack/Confirm/Shelve audit entries record the raw username, not the LDAP-resolved display name the comment promises. `Display` is dead code.
**Recommendation:** Drop `Display`; set the base `DisplayName = displayName ?? userName;`. Verify `ResolveCallUser` yields the resolved display name.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — re-triaged: in the pinned SDK version (1.5.374.126) `UserIdentity.DisplayName` is a sealed-virtual auto-property with no public setter, so the base `DisplayName = …` assignment the original recommendation suggested won't compile. Instead the fix passes `displayName ?? userName` as the first arg to the base `UserIdentity(string, string)` ctor — the SDK seeds `DisplayName` from that arg internally — and removes the dead `Display` property. `RoleBasedIdentity` is now `internal sealed` so `DriverNodeManager.ResolveCallUser` can be unit-tested against the production identity type. Regression tests `RoleBasedIdentityTests.DisplayName_returns_LDAP_resolved_display_name_when_present`, `DisplayName_falls_back_to_userName_when_LDAP_display_name_is_null`, and `ResolveCallUser_yields_LDAP_resolved_display_name` cover the behaviour.
### Server-005
| Field | Value |
@@ -102,13 +102,13 @@
| Severity | Low |
| Category | Concurrency & thread safety |
| Location | `src/Server/ZB.MOM.WW.OtOpcUa.Server/OpcUa/DriverNodeManager.cs:478-482, 1342-1348` |
| Status | Open |
| Status | Resolved |
**Description:** `OnReadValue`/`OnWriteValue` are synchronous stack hooks that block on async driver calls via `.GetAwaiter().GetResult()` with `CancellationToken.None`. With `MaxRequestThreadCount = 100`, a burst of reads/writes into a stalled driver pins request threads for the full pipeline timeout, exhausting the pool and stalling unrelated sessions. The call cannot be cancelled by a client timeout.
**Recommendation:** Derive a `CancellationToken` from the `OperationContext` / `TransportQuotas.OperationTimeout` so a stuck driver call is abandoned. Longer term, use the stack's async service overrides if available.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — added `DriverNodeManager.DeriveOperationCancellation(ISystemContext, TimeSpan fallback)` helper that reads `SystemContext.OperationContext.OperationDeadline` (which the stack sets from the client's `RequestHeader.TimeoutHint`). `OnReadValue` and `OnWriteValue` now pass `cts.Token` to `_invoker.ExecuteAsync` / `ExecuteWriteAsync` instead of `CancellationToken.None`, and surface `BadTimeout` (instead of `BadInternalError`) when the deadline fires. Handles both the SDK's sentinel deadlines: `DateTime.MinValue` (no deadline plumbed) and `DateTime.MaxValue` (TimeoutHint=0, the SDK default) collapse to a 30-s fallback. A deadline > Int32.MaxValue ms in the future also clamps to the fallback so the read path never throws `ArgumentOutOfRangeException` from inside `CancellationTokenSource(TimeSpan)`. Regression tests in `DriverNodeManagerCancellationTests` cover all five paths (future / past / missing / MinValue / MaxValue).
### Server-007
| Field | Value |
@@ -130,13 +130,13 @@
| Severity | Low |
| Category | Error handling & resilience |
| Location | `src/Server/ZB.MOM.WW.OtOpcUa.Server/OpcUa/DriverNodeManager.cs:736` |
| Status | Open |
| Status | Resolved |
**Description:** `RouteScriptedAlarmMethodCalls` marks a handled slot by setting `errors[i] = ServiceResult.Good`, assuming `base.Call` skips non-null *Good* error slots. The stack and `GateCallMethodRequests` only ever pre-populate *Bad* slots; the skip-on-Good assumption is not a guaranteed SDK contract. If `base.Call` re-dispatches, the engine method and the stack's built-in Part 9 handler both fire — double transition.
**Recommendation:** Verify against the pinned SDK whether `base.Call` skips Good-pre-populated slots. If not, exclude routed slots from `methodsToCall` before `base.Call`. Add a test asserting exactly-once engine transition for a routed Acknowledge.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — verified against the pinned SDK (DeepWiki query against OPCFoundation/UA-.NETStandard): `CustomNodeManager2.Call` / `CallInternalAsync` skip slots whose `CallMethodRequest.Processed` flag is `true`, not slots whose `errors[i]` is a non-Bad `ServiceResult`. `RouteScriptedAlarmMethodCalls` now sets `request.Processed = true` on every handled slot — success, `ArgumentException`, and generic exception paths — so `base.Call` never re-dispatches a routed Acknowledge / Confirm / AddComment to the stack's built-in Part 9 handler. Regression tests in `ScriptedAlarmMethodRoutingProcessedFlagTests` assert `Processed` is `true` after each engine path and `false` for slots the helper passes through to `base.Call`.
### Server-009
| Field | Value |
@@ -186,13 +186,13 @@
| Severity | Low |
| Category | Performance & resource management |
| Location | `src/Server/ZB.MOM.WW.OtOpcUa.Server/Hosting/PeerHttpProbeLoop.cs:78-79` |
| Status | Open |
| Status | Resolved |
**Description:** `ProbeAsync` creates an `IHttpClientFactory` client and mutates `client.Timeout` on every 2-second probe tick. The timeout belongs on the request or on the named-client registration, not set per call on a factory-vended instance.
**Recommendation:** Configure the timeout once via `AddHttpClient(HttpClientName).ConfigureHttpClient(...)`, or use a per-request linked `CancellationTokenSource(_options.HttpProbeTimeout)`; drop the per-call `client.Timeout` mutation.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — `ProbeAsync` no longer mutates `client.Timeout`. Replaced with a per-call `CancellationTokenSource(_options.HttpProbeTimeout)` linked to the loop's shutdown token; `GetAsync` consumes the linked token so the per-request deadline is enforced via cancellation instead of via the factory-vended `HttpClient` instance. Regression test `PeerHttpProbeLoopTests.Tick_does_not_mutate_factory_vended_client_Timeout` asserts the timeout-on-client mutation is gone.
### Server-013
| Field | Value |
@@ -214,13 +214,13 @@
| Severity | Low |
| Category | Code organization & conventions |
| Location | `src/Server/ZB.MOM.WW.OtOpcUa.Server/SealedBootstrap.cs` |
| Status | Open |
| Status | Resolved |
**Description:** `SealedBootstrap` claims in its xml-doc to "close release blocker #2" by consuming the generation-sealed cache + resilient reader + stale-config flag, but `Program.cs` registers and uses `NodeBootstrap` instead. `SealedBootstrap` is never registered in DI nor referenced by `OpcUaServerService` — it and its `StaleConfigFlag` plumbing are dead in the production wire-up; the release blocker remains open in practice.
**Recommendation:** Either register `SealedBootstrap` (with `GenerationSealedCache`/`ResilientConfigReader`/`StaleConfigFlag`) and wire `StaleConfigFlag` into the health host, or delete `SealedBootstrap` and correct the release-readiness doc.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — added `ServerWiring.AddSealedBootstrap` DI helper that registers `GenerationSealedCache` (rooted at a `.sealed` sibling of `NodeOptions.LocalCachePath`), `StaleConfigFlag`, `ResilientConfigReader`, and `SealedBootstrap`. `Program.cs` calls it after `AddSingleton<NodeBootstrap>()`; `OpcUaServerService` now consumes `SealedBootstrap` instead of `NodeBootstrap`; `OpcUaApplicationHost` is constructed with `staleConfigFlag` resolved from DI so `/healthz`'s `usingStaleConfig` reflects the cache-fallback state. The legacy `NodeBootstrap` registration stays for back-compat with the integration tests that construct it directly. Regression test `SealedBootstrapWiringTests.SealedBootstrap_and_its_dependencies_are_registered_in_DI` asserts the registrations compose without missing-service exceptions; `SealedBootstrap.cs`'s xml-doc updated to describe the live wire-up rather than the deferred plan.
### Server-015
| Field | Value |
@@ -228,10 +228,10 @@
| Severity | Low |
| Category | Documentation & comments |
| Location | `src/Server/ZB.MOM.WW.OtOpcUa.Server/OpcUa/OtOpcUaServer.cs:16-21`, `src/Server/ZB.MOM.WW.OtOpcUa.Server/OpcUa/OpcUaServerOptions.cs:21-26` |
| Status | Open |
| Status | Resolved |
**Description:** `OtOpcUaServer`'s class doc still says "PR 16 minimum-viable scope ... no security ... LDAP + security profiles are deferred." `OpcUaServerOptions`'s says "PR 17 minimum-viable scope: no LDAP, no security profiles beyond None." Both are stale — the class now does LDAP UserName auth, anonymous-role mapping, and a configurable security profile. A reader would wrongly conclude the server has no authentication.
**Recommendation:** Update both class summaries to describe current behaviour and drop the "deferred to a future PR" language.
**Resolution:** _(open)_
**Resolution:** Resolved 2026-05-23 — rewrote both class summaries. `OtOpcUaServer` now describes the live LDAP UserName / Anonymous identity-token flow, the `RoleBasedIdentity` wrapper, and the configurable `SecurityProfile` driven by `OpcUaServerOptions`. `OpcUaServerOptions` now describes endpoint + identity + PKI + health + LDAP + anonymous-role surfaces and points at `docs/security.md`. The stale "PR 16 / PR 17 minimum-viable scope" and "deferred to their own PR" language is gone.
+12 -1
View File
@@ -35,7 +35,7 @@ new ScriptedAlarmDefinition(
## Predicate evaluation
Alarm predicates reuse the same Roslyn sandbox as virtual tags — `ScriptEvaluator<AlarmPredicateContext, bool>` compiles the source, `TimedScriptEvaluator` wraps it with the configured timeout (default from `TimedScriptEvaluator.DefaultTimeout`), and `DependencyExtractor` statically harvests the tag paths the script reads. The sandbox rules (forbidden types, cancellation, logging sinks) are documented in [VirtualTags.md](VirtualTags.md); ScriptedAlarms does not redefine them. The known memory / CPU resource limits are documented there as well.
Alarm predicates reuse the same Roslyn sandbox as virtual tags — `ScriptEvaluator<AlarmPredicateContext, bool>` compiles the source, `TimedScriptEvaluator` wraps it with the configured timeout (default from `TimedScriptEvaluator.DefaultTimeout`), and `DependencyExtractor` statically harvests the tag paths the script reads. The sandbox rules (forbidden types, cancellation, logging sinks) are documented in [VirtualTags.md](VirtualTags.md); ScriptedAlarms does not redefine them. The known resource limits — unbounded script-side memory, the per-publish accretion of dynamically-emitted script assemblies (Core.Scripting-008), and the orphan-thread CPU-budget caveat — are documented in that file as well.
`AlarmPredicateContext` (`AlarmPredicateContext.cs`) is the script's `ScriptContext` subclass:
@@ -79,6 +79,17 @@ Two invariants the machine enforces:
Fallback rules: a resolved `DataValueSnapshot` with a non-zero `StatusCode`, a `null` `Value`, or an unknown path becomes `{?}`. The event still fires — the operator sees where the reference broke rather than having the alarm swallowed.
## Input-quality policy
Predicate evaluation and message-template resolution deliberately treat tag-input quality differently:
| Surface | Quality bar | Rationale |
|---|---|---|
| `ScriptedAlarmEngine.AreInputsReady` (predicate gate) | **Bad rejected** (`StatusCode` bit 31 set). `Good` and `Uncertain` are both accepted. | Uncertain quality still carries a value the predicate can inspect; rejecting it would mask a transitional alarm condition. Predicate evaluation is a state-machine input — operators want it to track reality as closely as the quality allows. |
| `MessageTemplate.Resolve` (operator-facing message) | **Any non-zero `StatusCode` rejected** — only `Good` substitutes; `Uncertain` / Bad / unknown all render as `{?}`. | The message is a human-readable signal; substituting an Uncertain value would let operators act on a questionable reading without seeing the qualifier. Rendering `{?}` makes the doubt explicit. |
`AlarmPredicateContext.GetTag` returns a `BadNodeIdUnknown` (`0x80340000`) snapshot for missing or empty paths, so a typo in the predicate flows through `AreInputsReady` (Bad → predicate skipped, prior state held) and `MessageTemplate.Resolve` (non-Good → `{?}`) without crashing the engine. (Core.ScriptedAlarms-010)
## State persistence
`IAlarmStateStore` (`IAlarmStateStore.cs`) is the persistence contract: `LoadAsync(alarmId)`, `LoadAllAsync`, `SaveAsync(state)`, `RemoveAsync(alarmId)`. `InMemoryAlarmStateStore` in the same file is the default for tests and dev deployments without a SQL backend. Stream E wires the production implementation against the `ScriptedAlarmState` config-DB table with audit logging through `Core.Abstractions.IAuditLogger`.
+3 -1
View File
@@ -28,7 +28,9 @@ Similarly, **`System.Threading.Tasks` is now denied** (Core.Scripting-003), whic
### Compile cache (`CompiledScriptCache<TContext, TResult>`)
`ConcurrentDictionary<string, Lazy<ScriptEvaluator<...>>>` keyed on `SHA-256(UTF8(source))` rendered to hex. `Lazy<T>` with `ExecutionAndPublication` mode means two threads racing a miss compile exactly once. Failed compiles evict the entry so a corrected retry can succeed (used during Admin UI authoring). No capacity bound — scripts are operator-authored and bounded by the config DB. Whitespace changes miss the cache on purpose. `Clear()` is called on config-publish.
`ConcurrentDictionary<string, Lazy<ScriptEvaluator<...>>>` keyed on `SHA-256(UTF8(source))` rendered to hex. `Lazy<T>` with `ExecutionAndPublication` mode means two threads racing a miss compile exactly once. Failed compiles evict the entry (via the `TryRemove(KeyValuePair<,>)` overload so a concurrently re-added retry entry is not collateral damage — Core.Scripting-006) so a corrected retry can succeed (used during Admin UI authoring). No capacity bound — scripts are operator-authored and bounded by the config DB. Whitespace changes miss the cache on purpose. `Clear()` is called on config-publish.
**Per-publish assembly accretion (accepted limitation, Core.Scripting-008).** Each compiled `ScriptEvaluator` holds a Roslyn `ScriptRunner<T>` delegate, which keeps the dynamically-emitted script assembly loaded for the process lifetime. Emitted assemblies in the default `AssemblyLoadContext` cannot be unloaded; `CompiledScriptCache.Clear()` drops the dictionary entries but does **not** unload the underlying assemblies. Across many config-publish generations (each `Clear()` followed by recompiling every script), the process accumulates dead script assemblies. For the expected "low thousands" of scripts this is benign, but a long-running server with very frequent publishes will see steady managed-memory growth that does not return until the process restarts. Out-of-process script evaluation or a collectible `AssemblyLoadContext` is a v3 concern; deployments with high-publish-frequency requirements should schedule a periodic server restart to reclaim the accrued assemblies.
### Per-evaluation timeout (`TimedScriptEvaluator<TContext, TResult>`)
+6
View File
@@ -150,3 +150,9 @@ substantive driver change, and revise this table when the data does.
leak guard. Likely culprits: lingering subscription handles in
`SubscriptionRegistry`, or a downstream consumer retaining
`DataValueSnapshot` references past their useful life.
## Scripted-alarm engine — known hot-path allocations
`ScriptedAlarmEngine.BuildReadCache` allocates a fresh `Dictionary<string, DataValueSnapshot>` and `AlarmPredicateContext` on every predicate evaluation — i.e. once per upstream tag change per referencing alarm. On a busy line where many tags feeding many alarms change frequently, this is a steady stream of short-lived dictionary allocations on the hot path. (Core.ScriptedAlarms-009)
The allocations are deliberate for now: predicate evaluation is already serialised under `_evalGate`, so a single reused scratch dictionary would be safe, but the per-call dictionary keeps the evaluation surface immutable and trivially safe against future refactors. If a future scripted-alarm soak surfaces allocation pressure on this path, the mitigation is a per-alarm scratch buffer cleared between evaluations — note here before changing the engine.
@@ -29,7 +29,7 @@ Tie-in capability — **historian alarm sink**:
| 3 | Evaluation trigger = **change-driven + timer-driven**; operator chooses per-tag | Change-driven is cheap at steady state; timer is the escape hatch for polling derivations that don't have a discrete "input changed" signal. |
| 4 | Script shape = **Shape A — one script per virtual tag/alarm**; `return` produces the value (or `bool` for alarm condition) | Minimal surface; no predicate/action split. Alarm side-effects (severity, message) configured out-of-band, not in the script. |
| 5 | Alarm fidelity = **full OPC UA Part 9** | Uniform with Galaxy + ALMD on the wire; client-side tooling (HMIs, historians, event pipelines) gets one shape. |
| 6 | Sandbox = **read-only context**; scripts can only read any tag + write to virtual tags | Strict Roslyn `ScriptOptions` allow-list. No HttpClient / File / Process / reflection. |
| 6 | Sandbox = **read-only context**; scripts can only read any tag + write to virtual tags | Strict Roslyn `ScriptOptions` allow-list. Authoritative deny-list (`ForbiddenTypeAnalyzer`): namespace-prefix deny `System.IO`, `System.Net`, `System.Diagnostics`, `System.Reflection`, `System.Threading.Tasks` (Task / Parallel fan-out — Core.Scripting-003), `System.Runtime.InteropServices`, `Microsoft.Win32`; type-granular deny `System.Environment`, `System.AppDomain`, `System.GC`, `System.Activator`, `System.Threading.Thread` (these live directly in the allow-listed `System` / `System.Threading` namespaces, so a prefix rule cannot reach them without blocking primitives — Core.Scripting-001 / -009). |
| 7 | Dependency declaration = **AST inference**; operator doesn't maintain a separate dependency list | `CSharpSyntaxWalker` extracts `ctx.GetTag("path")` string-literal calls at compile time; dynamic paths rejected at publish. |
| 8 | Config storage = **config DB with generation-sealed cache** (same as driver instances) | Virtual tags + alarms publish atomically in the same generation as the driver instance config they may depend on. |
| 9 | Script return value shape (`ctx.GetTag`) = **`DataValue { Value, StatusCode, Timestamp }`** | Scripts branch on quality naturally without separate `ctx.GetQuality(...)` calls. |
@@ -162,7 +162,7 @@ Tie-in capability — **historian alarm sink**:
## Compliance Checks (run at exit gate)
- [ ] **Sandbox escape**: attempts to reference `System.IO.File`, `System.Net.Http.HttpClient`, `System.Diagnostics.Process`, or `typeof(X).Assembly.Load` fail at script compile with an actionable error.
- [ ] **Sandbox escape**: attempts to reference any deny-listed namespace prefix (`System.IO`, `System.Net`, `System.Diagnostics`, `System.Reflection`, `System.Threading.Tasks`, `System.Runtime.InteropServices`, `Microsoft.Win32`) or any of the type-granular forbidden types (`System.Environment`, `System.AppDomain`, `System.GC`, `System.Activator`, `System.Threading.Thread`) fail at script compile with an actionable error. Vectors include direct calls, `typeof(T)`, generic type arguments, casts, `is`/`as` patterns, `default(T)`, array element types, and explicitly-typed local declarations.
- [ ] **Dependency inference**: `ctx.GetTag(myStringVar)` (non-literal path) is rejected at publish with a span-pointed error; `ctx.GetTag("Line1/Speed")` is accepted + appears in the inferred input set.
- [ ] **Change cascade**: tag A → virtual tag B → virtual tag C. When A changes, B recomputes, then C recomputes. Single change event triggers the full cascade in topological order within one evaluation pass.
- [ ] **Cycle rejection**: publish a config where virtual tag B depends on A and A depends on B. Publish fails pre-commit with a clear cycle message.
@@ -1,3 +1,5 @@
using System.Collections.Immutable;
namespace ZB.MOM.WW.OtOpcUa.Core.ScriptedAlarms;
/// <summary>
@@ -17,7 +19,10 @@ namespace ZB.MOM.WW.OtOpcUa.Core.ScriptedAlarms;
/// <para>
/// <see cref="Comments"/> is append-only; comments + ack/confirm user identities
/// are the audit surface regulators consume. The engine never rewrites past
/// entries.
/// entries. The runtime type is <see cref="ImmutableList{AlarmComment}"/> so
/// each append is O(log n) rather than the O(n) copy a plain
/// <c>IReadOnlyList&lt;AlarmComment&gt;</c> would force on every audit-producing
/// transition. (Core.ScriptedAlarms-008)
/// </para>
/// </remarks>
public sealed record AlarmConditionState(
@@ -36,7 +41,7 @@ public sealed record AlarmConditionState(
DateTime? LastConfirmUtc,
string? LastConfirmUser,
string? LastConfirmComment,
IReadOnlyList<AlarmComment> Comments)
ImmutableList<AlarmComment> Comments)
{
/// <summary>Initial-load state for a newly registered alarm — everything in the "no-event" position.</summary>
public static AlarmConditionState Fresh(string alarmId, DateTime nowUtc) => new(
@@ -55,7 +60,7 @@ public sealed record AlarmConditionState(
LastConfirmUtc: null,
LastConfirmUser: null,
LastConfirmComment: null,
Comments: []);
Comments: ImmutableList<AlarmComment>.Empty);
}
/// <summary>
@@ -33,6 +33,16 @@ public static class MessageTemplate
/// has a non-Good <see cref="DataValueSnapshot.StatusCode"/> or a null
/// <see cref="DataValueSnapshot.Value"/> resolve to <c>{?}</c>.
/// </summary>
/// <remarks>
/// Quality bar is intentionally <em>stricter</em> than predicate evaluation:
/// only Good (StatusCode == 0) is substituted; Uncertain renders as
/// <c>{?}</c>. The predicate gate (<c>ScriptedAlarmEngine.AreInputsReady</c>)
/// accepts Uncertain because it still carries a value the predicate can
/// inspect, but the operator-facing message must make doubt explicit rather
/// than substituting a value an operator might act on. See the
/// "Input-quality policy" section in <c>docs/ScriptedAlarms.md</c>.
/// (Core.ScriptedAlarms-010)
/// </remarks>
public static string Resolve(string template, Func<string, DataValueSnapshot?> resolveTag)
{
if (string.IsNullOrEmpty(template)) return template ?? string.Empty;
@@ -1,3 +1,5 @@
using System.Collections.Immutable;
namespace ZB.MOM.WW.OtOpcUa.Core.ScriptedAlarms;
/// <summary>
@@ -258,21 +260,33 @@ public static class Part9StateMachine
return s.UnshelveAtUtc is DateTime t && nowUtc >= t ? ShelvingState.Unshelved : s;
}
private static IReadOnlyList<AlarmComment> AppendComment(
IReadOnlyList<AlarmComment> existing, DateTime ts, string user, string kind, string? text)
{
var list = new List<AlarmComment>(existing.Count + 1);
list.AddRange(existing);
list.Add(new AlarmComment(ts, user, kind, text ?? string.Empty));
return list;
}
private static ImmutableList<AlarmComment> AppendComment(
ImmutableList<AlarmComment> existing, DateTime ts, string user, string kind, string? text)
=> existing.Add(new AlarmComment(ts, user, kind, text ?? string.Empty));
}
/// <summary>Result of a state-machine operation — new state + what to emit (if anything).</summary>
public sealed record TransitionResult(AlarmConditionState State, EmissionKind Emission)
/// <remarks>
/// <para>
/// <see cref="NoOpReason"/> carries a short diagnostic string for the
/// <see cref="NoOp(AlarmConditionState, string)"/> case (e.g.
/// "disabled — predicate result ignored", "already acknowledged"). The
/// engine logs this at debug level when a no-op result is observed, so
/// the class-level remarks on <see cref="Part9StateMachine"/> hold:
/// disabled-alarm and idempotent ack/confirm/shelve/unshelve
/// transitions do produce a diagnostic log line. Plain
/// <see cref="None(AlarmConditionState)"/> results (state unchanged,
/// no operator intent recorded — e.g. a predicate re-evaluation that
/// confirms the existing active state) leave <see cref="NoOpReason"/>
/// null because there is nothing to surface to an operator.
/// (Core.ScriptedAlarms-011)
/// </para>
/// </remarks>
public sealed record TransitionResult(AlarmConditionState State, EmissionKind Emission, string? NoOpReason = null)
{
public static TransitionResult None(AlarmConditionState state) => new(state, EmissionKind.None);
public static TransitionResult NoOp(AlarmConditionState state, string reason) => new(state, EmissionKind.None);
public static TransitionResult NoOp(AlarmConditionState state, string reason)
=> new(state, EmissionKind.None, reason);
}
/// <summary>What kind of event, if any, the engine should emit after a transition.</summary>
@@ -59,6 +59,15 @@ public sealed class ScriptedAlarmEngine : IDisposable
private bool _loaded;
private bool _disposed;
// Tracks fire-and-forget background work launched by OnUpstreamChange
// (ReevaluateAsync) and RunShelvingCheck (ShelvingCheckAsync). Dispose drains
// these so a re-evaluation in flight when shutdown begins finishes its
// SaveAsync before the engine returns control to the caller. The HashSet is
// accessed under its own lock — never under _evalGate — so registration /
// unregistration cannot deadlock against the gate. (Core.ScriptedAlarms-006)
private readonly HashSet<Task> _inFlight = [];
private readonly object _inFlightLock = new();
public ScriptedAlarmEngine(
ITagUpstreamSource upstream,
IAlarmStateStore store,
@@ -92,6 +101,7 @@ public sealed class ScriptedAlarmEngine : IDisposable
if (_disposed) throw new ObjectDisposedException(nameof(ScriptedAlarmEngine));
if (definitions is null) throw new ArgumentNullException(nameof(definitions));
var pending = new List<ScriptedAlarmEvent>(0);
await _evalGate.WaitAsync(ct).ConfigureAwait(false);
try
{
@@ -157,11 +167,14 @@ public sealed class ScriptedAlarmEngine : IDisposable
// Restore persisted state, falling back to Fresh where nothing was saved,
// then re-derive ActiveState from the current predicate per decision #14.
// Any predicate emissions queue into `pending` and fire after the gate
// is released — so a startup-recovery activation event can call back into
// the engine without deadlocking. (Core.ScriptedAlarms-003)
foreach (var (alarmId, state) in _alarms)
{
var persisted = await _store.LoadAsync(alarmId, ct).ConfigureAwait(false);
var seed = persisted ?? state.Condition;
var afterPredicate = await EvaluatePredicateToStateAsync(state, seed, nowUtc: _clock(), ct)
var afterPredicate = await EvaluatePredicateToStateAsync(state, seed, nowUtc: _clock(), ct, pending)
.ConfigureAwait(false);
_alarms[alarmId] = state with { Condition = afterPredicate };
await _store.SaveAsync(afterPredicate, ct).ConfigureAwait(false);
@@ -192,6 +205,10 @@ public sealed class ScriptedAlarmEngine : IDisposable
{
_evalGate.Release();
}
// Fire any emissions collected during startup recovery OUTSIDE the gate so
// subscribers can re-enter the engine safely. (Core.ScriptedAlarms-003)
foreach (var evt in pending) FireEvent(evt);
}
/// <summary>
@@ -234,6 +251,7 @@ public sealed class ScriptedAlarmEngine : IDisposable
if (!_alarms.TryGetValue(alarmId, out var state))
throw new ArgumentException($"Unknown alarm {alarmId}", nameof(alarmId));
ScriptedAlarmEvent? pending = null;
await _evalGate.WaitAsync(ct).ConfigureAwait(false);
try
{
@@ -244,27 +262,50 @@ public sealed class ScriptedAlarmEngine : IDisposable
// the exception propagates to the caller. (Core.ScriptedAlarms-007)
await _store.SaveAsync(result.State, ct).ConfigureAwait(false);
_alarms[alarmId] = state with { Condition = result.State };
if (result.Emission != EmissionKind.None) EmitEvent(state, result.State, result.Emission);
// Build the emission event under the gate (it captures a coherent
// snapshot of state + message-template values) but defer the actual
// OnEvent dispatch until after Release() so a slow subscriber or a
// subscriber that re-enters the engine doesn't block / deadlock.
// (Core.ScriptedAlarms-003)
if (result.Emission != EmissionKind.None)
pending = BuildEmission(state, result.State, result.Emission);
else if (result.NoOpReason is { } reason)
{
// The Part9StateMachine remarks promise a diagnostic log line for
// disabled-alarm no-ops + idempotent ack/confirm/shelve/unshelve
// calls. We surface them at debug so they're available when
// investigating "why didn't my ack take effect?" without spamming
// the main info log. (Core.ScriptedAlarms-011)
state.Logger.Debug("Alarm {AlarmId} no-op transition: {Reason}", alarmId, reason);
}
}
finally { _evalGate.Release(); }
// OnEvent dispatch happens OUTSIDE _evalGate so subscribers can call back
// into the engine (e.g. AcknowledgeAsync from inside an Activated handler)
// without deadlocking against the non-reentrant SemaphoreSlim.
if (pending is not null) FireEvent(pending);
}
/// <summary>
/// Upstream-change callback. Updates the value cache + enqueues predicate
/// re-evaluation for every alarm referencing the changed path. Fire-and-forget
/// so driver-side dispatch isn't blocked.
/// so driver-side dispatch isn't blocked; the background task is tracked so
/// <see cref="Dispose"/> can drain it. (Core.ScriptedAlarms-006)
/// </summary>
internal void OnUpstreamChange(string path, DataValueSnapshot value)
{
_valueCache[path] = value;
if (_disposed) return; // don't queue new work against a disposing engine
if (_alarmsReferencing.TryGetValue(path, out var alarmIds))
{
_ = ReevaluateAsync(alarmIds.ToArray(), CancellationToken.None);
TrackBackgroundTask(ReevaluateAsync(alarmIds.ToArray(), CancellationToken.None));
}
}
private async Task ReevaluateAsync(IReadOnlyList<string> alarmIds, CancellationToken ct)
{
var pending = new List<ScriptedAlarmEvent>(0);
try
{
await _evalGate.WaitAsync(ct).ConfigureAwait(false);
@@ -280,7 +321,7 @@ public sealed class ScriptedAlarmEngine : IDisposable
{
if (!_alarms.TryGetValue(id, out var state)) continue;
var newState = await EvaluatePredicateToStateAsync(
state, state.Condition, _clock(), ct).ConfigureAwait(false);
state, state.Condition, _clock(), ct, pending).ConfigureAwait(false);
if (!ReferenceEquals(newState, state.Condition))
{
// Persist before updating in-memory so a store failure leaves
@@ -295,16 +336,23 @@ public sealed class ScriptedAlarmEngine : IDisposable
catch (Exception ex)
{
_engineLogger.Error(ex, "ScriptedAlarmEngine reevaluate failed");
return;
}
// Fire emissions OUTSIDE _evalGate so subscriber callbacks can re-enter
// the engine without deadlocking. (Core.ScriptedAlarms-003)
foreach (var evt in pending) FireEvent(evt);
}
/// <summary>
/// Evaluate the predicate + apply the resulting state-machine transition.
/// Returns the new condition state. Emits the appropriate event if the
/// transition produces one.
/// Returns the new condition state. If the transition produces an emission,
/// appends it to <paramref name="pendingEmissions"/> so the caller can fire
/// them after releasing <c>_evalGate</c> — keeping subscriber callbacks
/// outside the gate. (Core.ScriptedAlarms-003)
/// </summary>
private async Task<AlarmConditionState> EvaluatePredicateToStateAsync(
AlarmState state, AlarmConditionState seed, DateTime nowUtc, CancellationToken ct)
AlarmState state, AlarmConditionState seed, DateTime nowUtc, CancellationToken ct,
List<ScriptedAlarmEvent>? pendingEmissions = null)
{
var inputs = BuildReadCache(state.Inputs);
@@ -340,7 +388,14 @@ public sealed class ScriptedAlarmEngine : IDisposable
var result = Part9StateMachine.ApplyPredicate(seed, predicateTrue, nowUtc);
if (result.Emission != EmissionKind.None)
EmitEvent(state, result.State, result.Emission);
{
var evt = BuildEmission(state, result.State, result.Emission);
if (evt is not null)
{
if (pendingEmissions is not null) pendingEmissions.Add(evt);
else FireEvent(evt); // LoadAsync path: no caller-supplied list, fire here.
}
}
return result.State;
}
@@ -373,14 +428,24 @@ public sealed class ScriptedAlarmEngine : IDisposable
return true;
}
private void EmitEvent(AlarmState state, AlarmConditionState condition, EmissionKind kind)
/// <summary>
/// Build (but do not fire) the <see cref="ScriptedAlarmEvent"/> for a
/// transition. Returns null for kinds that should not be published
/// (<see cref="EmissionKind.Suppressed"/> and
/// <see cref="EmissionKind.None"/>). Pure construction — called under
/// <c>_evalGate</c> so the message-template resolution uses a coherent
/// value-cache snapshot. The actual <see cref="OnEvent"/> dispatch is
/// done by <see cref="FireEvent(ScriptedAlarmEvent)"/> AFTER the gate is
/// released. (Core.ScriptedAlarms-003)
/// </summary>
private ScriptedAlarmEvent? BuildEmission(AlarmState state, AlarmConditionState condition, EmissionKind kind)
{
// Suppressed kind means shelving ate the emission — we don't fire for subscribers
// but the state record still advanced so startup recovery reflects reality.
if (kind == EmissionKind.Suppressed || kind == EmissionKind.None) return;
if (kind == EmissionKind.Suppressed || kind == EmissionKind.None) return null;
var message = MessageTemplate.Resolve(state.Definition.MessageTemplate, TryLookup);
var evt = new ScriptedAlarmEvent(
return new ScriptedAlarmEvent(
AlarmId: state.Definition.AlarmId,
EquipmentPath: state.Definition.EquipmentPath,
AlarmName: state.Definition.AlarmName,
@@ -390,10 +455,22 @@ public sealed class ScriptedAlarmEngine : IDisposable
Condition: condition,
Emission: kind,
TimestampUtc: _clock());
}
/// <summary>
/// Invoke the <see cref="OnEvent"/> handler for a built emission. Must be
/// called OUTSIDE <c>_evalGate</c>: a slow subscriber would otherwise
/// block the gate for every other engine operation, and a subscriber
/// that re-enters the engine (e.g. calls AcknowledgeAsync) would
/// deadlock against the non-reentrant SemaphoreSlim.
/// (Core.ScriptedAlarms-003)
/// </summary>
private void FireEvent(ScriptedAlarmEvent evt)
{
try { OnEvent?.Invoke(this, evt); }
catch (Exception ex)
{
_engineLogger.Warning(ex, "ScriptedAlarmEngine OnEvent subscriber threw for {AlarmId}", state.Definition.AlarmId);
_engineLogger.Warning(ex, "ScriptedAlarmEngine OnEvent subscriber threw for {AlarmId}", evt.AlarmId);
}
}
@@ -404,7 +481,24 @@ public sealed class ScriptedAlarmEngine : IDisposable
{
if (_disposed) return;
var ids = _alarms.Keys.ToArray();
_ = ShelvingCheckAsync(ids, CancellationToken.None);
TrackBackgroundTask(ShelvingCheckAsync(ids, CancellationToken.None));
}
/// <summary>
/// Register a fire-and-forget task so <see cref="Dispose"/> can await it.
/// The task removes itself from the set on completion via a continuation.
/// (Core.ScriptedAlarms-006)
/// </summary>
private void TrackBackgroundTask(Task task)
{
lock (_inFlightLock) { _inFlight.Add(task); }
// Use ContinueWith with ExecuteSynchronously so the removal runs on the
// completing thread — avoids scheduler delay between completion and
// unregistration that would otherwise let Dispose see a stale set.
task.ContinueWith(t =>
{
lock (_inFlightLock) { _inFlight.Remove(t); }
}, CancellationToken.None, TaskContinuationOptions.ExecuteSynchronously, TaskScheduler.Default);
}
/// <summary>
@@ -416,6 +510,7 @@ public sealed class ScriptedAlarmEngine : IDisposable
private async Task ShelvingCheckAsync(IReadOnlyList<string> alarmIds, CancellationToken ct)
{
var pending = new List<ScriptedAlarmEvent>(0);
try
{
await _evalGate.WaitAsync(ct).ConfigureAwait(false);
@@ -440,7 +535,10 @@ public sealed class ScriptedAlarmEngine : IDisposable
await _store.SaveAsync(result.State, ct).ConfigureAwait(false);
_alarms[id] = state with { Condition = result.State };
if (result.Emission != EmissionKind.None)
EmitEvent(state, result.State, result.Emission);
{
var evt = BuildEmission(state, result.State, result.Emission);
if (evt is not null) pending.Add(evt);
}
}
}
}
@@ -449,7 +547,10 @@ public sealed class ScriptedAlarmEngine : IDisposable
catch (Exception ex)
{
_engineLogger.Warning(ex, "ScriptedAlarmEngine shelving-check failed");
return;
}
// Fire emissions OUTSIDE _evalGate. (Core.ScriptedAlarms-003)
foreach (var evt in pending) FireEvent(evt);
}
private void UnsubscribeFromUpstream()
@@ -473,6 +574,28 @@ public sealed class ScriptedAlarmEngine : IDisposable
_disposed = true;
_shelvingTimer?.Dispose();
UnsubscribeFromUpstream();
// Drain any fire-and-forget background work (ReevaluateAsync from
// OnUpstreamChange + ShelvingCheckAsync from the 5s timer) that started
// before _disposed = true was visible. Without this, a SaveAsync in
// flight can outlive the engine and write to a (possibly disposed) store
// after Dispose() has returned. The tasks re-check _disposed after
// acquiring the gate and bail out, but the await still has to complete.
// (Core.ScriptedAlarms-006)
Task[] toAwait;
lock (_inFlightLock) { toAwait = [.. _inFlight]; }
if (toAwait.Length > 0)
{
try { Task.WhenAll(toAwait).GetAwaiter().GetResult(); }
catch (Exception ex)
{
// Background task failures already logged inside ReevaluateAsync /
// ShelvingCheckAsync; surface here at debug so a parent shutdown is
// not noisy. The key invariant is that the tasks have COMPLETED.
_engineLogger.Debug(ex, "ScriptedAlarmEngine background task threw during shutdown drain");
}
}
// Do NOT clear _alarms here: Timer.Dispose() does not wait for in-flight callbacks,
// so a ShelvingCheckAsync or ReevaluateAsync can still be running inside _evalGate.
// Those paths now re-check _disposed after acquiring the gate and bail out safely.
@@ -58,8 +58,13 @@ public sealed class CompiledScriptCache<TContext, TResult>
}
catch
{
// Failed compile — evict so a retry with corrected source can succeed.
_cache.TryRemove(key, out _);
// Failed compile — evict the SPECIFIC faulted Lazy instance so a retry with
// corrected source can succeed. The KeyValuePair<,> overload compares the
// value reference, so if two threads race the same bad source both observe
// the same faulted Lazy and both reach this catch, and a concurrent retry
// re-added a fresh Lazy under the same key between the two removals, the
// second removal does NOT evict the in-flight retry. (Core.Scripting-006.)
_cache.TryRemove(new KeyValuePair<string, Lazy<ScriptEvaluator<TContext, TResult>>>(key, lazy));
throw;
}
}
@@ -103,8 +103,14 @@ public static class DependencyExtractor
}
var pathArg = args[0].Expression;
// Accept any string-literal expression, including raw-string forms which
// tokenize as SingleLineRawStringLiteralToken / MultiLineRawStringLiteralToken
// rather than StringLiteralToken. Checking the expression kind
// (StringLiteralExpression) covers all token kinds Roslyn assigns to literal
// strings, so a """raw""" path is harvested rather than mis-rejected as a
// dynamic path. (Core.Scripting-005.)
if (pathArg is not LiteralExpressionSyntax literal
|| !literal.Token.IsKind(SyntaxKind.StringLiteralToken))
|| !literal.IsKind(SyntaxKind.StringLiteralExpression))
{
_rejections.Add(new DependencyRejection(
Span: pathArg.Span,
@@ -31,6 +31,13 @@ public sealed class DependencyGraph
private readonly Dictionary<string, HashSet<string>> _dependsOn = new(StringComparer.Ordinal);
private readonly Dictionary<string, HashSet<string>> _dependents = new(StringComparer.Ordinal);
// Shared empty set returned from DirectDependencies / DirectDependents on a miss.
// The CascadeAsync DFS and the Kahn topological sort both call DirectDependents
// per leaf per pass; allocating a fresh HashSet each time would churn the GC on
// every change-cascade event. Returning a shared immutable-via-convention empty
// set is safe because callers only enumerate (the IReadOnlySet contract).
private static readonly IReadOnlySet<string> EmptySet = new HashSet<string>(StringComparer.Ordinal);
// Cached topological rank — built lazily by TransitiveDependentsInOrder and
// invalidated whenever the graph is mutated (Add / Clear). Avoids re-running
// a full O(V+E) Kahn pass on every change-cascade event.
@@ -68,7 +75,7 @@ public sealed class DependencyGraph
/// <summary>Tag paths <paramref name="nodeId"/> directly reads.</summary>
public IReadOnlySet<string> DirectDependencies(string nodeId) =>
_dependsOn.TryGetValue(nodeId, out var set) ? set : (IReadOnlySet<string>)new HashSet<string>();
_dependsOn.TryGetValue(nodeId, out var set) ? set : EmptySet;
/// <summary>
/// Tags whose evaluation depends on <paramref name="nodeId"/> — i.e. when
@@ -76,7 +83,7 @@ public sealed class DependencyGraph
/// transitive propagation falls out of the topological sort.
/// </summary>
public IReadOnlySet<string> DirectDependents(string nodeId) =>
_dependents.TryGetValue(nodeId, out var set) ? set : (IReadOnlySet<string>)new HashSet<string>();
_dependents.TryGetValue(nodeId, out var set) ? set : EmptySet;
/// <summary>
/// Full transitive dependent closure of <paramref name="nodeId"/> in topological
@@ -284,7 +291,14 @@ public sealed class DependencyCycleException : Exception
private static string BuildMessage(IReadOnlyList<IReadOnlyList<string>> cycles)
{
var lines = cycles.Select(c => " - " + string.Join(" -> ", c) + " -> " + c[0]);
// Render each cycle as a comma-separated list of MEMBERS rather than an arrowed
// edge path. Tarjan's algorithm returns SCC members in stack-pop order, which is
// not guaranteed to be a valid edge sequence — for an SCC larger than two nodes
// the previously-emitted "A -> B -> C -> A" rendering could list edges that do
// not exist, sending operators looking for the wrong edge. Member framing avoids
// implying an order or set of edges.
var lines = cycles.Select(c =>
" - cycle members: " + string.Join(", ", c));
return "Virtual-tag dependency graph contains cycle(s):\n" + string.Join("\n", lines);
}
}
@@ -15,10 +15,11 @@ namespace ZB.MOM.WW.OtOpcUa.Core.VirtualTags;
/// from a last-known-value cache populated by the subscription callbacks.
/// </para>
/// <para>
/// The subscription path feeds the engine's <c>ChangeTriggerDispatcher</c> so
/// change-driven virtual tags re-evaluate on any upstream delta (value, status,
/// or timestamp). One subscription per distinct upstream tag path; the engine
/// tracks the mapping itself.
/// The subscription path feeds <see cref="VirtualTagEngine"/>'s
/// <c>OnUpstreamChange</c> callback, which updates the engine's value cache and
/// schedules <c>CascadeAsync</c> to re-evaluate every change-driven dependent in
/// topological order. One subscription per distinct upstream tag path; the
/// engine tracks the mapping itself.
/// </para>
/// </remarks>
public interface ITagUpstreamSource
@@ -9,12 +9,24 @@ namespace ZB.MOM.WW.OtOpcUa.Core.VirtualTags;
/// <see cref="System.Threading.Timer"/> per interval-group keeps the wire count
/// low regardless of tag count.
/// </summary>
/// <remarks>
/// <para>
/// Each timer group carries a per-group in-flight flag (see
/// <c>TickGroup.InFlight</c>). When the timer fires while a tick for the same
/// group is still running, the new callback skips the work and increments
/// <see cref="SkippedTickCount"/> rather than blocking a thread-pool thread on
/// the engine's evaluation gate. This bounds the work outstanding at one tick
/// per group, regardless of how long an individual evaluation takes.
/// </para>
/// </remarks>
public sealed class TimerTriggerScheduler : IDisposable
{
private readonly VirtualTagEngine _engine;
private readonly ILogger _logger;
private readonly List<Timer> _timers = [];
private readonly List<TickGroup> _groups = [];
private readonly CancellationTokenSource _cts = new();
private long _skippedTickCount;
private bool _disposed;
public TimerTriggerScheduler(VirtualTagEngine engine, ILogger logger)
@@ -23,6 +35,13 @@ public sealed class TimerTriggerScheduler : IDisposable
_logger = logger ?? throw new ArgumentNullException(nameof(logger));
}
/// <summary>
/// Diagnostic counter: number of timer callbacks that skipped their work because
/// the prior tick for the same group was still running. Exposed for tests +
/// operational metrics. Monotonic; never resets.
/// </summary>
public long SkippedTickCount => Interlocked.Read(ref _skippedTickCount);
/// <summary>
/// Stand up one <see cref="Timer"/> per unique interval. All tags with
/// matching interval share a timer; each tick triggers re-evaluation of the
@@ -41,31 +60,60 @@ public sealed class TimerTriggerScheduler : IDisposable
{
var paths = group.Select(d => d.Path).ToArray();
var interval = group.Key;
var timer = new Timer(_ => Tick(paths), null, interval, interval);
var ctx = new TickGroup(paths);
_groups.Add(ctx);
var timer = new Timer(_ => OnTimer(ctx), null, interval, interval);
_timers.Add(timer);
_logger.Information("TimerTriggerScheduler: {TagCount} tag(s) on {Interval} cadence",
paths.Length, interval);
}
}
private void Tick(IReadOnlyList<string> paths)
private void OnTimer(TickGroup ctx)
{
if (_cts.IsCancellationRequested) return;
foreach (var p in paths)
// Skip the tick when the prior one for this group is still running. Without
// this guard a slow evaluation (or one waiting on the engine's _evalGate) would
// cause subsequent timer callbacks to each pin a thread-pool thread on the
// gate, compounding under high tick rates.
if (Interlocked.CompareExchange(ref ctx.InFlight, 1, 0) != 0)
{
try
Interlocked.Increment(ref _skippedTickCount);
return;
}
// Run async without blocking the timer's pool-thread callback. The task is
// fire-and-forget — failures are logged inside RunTickAsync; the InFlight flag
// is reset in the finally block so the next tick can proceed.
_ = RunTickAsync(ctx);
}
private async Task RunTickAsync(TickGroup ctx)
{
try
{
foreach (var p in ctx.Paths)
{
_engine.EvaluateOneAsync(p, _cts.Token).GetAwaiter().GetResult();
}
catch (OperationCanceledException)
{
return;
}
catch (Exception ex)
{
_logger.Error(ex, "TimerTriggerScheduler evaluate failed for {Path}", p);
if (_cts.IsCancellationRequested) return;
try
{
await _engine.EvaluateOneAsync(p, _cts.Token).ConfigureAwait(false);
}
catch (OperationCanceledException)
{
return;
}
catch (Exception ex)
{
_logger.Error(ex, "TimerTriggerScheduler evaluate failed for {Path}", p);
}
}
}
finally
{
Interlocked.Exchange(ref ctx.InFlight, 0);
}
}
public void Dispose()
@@ -78,6 +126,21 @@ public sealed class TimerTriggerScheduler : IDisposable
try { t.Dispose(); } catch { }
}
_timers.Clear();
_groups.Clear();
_cts.Dispose();
}
private sealed class TickGroup
{
// 0 = idle, 1 = a tick is currently running (or queued) for this group. Use
// Interlocked.CompareExchange so a timer callback observes a consistent "is the
// prior tick still running" answer without taking a lock.
public int InFlight;
public IReadOnlyList<string> Paths { get; }
public TickGroup(IReadOnlyList<string> paths)
{
Paths = paths;
}
}
}
@@ -8,8 +8,9 @@ namespace ZB.MOM.WW.OtOpcUa.Core.VirtualTags;
/// Per-evaluation <see cref="ScriptContext"/> for a virtual-tag script. Reads come
/// out of the engine's last-known-value cache (driver tags updated via the
/// <see cref="ITagUpstreamSource"/> subscription, virtual tags updated by prior
/// evaluations). Writes route through the engine's <c>SetVirtualTag</c> callback so
/// cross-tag write side effects still participate in change-trigger cascades.
/// evaluations). Writes route through <see cref="VirtualTagEngine"/>'s
/// <c>OnScriptSetVirtualTag</c> callback so cross-tag write side effects still
/// participate in change-trigger cascades (via the engine's <c>CascadeAsync</c>).
/// </summary>
/// <remarks>
/// <para>
@@ -24,8 +24,8 @@ namespace ZB.MOM.WW.OtOpcUa.Core.VirtualTags;
/// </param>
/// <param name="TimerInterval">
/// Optional periodic re-evaluation cadence. Null = timer-driven disabled. Both can
/// be enabled simultaneously; independent scheduling paths both feed
/// <c>EvaluationPipeline</c>.
/// be enabled simultaneously; independent scheduling paths both end at
/// <see cref="VirtualTagEngine"/>'s <c>EvaluateInternalAsync</c>.
/// </param>
/// <param name="Historize">
/// When true, every evaluation result is forwarded to the configured
@@ -85,6 +85,13 @@ public sealed class VirtualTagEngine : IDisposable
continue;
}
if (!IsSupportedDataType(def.DataType))
{
compileFailures.Add(
$"{def.Path}: unsupported DataType DriverDataType.{def.DataType} — virtual tags only support scalar primitive types");
continue;
}
try
{
var extraction = DependencyExtractor.Extract(def.ScriptSource);
@@ -108,6 +115,22 @@ public sealed class VirtualTagEngine : IDisposable
}
}
// Validate every ctx.SetVirtualTag write target resolves to a registered virtual
// tag. A script writing to a non-existent virtual path would otherwise be silently
// dropped at runtime by OnScriptSetVirtualTag's warning-and-drop branch; catching
// it here surfaces operator typos as a publish failure.
foreach (var (path, state) in _tags)
{
foreach (var writeTarget in state.Writes)
{
if (!_tags.ContainsKey(writeTarget))
{
compileFailures.Add(
$"{path}: ctx.SetVirtualTag target '{writeTarget}' is not a registered virtual tag");
}
}
}
if (compileFailures.Count > 0)
{
var joined = string.Join("\n ", compileFailures);
@@ -184,9 +207,28 @@ public sealed class VirtualTagEngine : IDisposable
/// </summary>
public IDisposable Subscribe(string path, Action<string, DataValueSnapshot> observer)
{
var list = _observers.GetOrAdd(path, _ => []);
lock (list) { list.Add(observer); }
return new Unsub(this, path, observer);
// Race-safe pattern paired with Unsub.Dispose: if Unsub.Dispose removed the
// dictionary entry between our GetOrAdd and the lock-protected Add, the list
// we hold a reference to is orphaned. Re-check the map under the lock and
// re-insert the list (or grab the current one) if needed, retrying until the
// dictionary observably contains the list we just added our observer to.
while (true)
{
var list = _observers.GetOrAdd(path, _ => []);
lock (list)
{
// Confirm the list is still the dictionary's value for this key. If
// Dispose removed the entry, _observers[path] either doesn't exist or
// points at a different (newer) list — retry.
if (_observers.TryGetValue(path, out var current) && ReferenceEquals(current, list))
{
list.Add(observer);
return new Unsub(this, path, observer);
}
}
// Lost the race — Dispose pruned the list out from under us. Loop and
// either re-create or pick up the newer list.
}
}
/// <summary>
@@ -367,13 +409,24 @@ public sealed class VirtualTagEngine : IDisposable
return target switch
{
DriverDataType.Boolean => Convert.ToBoolean(raw),
DriverDataType.Int16 => Convert.ToInt16(raw),
DriverDataType.Int32 => Convert.ToInt32(raw),
DriverDataType.Int64 => Convert.ToInt64(raw),
DriverDataType.UInt16 => Convert.ToUInt16(raw),
DriverDataType.UInt32 => Convert.ToUInt32(raw),
DriverDataType.UInt64 => Convert.ToUInt64(raw),
DriverDataType.Float32 => Convert.ToSingle(raw),
DriverDataType.Float64 => Convert.ToDouble(raw),
DriverDataType.String => Convert.ToString(raw) ?? string.Empty,
DriverDataType.DateTime => raw is DateTime dt ? dt : Convert.ToDateTime(raw),
_ => raw,
// Any DriverDataType not in the explicit list (currently Reference, or any
// future enum member added without coercion support) must NOT silently
// return the uncoerced raw value — that would surface as a wire-level
// type mismatch on the OPC UA Variant. Throwing here is caught by the
// outer catch and mapped to BadInternalError. Load-time validation in
// IsSupportedDataType ensures operators never publish such a tag.
_ => throw new InvalidOperationException(
$"Virtual-tag CoerceResult does not support DriverDataType.{target}"),
};
}
catch
@@ -384,6 +437,28 @@ public sealed class VirtualTagEngine : IDisposable
}
}
/// <summary>
/// The set of <see cref="DriverDataType"/> values <see cref="CoerceResult"/> can
/// honour. Definitions declaring any other type are rejected at <see cref="Load"/>
/// so an operator typo (or a future enum member added without coercion support) is
/// caught at publish time rather than silently producing a type-mismatched value.
/// </summary>
private static bool IsSupportedDataType(DriverDataType t) => t switch
{
DriverDataType.Boolean => true,
DriverDataType.Int16 => true,
DriverDataType.Int32 => true,
DriverDataType.Int64 => true,
DriverDataType.UInt16 => true,
DriverDataType.UInt32 => true,
DriverDataType.UInt64 => true,
DriverDataType.Float32 => true,
DriverDataType.Float64 => true,
DriverDataType.String => true,
DriverDataType.DateTime => true,
_ => false,
};
private void UnsubscribeFromUpstream()
{
foreach (var s in _upstreamSubscriptions)
@@ -423,7 +498,23 @@ public sealed class VirtualTagEngine : IDisposable
{
if (_engine._observers.TryGetValue(_path, out var list))
{
lock (list) { list.Remove(_observer); }
lock (list)
{
list.Remove(_observer);
// If we removed the last observer, prune the dictionary entry so a
// long-running server doesn't accumulate empty Lists for paths that
// saw transient subscriptions. The emptiness check is inside the same
// lock so a concurrent Subscribe can't slip an observer in after we
// observe the list as empty.
if (list.Count == 0)
{
// ICollection<KeyValuePair<,>> removal is value-typed — only removes
// if both key + value still match (i.e. the dictionary still points
// at this list). This keeps a racing Subscribe's brand-new list safe.
((ICollection<KeyValuePair<string, List<Action<string, DataValueSnapshot>>>>)_engine._observers)
.Remove(new KeyValuePair<string, List<Action<string, DataValueSnapshot>>>(_path, list));
}
}
}
}
}
@@ -6,14 +6,16 @@
<meta name="viewport" content="width=device-width, initial-scale=1.0"/>
<title>OtOpcUa Admin</title>
<base href="/"/>
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.3/dist/css/bootstrap.min.css"/>
@* Admin-010: Bootstrap 5 is vendored under wwwroot/lib/bootstrap/ per admin-ui.md
"Tech Stack" — no public-CDN dependency so air-gapped fleet deployments work. *@
<link rel="stylesheet" href="lib/bootstrap/css/bootstrap.min.css"/>
<link rel="stylesheet" href="theme.css"/>
<link rel="stylesheet" href="app.css"/>
<HeadOutlet/>
</head>
<body>
<Routes/>
<script src="https://cdn.jsdelivr.net/npm/bootstrap@5.3.3/dist/js/bootstrap.bundle.min.js"></script>
<script src="lib/bootstrap/js/bootstrap.bundle.min.js"></script>
<script src="_framework/blazor.web.js"></script>
</body>
</html>
@@ -79,7 +79,7 @@
<div class="mt-3">
<label class="form-label">CSV content (paste or uploaded)</label>
<textarea class="form-control mono" rows="8" @bind="_csvText"
placeholder="# OtOpcUaCsv v1&#10;ZTag,MachineCode,SAPID,EquipmentId,…"/>
placeholder="# OtOpcUaCsv v1&#10;ZTag,MachineCode,SAPID,EquipmentUuid,Name,…"/>
</div>
<div class="mt-3">
<button class="btn btn-sm btn-outline-primary" @onclick="ParseAsync" disabled="@_busy">Parse</button>
@@ -15,8 +15,13 @@ namespace ZB.MOM.WW.OtOpcUa.Admin.Services;
/// columns must all be present. The version bump handshake lets future shapes parse
/// without ambiguity — v2 files go through a different parser variant.</para>
///
/// <para><b>Required columns</b> per decision #117: ZTag, MachineCode, SAPID,
/// EquipmentId, EquipmentUuid, Name, UnsAreaName, UnsLineName.</para>
/// <para><b>Required columns</b> per decision #117 + admin-ui.md "Equipment CSV import"
/// (Admin-012 — revised after adversarial review finding #4): ZTag, MachineCode, SAPID,
/// EquipmentUuid, Name, UnsAreaName, UnsLineName. <b>No <c>EquipmentId</c> column</b> —
/// it is system-derived as <c>'EQ-' + first 12 hex chars of EquipmentUuid</c> via
/// <see cref="ZB.MOM.WW.OtOpcUa.Configuration.Validation.DraftValidator.DeriveEquipmentId"/>
/// and is never accepted from the CSV: operator-supplied values would mint duplicate
/// equipment identity on typos.</para>
///
/// <para><b>Optional columns</b> per decision #139: Manufacturer, Model, SerialNumber,
/// HardwareRevision, SoftwareRevision, YearOfConstruction, AssetLocation,
@@ -32,7 +37,8 @@ public static class EquipmentCsvImporter
public static IReadOnlyList<string> RequiredColumns { get; } = new[]
{
"ZTag", "MachineCode", "SAPID", "EquipmentId", "EquipmentUuid",
// Admin-012: no EquipmentId — derived from EquipmentUuid at finalise time.
"ZTag", "MachineCode", "SAPID", "EquipmentUuid",
"Name", "UnsAreaName", "UnsLineName",
};
@@ -136,7 +142,6 @@ public static class EquipmentCsvImporter
ZTag = cells[colIndex["ZTag"]],
MachineCode = cells[colIndex["MachineCode"]],
SAPID = cells[colIndex["SAPID"]],
EquipmentId = cells[colIndex["EquipmentId"]],
EquipmentUuid = cells[colIndex["EquipmentUuid"]],
Name = cells[colIndex["Name"]],
UnsAreaName = cells[colIndex["UnsAreaName"]],
@@ -157,7 +162,6 @@ public static class EquipmentCsvImporter
"ZTag" => row.ZTag,
"MachineCode" => row.MachineCode,
"SAPID" => row.SAPID,
"EquipmentId" => row.EquipmentId,
"EquipmentUuid" => row.EquipmentUuid,
"Name" => row.Name,
"UnsAreaName" => row.UnsAreaName,
@@ -225,11 +229,11 @@ public static class EquipmentCsvImporter
/// <summary>One parsed equipment row with required + optional fields.</summary>
public sealed class EquipmentCsvRow
{
// Required (decision #117)
// Required (decision #117). Admin-012: no EquipmentId here — derived from
// EquipmentUuid at finalise time so a CSV typo cannot create a duplicate identity.
public required string ZTag { get; init; }
public required string MachineCode { get; init; }
public required string SAPID { get; init; }
public required string EquipmentId { get; init; }
public required string EquipmentUuid { get; init; }
public required string Name { get; init; }
public required string UnsAreaName { get; init; }
@@ -3,6 +3,7 @@ using ZB.MOM.WW.OtOpcUa.Admin.Services;
using ZB.MOM.WW.OtOpcUa.Configuration;
using ZB.MOM.WW.OtOpcUa.Configuration.Entities;
using ZB.MOM.WW.OtOpcUa.Configuration.Enums;
using ZB.MOM.WW.OtOpcUa.Configuration.Validation;
namespace ZB.MOM.WW.OtOpcUa.Admin.Services;
@@ -65,6 +66,14 @@ public sealed class EquipmentImportBatchService(OtOpcUaConfigDbContext db)
foreach (var row in acceptedRows)
{
// Admin-012: EquipmentId is not on the CSV row — derive it from EquipmentUuid
// so the staging row carries the canonical 'EQ-' + first-12-hex form. Rows
// with an unparseable UUID get an empty placeholder; the same finalise path
// that re-derives the UUID (FinaliseBatchAsync) will overwrite it.
var derivedEquipmentId = Guid.TryParse(row.EquipmentUuid, out var parsedUuid)
? DraftValidator.DeriveEquipmentId(parsedUuid)
: string.Empty;
db.EquipmentImportRows.Add(new EquipmentImportRow
{
Id = Guid.NewGuid(),
@@ -73,7 +82,7 @@ public sealed class EquipmentImportBatchService(OtOpcUaConfigDbContext db)
ZTag = row.ZTag,
MachineCode = row.MachineCode,
SAPID = row.SAPID,
EquipmentId = row.EquipmentId,
EquipmentId = derivedEquipmentId,
EquipmentUuid = row.EquipmentUuid,
Name = row.Name,
UnsAreaName = row.UnsAreaName,
@@ -178,11 +187,16 @@ public sealed class EquipmentImportBatchService(OtOpcUaConfigDbContext db)
{
var equipmentUuid = Guid.TryParse(row.EquipmentUuid, out var u) ? u : Guid.NewGuid();
// Admin-012: EquipmentId is always derived from the UUID that actually lands in
// the Equipment row. Using the staging row's pre-derived value would diverge
// when the staged UUID was blank and we just generated a fresh one above.
var derivedEquipmentId = DraftValidator.DeriveEquipmentId(equipmentUuid);
db.Equipment.Add(new Equipment
{
EquipmentRowId = Guid.NewGuid(),
GenerationId = generationId,
EquipmentId = row.EquipmentId,
EquipmentId = derivedEquipmentId,
EquipmentUuid = equipmentUuid,
DriverInstanceId = driverInstanceIdForRows,
UnsLineId = unsLineIdForRows,
@@ -0,0 +1,16 @@
# Bootstrap 5.3.3 — vendored copy
Per `docs/v2/admin-ui.md` "Tech Stack" the Admin UI vendors Bootstrap 5 here so
the app has no public-CDN dependency — air-gapped fleet deployments must work
without internet egress.
| File | Source |
|------|--------|
| `css/bootstrap.min.css` | https://cdn.jsdelivr.net/npm/bootstrap@5.3.3/dist/css/bootstrap.min.css |
| `css/bootstrap.min.css.map` | https://cdn.jsdelivr.net/npm/bootstrap@5.3.3/dist/css/bootstrap.min.css.map |
| `js/bootstrap.bundle.min.js` | https://cdn.jsdelivr.net/npm/bootstrap@5.3.3/dist/js/bootstrap.bundle.min.js |
| `js/bootstrap.bundle.min.js.map` | https://cdn.jsdelivr.net/npm/bootstrap@5.3.3/dist/js/bootstrap.bundle.min.js.map |
Bootstrap is MIT-licensed (https://github.com/twbs/bootstrap/blob/main/LICENSE).
To upgrade, re-download all four files at the matching version and bump this
table.
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
@@ -73,11 +73,20 @@ public sealed class PeerHttpProbeLoop(
{
var url = $"http://{peer.Host}:{peer.DashboardPort}/healthz";
var healthy = false;
// Server-012: bound the request via a linked CTS rather than mutating
// client.Timeout on a factory-vended HttpClient. IHttpClientFactory may pool /
// recycle the underlying handler, so mutating client.Timeout per call races with
// the factory's lifecycle and crosses ownership boundaries. A per-call CTS is the
// canonical way to enforce a per-request deadline.
using var timeoutCts = new CancellationTokenSource(_options.HttpProbeTimeout);
using var linkedCts = CancellationTokenSource.CreateLinkedTokenSource(
cancellationToken, timeoutCts.Token);
try
{
using var client = httpClientFactory.CreateClient(HttpClientName);
client.Timeout = _options.HttpProbeTimeout;
using var response = await client.GetAsync(url, cancellationToken).ConfigureAwait(false);
using var response = await client.GetAsync(url, linkedCts.Token).ConfigureAwait(false);
healthy = response.IsSuccessStatusCode;
}
catch (OperationCanceledException) when (cancellationToken.IsCancellationRequested)
@@ -86,9 +95,9 @@ public sealed class PeerHttpProbeLoop(
}
catch (Exception ex) when (ex is HttpRequestException or TaskCanceledException or OperationCanceledException)
{
// Any transport-level failure counts as unhealthy — connection refused, timeout,
// DNS fail, TLS fail. Swallow + mark unhealthy; don't log every tick, only when
// state transitions.
// Any transport-level failure counts as unhealthy — connection refused, timeout
// (linked CTS expired), DNS fail, TLS fail. Swallow + mark unhealthy; don't log
// every tick, only when state transitions.
healthy = false;
}
@@ -462,6 +462,63 @@ public sealed class DriverNodeManager : CustomNodeManager2, IAddressSpaceBuilder
}
}
/// <summary>
/// Server-006 fallback for the synchronous OnRead/OnWrite hooks when the stack hasn't
/// plumbed an <see cref="IOperationContext.OperationDeadline"/> through. Bounds the
/// <c>.GetAwaiter().GetResult()</c> wait so a stalled driver can't pin a request
/// thread indefinitely (default <c>MaxRequestThreadCount = 100</c>).
/// </summary>
internal static readonly TimeSpan DefaultSynchronousHookTimeout = TimeSpan.FromSeconds(30);
/// <summary>
/// Derives a <see cref="CancellationTokenSource"/> bounded by the OPC UA stack's
/// <see cref="IOperationContext.OperationDeadline"/>, falling back to
/// <paramref name="fallback"/> when no deadline has been plumbed. The synchronous
/// <c>OnReadValue</c> / <c>OnWriteValue</c> stack hooks consume the returned token so
/// a stuck driver call can be abandoned instead of pinning a request thread for the
/// full pipeline timeout (Server-006).
/// </summary>
/// <remarks>
/// The OPC UA stack's <c>SessionSystemContext.OperationContext</c> exposes the
/// per-request deadline computed from the client's <c>TimeoutHint</c> +
/// <see cref="TransportQuotas.OperationTimeout"/>. An expired deadline produces an
/// immediately-cancelled token so the invoker call short-circuits without dispatching
/// to the driver. A <see cref="DateTime.MinValue"/> deadline (the SDK's "not plumbed"
/// sentinel) is treated as missing and falls back to <paramref name="fallback"/>.
/// </remarks>
internal static CancellationTokenSource DeriveOperationCancellation(ISystemContext context, TimeSpan fallback)
{
// Cast: ISystemContext is read-only; SystemContext (the concrete base) exposes
// OperationContext as IOperationContext. Server-side, every ISystemContext the
// stack passes is a SessionSystemContext (subclass of SystemContext), so the cast
// succeeds in practice. The null guard keeps the helper safe for tests / future
// overrides that don't follow that contract.
var opCtx = (context as SystemContext)?.OperationContext;
var deadline = opCtx?.OperationDeadline ?? DateTime.MinValue;
// DateTime.MinValue is the legacy "no deadline plumbed" sentinel; DateTime.MaxValue
// is the SDK's "client didn't supply a TimeoutHint" default (OperationContext's
// initial value when RequestHeader.TimeoutHint == 0). Both must collapse to the
// fallback timeout — otherwise MinValue compares as already-expired and MaxValue
// overflows CancellationTokenSource(TimeSpan)'s Int32.MaxValue-ms ceiling.
if (deadline == DateTime.MinValue || deadline == DateTime.MaxValue)
return new CancellationTokenSource(fallback);
var remaining = deadline - DateTime.UtcNow;
if (remaining <= TimeSpan.Zero)
{
var cts = new CancellationTokenSource();
cts.Cancel();
return cts;
}
// CancellationTokenSource(TimeSpan) caps at ~24.86 days (Int32.MaxValue ms). A
// legitimate per-request deadline far in the future (e.g. a TimeoutHint of hours)
// exceeds that limit; clamp to the fallback so we never throw ArgumentOutOfRangeException
// from inside the read path.
if (remaining.TotalMilliseconds > int.MaxValue) return new CancellationTokenSource(fallback);
return new CancellationTokenSource(remaining);
}
private ServiceResult OnReadValue(ISystemContext context, NodeState node, NumericRange indexRange,
QualifiedName dataEncoding, ref object? value, ref StatusCode statusCode, ref DateTime timestamp)
{
@@ -490,11 +547,15 @@ public sealed class DriverNodeManager : CustomNodeManager2, IAddressSpaceBuilder
}
}
// Server-006: bound the synchronous .GetAwaiter().GetResult() wait by the stack's
// OperationDeadline so a stalled driver can't pin a request thread for the full
// pipeline timeout. The previous CancellationToken.None left every read uncancellable.
using var cts = DeriveOperationCancellation(context, DefaultSynchronousHookTimeout);
var result = _invoker.ExecuteAsync(
DriverCapability.Read,
ResolveHostFor(fullRef),
async ct => (IReadOnlyList<DataValueSnapshot>)await readable.ReadAsync([fullRef], ct).ConfigureAwait(false),
CancellationToken.None).AsTask().GetAwaiter().GetResult();
cts.Token).AsTask().GetAwaiter().GetResult();
if (result.Count == 0)
{
statusCode = StatusCodes.BadNoData;
@@ -505,6 +566,12 @@ public sealed class DriverNodeManager : CustomNodeManager2, IAddressSpaceBuilder
statusCode = snap.StatusCode;
timestamp = snap.ServerTimestampUtc;
}
catch (OperationCanceledException)
{
// The deadline expired or the client cancelled. Surface BadTimeout so the client
// sees the actual outcome (the pre-fix uncancellable path would have hung instead).
statusCode = StatusCodes.BadTimeout;
}
catch (Exception ex)
{
_logger.LogWarning(ex, "OnReadValue failed for {NodeId}", node.NodeId);
@@ -745,11 +812,15 @@ public sealed class DriverNodeManager : CustomNodeManager2, IAddressSpaceBuilder
engine.AddCommentAsync(alarmId, user, comment ?? string.Empty, CancellationToken.None)
.GetAwaiter().GetResult();
// Mark the slot as handled so base.Call skips it. A pre-populated Good
// result (not null and not Bad) is the signal the base class uses to
// skip per-slot dispatch — set StatusCode to Good explicitly.
// Mark the slot as handled so base.Call skips it. Server-008: the SDK's
// CustomNodeManager2.Call (and CallInternalAsync) skip slots whose
// CallMethodRequest.Processed flag is true — the errors[i] value is the
// per-slot status, not the skip signal. Without Processed=true the stack's
// built-in Part 9 Acknowledge/Confirm handler would also fire and the engine
// would observe a double transition.
results[i] = new CallMethodResult { StatusCode = StatusCodes.Good };
errors[i] = ServiceResult.Good;
request.Processed = true;
}
catch (ArgumentException ex)
{
@@ -757,11 +828,15 @@ public sealed class DriverNodeManager : CustomNodeManager2, IAddressSpaceBuilder
// as BadInvalidArgument so the OPC UA client sees a meaningful status.
errors[i] = new ServiceResult(StatusCodes.BadInvalidArgument,
ex.Message, ex.Message);
// The engine rejected the call, but we still routed it — base.Call must
// not re-run the method on the stack's built-in handler. Server-008.
request.Processed = true;
}
catch (Exception ex)
{
errors[i] = new ServiceResult(StatusCodes.BadInternalError,
ex.Message, ex.Message);
request.Processed = true;
}
}
}
@@ -1354,13 +1429,16 @@ public sealed class DriverNodeManager : CustomNodeManager2, IAddressSpaceBuilder
{
var isIdempotent = _writeIdempotentByFullRef.GetValueOrDefault(fullRef!, false);
var capturedValue = value;
// Server-006: same deadline-derived cancellation as OnReadValue so a stalled
// driver write can't pin a request thread for the full pipeline timeout.
using var cts = DeriveOperationCancellation(context, DefaultSynchronousHookTimeout);
var results = _invoker.ExecuteWriteAsync(
ResolveHostFor(fullRef!),
isIdempotent,
async ct => (IReadOnlyList<WriteResult>)await _writable.WriteAsync(
[new DriverWriteRequest(fullRef!, capturedValue)],
ct).ConfigureAwait(false),
CancellationToken.None).AsTask().GetAwaiter().GetResult();
cts.Token).AsTask().GetAwaiter().GetResult();
if (results.Count > 0 && results[0].StatusCode != 0)
{
statusCode = results[0].StatusCode;
@@ -1368,6 +1446,12 @@ public sealed class DriverNodeManager : CustomNodeManager2, IAddressSpaceBuilder
}
return ServiceResult.Good;
}
catch (OperationCanceledException)
{
// Deadline expired or client cancelled — surface BadTimeout instead of the
// generic BadInternalError so the client sees the actual outcome.
return new ServiceResult(StatusCodes.BadTimeout);
}
catch (Exception ex)
{
_logger.LogWarning(ex, "Write failed for {FullRef}", fullRef);
@@ -20,10 +20,15 @@ public enum OpcUaSecurityProfile
/// <summary>
/// OPC UA server endpoint + application-identity configuration. Bound from the
/// <c>OpcUaServer</c> section of <c>appsettings.json</c>. PR 17 minimum-viable scope: no LDAP,
/// no security profiles beyond None — those wire in alongside a future deployment-policy PR
/// that reads from the central config DB instead of appsettings.
/// <c>OpcUaServer</c> section of <c>appsettings.json</c>.
/// </summary>
/// <remarks>
/// Covers the endpoint URL + application identity, the PKI store root + auto-trust toggle,
/// the optional <c>/healthz</c> health-endpoints listener, the configurable transport
/// <see cref="SecurityProfile"/>, the <see cref="Ldap"/> binding for UserName token
/// validation, and the optional <see cref="AnonymousRoles"/> set that grants anonymous
/// sessions a configurable role list. See <c>docs/security.md</c> for the full guide.
/// </remarks>
public sealed class OpcUaServerOptions
{
public const string SectionName = "OpcUaServer";
@@ -15,10 +15,27 @@ namespace ZB.MOM.WW.OtOpcUa.Server.OpcUa;
/// <summary>
/// <see cref="StandardServer"/> subclass that wires one <see cref="DriverNodeManager"/> per
/// registered driver from <see cref="DriverHost"/>. Anonymous endpoint on
/// <c>opc.tcp://0.0.0.0:4840</c>, no security — PR 16 minimum-viable scope; LDAP + security
/// profiles are deferred to their own PR on top of this.
/// registered driver from <see cref="DriverHost"/>. Endpoint URL, transport security
/// profile (<see cref="OpcUaSecurityProfile"/>), and LDAP-backed UserName authentication
/// are all driven by <see cref="OpcUaServerOptions"/>.
/// </summary>
/// <remarks>
/// <para>
/// <see cref="OnImpersonateUser"/> accepts the stack's <c>AnonymousIdentityToken</c>
/// and <c>UserNameIdentityToken</c>. Anonymous sessions either receive the stack's
/// default empty identity (production default) or a configurable role set via
/// <see cref="OpcUaServerOptions.AnonymousRoles"/>. UserName sessions are validated
/// against the injected <see cref="IUserAuthenticator"/> — the production binding is
/// the LDAP authenticator under <see cref="OpcUaServerOptions.Ldap"/>; failed binds
/// throw <see cref="ServiceResultException"/> with
/// <see cref="StatusCodes.BadUserAccessDenied"/>.
/// </para>
/// <para>
/// Authenticated identities are wrapped in <see cref="RoleBasedIdentity"/> which
/// carries the LDAP-resolved roles + groups so <see cref="DriverNodeManager"/>'s
/// server-layer authorization gate can evaluate per-tag ACLs without re-reading LDAP.
/// </para>
/// </remarks>
public sealed class OtOpcUaServer : StandardServer
{
private readonly DriverHost _driverHost;
@@ -184,16 +201,31 @@ public sealed class OtOpcUaServer : StandardServer
/// <see cref="ILdapGroupsBearer"/> (data-plane: <see cref="AuthorizationGate"/> evaluator).
/// Anonymous identity (no roles configured) still uses the stack's default UserIdentity.
/// </summary>
private sealed class RoleBasedIdentity : UserIdentity, IRoleBearer, ILdapGroupsBearer
/// <remarks>
/// Server-004 — the LDAP-resolved display name is stamped onto the base
/// <see cref="UserIdentity.DisplayName"/> property (settable on the stack's
/// <c>UserIdentity</c>) so <c>DriverNodeManager.ResolveCallUser</c> picks it up
/// through the <see cref="IUserIdentity"/> interface. Without that assignment,
/// <c>DisplayName</c> resolves to the username and scripted-alarm
/// Ack/Confirm/Shelve audit entries record the raw username instead of the
/// human-readable LDAP <c>cn</c>.
/// </remarks>
internal sealed class RoleBasedIdentity : UserIdentity, IRoleBearer, ILdapGroupsBearer
{
public IReadOnlyList<string> Roles { get; }
public IReadOnlyList<string> LdapGroups { get; }
public string? Display { get; }
public RoleBasedIdentity(string userName, string? displayName, IReadOnlyList<string> roles, IReadOnlyList<string> ldapGroups)
: base(userName, "")
: base(displayName ?? userName, "")
{
Display = displayName;
// The base UserIdentity(string, string) ctor seeds DisplayName from its first arg
// (the stack's UserIdentity.DisplayName getter is sealed-virtual on this pinned
// SDK version, so it cannot be overridden or re-assigned). Pass the LDAP-resolved
// display name as that first arg so DriverNodeManager.ResolveCallUser — which
// reads IUserIdentity.DisplayName — stamps the human-readable name onto
// scripted-alarm Ack / Confirm / Shelve audit entries. When LDAP returns no
// display name, fall back to the username so every audit row still carries an
// identity (matches the pre-fix observable behaviour).
Roles = roles;
LdapGroups = ldapGroups;
}
@@ -15,7 +15,11 @@ namespace ZB.MOM.WW.OtOpcUa.Server;
/// runs until stopped.
/// </summary>
public sealed class OpcUaServerService(
NodeBootstrap bootstrap,
// Server-014 — consume SealedBootstrap so the Phase 6.1 Stream D resilient-reader +
// GenerationSealedCache + StaleConfigFlag chain actually runs on every boot. The
// legacy NodeBootstrap is still registered for integration tests that construct it
// directly, but the production service uses the generation-sealed path.
SealedBootstrap bootstrap,
DriverHost driverHost,
OpcUaApplicationHost applicationHost,
DriverEquipmentContentRegistry equipmentContentRegistry,
+11 -1
View File
@@ -116,6 +116,12 @@ builder.Services.AddSingleton<ILocalConfigCache>(_ => new LiteDbConfigCache(opti
builder.Services.AddSingleton<DriverHost>();
builder.Services.AddSingleton<NodeBootstrap>();
// Server-014: register the Phase 6.1 Stream D generation-sealed bootstrap chain. Production
// uses SealedBootstrap; NodeBootstrap stays registered for back-compat with the integration
// tests that depend on it directly. StaleConfigFlag is resolved into OpcUaApplicationHost so
// /healthz surfaces the stale-config signal when the cache fallback path serves a request.
builder.Services.AddSealedBootstrap(options);
// Task #248 — driver-instance bootstrap pipeline. DriverFactoryRegistry is the
// type-name → factory map; each driver project's static Register call pre-loads
// its factory so the bootstrapper can materialise DriverInstance rows from the
@@ -217,7 +223,11 @@ builder.Services.AddSingleton<OpcUaApplicationHost>(sp =>
equipmentContentLookup: registry.Get,
historyRouter: sp.GetRequiredService<IHistoryRouter>(),
alarmConditionService: sp.GetRequiredService<AlarmConditionService>(),
configDbHealthy: () => dbHealthCache.IsHealthy);
configDbHealthy: () => dbHealthCache.IsHealthy,
// Server-014: surface StaleConfigFlag into /healthz so a cache-fallback bootstrap
// (central DB unreachable + sealed snapshot served) flips usingStaleConfig=true on
// the next probe. Without this wire-up the flag was inert.
staleConfigFlag: sp.GetRequiredService<ZB.MOM.WW.OtOpcUa.Configuration.LocalCache.StaleConfigFlag>());
});
builder.Services.AddHostedService<OpcUaServerService>();
@@ -12,14 +12,17 @@ namespace ZB.MOM.WW.OtOpcUa.Server;
/// sealed snapshot to fall back to.
/// </summary>
/// <remarks>
/// <para>Alongside the original <see cref="NodeBootstrap"/> (which uses the single-file
/// <see cref="ILocalConfigCache"/>). Program.cs can switch to this one once operators are
/// ready for the generation-sealed semantics. The original stays for backward compat
/// with the three integration tests that construct <see cref="NodeBootstrap"/> directly.</para>
/// <para>Server-014 — registered in DI via <c>ServerWiring.AddSealedBootstrap</c> and
/// consumed by <c>OpcUaServerService</c>. The legacy <see cref="NodeBootstrap"/> stays
/// registered alongside for the three integration tests that construct it directly, but
/// production boots through this wrapper so <see cref="GenerationSealedCache"/> +
/// <see cref="ResilientConfigReader"/> + <see cref="StaleConfigFlag"/> run on every
/// start-up and <c>/healthz</c>'s <c>usingStaleConfig</c> reflects the cache-fallback
/// state.</para>
///
/// <para>Closes release blocker #2 in <c>docs/v2/v2-release-readiness.md</c> — the
/// generation-sealed cache + resilient reader + stale-config flag ship as unit-tested
/// primitives in PR #81 but no production path consumed them until this wrapper.</para>
/// primitives in PR #81; this wrapper is the production consumer that wires them in.</para>
/// </remarks>
public sealed class SealedBootstrap
{
@@ -0,0 +1,57 @@
using Microsoft.Extensions.DependencyInjection;
using ZB.MOM.WW.OtOpcUa.Configuration.LocalCache;
namespace ZB.MOM.WW.OtOpcUa.Server;
/// <summary>
/// DI registration helpers consumed by <c>Program.cs</c>. Extracted so tests can assert
/// the production wire-up actually composes without spinning up the full <c>Host</c>.
/// </summary>
public static class ServerWiring
{
/// <summary>
/// Server-014 — registers the Phase 6.1 Stream D generation-sealed bootstrap chain:
/// <see cref="GenerationSealedCache"/>, <see cref="StaleConfigFlag"/>,
/// <see cref="ResilientConfigReader"/>, and <see cref="SealedBootstrap"/>. Without these
/// registrations <c>OpcUaServerService</c> cannot consume the sealed bootstrap and the
/// <see cref="StaleConfigFlag"/> stays inert — <c>/healthz</c>'s <c>usingStaleConfig</c>
/// never flips on a DB outage with a warm cache.
/// </summary>
/// <remarks>
/// The cache root is sourced from <see cref="NodeOptions.LocalCachePath"/> — same path
/// the legacy <see cref="NodeBootstrap"/> uses for its LiteDB cache, so both bootstrap
/// paths persist alongside each other while the migration completes.
/// </remarks>
public static IServiceCollection AddSealedBootstrap(this IServiceCollection services, NodeOptions options)
{
// Use a sibling directory off LocalCachePath so the LiteDB file and the
// GenerationSealedCache snapshots don't clash. The cache root is a directory;
// LocalCachePath is canonically the LiteDB file path.
var cacheRoot = ResolveCacheRoot(options.LocalCachePath);
// Register NodeOptions only if the caller hasn't already done so — Program.cs
// registers it earlier in its DI chain, but the wiring helper supports standalone
// unit tests that want to compose just the SealedBootstrap chain.
if (!services.Any(d => d.ServiceType == typeof(NodeOptions)))
services.AddSingleton(options);
services.AddSingleton(new GenerationSealedCache(cacheRoot));
services.AddSingleton<StaleConfigFlag>();
services.AddSingleton<ResilientConfigReader>();
services.AddSingleton<SealedBootstrap>();
return services;
}
private static string ResolveCacheRoot(string localCachePath)
{
// LocalCachePath is the LiteDB file (e.g. "config_cache.db"); the sealed cache is a
// directory. Pick a sibling folder so the two don't share a path.
if (string.IsNullOrWhiteSpace(localCachePath))
return Path.Combine(Path.GetTempPath(), "otopcua-sealed-cache");
var dir = Path.GetDirectoryName(localCachePath);
var name = Path.GetFileNameWithoutExtension(localCachePath);
var root = string.IsNullOrEmpty(dir)
? $"{name}.sealed"
: Path.Combine(dir, $"{name}.sealed");
return root;
}
}
@@ -606,6 +606,253 @@ public sealed class ScriptedAlarmEngineTests
"Uncertain-quality inputs are treated as ready — predicate evaluates");
}
// -------------------------------------------------------------------------
// Core.ScriptedAlarms-003: OnEvent emission must not block under _evalGate.
// (1) A slow subscriber must not block the gate for other alarms.
// (2) A subscriber that re-enters the engine (e.g. AcknowledgeAsync) must
// not deadlock against _evalGate. Both regressions are covered here.
// -------------------------------------------------------------------------
[Fact]
public async Task OnEvent_subscriber_can_call_back_into_engine_without_deadlock(/* -003 */)
{
// Re-entrancy regression. When OnEvent emission was inside _evalGate, a
// subscriber that called an engine method (e.g. AcknowledgeAsync) hung
// forever because the non-reentrant SemaphoreSlim refused to re-grant
// the gate the dispatch path was still holding. After the fix, emission
// happens AFTER Release() so the subscriber's call acquires the gate
// cleanly and the operator-driven action completes.
var up = new FakeUpstream();
up.Set("Temp", 50);
var eng = Build(up, out _);
try
{
await eng.LoadAsync([Alarm("HighTemp", """return (int)ctx.GetTag("Temp").Value > 100;""")],
TestContext.Current.CancellationToken);
// Subscriber re-enters the engine via Task.Run so the OnEvent
// dispatch thread is not blocked while waiting. Either way, with
// the fix in place AcknowledgeAsync must acquire _evalGate (the
// dispatch path released it before invoking the subscriber) and
// complete in well under the timeout.
var ackDone = new TaskCompletionSource();
eng.OnEvent += (_, e) =>
{
if (e.Emission != EmissionKind.Activated) return;
_ = Task.Run(async () =>
{
try
{
await eng.AcknowledgeAsync(e.AlarmId, "sub", null, CancellationToken.None);
ackDone.TrySetResult();
}
catch (Exception ex) { ackDone.TrySetException(ex); }
});
};
up.Push("Temp", 150);
var winner = await Task.WhenAny(ackDone.Task, Task.Delay(TimeSpan.FromSeconds(3)));
winner.ShouldBe(ackDone.Task,
"subscriber re-entering the engine must not deadlock against _evalGate");
await ackDone.Task; // surface any inner exception
eng.GetState("HighTemp")!.Acked.ShouldBe(AlarmAckedState.Acknowledged);
}
finally
{
eng.Dispose();
}
}
[Fact]
public void OnEvent_emission_happens_outside_evalGate(/* -003 */)
{
// Direct white-box check on the gate-release ordering: AcknowledgeAsync
// emits the Acknowledged event AFTER releasing the gate. We assert that
// by observing the gate is acquirable from inside the subscriber.
// SemaphoreSlim.Wait(0) returns true only if the count > 0 (gate free).
var up = new FakeUpstream();
up.Set("Temp", 50);
var eng = Build(up, out _);
try
{
eng.LoadAsync([Alarm("HighTemp", """return (int)ctx.GetTag("Temp").Value > 100;""")],
TestContext.Current.CancellationToken).GetAwaiter().GetResult();
// Drive to Active so Acknowledge has something to ack.
up.Push("Temp", 150);
// Use the same WaitForAsync that other tests use — synchronously
// here since this is a non-async test.
for (var i = 0; i < 80 && eng.GetState("HighTemp")!.Active != AlarmActiveState.Active; i++)
Thread.Sleep(25);
eng.GetState("HighTemp")!.Active.ShouldBe(AlarmActiveState.Active);
// Use reflection to peek at _evalGate so the subscriber can probe it.
var gateField = typeof(ScriptedAlarmEngine).GetField(
"_evalGate", System.Reflection.BindingFlags.NonPublic | System.Reflection.BindingFlags.Instance);
gateField.ShouldNotBeNull();
var gate = (SemaphoreSlim)gateField.GetValue(eng)!;
var gateFreeInsideEmission = false;
eng.OnEvent += (_, e) =>
{
if (e.Emission != EmissionKind.Acknowledged) return;
// SemaphoreSlim.Wait(0) — non-blocking try-take. If the gate is
// free we acquire it (count back to 0); release immediately.
if (gate.Wait(0))
{
gateFreeInsideEmission = true;
gate.Release();
}
};
eng.AcknowledgeAsync("HighTemp", "alice", null, CancellationToken.None)
.GetAwaiter().GetResult();
gateFreeInsideEmission.ShouldBeTrue(
"_evalGate must be released before OnEvent fires so subscribers " +
"can call back into the engine without deadlocking");
}
finally
{
eng.Dispose();
}
}
// -------------------------------------------------------------------------
// Core.ScriptedAlarms-006: Dispose must drain in-flight background tasks
// launched by OnUpstreamChange / RunShelvingCheck. Otherwise a re-evaluation
// or shelving check started just before Dispose can keep running and write
// to a (possibly disposed) store after the engine has returned.
// -------------------------------------------------------------------------
[Fact]
public async Task Dispose_drains_in_flight_reevaluation_tasks(/* -006 */)
{
var up = new FakeUpstream();
up.Set("Temp", 50);
var logger = new LoggerConfiguration().CreateLogger();
var slowStore = new BlockingSaveAlarmStateStore();
var eng = new ScriptedAlarmEngine(up, slowStore, new ScriptLoggerFactory(logger), logger);
await eng.LoadAsync([Alarm("A", """return (int)ctx.GetTag("Temp").Value > 100;""")],
TestContext.Current.CancellationToken);
// Block the NEXT save (the one triggered by the push below).
var saveGate = new TaskCompletionSource();
slowStore.BlockNextSave = saveGate;
// Trigger a re-evaluation that will go inside _evalGate and call SaveAsync.
up.Push("Temp", 150);
// Wait until the store's SaveAsync is actually blocked.
await WaitForAsync(() => slowStore.SaveInProgress, timeoutMs: 1000);
// Dispose must wait for the in-flight reevaluation to complete rather
// than returning while a background task still runs.
var disposeTask = Task.Run(() => eng.Dispose());
// Verify Dispose does NOT complete immediately — it should block waiting
// for the in-flight task. Without the -006 fix Dispose returns straight
// away and the background reevaluation can outlive the engine.
var prematureFinish = await Task.WhenAny(disposeTask, Task.Delay(200));
prematureFinish.ShouldNotBe(disposeTask,
"Dispose must block until in-flight background tasks complete");
// Let the save complete and verify Dispose then returns.
saveGate.SetResult();
await disposeTask.WaitAsync(TimeSpan.FromSeconds(3), TestContext.Current.CancellationToken);
slowStore.SaveInProgress.ShouldBeFalse("background task drained before Dispose returned");
}
// -------------------------------------------------------------------------
// Core.ScriptedAlarms-010: predicate evaluation and message-template
// resolution apply different quality bars on purpose. Predicate evaluation
// accepts Uncertain (the predicate can still inspect the value); message
// resolution renders Uncertain as "{?}" so the operator sees the doubt
// explicitly. The two policies are documented in docs/ScriptedAlarms.md.
// -------------------------------------------------------------------------
[Fact]
public async Task Uncertain_quality_drives_predicate_but_renders_question_mark_in_message(/* -010 */)
{
var up = new FakeUpstream();
// Seed with Uncertain quality (severity bit 30 set, bit 31 clear).
up.Set("Temp", 150, statusCode: 0x40000000u);
using var eng = Build(up, out _);
await eng.LoadAsync([
new ScriptedAlarmDefinition(
"HighTemp", "Plant/Line1", "HighTemp",
AlarmKind.LimitAlarm, AlarmSeverity.High,
"Temp {Temp} exceeded limit",
"""return (int)ctx.GetTag("Temp").Value > 100;"""),
], TestContext.Current.CancellationToken);
// Predicate evaluated (Uncertain treated as ready) → alarm Active.
eng.GetState("HighTemp")!.Active.ShouldBe(AlarmActiveState.Active,
"AreInputsReady accepts Uncertain so the predicate runs");
// But the resolved emission message must show "{?}" for the Uncertain
// tag — only Good substitutes into the operator-facing message.
var events = new List<ScriptedAlarmEvent>();
eng.OnEvent += (_, e) => events.Add(e);
up.Push("Temp", 200, statusCode: 0x40000000u); // still Uncertain
// Trigger another evaluation to get an emission (already active, so
// we need a clear → re-activate cycle). Easier: force the same path
// through a comment which emits a CommentAdded message. But comments
// don't run the template. Instead clear it then re-activate.
up.Push("Temp", 50, statusCode: 0u); // Good, predicate becomes false
await WaitForAsync(() => events.Any(e => e.Emission == EmissionKind.Cleared));
events.Clear();
up.Push("Temp", 200, statusCode: 0x40000000u); // Uncertain, predicate true
await WaitForAsync(() => events.Any(e => e.Emission == EmissionKind.Activated));
// The Activated message must show {?} for the Uncertain input.
events.Single(e => e.Emission == EmissionKind.Activated).Message
.ShouldBe("Temp {?} exceeded limit",
"MessageTemplate.Resolve renders non-Good StatusCode as {?} " +
"even though predicate evaluation accepted the Uncertain value");
}
// -------------------------------------------------------------------------
// Core.ScriptedAlarms-008: switch Comments to ImmutableList for O(log n)
// append. The persisted runtime type must be ImmutableList<AlarmComment>
// (which still satisfies IReadOnlyList<AlarmComment> for existing
// consumers).
// -------------------------------------------------------------------------
[Fact]
public async Task Comments_collection_uses_ImmutableList_for_efficient_append(/* -008 */)
{
var up = new FakeUpstream();
up.Set("Temp", 50);
using var eng = Build(up, out _);
await eng.LoadAsync([Alarm("A", "return false;")], TestContext.Current.CancellationToken);
// Add a comment so AppendComment runs.
await eng.AddCommentAsync("A", "alice", "note", TestContext.Current.CancellationToken);
var s = eng.GetState("A")!;
s.Comments.ShouldBeOfType<System.Collections.Immutable.ImmutableList<AlarmComment>>(
"Comments should be an ImmutableList so append is O(log n), not O(n)");
}
// -------------------------------------------------------------------------
// Core.ScriptedAlarms-011: TransitionResult.NoOp's reason parameter must be
// propagated, not silently discarded. The class-level remarks promise a
// diagnostic log line for no-op disabled-alarm evaluations.
// -------------------------------------------------------------------------
[Fact]
public void TransitionResult_NoOp_propagates_reason(/* -011 */)
{
var fresh = AlarmConditionState.Fresh("a-1", DateTime.UtcNow);
var r = TransitionResult.NoOp(fresh, "disabled — predicate result ignored");
r.NoOpReason.ShouldBe("disabled — predicate result ignored",
"NoOp reason must be preserved on the TransitionResult so callers can log it");
}
[Fact]
public void TransitionResult_None_carries_no_reason(/* -011 */)
{
var fresh = AlarmConditionState.Fresh("a-1", DateTime.UtcNow);
var r = TransitionResult.None(fresh);
r.NoOpReason.ShouldBeNull("None() factory has no reason — only NoOp() carries one");
}
private static async Task WaitForAsync(Func<bool> cond, int timeoutMs = 2000)
{
var deadline = DateTime.UtcNow.AddMilliseconds(timeoutMs);
@@ -645,4 +892,37 @@ public sealed class ScriptedAlarmEngineTests
public Task RemoveAsync(string alarmId, CancellationToken ct)
=> _inner.RemoveAsync(alarmId, ct);
}
/// <summary>
/// A store whose SaveAsync can be made to block until the test signals it.
/// Used to verify Dispose drains in-flight background tasks (finding -006).
/// </summary>
private sealed class BlockingSaveAlarmStateStore : IAlarmStateStore
{
private readonly InMemoryAlarmStateStore _inner = new();
public TaskCompletionSource? BlockNextSave { get; set; }
public bool SaveInProgress { get; private set; }
public Task<AlarmConditionState?> LoadAsync(string alarmId, CancellationToken ct)
=> _inner.LoadAsync(alarmId, ct);
public Task<IReadOnlyList<AlarmConditionState>> LoadAllAsync(CancellationToken ct)
=> _inner.LoadAllAsync(ct);
public async Task SaveAsync(AlarmConditionState state, CancellationToken ct)
{
var gate = BlockNextSave;
if (gate is not null)
{
BlockNextSave = null;
SaveInProgress = true;
try { await gate.Task.WaitAsync(ct).ConfigureAwait(false); }
finally { SaveInProgress = false; }
}
await _inner.SaveAsync(state, ct).ConfigureAwait(false);
}
public Task RemoveAsync(string alarmId, CancellationToken ct)
=> _inner.RemoveAsync(alarmId, ct);
}
}
@@ -148,4 +148,77 @@ public sealed class CompiledScriptCacheTests
var cache = new CompiledScriptCache<FakeScriptContext, int>();
Should.Throw<ArgumentNullException>(() => cache.GetOrCompile(null!));
}
[Fact]
public void Failed_compile_eviction_does_not_remove_a_concurrent_retry_entry()
{
// Regression for Core.Scripting-006: when a faulted Lazy is observed by a thread,
// the eviction must scope to that specific Lazy instance, not the key. If a
// concurrent retry has already inserted a fresh Lazy under the same key between
// the throw and the catch-block removal, the buggy TryRemove(key, out _) overload
// evicts the retry entry. The fixed TryRemove(KeyValuePair<,>) overload compares
// value identity, so only the faulted Lazy is removed.
//
// Deterministic setup: pre-populate the cache's internal dictionary with a
// faulted Lazy whose factory itself swaps the entry to a fresh Lazy as a side
// effect during the throw. By the time GetOrCompile reaches its catch block, the
// dictionary holds the fresh entry under the same key — exactly the race window
// the finding describes. The fix must leave the fresh entry in place.
var cache = new CompiledScriptCache<FakeScriptContext, int>();
// Reach the private _cache + HashSource via reflection — they're private, so
// InternalsVisibleTo doesn't help.
var cacheField = typeof(CompiledScriptCache<FakeScriptContext, int>)
.GetField("_cache", System.Reflection.BindingFlags.Instance | System.Reflection.BindingFlags.NonPublic);
cacheField.ShouldNotBeNull();
var dict = (System.Collections.Concurrent.ConcurrentDictionary<
string, Lazy<ScriptEvaluator<FakeScriptContext, int>>>)cacheField!.GetValue(cache)!;
const string source = """return 7;""";
var hashSourceMethod = typeof(CompiledScriptCache<FakeScriptContext, int>)
.GetMethod("HashSource", System.Reflection.BindingFlags.Static | System.Reflection.BindingFlags.NonPublic);
hashSourceMethod.ShouldNotBeNull();
var key = (string)hashSourceMethod!.Invoke(null, [source])!;
// The fresh Lazy is what a concurrent retry would have inserted between the
// faulted-throw and the catch's removal. Materialise it eagerly so we have a
// stable reference to assert identity against.
var fresh = new Lazy<ScriptEvaluator<FakeScriptContext, int>>(
() => ScriptEvaluator<FakeScriptContext, int>.Compile(source),
LazyThreadSafetyMode.ExecutionAndPublication);
// The faulted Lazy throws — but only after swapping its own dictionary entry
// for the fresh Lazy, modelling the race window between the throw and the
// catch-block eviction.
var faulted = new Lazy<ScriptEvaluator<FakeScriptContext, int>>(
() =>
{
dict[key] = fresh;
throw new InvalidOperationException("bad compile");
},
LazyThreadSafetyMode.ExecutionAndPublication);
dict[key] = faulted;
// Drive GetOrCompile through the public API. It observes the faulted Lazy
// currently under `key`, invokes .Value (which swaps in the fresh Lazy then
// throws), and runs the catch block's eviction. The fix removes only the
// specific faulted Lazy instance; the fresh entry survives.
Should.Throw<InvalidOperationException>(() => cache.GetOrCompile(source));
dict.ContainsKey(key).ShouldBeTrue(
"the fresh retry entry that won the race must survive the faulted Lazy eviction (Core.Scripting-006)");
ReferenceEquals(dict[key], fresh).ShouldBeTrue(
"the entry under the key must still be the fresh Lazy — an unconditional TryRemove(key) would have evicted it");
}
[Fact]
public void Failed_compile_path_still_evicts_its_own_faulted_entry()
{
// Companion to the race test above: confirm the fix's value-scoped eviction
// still removes the actual faulted Lazy (so retries with corrected source can
// succeed — the original Core.Scripting test that locked the contract).
var cache = new CompiledScriptCache<FakeScriptContext, int>();
Should.Throw<Exception>(() => cache.GetOrCompile("""return unknownIdentifier + 1;"""));
cache.Count.ShouldBe(0, "faulted Lazy must still be evicted after compile failure");
}
}
@@ -209,4 +209,33 @@ public sealed class DependencyExtractorTests
result.IsValid.ShouldBeTrue();
result.Reads.Count.ShouldBe(2);
}
[Fact]
public void Accepts_single_line_raw_string_literal_path()
{
// A single-line raw string literal ("""Line1/Speed""") tokenizes as
// SingleLineRawStringLiteralToken, not StringLiteralToken — the old check
// would mis-reject it as a "dynamic path". Confirm static raw-string paths are
// harvested. (Core.Scripting-005.)
var src = "return ctx.GetTag(\"\"\"Line1/Speed\"\"\").Value;";
var result = DependencyExtractor.Extract(src);
result.IsValid.ShouldBeTrue();
result.Reads.ShouldContain("Line1/Speed");
result.Rejections.ShouldBeEmpty();
}
[Fact]
public void Accepts_multi_line_raw_string_literal_path()
{
// A multi-line raw string literal tokenizes as MultiLineRawStringLiteralToken.
// Even though it is unusual for a tag path, it is still a static string and
// must not be mis-rejected. (Core.Scripting-005.)
// Note: the multi-line raw string strips the common leading indent and the
// surrounding newlines, leaving exactly the body text.
var src = "return ctx.GetTag(\"\"\"\nLine1/Speed\n\"\"\").Value;";
var result = DependencyExtractor.Extract(src);
result.IsValid.ShouldBeTrue();
result.Reads.ShouldContain("Line1/Speed");
result.Rejections.ShouldBeEmpty();
}
}
@@ -0,0 +1,70 @@
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Core.Scripting;
namespace ZB.MOM.WW.OtOpcUa.Core.Scripting.Tests;
/// <summary>
/// Locks the boundary semantics of <see cref="ScriptContext.Deadband"/>. The helper
/// is the canonical "ignore small noise" predicate alarm authors compose into bigger
/// expressions; subtle sign / boundary changes here would silently move the
/// active-state edge of every consuming alarm. (Core.Scripting-011.)
/// </summary>
[Trait("Category", "Unit")]
public sealed class ScriptContextTests
{
[Fact]
public void Deadband_returns_false_when_difference_equals_tolerance()
{
// Strict greater-than comparison: a delta exactly equal to tolerance is
// considered "within deadband" and must NOT trip.
ScriptContext.Deadband(current: 10.5, previous: 10.0, tolerance: 0.5).ShouldBeFalse();
}
[Fact]
public void Deadband_returns_true_when_difference_just_exceeds_tolerance()
{
// Any delta strictly greater than tolerance trips the deadband.
ScriptContext.Deadband(current: 10.6, previous: 10.0, tolerance: 0.5).ShouldBeTrue();
}
[Fact]
public void Deadband_returns_false_when_difference_just_below_tolerance()
{
ScriptContext.Deadband(current: 10.4, previous: 10.0, tolerance: 0.5).ShouldBeFalse();
}
[Fact]
public void Deadband_is_symmetric_in_direction_of_change()
{
// Math.Abs ensures the helper trips identically whether the value rose or fell.
ScriptContext.Deadband(current: 9.0, previous: 10.0, tolerance: 0.5).ShouldBeTrue();
ScriptContext.Deadband(current: 11.0, previous: 10.0, tolerance: 0.5).ShouldBeTrue();
}
[Fact]
public void Deadband_returns_false_when_values_are_equal()
{
ScriptContext.Deadband(current: 10.0, previous: 10.0, tolerance: 0.001).ShouldBeFalse();
}
[Fact]
public void Deadband_with_zero_tolerance_returns_true_for_any_difference()
{
// Zero-tolerance is the "trip on any non-equal change" mode. Equality still
// returns false (delta 0 is not strictly greater than 0).
ScriptContext.Deadband(current: 10.0, previous: 10.0, tolerance: 0).ShouldBeFalse();
ScriptContext.Deadband(current: 10.0, previous: 10.000001, tolerance: 0).ShouldBeTrue();
}
[Fact]
public void Deadband_with_negative_tolerance_always_trips_for_unequal_values()
{
// A negative tolerance is nonsensical input but the helper does not guard
// against it — Math.Abs(delta) is always >= 0, so the comparison is
// effectively "any non-equal change". Lock this so a refactor that adds
// (or removes) input validation requires an explicit test update.
ScriptContext.Deadband(current: 10.0, previous: 10.5, tolerance: -1.0).ShouldBeTrue();
ScriptContext.Deadband(current: 10.0, previous: 10.0, tolerance: -1.0).ShouldBeTrue();
}
}
@@ -152,4 +152,47 @@ public sealed class ScriptLogCompanionSinkTests
scriptLogger.ForContext(ScriptLoggerFactory.ScriptNameProperty, "X").Fatal("fatal");
mainSink.Events.Count.ShouldBe(1);
}
[Fact]
public void Factory_plus_companion_sink_integration_surfaces_script_error_in_both_logs()
{
// End-to-end Core.Scripting-011 coverage: an Error-level emission from a logger
// built by ScriptLoggerFactory must land in BOTH the per-script sink (acting as
// the scripts-*.log substitute here) AND the main-log companion at Warning level,
// tagged with the originating ScriptName property. This is the integration the
// server's logger pipeline relies on so operators see script failures in the
// primary log without monitoring the scripts file separately.
var scriptsSink = new CapturingSink();
var mainSink = new CapturingSink();
// Main pipeline — receives only Error+ events forwarded by the companion sink.
var mainLogger = new LoggerConfiguration()
.MinimumLevel.Verbose().WriteTo.Sink(mainSink).CreateLogger();
// Root script pipeline — fans out to the scripts sink (stand-in for the
// rolling scripts-*.log file) AND the companion sink that forwards Error+
// to the main logger.
var rootScriptLogger = new LoggerConfiguration()
.MinimumLevel.Verbose()
.WriteTo.Sink(scriptsSink)
.WriteTo.Sink(new ScriptLogCompanionSink(mainLogger))
.CreateLogger();
// Factory binds the per-script ScriptName property — this is the only way the
// mirror knows which script raised the event.
var factory = new ScriptLoggerFactory(rootScriptLogger);
var perScript = factory.Create("HighTemp");
perScript.Error("threshold exceeded");
// Scripts sink saw the Error at its original level with the ScriptName bound.
scriptsSink.Events.Count.ShouldBe(1);
scriptsSink.Events[0].Level.ShouldBe(LogEventLevel.Error);
((ScalarValue)scriptsSink.Events[0].Properties[ScriptLoggerFactory.ScriptNameProperty]).Value.ShouldBe("HighTemp");
// Main sink saw a Warning-level companion entry tagged with the same ScriptName.
mainSink.Events.Count.ShouldBe(1);
mainSink.Events[0].Level.ShouldBe(LogEventLevel.Warning);
((ScalarValue)mainSink.Events[0].Properties[ScriptLoggerFactory.ScriptNameProperty]).Value.ShouldBe("HighTemp");
((ScalarValue)mainSink.Events[0].Properties["OriginalLevel"]).Value.ShouldBe(LogEventLevel.Error);
}
}
@@ -0,0 +1,56 @@
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Core.Scripting;
namespace ZB.MOM.WW.OtOpcUa.Core.Scripting.Tests;
/// <summary>
/// Covers the <see cref="ScriptSandbox.Build"/> argument-validation guards. The
/// guards are the only call-site protection against a typo / mis-wired context
/// type silently producing a sandbox missing the concrete context's assembly
/// reference; without coverage, the guards could be deleted by a refactor without
/// any test failing. (Core.Scripting-011.)
/// </summary>
[Trait("Category", "Unit")]
public sealed class ScriptSandboxBuildTests
{
[Fact]
public void Null_context_type_throws_ArgumentNullException()
{
Should.Throw<ArgumentNullException>(() => ScriptSandbox.Build(null!));
}
[Fact]
public void Non_ScriptContext_type_throws_ArgumentException()
{
// ScriptSandbox must reject types that don't derive from ScriptContext —
// ScriptGlobals<TContext> is constrained where TContext : ScriptContext, so
// sneaking a non-derived type past Build would later blow up inside Roslyn.
Should.Throw<ArgumentException>(() => ScriptSandbox.Build(typeof(string)))
.ParamName.ShouldBe("contextType");
}
[Fact]
public void Abstract_ScriptContext_base_type_is_accepted()
{
// The base ScriptContext type satisfies the IsAssignableFrom check, so the
// factory must not reject it even though it cannot be instantiated directly.
// Callers wiring a base-typed sandbox up for diagnostics rely on this.
var options = ScriptSandbox.Build(typeof(ScriptContext));
options.ShouldNotBeNull();
}
[Fact]
public void Concrete_subclass_is_accepted_and_its_assembly_referenced()
{
// The concrete context type's assembly must end up in the allow-listed
// references — otherwise Roslyn cannot resolve ScriptGlobals<TContext> at
// compile. We can't easily inspect the ScriptOptions metadata references
// by-assembly cross-version, so we exercise the end-to-end path instead: a
// script compiled against FakeScriptContext must successfully reach a
// FakeScriptContext member.
var evaluator = ScriptEvaluator<FakeScriptContext, double>.Compile(
"""return (double)ctx.GetTag("X").Value;""");
evaluator.ShouldNotBeNull();
}
}
@@ -153,6 +153,65 @@ public sealed class DependencyGraphTests
g.DirectDependents("A").ShouldBe(new[] { "B" });
}
// ----- Core.VirtualTags-013: DependencyCycleException message must not present SCC as edge path -----
[Fact]
public void DependencyCycleException_message_describes_cycle_members_not_a_fabricated_edge_path()
{
// Regression for Core.VirtualTags-013: Tarjan returns SCC members in stack-pop
// order, NOT in edge-traversal order. The exception message must not render the
// members as "A -> B -> C -> A" — that misleads operators into looking for an
// edge that may not be in the config. Instead the message uses a set-form
// ("members: A, B, C") or a labelled traversal.
var g = new DependencyGraph();
g.Add("A", Set("B"));
g.Add("B", Set("A"));
var ex = Should.Throw<DependencyCycleException>(() => g.TopologicalSort());
// The arrow ("->") notation as used previously (string.Join(" -> ", c) + " -> " + c[0])
// implies an ordered edge path. After the fix, the message must NOT contain the
// closing edge `-> A` (i.e. " -> " + first-member) on its own — the formatting
// must clearly mark the list as cycle MEMBERS rather than an edge sequence.
ex.Message.ShouldContain("cycle");
ex.Message.ShouldContain("A");
ex.Message.ShouldContain("B");
// Verify the message uses a member-list framing ("members:" or "members of cycle"
// or commas) rather than the misleading edge-path framing.
ex.Message.ShouldContain("member", Case.Insensitive,
"message should label entries as cycle members, not present them as an edge path");
}
// ----- Core.VirtualTags-009: empty-set allocation on miss -----
[Fact]
public void DirectDependencies_miss_returns_shared_empty_set_instance()
{
// Regression for Core.VirtualTags-009: calling DirectDependencies for an
// unregistered node should NOT allocate a fresh HashSet each time. The miss
// path returns a shared empty set so the change-cascade hot path doesn't
// churn the GC.
var g = new DependencyGraph();
var a = g.DirectDependencies("Unknown1");
var b = g.DirectDependencies("Unknown2");
a.ShouldBeEmpty();
b.ShouldBeEmpty();
ReferenceEquals(a, b).ShouldBeTrue("miss path must return the shared empty-set instance");
}
[Fact]
public void DirectDependents_miss_returns_shared_empty_set_instance()
{
// Same regression as above for DirectDependents — called from inside the
// CascadeAsync DFS and TopologicalSort Kahn loop, so the miss-path allocation
// is on every change-cascade event.
var g = new DependencyGraph();
var a = g.DirectDependents("LeafA");
var b = g.DirectDependents("LeafB");
a.ShouldBeEmpty();
b.ShouldBeEmpty();
ReferenceEquals(a, b).ShouldBeTrue("miss path must return the shared empty-set instance");
}
[Fact]
public void Deep_graph_no_stack_overflow()
{
@@ -92,6 +92,53 @@ public sealed class TimerTriggerSchedulerTests
}));
}
// ----- Core.VirtualTags-007: timer ticks must not block pool threads and must skip when prior tick is still running -----
[Fact]
public async Task Tick_skips_when_prior_tick_for_the_same_group_is_still_running()
{
// Regression for Core.VirtualTags-007: if a single tick takes longer than the
// interval, subsequent timer callbacks must NOT each pin a thread-pool thread
// waiting on the same evaluation gate. The scheduler tracks an in-flight flag
// per group and skips a new tick when the prior one is still running.
var up = new FakeUpstream();
up.Set("In", 1);
var logger = new LoggerConfiguration().CreateLogger();
// Slow script — each evaluation takes longer than several timer intervals.
const int slowMs = 250;
const int intervalMs = 50;
using var engine = new VirtualTagEngine(up,
new ScriptLoggerFactory(logger), logger);
engine.Load([new VirtualTagDefinition(
"Slow", DriverDataType.Int32,
$$"""
var end = DateTime.UtcNow.AddMilliseconds({{slowMs}});
while (DateTime.UtcNow < end) { }
return (int)ctx.GetTag("In").Value;
""",
ChangeTriggered: false,
TimerInterval: TimeSpan.FromMilliseconds(intervalMs))]);
using var sched = new TimerTriggerScheduler(engine, logger);
sched.Start([new VirtualTagDefinition(
"Slow", DriverDataType.Int32,
"",
ChangeTriggered: false,
TimerInterval: TimeSpan.FromMilliseconds(intervalMs))]);
// Wait long enough for many timer ticks at 50ms while one evaluation
// (~250ms each) holds the engine. Window is 600ms ~ 12 ticks.
await Task.Delay(600);
// With the fix in place, ticks that fire while the previous one for the same
// group is still running are skipped. The skipped count must be measurable; if
// SkippedTickCount is still 0 after 600ms with ~12 ticks fired and a 250ms eval,
// the fix is not working — at minimum 3-4 ticks must have been skipped.
sched.SkippedTickCount.ShouldBeGreaterThan(2,
"ticks that fire while the prior tick for the same group is still running must be skipped");
}
[Fact]
public void Disposed_scheduler_stops_firing()
{
@@ -400,24 +400,25 @@ public sealed class VirtualTagEngineTests
}
[Fact]
public async Task SetVirtualTag_on_non_registered_path_logs_warning_and_does_not_throw()
public async Task SetVirtualTag_on_non_registered_path_is_caught_at_Load()
{
// Arrange: script writes to a path that is not a registered virtual tag.
// Originally validated the runtime warning-and-drop branch in OnScriptSetVirtualTag.
// After Core.VirtualTags-011 the static DependencyExtractor.Writes set is validated
// at Load time, so a literal-string write to a non-existent path is now rejected
// at publish — the dynamic warning path is reserved as a defensive guard for cases
// the static extractor cannot see (currently none, since dynamic paths are also
// rejected at extraction).
var up = new FakeUpstream();
up.Set("In", 1);
using var engine = Build(up);
engine.Load([new VirtualTagDefinition(
"Writer", DriverDataType.Int32,
"""
ctx.SetVirtualTag("NonExistentPath", 99);
return (int)ctx.GetTag("In").Value;
""")]);
// Act + Assert: should not throw; engine stays healthy.
await engine.EvaluateAllAsync(TestContext.Current.CancellationToken);
engine.Read("Writer").StatusCode.ShouldBe(0u, "engine must not fault on write to non-registered path");
engine.Read("Writer").Value.ShouldBe(1);
Should.Throw<InvalidOperationException>(() => engine.Load([
new VirtualTagDefinition("Writer", DriverDataType.Int32,
"""
ctx.SetVirtualTag("NonExistentPath", 99);
return (int)ctx.GetTag("In").Value;
""")
])).Message.ShouldContain("NonExistentPath");
await Task.CompletedTask;
}
@@ -451,6 +452,136 @@ public sealed class VirtualTagEngineTests
engine.Read("Bad").Value.ShouldBeNull();
}
// ----- Core.VirtualTags-011: Writes target validation at Load time -----
[Fact]
public async Task Load_rejects_script_writing_to_unregistered_virtual_tag_path()
{
// Regression for Core.VirtualTags-011: a script that calls
// ctx.SetVirtualTag("Typo", ...) must be caught at publish/load time rather than
// silently dropped at runtime, so operator typos surface as a publish failure.
var up = new FakeUpstream();
using var engine = Build(up);
var ex = Should.Throw<InvalidOperationException>(() => engine.Load([
new VirtualTagDefinition("Writer", DriverDataType.Int32,
"""
ctx.SetVirtualTag("NonRegisteredTarget", 1);
return 0;
"""),
new VirtualTagDefinition("RegisteredTarget", DriverDataType.Int32,
"""return 1;"""),
]));
ex.Message.ShouldContain("Writer");
ex.Message.ShouldContain("NonRegisteredTarget");
await Task.CompletedTask;
}
[Fact]
public async Task Load_accepts_script_writing_to_registered_virtual_tag_path()
{
// Companion to the rejection test: a write to a registered tag must continue to
// load successfully.
var up = new FakeUpstream();
up.Set("In", 1);
using var engine = Build(up);
// No throw — Writer writes to Target which is registered.
engine.Load([
new VirtualTagDefinition("Target", DriverDataType.Int32,
"""return 0;""", ChangeTriggered: false),
new VirtualTagDefinition("Writer", DriverDataType.Int32,
"""
ctx.SetVirtualTag("Target", (int)ctx.GetTag("In").Value);
return 0;
"""),
]);
await engine.EvaluateAllAsync(TestContext.Current.CancellationToken);
engine.Read("Target").Value.ShouldBe(1);
}
// ----- Core.VirtualTags-006: empty observer list left in _observers map -----
[Fact]
public void Subscribe_then_Unsub_prunes_empty_observer_list_for_path()
{
// Regression for Core.VirtualTags-006: disposing the last subscriber for a path
// must remove the dictionary entry so a long-running server with churning OPC UA
// subscriptions does not accumulate an unbounded number of empty List entries.
var up = new FakeUpstream();
using var engine = Build(up);
engine.Load([new VirtualTagDefinition(
"T", DriverDataType.Int32, """return 1;""")]);
// Subscribe, then immediately Dispose — both the only observer.
var sub1 = engine.Subscribe("T", (_, _) => { });
var sub2 = engine.Subscribe("T", (_, _) => { });
sub1.Dispose();
sub2.Dispose();
// The internal map should no longer hold an entry for the path.
// Use the same ConcurrentDictionary type the engine uses; we check via reflection
// on the test-private field so this is robust to future renames inside engine.
var observersField = typeof(VirtualTagEngine).GetField(
"_observers",
System.Reflection.BindingFlags.NonPublic | System.Reflection.BindingFlags.Instance);
observersField.ShouldNotBeNull();
var observers = observersField!.GetValue(engine);
observers.ShouldNotBeNull();
var containsKey = observers!.GetType().GetMethod("ContainsKey")!;
var result = (bool)containsKey.Invoke(observers, new object[] { "T" })!;
result.ShouldBeFalse("disposing the last subscriber must remove the dictionary entry");
}
// ----- Core.VirtualTags-004: CoerceResult default arm leaks uncoerced values -----
[Fact]
public async Task CoerceResult_handles_Int16_UInt16_UInt32_UInt64()
{
// Regression for Core.VirtualTags-004: before the fix, CoerceResult had a default
// arm that returned the script's raw double/string for these types, producing a
// type-mismatched DataValueSnapshot. Verify every integer DriverDataType the engine
// is allowed to declare coerces correctly.
var up = new FakeUpstream();
using var engine = Build(up);
engine.Load([
new VirtualTagDefinition("AsInt16", DriverDataType.Int16, """return 7.0;"""),
new VirtualTagDefinition("AsUInt16", DriverDataType.UInt16, """return 8.0;"""),
new VirtualTagDefinition("AsUInt32", DriverDataType.UInt32, """return 9.0;"""),
new VirtualTagDefinition("AsUInt64", DriverDataType.UInt64, """return 10.0;"""),
]);
await engine.EvaluateAllAsync(TestContext.Current.CancellationToken);
engine.Read("AsInt16").Value.ShouldBeOfType<short>();
engine.Read("AsInt16").Value.ShouldBe((short)7);
engine.Read("AsUInt16").Value.ShouldBeOfType<ushort>();
engine.Read("AsUInt16").Value.ShouldBe((ushort)8);
engine.Read("AsUInt32").Value.ShouldBeOfType<uint>();
engine.Read("AsUInt32").Value.ShouldBe((uint)9);
engine.Read("AsUInt64").Value.ShouldBeOfType<ulong>();
engine.Read("AsUInt64").Value.ShouldBe((ulong)10);
}
[Fact]
public async Task Load_rejects_definition_with_unsupported_DriverDataType()
{
// Regression for Core.VirtualTags-004: any DriverDataType that CoerceResult cannot
// honour must be rejected at Load time so an operator typo (or a future enum
// member added without coercion support) does not silently emit a type-mismatched
// value to OPC UA clients. Reference is unsupported for virtual tags (the engine
// does not synthesize Galaxy attribute references).
var up = new FakeUpstream();
using var engine = Build(up);
var ex = Should.Throw<InvalidOperationException>(() => engine.Load([
new VirtualTagDefinition("Ref", DriverDataType.Reference, """return "Some.Attribute";"""),
]));
ex.Message.ShouldContain("Reference");
ex.Message.ShouldContain("Ref");
await Task.CompletedTask;
}
[Fact]
public async Task Load_rejects_duplicate_path_with_aggregated_error()
{
@@ -0,0 +1,64 @@
using System.IO;
using System.Reflection;
using Shouldly;
using Xunit;
namespace ZB.MOM.WW.OtOpcUa.Admin.Tests;
/// <summary>
/// Regression for Admin-010 — admin-ui.md "Tech Stack" requires Bootstrap 5
/// "vendored under wwwroot/lib/bootstrap/" so the Admin app has no third-party
/// runtime dependency and works in air-gapped fleet deployments. These tests
/// guard against a future re-introduction of the cdn.jsdelivr.net references
/// in App.razor.
/// </summary>
[Trait("Category", "Unit")]
public sealed class BootstrapVendoringTests
{
[Fact]
public void AppRazor_does_not_reference_a_remote_CDN_for_bootstrap()
{
var appRazor = File.ReadAllText(ResolveAdminPath("Components/App.razor"));
appRazor.ShouldNotContain("cdn.jsdelivr.net",
customMessage: "Admin-010: Bootstrap must be served from the vendored copy under wwwroot/lib/bootstrap/, not jsDelivr — air-gapped deployments cannot reach the public CDN.");
appRazor.ShouldNotContain("cdnjs.cloudflare.com",
customMessage: "Admin-010: third-party CDN references regress the vendoring requirement.");
appRazor.ShouldNotContain("unpkg.com",
customMessage: "Admin-010: third-party CDN references regress the vendoring requirement.");
}
[Fact]
public void AppRazor_references_vendored_bootstrap_assets()
{
var appRazor = File.ReadAllText(ResolveAdminPath("Components/App.razor"));
appRazor.ShouldContain("lib/bootstrap/css/bootstrap.min.css",
customMessage: "App.razor must load the vendored Bootstrap stylesheet.");
appRazor.ShouldContain("lib/bootstrap/js/bootstrap.bundle.min.js",
customMessage: "App.razor must load the vendored Bootstrap JS bundle.");
}
[Fact]
public void Vendored_bootstrap_assets_exist_under_wwwroot_lib_bootstrap()
{
var root = ResolveAdminPath("wwwroot/lib/bootstrap");
Directory.Exists(root).ShouldBeTrue($"expected vendored bootstrap directory at '{root}'");
File.Exists(Path.Combine(root, "css", "bootstrap.min.css")).ShouldBeTrue("bootstrap.min.css missing");
File.Exists(Path.Combine(root, "js", "bootstrap.bundle.min.js")).ShouldBeTrue("bootstrap.bundle.min.js missing");
// Sanity-check non-empty (a zero-byte placeholder would still pass File.Exists).
new FileInfo(Path.Combine(root, "css", "bootstrap.min.css")).Length.ShouldBeGreaterThan(100_000);
new FileInfo(Path.Combine(root, "js", "bootstrap.bundle.min.js")).Length.ShouldBeGreaterThan(50_000);
}
/// <summary>Resolve a path under the Admin source project from the test runner's bin folder.</summary>
private static string ResolveAdminPath(string relative)
{
var asmDir = Path.GetDirectoryName(typeof(BootstrapVendoringTests).Assembly.Location)!;
// tests/Server/ZB.MOM.WW.OtOpcUa.Admin.Tests/bin/Debug/net10.0 -> ../../../../../src/Server/...
var repoRoot = Path.GetFullPath(Path.Combine(asmDir, "..", "..", "..", "..", "..", ".."));
return Path.Combine(repoRoot, "src", "Server", "ZB.MOM.WW.OtOpcUa.Admin", relative.Replace('/', Path.DirectorySeparatorChar));
}
}
@@ -7,9 +7,10 @@ namespace ZB.MOM.WW.OtOpcUa.Admin.Tests;
[Trait("Category", "Unit")]
public sealed class EquipmentCsvImporterTests
{
// Admin-012: header no longer includes EquipmentId — that field is system-derived.
private const string Header =
"# OtOpcUaCsv v1\n" +
"ZTag,MachineCode,SAPID,EquipmentId,EquipmentUuid,Name,UnsAreaName,UnsLineName";
"ZTag,MachineCode,SAPID,EquipmentUuid,Name,UnsAreaName,UnsLineName";
[Fact]
public void EmptyFile_Throws()
@@ -20,7 +21,7 @@ public sealed class EquipmentCsvImporterTests
[Fact]
public void MissingVersionMarker_Throws()
{
var csv = "ZTag,MachineCode,SAPID,EquipmentId,EquipmentUuid,Name,UnsAreaName,UnsLineName\nx,x,x,x,x,x,x,x";
var csv = "ZTag,MachineCode,SAPID,EquipmentUuid,Name,UnsAreaName,UnsLineName\nx,x,x,x,x,x,x";
var ex = Should.Throw<InvalidCsvFormatException>(() => EquipmentCsvImporter.Parse(csv));
ex.Message.ShouldContain("# OtOpcUaCsv v1");
@@ -30,8 +31,8 @@ public sealed class EquipmentCsvImporterTests
public void MissingRequiredColumn_Throws()
{
var csv = "# OtOpcUaCsv v1\n" +
"ZTag,MachineCode,SAPID,EquipmentId,Name,UnsAreaName,UnsLineName\n" +
"z1,mc,sap,eq1,Name1,area,line";
"ZTag,MachineCode,SAPID,Name,UnsAreaName,UnsLineName\n" +
"z1,mc,sap,Name1,area,line";
var ex = Should.Throw<InvalidCsvFormatException>(() => EquipmentCsvImporter.Parse(csv));
ex.Message.ShouldContain("EquipmentUuid");
@@ -40,7 +41,7 @@ public sealed class EquipmentCsvImporterTests
[Fact]
public void UnknownColumn_Throws()
{
var csv = Header + ",WeirdColumn\nz1,mc,sap,eq1,uu,Name1,area,line,value";
var csv = Header + ",WeirdColumn\nz1,mc,sap,uu,Name1,area,line,value";
var ex = Should.Throw<InvalidCsvFormatException>(() => EquipmentCsvImporter.Parse(csv));
ex.Message.ShouldContain("WeirdColumn");
@@ -50,8 +51,8 @@ public sealed class EquipmentCsvImporterTests
public void DuplicateColumn_Throws()
{
var csv = "# OtOpcUaCsv v1\n" +
"ZTag,ZTag,MachineCode,SAPID,EquipmentId,EquipmentUuid,Name,UnsAreaName,UnsLineName\n" +
"z1,z1,mc,sap,eq,uu,Name,area,line";
"ZTag,ZTag,MachineCode,SAPID,EquipmentUuid,Name,UnsAreaName,UnsLineName\n" +
"z1,z1,mc,sap,uu,Name,area,line";
Should.Throw<InvalidCsvFormatException>(() => EquipmentCsvImporter.Parse(csv));
}
@@ -59,7 +60,7 @@ public sealed class EquipmentCsvImporterTests
[Fact]
public void ValidSingleRow_RoundTrips()
{
var csv = Header + "\nz-001,MC-1,SAP-1,eq-001,uuid-1,Oven-A,Warsaw,Line-1";
var csv = Header + "\nz-001,MC-1,SAP-1,uuid-1,Oven-A,Warsaw,Line-1";
var result = EquipmentCsvImporter.Parse(csv);
@@ -76,8 +77,8 @@ public sealed class EquipmentCsvImporterTests
public void OptionalColumns_Populated_WhenPresent()
{
var csv = "# OtOpcUaCsv v1\n" +
"ZTag,MachineCode,SAPID,EquipmentId,EquipmentUuid,Name,UnsAreaName,UnsLineName,Manufacturer,Model,SerialNumber,HardwareRevision,SoftwareRevision,YearOfConstruction,AssetLocation,ManufacturerUri,DeviceManualUri\n" +
"z-1,MC,SAP,eq,uuid,Oven,Warsaw,Line1,Siemens,S7-1500,SN123,Rev-1,Fw-2.3,2023,Bldg-3,https://siemens.example,https://siemens.example/manual";
"ZTag,MachineCode,SAPID,EquipmentUuid,Name,UnsAreaName,UnsLineName,Manufacturer,Model,SerialNumber,HardwareRevision,SoftwareRevision,YearOfConstruction,AssetLocation,ManufacturerUri,DeviceManualUri\n" +
"z-1,MC,SAP,uuid,Oven,Warsaw,Line1,Siemens,S7-1500,SN123,Rev-1,Fw-2.3,2023,Bldg-3,https://siemens.example,https://siemens.example/manual";
var result = EquipmentCsvImporter.Parse(csv);
@@ -93,7 +94,7 @@ public sealed class EquipmentCsvImporterTests
[Fact]
public void BlankRequiredField_Rejects_Row()
{
var csv = Header + "\nz-1,MC,SAP,eq,uuid,,Warsaw,Line1"; // Name blank
var csv = Header + "\nz-1,MC,SAP,uuid,,Warsaw,Line1"; // Name blank
var result = EquipmentCsvImporter.Parse(csv);
@@ -106,8 +107,8 @@ public sealed class EquipmentCsvImporterTests
public void DuplicateZTag_Rejects_SecondRow()
{
var csv = Header +
"\nz-1,MC1,SAP1,eq1,u1,N1,A,L1" +
"\nz-1,MC2,SAP2,eq2,u2,N2,A,L1";
"\nz-1,MC1,SAP1,u1,N1,A,L1" +
"\nz-1,MC2,SAP2,u2,N2,A,L1";
var result = EquipmentCsvImporter.Parse(csv);
@@ -121,7 +122,7 @@ public sealed class EquipmentCsvImporterTests
{
// RFC 4180: "" inside a quoted field is an escaped quote.
var csv = Header +
"\n\"z-1\",\"MC\",\"SAP,with,commas\",\"eq\",\"uuid\",\"Oven \"\"Ultra\"\"\",\"Warsaw\",\"Line1\"";
"\n\"z-1\",\"MC\",\"SAP,with,commas\",\"uuid\",\"Oven \"\"Ultra\"\"\",\"Warsaw\",\"Line1\"";
var result = EquipmentCsvImporter.Parse(csv);
@@ -133,7 +134,7 @@ public sealed class EquipmentCsvImporterTests
[Fact]
public void MismatchedColumnCount_Rejects_Row()
{
var csv = Header + "\nz-1,MC,SAP,eq,uuid,Name,Warsaw"; // missing UnsLineName cell
var csv = Header + "\nz-1,MC,SAP,uuid,Name,Warsaw"; // missing UnsLineName cell
var result = EquipmentCsvImporter.Parse(csv);
@@ -146,9 +147,9 @@ public sealed class EquipmentCsvImporterTests
public void BlankLines_BetweenRows_AreIgnored()
{
var csv = Header +
"\nz-1,MC,SAP,eq1,u1,N1,A,L1" +
"\nz-1,MC,SAP,u1,N1,A,L1" +
"\n" +
"\nz-2,MC,SAP,eq2,u2,N2,A,L1";
"\nz-2,MC,SAP,u2,N2,A,L1";
var result = EquipmentCsvImporter.Parse(csv);
@@ -159,8 +160,9 @@ public sealed class EquipmentCsvImporterTests
[Fact]
public void Header_Constants_Match_Decision_117_and_139()
{
// Admin-012: EquipmentId is intentionally absent — derived from EquipmentUuid at finalise time.
EquipmentCsvImporter.RequiredColumns.ShouldBe(
["ZTag", "MachineCode", "SAPID", "EquipmentId", "EquipmentUuid", "Name", "UnsAreaName", "UnsLineName"]);
["ZTag", "MachineCode", "SAPID", "EquipmentUuid", "Name", "UnsAreaName", "UnsLineName"]);
EquipmentCsvImporter.OptionalColumns.ShouldBe(
["Manufacturer", "Model", "SerialNumber", "HardwareRevision", "SoftwareRevision",
@@ -0,0 +1,74 @@
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Admin.Services;
namespace ZB.MOM.WW.OtOpcUa.Admin.Tests;
/// <summary>
/// Regression for Admin-012 — <c>admin-ui.md</c> ("Equipment CSV import", revised after
/// adversarial review finding #4) requires no <c>EquipmentId</c> column: it is
/// system-derived (<c>'EQ-' + first 12 hex chars of EquipmentUuid</c>) and "never
/// accepted from CSV imports". Operator-supplied EquipmentId would mint duplicate
/// equipment identity on typos.
/// </summary>
[Trait("Category", "Unit")]
public sealed class EquipmentCsvNoEquipmentIdColumnTests
{
[Fact]
public void RequiredColumns_does_not_include_EquipmentId()
{
EquipmentCsvImporter.RequiredColumns
.ShouldNotContain("EquipmentId",
customMessage: "Admin-012: admin-ui.md forbids an EquipmentId column on the CSV import — it is system-derived from EquipmentUuid.");
}
[Fact]
public void OptionalColumns_does_not_include_EquipmentId()
{
EquipmentCsvImporter.OptionalColumns
.ShouldNotContain("EquipmentId",
customMessage: "Admin-012: EquipmentId must not be an optional column either — it is never accepted from the CSV.");
}
[Fact]
public void EquipmentCsvRow_has_no_EquipmentId_property()
{
// The CSV row shape mirrors the accepted columns. Keeping EquipmentId on the
// row would invite the same misuse — drop it so the type system prevents
// accidental population from a future column.
var prop = typeof(EquipmentCsvRow).GetProperty("EquipmentId");
prop.ShouldBeNull("Admin-012: EquipmentCsvRow must not expose an EquipmentId — the value is derived at finalise time.");
}
[Fact]
public void Header_with_EquipmentId_column_is_rejected_as_unknown()
{
// After the fix, EquipmentId is an unknown column — the header validator must
// refuse it like any other unrecognised column so operators get an explicit
// error rather than silently importing the value.
const string csv =
"# OtOpcUaCsv v1\n" +
"ZTag,MachineCode,SAPID,EquipmentId,EquipmentUuid,Name,UnsAreaName,UnsLineName\n" +
"z-1,MC,SAP,eq,uuid,Oven,Warsaw,Line1";
var ex = Should.Throw<InvalidCsvFormatException>(() => EquipmentCsvImporter.Parse(csv));
ex.Message.ShouldContain("EquipmentId",
customMessage: "Importer must reject CSVs that still carry the (now disallowed) EquipmentId column.");
}
[Fact]
public void Valid_csv_without_EquipmentId_is_accepted()
{
// The canonical header should now omit EquipmentId.
const string csv =
"# OtOpcUaCsv v1\n" +
"ZTag,MachineCode,SAPID,EquipmentUuid,Name,UnsAreaName,UnsLineName\n" +
"z-1,MC,SAP,11111111-2222-3333-4444-555555555555,Oven,Warsaw,Line1";
var result = EquipmentCsvImporter.Parse(csv);
result.AcceptedRows.Count.ShouldBe(1);
result.RejectedRows.ShouldBeEmpty();
result.AcceptedRows[0].ZTag.ShouldBe("z-1");
result.AcceptedRows[0].EquipmentUuid.ShouldBe("11111111-2222-3333-4444-555555555555");
}
}
@@ -25,12 +25,12 @@ public sealed class EquipmentImportBatchServiceTests : IDisposable
// Unique SAPID per row — FinaliseBatch reserves ZTag + SAPID via filtered-unique index, so
// two rows sharing a SAPID under different EquipmentUuids collide as intended.
// Admin-012: no EquipmentId on the CSV row — it is derived from EquipmentUuid at stage/finalise.
private static EquipmentCsvRow Row(string zTag, string name = "eq-1") => new()
{
ZTag = zTag,
MachineCode = "mc",
SAPID = $"sap-{zTag}",
EquipmentId = "eq-id",
EquipmentUuid = Guid.NewGuid().ToString(),
Name = name,
UnsAreaName = "area",
@@ -189,7 +189,7 @@ public sealed class EquipmentImportBatchServiceTests : IDisposable
var row = new EquipmentCsvRow
{
ZTag = "z-shared", MachineCode = "mc", SAPID = "sap-shared",
EquipmentId = "eq-1", EquipmentUuid = sharedUuid.ToString(),
EquipmentUuid = sharedUuid.ToString(),
Name = "eq-1", UnsAreaName = "a", UnsLineName = "l",
};
await _svc.StageRowsAsync(batch1.Id, [row], [], CancellationToken.None);
@@ -212,17 +212,18 @@ public sealed class EquipmentImportBatchServiceTests : IDisposable
var rowA = new EquipmentCsvRow
{
ZTag = "z-collide", MachineCode = "mc-a", SAPID = "sap-a",
EquipmentId = "eq-a", EquipmentUuid = Guid.NewGuid().ToString(),
EquipmentUuid = Guid.NewGuid().ToString(),
Name = "a", UnsAreaName = "ar", UnsLineName = "ln",
};
await _svc.StageRowsAsync(batchA.Id, [rowA], [], CancellationToken.None);
await _svc.FinaliseBatchAsync(batchA.Id, 1, "drv", "line", CancellationToken.None);
var batchB = await _svc.CreateBatchAsync("c1", "bob", CancellationToken.None);
var rowBUuid = Guid.NewGuid();
var rowB = new EquipmentCsvRow
{
ZTag = "z-collide", MachineCode = "mc-b", SAPID = "sap-b", // same ZTag, different EquipmentUuid
EquipmentId = "eq-b", EquipmentUuid = Guid.NewGuid().ToString(),
EquipmentUuid = rowBUuid.ToString(),
Name = "b", UnsAreaName = "ar", UnsLineName = "ln",
};
await _svc.StageRowsAsync(batchB.Id, [rowB], [], CancellationToken.None);
@@ -231,9 +232,9 @@ public sealed class EquipmentImportBatchServiceTests : IDisposable
_svc.FinaliseBatchAsync(batchB.Id, 2, "drv", "line", CancellationToken.None));
ex.Message.ShouldContain("z-collide");
// Second finalise must have rolled back — no partial Equipment row for batch B.
// Second finalise must have rolled back — no partial Equipment row for batch B (match by UUID).
var equipmentB = await _db.Equipment.AsNoTracking()
.Where(e => e.EquipmentId == "eq-b")
.Where(e => e.EquipmentUuid == rowBUuid)
.ToListAsync();
equipmentB.ShouldBeEmpty();
}
@@ -245,7 +246,7 @@ public sealed class EquipmentImportBatchServiceTests : IDisposable
var row = new EquipmentCsvRow
{
ZTag = "", MachineCode = "mc", SAPID = "",
EquipmentId = "eq-nil", EquipmentUuid = Guid.NewGuid().ToString(),
EquipmentUuid = Guid.NewGuid().ToString(),
Name = "nil", UnsAreaName = "ar", UnsLineName = "ln",
};
await _svc.StageRowsAsync(batch.Id, [row], [], CancellationToken.None);
@@ -294,7 +295,7 @@ public sealed class EquipmentImportBatchServiceTests : IDisposable
var conflictRow = new EquipmentCsvRow
{
ZTag = "z-taken", MachineCode = "mc", SAPID = "sap-ok",
EquipmentId = "eq-x", EquipmentUuid = importerUuid.ToString(),
EquipmentUuid = importerUuid.ToString(),
Name = "x", UnsAreaName = "ar", UnsLineName = "ln",
};
var cleanRow = Row("z-clean");
@@ -334,7 +335,7 @@ public sealed class EquipmentImportBatchServiceTests : IDisposable
var conflictRow = new EquipmentCsvRow
{
ZTag = "z-free", MachineCode = "mc", SAPID = "sap-taken",
EquipmentId = "eq-y", EquipmentUuid = importerUuid.ToString(),
EquipmentUuid = importerUuid.ToString(),
Name = "y", UnsAreaName = "ar", UnsLineName = "ln",
};
@@ -370,7 +371,7 @@ public sealed class EquipmentImportBatchServiceTests : IDisposable
var row = new EquipmentCsvRow
{
ZTag = "z-mine", MachineCode = "mc", SAPID = "sap-mine",
EquipmentId = "eq-z", EquipmentUuid = sharedUuid.ToString(), // same UUID
EquipmentUuid = sharedUuid.ToString(), // same UUID
Name = "z", UnsAreaName = "ar", UnsLineName = "ln",
};
@@ -405,7 +406,7 @@ public sealed class EquipmentImportBatchServiceTests : IDisposable
var row = new EquipmentCsvRow
{
ZTag = "z-released", MachineCode = "mc", SAPID = "sap-new",
EquipmentId = "eq-new", EquipmentUuid = newImporterUuid.ToString(),
EquipmentUuid = newImporterUuid.ToString(),
Name = "new", UnsAreaName = "ar", UnsLineName = "ln",
};
@@ -0,0 +1,115 @@
using System.Collections.Concurrent;
using System.Reflection;
using Microsoft.AspNetCore.SignalR;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Logging.Abstractions;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Admin.Hubs;
using ZB.MOM.WW.OtOpcUa.Admin.Services;
namespace ZB.MOM.WW.OtOpcUa.Admin.Tests;
/// <summary>
/// Regression for Admin-011 — <see cref="FleetStatusPoller"/> kept three plain
/// <c>Dictionary&lt;,&gt;</c> caches that were enumerated/mutated from the steady-state
/// poll loop and cleared from <c>ResetCache()</c> with no synchronization. A concurrent
/// <c>ResetCache()</c> during a poll iteration could throw
/// <see cref="InvalidOperationException"/> or corrupt the dictionary. The fix swaps the
/// caches for <see cref="ConcurrentDictionary{TKey,TValue}"/> so reset + concurrent
/// reads/writes are safe by construction.
/// </summary>
[Trait("Category", "Unit")]
public sealed class FleetStatusPollerConcurrencyTests
{
[Fact]
public void Cache_fields_are_thread_safe_collections()
{
// The fix uses ConcurrentDictionary; that makes ResetCache() and concurrent
// poll-tick mutations safe by construction. Guard the structural choice with
// reflection so a future refactor cannot silently revert to plain Dictionary
// without flipping this guardrail.
var fields = typeof(FleetStatusPoller)
.GetFields(BindingFlags.NonPublic | BindingFlags.Instance)
.Where(f => f.Name is "_last" or "_lastRole" or "_lastResilience")
.ToList();
fields.Count.ShouldBe(3, "expected the three cache fields _last/_lastRole/_lastResilience to exist");
foreach (var f in fields)
{
var type = f.FieldType;
type.IsGenericType.ShouldBeTrue($"{f.Name} should be a generic concurrent collection");
type.GetGenericTypeDefinition().ShouldBe(
typeof(ConcurrentDictionary<,>),
customMessage: $"{f.Name} must be a ConcurrentDictionary<,> so concurrent ResetCache()/poll calls are safe — plain Dictionary regressed Admin-011.");
}
}
[Fact]
public void ResetCache_is_safe_to_call_concurrently_with_cache_mutations()
{
// Stress test — hammer the cache with mutate/clear concurrently. With plain
// Dictionary this throws InvalidOperationException ("Collection was modified")
// or corrupts internal state. With ConcurrentDictionary it must complete cleanly.
var poller = BuildPollerForReflectionTest();
var lastField = typeof(FleetStatusPoller).GetField("_last", BindingFlags.NonPublic | BindingFlags.Instance)!;
var cache = lastField.GetValue(poller)!;
var cacheType = cache.GetType();
var indexer = cacheType.GetProperty("Item")!;
var keyType = cacheType.GetGenericArguments()[0]; // string
var valueType = cacheType.GetGenericArguments()[1]; // NodeStateSnapshot record-struct
var defaultSnapshot = Activator.CreateInstance(valueType)!;
var cts = new CancellationTokenSource(TimeSpan.FromSeconds(2));
var writer = Task.Run(() =>
{
var i = 0;
while (!cts.IsCancellationRequested)
{
indexer.SetValue(cache, defaultSnapshot, new object[] { $"node-{i++ % 64}" });
}
});
var resetter = Task.Run(() =>
{
var method = typeof(FleetStatusPoller).GetMethod("ResetCache", BindingFlags.NonPublic | BindingFlags.Instance)!;
while (!cts.IsCancellationRequested)
{
method.Invoke(poller, null);
}
});
// Should not throw — the whole point is that the two run concurrently safely.
Should.NotThrow(() => Task.WaitAll([writer, resetter]));
}
private static FleetStatusPoller BuildPollerForReflectionTest()
{
// Pass null-style stubs — the poller constructor doesn't touch them and we
// never call ExecuteAsync/PollOnceAsync here (those need a real DB context).
// We only exercise ResetCache + cache mutation by reflection.
var scopeFactory = new StubServiceScopeFactory();
var fleetHub = new StubHubContext<FleetStatusHub>();
var alertHub = new StubHubContext<AlertHub>();
return new FleetStatusPoller(
scopeFactory,
fleetHub,
alertHub,
NullLogger<FleetStatusPoller>.Instance,
new RedundancyMetrics());
}
private sealed class StubServiceScopeFactory : IServiceScopeFactory
{
public IServiceScope CreateScope() => throw new NotImplementedException();
}
private sealed class StubHubContext<THub> : IHubContext<THub> where THub : Hub
{
public IHubClients Clients => throw new NotImplementedException();
public IGroupManager Groups => throw new NotImplementedException();
}
}
@@ -0,0 +1,116 @@
using Opc.Ua;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Server.OpcUa;
namespace ZB.MOM.WW.OtOpcUa.Server.Tests;
/// <summary>
/// Regression for Server-006 — synchronous OnReadValue / OnWriteValue stack hooks must
/// derive a <see cref="CancellationToken"/> from the operation deadline so a stalled
/// driver call doesn't pin a request thread for the full pipeline timeout. The shared
/// helper <see cref="DriverNodeManager.DeriveOperationCancellation"/> turns the
/// <see cref="ISystemContext"/>'s <c>OperationDeadline</c> into a linked CTS.
/// </summary>
[Trait("Category", "Unit")]
public sealed class DriverNodeManagerCancellationTests
{
/// <summary>
/// Build a SystemContext bound to the supplied IOperationContext. SystemContext's
/// OperationContext setter is protected, so we use the public <c>Copy</c> method
/// which clones the context onto the supplied operation context.
/// </summary>
private static ISystemContext ContextWithDeadline(DateTime deadline)
=> new SystemContext().Copy(new StubOperationContext(deadline));
[Fact]
public void Future_deadline_produces_uncancelled_token()
{
var ctx = ContextWithDeadline(DateTime.UtcNow.AddSeconds(30));
using var cts = DriverNodeManager.DeriveOperationCancellation(ctx, fallback: TimeSpan.FromSeconds(10));
cts.Token.IsCancellationRequested.ShouldBeFalse();
}
[Fact]
public void Past_deadline_produces_already_cancelled_token()
{
var ctx = ContextWithDeadline(DateTime.UtcNow.AddSeconds(-5));
using var cts = DriverNodeManager.DeriveOperationCancellation(ctx, fallback: TimeSpan.FromSeconds(10));
cts.Token.IsCancellationRequested.ShouldBeTrue(
"an expired OperationDeadline must surface as an immediately-cancelled token so the "
+ "stalled driver call returns without burning a request thread");
}
[Fact]
public void Missing_deadline_uses_fallback_timeout()
{
// No OperationContext attached → no deadline plumbed; helper falls back to the
// supplied timeout so an OnReadValue hook into a stalled driver can't hang the
// request thread indefinitely.
var ctx = new SystemContext();
using var cts = DriverNodeManager.DeriveOperationCancellation(ctx, fallback: TimeSpan.FromMilliseconds(20));
cts.Token.WaitHandle.WaitOne(TimeSpan.FromMilliseconds(500)).ShouldBeTrue(
"fallback timeout must fire so a missing deadline cannot hang the request thread");
cts.Token.IsCancellationRequested.ShouldBeTrue();
}
[Fact]
public void DateTime_MinValue_deadline_uses_fallback_timeout()
{
// IOperationContext.OperationDeadline is `DateTime.MinValue` when the stack hasn't
// plumbed a deadline through. The helper treats that as "no deadline" and falls
// back to the supplied timeout, otherwise an MinValue would surface as
// already-cancelled and short-circuit every read.
var ctx = ContextWithDeadline(DateTime.MinValue);
using var cts = DriverNodeManager.DeriveOperationCancellation(ctx, fallback: TimeSpan.FromMilliseconds(20));
cts.Token.WaitHandle.WaitOne(TimeSpan.FromMilliseconds(500)).ShouldBeTrue();
cts.Token.IsCancellationRequested.ShouldBeTrue();
}
[Fact]
public void DateTime_MaxValue_deadline_uses_fallback_timeout_not_overflow()
{
// OperationContext sets OperationDeadline = DateTime.MaxValue when the client's
// RequestHeader.TimeoutHint is zero (the default). DateTime.MaxValue - UtcNow
// overflows CancellationTokenSource(TimeSpan)'s Int32.MaxValue-ms cap, so the
// helper must collapse it to the fallback — otherwise the read throws
// ArgumentOutOfRangeException from inside DeriveOperationCancellation and surfaces
// as BadInternalError on every read (regression that broke OpcUaServerIntegrationTests).
var ctx = ContextWithDeadline(DateTime.MaxValue);
using var cts = DriverNodeManager.DeriveOperationCancellation(ctx, fallback: TimeSpan.FromSeconds(30));
cts.Token.IsCancellationRequested.ShouldBeFalse("MaxValue deadline + 30 s fallback must produce a live token");
}
[Fact]
public void Null_context_returns_uncancelled_token_with_fallback()
{
// Defensive — OnReadValue receives an ISystemContext from the stack so the helper
// shouldn't NRE if a future override passes through a null context.
using var cts = DriverNodeManager.DeriveOperationCancellation(context: null!, fallback: TimeSpan.FromSeconds(30));
cts.Token.IsCancellationRequested.ShouldBeFalse();
}
/// <summary>Minimal IOperationContext for deadline testing.</summary>
private sealed class StubOperationContext(DateTime deadline) : IOperationContext
{
public DateTime OperationDeadline { get; } = deadline;
public NodeId? SessionId => null;
public IUserIdentity? UserIdentity => null;
public IList<string>? PreferredLocales => null;
public DiagnosticsMasks DiagnosticsMask => DiagnosticsMasks.None;
public StringTable StringTable { get; } = new StringTable();
public StatusCode OperationStatus => StatusCodes.Good;
public string? AuditEntryId => null;
}
}
@@ -115,6 +115,34 @@ public sealed class PeerHttpProbeLoopTests : IDisposable
tracker.Get("B").HttpHealthy.ShouldBeFalse();
}
[Fact]
public async Task Tick_does_not_mutate_factory_vended_client_Timeout()
{
// Server-012: timeouts belong on the named-client registration or a per-request CTS,
// NOT on a factory-vended HttpClient (which IHttpClientFactory may pool/recycle).
// Mutating client.Timeout per tick is at minimum a bad smell and races with
// IHttpClientFactory's lifecycle expectations.
var coordinator = await SeedAndInitializeAsync("A",
("A", RedundancyRole.Primary, "urn:A"),
("B", RedundancyRole.Secondary, "urn:B"));
var tracker = new PeerReachabilityTracker();
var factoryInitialTimeout = TimeSpan.FromMinutes(2);
var factory = new RecordingHttpClientFactory(
_ => new HttpResponseMessage(HttpStatusCode.OK),
factoryInitialTimeout);
var loop = new PeerHttpProbeLoop(coordinator, tracker, factory, NullLogger<PeerHttpProbeLoop>.Instance,
options: new PeerProbeOptions { HttpProbeTimeout = TimeSpan.FromSeconds(3) });
await loop.TickAsync(CancellationToken.None);
factory.LastCreatedClient.ShouldNotBeNull();
factory.LastCreatedClient.Timeout.ShouldBe(factoryInitialTimeout,
"the probe loop must not mutate the factory-vended HttpClient's Timeout — "
+ "per-call timeout should be enforced via a CancellationToken or via "
+ "AddHttpClient.ConfigureHttpClient on the named registration.");
}
// ---- fixture helpers ---------------------------------------------------
private async Task<RedundancyCoordinator> SeedAndInitializeAsync(string selfNodeId, params (string id, RedundancyRole role, string appUri)[] nodes)
@@ -158,4 +186,30 @@ public sealed class PeerHttpProbeLoopTests : IDisposable
=> Task.FromResult(respond(request));
}
}
/// <summary>
/// Server-012 — captures the most-recently-vended <see cref="HttpClient"/> so the
/// test can assert the probe loop didn't mutate its <see cref="HttpClient.Timeout"/>.
/// </summary>
private sealed class RecordingHttpClientFactory(
Func<HttpRequestMessage, HttpResponseMessage> respond,
TimeSpan initialTimeout) : IHttpClientFactory
{
public HttpClient? LastCreatedClient { get; private set; }
public HttpClient CreateClient(string name)
{
var client = new HttpClient(new RecordingHandler(respond), disposeHandler: true)
{
Timeout = initialTimeout,
};
LastCreatedClient = client;
return client;
}
private sealed class RecordingHandler(Func<HttpRequestMessage, HttpResponseMessage> respond) : HttpMessageHandler
{
protected override Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
=> Task.FromResult(respond(request));
}
}
}
@@ -0,0 +1,176 @@
using Opc.Ua;
using Serilog;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
using ZB.MOM.WW.OtOpcUa.Core.AlarmHistorian;
using ZB.MOM.WW.OtOpcUa.Core.ScriptedAlarms;
using ZB.MOM.WW.OtOpcUa.Core.Scripting;
using ZB.MOM.WW.OtOpcUa.Server.OpcUa;
using ZB.MOM.WW.OtOpcUa.Server.Phase7;
namespace ZB.MOM.WW.OtOpcUa.Server.Tests.Phase7;
/// <summary>
/// Regression for Server-008 — <c>RouteScriptedAlarmMethodCalls</c> must mark a handled
/// <see cref="CallMethodRequest"/> slot as <c>Processed = true</c> so the stack's
/// <c>CustomNodeManager2.Call</c> skips it. The pre-fix code relied on the slot's
/// <c>errors[i]</c> being <c>ServiceResult.Good</c>, but the SDK's actual skip predicate is
/// <see cref="CallMethodRequest.Processed"/>; without setting it, the stack's built-in
/// Part 9 Acknowledge / Confirm handler would also fire, producing a double transition.
/// </summary>
[Trait("Category", "Unit")]
public sealed class ScriptedAlarmMethodRoutingProcessedFlagTests
{
private static ScriptedAlarmEngine BuildActiveEngine(string alarmId)
{
var upstream = new CachedTagUpstreamSource();
var logger = new LoggerConfiguration().CreateLogger();
var factory = new ScriptLoggerFactory(logger);
var engine = new ScriptedAlarmEngine(upstream, new InMemoryAlarmStateStore(), factory, logger);
var defs = new List<ScriptedAlarmDefinition>
{
new(AlarmId: alarmId,
EquipmentPath: "/eq",
AlarmName: alarmId,
Kind: AlarmKind.LimitAlarm,
Severity: AlarmSeverity.Medium,
MessageTemplate: "msg",
PredicateScriptSource: "return true;"),
};
engine.LoadAsync(defs, CancellationToken.None).GetAwaiter().GetResult();
return engine;
}
private static ScriptedAlarmEngine BuildEngine(string alarmId)
{
var upstream = new CachedTagUpstreamSource();
var logger = new LoggerConfiguration().CreateLogger();
var factory = new ScriptLoggerFactory(logger);
var engine = new ScriptedAlarmEngine(upstream, new InMemoryAlarmStateStore(), factory, logger);
var defs = new List<ScriptedAlarmDefinition>
{
new(AlarmId: alarmId,
EquipmentPath: "/eq",
AlarmName: alarmId,
Kind: AlarmKind.LimitAlarm,
Severity: AlarmSeverity.Medium,
MessageTemplate: "msg",
PredicateScriptSource: "return false;"),
};
engine.LoadAsync(defs, CancellationToken.None).GetAwaiter().GetResult();
return engine;
}
private static CallMethodRequest AcknowledgeRequest(string conditionNodeId)
=> new()
{
ObjectId = new NodeId(conditionNodeId, 2),
MethodId = MethodIds.AcknowledgeableConditionType_Acknowledge,
InputArguments =
{
new Variant(new byte[] { 1, 2, 3 }),
new Variant(new LocalizedText("ack-comment")),
},
};
private static CallMethodRequest AddCommentRequest(string conditionNodeId)
=> new()
{
ObjectId = new NodeId(conditionNodeId, 2),
MethodId = MethodIds.ConditionType_AddComment,
InputArguments =
{
new Variant(new byte[] { 1, 2, 3 }),
new Variant(new LocalizedText("comment-text")),
},
};
[Fact]
public void Handled_Acknowledge_marks_Processed_true_so_baseCall_skips_the_slot()
{
using var engine = BuildActiveEngine("al-1");
var calls = new List<CallMethodRequest> { AcknowledgeRequest("al-1.Condition") };
var results = new List<CallMethodResult> { new CallMethodResult() };
var errors = new List<ServiceResult> { null! };
var index = new Dictionary<string, string>(StringComparer.OrdinalIgnoreCase)
{
["al-1.Condition"] = "al-1",
};
DriverNodeManager.RouteScriptedAlarmMethodCalls(
new NamedIdentity("ops-user"), calls, results, errors, engine, index);
calls[0].Processed.ShouldBeTrue(
"CustomNodeManager2.Call/CallInternalAsync skips slots with Processed=true. "
+ "Without this flag, base.Call would re-dispatch the Acknowledge to the stack's "
+ "built-in Part 9 handler and the engine would observe a double transition.");
}
[Fact]
public void Handled_AddComment_marks_Processed_true()
{
using var engine = BuildEngine("al-1");
var calls = new List<CallMethodRequest> { AddCommentRequest("al-1.Condition") };
var results = new List<CallMethodResult> { new CallMethodResult() };
var errors = new List<ServiceResult> { null! };
var index = new Dictionary<string, string>(StringComparer.OrdinalIgnoreCase)
{
["al-1.Condition"] = "al-1",
};
DriverNodeManager.RouteScriptedAlarmMethodCalls(
new NamedIdentity("ops-user"), calls, results, errors, engine, index);
calls[0].Processed.ShouldBeTrue("AddComment handled by the engine must not re-dispatch via base.Call");
}
[Fact]
public void Engine_error_path_also_marks_Processed_so_baseCall_does_not_re_run_the_method()
{
using var engine = BuildEngine("al-1");
var calls = new List<CallMethodRequest>
{
// Index maps to an alarm id the engine doesn't know — engine throws
// ArgumentException, helper sets errors[i] = BadInvalidArgument.
AcknowledgeRequest("al-1.Condition"),
};
var results = new List<CallMethodResult> { new CallMethodResult() };
var errors = new List<ServiceResult> { null! };
var index = new Dictionary<string, string>(StringComparer.OrdinalIgnoreCase)
{
["al-1.Condition"] = "al-NOT-IN-ENGINE",
};
DriverNodeManager.RouteScriptedAlarmMethodCalls(
new NamedIdentity("ops-user"), calls, results, errors, engine, index);
ServiceResult.IsBad(errors[0]).ShouldBeTrue("engine error path");
calls[0].Processed.ShouldBeTrue(
"even when the engine returns Bad, the slot was handled — base.Call must not "
+ "re-dispatch the method against the OPC UA built-in handler.");
}
[Fact]
public void Unhandled_slot_leaves_Processed_false_so_baseCall_drives_it()
{
using var engine = BuildActiveEngine("al-1");
var genericMethod = new CallMethodRequest
{
ObjectId = new NodeId("some-driver-method", 2),
MethodId = new NodeId("driver-method", 2),
};
var calls = new List<CallMethodRequest> { genericMethod };
var results = new List<CallMethodResult> { new CallMethodResult() };
var errors = new List<ServiceResult> { null! };
DriverNodeManager.RouteScriptedAlarmMethodCalls(
new NamedIdentity("ops-user"), calls, results, errors, engine,
conditionIdToAlarmId: new Dictionary<string, string>());
calls[0].Processed.ShouldBeFalse("non-alarm methods must fall through to base.Call");
errors[0].ShouldBeNull("unhandled slot's error must stay null for the base implementation");
}
private sealed class NamedIdentity(string displayName) : UserIdentity(displayName, "") { }
}
@@ -0,0 +1,56 @@
using Opc.Ua;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Server.OpcUa;
namespace ZB.MOM.WW.OtOpcUa.Server.Tests;
/// <summary>
/// Regression for Server-004 — the production
/// <see cref="OtOpcUaServer.RoleBasedIdentity"/> must surface the LDAP-resolved display
/// name through <see cref="IUserIdentity.DisplayName"/>, since
/// <c>DriverNodeManager.ResolveCallUser</c> reads the base interface property when stamping
/// audit identities on scripted-alarm Acknowledge / Confirm / Shelve calls.
/// </summary>
[Trait("Category", "Unit")]
public sealed class RoleBasedIdentityTests
{
[Fact]
public void DisplayName_returns_LDAP_resolved_display_name_when_present()
{
IUserIdentity identity = new OtOpcUaServer.RoleBasedIdentity(
userName: "alice",
displayName: "Alice Smith",
roles: new[] { "WriteOperate" },
ldapGroups: new[] { "ot_operators" });
identity.DisplayName.ShouldBe("Alice Smith",
"DriverNodeManager.ResolveCallUser reads IUserIdentity.DisplayName for audit entries; "
+ "RoleBasedIdentity must surface the LDAP-resolved name, not just the username.");
}
[Fact]
public void DisplayName_falls_back_to_userName_when_LDAP_display_name_is_null()
{
IUserIdentity identity = new OtOpcUaServer.RoleBasedIdentity(
userName: "alice",
displayName: null,
roles: [],
ldapGroups: []);
identity.DisplayName.ShouldBe("alice",
"absent an LDAP display name, audit entries should still carry the username.");
}
[Fact]
public void ResolveCallUser_yields_LDAP_resolved_display_name()
{
IUserIdentity identity = new OtOpcUaServer.RoleBasedIdentity(
userName: "alice",
displayName: "Alice Smith",
roles: [],
ldapGroups: []);
DriverNodeManager.ResolveCallUser(identity).ShouldBe("Alice Smith");
}
}
@@ -0,0 +1,52 @@
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Logging.Abstractions;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Configuration.LocalCache;
namespace ZB.MOM.WW.OtOpcUa.Server.Tests;
/// <summary>
/// Regression for Server-014 — <see cref="SealedBootstrap"/> exists in the source tree and
/// is referenced by <c>docs/v2/v2-release-readiness.md</c> as the closed release blocker for
/// generation-sealed config plumbing, but it was never registered in the production DI
/// container. The release blocker remained de-facto open. This test asserts the DI
/// registrations (which <c>Program.cs</c> performs at startup) actually compose: every
/// dependency <see cref="SealedBootstrap"/> needs — <see cref="GenerationSealedCache"/>,
/// <see cref="ResilientConfigReader"/>, <see cref="StaleConfigFlag"/> — must be resolvable
/// so the production wire-up doesn't fail with a missing-service exception at startup.
/// </summary>
[Trait("Category", "Unit")]
public sealed class SealedBootstrapWiringTests
{
[Fact]
public void SealedBootstrap_and_its_dependencies_are_registered_in_DI()
{
var tempRoot = Path.Combine(Path.GetTempPath(), $"otopcua-sealed-bootstrap-wiring-{Guid.NewGuid():N}");
try
{
// Mirror Program.cs's registrations of NodeOptions + the SealedBootstrap chain.
var services = new ServiceCollection();
ZB.MOM.WW.OtOpcUa.Server.ServerWiring.AddSealedBootstrap(services, new NodeOptions
{
NodeId = "test-node",
ClusterId = "test-cluster",
ConfigDbConnectionString = "Server=fake;Database=fake;Integrated Security=true;",
LocalCachePath = tempRoot,
});
services.AddSingleton(NullLoggerFactory.Instance);
services.AddLogging();
using var sp = services.BuildServiceProvider();
sp.GetRequiredService<GenerationSealedCache>().ShouldNotBeNull();
sp.GetRequiredService<ResilientConfigReader>().ShouldNotBeNull();
sp.GetRequiredService<StaleConfigFlag>().ShouldNotBeNull();
sp.GetRequiredService<SealedBootstrap>().ShouldNotBeNull();
}
finally
{
try { if (Directory.Exists(tempRoot)) Directory.Delete(tempRoot, recursive: true); } catch { }
}
}
}