Compare commits

...

18 Commits

Author SHA1 Message Date
Joseph Doherty
404bfbe7e4 AlarmSurfaceInvoker — wraps IAlarmSource.Subscribe/Unsubscribe/Acknowledge through CapabilityInvoker with multi-host fan-out. Closes alarm-surface slice of task #161 (Phase 6.1 Stream A); the Roslyn invoker-coverage analyzer is split into new task #200 because a DiagnosticAnalyzer project is genuinely its own scaffolding PR (Microsoft.CodeAnalysis.CSharp.Workspaces dep, netstandard2.0 target, Microsoft.CodeAnalysis.Testing harness, ProjectReference OutputItemType=Analyzer wiring, and four corner-case rules I want tests for before shipping). Ship this PR as the runtime guardrail + callable API; the analyzer lands next as the compile-time guardrail. New AlarmSurfaceInvoker class in Core.Resilience. Three methods mirror IAlarmSource's three mutating surfaces: SubscribeAsync (fan-out: group sourceNodeIds by IPerCallHostResolver.ResolveHost, one CapabilityInvoker.ExecuteAsync per host with DriverCapability.AlarmSubscribe so AlarmSubscribe's retry policy kicks in + returns one IAlarmSubscriptionHandle per host); UnsubscribeAsync (single-host, defaultHost); AcknowledgeAsync (fan-out: group AlarmAcknowledgeRequests by resolver-mapped host, run each host's batch through DriverCapability.AlarmAcknowledge which does NOT retry per decision #143 — alarm-ack is a write-shaped op that's not idempotent at the plant-floor level). Drivers without IPerCallHostResolver (Galaxy single MXAccess endpoint, OpcUaClient against one remote, etc.) fall back to defaultHost = DriverInstanceId so breaker + bulkhead keying still happens; drivers with it get one-dead-PLC-doesn't-poison-siblings isolation per decision #144. Single-host single-subscribe returns [handle] with length 1; empty sourceNodeIds fast-paths to [] without a driver call. Five new AlarmSurfaceInvokerTests covering: (a) empty list short-circuits — driver method never called; (b) single-host sub routes via default host — one driver call with full id list; (c) multi-host sub fans out to 2 distinct hosts for 3 src ids mapping to 2 plcs — one driver call per host; (d) Acknowledge does not retry on failure — call count stays at 1 even with exception; (e) Subscribe retries transient failures — call count reaches 3 with a 2-failures-then-success fake. Core.Tests resilience-builder suite 19/19 passing (was 14, +5); Core.Tests whole suite still green. Core project builds 0 errors. Task #200 captures the compile-time guardrail: Roslyn DiagnosticAnalyzer at src/ZB.MOM.WW.OtOpcUa.Analyzers that flags direct invocations of the eleven capability-interface methods inside the Server namespace when the call is NOT inside a CapabilityInvoker.ExecuteAsync/ExecuteWriteAsync/AlarmSurfaceInvoker.*Async lambda. That analyzer is the reason we keep paying the wrapping-class overhead for every new capability.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 23:07:37 -04:00
006af636a0 Merge pull request (#139) - ExternalIdReservation merge in FinaliseBatch 2026-04-19 23:04:25 -04:00
Joseph Doherty
c0751fdda5 ExternalIdReservation merge inside FinaliseBatchAsync. Closes task #197. The FinaliseBatch docstring called this out as a narrower follow-up pending a concurrent-insert test matrix, and the CSV import UI PR (#163) noted that operators would see raw DbUpdate UNIQUE-constraint messages on ZTag/SAPID collision until this landed. Now every finalised-batch row reserves ZTag + SAPID in the same EF transaction as the Equipment inserts, so either both commit atomically or neither does. New MergeReservation helper handles the four outcomes per (Kind, Value) pair: (1) value empty/whitespace → skip the reservation entirely (operator left the optional identifier blank); (2) active reservation exists for same EquipmentUuid → bump LastPublishedAt + reuse (re-finalising a batch against the same equipment must be idempotent, e.g. a retry after a transient DB blip); (3) active reservation exists for a DIFFERENT EquipmentUuid → throw ExternalIdReservationConflictException with the conflicting UUID + originating cluster + first-published timestamp so operator sees exactly who owns the value + where to resolve it (release via sp_ReleaseExternalIdReservation or pick a new ZTag); (4) no active reservation → create a fresh row with FirstPublishedBy = batch.CreatedBy + FirstPublishedAt = transaction time. Pre-commit overlap scan uses one round-trip (WHERE Kind+Value IN the batch's distinct sets, filtered to ReleasedAt IS NULL so explicitly-released values can be re-issued per decision #124) + caches the results in a Dictionary keyed on (Kind, value.ToLowerInvariant()) for O(1) lookup during the row loop. Race-safety catch: if another finalise commits between our cache-load + our SaveChanges, SQL Server surfaces a 2601/2627 unique-index violation against UX_ExternalIdReservation_KindValue_Active — IsReservationUniquenessViolation walks the inner-exception chain for that specific signature + rethrows as ExternalIdReservationConflictException so the UI shows a clean message instead of a raw DbUpdateException. The index-name match means unrelated filtered-unique violations (future indices) don't get mis-classified. Test-fixture Row() helper updated to generate unique SAPID per row (sap-{ZTag}) — the prior shared SAPID="sap" worked only because reservations didn't exist; two rows sharing a SAPID under different EquipmentUuids now collide as intended by decision #124's fleet-wide uniqueness rule. Four new tests: (a) finalise creates both ZTag + SAPID reservations with expected Kind + Value; (b) re-finalising same EquipmentUuid's ZTag from a different batch does not create a duplicate (LastPublishedAt refresh only); (c) different EquipmentUuid claiming the same ZTag throws ExternalIdReservationConflictException with the ZTag value in the message + Equipment row for the second batch is NOT inserted (transaction rolled back cleanly); (d) row with empty ZTag + empty SAPID skips reservation entirely. Full Admin.Tests suite 85/85 passing (was 81 before this PR, +4). Admin project builds 0 errors. Note: the InMemory EF provider doesn't enforce filtered-unique indices, so the IsReservationUniquenessViolation catch is exercised only in the SQL Server integration path — the in-memory tests cover the cache-level conflict detection in MergeReservation instead, which is the first line of defence + catches the same-batch + published-vs-staged cases. The DbUpdate catch protects only the last-second race where two concurrent transactions both passed the cache check.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 23:02:31 -04:00
80e080ecec Merge pull request (#138) - UnsTab drag/drop + 409 conflict modal 2026-04-19 22:32:45 -04:00
Joseph Doherty
5ee510dc1a UnsTab native HTML5 drag/drop + 409 concurrent-edit modal + optimistic-concurrency commit path. Closes UI slice of task #153 (Phase 6.4 Stream A UI follow-up). Playwright E2E smoke is split into new task #199 — Playwright install + WebApplicationFactory + seeded-DB harness is genuinely its own infra-setup PR. Native HTML5 attributes (draggable, @ondragstart, @ondragover, @ondragleave, @ondrop) deliberately over MudBlazor per the task title — no MudBlazor ever joins this project. Two new service methods on UnsService land the data layer the existing UnsImpactAnalyzer assumed but which didn't actually exist: (1) LoadSnapshotAsync(generationId) — walks UnsAreas + UnsLines + per-line equipment counts + builds a UnsTreeSnapshot including a 16-char SHA-256 revision token computed deterministically over the sorted (kind, id, parent, name, notes) tuple-set so it's stable across processes + changes whenever any row is added / modified / deleted; (2) MoveLineAsync(generationId, expectedToken, lineId, targetAreaId) — re-parents one line inside the same draft under an EF transaction, recomputes the current revision token from freshly-loaded rows, and throws DraftRevisionConflictException when the caller-supplied token no longer matches. Token mismatch means another operator mutated the draft between preview + commit + the move rolls back rather than clobbering their work. No-op same-area drop is a silent return. Cross-generation move is prevented by the generationId filter on the transaction reads. UnsTab.razor gains draggable="true" on every line row with @ondragstart capturing the LineId into _dragLineId, and every area row is a drop target (@ondragover with :preventDefault so the browser accepts drops, @ondrop kicking off OnLineDroppedAsync). Drop path loads a fresh snapshot, builds a UnsMoveOperation(Kind=LineMove, source/target cluster matching because cross-cluster is decision-#82 rejected), runs UnsImpactAnalyzer.Analyze + shows a Bootstrap modal rendered inline in the component — modal shows HumanReadableSummary + equipment/tag counts + any CascadeWarnings list. Confirm button calls MoveLineAsync with the snapshot's RevisionToken; DraftRevisionConflictException surfaces a separate red-header "Draft changed — refresh required" modal with a Reload button that re-fetches areas + lines from the DB. New DraftRevisionConflictException in UnsService.cs, co-located with the service that throws it. Five new UnsServiceMoveTests covering LoadSnapshotAsync (areas + lines + equipment counts), RevisionToken stability between two reads, RevisionToken changes on AddLineAsync, MoveLineAsync happy path reparents the line in the DB, MoveLineAsync with stale token throws DraftRevisionConflictException + leaves the DB unchanged. Admin suite 81/81 passing (was 76, +5). Admin project builds 0 errors. Task #199 captures the deferred Playwright E2E smoke — drag a line onto a different area in a real browser, assert preview modal contents, click Confirm, assert the line row shows the new area. That PR stands up a new tests/ZB.MOM.WW.OtOpcUa.Admin.E2ETests project with Playwright + WebApplicationFactory + seeded InMemory DbContext.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 22:30:48 -04:00
543665dedd Merge pull request (#137) - DiffViewer refactor 2026-04-19 22:25:24 -04:00
Joseph Doherty
c8a38bc57b DiffViewer refactor — 6-section plugin pattern + 1000-row cap. Closes task #156 (Phase 6.4 Stream C). Replaces the flat single-table rendering that mixed Namespace/DriverInstance/Equipment/Tag rows into one untyped list with a per-section-card layout that makes draft review actually scannable on non-trivial diffs. New DiffSection.razor reusable component encapsulates the per-section rendering — card header shows Title + Description + a three-badge summary (+added / −removed / ~modified plus a "no changes" grey badge when the section is empty) so operators can glance at a six-card page and see what areas of the draft actually shifted before drilling into any one table. Hard row-cap at DefaultRowCap=1000 per section lives inside the component so a pathological draft (e.g. 20k tags churned by a block rebuild) can't freeze the browser on render — excess rows are silently dropped with a yellow warning banner that surfaces "Showing the first 1000 of N rows" + a pointer to run sp_ComputeGenerationDiff directly for the full set. Body max-height: 400px + overflow-y: auto gives each section its own scroll region so one big section doesn't push the others off screen. DiffViewer.razor refactored to a static Sections table driving a single foreach that instantiates one DiffSection per known TableName. Sections listed in author-order (Namespace → DriverInstance → Equipment → Tag → UnsLine → NodeAcl) — six entries matching the task acceptance criterion. The first four correspond to what sp_ComputeGenerationDiff currently emits; the last two (UnsLine + NodeAcl) render as empty "no changes" cards today + will light up when the proc is extended (tracked in task #196 for NodeAcl; UnsLine proc extension is a natural follow-up since UnsImpactAnalyzer already tracks UNS moves). RowsFor(tableName) replaces the prior flat table — each section filters the overall DiffRow list by its TableName so the proc output format stays stable. Header-bar summary at the top of the page now reads "N rows across M of 6 sections" so operators see overall change weight at a glance before scanning. Two Razor-specific fixes landed along the way: loop variable renamed from section to sec because @section collides with the Razor section directive + trips RZ2005; helper method renamed from Group to RowsFor because the Razor generator gets confused by a parameter-flowing method whose name clashes with LINQ's Group extension (the source-gen output referenced TypeCheck<T> with no argument). Admin project builds 0 errors; Admin.Tests suite 76/76 (unchanged — the refactor is structural + no service-layer logic changed, so the existing DraftValidator + EquipmentService + AdminServicesIntegrationTests cover the consuming paths). No bUnit in this project so the cap behavior isn't unit-tested at the component level; DiffSection.OnParametersSet is small + deterministic (int counts + Take(RowCap)) + reviewed before ship.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 22:23:22 -04:00
cecb84fa5d Merge pull request (#136) - Admin RedundancyTab 2026-04-19 22:16:20 -04:00
Joseph Doherty
13d5a7968b Admin RedundancyTab — per-cluster read-only topology view. Closes the UI slice of task #149 (Phase 6.3 Stream E — Admin UI RedundancyTab + OpenTelemetry metrics + SignalR); the OpenTelemetry metrics + RoleChanged SignalR push are split into new follow-up task #198 because each is a structural add that deserves its own test matrix + NuGet-dep decision rather than riding this UI PR. New /clusters/{ClusterId} Redundancy tab slotted between ACLs and Audit in the existing ClusterDetail tab bar. Shows each ClusterNode row in the cluster with columns Node / Role (Primary green, Secondary blue, Standalone primary-blue badge) / Host / OPC UA port / ServiceLevel base / ApplicationUri (text-break so the long urn: doesn't blow out the table) / Enabled badge / Last seen (relative age via the same FormatAge helper as Hosts.razor, with a yellow "Stale" chip once LastSeenAt crosses the 30s threshold shared with HostStatusService.StaleThreshold — a missed heartbeat plus clock-skew buffer). Four summary cards above the table — total Nodes, Primary count, Secondary count, Stale count. Two guard-rail alerts: (a) red "No Primary or Standalone" when the cluster has no authoritative write target (all rows are Secondaries — read-only until one is promoted by the server-side RedundancyCoordinator apply-lease flow); (b) red "Split-brain" when >1 Primary exists — apply-lease enforcement at the coordinator level should have made this impossible, so the alert implies a hand-edited DB row + an investigation. New ClusterNodeService with ListByClusterAsync (ordered by ServiceLevelBase descending so Primary rows with higher base float to the top) + a static IsStale predicate matching HostStatusService's 30s convention. DI-registered alongside the existing scoped services in Program.cs. Writes (role swap, enable/disable) are deliberately absent from the service — they go through the RedundancyCoordinator apply-lease flow on the server side + direct DB mutation from Admin would race with it. New ClusterNodeServiceTests covering IsStale across null/recent/old LastSeenAt + ListByClusterAsync ordering + cluster filter. 4/4 new tests passing; full Admin.Tests suite 76/76 (was 72 before this PR, +4). Admin project builds 0 errors. Task #198 captures the deferred work: (1) OpenTelemetry Meter for primary/secondary/stale counts + role_transition counter with from/to/node tags + OTLP exporter config; (2) RoleChanged SignalR push — extend FleetStatusPoller to detect RedundancyRole changes on ClusterNode rows + emit a RoleChanged hub message so the RedundancyTab refreshes instantly instead of on-page-load polling.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 22:14:25 -04:00
d1686ed82d Merge pull request (#135) - Equipment CSV import UI 2026-04-19 22:02:36 -04:00
Joseph Doherty
ac69a1c39d Equipment CSV import UI — Stream B.3/B.5 operator page + EquipmentTab "Import CSV" button. Closes the UI slice of task #163 (Phase 6.4 Stream B.3/B.5); the ExternalIdReservation merge follow-up inside FinaliseBatchAsync is split into new task #197 so it gets a proper concurrent-insert test matrix rather than riding this UI PR. New /clusters/{ClusterId}/draft/{GenerationId}/import-equipment page driving the full staged-import flow end-to-end. Operator selects a driver instance + UNS line (both scoped to the draft generation via DriverInstanceService.ListAsync + UnsService.ListLinesAsync dropdowns), pastes or uploads a CSV (InputFile with 5 MiB cap so pathological files can't OOM the server), clicks Parse — EquipmentCsvImporter.Parse runs + shows two side-by-side cards (accepted rows in green with ZTag/Machine/Name/Line columns, rejected rows in red with line-number + reason). Click Stage + Finalise and the page calls CreateBatchAsync → StageRowsAsync → FinaliseBatchAsync in sequence using the authenticated user's identity as CreatedBy; on success, 600ms banner then NavigateTo back to the draft editor so operator sees the newly-imported rows in EquipmentTab without a manual refresh. Parse errors (missing version marker, bad header, malformed CSV) surface InvalidCsvFormatException.Message inline alongside the Parse button — no page reload needed to retry. Finalise errors surface the service-layer exception message (ImportBatchNotFoundException / ImportBatchAlreadyFinalisedException / any DbUpdate* exception from the atomic transaction) so operator sees exactly why the finalise rejected before the tx rolled back. EquipmentTab gains an "Import CSV…" button next to "Add equipment" that NavigateTo's the new page; it needs a ClusterId parameter to build the URL so the @code block adds [Parameter] string ClusterId, and DraftEditor now passes ClusterId="@ClusterId" alongside the existing GenerationId. EquipmentImportBatchService was already implemented in Phase 6.4 Stream B.4 but missing from the Admin DI container — this PR adds AddScoped so the @inject resolves. The FinaliseBatch docstring explicitly defers ExternalIdReservation merge as a narrower follow-up with a concurrent-insert test matrix — task #197 captures that work. For now the finalise may surface a DB-level UNIQUE-constraint violation if a ZTag conflict exists at commit time; the UI shows the raw message + the batch + staged rows are still in the DB for re-use once the conflict is resolved. Admin project builds 0 errors; Admin.Tests 72/72 passing.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 22:00:40 -04:00
30714831fa Merge pull request (#134) - Admin RoleGrants page 2026-04-19 21:48:14 -04:00
Joseph Doherty
44d4448b37 Admin RoleGrants page — LDAP-group → Admin-role mapping CRUD. Closes the RoleGrantsTab slice of task #144 (Phase 6.2 Stream D follow-up); the remaining three sub-items (Probe-this-permission on AclsTab, SignalR invalidation on role/ACL changes, draft-diff ACL section) are split into new follow-up task #196 so each can ship independently. The permission-trie evaluator + ILdapGroupRoleMappingService already exist from Phase 6.2 Streams A + B — this PR adds the consuming UI + the DI registration that was missing. New /role-grants page at Components/Pages/RoleGrants.razor registered in MainLayout's sidebar next to Certificates. Lists every LdapGroupRoleMapping row with columns LDAP group / Role / Scope (Fleet-wide or Cluster:X) / Created / Notes / Revoke. Add-grant form takes LDAP group DN + AdminRole dropdown (ConfigViewer, ConfigEditor, FleetAdmin) + Fleet-wide checkbox + Cluster dropdown (disabled when Fleet-wide checked) + optional Notes. Service-layer invariants — IsSystemWide=true + ClusterId=null, or IsSystemWide=false + ClusterId populated — enforced in ValidateInvariants; UI catches InvalidLdapGroupRoleMappingException and displays the message in a red alert. ILdapGroupRoleMappingService was present in the Configuration project from Stream A but never registered in the Admin DI container — this PR adds the AddScoped registration so the injection can resolve. Control-plane/data-plane separation note rendered in an info banner at the top of the page per decision #150 (these grants do NOT govern OPC UA data-path authorization; NodeAcl rows are read directly by the permission-trie evaluator without consulting role mappings). Admin project builds 0 errors; Admin.Tests 72/72 passing. Task #196 created to track: (1) AclsTab Probe-this-permission form that takes (ldap group, node path, permission flag) and runs it through the permission trie, showing which row granted it + the actual resolved grant; (2) SignalR invalidation — push a RoleGrantsChanged event when rows are created/deleted so connected Admin sessions reload without polling, ditto NodeAclChanged on ACL writes; (3) DiffViewer ACL section — show NodeAcl + LdapGroupRoleMapping deltas between draft + published alongside equipment/uns diffs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 21:46:21 -04:00
572f8887e4 Merge pull request (#133) - IdentificationFields editor + edit mode 2026-04-19 21:43:18 -04:00
Joseph Doherty
2acea08ced Admin Equipment editor — IdentificationFields component + edit mode + three missing OPC 40010 fields. Closes the UI-editor slice of task #159 (Phase 6.4 Stream D remaining); the DriverNodeManager wire-in + ACL integration test are split into a new follow-up task #195 because they're blocked on a prerequisite that hasn't shipped — the DriverNodeManager does not currently materialize Equipment nodes at all (NodeScopeResolver has an explicit "A future resolver will..." TODO in its decomposition docstring). Shipping the IdentificationFolderBuilder call before the parent walker exists would wire a call that no code path hits, so the wire-in is deferred until the Equipment node walker lands first. New IdentificationFields.razor reusable component renders the 9-field decision #139 grid in a Bootstrap 3-column layout — Manufacturer, Model, SerialNumber, HardwareRevision, SoftwareRevision, YearOfConstruction (InputNumber), AssetLocation, ManufacturerUri (placeholder https://…), DeviceManualUri (placeholder https://…). Takes a required Equipment parameter + 2-way binds every field; no state of its own. Three fields that were missing from the old inline form — AssetLocation, ManufacturerUri, DeviceManualUri — now present, matching IdentificationFolderBuilder.FieldNames exactly. EquipmentTab.razor refactored to consume the component in both create + edit flows. Each table row gains an Edit button next to Remove. StartEdit clones the row into _draft so Cancel doesn't mutate the displayed list row with in-flight edits; on Save, UpdateAsync persists through EquipmentService's existing update path which already handles all 9 Identification fields. SaveAsync branches on _editMode — create still derives EquipmentId from a fresh Uuid via DraftValidator per decision #125, edit keeps the original EquipmentId + EquipmentUuid (immutable once set). FormName renamed equipment-form (was new-equipment) to work for both flows. Admin project builds 0 errors; Admin.Tests 72/72 passing. No new tests shipped — this PR is strictly a Razor-component refactor + two new bound fields + an Edit branch; the existing EquipmentService tests cover both the create + update paths. Task #195 created to track the blocked server-side work: call IdentificationFolderBuilder.Build from DriverNodeManager once the Equipment walker exists, plus an integration test browsing Equipment/Identification as an unauthorized user asserting BadUserAccessDenied per the builder's cross-reference note in docs/v2/acl-design.md §Identification.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 21:41:13 -04:00
49f6c9484e Merge pull request (#132) - Admin /hosts red-badge + Polly telemetry observer 2026-04-19 21:38:11 -04:00
Joseph Doherty
d06cc01a48 Admin /hosts red-badge + resilience columns + Polly telemetry observer. Closes task #164 (the remaining slice of Phase 6.1 Stream E.3 after the earlier publisher + hub PR). Three cooperating pieces wired together so the operator-facing /hosts table actually reflects the live Polly counters that the pipeline builder is producing. DriverResiliencePipelineBuilder gains an optional DriverResilienceStatusTracker ctor param — when non-null, every built pipeline wires Polly's OnRetry/OnOpened/OnClosed strategy-options callbacks into the tracker. OnRetry → tracker.RecordFailure (so ConsecutiveFailures climbs per retry), OnOpened → tracker.RecordBreakerOpen (stamps LastCircuitBreakerOpenUtc), OnClosed → tracker.RecordSuccess (resets the failure counter once the target recovers). Absent tracker = silent, preserving the unit-test constructor path + any deployment that doesn't care about resilience observability. Cancellation stays excluded from the failure count via the existing ShouldHandle predicate. HostStatusService.HostStatusRow extends with four new fields — ConsecutiveFailures, LastCircuitBreakerOpenUtc, CurrentBulkheadDepth, LastRecycleUtc — populated via a second LEFT JOIN onto DriverInstanceResilienceStatuses keyed on (DriverInstanceId, HostName). LEFT JOIN because brand-new hosts haven't been sampled yet; a missing row means zero failures + never-opened breaker, which is the correct default. New FailureFlagThreshold constant (=3, matches plan decision #143's conservative half-of-breaker convention) + IsFlagged predicate so the UI can pre-warn before the breaker actually trips. Hosts.razor paints three new columns between State and Last-transition — Fail# (bold red when flagged), In-flight (bulkhead-depth proxy), Breaker-opened (relative age). Per-row "Flagged" red badge alongside State when IsFlagged is true. Above the first cluster table, a red alert banner summarises the flagged-host count when ≥1 host is flagged, so operators see the problem before scanning rows. Three new tests in DriverResiliencePipelineBuilderTests — Tracker_RecordsFailure_OnEveryRetry verifies ConsecutiveFailures reaches RetryCount after a transient-forever operation, Tracker_StampsBreakerOpen_WhenBreakerTrips verifies LastBreakerOpenUtc is set after threshold failures on a Write pipeline, Tracker_IsolatesCounters_PerHost verifies one dead host does not leak failure counts into a healthy sibling. Full suite — Core.Tests 14/14 resilience-builder tests passing (11 existing + 3 new), Admin.Tests 72/72 passing, Admin project builds 0 errors. SignalR live push of status changes + browser visual review are deliberately left to a follow-up — this PR keeps the structural change minimal (polling refresh already exists in the page's 10s timer; SignalR would be a structural add that touches hub registration + client subscription).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 21:35:54 -04:00
5536e96b46 Merge pull request (#131) - AbCip UDT Template reader 2026-04-19 21:23:34 -04:00
24 changed files with 1915 additions and 67 deletions

View File

@@ -10,6 +10,7 @@
<li class="nav-item"><a class="nav-link text-light" href="/clusters">Clusters</a></li>
<li class="nav-item"><a class="nav-link text-light" href="/reservations">Reservations</a></li>
<li class="nav-item"><a class="nav-link text-light" href="/certificates">Certificates</a></li>
<li class="nav-item"><a class="nav-link text-light" href="/role-grants">Role grants</a></li>
</ul>
<div class="mt-5">

View File

@@ -52,6 +52,7 @@ else
<li class="nav-item"><button class="nav-link @Tab("namespaces")" @onclick='() => _tab = "namespaces"'>Namespaces</button></li>
<li class="nav-item"><button class="nav-link @Tab("drivers")" @onclick='() => _tab = "drivers"'>Drivers</button></li>
<li class="nav-item"><button class="nav-link @Tab("acls")" @onclick='() => _tab = "acls"'>ACLs</button></li>
<li class="nav-item"><button class="nav-link @Tab("redundancy")" @onclick='() => _tab = "redundancy"'>Redundancy</button></li>
<li class="nav-item"><button class="nav-link @Tab("audit")" @onclick='() => _tab = "audit"'>Audit</button></li>
</ul>
@@ -92,6 +93,10 @@ else
{
<AclsTab GenerationId="@_currentDraft.GenerationId" ClusterId="@ClusterId"/>
}
else if (_tab == "redundancy")
{
<RedundancyTab ClusterId="@ClusterId"/>
}
else if (_tab == "audit")
{
<AuditTab ClusterId="@ClusterId"/>

View File

@@ -0,0 +1,90 @@
@using ZB.MOM.WW.OtOpcUa.Admin.Services
@* Per-section diff renderer — the base used by DiffViewer for every known TableName. Caps
output at RowCap rows so a pathological draft (e.g. 20k tags churned) can't freeze the
Blazor render; overflow banner tells operator how many rows were hidden. *@
<div class="card mb-3">
<div class="card-header d-flex justify-content-between align-items-center">
<div>
<strong>@Title</strong>
<small class="text-muted ms-2">@Description</small>
</div>
<div>
@if (_added > 0) { <span class="badge bg-success me-1">+@_added</span> }
@if (_removed > 0) { <span class="badge bg-danger me-1">@_removed</span> }
@if (_modified > 0) { <span class="badge bg-warning text-dark me-1">~@_modified</span> }
@if (_total == 0) { <span class="badge bg-secondary">no changes</span> }
</div>
</div>
@if (_total == 0)
{
<div class="card-body text-muted small">No changes in this section.</div>
}
else
{
@if (_total > RowCap)
{
<div class="alert alert-warning mb-0 small rounded-0">
Showing the first @RowCap of @_total rows — cap protects the browser from megabyte-class
diffs. Inspect the remainder via the SQL <code>sp_ComputeGenerationDiff</code> directly.
</div>
}
<div class="table-responsive" style="max-height: 400px; overflow-y: auto;">
<table class="table table-sm table-hover mb-0">
<thead class="table-light">
<tr><th>LogicalId</th><th style="width: 120px;">Change</th></tr>
</thead>
<tbody>
@foreach (var r in _visibleRows)
{
<tr>
<td><code>@r.LogicalId</code></td>
<td>
@switch (r.ChangeKind)
{
case "Added": <span class="badge bg-success">@r.ChangeKind</span> break;
case "Removed": <span class="badge bg-danger">@r.ChangeKind</span> break;
case "Modified": <span class="badge bg-warning text-dark">@r.ChangeKind</span> break;
default: <span class="badge bg-secondary">@r.ChangeKind</span> break;
}
</td>
</tr>
}
</tbody>
</table>
</div>
}
</div>
@code {
/// <summary>Default row-cap per section — matches task #156's acceptance criterion.</summary>
public const int DefaultRowCap = 1000;
[Parameter, EditorRequired] public string Title { get; set; } = string.Empty;
[Parameter] public string Description { get; set; } = string.Empty;
[Parameter, EditorRequired] public IReadOnlyList<DiffRow> Rows { get; set; } = [];
[Parameter] public int RowCap { get; set; } = DefaultRowCap;
private int _total;
private int _added;
private int _removed;
private int _modified;
private List<DiffRow> _visibleRows = [];
protected override void OnParametersSet()
{
_total = Rows.Count;
_added = 0; _removed = 0; _modified = 0;
foreach (var r in Rows)
{
switch (r.ChangeKind)
{
case "Added": _added++; break;
case "Removed": _removed++; break;
case "Modified": _modified++; break;
}
}
_visibleRows = _total > RowCap ? Rows.Take(RowCap).ToList() : Rows.ToList();
}
}

View File

@@ -28,36 +28,44 @@ else if (_rows.Count == 0)
}
else
{
<table class="table table-hover table-sm">
<thead><tr><th>Table</th><th>LogicalId</th><th>ChangeKind</th></tr></thead>
<tbody>
@foreach (var r in _rows)
{
<tr>
<td>@r.TableName</td>
<td><code>@r.LogicalId</code></td>
<td>
@switch (r.ChangeKind)
{
case "Added": <span class="badge bg-success">@r.ChangeKind</span> break;
case "Removed": <span class="badge bg-danger">@r.ChangeKind</span> break;
case "Modified": <span class="badge bg-warning text-dark">@r.ChangeKind</span> break;
default: <span class="badge bg-secondary">@r.ChangeKind</span> break;
}
</td>
</tr>
}
</tbody>
</table>
<p class="small text-muted mb-3">
@_rows.Count row@(_rows.Count == 1 ? "" : "s") across @_sectionsWithChanges of @Sections.Count sections.
Each section is capped at @DiffSection.DefaultRowCap rows to keep the browser responsive on pathological drafts.
</p>
@foreach (var sec in Sections)
{
<DiffSection Title="@sec.Title"
Description="@sec.Description"
Rows="@RowsFor(sec.TableName)"/>
}
}
@code {
[Parameter] public string ClusterId { get; set; } = string.Empty;
[Parameter] public long GenerationId { get; set; }
/// <summary>
/// Ordered section definitions — each maps a <c>TableName</c> emitted by
/// <c>sp_ComputeGenerationDiff</c> to a human label + description. The proc currently
/// emits Namespace/DriverInstance/Equipment/Tag; UnsLine + NodeAcl entries render as
/// empty "no changes" cards until the proc is extended (tracked in tasks #196 + #156
/// follow-up). Six sections total matches the task #156 target.
/// </summary>
private static readonly IReadOnlyList<SectionDef> Sections = new[]
{
new SectionDef("Namespace", "Namespaces", "OPC UA namespace URIs + enablement"),
new SectionDef("DriverInstance", "Driver instances","Per-cluster driver configuration rows"),
new SectionDef("Equipment", "Equipment", "UNS level-5 rows + identification fields"),
new SectionDef("Tag", "Tags", "Per-device tag definitions + poll-group binding"),
new SectionDef("UnsLine", "UNS structure", "Site / Area / Line hierarchy (proc-extension pending)"),
new SectionDef("NodeAcl", "ACLs", "LDAP-group → node-scope permission grants (proc-extension pending)"),
};
private List<DiffRow>? _rows;
private string _fromLabel = "(empty)";
private string? _error;
private int _sectionsWithChanges;
protected override async Task OnParametersSetAsync()
{
@@ -67,7 +75,13 @@ else
var from = all.FirstOrDefault(g => g.Status == GenerationStatus.Published);
_fromLabel = from is null ? "(empty)" : $"gen {from.GenerationId}";
_rows = await GenerationSvc.ComputeDiffAsync(from?.GenerationId ?? 0, GenerationId, CancellationToken.None);
_sectionsWithChanges = Sections.Count(s => _rows.Any(r => r.TableName == s.TableName));
}
catch (Exception ex) { _error = ex.Message; }
}
private IReadOnlyList<DiffRow> RowsFor(string tableName) =>
_rows?.Where(r => r.TableName == tableName).ToList() ?? [];
private sealed record SectionDef(string TableName, string Title, string Description);
}

View File

@@ -27,7 +27,7 @@
<div class="row">
<div class="col-md-8">
@if (_tab == "equipment") { <EquipmentTab GenerationId="@GenerationId"/> }
@if (_tab == "equipment") { <EquipmentTab GenerationId="@GenerationId" ClusterId="@ClusterId"/> }
else if (_tab == "uns") { <UnsTab GenerationId="@GenerationId" ClusterId="@ClusterId"/> }
else if (_tab == "namespaces") { <NamespacesTab GenerationId="@GenerationId" ClusterId="@ClusterId"/> }
else if (_tab == "drivers") { <DriversTab GenerationId="@GenerationId" ClusterId="@ClusterId"/> }

View File

@@ -2,10 +2,14 @@
@using ZB.MOM.WW.OtOpcUa.Configuration.Entities
@using ZB.MOM.WW.OtOpcUa.Configuration.Validation
@inject EquipmentService EquipmentSvc
@inject NavigationManager Nav
<div class="d-flex justify-content-between mb-3">
<h4>Equipment (draft gen @GenerationId)</h4>
<button class="btn btn-primary btn-sm" @onclick="StartAdd">Add equipment</button>
<div>
<button class="btn btn-outline-primary btn-sm me-2" @onclick="GoImport">Import CSV…</button>
<button class="btn btn-primary btn-sm" @onclick="StartAdd">Add equipment</button>
</div>
</div>
@if (_equipment is null)
@@ -36,7 +40,10 @@ else if (_equipment.Count > 0)
<td>@e.SAPID</td>
<td>@e.Manufacturer / @e.Model</td>
<td>@e.SerialNumber</td>
<td><button class="btn btn-sm btn-outline-danger" @onclick="() => DeleteAsync(e.EquipmentRowId)">Remove</button></td>
<td>
<button class="btn btn-sm btn-outline-secondary me-1" @onclick="() => StartEdit(e)">Edit</button>
<button class="btn btn-sm btn-outline-danger" @onclick="() => DeleteAsync(e.EquipmentRowId)">Remove</button>
</td>
</tr>
}
</tbody>
@@ -47,8 +54,8 @@ else if (_equipment.Count > 0)
{
<div class="card mt-3">
<div class="card-body">
<h5>New equipment</h5>
<EditForm Model="_draft" OnValidSubmit="SaveAsync" FormName="new-equipment">
<h5>@(_editMode ? "Edit equipment" : "New equipment")</h5>
<EditForm Model="_draft" OnValidSubmit="SaveAsync" FormName="equipment-form">
<DataAnnotationsValidator/>
<div class="row g-3">
<div class="col-md-4">
@@ -78,24 +85,13 @@ else if (_equipment.Count > 0)
</div>
</div>
<h6 class="mt-4">OPC 40010 Identification</h6>
<div class="row g-3">
<div class="col-md-4"><label class="form-label">Manufacturer</label><InputText @bind-Value="_draft.Manufacturer" class="form-control"/></div>
<div class="col-md-4"><label class="form-label">Model</label><InputText @bind-Value="_draft.Model" class="form-control"/></div>
<div class="col-md-4"><label class="form-label">Serial number</label><InputText @bind-Value="_draft.SerialNumber" class="form-control"/></div>
<div class="col-md-4"><label class="form-label">Hardware rev</label><InputText @bind-Value="_draft.HardwareRevision" class="form-control"/></div>
<div class="col-md-4"><label class="form-label">Software rev</label><InputText @bind-Value="_draft.SoftwareRevision" class="form-control"/></div>
<div class="col-md-4">
<label class="form-label">Year of construction</label>
<InputNumber @bind-Value="_draft.YearOfConstruction" class="form-control"/>
</div>
</div>
<IdentificationFields Equipment="_draft"/>
@if (_error is not null) { <div class="alert alert-danger mt-3">@_error</div> }
<div class="mt-3">
<button type="submit" class="btn btn-primary btn-sm">Save</button>
<button type="button" class="btn btn-secondary btn-sm ms-2" @onclick="() => _showForm = false">Cancel</button>
<button type="button" class="btn btn-secondary btn-sm ms-2" @onclick="Cancel">Cancel</button>
</div>
</EditForm>
</div>
@@ -104,8 +100,12 @@ else if (_equipment.Count > 0)
@code {
[Parameter] public long GenerationId { get; set; }
[Parameter] public string ClusterId { get; set; } = string.Empty;
private void GoImport() => Nav.NavigateTo($"/clusters/{ClusterId}/draft/{GenerationId}/import-equipment");
private List<Equipment>? _equipment;
private bool _showForm;
private bool _editMode;
private Equipment _draft = NewBlankDraft();
private string? _error;
@@ -125,20 +125,68 @@ else if (_equipment.Count > 0)
private void StartAdd()
{
_draft = NewBlankDraft();
_editMode = false;
_error = null;
_showForm = true;
}
private void StartEdit(Equipment row)
{
// Shallow-clone so Cancel doesn't mutate the list-displayed row with in-flight form edits.
_draft = new Equipment
{
EquipmentRowId = row.EquipmentRowId,
GenerationId = row.GenerationId,
EquipmentId = row.EquipmentId,
EquipmentUuid = row.EquipmentUuid,
DriverInstanceId = row.DriverInstanceId,
DeviceId = row.DeviceId,
UnsLineId = row.UnsLineId,
Name = row.Name,
MachineCode = row.MachineCode,
ZTag = row.ZTag,
SAPID = row.SAPID,
Manufacturer = row.Manufacturer,
Model = row.Model,
SerialNumber = row.SerialNumber,
HardwareRevision = row.HardwareRevision,
SoftwareRevision = row.SoftwareRevision,
YearOfConstruction = row.YearOfConstruction,
AssetLocation = row.AssetLocation,
ManufacturerUri = row.ManufacturerUri,
DeviceManualUri = row.DeviceManualUri,
EquipmentClassRef = row.EquipmentClassRef,
Enabled = row.Enabled,
};
_editMode = true;
_error = null;
_showForm = true;
}
private void Cancel()
{
_showForm = false;
_editMode = false;
}
private async Task SaveAsync()
{
_error = null;
_draft.EquipmentUuid = Guid.NewGuid();
_draft.EquipmentId = DraftValidator.DeriveEquipmentId(_draft.EquipmentUuid);
_draft.GenerationId = GenerationId;
try
{
await EquipmentSvc.CreateAsync(GenerationId, _draft, CancellationToken.None);
if (_editMode)
{
await EquipmentSvc.UpdateAsync(_draft, CancellationToken.None);
}
else
{
_draft.EquipmentUuid = Guid.NewGuid();
_draft.EquipmentId = DraftValidator.DeriveEquipmentId(_draft.EquipmentUuid);
_draft.GenerationId = GenerationId;
await EquipmentSvc.CreateAsync(GenerationId, _draft, CancellationToken.None);
}
_showForm = false;
_editMode = false;
await ReloadAsync();
}
catch (Exception ex) { _error = ex.Message; }

View File

@@ -0,0 +1,49 @@
@using ZB.MOM.WW.OtOpcUa.Configuration.Entities
@* Reusable OPC 40010 Machinery Identification editor. Binds to an Equipment row and renders the
nine decision #139 fields in a consistent 3-column Bootstrap grid. Used by EquipmentTab's
create + edit forms so the same UI renders regardless of which flow opened it. *@
<h6 class="mt-4">OPC 40010 Identification</h6>
<div class="row g-3">
<div class="col-md-4">
<label class="form-label">Manufacturer</label>
<InputText @bind-Value="Equipment!.Manufacturer" class="form-control"/>
</div>
<div class="col-md-4">
<label class="form-label">Model</label>
<InputText @bind-Value="Equipment!.Model" class="form-control"/>
</div>
<div class="col-md-4">
<label class="form-label">Serial number</label>
<InputText @bind-Value="Equipment!.SerialNumber" class="form-control"/>
</div>
<div class="col-md-4">
<label class="form-label">Hardware rev</label>
<InputText @bind-Value="Equipment!.HardwareRevision" class="form-control"/>
</div>
<div class="col-md-4">
<label class="form-label">Software rev</label>
<InputText @bind-Value="Equipment!.SoftwareRevision" class="form-control"/>
</div>
<div class="col-md-4">
<label class="form-label">Year of construction</label>
<InputNumber @bind-Value="Equipment!.YearOfConstruction" class="form-control"/>
</div>
<div class="col-md-4">
<label class="form-label">Asset location</label>
<InputText @bind-Value="Equipment!.AssetLocation" class="form-control"/>
</div>
<div class="col-md-4">
<label class="form-label">Manufacturer URI</label>
<InputText @bind-Value="Equipment!.ManufacturerUri" class="form-control" placeholder="https://…"/>
</div>
<div class="col-md-4">
<label class="form-label">Device manual URI</label>
<InputText @bind-Value="Equipment!.DeviceManualUri" class="form-control" placeholder="https://…"/>
</div>
</div>
@code {
[Parameter, EditorRequired] public Equipment? Equipment { get; set; }
}

View File

@@ -0,0 +1,200 @@
@page "/clusters/{ClusterId}/draft/{GenerationId:long}/import-equipment"
@using Microsoft.AspNetCore.Components.Authorization
@using ZB.MOM.WW.OtOpcUa.Admin.Services
@using ZB.MOM.WW.OtOpcUa.Configuration.Entities
@inject DriverInstanceService DriverSvc
@inject UnsService UnsSvc
@inject EquipmentImportBatchService BatchSvc
@inject NavigationManager Nav
@inject AuthenticationStateProvider AuthProvider
<div class="d-flex justify-content-between align-items-center mb-3">
<div>
<h1 class="mb-0">Equipment CSV import</h1>
<small class="text-muted">Cluster <code>@ClusterId</code> · draft generation @GenerationId</small>
</div>
<a class="btn btn-outline-secondary" href="/clusters/@ClusterId/draft/@GenerationId">Back to draft</a>
</div>
<div class="alert alert-info small mb-3">
Accepts <code>@EquipmentCsvImporter.VersionMarker</code>-headered CSV per Stream B.3.
Required columns: @string.Join(", ", EquipmentCsvImporter.RequiredColumns).
Optional columns cover the OPC 40010 Identification fields. Paste the file contents
or upload directly — the parser runs client-stream-side and shows a row-level preview
before anything lands in the draft. ZTag + SAPID uniqueness across the fleet is NOT
enforced here yet (see task #197); for now the finalise may fail at commit time if a
reservation conflict exists.
</div>
<div class="card mb-3">
<div class="card-body">
<div class="row g-3">
<div class="col-md-5">
<label class="form-label">Target driver instance (for every accepted row)</label>
<select class="form-select" @bind="_driverInstanceId">
<option value="">-- select driver --</option>
@if (_drivers is not null)
{
@foreach (var d in _drivers) { <option value="@d.DriverInstanceId">@d.DriverInstanceId</option> }
}
</select>
</div>
<div class="col-md-5">
<label class="form-label">Target UNS line (for every accepted row)</label>
<select class="form-select" @bind="_unsLineId">
<option value="">-- select line --</option>
@if (_unsLines is not null)
{
@foreach (var l in _unsLines) { <option value="@l.UnsLineId">@l.UnsLineId — @l.Name</option> }
}
</select>
</div>
<div class="col-md-2 pt-4">
<InputFile OnChange="HandleFileAsync" class="form-control form-control-sm" accept=".csv,.txt"/>
</div>
</div>
<div class="mt-3">
<label class="form-label">CSV content (paste or uploaded)</label>
<textarea class="form-control font-monospace" rows="8" @bind="_csvText"
placeholder="# OtOpcUaCsv v1&#10;ZTag,MachineCode,SAPID,EquipmentId,…"/>
</div>
<div class="mt-3">
<button class="btn btn-sm btn-outline-primary" @onclick="ParseAsync" disabled="@_busy">Parse</button>
<button class="btn btn-sm btn-primary ms-2" @onclick="StageAndFinaliseAsync"
disabled="@(_parseResult is null || _parseResult.AcceptedRows.Count == 0 || string.IsNullOrWhiteSpace(_driverInstanceId) || string.IsNullOrWhiteSpace(_unsLineId) || _busy)">
Stage + Finalise
</button>
@if (_parseError is not null) { <span class="alert alert-danger ms-3 py-1 px-2 small">@_parseError</span> }
@if (_result is not null) { <span class="alert alert-success ms-3 py-1 px-2 small">@_result</span> }
</div>
</div>
</div>
@if (_parseResult is not null)
{
<div class="row g-3">
<div class="col-md-6">
<div class="card">
<div class="card-header bg-success text-white">
Accepted (@_parseResult.AcceptedRows.Count)
</div>
<div class="card-body p-0" style="max-height: 400px; overflow-y: auto;">
@if (_parseResult.AcceptedRows.Count == 0)
{
<p class="text-muted p-3 mb-0">No accepted rows.</p>
}
else
{
<table class="table table-sm table-striped mb-0">
<thead>
<tr><th>ZTag</th><th>Machine</th><th>Name</th><th>Line</th></tr>
</thead>
<tbody>
@foreach (var r in _parseResult.AcceptedRows)
{
<tr>
<td><code>@r.ZTag</code></td>
<td>@r.MachineCode</td>
<td>@r.Name</td>
<td>@r.UnsLineName</td>
</tr>
}
</tbody>
</table>
}
</div>
</div>
</div>
<div class="col-md-6">
<div class="card">
<div class="card-header bg-danger text-white">
Rejected (@_parseResult.RejectedRows.Count)
</div>
<div class="card-body p-0" style="max-height: 400px; overflow-y: auto;">
@if (_parseResult.RejectedRows.Count == 0)
{
<p class="text-muted p-3 mb-0">No rejections.</p>
}
else
{
<table class="table table-sm table-striped mb-0">
<thead><tr><th>Line</th><th>Reason</th></tr></thead>
<tbody>
@foreach (var e in _parseResult.RejectedRows)
{
<tr>
<td>@e.LineNumber</td>
<td class="small">@e.Reason</td>
</tr>
}
</tbody>
</table>
}
</div>
</div>
</div>
</div>
}
@code {
[Parameter] public string ClusterId { get; set; } = string.Empty;
[Parameter] public long GenerationId { get; set; }
private List<DriverInstance>? _drivers;
private List<UnsLine>? _unsLines;
private string _driverInstanceId = string.Empty;
private string _unsLineId = string.Empty;
private string _csvText = string.Empty;
private EquipmentCsvParseResult? _parseResult;
private string? _parseError;
private string? _result;
private bool _busy;
protected override async Task OnInitializedAsync()
{
_drivers = await DriverSvc.ListAsync(GenerationId, CancellationToken.None);
_unsLines = await UnsSvc.ListLinesAsync(GenerationId, CancellationToken.None);
}
private async Task HandleFileAsync(InputFileChangeEventArgs e)
{
// 5 MiB cap — refuses pathological uploads that would OOM the server.
using var stream = e.File.OpenReadStream(maxAllowedSize: 5 * 1024 * 1024);
using var reader = new StreamReader(stream);
_csvText = await reader.ReadToEndAsync();
}
private void ParseAsync()
{
_parseError = null;
_parseResult = null;
_result = null;
try { _parseResult = EquipmentCsvImporter.Parse(_csvText); }
catch (InvalidCsvFormatException ex) { _parseError = ex.Message; }
catch (Exception ex) { _parseError = $"Parse failed: {ex.Message}"; }
}
private async Task StageAndFinaliseAsync()
{
if (_parseResult is null) return;
_busy = true;
_result = null;
_parseError = null;
try
{
var auth = await AuthProvider.GetAuthenticationStateAsync();
var createdBy = auth.User.Identity?.Name ?? "unknown";
var batch = await BatchSvc.CreateBatchAsync(ClusterId, createdBy, CancellationToken.None);
await BatchSvc.StageRowsAsync(batch.Id, _parseResult.AcceptedRows, _parseResult.RejectedRows, CancellationToken.None);
await BatchSvc.FinaliseBatchAsync(batch.Id, GenerationId, _driverInstanceId, _unsLineId, CancellationToken.None);
_result = $"Finalised batch {batch.Id:N} — {_parseResult.AcceptedRows.Count} rows added.";
// Pause 600 ms so the success banner is visible, then navigate back.
await Task.Delay(600);
Nav.NavigateTo($"/clusters/{ClusterId}/draft/{GenerationId}");
}
catch (Exception ex) { _parseError = $"Finalise failed: {ex.Message}"; }
finally { _busy = false; }
}
}

View File

@@ -0,0 +1,136 @@
@using ZB.MOM.WW.OtOpcUa.Admin.Services
@using ZB.MOM.WW.OtOpcUa.Configuration.Entities
@using ZB.MOM.WW.OtOpcUa.Configuration.Enums
@inject ClusterNodeService NodeSvc
<h4>Redundancy topology</h4>
<p class="text-muted small">
One row per <code>ClusterNode</code> in this cluster. Role, <code>ApplicationUri</code>,
and <code>ServiceLevelBase</code> are authored separately; the Admin UI shows them read-only
here so operators can confirm the published topology without touching it. LastSeen older than
@((int)ClusterNodeService.StaleThreshold.TotalSeconds)s is flagged Stale — the node has
stopped heart-beating and is likely down. Role swap goes through the server-side
<code>RedundancyCoordinator</code> apply-lease flow, not direct DB edits.
</p>
@if (_nodes is null)
{
<p>Loading…</p>
}
else if (_nodes.Count == 0)
{
<div class="alert alert-warning">
No ClusterNode rows for this cluster. The server process needs at least one entry
(with a non-blank <code>ApplicationUri</code>) before it can start up per OPC UA spec.
</div>
}
else
{
var primaries = _nodes.Count(n => n.RedundancyRole == RedundancyRole.Primary);
var secondaries = _nodes.Count(n => n.RedundancyRole == RedundancyRole.Secondary);
var standalone = _nodes.Count(n => n.RedundancyRole == RedundancyRole.Standalone);
var staleCount = _nodes.Count(ClusterNodeService.IsStale);
<div class="row g-3 mb-4">
<div class="col-md-3"><div class="card"><div class="card-body">
<h6 class="text-muted mb-1">Nodes</h6>
<div class="fs-3">@_nodes.Count</div>
</div></div></div>
<div class="col-md-3"><div class="card border-success"><div class="card-body">
<h6 class="text-muted mb-1">Primary</h6>
<div class="fs-3 text-success">@primaries</div>
</div></div></div>
<div class="col-md-3"><div class="card border-info"><div class="card-body">
<h6 class="text-muted mb-1">Secondary</h6>
<div class="fs-3 text-info">@secondaries</div>
</div></div></div>
<div class="col-md-3"><div class="card @(staleCount > 0 ? "border-warning" : "")"><div class="card-body">
<h6 class="text-muted mb-1">Stale</h6>
<div class="fs-3 @(staleCount > 0 ? "text-warning" : "")">@staleCount</div>
</div></div></div>
</div>
@if (primaries == 0 && standalone == 0)
{
<div class="alert alert-danger small mb-3">
No Primary or Standalone node — the cluster has no authoritative write target. Secondaries
stay read-only until one of them gets promoted via <code>RedundancyCoordinator</code>.
</div>
}
else if (primaries > 1)
{
<div class="alert alert-danger small mb-3">
<strong>Split-brain:</strong> @primaries nodes claim the Primary role. Apply-lease
enforcement should have made this impossible at the coordinator level. Investigate
immediately — one of the rows was likely hand-edited.
</div>
}
<table class="table table-sm table-hover align-middle">
<thead>
<tr>
<th>Node</th>
<th>Role</th>
<th>Host</th>
<th class="text-end">OPC UA port</th>
<th class="text-end">ServiceLevel base</th>
<th>ApplicationUri</th>
<th>Enabled</th>
<th>Last seen</th>
</tr>
</thead>
<tbody>
@foreach (var n in _nodes)
{
<tr class="@RowClass(n)">
<td><code>@n.NodeId</code></td>
<td><span class="badge @RoleBadge(n.RedundancyRole)">@n.RedundancyRole</span></td>
<td>@n.Host</td>
<td class="text-end"><code>@n.OpcUaPort</code></td>
<td class="text-end">@n.ServiceLevelBase</td>
<td class="small text-break"><code>@n.ApplicationUri</code></td>
<td>
@if (n.Enabled) { <span class="badge bg-success">Enabled</span> }
else { <span class="badge bg-secondary">Disabled</span> }
</td>
<td class="small @(ClusterNodeService.IsStale(n) ? "text-warning fw-bold" : "")">
@(n.LastSeenAt is null ? "never" : FormatAge(n.LastSeenAt.Value))
@if (ClusterNodeService.IsStale(n)) { <span class="badge bg-warning text-dark ms-1">Stale</span> }
</td>
</tr>
}
</tbody>
</table>
}
@code {
[Parameter] public string ClusterId { get; set; } = string.Empty;
private List<ClusterNode>? _nodes;
protected override async Task OnParametersSetAsync()
{
_nodes = await NodeSvc.ListByClusterAsync(ClusterId, CancellationToken.None);
}
private static string RowClass(ClusterNode n) =>
ClusterNodeService.IsStale(n) ? "table-warning" :
!n.Enabled ? "table-secondary" : "";
private static string RoleBadge(RedundancyRole r) => r switch
{
RedundancyRole.Primary => "bg-success",
RedundancyRole.Secondary => "bg-info",
RedundancyRole.Standalone => "bg-primary",
_ => "bg-secondary",
};
private static string FormatAge(DateTime t)
{
var age = DateTime.UtcNow - t;
if (age.TotalSeconds < 60) return $"{(int)age.TotalSeconds}s ago";
if (age.TotalMinutes < 60) return $"{(int)age.TotalMinutes}m ago";
if (age.TotalHours < 24) return $"{(int)age.TotalHours}h ago";
return t.ToString("yyyy-MM-dd HH:mm 'UTC'");
}
}

View File

@@ -2,6 +2,13 @@
@using ZB.MOM.WW.OtOpcUa.Configuration.Entities
@inject UnsService UnsSvc
<div class="alert alert-info small mb-3">
Drag any line in the <strong>UNS Lines</strong> table onto an area row in <strong>UNS Areas</strong>
to re-parent it. A preview modal shows the impact (equipment re-home count) + lets you confirm
or cancel. If another operator modifies the draft while you're confirming, you'll see a 409
refresh-required modal instead of clobbering their work.
</div>
<div class="row">
<div class="col-md-6">
<div class="d-flex justify-content-between mb-2">
@@ -14,11 +21,20 @@
else
{
<table class="table table-sm">
<thead><tr><th>AreaId</th><th>Name</th></tr></thead>
<thead><tr><th>AreaId</th><th>Name</th><th class="small text-muted">(drop target)</th></tr></thead>
<tbody>
@foreach (var a in _areas)
{
<tr><td><code>@a.UnsAreaId</code></td><td>@a.Name</td></tr>
<tr class="@(_hoverAreaId == a.UnsAreaId ? "table-primary" : "")"
@ondragover="e => OnAreaDragOver(e, a.UnsAreaId)"
@ondragover:preventDefault
@ondragleave="() => _hoverAreaId = null"
@ondrop="() => OnLineDroppedAsync(a.UnsAreaId)"
@ondrop:preventDefault>
<td><code>@a.UnsAreaId</code></td>
<td>@a.Name</td>
<td class="small text-muted">drop here</td>
</tr>
}
</tbody>
</table>
@@ -35,6 +51,7 @@
</div>
}
</div>
<div class="col-md-6">
<div class="d-flex justify-content-between mb-2">
<h4>UNS Lines</h4>
@@ -50,7 +67,14 @@
<tbody>
@foreach (var l in _lines)
{
<tr><td><code>@l.UnsLineId</code></td><td><code>@l.UnsAreaId</code></td><td>@l.Name</td></tr>
<tr draggable="true"
@ondragstart="() => _dragLineId = l.UnsLineId"
@ondragend="() => { _dragLineId = null; _hoverAreaId = null; }"
style="cursor: grab;">
<td><code>@l.UnsLineId</code></td>
<td><code>@l.UnsAreaId</code></td>
<td>@l.Name</td>
</tr>
}
</tbody>
</table>
@@ -75,6 +99,64 @@
</div>
</div>
@* Preview / confirm modal for a pending drag-drop move *@
@if (_pendingPreview is not null)
{
<div class="modal show d-block" tabindex="-1" style="background-color: rgba(0,0,0,0.5);">
<div class="modal-dialog">
<div class="modal-content">
<div class="modal-header">
<h5 class="modal-title">Confirm UNS move</h5>
<button type="button" class="btn-close" @onclick="CancelMove"></button>
</div>
<div class="modal-body">
<p>@_pendingPreview.HumanReadableSummary</p>
<p class="text-muted small">
Equipment re-homed: <strong>@_pendingPreview.AffectedEquipmentCount</strong>.
Tags re-parented: <strong>@_pendingPreview.AffectedTagCount</strong>.
</p>
@if (_pendingPreview.CascadeWarnings.Count > 0)
{
<div class="alert alert-warning small mb-0">
<ul class="mb-0">
@foreach (var w in _pendingPreview.CascadeWarnings) { <li>@w</li> }
</ul>
</div>
}
</div>
<div class="modal-footer">
<button class="btn btn-secondary" @onclick="CancelMove">Cancel</button>
<button class="btn btn-primary" @onclick="ConfirmMoveAsync" disabled="@_committing">Confirm move</button>
</div>
</div>
</div>
</div>
}
@* 409 concurrent-edit modal — another operator changed the draft between preview + commit *@
@if (_conflictMessage is not null)
{
<div class="modal show d-block" tabindex="-1" style="background-color: rgba(0,0,0,0.5);">
<div class="modal-dialog">
<div class="modal-content border-danger">
<div class="modal-header bg-danger text-white">
<h5 class="modal-title">Draft changed — refresh required</h5>
</div>
<div class="modal-body">
<p>@_conflictMessage</p>
<p class="small text-muted">
Concurrency guard per DraftRevisionToken prevented overwriting the peer
operator's edit. Reload the tab + redo the move on the current draft state.
</p>
</div>
<div class="modal-footer">
<button class="btn btn-primary" @onclick="ReloadAfterConflict">Reload draft</button>
</div>
</div>
</div>
</div>
}
@code {
[Parameter] public long GenerationId { get; set; }
[Parameter] public string ClusterId { get; set; } = string.Empty;
@@ -87,6 +169,13 @@
private string _newLineName = string.Empty;
private string _newLineAreaId = string.Empty;
private string? _dragLineId;
private string? _hoverAreaId;
private UnsImpactPreview? _pendingPreview;
private UnsMoveOperation? _pendingMove;
private bool _committing;
private string? _conflictMessage;
protected override async Task OnParametersSetAsync() => await ReloadAsync();
private async Task ReloadAsync()
@@ -112,4 +201,72 @@
_showLineForm = false;
await ReloadAsync();
}
private void OnAreaDragOver(DragEventArgs _, string areaId) => _hoverAreaId = areaId;
private async Task OnLineDroppedAsync(string targetAreaId)
{
var lineId = _dragLineId;
_hoverAreaId = null;
_dragLineId = null;
if (string.IsNullOrWhiteSpace(lineId)) return;
var line = _lines?.FirstOrDefault(l => l.UnsLineId == lineId);
if (line is null || line.UnsAreaId == targetAreaId) return;
var snapshot = await UnsSvc.LoadSnapshotAsync(GenerationId, CancellationToken.None);
var move = new UnsMoveOperation(
Kind: UnsMoveKind.LineMove,
SourceClusterId: ClusterId,
TargetClusterId: ClusterId,
SourceLineId: lineId,
TargetAreaId: targetAreaId);
try
{
_pendingPreview = UnsImpactAnalyzer.Analyze(snapshot, move);
_pendingMove = move;
}
catch (Exception ex)
{
_conflictMessage = ex.Message; // CrossCluster or validation failure surfaces here
}
}
private void CancelMove()
{
_pendingPreview = null;
_pendingMove = null;
}
private async Task ConfirmMoveAsync()
{
if (_pendingPreview is null || _pendingMove is null) return;
_committing = true;
try
{
await UnsSvc.MoveLineAsync(
GenerationId,
_pendingPreview.RevisionToken,
_pendingMove.SourceLineId!,
_pendingMove.TargetAreaId!,
CancellationToken.None);
_pendingPreview = null;
_pendingMove = null;
await ReloadAsync();
}
catch (DraftRevisionConflictException ex)
{
_pendingPreview = null;
_pendingMove = null;
_conflictMessage = ex.Message;
}
finally { _committing = false; }
}
private async Task ReloadAfterConflict()
{
_conflictMessage = null;
await ReloadAsync();
}
}

View File

@@ -56,6 +56,16 @@ else
</div></div></div>
</div>
@if (_rows.Any(HostStatusService.IsFlagged))
{
var flaggedCount = _rows.Count(HostStatusService.IsFlagged);
<div class="alert alert-danger small mb-3">
<strong>@flaggedCount host@(flaggedCount == 1 ? "" : "s")</strong>
reporting ≥ @HostStatusService.FailureFlagThreshold consecutive failures — circuit breaker
may trip soon. Inspect the resilience columns below to locate.
</div>
}
@foreach (var cluster in _rows.GroupBy(r => r.ClusterId ?? "(unassigned)").OrderBy(g => g.Key))
{
<h2 class="h5 mt-4">Cluster: <code>@cluster.Key</code></h2>
@@ -66,6 +76,9 @@ else
<th>Driver</th>
<th>Host</th>
<th>State</th>
<th class="text-end" title="Consecutive failures — resets when a call succeeds or the breaker closes">Fail#</th>
<th class="text-end" title="In-flight capability calls (bulkhead-depth proxy)">In-flight</th>
<th>Breaker opened</th>
<th>Last transition</th>
<th>Last seen</th>
<th>Detail</th>
@@ -84,10 +97,21 @@ else
{
<span class="badge bg-warning text-dark ms-1">Stale</span>
}
@if (HostStatusService.IsFlagged(r))
{
<span class="badge bg-danger ms-1" title="≥ @HostStatusService.FailureFlagThreshold consecutive failures">Flagged</span>
}
</td>
<td class="text-end small @(HostStatusService.IsFlagged(r) ? "text-danger fw-bold" : "")">
@r.ConsecutiveFailures
</td>
<td class="text-end small">@r.CurrentBulkheadDepth</td>
<td class="small">
@(r.LastCircuitBreakerOpenUtc is null ? "—" : FormatAge(r.LastCircuitBreakerOpenUtc.Value))
</td>
<td class="small">@FormatAge(r.StateChangedUtc)</td>
<td class="small @(HostStatusService.IsStale(r) ? "text-warning" : "")">@FormatAge(r.LastSeenUtc)</td>
<td class="text-truncate small" style="max-width: 320px;" title="@r.Detail">@r.Detail</td>
<td class="text-truncate small" style="max-width: 240px;" title="@r.Detail">@r.Detail</td>
</tr>
}
</tbody>

View File

@@ -0,0 +1,161 @@
@page "/role-grants"
@using ZB.MOM.WW.OtOpcUa.Admin.Services
@using ZB.MOM.WW.OtOpcUa.Configuration.Entities
@using ZB.MOM.WW.OtOpcUa.Configuration.Enums
@using ZB.MOM.WW.OtOpcUa.Configuration.Services
@inject ILdapGroupRoleMappingService RoleSvc
@inject ClusterService ClusterSvc
<h1 class="mb-4">LDAP group → Admin role grants</h1>
<div class="alert alert-info small mb-4">
Maps LDAP groups to Admin UI roles (ConfigViewer / ConfigEditor / FleetAdmin). Control-plane
only — OPC UA data-path authorization reads <code>NodeAcl</code> rows directly and is
unaffected by these mappings (see decision #150). A fleet-wide grant applies across every
cluster; a cluster-scoped grant only binds within the named cluster. The same LDAP group
may hold different roles on different clusters.
</div>
<div class="d-flex justify-content-end mb-3">
<button class="btn btn-primary btn-sm" @onclick="StartAdd">Add grant</button>
</div>
@if (_rows is null)
{
<p>Loading…</p>
}
else if (_rows.Count == 0)
{
<p class="text-muted">No role grants defined yet. Without at least one FleetAdmin grant,
only the bootstrap admin can publish drafts.</p>
}
else
{
<table class="table table-sm table-hover">
<thead>
<tr><th>LDAP group</th><th>Role</th><th>Scope</th><th>Created</th><th>Notes</th><th></th></tr>
</thead>
<tbody>
@foreach (var r in _rows)
{
<tr>
<td><code>@r.LdapGroup</code></td>
<td><span class="badge bg-secondary">@r.Role</span></td>
<td>@(r.IsSystemWide ? "Fleet-wide" : $"Cluster: {r.ClusterId}")</td>
<td class="small">@r.CreatedAtUtc.ToString("yyyy-MM-dd")</td>
<td class="small text-muted">@r.Notes</td>
<td><button class="btn btn-sm btn-outline-danger" @onclick="() => DeleteAsync(r.Id)">Revoke</button></td>
</tr>
}
</tbody>
</table>
}
@if (_showForm)
{
<div class="card mt-3">
<div class="card-body">
<h5>New role grant</h5>
<div class="row g-3">
<div class="col-md-4">
<label class="form-label">LDAP group (DN)</label>
<input class="form-control" @bind="_group" placeholder="cn=fleet-admin,ou=groups,dc=…"/>
</div>
<div class="col-md-3">
<label class="form-label">Role</label>
<select class="form-select" @bind="_role">
@foreach (var r in Enum.GetValues<AdminRole>())
{
<option value="@r">@r</option>
}
</select>
</div>
<div class="col-md-2 pt-4">
<div class="form-check">
<input class="form-check-input" type="checkbox" id="systemWide" @bind="_isSystemWide"/>
<label class="form-check-label" for="systemWide">Fleet-wide</label>
</div>
</div>
<div class="col-md-3">
<label class="form-label">Cluster @(_isSystemWide ? "(disabled)" : "")</label>
<select class="form-select" @bind="_clusterId" disabled="@_isSystemWide">
<option value="">-- select --</option>
@if (_clusters is not null)
{
@foreach (var c in _clusters)
{
<option value="@c.ClusterId">@c.ClusterId</option>
}
}
</select>
</div>
<div class="col-12">
<label class="form-label">Notes (optional)</label>
<input class="form-control" @bind="_notes"/>
</div>
</div>
@if (_error is not null) { <div class="alert alert-danger mt-3">@_error</div> }
<div class="mt-3">
<button class="btn btn-sm btn-primary" @onclick="SaveAsync">Save</button>
<button class="btn btn-sm btn-secondary ms-2" @onclick="() => _showForm = false">Cancel</button>
</div>
</div>
</div>
}
@code {
private IReadOnlyList<LdapGroupRoleMapping>? _rows;
private List<ServerCluster>? _clusters;
private bool _showForm;
private string _group = string.Empty;
private AdminRole _role = AdminRole.ConfigViewer;
private bool _isSystemWide;
private string _clusterId = string.Empty;
private string? _notes;
private string? _error;
protected override async Task OnInitializedAsync() => await ReloadAsync();
private async Task ReloadAsync()
{
_rows = await RoleSvc.ListAllAsync(CancellationToken.None);
_clusters = await ClusterSvc.ListAsync(CancellationToken.None);
}
private void StartAdd()
{
_group = string.Empty;
_role = AdminRole.ConfigViewer;
_isSystemWide = false;
_clusterId = string.Empty;
_notes = null;
_error = null;
_showForm = true;
}
private async Task SaveAsync()
{
_error = null;
try
{
var row = new LdapGroupRoleMapping
{
LdapGroup = _group.Trim(),
Role = _role,
IsSystemWide = _isSystemWide,
ClusterId = _isSystemWide ? null : (string.IsNullOrWhiteSpace(_clusterId) ? null : _clusterId),
Notes = string.IsNullOrWhiteSpace(_notes) ? null : _notes,
};
await RoleSvc.CreateAsync(row, CancellationToken.None);
_showForm = false;
await ReloadAsync();
}
catch (Exception ex) { _error = ex.Message; }
}
private async Task DeleteAsync(Guid id)
{
await RoleSvc.DeleteAsync(id, CancellationToken.None);
await ReloadAsync();
}
}

View File

@@ -48,6 +48,10 @@ builder.Services.AddScoped<ReservationService>();
builder.Services.AddScoped<DraftValidationService>();
builder.Services.AddScoped<AuditLogService>();
builder.Services.AddScoped<HostStatusService>();
builder.Services.AddScoped<ClusterNodeService>();
builder.Services.AddScoped<EquipmentImportBatchService>();
builder.Services.AddScoped<ZB.MOM.WW.OtOpcUa.Configuration.Services.ILdapGroupRoleMappingService,
ZB.MOM.WW.OtOpcUa.Configuration.Services.LdapGroupRoleMappingService>();
// Cert-trust management — reads the OPC UA server's PKI store root so rejected client certs
// can be promoted to trusted via the Admin UI. Singleton: no per-request state, just

View File

@@ -0,0 +1,28 @@
using Microsoft.EntityFrameworkCore;
using ZB.MOM.WW.OtOpcUa.Configuration;
using ZB.MOM.WW.OtOpcUa.Configuration.Entities;
namespace ZB.MOM.WW.OtOpcUa.Admin.Services;
/// <summary>
/// Read-side service for ClusterNode rows + their cluster-scoped redundancy view. Consumed
/// by the RedundancyTab on the cluster detail page. Writes (role swap, node enable/disable)
/// are not supported here — role swap happens through the RedundancyCoordinator apply-lease
/// flow on the server side and would conflict with any direct DB mutation from Admin.
/// </summary>
public sealed class ClusterNodeService(OtOpcUaConfigDbContext db)
{
/// <summary>Stale-threshold matching <c>HostStatusService.StaleThreshold</c> — 30s of clock
/// tolerance covers a missed heartbeat plus publisher GC pauses.</summary>
public static readonly TimeSpan StaleThreshold = TimeSpan.FromSeconds(30);
public Task<List<ClusterNode>> ListByClusterAsync(string clusterId, CancellationToken ct) =>
db.ClusterNodes.AsNoTracking()
.Where(n => n.ClusterId == clusterId)
.OrderByDescending(n => n.ServiceLevelBase)
.ThenBy(n => n.NodeId)
.ToListAsync(ct);
public static bool IsStale(ClusterNode node) =>
node.LastSeenAt is null || DateTime.UtcNow - node.LastSeenAt.Value > StaleThreshold;
}

View File

@@ -2,6 +2,7 @@ using Microsoft.EntityFrameworkCore;
using ZB.MOM.WW.OtOpcUa.Admin.Services;
using ZB.MOM.WW.OtOpcUa.Configuration;
using ZB.MOM.WW.OtOpcUa.Configuration.Entities;
using ZB.MOM.WW.OtOpcUa.Configuration.Enums;
namespace ZB.MOM.WW.OtOpcUa.Admin.Services;
@@ -152,14 +153,37 @@ public sealed class EquipmentImportBatchService(OtOpcUaConfigDbContext db)
try
{
foreach (var row in batch.Rows.Where(r => r.IsAccepted))
// Snapshot active reservations that overlap this batch's ZTag + SAPID set — one
// round-trip instead of N. Released rows (ReleasedAt IS NOT NULL) are ignored so
// an explicitly-released value can be reused.
var accepted = batch.Rows.Where(r => r.IsAccepted).ToList();
var zTags = accepted.Where(r => !string.IsNullOrWhiteSpace(r.ZTag))
.Select(r => r.ZTag).Distinct(StringComparer.OrdinalIgnoreCase).ToList();
var sapIds = accepted.Where(r => !string.IsNullOrWhiteSpace(r.SAPID))
.Select(r => r.SAPID).Distinct(StringComparer.OrdinalIgnoreCase).ToList();
var existingReservations = await db.ExternalIdReservations
.Where(r => r.ReleasedAt == null &&
((r.Kind == ReservationKind.ZTag && zTags.Contains(r.Value)) ||
(r.Kind == ReservationKind.SAPID && sapIds.Contains(r.Value))))
.ToListAsync(ct).ConfigureAwait(false);
var resByKey = existingReservations.ToDictionary(
r => (r.Kind, r.Value.ToLowerInvariant()),
r => r);
var nowUtc = DateTime.UtcNow;
var firstPublishedBy = batch.CreatedBy;
foreach (var row in accepted)
{
var equipmentUuid = Guid.TryParse(row.EquipmentUuid, out var u) ? u : Guid.NewGuid();
db.Equipment.Add(new Equipment
{
EquipmentRowId = Guid.NewGuid(),
GenerationId = generationId,
EquipmentId = row.EquipmentId,
EquipmentUuid = Guid.TryParse(row.EquipmentUuid, out var u) ? u : Guid.NewGuid(),
EquipmentUuid = equipmentUuid,
DriverInstanceId = driverInstanceIdForRows,
UnsLineId = unsLineIdForRows,
Name = row.Name,
@@ -176,10 +200,25 @@ public sealed class EquipmentImportBatchService(OtOpcUaConfigDbContext db)
ManufacturerUri = row.ManufacturerUri,
DeviceManualUri = row.DeviceManualUri,
});
MergeReservation(row.ZTag, ReservationKind.ZTag, equipmentUuid, batch.ClusterId,
firstPublishedBy, nowUtc, resByKey);
MergeReservation(row.SAPID, ReservationKind.SAPID, equipmentUuid, batch.ClusterId,
firstPublishedBy, nowUtc, resByKey);
}
batch.FinalisedAtUtc = DateTime.UtcNow;
await db.SaveChangesAsync(ct).ConfigureAwait(false);
batch.FinalisedAtUtc = nowUtc;
try
{
await db.SaveChangesAsync(ct).ConfigureAwait(false);
}
catch (DbUpdateException ex) when (IsReservationUniquenessViolation(ex))
{
throw new ExternalIdReservationConflictException(
"Finalise rejected: one or more ZTag/SAPID values were reserved by another operator " +
"between batch preview and commit. Inspect active reservations + retry after resolving the conflict.",
ex);
}
if (tx is not null) await tx.CommitAsync(ct).ConfigureAwait(false);
}
catch
@@ -193,6 +232,71 @@ public sealed class EquipmentImportBatchService(OtOpcUaConfigDbContext db)
}
}
/// <summary>
/// Merge one external-ID reservation for an equipment row. Three outcomes:
/// (1) value is empty → skip; (2) reservation exists for same <paramref name="equipmentUuid"/>
/// → bump <c>LastPublishedAt</c>; (3) reservation exists for a different EquipmentUuid
/// → throw <see cref="ExternalIdReservationConflictException"/> with the conflicting UUID
/// so the caller sees which equipment already owns the value; (4) no reservation → create new.
/// </summary>
private void MergeReservation(
string? value,
ReservationKind kind,
Guid equipmentUuid,
string clusterId,
string firstPublishedBy,
DateTime nowUtc,
Dictionary<(ReservationKind, string), ExternalIdReservation> cache)
{
if (string.IsNullOrWhiteSpace(value)) return;
var key = (kind, value.ToLowerInvariant());
if (cache.TryGetValue(key, out var existing))
{
if (existing.EquipmentUuid != equipmentUuid)
throw new ExternalIdReservationConflictException(
$"{kind} '{value}' is already reserved by EquipmentUuid {existing.EquipmentUuid} " +
$"(first published {existing.FirstPublishedAt:u} on cluster '{existing.ClusterId}'). " +
$"Refusing to re-assign to {equipmentUuid}.");
existing.LastPublishedAt = nowUtc;
return;
}
var fresh = new ExternalIdReservation
{
ReservationId = Guid.NewGuid(),
Kind = kind,
Value = value,
EquipmentUuid = equipmentUuid,
ClusterId = clusterId,
FirstPublishedAt = nowUtc,
FirstPublishedBy = firstPublishedBy,
LastPublishedAt = nowUtc,
};
db.ExternalIdReservations.Add(fresh);
cache[key] = fresh;
}
/// <summary>
/// True when the <see cref="DbUpdateException"/> root-cause was the filtered-unique
/// index <c>UX_ExternalIdReservation_KindValue_Active</c> — i.e. another transaction
/// won the race between our cache-load + commit. SQL Server surfaces this as 2601 / 2627.
/// </summary>
private static bool IsReservationUniquenessViolation(DbUpdateException ex)
{
for (Exception? inner = ex; inner is not null; inner = inner.InnerException)
{
if (inner is Microsoft.Data.SqlClient.SqlException sql &&
(sql.Number == 2601 || sql.Number == 2627) &&
sql.Message.Contains("UX_ExternalIdReservation_KindValue_Active", StringComparison.OrdinalIgnoreCase))
{
return true;
}
}
return false;
}
/// <summary>List batches created by the given user. Finalised batches are archived; include them on demand.</summary>
public async Task<IReadOnlyList<EquipmentImportBatch>> ListByUserAsync(string createdBy, bool includeFinalised, CancellationToken ct)
{
@@ -205,3 +309,16 @@ public sealed class EquipmentImportBatchService(OtOpcUaConfigDbContext db)
public sealed class ImportBatchNotFoundException(string message) : Exception(message);
public sealed class ImportBatchAlreadyFinalisedException(string message) : Exception(message);
/// <summary>
/// Thrown when a <c>FinaliseBatchAsync</c> call detects that one of its ZTag/SAPID values is
/// already reserved by a different EquipmentUuid — either from a prior published generation
/// or a concurrent finalise that won the race. The operator sees the message + the conflicting
/// equipment ownership so they can resolve the conflict (pick a new ZTag, release the existing
/// reservation via <c>sp_ReleaseExternalIdReservation</c>, etc.) and retry the finalise.
/// </summary>
public sealed class ExternalIdReservationConflictException : Exception
{
public ExternalIdReservationConflictException(string message) : base(message) { }
public ExternalIdReservationConflictException(string message, Exception inner) : base(message, inner) { }
}

View File

@@ -7,8 +7,9 @@ namespace ZB.MOM.WW.OtOpcUa.Admin.Services;
/// <summary>
/// One row per <see cref="DriverHostStatus"/> record, enriched with the owning
/// <c>ClusterNode.ClusterId</c> when available (left-join). The Admin <c>/hosts</c> page
/// groups by cluster and renders a per-node → per-driver → per-host tree.
/// <c>ClusterNode.ClusterId</c> (left-join) + the per-<c>(DriverInstanceId, HostName)</c>
/// <see cref="DriverInstanceResilienceStatus"/> counters (also left-join) so the Admin
/// <c>/hosts</c> page renders the resilience surface inline with host state.
/// </summary>
public sealed record HostStatusRow(
string NodeId,
@@ -18,7 +19,11 @@ public sealed record HostStatusRow(
DriverHostState State,
DateTime StateChangedUtc,
DateTime LastSeenUtc,
string? Detail);
string? Detail,
int ConsecutiveFailures,
DateTime? LastCircuitBreakerOpenUtc,
int CurrentBulkheadDepth,
DateTime? LastRecycleUtc);
/// <summary>
/// Read-side service for the Admin UI's per-host drill-down. Loads
@@ -36,15 +41,26 @@ public sealed class HostStatusService(OtOpcUaConfigDbContext db)
{
public static readonly TimeSpan StaleThreshold = TimeSpan.FromSeconds(30);
/// <summary>Consecutive-failure threshold at which <see cref="IsFlagged"/> returns <c>true</c>
/// so the Admin UI can paint a red badge. Matches Phase 6.1 decision #143's conservative
/// half-of-breaker-threshold convention — flags before the breaker actually opens.</summary>
public const int FailureFlagThreshold = 3;
public async Task<IReadOnlyList<HostStatusRow>> ListAsync(CancellationToken ct = default)
{
// LEFT JOIN on NodeId so a row persists even when its owning ClusterNode row hasn't
// been created yet (first-boot bootstrap case — keeps the UI from losing sight of
// the reporting server).
// Two LEFT JOINs:
// 1. ClusterNodes on NodeId — row persists even when its owning ClusterNode row
// hasn't been created yet (first-boot bootstrap case).
// 2. DriverInstanceResilienceStatuses on (DriverInstanceId, HostName) — resilience
// counters haven't been sampled yet for brand-new hosts, so a missing row means
// zero failures + never-opened breaker.
var rows = await (from s in db.DriverHostStatuses.AsNoTracking()
join n in db.ClusterNodes.AsNoTracking()
on s.NodeId equals n.NodeId into nodeJoin
from n in nodeJoin.DefaultIfEmpty()
join r in db.DriverInstanceResilienceStatuses.AsNoTracking()
on new { s.DriverInstanceId, s.HostName } equals new { r.DriverInstanceId, r.HostName } into resilJoin
from r in resilJoin.DefaultIfEmpty()
orderby s.NodeId, s.DriverInstanceId, s.HostName
select new HostStatusRow(
s.NodeId,
@@ -54,10 +70,21 @@ public sealed class HostStatusService(OtOpcUaConfigDbContext db)
s.State,
s.StateChangedUtc,
s.LastSeenUtc,
s.Detail)).ToListAsync(ct);
s.Detail,
r != null ? r.ConsecutiveFailures : 0,
r != null ? r.LastCircuitBreakerOpenUtc : null,
r != null ? r.CurrentBulkheadDepth : 0,
r != null ? r.LastRecycleUtc : null)).ToListAsync(ct);
return rows;
}
public static bool IsStale(HostStatusRow row) =>
DateTime.UtcNow - row.LastSeenUtc > StaleThreshold;
/// <summary>
/// Red-badge predicate — <c>true</c> when the host has accumulated enough consecutive
/// failures that an operator should take notice before the breaker trips.
/// </summary>
public static bool IsFlagged(HostStatusRow row) =>
row.ConsecutiveFailures >= FailureFlagThreshold;
}

View File

@@ -1,3 +1,5 @@
using System.Security.Cryptography;
using System.Text;
using Microsoft.EntityFrameworkCore;
using ZB.MOM.WW.OtOpcUa.Configuration;
using ZB.MOM.WW.OtOpcUa.Configuration.Entities;
@@ -47,4 +49,132 @@ public sealed class UnsService(OtOpcUaConfigDbContext db)
await db.SaveChangesAsync(ct);
return line;
}
/// <summary>
/// Build the full UNS tree snapshot for the analyzer. Walks areas + lines in the draft
/// and counts equipment + tags per line. Returns the snapshot plus a deterministic
/// revision token computed by SHA-256'ing the sorted (kind, id, parent, name) tuples —
/// stable across processes + changes whenever any row is added / modified / deleted.
/// </summary>
public async Task<UnsTreeSnapshot> LoadSnapshotAsync(long generationId, CancellationToken ct)
{
var areas = await db.UnsAreas.AsNoTracking()
.Where(a => a.GenerationId == generationId)
.OrderBy(a => a.UnsAreaId)
.ToListAsync(ct);
var lines = await db.UnsLines.AsNoTracking()
.Where(l => l.GenerationId == generationId)
.OrderBy(l => l.UnsLineId)
.ToListAsync(ct);
var equipmentCounts = await db.Equipment.AsNoTracking()
.Where(e => e.GenerationId == generationId)
.GroupBy(e => e.UnsLineId)
.Select(g => new { LineId = g.Key, Count = g.Count() })
.ToListAsync(ct);
var equipmentByLine = equipmentCounts.ToDictionary(x => x.LineId, x => x.Count, StringComparer.OrdinalIgnoreCase);
var lineSummaries = lines.Select(l =>
new UnsLineSummary(
LineId: l.UnsLineId,
Name: l.Name,
EquipmentCount: equipmentByLine.GetValueOrDefault(l.UnsLineId),
TagCount: 0)).ToList();
var areaSummaries = areas.Select(a =>
new UnsAreaSummary(
AreaId: a.UnsAreaId,
Name: a.Name,
LineIds: lines.Where(l => string.Equals(l.UnsAreaId, a.UnsAreaId, StringComparison.OrdinalIgnoreCase))
.Select(l => l.UnsLineId).ToList())).ToList();
return new UnsTreeSnapshot
{
DraftGenerationId = generationId,
RevisionToken = ComputeRevisionToken(areas, lines),
Areas = areaSummaries,
Lines = lineSummaries,
};
}
/// <summary>
/// Atomic re-parent of a line to a new area inside the same draft. The caller must pass
/// the revision token it observed at preview time — a mismatch raises
/// <see cref="DraftRevisionConflictException"/> so the UI can show the 409 concurrent-edit
/// modal instead of silently overwriting a peer's work.
/// </summary>
public async Task MoveLineAsync(
long generationId,
DraftRevisionToken expected,
string lineId,
string targetAreaId,
CancellationToken ct)
{
ArgumentNullException.ThrowIfNull(expected);
ArgumentException.ThrowIfNullOrWhiteSpace(lineId);
ArgumentException.ThrowIfNullOrWhiteSpace(targetAreaId);
var supportsTx = db.Database.IsRelational();
Microsoft.EntityFrameworkCore.Storage.IDbContextTransaction? tx = null;
if (supportsTx) tx = await db.Database.BeginTransactionAsync(ct).ConfigureAwait(false);
try
{
var areas = await db.UnsAreas
.Where(a => a.GenerationId == generationId)
.OrderBy(a => a.UnsAreaId)
.ToListAsync(ct);
var lines = await db.UnsLines
.Where(l => l.GenerationId == generationId)
.OrderBy(l => l.UnsLineId)
.ToListAsync(ct);
var current = ComputeRevisionToken(areas, lines);
if (!current.Matches(expected))
throw new DraftRevisionConflictException(
$"Draft {generationId} changed since preview. Expected revision {expected.Value}, saw {current.Value}. " +
"Refresh + redo the move.");
var line = lines.FirstOrDefault(l => string.Equals(l.UnsLineId, lineId, StringComparison.OrdinalIgnoreCase))
?? throw new InvalidOperationException($"Line '{lineId}' not found in draft {generationId}.");
if (!areas.Any(a => string.Equals(a.UnsAreaId, targetAreaId, StringComparison.OrdinalIgnoreCase)))
throw new InvalidOperationException($"Target area '{targetAreaId}' not found in draft {generationId}.");
if (string.Equals(line.UnsAreaId, targetAreaId, StringComparison.OrdinalIgnoreCase))
return; // no-op drop — same area
line.UnsAreaId = targetAreaId;
await db.SaveChangesAsync(ct);
if (tx is not null) await tx.CommitAsync(ct).ConfigureAwait(false);
}
catch
{
if (tx is not null) await tx.RollbackAsync(ct).ConfigureAwait(false);
throw;
}
finally
{
if (tx is not null) await tx.DisposeAsync().ConfigureAwait(false);
}
}
private static DraftRevisionToken ComputeRevisionToken(IReadOnlyList<UnsArea> areas, IReadOnlyList<UnsLine> lines)
{
var sb = new StringBuilder(capacity: 256 + (areas.Count + lines.Count) * 80);
foreach (var a in areas.OrderBy(a => a.UnsAreaId, StringComparer.Ordinal))
sb.Append("A:").Append(a.UnsAreaId).Append('|').Append(a.Name).Append('|').Append(a.Notes ?? "").Append(';');
foreach (var l in lines.OrderBy(l => l.UnsLineId, StringComparer.Ordinal))
sb.Append("L:").Append(l.UnsLineId).Append('|').Append(l.UnsAreaId).Append('|').Append(l.Name).Append('|').Append(l.Notes ?? "").Append(';');
var hash = SHA256.HashData(Encoding.UTF8.GetBytes(sb.ToString()));
return new DraftRevisionToken(Convert.ToHexStringLower(hash)[..16]);
}
}
/// <summary>Thrown when a UNS move's expected revision token no longer matches the live draft
/// — another operator mutated the draft between preview + commit. Caller surfaces a 409-style
/// "refresh required" modal in the Admin UI.</summary>
public sealed class DraftRevisionConflictException(string message) : Exception(message);

View File

@@ -0,0 +1,129 @@
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
namespace ZB.MOM.WW.OtOpcUa.Core.Resilience;
/// <summary>
/// Wraps the three mutating surfaces of <see cref="IAlarmSource"/>
/// (<see cref="IAlarmSource.SubscribeAlarmsAsync"/>, <see cref="IAlarmSource.UnsubscribeAlarmsAsync"/>,
/// <see cref="IAlarmSource.AcknowledgeAsync"/>) through <see cref="CapabilityInvoker"/> so the
/// Phase 6.1 resilience pipeline runs — retry semantics match
/// <see cref="DriverCapability.AlarmSubscribe"/> (retries by default) and
/// <see cref="DriverCapability.AlarmAcknowledge"/> (does NOT retry per decision #143).
/// </summary>
/// <remarks>
/// <para>Multi-host dispatch: when the driver implements <see cref="IPerCallHostResolver"/>,
/// each source-node-id is resolved individually + grouped by host so a dead PLC inside a
/// multi-device driver doesn't poison the sibling hosts' breakers. Drivers with a single
/// host fall back to <see cref="IDriver.DriverInstanceId"/> as the single-host key.</para>
///
/// <para>Why this lives here + not on <see cref="CapabilityInvoker"/>: alarm surfaces have a
/// handle-returning shape (SubscribeAlarmsAsync returns <see cref="IAlarmSubscriptionHandle"/>)
/// + a per-call fan-out (AcknowledgeAsync gets a batch of
/// <see cref="AlarmAcknowledgeRequest"/>s that may span multiple hosts). Keeping the fan-out
/// logic here keeps the invoker's execute-overloads narrow.</para>
/// </remarks>
public sealed class AlarmSurfaceInvoker
{
private readonly CapabilityInvoker _invoker;
private readonly IAlarmSource _alarmSource;
private readonly IPerCallHostResolver? _hostResolver;
private readonly string _defaultHost;
public AlarmSurfaceInvoker(
CapabilityInvoker invoker,
IAlarmSource alarmSource,
string defaultHost,
IPerCallHostResolver? hostResolver = null)
{
ArgumentNullException.ThrowIfNull(invoker);
ArgumentNullException.ThrowIfNull(alarmSource);
ArgumentException.ThrowIfNullOrWhiteSpace(defaultHost);
_invoker = invoker;
_alarmSource = alarmSource;
_defaultHost = defaultHost;
_hostResolver = hostResolver;
}
/// <summary>
/// Subscribe to alarm events for a set of source node ids, fanning out by resolved host
/// so per-host breakers / bulkheads apply. Returns one handle per host — callers that
/// don't care about per-host separation may concatenate them.
/// </summary>
public async Task<IReadOnlyList<IAlarmSubscriptionHandle>> SubscribeAsync(
IReadOnlyList<string> sourceNodeIds,
CancellationToken cancellationToken)
{
ArgumentNullException.ThrowIfNull(sourceNodeIds);
if (sourceNodeIds.Count == 0) return [];
var byHost = GroupByHost(sourceNodeIds);
var handles = new List<IAlarmSubscriptionHandle>(byHost.Count);
foreach (var (host, ids) in byHost)
{
var handle = await _invoker.ExecuteAsync(
DriverCapability.AlarmSubscribe,
host,
async ct => await _alarmSource.SubscribeAlarmsAsync(ids, ct).ConfigureAwait(false),
cancellationToken).ConfigureAwait(false);
handles.Add(handle);
}
return handles;
}
/// <summary>Cancel an alarm subscription. Routes through the AlarmSubscribe pipeline for parity.</summary>
public ValueTask UnsubscribeAsync(IAlarmSubscriptionHandle handle, CancellationToken cancellationToken)
{
ArgumentNullException.ThrowIfNull(handle);
return _invoker.ExecuteAsync(
DriverCapability.AlarmSubscribe,
_defaultHost,
async ct => await _alarmSource.UnsubscribeAlarmsAsync(handle, ct).ConfigureAwait(false),
cancellationToken);
}
/// <summary>
/// Acknowledge alarms. Fans out by resolved host; each host's batch runs through the
/// AlarmAcknowledge pipeline (no-retry per decision #143 — an alarm-ack is not idempotent
/// at the plant-floor acknowledgement level even if the OPC UA spec permits re-issue).
/// </summary>
public async Task AcknowledgeAsync(
IReadOnlyList<AlarmAcknowledgeRequest> acknowledgements,
CancellationToken cancellationToken)
{
ArgumentNullException.ThrowIfNull(acknowledgements);
if (acknowledgements.Count == 0) return;
var byHost = _hostResolver is null
? new Dictionary<string, List<AlarmAcknowledgeRequest>> { [_defaultHost] = acknowledgements.ToList() }
: acknowledgements
.GroupBy(a => _hostResolver.ResolveHost(a.SourceNodeId))
.ToDictionary(g => g.Key, g => g.ToList());
foreach (var (host, batch) in byHost)
{
var batchSnapshot = batch; // capture for the lambda
await _invoker.ExecuteAsync(
DriverCapability.AlarmAcknowledge,
host,
async ct => await _alarmSource.AcknowledgeAsync(batchSnapshot, ct).ConfigureAwait(false),
cancellationToken).ConfigureAwait(false);
}
}
private Dictionary<string, List<string>> GroupByHost(IReadOnlyList<string> sourceNodeIds)
{
if (_hostResolver is null)
return new Dictionary<string, List<string>> { [_defaultHost] = sourceNodeIds.ToList() };
var result = new Dictionary<string, List<string>>(StringComparer.Ordinal);
foreach (var id in sourceNodeIds)
{
var host = _hostResolver.ResolveHost(id);
if (!result.TryGetValue(host, out var list))
result[host] = list = new List<string>();
list.Add(id);
}
return result;
}
}

View File

@@ -24,11 +24,21 @@ public sealed class DriverResiliencePipelineBuilder
{
private readonly ConcurrentDictionary<PipelineKey, ResiliencePipeline> _pipelines = new();
private readonly TimeProvider _timeProvider;
private readonly DriverResilienceStatusTracker? _statusTracker;
/// <summary>Construct with the ambient clock (use <see cref="TimeProvider.System"/> in prod).</summary>
public DriverResiliencePipelineBuilder(TimeProvider? timeProvider = null)
/// <param name="timeProvider">Clock source for pipeline timeouts + breaker sampling. Defaults to system.</param>
/// <param name="statusTracker">When non-null, every built pipeline wires Polly telemetry into
/// the tracker — retries increment <c>ConsecutiveFailures</c>, breaker-open stamps
/// <c>LastBreakerOpenUtc</c>, breaker-close resets failures. Feeds Admin <c>/hosts</c> +
/// the Polly bulkhead-depth column. Absent tracker means no telemetry (unit tests +
/// deployments that don't care about resilience observability).</param>
public DriverResiliencePipelineBuilder(
TimeProvider? timeProvider = null,
DriverResilienceStatusTracker? statusTracker = null)
{
_timeProvider = timeProvider ?? TimeProvider.System;
_statusTracker = statusTracker;
}
/// <summary>
@@ -54,8 +64,9 @@ public sealed class DriverResiliencePipelineBuilder
ArgumentException.ThrowIfNullOrWhiteSpace(hostName);
var key = new PipelineKey(driverInstanceId, hostName, capability);
return _pipelines.GetOrAdd(key, static (_, state) => Build(state.capability, state.options, state.timeProvider),
(capability, options, timeProvider: _timeProvider));
return _pipelines.GetOrAdd(key, static (k, state) => Build(
k.DriverInstanceId, k.HostName, state.capability, state.options, state.timeProvider, state.tracker),
(capability, options, timeProvider: _timeProvider, tracker: _statusTracker));
}
/// <summary>Drop cached pipelines for one driver instance (e.g. on ResilienceConfig change). Test + Admin-reload use.</summary>
@@ -74,9 +85,12 @@ public sealed class DriverResiliencePipelineBuilder
public int CachedPipelineCount => _pipelines.Count;
private static ResiliencePipeline Build(
string driverInstanceId,
string hostName,
DriverCapability capability,
DriverResilienceOptions options,
TimeProvider timeProvider)
TimeProvider timeProvider,
DriverResilienceStatusTracker? tracker)
{
var policy = options.Resolve(capability);
var builder = new ResiliencePipelineBuilder { TimeProvider = timeProvider };
@@ -88,7 +102,7 @@ public sealed class DriverResiliencePipelineBuilder
if (policy.RetryCount > 0)
{
builder.AddRetry(new RetryStrategyOptions
var retryOptions = new RetryStrategyOptions
{
MaxRetryAttempts = policy.RetryCount,
BackoffType = DelayBackoffType.Exponential,
@@ -96,19 +110,44 @@ public sealed class DriverResiliencePipelineBuilder
Delay = TimeSpan.FromMilliseconds(100),
MaxDelay = TimeSpan.FromSeconds(5),
ShouldHandle = new PredicateBuilder().Handle<Exception>(ex => ex is not OperationCanceledException),
});
};
if (tracker is not null)
{
retryOptions.OnRetry = args =>
{
tracker.RecordFailure(driverInstanceId, hostName, timeProvider.GetUtcNow().UtcDateTime);
return default;
};
}
builder.AddRetry(retryOptions);
}
if (policy.BreakerFailureThreshold > 0)
{
builder.AddCircuitBreaker(new CircuitBreakerStrategyOptions
var breakerOptions = new CircuitBreakerStrategyOptions
{
FailureRatio = 1.0,
MinimumThroughput = policy.BreakerFailureThreshold,
SamplingDuration = TimeSpan.FromSeconds(30),
BreakDuration = TimeSpan.FromSeconds(15),
ShouldHandle = new PredicateBuilder().Handle<Exception>(ex => ex is not OperationCanceledException),
});
};
if (tracker is not null)
{
breakerOptions.OnOpened = args =>
{
tracker.RecordBreakerOpen(driverInstanceId, hostName, timeProvider.GetUtcNow().UtcDateTime);
return default;
};
breakerOptions.OnClosed = args =>
{
// Closing the breaker means the target recovered — reset the consecutive-
// failure counter so Admin UI stops flashing red for this host.
tracker.RecordSuccess(driverInstanceId, hostName, timeProvider.GetUtcNow().UtcDateTime);
return default;
};
}
builder.AddCircuitBreaker(breakerOptions);
}
return builder.Build();

View File

@@ -0,0 +1,78 @@
using Microsoft.EntityFrameworkCore;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Admin.Services;
using ZB.MOM.WW.OtOpcUa.Configuration;
using ZB.MOM.WW.OtOpcUa.Configuration.Entities;
using ZB.MOM.WW.OtOpcUa.Configuration.Enums;
namespace ZB.MOM.WW.OtOpcUa.Admin.Tests;
[Trait("Category", "Unit")]
public sealed class ClusterNodeServiceTests
{
[Fact]
public void IsStale_NullLastSeen_Returns_True()
{
var node = NewNode("A", RedundancyRole.Primary, lastSeenAt: null);
ClusterNodeService.IsStale(node).ShouldBeTrue();
}
[Fact]
public void IsStale_RecentLastSeen_Returns_False()
{
var node = NewNode("A", RedundancyRole.Primary, lastSeenAt: DateTime.UtcNow.AddSeconds(-5));
ClusterNodeService.IsStale(node).ShouldBeFalse();
}
[Fact]
public void IsStale_Old_LastSeen_Returns_True()
{
var node = NewNode("A", RedundancyRole.Primary,
lastSeenAt: DateTime.UtcNow - ClusterNodeService.StaleThreshold - TimeSpan.FromSeconds(1));
ClusterNodeService.IsStale(node).ShouldBeTrue();
}
[Fact]
public async Task ListByClusterAsync_OrdersByServiceLevelBase_Descending_Then_NodeId()
{
using var ctx = NewContext();
ctx.ClusterNodes.AddRange(
NewNode("B-low", RedundancyRole.Secondary, serviceLevelBase: 150, clusterId: "c1"),
NewNode("A-high", RedundancyRole.Primary, serviceLevelBase: 200, clusterId: "c1"),
NewNode("other-cluster", RedundancyRole.Primary, serviceLevelBase: 200, clusterId: "c2"));
await ctx.SaveChangesAsync();
var svc = new ClusterNodeService(ctx);
var rows = await svc.ListByClusterAsync("c1", CancellationToken.None);
rows.Count.ShouldBe(2);
rows[0].NodeId.ShouldBe("A-high"); // higher ServiceLevelBase first
rows[1].NodeId.ShouldBe("B-low");
}
private static ClusterNode NewNode(
string nodeId,
RedundancyRole role,
DateTime? lastSeenAt = null,
int serviceLevelBase = 200,
string clusterId = "c1") => new()
{
NodeId = nodeId,
ClusterId = clusterId,
RedundancyRole = role,
Host = $"{nodeId}.example",
ApplicationUri = $"urn:{nodeId}",
ServiceLevelBase = (byte)serviceLevelBase,
LastSeenAt = lastSeenAt,
CreatedBy = "test",
};
private static OtOpcUaConfigDbContext NewContext()
{
var opts = new DbContextOptionsBuilder<OtOpcUaConfigDbContext>()
.UseInMemoryDatabase(Guid.NewGuid().ToString())
.Options;
return new OtOpcUaConfigDbContext(opts);
}
}

View File

@@ -23,11 +23,13 @@ public sealed class EquipmentImportBatchServiceTests : IDisposable
public void Dispose() => _db.Dispose();
// Unique SAPID per row — FinaliseBatch reserves ZTag + SAPID via filtered-unique index, so
// two rows sharing a SAPID under different EquipmentUuids collide as intended.
private static EquipmentCsvRow Row(string zTag, string name = "eq-1") => new()
{
ZTag = zTag,
MachineCode = "mc",
SAPID = "sap",
SAPID = $"sap-{zTag}",
EquipmentId = "eq-id",
EquipmentUuid = Guid.NewGuid().ToString(),
Name = name,
@@ -162,4 +164,93 @@ public sealed class EquipmentImportBatchServiceTests : IDisposable
await _svc.DropBatchAsync(Guid.NewGuid(), CancellationToken.None);
// no throw
}
[Fact]
public async Task FinaliseBatch_Creates_ExternalIdReservations_ForZTagAndSAPID()
{
var batch = await _svc.CreateBatchAsync("c1", "alice", CancellationToken.None);
await _svc.StageRowsAsync(batch.Id, [Row("z-new-1")], [], CancellationToken.None);
await _svc.FinaliseBatchAsync(batch.Id, 1, "drv", "line", CancellationToken.None);
var active = await _db.ExternalIdReservations.AsNoTracking()
.Where(r => r.ReleasedAt == null)
.ToListAsync();
active.Count.ShouldBe(2);
active.ShouldContain(r => r.Kind == ZB.MOM.WW.OtOpcUa.Configuration.Enums.ReservationKind.ZTag && r.Value == "z-new-1");
active.ShouldContain(r => r.Kind == ZB.MOM.WW.OtOpcUa.Configuration.Enums.ReservationKind.SAPID && r.Value == "sap-z-new-1");
}
[Fact]
public async Task FinaliseBatch_SameEquipmentUuid_ReusesExistingReservation()
{
var batch1 = await _svc.CreateBatchAsync("c1", "alice", CancellationToken.None);
var sharedUuid = Guid.NewGuid();
var row = new EquipmentCsvRow
{
ZTag = "z-shared", MachineCode = "mc", SAPID = "sap-shared",
EquipmentId = "eq-1", EquipmentUuid = sharedUuid.ToString(),
Name = "eq-1", UnsAreaName = "a", UnsLineName = "l",
};
await _svc.StageRowsAsync(batch1.Id, [row], [], CancellationToken.None);
await _svc.FinaliseBatchAsync(batch1.Id, 1, "drv", "line", CancellationToken.None);
var countAfterFirst = _db.ExternalIdReservations.Count(r => r.ReleasedAt == null);
// Second finalise with same EquipmentUuid + same ZTag — should NOT create a duplicate.
var batch2 = await _svc.CreateBatchAsync("c1", "alice", CancellationToken.None);
await _svc.StageRowsAsync(batch2.Id, [row], [], CancellationToken.None);
await _svc.FinaliseBatchAsync(batch2.Id, 2, "drv", "line", CancellationToken.None);
_db.ExternalIdReservations.Count(r => r.ReleasedAt == null).ShouldBe(countAfterFirst);
}
[Fact]
public async Task FinaliseBatch_DifferentEquipmentUuid_SameZTag_Throws_Conflict()
{
var batchA = await _svc.CreateBatchAsync("c1", "alice", CancellationToken.None);
var rowA = new EquipmentCsvRow
{
ZTag = "z-collide", MachineCode = "mc-a", SAPID = "sap-a",
EquipmentId = "eq-a", EquipmentUuid = Guid.NewGuid().ToString(),
Name = "a", UnsAreaName = "ar", UnsLineName = "ln",
};
await _svc.StageRowsAsync(batchA.Id, [rowA], [], CancellationToken.None);
await _svc.FinaliseBatchAsync(batchA.Id, 1, "drv", "line", CancellationToken.None);
var batchB = await _svc.CreateBatchAsync("c1", "bob", CancellationToken.None);
var rowB = new EquipmentCsvRow
{
ZTag = "z-collide", MachineCode = "mc-b", SAPID = "sap-b", // same ZTag, different EquipmentUuid
EquipmentId = "eq-b", EquipmentUuid = Guid.NewGuid().ToString(),
Name = "b", UnsAreaName = "ar", UnsLineName = "ln",
};
await _svc.StageRowsAsync(batchB.Id, [rowB], [], CancellationToken.None);
var ex = await Should.ThrowAsync<ExternalIdReservationConflictException>(() =>
_svc.FinaliseBatchAsync(batchB.Id, 2, "drv", "line", CancellationToken.None));
ex.Message.ShouldContain("z-collide");
// Second finalise must have rolled back — no partial Equipment row for batch B.
var equipmentB = await _db.Equipment.AsNoTracking()
.Where(e => e.EquipmentId == "eq-b")
.ToListAsync();
equipmentB.ShouldBeEmpty();
}
[Fact]
public async Task FinaliseBatch_EmptyZTagAndSAPID_SkipsReservation()
{
var batch = await _svc.CreateBatchAsync("c1", "alice", CancellationToken.None);
var row = new EquipmentCsvRow
{
ZTag = "", MachineCode = "mc", SAPID = "",
EquipmentId = "eq-nil", EquipmentUuid = Guid.NewGuid().ToString(),
Name = "nil", UnsAreaName = "ar", UnsLineName = "ln",
};
await _svc.StageRowsAsync(batch.Id, [row], [], CancellationToken.None);
await _svc.FinaliseBatchAsync(batch.Id, 1, "drv", "line", CancellationToken.None);
_db.ExternalIdReservations.Count().ShouldBe(0);
}
}

View File

@@ -0,0 +1,130 @@
using Microsoft.EntityFrameworkCore;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Admin.Services;
using ZB.MOM.WW.OtOpcUa.Configuration;
using ZB.MOM.WW.OtOpcUa.Configuration.Entities;
namespace ZB.MOM.WW.OtOpcUa.Admin.Tests;
[Trait("Category", "Unit")]
public sealed class UnsServiceMoveTests
{
[Fact]
public async Task LoadSnapshotAsync_ReturnsAllAreasAndLines_WithEquipmentCounts()
{
using var ctx = NewContext();
Seed(ctx, draftId: 1, areas: new[] { "area-1", "area-2" },
lines: new[] { ("line-a", "area-1"), ("line-b", "area-1"), ("line-c", "area-2") },
equipmentLines: new[] { "line-a", "line-a", "line-b" });
var svc = new UnsService(ctx);
var snap = await svc.LoadSnapshotAsync(1, CancellationToken.None);
snap.Areas.Count.ShouldBe(2);
snap.Lines.Count.ShouldBe(3);
snap.FindLine("line-a")!.EquipmentCount.ShouldBe(2);
snap.FindLine("line-b")!.EquipmentCount.ShouldBe(1);
snap.FindLine("line-c")!.EquipmentCount.ShouldBe(0);
}
[Fact]
public async Task LoadSnapshotAsync_RevisionToken_IsStable_BetweenTwoReads()
{
using var ctx = NewContext();
Seed(ctx, draftId: 1, areas: new[] { "area-1" }, lines: new[] { ("line-a", "area-1") });
var svc = new UnsService(ctx);
var first = await svc.LoadSnapshotAsync(1, CancellationToken.None);
var second = await svc.LoadSnapshotAsync(1, CancellationToken.None);
second.RevisionToken.Matches(first.RevisionToken).ShouldBeTrue();
}
[Fact]
public async Task LoadSnapshotAsync_RevisionToken_Changes_When_LineAdded()
{
using var ctx = NewContext();
Seed(ctx, draftId: 1, areas: new[] { "area-1" }, lines: new[] { ("line-a", "area-1") });
var svc = new UnsService(ctx);
var before = await svc.LoadSnapshotAsync(1, CancellationToken.None);
await svc.AddLineAsync(1, "area-1", "new-line", null, CancellationToken.None);
var after = await svc.LoadSnapshotAsync(1, CancellationToken.None);
after.RevisionToken.Matches(before.RevisionToken).ShouldBeFalse();
}
[Fact]
public async Task MoveLineAsync_WithMatchingToken_Reparents_Line()
{
using var ctx = NewContext();
Seed(ctx, draftId: 1, areas: new[] { "area-1", "area-2" },
lines: new[] { ("line-a", "area-1") });
var svc = new UnsService(ctx);
var snap = await svc.LoadSnapshotAsync(1, CancellationToken.None);
await svc.MoveLineAsync(1, snap.RevisionToken, "line-a", "area-2", CancellationToken.None);
var moved = await ctx.UnsLines.AsNoTracking().FirstAsync(l => l.UnsLineId == "line-a");
moved.UnsAreaId.ShouldBe("area-2");
}
[Fact]
public async Task MoveLineAsync_WithStaleToken_Throws_DraftRevisionConflict()
{
using var ctx = NewContext();
Seed(ctx, draftId: 1, areas: new[] { "area-1", "area-2" },
lines: new[] { ("line-a", "area-1") });
var svc = new UnsService(ctx);
// Simulate a peer operator's concurrent edit between our preview + commit.
var stale = new DraftRevisionToken("0000000000000000");
await Should.ThrowAsync<DraftRevisionConflictException>(() =>
svc.MoveLineAsync(1, stale, "line-a", "area-2", CancellationToken.None));
var row = await ctx.UnsLines.AsNoTracking().FirstAsync(l => l.UnsLineId == "line-a");
row.UnsAreaId.ShouldBe("area-1");
}
private static void Seed(OtOpcUaConfigDbContext ctx, long draftId,
IEnumerable<string> areas,
IEnumerable<(string line, string area)> lines,
IEnumerable<string>? equipmentLines = null)
{
foreach (var a in areas)
{
ctx.UnsAreas.Add(new UnsArea
{
GenerationId = draftId, UnsAreaId = a, ClusterId = "c1", Name = a,
});
}
foreach (var (line, area) in lines)
{
ctx.UnsLines.Add(new UnsLine
{
GenerationId = draftId, UnsLineId = line, UnsAreaId = area, Name = line,
});
}
foreach (var lineId in equipmentLines ?? [])
{
ctx.Equipment.Add(new Equipment
{
EquipmentRowId = Guid.NewGuid(), GenerationId = draftId,
EquipmentId = $"EQ-{Guid.NewGuid():N}"[..15],
EquipmentUuid = Guid.NewGuid(), DriverInstanceId = "drv",
UnsLineId = lineId, Name = "x", MachineCode = "m",
});
}
ctx.SaveChanges();
}
private static OtOpcUaConfigDbContext NewContext()
{
var opts = new DbContextOptionsBuilder<OtOpcUaConfigDbContext>()
.UseInMemoryDatabase(Guid.NewGuid().ToString())
.Options;
return new OtOpcUaConfigDbContext(opts);
}
}

View File

@@ -0,0 +1,127 @@
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
using ZB.MOM.WW.OtOpcUa.Core.Resilience;
namespace ZB.MOM.WW.OtOpcUa.Core.Tests.Resilience;
[Trait("Category", "Unit")]
public sealed class AlarmSurfaceInvokerTests
{
private static readonly DriverResilienceOptions TierAOptions = new() { Tier = DriverTier.A };
[Fact]
public async Task SubscribeAsync_EmptyList_ReturnsEmpty_WithoutDriverCall()
{
var driver = new FakeAlarmSource();
var surface = NewSurface(driver, defaultHost: "h");
var handles = await surface.SubscribeAsync([], CancellationToken.None);
handles.Count.ShouldBe(0);
driver.SubscribeCallCount.ShouldBe(0);
}
[Fact]
public async Task SubscribeAsync_SingleHost_RoutesThroughDefaultHost()
{
var driver = new FakeAlarmSource();
var surface = NewSurface(driver, defaultHost: "h1");
var handles = await surface.SubscribeAsync(["src-1", "src-2"], CancellationToken.None);
handles.Count.ShouldBe(1);
driver.SubscribeCallCount.ShouldBe(1);
driver.LastSubscribedIds.ShouldBe(["src-1", "src-2"]);
}
[Fact]
public async Task SubscribeAsync_MultiHost_FansOutByResolvedHost()
{
var driver = new FakeAlarmSource();
var resolver = new StubResolver(new Dictionary<string, string>
{
["src-1"] = "plc-a",
["src-2"] = "plc-b",
["src-3"] = "plc-a",
});
var surface = NewSurface(driver, defaultHost: "default-ignored", resolver: resolver);
var handles = await surface.SubscribeAsync(["src-1", "src-2", "src-3"], CancellationToken.None);
handles.Count.ShouldBe(2); // one per distinct host
driver.SubscribeCallCount.ShouldBe(2); // one driver call per host
}
[Fact]
public async Task AcknowledgeAsync_DoesNotRetry_OnFailure()
{
var driver = new FakeAlarmSource { AcknowledgeShouldThrow = true };
var surface = NewSurface(driver, defaultHost: "h1");
await Should.ThrowAsync<InvalidOperationException>(() =>
surface.AcknowledgeAsync([new AlarmAcknowledgeRequest("s", "c", null)], CancellationToken.None));
driver.AcknowledgeCallCount.ShouldBe(1, "AlarmAcknowledge must not retry — decision #143");
}
[Fact]
public async Task SubscribeAsync_Retries_Transient_Failures()
{
var driver = new FakeAlarmSource { SubscribeFailuresBeforeSuccess = 2 };
var surface = NewSurface(driver, defaultHost: "h1");
await surface.SubscribeAsync(["src"], CancellationToken.None);
driver.SubscribeCallCount.ShouldBe(3, "AlarmSubscribe retries by default — decision #143");
}
private static AlarmSurfaceInvoker NewSurface(
IAlarmSource driver,
string defaultHost,
IPerCallHostResolver? resolver = null)
{
var builder = new DriverResiliencePipelineBuilder();
var invoker = new CapabilityInvoker(builder, "drv-1", () => TierAOptions);
return new AlarmSurfaceInvoker(invoker, driver, defaultHost, resolver);
}
private sealed class FakeAlarmSource : IAlarmSource
{
public int SubscribeCallCount { get; private set; }
public int AcknowledgeCallCount { get; private set; }
public int SubscribeFailuresBeforeSuccess { get; set; }
public bool AcknowledgeShouldThrow { get; set; }
public IReadOnlyList<string> LastSubscribedIds { get; private set; } = [];
public Task<IAlarmSubscriptionHandle> SubscribeAlarmsAsync(
IReadOnlyList<string> sourceNodeIds, CancellationToken cancellationToken)
{
SubscribeCallCount++;
LastSubscribedIds = sourceNodeIds;
if (SubscribeCallCount <= SubscribeFailuresBeforeSuccess)
throw new InvalidOperationException("transient");
return Task.FromResult<IAlarmSubscriptionHandle>(new StubHandle($"h-{SubscribeCallCount}"));
}
public Task UnsubscribeAlarmsAsync(IAlarmSubscriptionHandle handle, CancellationToken cancellationToken)
=> Task.CompletedTask;
public Task AcknowledgeAsync(
IReadOnlyList<AlarmAcknowledgeRequest> acknowledgements, CancellationToken cancellationToken)
{
AcknowledgeCallCount++;
if (AcknowledgeShouldThrow) throw new InvalidOperationException("ack boom");
return Task.CompletedTask;
}
public event EventHandler<AlarmEventArgs>? OnAlarmEvent { add { } remove { } }
}
private sealed record StubHandle(string DiagnosticId) : IAlarmSubscriptionHandle;
private sealed class StubResolver(Dictionary<string, string> map) : IPerCallHostResolver
{
public string ResolveHost(string fullReference) => map[fullReference];
}
}

View File

@@ -219,4 +219,67 @@ public sealed class DriverResiliencePipelineBuilderTests
attempts.ShouldBeLessThanOrEqualTo(1);
}
[Fact]
public async Task Tracker_RecordsFailure_OnEveryRetry()
{
var tracker = new DriverResilienceStatusTracker();
var builder = new DriverResiliencePipelineBuilder(statusTracker: tracker);
var pipeline = builder.GetOrCreate("drv-trk", "host-x", DriverCapability.Read, TierAOptions);
await Should.ThrowAsync<InvalidOperationException>(async () =>
await pipeline.ExecuteAsync(async _ =>
{
await Task.Yield();
throw new InvalidOperationException("always fails");
}));
var snap = tracker.TryGet("drv-trk", "host-x");
snap.ShouldNotBeNull();
var retryCount = TierAOptions.Resolve(DriverCapability.Read).RetryCount;
snap!.ConsecutiveFailures.ShouldBe(retryCount);
}
[Fact]
public async Task Tracker_StampsBreakerOpen_WhenBreakerTrips()
{
var tracker = new DriverResilienceStatusTracker();
var builder = new DriverResiliencePipelineBuilder(statusTracker: tracker);
var pipeline = builder.GetOrCreate("drv-trk", "host-b", DriverCapability.Write, TierAOptions);
var threshold = TierAOptions.Resolve(DriverCapability.Write).BreakerFailureThreshold;
for (var i = 0; i < threshold; i++)
{
await Should.ThrowAsync<InvalidOperationException>(async () =>
await pipeline.ExecuteAsync(async _ =>
{
await Task.Yield();
throw new InvalidOperationException("boom");
}));
}
var snap = tracker.TryGet("drv-trk", "host-b");
snap.ShouldNotBeNull();
snap!.LastBreakerOpenUtc.ShouldNotBeNull();
}
[Fact]
public async Task Tracker_IsolatesCounters_PerHost()
{
var tracker = new DriverResilienceStatusTracker();
var builder = new DriverResiliencePipelineBuilder(statusTracker: tracker);
var dead = builder.GetOrCreate("drv-trk", "dead", DriverCapability.Read, TierAOptions);
var live = builder.GetOrCreate("drv-trk", "live", DriverCapability.Read, TierAOptions);
await Should.ThrowAsync<InvalidOperationException>(async () =>
await dead.ExecuteAsync(async _ =>
{
await Task.Yield();
throw new InvalidOperationException("dead");
}));
await live.ExecuteAsync(async _ => await Task.Yield());
tracker.TryGet("drv-trk", "dead")!.ConsecutiveFailures.ShouldBeGreaterThan(0);
tracker.TryGet("drv-trk", "live").ShouldBeNull();
}
}