Files
2b2991c593d51350e6c9ab078205716e76ebf644
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
DiffViewer ACL section — extend sp_ComputeGenerationDiff with NodeAcl rows. Closes the final slice of task #196 (draft-diff ACL section). The DiffViewer already rendered a placeholder "NodeAcl" card from the task #156 refactor; it stayed empty because the stored proc didn't emit NodeAcl rows. This PR lights the card up by adding a fifth UNION to the proc. Logical id for NodeAcl is the composite LdapGroup + ScopeKind + ScopeId triple — format "cn=group|Cluster|scope-id" or "cn=group|Cluster|(cluster)" when ScopeId is null (Cluster-wide rows). That shape means a permission-only change (same group + same scope, PermissionFlags shifted) appears as a single Modified row with the full triple as its identifier, whereas a scope move (same group, new ScopeId) correctly surfaces as Added + Removed of two different logical ids. CHECKSUM signature covers ClusterId + PermissionFlags + Notes so both operator-visible changes (permission bitmask) and audit-tier changes (notes) round-trip through the diff. New migration 20260420000001_ExtendComputeGenerationDiffWithNodeAcl.cs ships both Up (install V2 proc) + Down (restore the exact V1 proc text shipped in 20260417215224_StoredProcedures so the migration is reversible). Row-id column widens from nvarchar(64) to nvarchar(128) in V2 since the composite key (group DN + scope + scope-id) exceeds 64 chars comfortably — narrow column would silently truncate in prod. Designer .cs cloned from the prior migration since the EF model is unchanged; DiffViewer.razor section description updated to drop the "(proc-extension pending)" note it carried since task #156 — the card will now populate live. Admin + Core full-solution build clean. No unit-test changes needed — the existing StoredProceduresTests cover the proc-exec path + would immediately catch any SQL syntax regression on next SQL Server integration run. Task #196 fully closed now — Probe-this-permission (slice 1, PR 144), SignalR invalidation (slice 2, PR 145), draft-diff ACL section (this PR).
Phase 3 PR 27 — Fleet status dashboard page. New /fleet route shows per-node apply state (ClusterNodeGenerationState joined with ClusterNode for the ClusterId) in a sortable table with summary cards for Total / Applied / Stale / Failed node counts. Stale detection: LastSeenAt older than 30s triggers a table-warning row class + yellow count card. Failed rows get table-danger + red card. Badge classes per LastAppliedStatus: Applied=bg-success, Failed=bg-danger, Applying=bg-info, unknown=bg-secondary. Timestamps rendered as relative-age strings ('42s ago', '15m ago', '3h ago', then absolute date for >24h). Error column is truncated to 320px with the full message in a tooltip so the table stays readable on wide fleets. Initial data load on OnInitializedAsync; auto-refresh every 5s via a Timer that calls InvokeAsync(RefreshAsync) — matches the FleetStatusPoller's 5s cadence so the dashboard sees the most recent state without polling ahead of the broadcaster. A Refresh button also kicks a manual reload; _refreshing gate prevents double-runs when the timer fires during an in-flight query. IServiceScopeFactory (matches FleetStatusPoller's pattern) creates a fresh DI scope per refresh so the per-page DbContext can't race the timer with the render thread; no new DI registrations needed. Live SignalR hub push is deliberately deferred to a follow-up PR — the existing FleetStatusHub + NodeStateChangedMessage already works for external JavaScript clients; wiring an in-process Blazor Server consumer adds HubConnectionBuilder plumbing that's worth its own focused change. Sidebar link added to MainLayout between Overview and Clusters. Full Admin.Tests Unit suite 14 pass / 0 fail — unchanged, no tests regressed. Full Admin build clean (0 errors, 0 warnings). Closes the 'no per-driver dashboard' gap from lmx-followups item #7 at the fleet level; per-host (platform/engine/Modbus PLC) granularity still needs a dedicated page that consumes IHostConnectivityProbe.GetHostStatuses from the Server process — that's the live-SignalR follow-up.
Phase 1 Stream E Admin UI — finish Blazor pages so operators can run the draft → publish → rollback workflow end-to-end without hand-executing SQL. Adds eight new scoped services that wrap the Configuration stored procs + managed validators: EquipmentService (CRUD with auto-derived EquipmentId per decision #125), UnsService (areas + lines), NamespaceService, DriverInstanceService (generic JSON DriverConfig editor per decision #94 — per-driver schema validation lands in each driver's phase), NodeAclService (grant + revoke with bundled-preset permission sets; full per-flag editor + bulk-grant + permission simulator deferred to v2.1), ReservationService (fleet-wide active + released reservation inspector + FleetAdmin-only sp_ReleaseExternalIdReservation wrapper with required-reason invariant), DraftValidationService (hydrates a DraftSnapshot from the draft's rows plus prior-cluster Equipment + active reservations, runs the managed DraftValidator to surface every rule in one pass for inline validation panel), AuditLogService (recent ConfigAuditLog reader). Pages: /clusters list with create-new shortcut; /clusters/new wizard that creates the cluster row + initial empty draft in one go; /clusters/{id} detail with 8 tabs (Overview / Generations / Equipment / UNS Structure / Namespaces / Drivers / ACLs / Audit) — tabs that write always target the active draft, published generations stay read-only; /clusters/{id}/draft/{gen} editor with live validation panel (errors list with stable code + message + context; publish button disabled while any error exists) and tab-embedded sub-components; /clusters/{id}/draft/{gen}/diff three-column view backed by sp_ComputeGenerationDiff with Added/Removed/Modified badges; Generations tab with per-row rollback action wired to sp_RollbackToGeneration; /reservations FleetAdmin-only page (CanPublish policy) with active + released lists and a modal release dialog that enforces non-empty reason and round-trips through sp_ReleaseExternalIdReservation; /login scaffold with stub credential accept + FleetAdmin-role cookie issuance (real LDAP bind via the ScadaLink-parity LdapAuthService is deferred until live GLAuth integration — marked in the login view and in the Phase 1 partial-exit TODO). Layout: sidebar gets Overview / Clusters / Reservations + AuthorizeView with signed-in username + roles + sign-out POST to /auth/logout; cascading authentication state registered for <AuthorizeView> to work in RenderMode.InteractiveServer. Integration testing: AdminServicesIntegrationTests creates a throwaway per-run database (same pattern as the Configuration test fixture), applies all three migrations, and exercises (1) create-cluster → add-namespace+UNS+driver+equipment → validate (expects zero errors) → publish (expects Published status) → rollback (expects one new Published + at least one Superseded); (2) cross-cluster namespace binding draft → validates to BadCrossClusterNamespaceBinding per decision #122. Old flat Components/Pages/Clusters.razor moved to Components/Pages/Clusters/ClustersList.razor so the Clusters folder can host tab sub-components without the razor generator creating a type-and-namespace collision. Dev appsettings.json connection string switched from Integrated Security to sa auth to match the otopcua-mssql container on port 14330 (remapped from 1433 to coexist with the native MSSQL14 Galaxy ZB instance). Browser smoke test completed: home page, clusters list, new-cluster form, cluster detail with a seeded row, reservations (redirected to login for anon user) all return 200 / 302-to-login as expected; full solution 928 pass / 1 pre-existing Phase 0 baseline failure. Phase 1 Stream E items explicitly deferred with TODOs: CSV import for Equipment, SignalR FleetStatusHub + AlertHub real-time push, bulk-grant workflow, permission-simulator trie, merge-equipment draft, AppServer-via-OI-Gateway end-to-end smoke test (decision #142), and the real LDAP bind replacing the Login page stub.
Admin /hosts red-badge + resilience columns + Polly telemetry observer. Closes task #164 (the remaining slice of Phase 6.1 Stream E.3 after the earlier publisher + hub PR). Three cooperating pieces wired together so the operator-facing /hosts table actually reflects the live Polly counters that the pipeline builder is producing. DriverResiliencePipelineBuilder gains an optional DriverResilienceStatusTracker ctor param — when non-null, every built pipeline wires Polly's OnRetry/OnOpened/OnClosed strategy-options callbacks into the tracker. OnRetry → tracker.RecordFailure (so ConsecutiveFailures climbs per retry), OnOpened → tracker.RecordBreakerOpen (stamps LastCircuitBreakerOpenUtc), OnClosed → tracker.RecordSuccess (resets the failure counter once the target recovers). Absent tracker = silent, preserving the unit-test constructor path + any deployment that doesn't care about resilience observability. Cancellation stays excluded from the failure count via the existing ShouldHandle predicate. HostStatusService.HostStatusRow extends with four new fields — ConsecutiveFailures, LastCircuitBreakerOpenUtc, CurrentBulkheadDepth, LastRecycleUtc — populated via a second LEFT JOIN onto DriverInstanceResilienceStatuses keyed on (DriverInstanceId, HostName). LEFT JOIN because brand-new hosts haven't been sampled yet; a missing row means zero failures + never-opened breaker, which is the correct default. New FailureFlagThreshold constant (=3, matches plan decision #143's conservative half-of-breaker convention) + IsFlagged predicate so the UI can pre-warn before the breaker actually trips. Hosts.razor paints three new columns between State and Last-transition — Fail# (bold red when flagged), In-flight (bulkhead-depth proxy), Breaker-opened (relative age). Per-row "Flagged" red badge alongside State when IsFlagged is true. Above the first cluster table, a red alert banner summarises the flagged-host count when ≥1 host is flagged, so operators see the problem before scanning rows. Three new tests in DriverResiliencePipelineBuilderTests — Tracker_RecordsFailure_OnEveryRetry verifies ConsecutiveFailures reaches RetryCount after a transient-forever operation, Tracker_StampsBreakerOpen_WhenBreakerTrips verifies LastBreakerOpenUtc is set after threshold failures on a Write pipeline, Tracker_IsolatesCounters_PerHost verifies one dead host does not leak failure counts into a healthy sibling. Full suite — Core.Tests 14/14 resilience-builder tests passing (11 existing + 3 new), Admin.Tests 72/72 passing, Admin project builds 0 errors. SignalR live push of status changes + browser visual review are deliberately left to a follow-up — this PR keeps the structural change minimal (polling refresh already exists in the page's 10s timer; SignalR would be a structural add that touches hub registration + client subscription).
Phase 1 Stream E Admin UI — finish Blazor pages so operators can run the draft → publish → rollback workflow end-to-end without hand-executing SQL. Adds eight new scoped services that wrap the Configuration stored procs + managed validators: EquipmentService (CRUD with auto-derived EquipmentId per decision #125), UnsService (areas + lines), NamespaceService, DriverInstanceService (generic JSON DriverConfig editor per decision #94 — per-driver schema validation lands in each driver's phase), NodeAclService (grant + revoke with bundled-preset permission sets; full per-flag editor + bulk-grant + permission simulator deferred to v2.1), ReservationService (fleet-wide active + released reservation inspector + FleetAdmin-only sp_ReleaseExternalIdReservation wrapper with required-reason invariant), DraftValidationService (hydrates a DraftSnapshot from the draft's rows plus prior-cluster Equipment + active reservations, runs the managed DraftValidator to surface every rule in one pass for inline validation panel), AuditLogService (recent ConfigAuditLog reader). Pages: /clusters list with create-new shortcut; /clusters/new wizard that creates the cluster row + initial empty draft in one go; /clusters/{id} detail with 8 tabs (Overview / Generations / Equipment / UNS Structure / Namespaces / Drivers / ACLs / Audit) — tabs that write always target the active draft, published generations stay read-only; /clusters/{id}/draft/{gen} editor with live validation panel (errors list with stable code + message + context; publish button disabled while any error exists) and tab-embedded sub-components; /clusters/{id}/draft/{gen}/diff three-column view backed by sp_ComputeGenerationDiff with Added/Removed/Modified badges; Generations tab with per-row rollback action wired to sp_RollbackToGeneration; /reservations FleetAdmin-only page (CanPublish policy) with active + released lists and a modal release dialog that enforces non-empty reason and round-trips through sp_ReleaseExternalIdReservation; /login scaffold with stub credential accept + FleetAdmin-role cookie issuance (real LDAP bind via the ScadaLink-parity LdapAuthService is deferred until live GLAuth integration — marked in the login view and in the Phase 1 partial-exit TODO). Layout: sidebar gets Overview / Clusters / Reservations + AuthorizeView with signed-in username + roles + sign-out POST to /auth/logout; cascading authentication state registered for <AuthorizeView> to work in RenderMode.InteractiveServer. Integration testing: AdminServicesIntegrationTests creates a throwaway per-run database (same pattern as the Configuration test fixture), applies all three migrations, and exercises (1) create-cluster → add-namespace+UNS+driver+equipment → validate (expects zero errors) → publish (expects Published status) → rollback (expects one new Published + at least one Superseded); (2) cross-cluster namespace binding draft → validates to BadCrossClusterNamespaceBinding per decision #122. Old flat Components/Pages/Clusters.razor moved to Components/Pages/Clusters/ClustersList.razor so the Clusters folder can host tab sub-components without the razor generator creating a type-and-namespace collision. Dev appsettings.json connection string switched from Integrated Security to sa auth to match the otopcua-mssql container on port 14330 (remapped from 1433 to coexist with the native MSSQL14 Galaxy ZB instance). Browser smoke test completed: home page, clusters list, new-cluster form, cluster detail with a seeded row, reservations (redirected to login for anon user) all return 200 / 302-to-login as expected; full solution 928 pass / 1 pre-existing Phase 0 baseline failure. Phase 1 Stream E items explicitly deferred with TODOs: CSV import for Equipment, SignalR FleetStatusHub + AlertHub real-time push, bulk-grant workflow, permission-simulator trie, merge-equipment draft, AppServer-via-OI-Gateway end-to-end smoke test (decision #142), and the real LDAP bind replacing the Login page stub.
ACL + role-grant SignalR invalidation — #196 slice 2. Adds the live-push layer so an operator editing permissions in one Admin session sees the change in peer sessions without a manual reload. Covers both axes of task #196's invalidation requirement: cluster-scoped NodeAcl mutations push NodeAclChanged to that cluster's subscribers; fleet-wide LdapGroupRoleMapping CRUD pushes RoleGrantsChanged to every Admin session on the fleet group. New AclChangeNotifier service wraps IHubContext<FleetStatusHub> with two methods: NotifyNodeAclChangedAsync(clusterId, generationId) + NotifyRoleGrantsChangedAsync(). Both are fire-and-forget — a failed hub send logs a warning + returns; the authoritative DB write already committed, so worst-case peers see stale data until their next poll (AclsTab has no polling today; on-parameter-set reload + this signal covers the practical refresh cases). Catching OperationCanceledException separately so request-teardown doesn't log a false-positive hub-failure. NodeAclService constructor gains an optional AclChangeNotifier param (defaults to null so the existing unit tests that pass only a DbContext keep compiling). GrantAsync + RevokeAsync both emit NodeAclChanged after the SaveChanges completes — the Revoke path uses the loaded row's ClusterId + GenerationId for accurate routing since the caller passes only the surrogate rowId. RoleGrants.razor consumes the notifier after every Create + Delete + opens a fleet-scoped HubConnection on first render that reloads the grant list on RoleGrantsChanged. AclsTab.razor opens a cluster-scoped connection on first render and reloads only when the incoming NodeAclChanged message matches both the current ClusterId + GenerationId (so a peer editing a different draft doesn't trigger spurious reloads). Both pages IAsyncDisposable the connection on navigation away. AclChangeNotifier is DI-registered alongside PermissionProbeService. Two new message records in AclChangeNotifier.cs: NodeAclChangedMessage(ClusterId, GenerationId, ObservedAtUtc) + RoleGrantsChangedMessage(ObservedAtUtc). Admin.Tests 92/92 passing (unchanged — the notifier is fire-and-forget + tested at hub level in existing FleetStatusPoller suite). Admin builds 0 errors. One slice of #196 remains: the draft-diff ACL section (extend sp_ComputeGenerationDiff to emit NodeAcl rows + wire the DiffViewer NodeAcl card from the empty placeholder it currently shows). Next PR.