docs(m7): reflect OPC UA / MxGateway UX (T13-T17) across component docs + CLAUDE/stillpending/completion-design

This commit is contained in:
Joseph Doherty
2026-06-18 04:13:21 -04:00
parent 39afa2743e
commit 40928535fd
11 changed files with 158 additions and 19 deletions
+2
View File
@@ -116,6 +116,7 @@ Related repos cloned as sibling directories under `~/Desktop/` — referenced fo
- Inter-cluster communication uses two transports: ClusterClient for command/control (deployments, lifecycle, subscribe/unsubscribe handshake, snapshots) and gRPC server-streaming for real-time data (attribute values, alarm states). Both CentralCommunicationActor and SiteCommunicationActor registered with receptionist. Central creates one ClusterClient per site using NodeA/NodeB as contact points. Sites configure multiple central contact points for failover. Addresses cached in CentralCommunicationActor, refreshed periodically (60s) and on admin changes. Heartbeats serve health monitoring only.
- gRPC streaming channel: SiteStreamGrpcServer on each site node (Kestrel HTTP/2, port 8083); central creates per-site SiteStreamGrpcClient via SiteStreamGrpcClientFactory. Site entity has GrpcNodeAAddress/GrpcNodeBAddress fields. Proto: sitestream.proto with SiteStreamService, SiteStreamEvent (oneof: AttributeValueUpdate, AlarmStateUpdate). DebugStreamEvent message removed (no longer flows through ClusterClient).
- Native alarms: a read-only mirror of native alarms from OPC UA Alarms & Conditions servers and the MxAccess Gateway, unified onto an A&C-style condition model (`AlarmConditionState`: orthogonal Active/Acked/Confirmed/Shelved/Suppressed + 01000 severity) plus an `AlarmKind` discriminator (Computed/NativeOpcUa/NativeMxAccess). New DCL capability seam `IAlarmSubscribableConnection` (implemented by the OPC UA and MxGateway adapters); the `DataConnectionActor` opens ONE alarm feed per connection and routes transitions to instances by source-object reference. A `NativeAlarmActor` (peer to the computed `AlarmActor` under `InstanceActor`) mirrors one source binding: snapshot atomic-swap on (re)subscribe, retention (drops once inactive+acked), per-source cap, and site SQLite persistence (`native_alarm_state`, survives failover, cleared on redeploy/undeploy — mirrors static overrides). State streams to central over the additively-enriched gRPC `AlarmStateUpdate` (the existing computed `AlarmStateChanged` was enriched additively) and seeds via the DebugView snapshot. Authoring: `TemplateNativeAlarmSource` / `InstanceNativeAlarmSourceOverride` entities flatten to `ResolvedNativeAlarmSource` (inherit/compose/override); management commands + ManagementActor handlers + CLI (`template/instance native-alarm-source`) + Central UI (template editor tab + instance override panel) + enriched DebugView alarm table. Read-only — no ack-back; no central tables.
- OPC UA / MxGateway UX (M7): operator **Alarm Summary** page (`/monitoring/alarms`, RequireDeployment, read-only) fans out the existing per-instance `DebugViewSnapshot` Ask (SemaphoreSlim-capped, partial-results tolerant) and aggregates client-side — no central alarm store, no aggregated live stream yet (follow-up); shared `AlarmStateBadges` component. OPC UA node browser gains `BrowseNext` continuation paging ("Load more"), a bounded recursive address-space **search** (`IAddressSpaceSearchable` seam; depth + result caps; substring on DisplayName/path), and **type-info** (DataType/ValueRank/Writable on `BrowseNode` for Variables). Attribute-override **CSV bulk import** (`OverrideCsvParser`, all-or-nothing) via InstanceConfigure `InputFile` + CLI `instance import-overrides --file` (native-alarm-source-override CSV deferred). **Verify-endpoint** probe (temporary `RealOpcUaClient`, short timeout, captures an untrusted server cert but NEVER trusts it) + **site-local cert trust**: per-node `CertStoreActor` (runs on every site node, not a singleton) writing the `.der` into the node's OPC UA trusted-peer PKI store; DeploymentManager broadcasts `TrustServerCertCommand`/`RemoveServerCertCommand` to BOTH site nodes so PKI stores stay consistent across failover; Admin-gated cert-management UI (`/design/connections/{id}/certificates`). No central persistence of cert trust (follow-up).
### External Integrations
- External System Gateway: HTTP/REST only, JSON serialization, API key + Basic Auth.
@@ -181,6 +182,7 @@ Related repos cloned as sibling directories under `~/Desktop/` — referenced fo
- Cookie+JWT hybrid sessions: HttpOnly/Secure cookie carries an embedded JWT (HMAC-SHA256 shared symmetric key), 15-minute expiry with sliding refresh, 30-minute idle timeout. Cookies are the correct transport for Blazor Server (SignalR circuits).
- LDAP failure: new logins fail; active sessions continue with current roles.
- Load balancer in front of central UI; cookie-embedded JWT + shared Data Protection keys for failover transparency.
- Two-person MxGateway secured writes (M7): two new global roles — `Operator` (initiates) + `Verifier` (approves) — added alongside the canonical `Administrator`/`Designer`/`Deployer`/`Viewer`, with `RequireOperator`/`RequireVerifier` policies. An Operator submits a secured write from the Central UI Secured Writes page (`/operations/secured-writes`); it stays a `Pending` `PendingSecuredWrite` row until a *distinct* Verifier approves it (no-self-approval enforced server-side in the ManagementActor, plus a compare-and-swap race guard). Approval relays a `WriteTagRequest` to the site MxGateway; MxGateway-protocol connections only; each lifecycle event (submit/approve/reject/execute) emits a best-effort `AuditChannel.SecuredWrite` / `AuditKind.SecuredWrite*` central-direct-write row sharing the row id as `CorrelationId`. (SecuredWrite audit rows currently leave `SourceNode` NULL — a logged follow-up.)
### Cluster & Failover
- Keep-oldest split-brain resolver with `down-if-alone = on`, 15s stable-after.