Doc refresh (task #205) — requirements updated for multi-driver OtOpcUa three-process deploy

Per-file summary:

- docs/reqs/OpcUaServerReqs.md — rewritten driver-agnostic. OPC-001..OPC-013 re-scoped to multi-driver address-space composition + capability dispatch; OPC-014 AuthorizationGate + permission trie; OPC-015 dynamic ServiceLevel via RedundancyCoordinator; OPC-017 surgical generation-apply rebuild; OPC-012 capability dispatch via CapabilityInvoker (decision #143 idempotence-aware retry); OPC-013 per-host Polly isolation (decision #144); OPC-019 OpenTelemetry metrics. Transport-security profile matrix (OPC-010) + UserName/LDAP (OPC-011) preserved.

- docs/reqs/GalaxyRepositoryReqs.md — scope clarified as Galaxy-driver-only (not platform). GR-001..GR-004 tied to ITagDiscovery.DiscoverAsync + IRediscoverable; all SQL runs inside OtOpcUa.Galaxy.Host and streams to Proxy via named pipe. GR-008 capability wrapping via CapabilityInvoker added. Cross-links to docs/v2/driver-specs.md + docs/GalaxyRepository.md.

- docs/reqs/MxAccessClientReqs.md — scope clarified as Galaxy-Host-only. MXA-001..MXA-009 preserved (STA pump, register/unregister, subscription refcount, auto-reconnect, probe, COM cleanup, operation metrics, error translation). MXA-010 Proxy-side capability wrapping + MXA-011 pipe ACL + per-process shared secret (OTOPCUA_ALLOWED_SID / OTOPCUA_GALAXY_SECRET) added.

- docs/reqs/ServiceHostReqs.md — rewritten for three-process deployment. Shared section (SVC-SHARED-001/002) for Serilog + bootstrap-only appsettings. SRV-* for OtOpcUa.Server (net10 x64, Microsoft.Extensions.Hosting + AddWindowsService, in-process driver hosting, redundancy-node bootstrap). ADM-* for OtOpcUa.Admin (Blazor Server, cookie+LDAP auth, CanEdit/CanPublish policies, sole DB writer, Prometheus /metrics, audit logging). GHX-* for OtOpcUa.Galaxy.Host (TopShelf, net48 x86, named-pipe IPC bootstrap, STA backend lifecycle, crash handling tied to supervisor).

- docs/reqs/ClientRequirements.md — restructured as numbered, verifiable requirements. SHR-* for Client.Shared (single IOpcUaClientService, ConnectionSettings, failover, cross-platform certs, type-coercing write, UI-thread neutrality). CLI-001..CLI-011 cover connect/read/write/browse/subscribe/historyread/alarms/redundancy. UI-001..UI-008 cover connection panel, tree browser, each tab, connection-state reflection, cross-platform build. Reference design content (IOpcUaClientService shape, models, view-model map, mock layout) preserved.

- docs/reqs/StatusDashboardReqs.md — retired cleanly. Replaced with a pointer to docs/v2/admin-ui.md + HLR-015 / HLR-016 / HLR-017 / ADM-*. Mapping table shows each retired DASH-001..DASH-009 requirement's replacement (live cluster-node view via SignalR, Prometheus metrics, driver-instance detail views, etc.). Note that a formal AdminUiReqs.md can be written later if needed for cert compliance.

HighLevelReqs.md was already at the target shape (HLR-001..HLR-018 with Revision header noting retired HLR-009) as of commit f217636; verified identical and no additional edit required.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-04-20 01:31:58 -04:00
parent f217636467
commit 48970af416
6 changed files with 739 additions and 644 deletions

View File

@@ -1,234 +1,266 @@
# OPC UA Server — Component Requirements
Parent: [HLR-001](HighLevelReqs.md#hlr-001-opc-ua-server), [HLR-002](HighLevelReqs.md#hlr-002-galaxy-hierarchy-as-opc-ua-address-space), [HLR-004](HighLevelReqs.md#hlr-004-data-type-mapping)
> **Revision** — Refreshed 2026-04-19 for the OtOpcUa v2 multi-driver platform (task #205). OPC-001…OPC-013 have been rewritten driver-agnostically — they now describe how the core OPC UA server composes multiple driver subtrees, enforces authorization, and invokes capabilities through the Polly-wrapped dispatch path. OPC-014 through OPC-022 are new and cover capability dispatch, per-host Polly isolation, idempotence-aware write retry, `AuthorizationGate`, `ServiceLevel` reporting, the alarm surface, history surface, server-certificate management, and the transport-security profile matrix. Galaxy-specific behavior has been moved out to `GalaxyRepositoryReqs.md` and `MxAccessClientReqs.md`.
Parent: [HLR-001](HighLevelReqs.md#hlr-001-opc-ua-server), [HLR-003](HighLevelReqs.md#hlr-003-address-space-composition-per-namespace), [HLR-009](HighLevelReqs.md#hlr-009-transport-security-and-authentication), [HLR-010](HighLevelReqs.md#hlr-010-per-driver-instance-resilience), [HLR-013](HighLevelReqs.md#hlr-013-cluster-redundancy)
## OPC-001: Server Endpoint
The OPC UA server shall listen on a configurable TCP port (default 4840) using the OPC Foundation .NET Standard stack.
The OPC UA server shall listen on a configurable TCP endpoint using the OPC Foundation .NET Standard stack and expose a single endpoint URL per cluster node.
### Acceptance Criteria
- Server starts and accepts TCP connections on the configured port.
- Port is read from `appsettings.json` under `OpcUa:Port`; defaults to 4840 if absent.
- Endpoint URL format: `opc.tcp://<hostname>:<port>/LmxOpcUa`.
- If the port is in use at startup, log an Error and fail to start (do not silently pick another port).
- Security policy: None (no certificate validation). This is an internal plant-floor service.
- Endpoint URL comes from `ClusterNode.EndpointUrl` in the Config DB (default form `opc.tcp://<hostname>:<port>/OtOpcUa`).
- `ApplicationName` and `ApplicationUri` come from `ClusterNode` fields; `ApplicationUri` is unique per node so redundancy `ServerUriArray` entries are distinguishable.
- Port defaults to 4840. If the port is in use at startup the server shall log Error and fail to start (no silent port reassignment).
- Uses `OPCFoundation.NetStandard.Opc.Ua.Server` NuGet.
- Endpoint URL logged at Information level on startup.
### Details
- Configurable items: port (default 4840), endpoint path (default `/LmxOpcUa`), server application name (default `LmxOpcUa`).
- Server shall use the `OPCFoundation.NetStandard.Opc.Ua.Server` NuGet package.
- On startup, log the endpoint URL at Information level.
- Node-local `appsettings.json` only carries the `Config DB connection + NodeId + ClusterId` bootstrap — actual endpoint topology comes from the Config DB per HLR-011.
---
## OPC-002: Address Space Structure
## OPC-002: Address Space Composition
The server shall create folder nodes for areas and object nodes for automation objects, organized in the same parent-child hierarchy as the Galaxy.
The server shall compose an address space by mounting each active driver instance's subtree under a dedicated OPC UA namespace.
### Acceptance Criteria
- The root folder node has BrowseName `ZB` (hardcoded Galaxy name).
- Objects where `is_area = 1` are created as FolderType nodes (organizational).
- Objects where `is_area = 0` are created as BaseObjectType nodes.
- Parent-child relationships use Organizes references (for areas) and HasComponent references (for contained objects).
- A client browsing Root → Objects → ZB → DEV → TestArea → TestMachine_001 → DelmiaReceiver sees the same structure as `gr/layout.md`.
### Details
- NodeIds use a string-based identifier scheme: `ns=1;s=<tag_name>` for object nodes, `ns=1;s=<tag_name>.<attribute_name>` for variable nodes.
- Infrastructure objects (AppEngines, Platforms) are included in the tree but may have no variable children.
- When `contained_name` is null or empty, fall back to `tag_name` as the BrowseName.
- Each `DriverInstance` in the current published generation registers one `IDriver` implementation in the core.
- Each driver's `ITagDiscovery.DiscoverAsync` result is streamed into the core via `IAddressSpaceBuilder``AddFolder` / `AddVariable` calls; the driver does not buffer the whole tree.
- Each driver instance gets its own namespace index; `NamespaceUri` comes from the `Namespace` row in the Config DB.
- Each cluster has at most one namespace per `Kind` (`Equipment`, `SystemPlatform`, future `Simulated`); enforced by UNIQUE on `(ClusterId, Kind)` in the DB.
- Galaxy driver subtree preserves the contained-name browse structure from the deployed Galaxy (moved to `GalaxyRepositoryReqs.md`).
- Equipment-kind drivers populate the canonical 5-level UNS structure (`Enterprise/Site/Area/Line/Equipment/Signal`).
---
## OPC-003: Variable Nodes for Attributes
## OPC-003: Variable Nodes and Access Levels
Each user-defined attribute on a deployed object shall be represented as an OPC UA variable node under its parent object node.
Each tag produced by a driver's `ITagDiscovery` shall become an OPC UA variable node.
### Acceptance Criteria
- Each row from `attributes.sql` creates one variable node under the matching object node (matched by `gobject_id`).
- Variable node BrowseName and DisplayName are set to `attribute_name`.
- Variable node stores `full_tag_reference` as its runtime MXAccess address.
- Variable node AccessLevel is set based on the attribute's `security_classification` per the mapping in `gr/data_type_mapping.md`.
- FreeAccess (0), Operate (1), Tune (4), Configure (5) → AccessLevel = CurrentRead | CurrentWrite (3).
- SecuredWrite (2), VerifiedWrite (3), ViewOnly (6) → AccessLevel = CurrentRead (1).
- Objects with no user-defined attributes still appear as object nodes with zero children.
### Details
- Security classification determines the OPC UA AccessLevel and UserAccessLevel attributes on each variable node. The OPC UA stack enforces read-only access for nodes with CurrentRead-only access level.
- Attributes whose names start with `_` are already filtered by the SQL query.
- Variable node `BrowseName` and `DisplayName` come from `DriverAttributeInfo`.
- `DataType` is resolved from `DriverDataType` per each driver's spec in `docs/v2/driver-specs.md`.
- `AccessLevel` and `UserAccessLevel` are derived from the tag's `SecurityClassification` and the session's effective permissions walked through the node-ACL permission trie (see OPC-017 `AuthorizationGate`).
- Scalar attributes produce `ValueRank = Scalar`; array attributes produce `ValueRank = OneDimension` with `ArrayDimensions` set from the driver's attribute info.
---
## OPC-004: Browse Name Translation
## OPC-004: Namespace Index Allocation
Browse names shall use contained names (human-readable, scoped to parent). The server shall internally translate browse paths to tag_name references for MXAccess operations.
The server shall register one OPC UA namespace per active driver instance.
### Acceptance Criteria
- A variable node browsed as `ZB/DEV/TestArea/TestMachine_001/DelmiaReceiver/DownloadPath` correctly translates to MXAccess reference `DelmiaReceiver_001.DownloadPath`.
- Translation uses the `tag_name` stored on the parent object node, not the browse path.
- No runtime path parsing — the mapping is baked into each node at build time.
### Details
- Each variable node stores its `full_tag_reference` (e.g., `DelmiaReceiver_001.DownloadPath`) at address-space build time. Read/write operations use this stored reference directly.
- Namespace index 0 remains the standard OPC UA namespace.
- Each driver instance's `Namespace.Uri` becomes a registered namespace; its index is assigned deterministically at startup from the published generation's driver ordering.
- All variable NodeIds use the driver's namespace index; NodeId identifiers are string-shaped and stable across restarts of the same generation.
- Namespace index reshuffles are a publish-time concern; clients reconciling server-relative NodeIds must re-resolve namespace URIs after a new generation is applied.
---
## OPC-005: Data Type Mapping
## OPC-005: Read Operations
Variable nodes shall use OPC UA data types mapped from Galaxy mx_data_type values per the mapping in `gr/data_type_mapping.md`.
The server shall fulfill OPC UA `Read` requests by invoking `IReadable.ReadAsync` on the target driver instance, dispatched through `CapabilityInvoker`.
### Acceptance Criteria
- Every `mx_data_type` value in the mapping table produces the correct OPC UA DataType NodeId on the variable node.
- Unknown/unmapped `mx_data_type` values default to String (i=12).
- ElapsedTime (type 7) maps to Double representing seconds.
### Details
- Full mapping table in `gr/data_type_mapping.md`.
- DateTime conversion: Galaxy may store local time; convert to UTC for OPC UA.
- LocalizedText (type 15): use empty locale string with the text value.
- Every read call at dispatch passes through `Core.Resilience.CapabilityInvoker.InvokeAsync(DriverCapability.Read, …)`.
- Returned `DataValueSnapshot` is converted to an OPC UA `DataValue` with `StatusCode`, source timestamp, and server timestamp.
- If the owning driver instance's Polly circuit is open, the read returns Bad quality immediately without hitting the wire.
- Reads on a node the session has no `Read` bit for in the permission trie return `Bad_UserAccessDenied` before the capability is invoked (OPC-017).
- Read timeout is the Polly timeout leg on the `Read` capability; its duration is per-`(DriverInstanceId, HostName)` and comes from the Config DB.
---
## OPC-006: Array Support
## OPC-006: Write Operations
Attributes marked as arrays shall have ValueRank=1 and ArrayDimensions set to the attribute's array_dimension value.
The server shall fulfill OPC UA `Write` requests by invoking `IWritable.WriteAsync` through `CapabilityInvoker` with **idempotence-aware** retry policy.
### Acceptance Criteria
- `is_array = 1` produces ValueRank = 1 (OneDimension) and ArrayDimensions = `[array_dimension]`.
- `is_array = 0` produces ValueRank = -1 (Scalar) and no ArrayDimensions.
- MXAccess reference for array attributes uses `tag_name.attribute[]` (whole array) format.
### Details
- Individual array element access (`tag_name.attribute[n]`) is not required for initial implementation. Whole-array read/write only.
- If `array_dimension` is null or 0 when `is_array = 1`, log a Warning and default to ArrayDimensions = [0] (variable-length).
- Writes dispatch through `CapabilityInvoker.InvokeAsync(DriverCapability.Write, …)`.
- Writes **do not auto-retry** unless the tag's `TagConfig.WriteIdempotent = true`, or the driver's capability is marked with `[WriteIdempotent]` (decision #143).
- Writes on a node the session lacks the required permission bit for (`WriteOperate`, `WriteTune`, or `WriteConfigure` derived from the tag's `SecurityClassification`) return `Bad_UserAccessDenied` before the capability runs.
- A write into an open circuit returns a driver-shaped error (`Bad_NoCommunication` / `Bad_ServerNotConnected`) without hitting the wire.
- The server shall coerce the written OPC UA value to the driver's expected native type using the node's `DriverDataType` before calling `WriteAsync`.
- Writes to a NodeId not currently in the address space return `Bad_NodeIdUnknown`.
---
## OPC-007: Read Operations
## OPC-007: Subscriptions and Monitored Items
The server shall fulfill OPC UA Read requests by reading the corresponding tag value from MXAccess using the tag_name.AttributeName reference.
The server shall map OPC UA `CreateMonitoredItems` / `DeleteMonitoredItems` to `ISubscribable.SubscribeAsync` / `UnsubscribeAsync` on the owning driver instance.
### Acceptance Criteria
- OPC UA Read request for a variable node results in a read via MXAccess using the node's stored `full_tag_reference`.
- Returned value is converted from the COM variant to the OPC UA data type specified on the node.
- OPC UA StatusCode reflects MXAccess quality: Good maps to Good, Bad/Uncertain map appropriately.
- If MXAccess is not connected, return StatusCode = Bad_NotConnected.
- Read timeout: configurable, default 5 seconds. On timeout, return Bad_Timeout.
### Details
- Prefer cached subscription-delivered values over on-demand reads to reduce COM round-trips.
- If no subscription is active for the tag, perform an on-demand read (AddItem, AdviseSupervisory, wait for first OnDataChange, then UnAdvise/RemoveItem).
- Concurrency: semaphore-limited to configurable max (default 10) concurrent MXAccess operations.
- Subscription setup dispatches through `CapabilityInvoker.InvokeAsync(DriverCapability.Subscribe, …)`.
- Two OPC UA monitored items against the same tag produce exactly one driver-side subscription (ref-counted); last unsubscribe releases the driver-side resource.
- `OnDataChange` callbacks from the driver arrive as `DataValueSnapshot` and are forwarded to all OPC UA monitored items on that tag.
- Driver-side quality maps to OPC UA `StatusCode` per the driver's spec.
- When the owning driver's circuit opens, subscribed items publish Bad quality; when it resets, resumption publishes the cached or freshly-sampled value.
- Across generation applies that preserve a tag's NodeId, existing OPC UA monitored items are preserved (no re-subscribe required on the client).
---
## OPC-008: Write Operations
## OPC-008: Alarm Surface
The server shall fulfill OPC UA Write requests by writing to the corresponding tag via MXAccess.
The server shall expose the OPC UA alarm and condition model backed by each driver's `IAlarmSource` (where implemented).
### Acceptance Criteria
- OPC UA Write request results in an MXAccess `Write()` call with completion confirmed via `OnWriteComplete()` callback.
- Write timeout: configurable, default 5 seconds. On timeout, log Warning and return Bad_Timeout.
- MXSTATUS_PROXY with `success = 0` causes the OPC UA write to return Bad_InternalError with the detail message.
- MXAccess errors 1008 (no permission), 1012 (secured write), 1013 (verified write) return Bad_UserAccessDenied.
- Write to a non-existent tag returns Bad_NodeIdUnknown.
- The server shall attempt to convert the written value to the expected Galaxy data type before passing to Write().
### Details
- Write uses security classification -1 (no security). Galaxy runtime handles security enforcement.
- Write sequence: uses existing subscription handle if available, otherwise AddItem + AdviseSupervisory + Write + await OnWriteComplete + cleanup.
- Concurrent write limit: same semaphore as reads (configurable, default 10).
- Drivers implementing `IAlarmSource` (today: Galaxy, FOCAS, OPC UA Client) produce alarm events that the core maps onto OPC UA `ConditionType` / `AlarmConditionType` instances in the driver's namespace.
- `AlarmSubscribe` dispatches through `CapabilityInvoker.InvokeAsync(DriverCapability.AlarmSubscribe, …)` and retries on transient failure.
- `AlarmAcknowledge` from the OPC UA client dispatches through `CapabilityInvoker.InvokeAsync(DriverCapability.AlarmAcknowledge, …)` and **does not retry** (decision #143 — ack is a write-shaped operation).
- Alarm-ack requires the `AlarmAck` permission bit for the tag / equipment node; otherwise `Bad_UserAccessDenied`.
- Drivers that do not implement `IAlarmSource` contribute no alarm nodes; the core does not synthesize placeholder conditions.
---
## OPC-009: Subscriptions
## OPC-009: Historical Access
The server shall support OPC UA subscriptions by mapping them to MXAccess advisory subscriptions and forwarding data change notifications.
The server shall surface OPC UA Historical Access (HA) via each driver's `IHistoryProvider` (where implemented).
### Acceptance Criteria
- OPC UA CreateMonitoredItems results in MXAccess `AdviseSupervisory()` subscriptions for the requested tags.
- Data changes from `OnDataChange` callback are forwarded as OPC UA notifications to all subscribed clients.
- Shared subscriptions: if two OPC UA clients subscribe to the same tag, only one MXAccess subscription exists (ref-counted).
- Last subscriber unsubscribing triggers UnAdvise/RemoveItem on the MXAccess side.
- After MXAccess reconnect, all active MXAccess subscriptions are re-established automatically.
### Details
- Publishing interval from the OPC UA subscription request is honored on the OPC UA side; MXAccess delivers changes as fast as it receives them.
- OPC UA quality mapping from MXAccess quality integers: 192+ = Good, 64-191 = Uncertain, 0-63 = Bad.
- OnDataChange with MXSTATUS_PROXY failure: deliver notification with Bad quality to subscribed clients.
- `HistoryRead` for `Raw`, `Processed`, `AtTime`, and `Events` dispatches through `CapabilityInvoker.InvokeAsync(DriverCapability.HistoryRead, …)`.
- Drivers implementing `IHistoryProvider` today: Galaxy (Wonderware Historian), OPC UA Client (proxy to remote historian).
- Drivers not implementing `IHistoryProvider` return `Bad_HistoryOperationUnsupported` for history requests on their nodes.
- History reads require the `Read` permission bit on the target node.
---
## OPC-010: Address Space Rebuild
## OPC-010: Transport Security Profiles
When a Galaxy deployment change is detected, the server shall rebuild the address space without dropping existing OPC UA client connections where possible.
The server shall offer OPC UA transport-security profiles resolved at startup by `SecurityProfileResolver`.
### Acceptance Criteria
- When Galaxy Repository detects a deployment change, the OPC UA address space is updated.
- Only changed gobject subtrees are torn down and rebuilt; unchanged nodes, subscriptions, and alarm tracking remain intact.
- Existing OPC UA client sessions are preserved — clients stay connected.
- Subscriptions for tags on unchanged objects continue to work without interruption.
- Subscriptions for tags that no longer exist receive a Bad_NodeIdUnknown status notification.
- Sync is logged at Information level with the number of changed gobjects.
### Details
- Uses incremental subtree sync: compares previous hierarchy+attributes with new, identifies changed gobject IDs, expands to include child subtrees, tears down only affected subtrees, and rebuilds them.
- First build (no cached state) performs a full build.
- If no changes are detected, the sync is a no-op (logged and skipped).
- Alarm tracking and MXAccess subscriptions for unchanged objects are not disrupted.
- Falls back to full rebuild behavior if the entire hierarchy changes.
- Supported profiles: `None`, `Basic256Sha256-Sign`, `Basic256Sha256-SignAndEncrypt`, `Aes128_Sha256_RsaOaep-Sign`, `Aes128_Sha256_RsaOaep-SignAndEncrypt`, `Aes256_Sha256_RsaPss-Sign`, `Aes256_Sha256_RsaPss-SignAndEncrypt`.
- Active profile list comes from `OpcUa.SecurityProfile` in `appsettings.json` (bootstrap config) or Config DB (per-cluster override).
- Server certificate is created at first startup even when only `None` is enabled, because UserName-token encryption depends on an ApplicationInstanceCertificate.
- Certificate store root path is configurable (default `%ProgramData%/OtOpcUa/pki/`).
- `AutoAcceptUntrustedClientCertificates` is a config flag; production deployments set it to `false` and operators add trusted client certs via the Admin UI Cert Trust screen.
---
## OPC-011: Server Diagnostics Node
## OPC-011: UserName Authentication
The server shall expose a ServerStatus node under the standard OPC UA Server object with ServerState, CurrentTime, and StartTime. This is required by the OPC UA specification for compliant servers.
The server shall validate `UserNameIdentityToken` credentials against LDAP (production: Active Directory; dev: GLAuth).
### Acceptance Criteria
- ServerState reports Running during normal operation.
- CurrentTime returns the server's current UTC time.
- StartTime returns the UTC time when the service started.
- If `Ldap.Enabled = false`, all UserName tokens are rejected (`BadUserAccessDenied`).
- When enabled, the server performs an LDAP bind using the supplied credentials via `LdapUserAuthenticator`.
- On successful bind, group memberships resolved from LDAP are mapped through `LdapOptions.GroupToRole` to produce the session's permission bits (`ReadOnly`, `WriteOperate`, `WriteTune`, `WriteConfigure`, `AlarmAck`).
- `LdapAuthenticationProvider` implements both `IUserAuthenticationProvider` and `IRoleProvider`.
- UserName tokens are always carried on an encrypted secure channel (either Sign-and-Encrypt transport, or encrypted token using the server certificate even on a `None` channel).
---
## OPC-012: Namespace Configuration
## OPC-012: Capability Dispatch via CapabilityInvoker
The server shall register a namespace URI at namespace index 1. All application-specific NodeIds shall use this namespace.
Every async capability-interface call the server makes shall route through `Core.Resilience.CapabilityInvoker`.
### Acceptance Criteria
- Namespace URI: `urn:ZB:LmxOpcUa` (Galaxy name is configurable).
- All object and variable NodeIds created from Galaxy data use namespace index 1.
- Standard OPC UA nodes remain in namespace 0.
- `CapabilityInvoker.InvokeAsync` resolves a Polly resilience pipeline keyed on `(DriverInstanceId, HostName, DriverCapability)`.
- Read / Discover / Probe / Subscribe / AlarmSubscribe / HistoryRead pipelines carry Timeout + Retry + CircuitBreaker strategies.
- Write / AlarmAcknowledge pipelines carry Timeout + CircuitBreaker only; Retry is enabled only when the tag or capability carries `[WriteIdempotent]` (decision #143).
- Roslyn diagnostic **OTOPCUA0001** fires on any direct call to a capability-interface method from outside `CapabilityInvoker` (enforced via `ZB.MOM.WW.OtOpcUa.Analyzers`).
---
## OPC-013: Session Management
## OPC-013: Per-Host Polly Isolation
Polly pipelines shall be keyed per `(DriverInstanceId, HostName, DriverCapability)` so that a failing device in one driver does not trip the circuit for another device on the same driver or any other driver (decision #144).
### Acceptance Criteria
- A driver serving `N` devices has `N × capabilityCount` distinct pipelines.
- Circuit-breaker state transitions are telemetry-published per pipeline and appear on the Admin UI + `/metrics`.
- A host-scope fault (e.g. shared PLC gateway) naturally trips all devices behind that host but leaves other hosts untouched.
---
## OPC-014: Authorization Gate and Permission Trie
`Security.AuthorizationGate` shall enforce node-level permissions on every browse, read, write, subscribe, alarm-ack, and history call before dispatch.
### Acceptance Criteria
- Permission bits for the session are assembled at login from LDAP group → role → permission mapping plus Config-DB `NodeAcl` rows that modify permission inheritance along the browse tree.
- The permission trie walks from the addressed node toward the root, inheriting permissions unless a `NodeAcl` overrides; first match wins.
- Missing `Read` bit → `Bad_UserAccessDenied` on Read / Subscribe / HistoryRead.
- Missing `Write*` bit (matching the tag's `SecurityClassification`) → `Bad_UserAccessDenied` on Write.
- Missing `AlarmAck` bit → `Bad_UserAccessDenied` on acknowledge.
- Authorization decisions are made at the server layer only — drivers never enforce authorization and only expose `SecurityClassification` metadata.
---
## OPC-015: ServiceLevel Reporting
The server shall expose a dynamic `ServiceLevel` value computed by `RedundancyCoordinator` + `ServiceLevelCalculator`.
### Acceptance Criteria
- `ServiceLevel` reflects: redundancy role (Primary higher than Secondary), publish state (current generation applied > mid-apply > failed-apply), driver health (any driver instance in open circuit lowers the value), apply-lease state.
- `ServiceLevel` is exposed as a Variable under the standard `Server` object and is readable by any authenticated client.
- Clients that observe Primary's `ServiceLevel` drop below Secondary's should failover per the OPC UA spec.
- Single-node deployments (`NodeCount = 1`) always publish their node as Primary.
---
## OPC-016: Session Management
The server shall support multiple concurrent OPC UA client sessions.
### Acceptance Criteria
- Maximum concurrent sessions: configurable, default 100.
- Session timeout: configurable, default 30 minutes of inactivity.
- Expired sessions are cleaned up and their subscriptions removed.
- Session count is reported to the status dashboard.
- Maximum concurrent sessions and session timeout come from Config DB cluster settings (default 100 sessions, 30-minute idle timeout).
- Expired sessions are cleaned up and their subscriptions and monitored items removed.
- Active session count is reported as a Prometheus gauge on the Admin `/metrics` endpoint.
---
## OPC-017: Address Space Rebuild on Generation Apply
When a new Config DB generation is applied, the server shall surgically update only the affected driver subtrees.
### Acceptance Criteria
- Apply compares the previous generation to the incoming generation and produces per-driver add / modify / remove sets.
- Existing OPC UA sessions, subscriptions, and monitored items are preserved across apply whenever the target NodeId survives the generation change.
- Tags that no longer exist post-apply emit `Bad_NodeIdUnknown` on their subscribed monitored items.
- During apply, the node's `ServiceLevel` is lowered (per `ServiceLevelCalculator`) so redundancy partners temporarily take precedence.
- Galaxy subtree rebuilds triggered by `IRediscoverable` (Galaxy deployment change) are scoped to the Galaxy driver's namespace and follow the same preservation rule (OPC-006 from the v1 file, now subsumed).
---
## OPC-018: Server Diagnostics Nodes
The server shall expose standard OPC UA `Server` object nodes required by the spec.
### Acceptance Criteria
- `ServerStatus` / `ServerState` / `CurrentTime` / `StartTime` populated and compliant with the OPC UA 1.05 spec.
- `ServerCapabilities` declares historical access capabilities for namespaces that have an `IHistoryProvider`-backed driver.
- `ServerRedundancy.RedundancySupport` reflects the cluster's redundancy mode (`None` / `Warm` / `Hot`).
- `ServerRedundancy.ServerUriArray` lists both cluster members' `ApplicationUri` values.
---
## OPC-019: Observability Hooks
The server shall emit OpenTelemetry metrics consumed by the Admin `/metrics` Prometheus endpoint.
### Acceptance Criteria
- Counters: capability calls per `DriverInstanceId` + `DriverCapability`, OPC UA requests per method, alarm events emitted, history reads, generation apply attempts.
- Histograms: capability-call duration per `DriverInstanceId` + `DriverCapability`, OPC UA request duration per method.
- Gauges: circuit-breaker state per pipeline, active OPC UA sessions, active monitored items, subscription queue depth, `ServiceLevel` value, memory-tracking watermarks (Phase 6.1).
- Metric cardinality is bounded — `DriverInstanceId` and `HostName` are the only high-cardinality labels, both controlled by the Config DB.