Files
lmxopcua/docs/reqs/OpcUaServerReqs.md
Joseph Doherty 48970af416 Doc refresh (task #205) — requirements updated for multi-driver OtOpcUa three-process deploy
Per-file summary:

- docs/reqs/OpcUaServerReqs.md — rewritten driver-agnostic. OPC-001..OPC-013 re-scoped to multi-driver address-space composition + capability dispatch; OPC-014 AuthorizationGate + permission trie; OPC-015 dynamic ServiceLevel via RedundancyCoordinator; OPC-017 surgical generation-apply rebuild; OPC-012 capability dispatch via CapabilityInvoker (decision #143 idempotence-aware retry); OPC-013 per-host Polly isolation (decision #144); OPC-019 OpenTelemetry metrics. Transport-security profile matrix (OPC-010) + UserName/LDAP (OPC-011) preserved.

- docs/reqs/GalaxyRepositoryReqs.md — scope clarified as Galaxy-driver-only (not platform). GR-001..GR-004 tied to ITagDiscovery.DiscoverAsync + IRediscoverable; all SQL runs inside OtOpcUa.Galaxy.Host and streams to Proxy via named pipe. GR-008 capability wrapping via CapabilityInvoker added. Cross-links to docs/v2/driver-specs.md + docs/GalaxyRepository.md.

- docs/reqs/MxAccessClientReqs.md — scope clarified as Galaxy-Host-only. MXA-001..MXA-009 preserved (STA pump, register/unregister, subscription refcount, auto-reconnect, probe, COM cleanup, operation metrics, error translation). MXA-010 Proxy-side capability wrapping + MXA-011 pipe ACL + per-process shared secret (OTOPCUA_ALLOWED_SID / OTOPCUA_GALAXY_SECRET) added.

- docs/reqs/ServiceHostReqs.md — rewritten for three-process deployment. Shared section (SVC-SHARED-001/002) for Serilog + bootstrap-only appsettings. SRV-* for OtOpcUa.Server (net10 x64, Microsoft.Extensions.Hosting + AddWindowsService, in-process driver hosting, redundancy-node bootstrap). ADM-* for OtOpcUa.Admin (Blazor Server, cookie+LDAP auth, CanEdit/CanPublish policies, sole DB writer, Prometheus /metrics, audit logging). GHX-* for OtOpcUa.Galaxy.Host (TopShelf, net48 x86, named-pipe IPC bootstrap, STA backend lifecycle, crash handling tied to supervisor).

- docs/reqs/ClientRequirements.md — restructured as numbered, verifiable requirements. SHR-* for Client.Shared (single IOpcUaClientService, ConnectionSettings, failover, cross-platform certs, type-coercing write, UI-thread neutrality). CLI-001..CLI-011 cover connect/read/write/browse/subscribe/historyread/alarms/redundancy. UI-001..UI-008 cover connection panel, tree browser, each tab, connection-state reflection, cross-platform build. Reference design content (IOpcUaClientService shape, models, view-model map, mock layout) preserved.

- docs/reqs/StatusDashboardReqs.md — retired cleanly. Replaced with a pointer to docs/v2/admin-ui.md + HLR-015 / HLR-016 / HLR-017 / ADM-*. Mapping table shows each retired DASH-001..DASH-009 requirement's replacement (live cluster-node view via SignalR, Prometheus metrics, driver-instance detail views, etc.). Note that a formal AdminUiReqs.md can be written later if needed for cert compliance.

HighLevelReqs.md was already at the target shape (HLR-001..HLR-018 with Revision header noting retired HLR-009) as of commit f217636; verified identical and no additional edit required.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 01:31:58 -04:00

267 lines
16 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# OPC UA Server — Component Requirements
> **Revision** — Refreshed 2026-04-19 for the OtOpcUa v2 multi-driver platform (task #205). OPC-001…OPC-013 have been rewritten driver-agnostically — they now describe how the core OPC UA server composes multiple driver subtrees, enforces authorization, and invokes capabilities through the Polly-wrapped dispatch path. OPC-014 through OPC-022 are new and cover capability dispatch, per-host Polly isolation, idempotence-aware write retry, `AuthorizationGate`, `ServiceLevel` reporting, the alarm surface, history surface, server-certificate management, and the transport-security profile matrix. Galaxy-specific behavior has been moved out to `GalaxyRepositoryReqs.md` and `MxAccessClientReqs.md`.
Parent: [HLR-001](HighLevelReqs.md#hlr-001-opc-ua-server), [HLR-003](HighLevelReqs.md#hlr-003-address-space-composition-per-namespace), [HLR-009](HighLevelReqs.md#hlr-009-transport-security-and-authentication), [HLR-010](HighLevelReqs.md#hlr-010-per-driver-instance-resilience), [HLR-013](HighLevelReqs.md#hlr-013-cluster-redundancy)
## OPC-001: Server Endpoint
The OPC UA server shall listen on a configurable TCP endpoint using the OPC Foundation .NET Standard stack and expose a single endpoint URL per cluster node.
### Acceptance Criteria
- Endpoint URL comes from `ClusterNode.EndpointUrl` in the Config DB (default form `opc.tcp://<hostname>:<port>/OtOpcUa`).
- `ApplicationName` and `ApplicationUri` come from `ClusterNode` fields; `ApplicationUri` is unique per node so redundancy `ServerUriArray` entries are distinguishable.
- Port defaults to 4840. If the port is in use at startup the server shall log Error and fail to start (no silent port reassignment).
- Uses `OPCFoundation.NetStandard.Opc.Ua.Server` NuGet.
- Endpoint URL logged at Information level on startup.
### Details
- Node-local `appsettings.json` only carries the `Config DB connection + NodeId + ClusterId` bootstrap — actual endpoint topology comes from the Config DB per HLR-011.
---
## OPC-002: Address Space Composition
The server shall compose an address space by mounting each active driver instance's subtree under a dedicated OPC UA namespace.
### Acceptance Criteria
- Each `DriverInstance` in the current published generation registers one `IDriver` implementation in the core.
- Each driver's `ITagDiscovery.DiscoverAsync` result is streamed into the core via `IAddressSpaceBuilder``AddFolder` / `AddVariable` calls; the driver does not buffer the whole tree.
- Each driver instance gets its own namespace index; `NamespaceUri` comes from the `Namespace` row in the Config DB.
- Each cluster has at most one namespace per `Kind` (`Equipment`, `SystemPlatform`, future `Simulated`); enforced by UNIQUE on `(ClusterId, Kind)` in the DB.
- Galaxy driver subtree preserves the contained-name browse structure from the deployed Galaxy (moved to `GalaxyRepositoryReqs.md`).
- Equipment-kind drivers populate the canonical 5-level UNS structure (`Enterprise/Site/Area/Line/Equipment/Signal`).
---
## OPC-003: Variable Nodes and Access Levels
Each tag produced by a driver's `ITagDiscovery` shall become an OPC UA variable node.
### Acceptance Criteria
- Variable node `BrowseName` and `DisplayName` come from `DriverAttributeInfo`.
- `DataType` is resolved from `DriverDataType` per each driver's spec in `docs/v2/driver-specs.md`.
- `AccessLevel` and `UserAccessLevel` are derived from the tag's `SecurityClassification` and the session's effective permissions walked through the node-ACL permission trie (see OPC-017 `AuthorizationGate`).
- Scalar attributes produce `ValueRank = Scalar`; array attributes produce `ValueRank = OneDimension` with `ArrayDimensions` set from the driver's attribute info.
---
## OPC-004: Namespace Index Allocation
The server shall register one OPC UA namespace per active driver instance.
### Acceptance Criteria
- Namespace index 0 remains the standard OPC UA namespace.
- Each driver instance's `Namespace.Uri` becomes a registered namespace; its index is assigned deterministically at startup from the published generation's driver ordering.
- All variable NodeIds use the driver's namespace index; NodeId identifiers are string-shaped and stable across restarts of the same generation.
- Namespace index reshuffles are a publish-time concern; clients reconciling server-relative NodeIds must re-resolve namespace URIs after a new generation is applied.
---
## OPC-005: Read Operations
The server shall fulfill OPC UA `Read` requests by invoking `IReadable.ReadAsync` on the target driver instance, dispatched through `CapabilityInvoker`.
### Acceptance Criteria
- Every read call at dispatch passes through `Core.Resilience.CapabilityInvoker.InvokeAsync(DriverCapability.Read, …)`.
- Returned `DataValueSnapshot` is converted to an OPC UA `DataValue` with `StatusCode`, source timestamp, and server timestamp.
- If the owning driver instance's Polly circuit is open, the read returns Bad quality immediately without hitting the wire.
- Reads on a node the session has no `Read` bit for in the permission trie return `Bad_UserAccessDenied` before the capability is invoked (OPC-017).
- Read timeout is the Polly timeout leg on the `Read` capability; its duration is per-`(DriverInstanceId, HostName)` and comes from the Config DB.
---
## OPC-006: Write Operations
The server shall fulfill OPC UA `Write` requests by invoking `IWritable.WriteAsync` through `CapabilityInvoker` with **idempotence-aware** retry policy.
### Acceptance Criteria
- Writes dispatch through `CapabilityInvoker.InvokeAsync(DriverCapability.Write, …)`.
- Writes **do not auto-retry** unless the tag's `TagConfig.WriteIdempotent = true`, or the driver's capability is marked with `[WriteIdempotent]` (decision #143).
- Writes on a node the session lacks the required permission bit for (`WriteOperate`, `WriteTune`, or `WriteConfigure` derived from the tag's `SecurityClassification`) return `Bad_UserAccessDenied` before the capability runs.
- A write into an open circuit returns a driver-shaped error (`Bad_NoCommunication` / `Bad_ServerNotConnected`) without hitting the wire.
- The server shall coerce the written OPC UA value to the driver's expected native type using the node's `DriverDataType` before calling `WriteAsync`.
- Writes to a NodeId not currently in the address space return `Bad_NodeIdUnknown`.
---
## OPC-007: Subscriptions and Monitored Items
The server shall map OPC UA `CreateMonitoredItems` / `DeleteMonitoredItems` to `ISubscribable.SubscribeAsync` / `UnsubscribeAsync` on the owning driver instance.
### Acceptance Criteria
- Subscription setup dispatches through `CapabilityInvoker.InvokeAsync(DriverCapability.Subscribe, …)`.
- Two OPC UA monitored items against the same tag produce exactly one driver-side subscription (ref-counted); last unsubscribe releases the driver-side resource.
- `OnDataChange` callbacks from the driver arrive as `DataValueSnapshot` and are forwarded to all OPC UA monitored items on that tag.
- Driver-side quality maps to OPC UA `StatusCode` per the driver's spec.
- When the owning driver's circuit opens, subscribed items publish Bad quality; when it resets, resumption publishes the cached or freshly-sampled value.
- Across generation applies that preserve a tag's NodeId, existing OPC UA monitored items are preserved (no re-subscribe required on the client).
---
## OPC-008: Alarm Surface
The server shall expose the OPC UA alarm and condition model backed by each driver's `IAlarmSource` (where implemented).
### Acceptance Criteria
- Drivers implementing `IAlarmSource` (today: Galaxy, FOCAS, OPC UA Client) produce alarm events that the core maps onto OPC UA `ConditionType` / `AlarmConditionType` instances in the driver's namespace.
- `AlarmSubscribe` dispatches through `CapabilityInvoker.InvokeAsync(DriverCapability.AlarmSubscribe, …)` and retries on transient failure.
- `AlarmAcknowledge` from the OPC UA client dispatches through `CapabilityInvoker.InvokeAsync(DriverCapability.AlarmAcknowledge, …)` and **does not retry** (decision #143 — ack is a write-shaped operation).
- Alarm-ack requires the `AlarmAck` permission bit for the tag / equipment node; otherwise `Bad_UserAccessDenied`.
- Drivers that do not implement `IAlarmSource` contribute no alarm nodes; the core does not synthesize placeholder conditions.
---
## OPC-009: Historical Access
The server shall surface OPC UA Historical Access (HA) via each driver's `IHistoryProvider` (where implemented).
### Acceptance Criteria
- `HistoryRead` for `Raw`, `Processed`, `AtTime`, and `Events` dispatches through `CapabilityInvoker.InvokeAsync(DriverCapability.HistoryRead, …)`.
- Drivers implementing `IHistoryProvider` today: Galaxy (Wonderware Historian), OPC UA Client (proxy to remote historian).
- Drivers not implementing `IHistoryProvider` return `Bad_HistoryOperationUnsupported` for history requests on their nodes.
- History reads require the `Read` permission bit on the target node.
---
## OPC-010: Transport Security Profiles
The server shall offer OPC UA transport-security profiles resolved at startup by `SecurityProfileResolver`.
### Acceptance Criteria
- Supported profiles: `None`, `Basic256Sha256-Sign`, `Basic256Sha256-SignAndEncrypt`, `Aes128_Sha256_RsaOaep-Sign`, `Aes128_Sha256_RsaOaep-SignAndEncrypt`, `Aes256_Sha256_RsaPss-Sign`, `Aes256_Sha256_RsaPss-SignAndEncrypt`.
- Active profile list comes from `OpcUa.SecurityProfile` in `appsettings.json` (bootstrap config) or Config DB (per-cluster override).
- Server certificate is created at first startup even when only `None` is enabled, because UserName-token encryption depends on an ApplicationInstanceCertificate.
- Certificate store root path is configurable (default `%ProgramData%/OtOpcUa/pki/`).
- `AutoAcceptUntrustedClientCertificates` is a config flag; production deployments set it to `false` and operators add trusted client certs via the Admin UI Cert Trust screen.
---
## OPC-011: UserName Authentication
The server shall validate `UserNameIdentityToken` credentials against LDAP (production: Active Directory; dev: GLAuth).
### Acceptance Criteria
- If `Ldap.Enabled = false`, all UserName tokens are rejected (`BadUserAccessDenied`).
- When enabled, the server performs an LDAP bind using the supplied credentials via `LdapUserAuthenticator`.
- On successful bind, group memberships resolved from LDAP are mapped through `LdapOptions.GroupToRole` to produce the session's permission bits (`ReadOnly`, `WriteOperate`, `WriteTune`, `WriteConfigure`, `AlarmAck`).
- `LdapAuthenticationProvider` implements both `IUserAuthenticationProvider` and `IRoleProvider`.
- UserName tokens are always carried on an encrypted secure channel (either Sign-and-Encrypt transport, or encrypted token using the server certificate even on a `None` channel).
---
## OPC-012: Capability Dispatch via CapabilityInvoker
Every async capability-interface call the server makes shall route through `Core.Resilience.CapabilityInvoker`.
### Acceptance Criteria
- `CapabilityInvoker.InvokeAsync` resolves a Polly resilience pipeline keyed on `(DriverInstanceId, HostName, DriverCapability)`.
- Read / Discover / Probe / Subscribe / AlarmSubscribe / HistoryRead pipelines carry Timeout + Retry + CircuitBreaker strategies.
- Write / AlarmAcknowledge pipelines carry Timeout + CircuitBreaker only; Retry is enabled only when the tag or capability carries `[WriteIdempotent]` (decision #143).
- Roslyn diagnostic **OTOPCUA0001** fires on any direct call to a capability-interface method from outside `CapabilityInvoker` (enforced via `ZB.MOM.WW.OtOpcUa.Analyzers`).
---
## OPC-013: Per-Host Polly Isolation
Polly pipelines shall be keyed per `(DriverInstanceId, HostName, DriverCapability)` so that a failing device in one driver does not trip the circuit for another device on the same driver or any other driver (decision #144).
### Acceptance Criteria
- A driver serving `N` devices has `N × capabilityCount` distinct pipelines.
- Circuit-breaker state transitions are telemetry-published per pipeline and appear on the Admin UI + `/metrics`.
- A host-scope fault (e.g. shared PLC gateway) naturally trips all devices behind that host but leaves other hosts untouched.
---
## OPC-014: Authorization Gate and Permission Trie
`Security.AuthorizationGate` shall enforce node-level permissions on every browse, read, write, subscribe, alarm-ack, and history call before dispatch.
### Acceptance Criteria
- Permission bits for the session are assembled at login from LDAP group → role → permission mapping plus Config-DB `NodeAcl` rows that modify permission inheritance along the browse tree.
- The permission trie walks from the addressed node toward the root, inheriting permissions unless a `NodeAcl` overrides; first match wins.
- Missing `Read` bit → `Bad_UserAccessDenied` on Read / Subscribe / HistoryRead.
- Missing `Write*` bit (matching the tag's `SecurityClassification`) → `Bad_UserAccessDenied` on Write.
- Missing `AlarmAck` bit → `Bad_UserAccessDenied` on acknowledge.
- Authorization decisions are made at the server layer only — drivers never enforce authorization and only expose `SecurityClassification` metadata.
---
## OPC-015: ServiceLevel Reporting
The server shall expose a dynamic `ServiceLevel` value computed by `RedundancyCoordinator` + `ServiceLevelCalculator`.
### Acceptance Criteria
- `ServiceLevel` reflects: redundancy role (Primary higher than Secondary), publish state (current generation applied > mid-apply > failed-apply), driver health (any driver instance in open circuit lowers the value), apply-lease state.
- `ServiceLevel` is exposed as a Variable under the standard `Server` object and is readable by any authenticated client.
- Clients that observe Primary's `ServiceLevel` drop below Secondary's should failover per the OPC UA spec.
- Single-node deployments (`NodeCount = 1`) always publish their node as Primary.
---
## OPC-016: Session Management
The server shall support multiple concurrent OPC UA client sessions.
### Acceptance Criteria
- Maximum concurrent sessions and session timeout come from Config DB cluster settings (default 100 sessions, 30-minute idle timeout).
- Expired sessions are cleaned up and their subscriptions and monitored items removed.
- Active session count is reported as a Prometheus gauge on the Admin `/metrics` endpoint.
---
## OPC-017: Address Space Rebuild on Generation Apply
When a new Config DB generation is applied, the server shall surgically update only the affected driver subtrees.
### Acceptance Criteria
- Apply compares the previous generation to the incoming generation and produces per-driver add / modify / remove sets.
- Existing OPC UA sessions, subscriptions, and monitored items are preserved across apply whenever the target NodeId survives the generation change.
- Tags that no longer exist post-apply emit `Bad_NodeIdUnknown` on their subscribed monitored items.
- During apply, the node's `ServiceLevel` is lowered (per `ServiceLevelCalculator`) so redundancy partners temporarily take precedence.
- Galaxy subtree rebuilds triggered by `IRediscoverable` (Galaxy deployment change) are scoped to the Galaxy driver's namespace and follow the same preservation rule (OPC-006 from the v1 file, now subsumed).
---
## OPC-018: Server Diagnostics Nodes
The server shall expose standard OPC UA `Server` object nodes required by the spec.
### Acceptance Criteria
- `ServerStatus` / `ServerState` / `CurrentTime` / `StartTime` populated and compliant with the OPC UA 1.05 spec.
- `ServerCapabilities` declares historical access capabilities for namespaces that have an `IHistoryProvider`-backed driver.
- `ServerRedundancy.RedundancySupport` reflects the cluster's redundancy mode (`None` / `Warm` / `Hot`).
- `ServerRedundancy.ServerUriArray` lists both cluster members' `ApplicationUri` values.
---
## OPC-019: Observability Hooks
The server shall emit OpenTelemetry metrics consumed by the Admin `/metrics` Prometheus endpoint.
### Acceptance Criteria
- Counters: capability calls per `DriverInstanceId` + `DriverCapability`, OPC UA requests per method, alarm events emitted, history reads, generation apply attempts.
- Histograms: capability-call duration per `DriverInstanceId` + `DriverCapability`, OPC UA request duration per method.
- Gauges: circuit-breaker state per pipeline, active OPC UA sessions, active monitored items, subscription queue depth, `ServiceLevel` value, memory-tracking watermarks (Phase 6.1).
- Metric cardinality is bounded — `DriverInstanceId` and `HostName` are the only high-cardinality labels, both controlled by the Config DB.