Per-file summary: - docs/reqs/OpcUaServerReqs.md — rewritten driver-agnostic. OPC-001..OPC-013 re-scoped to multi-driver address-space composition + capability dispatch; OPC-014 AuthorizationGate + permission trie; OPC-015 dynamic ServiceLevel via RedundancyCoordinator; OPC-017 surgical generation-apply rebuild; OPC-012 capability dispatch via CapabilityInvoker (decision #143 idempotence-aware retry); OPC-013 per-host Polly isolation (decision #144); OPC-019 OpenTelemetry metrics. Transport-security profile matrix (OPC-010) + UserName/LDAP (OPC-011) preserved. - docs/reqs/GalaxyRepositoryReqs.md — scope clarified as Galaxy-driver-only (not platform). GR-001..GR-004 tied to ITagDiscovery.DiscoverAsync + IRediscoverable; all SQL runs inside OtOpcUa.Galaxy.Host and streams to Proxy via named pipe. GR-008 capability wrapping via CapabilityInvoker added. Cross-links to docs/v2/driver-specs.md + docs/GalaxyRepository.md. - docs/reqs/MxAccessClientReqs.md — scope clarified as Galaxy-Host-only. MXA-001..MXA-009 preserved (STA pump, register/unregister, subscription refcount, auto-reconnect, probe, COM cleanup, operation metrics, error translation). MXA-010 Proxy-side capability wrapping + MXA-011 pipe ACL + per-process shared secret (OTOPCUA_ALLOWED_SID / OTOPCUA_GALAXY_SECRET) added. - docs/reqs/ServiceHostReqs.md — rewritten for three-process deployment. Shared section (SVC-SHARED-001/002) for Serilog + bootstrap-only appsettings. SRV-* for OtOpcUa.Server (net10 x64, Microsoft.Extensions.Hosting + AddWindowsService, in-process driver hosting, redundancy-node bootstrap). ADM-* for OtOpcUa.Admin (Blazor Server, cookie+LDAP auth, CanEdit/CanPublish policies, sole DB writer, Prometheus /metrics, audit logging). GHX-* for OtOpcUa.Galaxy.Host (TopShelf, net48 x86, named-pipe IPC bootstrap, STA backend lifecycle, crash handling tied to supervisor). - docs/reqs/ClientRequirements.md — restructured as numbered, verifiable requirements. SHR-* for Client.Shared (single IOpcUaClientService, ConnectionSettings, failover, cross-platform certs, type-coercing write, UI-thread neutrality). CLI-001..CLI-011 cover connect/read/write/browse/subscribe/historyread/alarms/redundancy. UI-001..UI-008 cover connection panel, tree browser, each tab, connection-state reflection, cross-platform build. Reference design content (IOpcUaClientService shape, models, view-model map, mock layout) preserved. - docs/reqs/StatusDashboardReqs.md — retired cleanly. Replaced with a pointer to docs/v2/admin-ui.md + HLR-015 / HLR-016 / HLR-017 / ADM-*. Mapping table shows each retired DASH-001..DASH-009 requirement's replacement (live cluster-node view via SignalR, Prometheus metrics, driver-instance detail views, etc.). Note that a formal AdminUiReqs.md can be written later if needed for cert compliance. HighLevelReqs.md was already at the target shape (HLR-001..HLR-018 with Revision header noting retired HLR-009) as of commit f217636; verified identical and no additional edit required. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
267 lines
16 KiB
Markdown
267 lines
16 KiB
Markdown
# OPC UA Server — Component Requirements
|
||
|
||
> **Revision** — Refreshed 2026-04-19 for the OtOpcUa v2 multi-driver platform (task #205). OPC-001…OPC-013 have been rewritten driver-agnostically — they now describe how the core OPC UA server composes multiple driver subtrees, enforces authorization, and invokes capabilities through the Polly-wrapped dispatch path. OPC-014 through OPC-022 are new and cover capability dispatch, per-host Polly isolation, idempotence-aware write retry, `AuthorizationGate`, `ServiceLevel` reporting, the alarm surface, history surface, server-certificate management, and the transport-security profile matrix. Galaxy-specific behavior has been moved out to `GalaxyRepositoryReqs.md` and `MxAccessClientReqs.md`.
|
||
|
||
Parent: [HLR-001](HighLevelReqs.md#hlr-001-opc-ua-server), [HLR-003](HighLevelReqs.md#hlr-003-address-space-composition-per-namespace), [HLR-009](HighLevelReqs.md#hlr-009-transport-security-and-authentication), [HLR-010](HighLevelReqs.md#hlr-010-per-driver-instance-resilience), [HLR-013](HighLevelReqs.md#hlr-013-cluster-redundancy)
|
||
|
||
## OPC-001: Server Endpoint
|
||
|
||
The OPC UA server shall listen on a configurable TCP endpoint using the OPC Foundation .NET Standard stack and expose a single endpoint URL per cluster node.
|
||
|
||
### Acceptance Criteria
|
||
|
||
- Endpoint URL comes from `ClusterNode.EndpointUrl` in the Config DB (default form `opc.tcp://<hostname>:<port>/OtOpcUa`).
|
||
- `ApplicationName` and `ApplicationUri` come from `ClusterNode` fields; `ApplicationUri` is unique per node so redundancy `ServerUriArray` entries are distinguishable.
|
||
- Port defaults to 4840. If the port is in use at startup the server shall log Error and fail to start (no silent port reassignment).
|
||
- Uses `OPCFoundation.NetStandard.Opc.Ua.Server` NuGet.
|
||
- Endpoint URL logged at Information level on startup.
|
||
|
||
### Details
|
||
|
||
- Node-local `appsettings.json` only carries the `Config DB connection + NodeId + ClusterId` bootstrap — actual endpoint topology comes from the Config DB per HLR-011.
|
||
|
||
---
|
||
|
||
## OPC-002: Address Space Composition
|
||
|
||
The server shall compose an address space by mounting each active driver instance's subtree under a dedicated OPC UA namespace.
|
||
|
||
### Acceptance Criteria
|
||
|
||
- Each `DriverInstance` in the current published generation registers one `IDriver` implementation in the core.
|
||
- Each driver's `ITagDiscovery.DiscoverAsync` result is streamed into the core via `IAddressSpaceBuilder` — `AddFolder` / `AddVariable` calls; the driver does not buffer the whole tree.
|
||
- Each driver instance gets its own namespace index; `NamespaceUri` comes from the `Namespace` row in the Config DB.
|
||
- Each cluster has at most one namespace per `Kind` (`Equipment`, `SystemPlatform`, future `Simulated`); enforced by UNIQUE on `(ClusterId, Kind)` in the DB.
|
||
- Galaxy driver subtree preserves the contained-name browse structure from the deployed Galaxy (moved to `GalaxyRepositoryReqs.md`).
|
||
- Equipment-kind drivers populate the canonical 5-level UNS structure (`Enterprise/Site/Area/Line/Equipment/Signal`).
|
||
|
||
---
|
||
|
||
## OPC-003: Variable Nodes and Access Levels
|
||
|
||
Each tag produced by a driver's `ITagDiscovery` shall become an OPC UA variable node.
|
||
|
||
### Acceptance Criteria
|
||
|
||
- Variable node `BrowseName` and `DisplayName` come from `DriverAttributeInfo`.
|
||
- `DataType` is resolved from `DriverDataType` per each driver's spec in `docs/v2/driver-specs.md`.
|
||
- `AccessLevel` and `UserAccessLevel` are derived from the tag's `SecurityClassification` and the session's effective permissions walked through the node-ACL permission trie (see OPC-017 `AuthorizationGate`).
|
||
- Scalar attributes produce `ValueRank = Scalar`; array attributes produce `ValueRank = OneDimension` with `ArrayDimensions` set from the driver's attribute info.
|
||
|
||
---
|
||
|
||
## OPC-004: Namespace Index Allocation
|
||
|
||
The server shall register one OPC UA namespace per active driver instance.
|
||
|
||
### Acceptance Criteria
|
||
|
||
- Namespace index 0 remains the standard OPC UA namespace.
|
||
- Each driver instance's `Namespace.Uri` becomes a registered namespace; its index is assigned deterministically at startup from the published generation's driver ordering.
|
||
- All variable NodeIds use the driver's namespace index; NodeId identifiers are string-shaped and stable across restarts of the same generation.
|
||
- Namespace index reshuffles are a publish-time concern; clients reconciling server-relative NodeIds must re-resolve namespace URIs after a new generation is applied.
|
||
|
||
---
|
||
|
||
## OPC-005: Read Operations
|
||
|
||
The server shall fulfill OPC UA `Read` requests by invoking `IReadable.ReadAsync` on the target driver instance, dispatched through `CapabilityInvoker`.
|
||
|
||
### Acceptance Criteria
|
||
|
||
- Every read call at dispatch passes through `Core.Resilience.CapabilityInvoker.InvokeAsync(DriverCapability.Read, …)`.
|
||
- Returned `DataValueSnapshot` is converted to an OPC UA `DataValue` with `StatusCode`, source timestamp, and server timestamp.
|
||
- If the owning driver instance's Polly circuit is open, the read returns Bad quality immediately without hitting the wire.
|
||
- Reads on a node the session has no `Read` bit for in the permission trie return `Bad_UserAccessDenied` before the capability is invoked (OPC-017).
|
||
- Read timeout is the Polly timeout leg on the `Read` capability; its duration is per-`(DriverInstanceId, HostName)` and comes from the Config DB.
|
||
|
||
---
|
||
|
||
## OPC-006: Write Operations
|
||
|
||
The server shall fulfill OPC UA `Write` requests by invoking `IWritable.WriteAsync` through `CapabilityInvoker` with **idempotence-aware** retry policy.
|
||
|
||
### Acceptance Criteria
|
||
|
||
- Writes dispatch through `CapabilityInvoker.InvokeAsync(DriverCapability.Write, …)`.
|
||
- Writes **do not auto-retry** unless the tag's `TagConfig.WriteIdempotent = true`, or the driver's capability is marked with `[WriteIdempotent]` (decision #143).
|
||
- Writes on a node the session lacks the required permission bit for (`WriteOperate`, `WriteTune`, or `WriteConfigure` derived from the tag's `SecurityClassification`) return `Bad_UserAccessDenied` before the capability runs.
|
||
- A write into an open circuit returns a driver-shaped error (`Bad_NoCommunication` / `Bad_ServerNotConnected`) without hitting the wire.
|
||
- The server shall coerce the written OPC UA value to the driver's expected native type using the node's `DriverDataType` before calling `WriteAsync`.
|
||
- Writes to a NodeId not currently in the address space return `Bad_NodeIdUnknown`.
|
||
|
||
---
|
||
|
||
## OPC-007: Subscriptions and Monitored Items
|
||
|
||
The server shall map OPC UA `CreateMonitoredItems` / `DeleteMonitoredItems` to `ISubscribable.SubscribeAsync` / `UnsubscribeAsync` on the owning driver instance.
|
||
|
||
### Acceptance Criteria
|
||
|
||
- Subscription setup dispatches through `CapabilityInvoker.InvokeAsync(DriverCapability.Subscribe, …)`.
|
||
- Two OPC UA monitored items against the same tag produce exactly one driver-side subscription (ref-counted); last unsubscribe releases the driver-side resource.
|
||
- `OnDataChange` callbacks from the driver arrive as `DataValueSnapshot` and are forwarded to all OPC UA monitored items on that tag.
|
||
- Driver-side quality maps to OPC UA `StatusCode` per the driver's spec.
|
||
- When the owning driver's circuit opens, subscribed items publish Bad quality; when it resets, resumption publishes the cached or freshly-sampled value.
|
||
- Across generation applies that preserve a tag's NodeId, existing OPC UA monitored items are preserved (no re-subscribe required on the client).
|
||
|
||
---
|
||
|
||
## OPC-008: Alarm Surface
|
||
|
||
The server shall expose the OPC UA alarm and condition model backed by each driver's `IAlarmSource` (where implemented).
|
||
|
||
### Acceptance Criteria
|
||
|
||
- Drivers implementing `IAlarmSource` (today: Galaxy, FOCAS, OPC UA Client) produce alarm events that the core maps onto OPC UA `ConditionType` / `AlarmConditionType` instances in the driver's namespace.
|
||
- `AlarmSubscribe` dispatches through `CapabilityInvoker.InvokeAsync(DriverCapability.AlarmSubscribe, …)` and retries on transient failure.
|
||
- `AlarmAcknowledge` from the OPC UA client dispatches through `CapabilityInvoker.InvokeAsync(DriverCapability.AlarmAcknowledge, …)` and **does not retry** (decision #143 — ack is a write-shaped operation).
|
||
- Alarm-ack requires the `AlarmAck` permission bit for the tag / equipment node; otherwise `Bad_UserAccessDenied`.
|
||
- Drivers that do not implement `IAlarmSource` contribute no alarm nodes; the core does not synthesize placeholder conditions.
|
||
|
||
---
|
||
|
||
## OPC-009: Historical Access
|
||
|
||
The server shall surface OPC UA Historical Access (HA) via each driver's `IHistoryProvider` (where implemented).
|
||
|
||
### Acceptance Criteria
|
||
|
||
- `HistoryRead` for `Raw`, `Processed`, `AtTime`, and `Events` dispatches through `CapabilityInvoker.InvokeAsync(DriverCapability.HistoryRead, …)`.
|
||
- Drivers implementing `IHistoryProvider` today: Galaxy (Wonderware Historian), OPC UA Client (proxy to remote historian).
|
||
- Drivers not implementing `IHistoryProvider` return `Bad_HistoryOperationUnsupported` for history requests on their nodes.
|
||
- History reads require the `Read` permission bit on the target node.
|
||
|
||
---
|
||
|
||
## OPC-010: Transport Security Profiles
|
||
|
||
The server shall offer OPC UA transport-security profiles resolved at startup by `SecurityProfileResolver`.
|
||
|
||
### Acceptance Criteria
|
||
|
||
- Supported profiles: `None`, `Basic256Sha256-Sign`, `Basic256Sha256-SignAndEncrypt`, `Aes128_Sha256_RsaOaep-Sign`, `Aes128_Sha256_RsaOaep-SignAndEncrypt`, `Aes256_Sha256_RsaPss-Sign`, `Aes256_Sha256_RsaPss-SignAndEncrypt`.
|
||
- Active profile list comes from `OpcUa.SecurityProfile` in `appsettings.json` (bootstrap config) or Config DB (per-cluster override).
|
||
- Server certificate is created at first startup even when only `None` is enabled, because UserName-token encryption depends on an ApplicationInstanceCertificate.
|
||
- Certificate store root path is configurable (default `%ProgramData%/OtOpcUa/pki/`).
|
||
- `AutoAcceptUntrustedClientCertificates` is a config flag; production deployments set it to `false` and operators add trusted client certs via the Admin UI Cert Trust screen.
|
||
|
||
---
|
||
|
||
## OPC-011: UserName Authentication
|
||
|
||
The server shall validate `UserNameIdentityToken` credentials against LDAP (production: Active Directory; dev: GLAuth).
|
||
|
||
### Acceptance Criteria
|
||
|
||
- If `Ldap.Enabled = false`, all UserName tokens are rejected (`BadUserAccessDenied`).
|
||
- When enabled, the server performs an LDAP bind using the supplied credentials via `LdapUserAuthenticator`.
|
||
- On successful bind, group memberships resolved from LDAP are mapped through `LdapOptions.GroupToRole` to produce the session's permission bits (`ReadOnly`, `WriteOperate`, `WriteTune`, `WriteConfigure`, `AlarmAck`).
|
||
- `LdapAuthenticationProvider` implements both `IUserAuthenticationProvider` and `IRoleProvider`.
|
||
- UserName tokens are always carried on an encrypted secure channel (either Sign-and-Encrypt transport, or encrypted token using the server certificate even on a `None` channel).
|
||
|
||
---
|
||
|
||
## OPC-012: Capability Dispatch via CapabilityInvoker
|
||
|
||
Every async capability-interface call the server makes shall route through `Core.Resilience.CapabilityInvoker`.
|
||
|
||
### Acceptance Criteria
|
||
|
||
- `CapabilityInvoker.InvokeAsync` resolves a Polly resilience pipeline keyed on `(DriverInstanceId, HostName, DriverCapability)`.
|
||
- Read / Discover / Probe / Subscribe / AlarmSubscribe / HistoryRead pipelines carry Timeout + Retry + CircuitBreaker strategies.
|
||
- Write / AlarmAcknowledge pipelines carry Timeout + CircuitBreaker only; Retry is enabled only when the tag or capability carries `[WriteIdempotent]` (decision #143).
|
||
- Roslyn diagnostic **OTOPCUA0001** fires on any direct call to a capability-interface method from outside `CapabilityInvoker` (enforced via `ZB.MOM.WW.OtOpcUa.Analyzers`).
|
||
|
||
---
|
||
|
||
## OPC-013: Per-Host Polly Isolation
|
||
|
||
Polly pipelines shall be keyed per `(DriverInstanceId, HostName, DriverCapability)` so that a failing device in one driver does not trip the circuit for another device on the same driver or any other driver (decision #144).
|
||
|
||
### Acceptance Criteria
|
||
|
||
- A driver serving `N` devices has `N × capabilityCount` distinct pipelines.
|
||
- Circuit-breaker state transitions are telemetry-published per pipeline and appear on the Admin UI + `/metrics`.
|
||
- A host-scope fault (e.g. shared PLC gateway) naturally trips all devices behind that host but leaves other hosts untouched.
|
||
|
||
---
|
||
|
||
## OPC-014: Authorization Gate and Permission Trie
|
||
|
||
`Security.AuthorizationGate` shall enforce node-level permissions on every browse, read, write, subscribe, alarm-ack, and history call before dispatch.
|
||
|
||
### Acceptance Criteria
|
||
|
||
- Permission bits for the session are assembled at login from LDAP group → role → permission mapping plus Config-DB `NodeAcl` rows that modify permission inheritance along the browse tree.
|
||
- The permission trie walks from the addressed node toward the root, inheriting permissions unless a `NodeAcl` overrides; first match wins.
|
||
- Missing `Read` bit → `Bad_UserAccessDenied` on Read / Subscribe / HistoryRead.
|
||
- Missing `Write*` bit (matching the tag's `SecurityClassification`) → `Bad_UserAccessDenied` on Write.
|
||
- Missing `AlarmAck` bit → `Bad_UserAccessDenied` on acknowledge.
|
||
- Authorization decisions are made at the server layer only — drivers never enforce authorization and only expose `SecurityClassification` metadata.
|
||
|
||
---
|
||
|
||
## OPC-015: ServiceLevel Reporting
|
||
|
||
The server shall expose a dynamic `ServiceLevel` value computed by `RedundancyCoordinator` + `ServiceLevelCalculator`.
|
||
|
||
### Acceptance Criteria
|
||
|
||
- `ServiceLevel` reflects: redundancy role (Primary higher than Secondary), publish state (current generation applied > mid-apply > failed-apply), driver health (any driver instance in open circuit lowers the value), apply-lease state.
|
||
- `ServiceLevel` is exposed as a Variable under the standard `Server` object and is readable by any authenticated client.
|
||
- Clients that observe Primary's `ServiceLevel` drop below Secondary's should failover per the OPC UA spec.
|
||
- Single-node deployments (`NodeCount = 1`) always publish their node as Primary.
|
||
|
||
---
|
||
|
||
## OPC-016: Session Management
|
||
|
||
The server shall support multiple concurrent OPC UA client sessions.
|
||
|
||
### Acceptance Criteria
|
||
|
||
- Maximum concurrent sessions and session timeout come from Config DB cluster settings (default 100 sessions, 30-minute idle timeout).
|
||
- Expired sessions are cleaned up and their subscriptions and monitored items removed.
|
||
- Active session count is reported as a Prometheus gauge on the Admin `/metrics` endpoint.
|
||
|
||
---
|
||
|
||
## OPC-017: Address Space Rebuild on Generation Apply
|
||
|
||
When a new Config DB generation is applied, the server shall surgically update only the affected driver subtrees.
|
||
|
||
### Acceptance Criteria
|
||
|
||
- Apply compares the previous generation to the incoming generation and produces per-driver add / modify / remove sets.
|
||
- Existing OPC UA sessions, subscriptions, and monitored items are preserved across apply whenever the target NodeId survives the generation change.
|
||
- Tags that no longer exist post-apply emit `Bad_NodeIdUnknown` on their subscribed monitored items.
|
||
- During apply, the node's `ServiceLevel` is lowered (per `ServiceLevelCalculator`) so redundancy partners temporarily take precedence.
|
||
- Galaxy subtree rebuilds triggered by `IRediscoverable` (Galaxy deployment change) are scoped to the Galaxy driver's namespace and follow the same preservation rule (OPC-006 from the v1 file, now subsumed).
|
||
|
||
---
|
||
|
||
## OPC-018: Server Diagnostics Nodes
|
||
|
||
The server shall expose standard OPC UA `Server` object nodes required by the spec.
|
||
|
||
### Acceptance Criteria
|
||
|
||
- `ServerStatus` / `ServerState` / `CurrentTime` / `StartTime` populated and compliant with the OPC UA 1.05 spec.
|
||
- `ServerCapabilities` declares historical access capabilities for namespaces that have an `IHistoryProvider`-backed driver.
|
||
- `ServerRedundancy.RedundancySupport` reflects the cluster's redundancy mode (`None` / `Warm` / `Hot`).
|
||
- `ServerRedundancy.ServerUriArray` lists both cluster members' `ApplicationUri` values.
|
||
|
||
---
|
||
|
||
## OPC-019: Observability Hooks
|
||
|
||
The server shall emit OpenTelemetry metrics consumed by the Admin `/metrics` Prometheus endpoint.
|
||
|
||
### Acceptance Criteria
|
||
|
||
- Counters: capability calls per `DriverInstanceId` + `DriverCapability`, OPC UA requests per method, alarm events emitted, history reads, generation apply attempts.
|
||
- Histograms: capability-call duration per `DriverInstanceId` + `DriverCapability`, OPC UA request duration per method.
|
||
- Gauges: circuit-breaker state per pipeline, active OPC UA sessions, active monitored items, subscription queue depth, `ServiceLevel` value, memory-tracking watermarks (Phase 6.1).
|
||
- Metric cardinality is bounded — `DriverInstanceId` and `HostName` are the only high-cardinality labels, both controlled by the Config DB.
|