Files
lmxopcua/docs/reqs/OpcUaServerReqs.md
Joseph Doherty 48970af416 Doc refresh (task #205) — requirements updated for multi-driver OtOpcUa three-process deploy
Per-file summary:

- docs/reqs/OpcUaServerReqs.md — rewritten driver-agnostic. OPC-001..OPC-013 re-scoped to multi-driver address-space composition + capability dispatch; OPC-014 AuthorizationGate + permission trie; OPC-015 dynamic ServiceLevel via RedundancyCoordinator; OPC-017 surgical generation-apply rebuild; OPC-012 capability dispatch via CapabilityInvoker (decision #143 idempotence-aware retry); OPC-013 per-host Polly isolation (decision #144); OPC-019 OpenTelemetry metrics. Transport-security profile matrix (OPC-010) + UserName/LDAP (OPC-011) preserved.

- docs/reqs/GalaxyRepositoryReqs.md — scope clarified as Galaxy-driver-only (not platform). GR-001..GR-004 tied to ITagDiscovery.DiscoverAsync + IRediscoverable; all SQL runs inside OtOpcUa.Galaxy.Host and streams to Proxy via named pipe. GR-008 capability wrapping via CapabilityInvoker added. Cross-links to docs/v2/driver-specs.md + docs/GalaxyRepository.md.

- docs/reqs/MxAccessClientReqs.md — scope clarified as Galaxy-Host-only. MXA-001..MXA-009 preserved (STA pump, register/unregister, subscription refcount, auto-reconnect, probe, COM cleanup, operation metrics, error translation). MXA-010 Proxy-side capability wrapping + MXA-011 pipe ACL + per-process shared secret (OTOPCUA_ALLOWED_SID / OTOPCUA_GALAXY_SECRET) added.

- docs/reqs/ServiceHostReqs.md — rewritten for three-process deployment. Shared section (SVC-SHARED-001/002) for Serilog + bootstrap-only appsettings. SRV-* for OtOpcUa.Server (net10 x64, Microsoft.Extensions.Hosting + AddWindowsService, in-process driver hosting, redundancy-node bootstrap). ADM-* for OtOpcUa.Admin (Blazor Server, cookie+LDAP auth, CanEdit/CanPublish policies, sole DB writer, Prometheus /metrics, audit logging). GHX-* for OtOpcUa.Galaxy.Host (TopShelf, net48 x86, named-pipe IPC bootstrap, STA backend lifecycle, crash handling tied to supervisor).

- docs/reqs/ClientRequirements.md — restructured as numbered, verifiable requirements. SHR-* for Client.Shared (single IOpcUaClientService, ConnectionSettings, failover, cross-platform certs, type-coercing write, UI-thread neutrality). CLI-001..CLI-011 cover connect/read/write/browse/subscribe/historyread/alarms/redundancy. UI-001..UI-008 cover connection panel, tree browser, each tab, connection-state reflection, cross-platform build. Reference design content (IOpcUaClientService shape, models, view-model map, mock layout) preserved.

- docs/reqs/StatusDashboardReqs.md — retired cleanly. Replaced with a pointer to docs/v2/admin-ui.md + HLR-015 / HLR-016 / HLR-017 / ADM-*. Mapping table shows each retired DASH-001..DASH-009 requirement's replacement (live cluster-node view via SignalR, Prometheus metrics, driver-instance detail views, etc.). Note that a formal AdminUiReqs.md can be written later if needed for cert compliance.

HighLevelReqs.md was already at the target shape (HLR-001..HLR-018 with Revision header noting retired HLR-009) as of commit f217636; verified identical and no additional edit required.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 01:31:58 -04:00

16 KiB
Raw Permalink Blame History

OPC UA Server — Component Requirements

Revision — Refreshed 2026-04-19 for the OtOpcUa v2 multi-driver platform (task #205). OPC-001…OPC-013 have been rewritten driver-agnostically — they now describe how the core OPC UA server composes multiple driver subtrees, enforces authorization, and invokes capabilities through the Polly-wrapped dispatch path. OPC-014 through OPC-022 are new and cover capability dispatch, per-host Polly isolation, idempotence-aware write retry, AuthorizationGate, ServiceLevel reporting, the alarm surface, history surface, server-certificate management, and the transport-security profile matrix. Galaxy-specific behavior has been moved out to GalaxyRepositoryReqs.md and MxAccessClientReqs.md.

Parent: HLR-001, HLR-003, HLR-009, HLR-010, HLR-013

OPC-001: Server Endpoint

The OPC UA server shall listen on a configurable TCP endpoint using the OPC Foundation .NET Standard stack and expose a single endpoint URL per cluster node.

Acceptance Criteria

  • Endpoint URL comes from ClusterNode.EndpointUrl in the Config DB (default form opc.tcp://<hostname>:<port>/OtOpcUa).
  • ApplicationName and ApplicationUri come from ClusterNode fields; ApplicationUri is unique per node so redundancy ServerUriArray entries are distinguishable.
  • Port defaults to 4840. If the port is in use at startup the server shall log Error and fail to start (no silent port reassignment).
  • Uses OPCFoundation.NetStandard.Opc.Ua.Server NuGet.
  • Endpoint URL logged at Information level on startup.

Details

  • Node-local appsettings.json only carries the Config DB connection + NodeId + ClusterId bootstrap — actual endpoint topology comes from the Config DB per HLR-011.

OPC-002: Address Space Composition

The server shall compose an address space by mounting each active driver instance's subtree under a dedicated OPC UA namespace.

Acceptance Criteria

  • Each DriverInstance in the current published generation registers one IDriver implementation in the core.
  • Each driver's ITagDiscovery.DiscoverAsync result is streamed into the core via IAddressSpaceBuilderAddFolder / AddVariable calls; the driver does not buffer the whole tree.
  • Each driver instance gets its own namespace index; NamespaceUri comes from the Namespace row in the Config DB.
  • Each cluster has at most one namespace per Kind (Equipment, SystemPlatform, future Simulated); enforced by UNIQUE on (ClusterId, Kind) in the DB.
  • Galaxy driver subtree preserves the contained-name browse structure from the deployed Galaxy (moved to GalaxyRepositoryReqs.md).
  • Equipment-kind drivers populate the canonical 5-level UNS structure (Enterprise/Site/Area/Line/Equipment/Signal).

OPC-003: Variable Nodes and Access Levels

Each tag produced by a driver's ITagDiscovery shall become an OPC UA variable node.

Acceptance Criteria

  • Variable node BrowseName and DisplayName come from DriverAttributeInfo.
  • DataType is resolved from DriverDataType per each driver's spec in docs/v2/driver-specs.md.
  • AccessLevel and UserAccessLevel are derived from the tag's SecurityClassification and the session's effective permissions walked through the node-ACL permission trie (see OPC-017 AuthorizationGate).
  • Scalar attributes produce ValueRank = Scalar; array attributes produce ValueRank = OneDimension with ArrayDimensions set from the driver's attribute info.

OPC-004: Namespace Index Allocation

The server shall register one OPC UA namespace per active driver instance.

Acceptance Criteria

  • Namespace index 0 remains the standard OPC UA namespace.
  • Each driver instance's Namespace.Uri becomes a registered namespace; its index is assigned deterministically at startup from the published generation's driver ordering.
  • All variable NodeIds use the driver's namespace index; NodeId identifiers are string-shaped and stable across restarts of the same generation.
  • Namespace index reshuffles are a publish-time concern; clients reconciling server-relative NodeIds must re-resolve namespace URIs after a new generation is applied.

OPC-005: Read Operations

The server shall fulfill OPC UA Read requests by invoking IReadable.ReadAsync on the target driver instance, dispatched through CapabilityInvoker.

Acceptance Criteria

  • Every read call at dispatch passes through Core.Resilience.CapabilityInvoker.InvokeAsync(DriverCapability.Read, …).
  • Returned DataValueSnapshot is converted to an OPC UA DataValue with StatusCode, source timestamp, and server timestamp.
  • If the owning driver instance's Polly circuit is open, the read returns Bad quality immediately without hitting the wire.
  • Reads on a node the session has no Read bit for in the permission trie return Bad_UserAccessDenied before the capability is invoked (OPC-017).
  • Read timeout is the Polly timeout leg on the Read capability; its duration is per-(DriverInstanceId, HostName) and comes from the Config DB.

OPC-006: Write Operations

The server shall fulfill OPC UA Write requests by invoking IWritable.WriteAsync through CapabilityInvoker with idempotence-aware retry policy.

Acceptance Criteria

  • Writes dispatch through CapabilityInvoker.InvokeAsync(DriverCapability.Write, …).
  • Writes do not auto-retry unless the tag's TagConfig.WriteIdempotent = true, or the driver's capability is marked with [WriteIdempotent] (decision #143).
  • Writes on a node the session lacks the required permission bit for (WriteOperate, WriteTune, or WriteConfigure derived from the tag's SecurityClassification) return Bad_UserAccessDenied before the capability runs.
  • A write into an open circuit returns a driver-shaped error (Bad_NoCommunication / Bad_ServerNotConnected) without hitting the wire.
  • The server shall coerce the written OPC UA value to the driver's expected native type using the node's DriverDataType before calling WriteAsync.
  • Writes to a NodeId not currently in the address space return Bad_NodeIdUnknown.

OPC-007: Subscriptions and Monitored Items

The server shall map OPC UA CreateMonitoredItems / DeleteMonitoredItems to ISubscribable.SubscribeAsync / UnsubscribeAsync on the owning driver instance.

Acceptance Criteria

  • Subscription setup dispatches through CapabilityInvoker.InvokeAsync(DriverCapability.Subscribe, …).
  • Two OPC UA monitored items against the same tag produce exactly one driver-side subscription (ref-counted); last unsubscribe releases the driver-side resource.
  • OnDataChange callbacks from the driver arrive as DataValueSnapshot and are forwarded to all OPC UA monitored items on that tag.
  • Driver-side quality maps to OPC UA StatusCode per the driver's spec.
  • When the owning driver's circuit opens, subscribed items publish Bad quality; when it resets, resumption publishes the cached or freshly-sampled value.
  • Across generation applies that preserve a tag's NodeId, existing OPC UA monitored items are preserved (no re-subscribe required on the client).

OPC-008: Alarm Surface

The server shall expose the OPC UA alarm and condition model backed by each driver's IAlarmSource (where implemented).

Acceptance Criteria

  • Drivers implementing IAlarmSource (today: Galaxy, FOCAS, OPC UA Client) produce alarm events that the core maps onto OPC UA ConditionType / AlarmConditionType instances in the driver's namespace.
  • AlarmSubscribe dispatches through CapabilityInvoker.InvokeAsync(DriverCapability.AlarmSubscribe, …) and retries on transient failure.
  • AlarmAcknowledge from the OPC UA client dispatches through CapabilityInvoker.InvokeAsync(DriverCapability.AlarmAcknowledge, …) and does not retry (decision #143 — ack is a write-shaped operation).
  • Alarm-ack requires the AlarmAck permission bit for the tag / equipment node; otherwise Bad_UserAccessDenied.
  • Drivers that do not implement IAlarmSource contribute no alarm nodes; the core does not synthesize placeholder conditions.

OPC-009: Historical Access

The server shall surface OPC UA Historical Access (HA) via each driver's IHistoryProvider (where implemented).

Acceptance Criteria

  • HistoryRead for Raw, Processed, AtTime, and Events dispatches through CapabilityInvoker.InvokeAsync(DriverCapability.HistoryRead, …).
  • Drivers implementing IHistoryProvider today: Galaxy (Wonderware Historian), OPC UA Client (proxy to remote historian).
  • Drivers not implementing IHistoryProvider return Bad_HistoryOperationUnsupported for history requests on their nodes.
  • History reads require the Read permission bit on the target node.

OPC-010: Transport Security Profiles

The server shall offer OPC UA transport-security profiles resolved at startup by SecurityProfileResolver.

Acceptance Criteria

  • Supported profiles: None, Basic256Sha256-Sign, Basic256Sha256-SignAndEncrypt, Aes128_Sha256_RsaOaep-Sign, Aes128_Sha256_RsaOaep-SignAndEncrypt, Aes256_Sha256_RsaPss-Sign, Aes256_Sha256_RsaPss-SignAndEncrypt.
  • Active profile list comes from OpcUa.SecurityProfile in appsettings.json (bootstrap config) or Config DB (per-cluster override).
  • Server certificate is created at first startup even when only None is enabled, because UserName-token encryption depends on an ApplicationInstanceCertificate.
  • Certificate store root path is configurable (default %ProgramData%/OtOpcUa/pki/).
  • AutoAcceptUntrustedClientCertificates is a config flag; production deployments set it to false and operators add trusted client certs via the Admin UI Cert Trust screen.

OPC-011: UserName Authentication

The server shall validate UserNameIdentityToken credentials against LDAP (production: Active Directory; dev: GLAuth).

Acceptance Criteria

  • If Ldap.Enabled = false, all UserName tokens are rejected (BadUserAccessDenied).
  • When enabled, the server performs an LDAP bind using the supplied credentials via LdapUserAuthenticator.
  • On successful bind, group memberships resolved from LDAP are mapped through LdapOptions.GroupToRole to produce the session's permission bits (ReadOnly, WriteOperate, WriteTune, WriteConfigure, AlarmAck).
  • LdapAuthenticationProvider implements both IUserAuthenticationProvider and IRoleProvider.
  • UserName tokens are always carried on an encrypted secure channel (either Sign-and-Encrypt transport, or encrypted token using the server certificate even on a None channel).

OPC-012: Capability Dispatch via CapabilityInvoker

Every async capability-interface call the server makes shall route through Core.Resilience.CapabilityInvoker.

Acceptance Criteria

  • CapabilityInvoker.InvokeAsync resolves a Polly resilience pipeline keyed on (DriverInstanceId, HostName, DriverCapability).
  • Read / Discover / Probe / Subscribe / AlarmSubscribe / HistoryRead pipelines carry Timeout + Retry + CircuitBreaker strategies.
  • Write / AlarmAcknowledge pipelines carry Timeout + CircuitBreaker only; Retry is enabled only when the tag or capability carries [WriteIdempotent] (decision #143).
  • Roslyn diagnostic OTOPCUA0001 fires on any direct call to a capability-interface method from outside CapabilityInvoker (enforced via ZB.MOM.WW.OtOpcUa.Analyzers).

OPC-013: Per-Host Polly Isolation

Polly pipelines shall be keyed per (DriverInstanceId, HostName, DriverCapability) so that a failing device in one driver does not trip the circuit for another device on the same driver or any other driver (decision #144).

Acceptance Criteria

  • A driver serving N devices has N × capabilityCount distinct pipelines.
  • Circuit-breaker state transitions are telemetry-published per pipeline and appear on the Admin UI + /metrics.
  • A host-scope fault (e.g. shared PLC gateway) naturally trips all devices behind that host but leaves other hosts untouched.

OPC-014: Authorization Gate and Permission Trie

Security.AuthorizationGate shall enforce node-level permissions on every browse, read, write, subscribe, alarm-ack, and history call before dispatch.

Acceptance Criteria

  • Permission bits for the session are assembled at login from LDAP group → role → permission mapping plus Config-DB NodeAcl rows that modify permission inheritance along the browse tree.
  • The permission trie walks from the addressed node toward the root, inheriting permissions unless a NodeAcl overrides; first match wins.
  • Missing Read bit → Bad_UserAccessDenied on Read / Subscribe / HistoryRead.
  • Missing Write* bit (matching the tag's SecurityClassification) → Bad_UserAccessDenied on Write.
  • Missing AlarmAck bit → Bad_UserAccessDenied on acknowledge.
  • Authorization decisions are made at the server layer only — drivers never enforce authorization and only expose SecurityClassification metadata.

OPC-015: ServiceLevel Reporting

The server shall expose a dynamic ServiceLevel value computed by RedundancyCoordinator + ServiceLevelCalculator.

Acceptance Criteria

  • ServiceLevel reflects: redundancy role (Primary higher than Secondary), publish state (current generation applied > mid-apply > failed-apply), driver health (any driver instance in open circuit lowers the value), apply-lease state.
  • ServiceLevel is exposed as a Variable under the standard Server object and is readable by any authenticated client.
  • Clients that observe Primary's ServiceLevel drop below Secondary's should failover per the OPC UA spec.
  • Single-node deployments (NodeCount = 1) always publish their node as Primary.

OPC-016: Session Management

The server shall support multiple concurrent OPC UA client sessions.

Acceptance Criteria

  • Maximum concurrent sessions and session timeout come from Config DB cluster settings (default 100 sessions, 30-minute idle timeout).
  • Expired sessions are cleaned up and their subscriptions and monitored items removed.
  • Active session count is reported as a Prometheus gauge on the Admin /metrics endpoint.

OPC-017: Address Space Rebuild on Generation Apply

When a new Config DB generation is applied, the server shall surgically update only the affected driver subtrees.

Acceptance Criteria

  • Apply compares the previous generation to the incoming generation and produces per-driver add / modify / remove sets.
  • Existing OPC UA sessions, subscriptions, and monitored items are preserved across apply whenever the target NodeId survives the generation change.
  • Tags that no longer exist post-apply emit Bad_NodeIdUnknown on their subscribed monitored items.
  • During apply, the node's ServiceLevel is lowered (per ServiceLevelCalculator) so redundancy partners temporarily take precedence.
  • Galaxy subtree rebuilds triggered by IRediscoverable (Galaxy deployment change) are scoped to the Galaxy driver's namespace and follow the same preservation rule (OPC-006 from the v1 file, now subsumed).

OPC-018: Server Diagnostics Nodes

The server shall expose standard OPC UA Server object nodes required by the spec.

Acceptance Criteria

  • ServerStatus / ServerState / CurrentTime / StartTime populated and compliant with the OPC UA 1.05 spec.
  • ServerCapabilities declares historical access capabilities for namespaces that have an IHistoryProvider-backed driver.
  • ServerRedundancy.RedundancySupport reflects the cluster's redundancy mode (None / Warm / Hot).
  • ServerRedundancy.ServerUriArray lists both cluster members' ApplicationUri values.

OPC-019: Observability Hooks

The server shall emit OpenTelemetry metrics consumed by the Admin /metrics Prometheus endpoint.

Acceptance Criteria

  • Counters: capability calls per DriverInstanceId + DriverCapability, OPC UA requests per method, alarm events emitted, history reads, generation apply attempts.
  • Histograms: capability-call duration per DriverInstanceId + DriverCapability, OPC UA request duration per method.
  • Gauges: circuit-breaker state per pipeline, active OPC UA sessions, active monitored items, subscription queue depth, ServiceLevel value, memory-tracking watermarks (Phase 6.1).
  • Metric cardinality is bounded — DriverInstanceId and HostName are the only high-cardinality labels, both controlled by the Config DB.