Recover stashed driver-gaps work from pre-v2-mxgw-merge working tree

Captures uncommitted work that lived in the working tree on
v2-mxgw-integration but was orthogonal to the migration. Stashed
during the v2-mxgw merge to master (2026-04-30) and replanted here on
a feature branch off master so it's git-visible rather than living in
the stash list.

Two distinct buckets:

1. Tracked fixture/config refinements (10 files, ~36 lines):
   - scripts/e2e/test-opcuaclient.ps1
   - src/ZB.MOM.WW.OtOpcUa.Admin/appsettings.json
   - 5 docker-compose.yml under tests/.../IntegrationTests/Docker/
     (AbCip, Modbus, OpcUaClient, S7)
   - 4 fixture .cs files (AbServerFixture, ModbusSimulatorFixture,
     OpcPlcFixture, Snap7ServerFixture)

2. Untracked driver-gaps queue artifacts (~8000 lines):
   - docs/plans/{abcip,ablegacy,focas,opcuaclient,s7,twincat}-plan.md
     — per-driver gap plans
   - docs/featuregaps.md — cross-cutting analysis
   - docs/v2/focas-deployment.md, docs/v2/implementation/focas-simulator-plan.md
   - followup.md — auto/driver-gaps queue follow-ups
   - scripts/queue/ — PR-queue automation tooling (12 files including
     pr-manifest.yaml at 1473 lines)

This commit is a snapshot for recoverability — review and split into
focused PRs (or discard) before merging anywhere downstream.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-04-30 08:28:01 -04:00
parent ae7106dfce
commit 2d07d716dc
33 changed files with 8074 additions and 14 deletions

View File

@@ -0,0 +1,863 @@
# OpcUaClient Driver — Implementation Plan
> Source of gap analysis: [featuregaps.md → OpcUaClient](../featuregaps.md#opcuaclient-opc-ua-aggregation-client)
>
> Covers Build = Yes items only. Numbering matches the featuregaps Recommendations table.
## Summary
The OpcUaClient driver already ships 8/8 capability interfaces and a working
end-to-end Session/Subscription/MonitoredItem/HistoryRead pipeline backed by
the OPC Foundation `OPCFoundation.NetStandard.Opc.Ua.Client` SDK. Most of the
14 Build = Yes gaps are operability or curation knobs — config surface +
plumbing into existing SDK calls — rather than new protocol implementation.
A small number need genuinely new SDK plumbing (Reverse Connect,
ModelChangeEvent subscribe) and one (`ReadEventsAsync`) needs a coordinated
cross-driver interface change.
The plan groups the work into five phases, ordered to deliver per-tag /
per-subscription operability first (highest-frequency operator pain), then
curation, then change tracking, then connectivity, then historical+HA. Each
PR sticks to one feature-gap row so reviews stay narrow.
## Phased delivery
| Phase | Theme | Gaps | PRs | Notes |
| :---: | --- | --- | :---: | --- |
| 1 | Operability knobs | #5, #6, #15, #17, #20 | 5 | Pure SDK config surface; no new wire flows |
| 2 | Discovery & curation | #2, #7, #8, #9 | 4 | Touches `ITagDiscovery` + adds method invoke |
| 3 | Change tracking | #10 | 1 | New session-level subscription on `Server` node |
| 4 | Connectivity | #1 | 1 | Reverse Connect — new listener path |
| 5 | Historical & redundancy | #12, #13, #14 | 3 | Includes the cross-driver `IHistoryProvider` change |
**Total: 14 PRs across 5 phases.** Phases 1-3 land independently against
the existing single-session model. Phase 4 ships in parallel with phases 2-3
since it doesn't touch `OpcUaClientDriver` proper. Phase 5's first PR is a
prerequisite for the `ReadEventsAsync` work in every other history-capable
driver and must coordinate with them.
## Per-PR detail
### Phase 1 — Operability knobs
#### PR-1: Per-subscription tuning (gap #6)
**Goal**: lift the hard-coded `KeepAliveCount=10`, `LifetimeCount=1000`,
`MaxNotificationsPerPublish=0`, `Priority=0`, `PublishingInterval` floor of
50 ms into `OpcUaClientDriverOptions` so high-event-rate servers can be
defended against (`MaxNotificationsPerPublish=0` is unlimited — the
documented DoS surface) and high-tag-count deployments can split by
priority.
**SDK API**:
- `Subscription.SetPublishingMode(bool, ct)` for runtime enable/disable
- `SubscriptionOptions.PublishingInterval / KeepAliveCount / LifetimeCount /
MaxNotificationsPerPublish / Priority` set at create-time
- New options class `OpcUaSubscriptionDefaults` (publish interval floor,
keep-alive count, lifetime count, max notifications, priority)
**Files**:
- `src/.../OpcUaClient/OpcUaClientDriverOptions.cs` — add `Subscriptions`
sub-section
- `src/.../OpcUaClient/OpcUaClientDriver.cs` — `SubscribeAsync` reads from
options
- `src/.../OpcUaClient/OpcUaClientDriver.cs` — `SubscribeAlarmsAsync` reuses
same defaults but with `Priority=1` higher than data subscriptions so
alarms aren't starved during data bursts
**Tests**: `OpcUaClientSubscribeAndProbeTests` — assert options propagate;
add a stress unit test (mocked `Subscription`) that asserts custom
`MaxNotificationsPerPublish` is forwarded so a value > 0 actually reaches
the SDK.
**Risks**: Setting `LifetimeCount` too low against a server with publish-
throttling can drop subscriptions; doc the formula (`LifetimeCount >=
3 * KeepAliveCount`).
**Docs / fixture / e2e**: new "Subscription tuning" subsection in
`docs/drivers/OpcUaClient.md` (create if missing) documenting the
`Subscriptions` options block with the `LifetimeCount >= 3 *
KeepAliveCount` formula; cross-link from the "Advanced options" section
of `docs/Client.CLI.md` so CLI users discover the knobs. Fixture: opc-plc
already publishes fast tickers (`FastUInt1` @ 100 ms) sufficient for
coverage — no fixture-side change. Integration test in
`tests/ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient.IntegrationTests/` asserting
custom `KeepAliveCount` / `Priority` reach the wire (capture via
`OpcPlcFixture` keepalive count). E2E: extend
`scripts/e2e/test-opcuaclient.ps1` with a stage that sets a non-default
publish interval and confirms the local subscription honours it.
---
#### PR-2: Per-tag advanced subscription tuning incl. deadband (gap #5)
**Goal**: surface `SamplingInterval`, `QueueSize`, `DiscardOldest`,
`MonitoringMode`, and `DataChangeFilter` (DeadbandType=Absolute/Percent +
Trigger=Status/StatusValue/StatusValueTimestamp) per-tag. Deadband is the
baseline analog noise filter every commercial UA aggregator ships and the
single feature most likely to cut bandwidth on busy plants.
**SDK API**:
- `MonitoredItem.Filter = new DataChangeFilter { Trigger =
DataChangeTrigger.StatusValue, DeadbandType = (uint)DeadbandType.Absolute,
DeadbandValue = 0.5 }`
- `MonitoredItemOptions.QueueSize / DiscardOldest / SamplingInterval /
MonitoringMode`
- Per-tag override structure: extend the `SubscribeAsync` parameter shape
(or add an overload accepting a `IReadOnlyList<MonitoredTagSpec>`) — note
this requires coordinating with `ISubscribable` so the per-tag carrier
reaches the driver.
**Files**:
- `src/.../Core.Abstractions/ISubscribable.cs` — add overload
`SubscribeAsync(IReadOnlyList<MonitoredTagSpec>, ...)` keeping old API
for source compat
- `src/.../OpcUaClient/OpcUaClientDriver.cs` — translate spec → SDK filter
**Tests**: assert `DataChangeFilter` lands on the `MonitoredItem.Filter` for
each kind of trigger; assert PercentDeadband requires server-side
EURange (server returns `BadFilterNotAllowed` if not configured) — capture
the StatusCode and surface as a usable error.
**Risks**: cross-cutting `ISubscribable` change. Mitigation: ship the
overload as additive — existing single-arg path still exists.
**Docs / fixture / e2e**: new "Per-tag deadband and monitoring filters"
section in `docs/drivers/OpcUaClient.md` (create if missing) with worked
examples of Absolute vs Percent deadband + the EURange prerequisite;
update `docs/Client.CLI.md` `subscribe` command page with the new tag-
config syntax for `--deadband` / `--queue-size` / `--discard-oldest`;
update `docs/Client.UI.md` Subscriptions tab section to mirror. Fixture:
`OpcPlcFixture` / `OpcPlcProfile` seeds an analog (`StepUp` already
oscillates) and confirms `EURange` is published — extend the profile to
flag noisy nodes. Integration test in
`tests/ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient.IntegrationTests/` asserts
publish suppression below the deadband threshold. E2E: add a
`-DeadbandValue` stage to `scripts/e2e/test-opcuaclient.ps1` (and a
`deadband` knob to `scripts/e2e/e2e-config.sample.json`) that subscribes,
asserts no spurious updates within the band.
---
#### PR-3: Honor server `OperationLimits` (gap #15)
**Goal**: read `Server.ServerCapabilities.OperationLimits.MaxNodesPerRead /
Write / Browse / HistoryReadData` once after Session activation, cache,
and chunk batch operations to those caps client-side. Today the SDK chunks
on its internal default; against an undersized embedded UA server this
results in `BadTooManyOperations`.
**SDK API**:
- After session open: `Session.ReadAsync` of
`VariableIds.Server_ServerCapabilities_OperationLimits_MaxNodesPerRead`
+ sibling NodeIds. The SDK exposes `Session.OperationLimits` after
`FetchOperationLimits` is called — prefer that path.
- `Session.FetchOperationLimitsAsync(ct)` (1.5+); fallback: explicit Read.
**Files**:
- `src/.../OpcUaClient/OpcUaClientDriver.cs` — call
`FetchOperationLimitsAsync` post-`OpenSessionOnEndpointAsync`; honour
caps in `ReadAsync`, `WriteAsync`, `BrowseRecursiveAsync`,
`EnrichAndRegisterVariablesAsync`, `ExecuteHistoryReadAsync`.
**Tests**: mock `Session.OperationLimits` to a value below the test batch
size and assert the driver issues N wire calls instead of one.
**Risks**: a zero on the server means "no limit" per Part 5 — don't divide
by zero.
**Docs / fixture / e2e**: new "Server OperationLimits handling"
subsection in `docs/drivers/OpcUaClient.md` documenting the auto-fetch
behaviour, the zero-means-unlimited semantics, and how to override via
options if the server reports an under-truthful value. Fixture: opc-plc
publishes the standard ServerCapabilities tree out of the box — no
container-side change; the `OpcPlcFixture` seed validates the IDs at
collection init. Integration test asserts batch reads chunk to the
fetched cap. No e2e change needed (the script's batch sizes are already
small).
---
#### PR-4: Diagnostics counters (gap #17)
**Goal**: expose per-driver counters on `DriverHealth` (or a sibling
`DriverDiagnostics` surface): publish-request count, notifications-per-
second EWMA, missing-publish-request count, dropped-notification rate,
session resets count. Operators currently see only `LastSuccessfulRead`
+ last error.
**SDK API**:
- `Subscription.Notification` event fires per published notification — bump
a counter
- `Subscription.PublishStateChanged` event for missed-publish detection
- `Session.PublishError` event for channel-level errors
- `Session.SessionClosing`/`SessionConfigurationChanged` for session-reset
attribution
**Files**:
- `src/.../OpcUaClient/OpcUaClientDriver.cs` — instrument hooks; expose via
`IDriver.GetDiagnostics()` or extend `DriverHealth`
- `src/.../Core.Abstractions/IDriver.cs` — confirm where the counter shape
lives; if `DriverHealth` is too rigid, add `IDriverDiagnostics` (mirrors
the Modbus `driver-diagnostics` RPC pattern from #154)
**Tests**: synthetic notification fan-out → assert counters increment;
session close → assert reset count bumps.
**Risks**: counters need to be lock-free hot-path safe; use
`Interlocked.Increment` and a single sliding-window clock per counter.
**Docs / fixture / e2e**: new "Driver diagnostics" section in
`docs/drivers/OpcUaClient.md` enumerating each counter and the event
that bumps it; cross-link to the `driver-diagnostics` Admin RPC
documented for Modbus (#154 pattern). Fixture: no opc-plc change
required. Integration test exercises `IDriverDiagnostics` after
forcing a session close. E2E: extend
`scripts/e2e/test-opcuaclient.ps1` with a "diagnostics snapshot" stage
that asserts publish/notification counters are non-zero after the
subscribe stage.
---
#### PR-5: CRL / revocation handling (gap #20)
**Goal**: explicit revoked-cert handling in `CertificateValidator` plus a
`RejectSHA1SignedCertificates` knob. Today the validator hooks
`BadCertificateUntrusted` only — a revoked cert silently fails as
"untrusted" with no operator-visible distinction.
**SDK API**:
- `CertificateValidator.CertificateValidation` event — inspect
`e.Error.StatusCode` for `BadCertificateRevoked`,
`BadCertificateRevocationUnknown`,
`BadCertificateIssuerRevocationUnknown`,
`BadCertificatePolicyCheckFailed`
- `SecurityConfiguration.RejectSHA1SignedCertificates`,
`SecurityConfiguration.RejectUnknownRevocationStatus`,
`SecurityConfiguration.MinimumCertificateKeySize` — direct config
bool/int knobs already on the SDK type
- `CertificateTrustList.AddCRL` / per-store CRL directories under
`%LocalAppData%\OtOpcUa\pki\{trusted,issuers}\crl\`
**Files**:
- `src/.../OpcUaClient/OpcUaClientDriver.cs``BuildApplicationConfigurationAsync`
honours new options, validator handler distinguishes revoked vs untrusted
in the surfaced error message
- `src/.../OpcUaClient/OpcUaClientDriverOptions.cs` — add
`RejectSHA1SignedCertificates`, `RejectUnknownRevocationStatus`,
`MinimumCertificateKeySize`
**Tests**: feed a SHA1-signed test cert and a revoked cert through the
validator with the new knobs on/off.
**Risks**: PKI directory layout changes — existing deployments need a
migration note.
**Docs / fixture / e2e**: new "Certificate revocation and SHA1 rejection"
subsection in `docs/drivers/OpcUaClient.md` documenting the CRL
directory layout under `%LocalAppData%\OtOpcUa\pki\{trusted,issuers}\crl\`
and the new options (with a migration note for existing PKI stores);
cross-link from `docs/security.md`. Fixture: extend
`OpcPlcFixture` / `Docker/docker-compose.yml` with an optional secured
endpoint variant and a SHA1-signed test cert checked into the test
project's resources for the validator unit test. Integration test
exercises a revoked cert via a local CRL drop. E2E: add a
`-Insecure:$false` smoke stage to `scripts/e2e/test-opcuaclient.ps1`
that asserts a revoked cert produces a distinguishable error message.
---
### Phase 2 — Discovery & curation
#### PR-6: Discovery URL `FindServers` (gap #2)
**Goal**: accept a discovery URL (`opc.tcp://host:4840` pointing at the
LDS or the server's own discovery endpoint) and surface advertised servers
+ endpoints to the operator without manual policy/mode tuple copy.
**SDK API**:
- `DiscoveryClient.CreateAsync(appConfig, new Uri(url), DiagnosticsMasks.None, ct)`
- `DiscoveryClient.FindServersAsync(null, ct)``ApplicationDescription[]`
- `DiscoveryClient.GetEndpointsAsync(null, ct)` per advertised `DiscoveryUrl`
**Files**:
- `src/.../OpcUaClient/OpcUaClientDriver.cs` — new internal
`DiscoverServersAsync` helper; extend the Admin-side discovery RPC to
invoke it (driver-diagnostics pattern from #154)
- `src/.../OpcUaClient/OpcUaClientDriverOptions.cs` — add
`DiscoveryUrl` knob (alternative to explicit `EndpointUrls` — when set
the driver runs `FindServers` at init and feeds the result into the
failover candidate list)
**Tests**: mock `DiscoveryClient` returning two advertised servers each
with three endpoints; assert the candidate list reflects the policy/mode
filter applied client-side.
**Risks**: `FindServers` itself usually requires `SecurityMode=None`
spec out in the doc that the discovery channel is unsecured even when
the data channel will be encrypted.
**Docs / fixture / e2e**: new "Discovery URL (`FindServers`)" section in
`docs/drivers/OpcUaClient.md` with the unsecured-discovery-vs-secured-
data caveat called out; cross-link from `docs/Client.CLI.md` if a
`discover` CLI command surfaces. Fixture: opc-plc already responds to
`FindServers` on the same endpoint — `OpcPlcFixture` adds a discovery
probe at collection init. Integration test exercises the helper against
the live opc-plc container and asserts at least one
`ApplicationDescription` returned. E2E: replace the hard-coded
`-RemoteUrl` stage in `scripts/e2e/test-opcuaclient.ps1` with an
optional `-DiscoveryUrl` mode that picks the first advertised endpoint.
---
#### PR-7: Selective import / namespace remap (gap #7)
**Goal**: per-branch include/exclude rules, namespace-URI remapping, and
re-keyed BrowseNames — the curation surface every commercial aggregator
ships.
**Approach**: extend `OpcUaClientDriverOptions` with a `Curation` section:
- `IncludePaths: string[]` — glob or NodeId-rooted prefix list; only paths
matching are imported
- `ExcludePaths: string[]` — wins over Include (Include is allow-list,
Exclude is block-list)
- `NamespaceRemap: Dictionary<string,string>` — upstream NS URI →
local-side alias for BrowseName generation
- `RootAlias: string` — default `"Remote"`; replaces the hardcoded folder
name today
**SDK API** — none new; this is pure local filtering inside
`BrowseRecursiveAsync` and `EnrichAndRegisterVariablesAsync`.
**Files**:
- `src/.../OpcUaClient/OpcUaClientDriverOptions.cs`
- `src/.../OpcUaClient/OpcUaClientDriver.cs`
`BrowseRecursiveAsync` consults the rule set; helper
`MapNamespaceForBrowseName` handles NS remap
**Tests**: synthetic browse tree, exercise include/exclude/remap each
independently and combined; verify the cap accounting in
`MaxDiscoveredNodes` excludes filtered nodes.
**Risks**: glob semantics — pin to a small subset (`*`, `?` only — no
character classes or `**`) to keep the doc + behaviour simple.
**Docs / fixture / e2e**: new "Curation: include/exclude and namespace
remap" section in `docs/drivers/OpcUaClient.md` with worked examples of
each rule kind and the supported glob subset; update
`docs/drivers/OpcUaClient-Test-Fixture.md` "Coverage map" with the new
filtering rows. Fixture: extend `OpcPlcProfile` to enumerate which
upstream namespaces are exercised so curation tests can target them.
Integration test seeds an Include + Exclude + Remap rule and asserts
the local tree reflects the filter. E2E: add a
`-IncludePath` / `-NamespaceRemap` set of params to
`scripts/e2e/test-opcuaclient.ps1` that asserts the local browse depth
matches the rule.
---
#### PR-8: Type definition mirroring (gap #8)
**Goal**: walk the upstream `Types` folder (`ObjectTypes`,
`VariableTypes`, `DataTypes`, `ReferenceTypes`) and project them into the
local address space so downstream UI clients keep type-aware rendering and
structured DataTypes decode correctly.
**SDK API**:
- `Session.NodeCache.FetchNode(typeNodeId)` for type metadata
- `Session.LoadDataTypeSystem` — for structured DataType encoding
- `Session.FetchTypeTree(NodeIdCollection)` — populates the session's
type cache from the server
**Files**:
- `src/.../OpcUaClient/OpcUaClientDriver.cs` — new pass-3 in `DiscoverAsync`
that walks `i=86` (Types folder) under the curation rules, registers a
parallel type subtree, and links variables to their TypeDefinition via
HasTypeDefinition references on the address-space builder
- `src/.../Core.Abstractions/IAddressSpaceBuilder.cs` — confirm whether
the builder accepts type nodes; if not, extend it (this likely is a
prerequisite — if so, it gets its own preceding PR-8a)
**Tests**: mock browse returning `BaseObjectType -> DerivedThing`;
assert local builder receives the type node + the HasTypeDefinition link.
**Risks**: significant. Type mirroring touches `IAddressSpaceBuilder`
which is a cross-cutting interface every driver depends on. If
`IAddressSpaceBuilder` already supports type nodes (Galaxy has type-like
templates), reuse that surface; otherwise this PR splits.
**Docs / fixture / e2e**: new "Type mirroring" section in
`docs/drivers/OpcUaClient.md` documenting which type nodes get walked
and how downstream UA clients see the HasTypeDefinition references; also
note in `docs/Client.UI.md` that the Browse tree now shows mirrored
types. Fixture: opc-plc already exposes the standard `Types` folder;
extend `OpcPlcProfile` to assert at least one custom ObjectType is
present. Integration test browses the local Types folder post-discovery
and asserts the upstream type chain landed. No e2e change needed beyond
extending the existing browse stage to walk under `Types`.
---
#### PR-9: Method node mirroring + `Call` passthrough (gap #9)
**Goal**: discover `NodeClass.Method` nodes in the browse pass, expose
them on the local address space, and forward `Call` invocations as
`Session.CallAsync` against the upstream node. The driver already calls
`AcknowledgeableConditionType.Acknowledge` for A&C — generalize that path.
**SDK API**:
- `Session.CallAsync(requestHeader, methodsToCall: CallMethodRequestCollection, ct)`
returning `CallMethodResultCollection`
- Browse already covers Method nodes by lifting the `NodeClassMask`; need
to additionally browse `HasProperty` to discover `InputArguments` /
`OutputArguments` for argument translation
**Files**:
- `src/.../Core.Abstractions/IDriver.cs` — add `IMethodInvoker` capability
interface (this is a NEW capability, not a tweak to an existing one)
- `src/.../OpcUaClient/OpcUaClientDriver.cs` — implement
`IMethodInvoker.InvokeAsync(string objectId, string methodId,
IReadOnlyList<object?> inputs, ct)`; refactor `AcknowledgeAsync` to
reuse the common path
- `src/.../Server/...` node-manager — wire `IMethodInvoker` to the OPC UA
server's `MethodNode.OnCallMethod` hook so downstream Call requests
reach the driver
**Tests**: mock `Session.CallAsync` returning Good + an output collection;
assert pass-through fidelity. Also assert per-argument `BadInvalidArgument`
codes pass through.
**Risks**: high — adds a new capability interface. Other drivers that
*could* support methods (Galaxy via `OnExecute` scripts, FOCAS via FOCAS
commands) gain a clean extension point but each is its own follow-up.
**Docs / fixture / e2e**: new "Method nodes and Call passthrough"
section in `docs/drivers/OpcUaClient.md` explaining how method calls
flow through the aggregator (input/output argument translation, error-
code passthrough); add a `call` command page to `docs/Client.CLI.md`
covering the new path; mirror in `docs/Client.UI.md` if a UI surface
ships. Fixture: opc-plc already exposes the standard
`Server.GetMonitoredItems` method — `OpcPlcFixture` registers it as the
canonical method-call target. Integration test in
`tests/ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient.IntegrationTests/` invokes
`Server.GetMonitoredItems` through the aggregator. E2E: add a
`-MethodNodeId` stage to `scripts/e2e/test-opcuaclient.ps1` that calls
the method through the local server and asserts the output matches the
direct upstream call.
---
### Phase 3 — Change tracking
#### PR-10: Auto re-import on `ModelChangeEvent` (gap #10)
**Goal**: subscribe to `BaseModelChangeEventType` /
`GeneralModelChangeEventType` on the upstream server's `i=2253` Server
node so when the upstream topology changes (new tag added, type modified)
the driver triggers a `ReinitializeAsync`-style re-import without
operator action.
**SDK API**:
- A second `Subscription` on the Session, monitoring `Server` node
(`ObjectIds.Server`) with an `EventFilter` whose SelectClauses reference
`BaseModelChangeEventType` and (optionally) `GeneralModelChangeEventType`
Changes property
- On notification: enqueue a debounced re-discover (don't react to every
event during a bulk topology edit — coalesce 2-5s window)
**Files**:
- `src/.../OpcUaClient/OpcUaClientDriver.cs` — add `_modelChangeSubscription`
field; new `SubscribeModelChangesAsync` invoked at the end of
`InitializeAsync`; debounce timer that calls `ReinitializeAsync` on the
driver host
- `src/.../OpcUaClient/OpcUaClientDriverOptions.cs` — add
`WatchModelChanges: bool` (default true) +
`ModelChangeDebounce: TimeSpan` (default 5s)
**Tests**: synthetic event injection on the mock Session's notification
stream; assert one debounced re-import call regardless of N events
arriving in the window.
**Risks**: re-import while a downstream client is mid-browse — needs
serialization on `_gate` like the rest of the driver; document that
clients see a brief gap in the address space during reload.
**Docs / fixture / e2e**: new "Auto re-import on ModelChangeEvent"
section in `docs/drivers/OpcUaClient.md` documenting the debounce window,
the `_gate` serialization, and the brief browse-gap during reload.
Fixture: opc-plc supports runtime topology mutation via the
`addnode`/`addtag` HTTP control endpoint — extend `OpcPlcFixture` with
a helper that triggers a model change. Integration test asserts a
single re-import call after a burst of synthetic model change events.
E2E: add a "topology change" stage to
`scripts/e2e/test-opcuaclient.ps1` that calls the opc-plc control
endpoint, then asserts the local server reflects the new node within
the debounce window.
---
### Phase 4 — Connectivity
#### PR-11: Reverse Connect (gap #1)
**Goal**: support server-initiated client connect for OT-DMZ outbound-only
firewalls. The upstream server connects *to* us on a TCP listener; we
respond as the client. Hard requirement for many regulated plant networks.
**SDK API**:
- `Opc.Ua.Client.ReverseConnectManager` — manages a TCP listener on the
configured port and dispatches incoming reverse-connect requests
- `ReverseConnectManager.AddEndpoint(Uri reverseEndpoint)` — listener URI
e.g. `opc.tcp://0.0.0.0:4844`
- `ReverseConnectManager.WaitForConnection(serverUri, serverUri, ct)`
blocks until the configured server initiates a reverse connect
- `Session.Create(appConfig, reverseConnection, endpoint, ...)`
alternative session-create overload accepting the
`ITransportWaitingConnection` returned by the manager
**Files**:
- `src/.../OpcUaClient/OpcUaClientDriverOptions.cs` — add
`ReverseConnect: { Enabled, ListenerUrl, ExpectedServerUri }` section
- `src/.../OpcUaClient/OpcUaClientDriver.cs` — when reverse-connect is
enabled, replace the failover sweep with `WaitForConnection` and fall
through into the same session-create path
- New helper `ReverseConnectListener` — owns the manager lifecycle, one
listener per driver-host process (singleton across instances if multiple
reverse-connect drivers are configured)
**Tests**: spin up a `ReverseConnectClient` test against an opc-plc
container started with `--rc opc.tcp://host:4844` to verify end-to-end.
Unit tests mock `ITransportWaitingConnection`.
**Risks**: highest of the plan. Reverse Connect changes the
listen-vs-dial direction; if multiple OpcUaClient driver instances both
listen on the same port the manager must multiplex. opc-plc supports
reverse connect (`--rc` flag) so the integration test pattern from
`docs/drivers/OpcUaClient-Test-Fixture.md` extends cleanly.
**Docs / fixture / e2e**: new "Reverse Connect" section in
`docs/drivers/OpcUaClient.md` (create if missing) documenting the
listener URL config, the OT-DMZ outbound-only use case, and the shared-
listener singleton model; update `docs/drivers/OpcUaClient-Test-Fixture.md`
with the new "Reverse Connect coverage" row. Fixture: extend
`Docker/docker-compose.yml` with an `opc-plc-rc` service variant that
adds `--rc opc.tcp://host.docker.internal:4844`; `OpcPlcFixture` gains
a `[CollectionDefinition]` that wires up the reverse-connect listener
on the test side. Integration test asserts a session opens via the
reverse path. E2E: add a `-ReverseConnect` switch to
`scripts/e2e/test-opcuaclient.ps1` that flips the driver to listener
mode and verifies the bridge stage still passes.
---
### Phase 5 — Historical & redundancy
#### PR-12: `IHistoryProvider.ReadEventsAsync` interface fix + driver impl (gap #12)
**Goal**: extend `IHistoryProvider.ReadEventsAsync` to carry an
`EventFilter SelectClauses` parameter so HistoryRead Events can return
the right field projection, and implement the OPC UA Client passthrough.
**This is a cross-driver concern.** `IHistoryProvider` lives in
`Core.Abstractions` and every driver that opts into history (Galaxy,
OpcUaClient, plus any future historian-backed Tier-A driver) inherits the
default. Changing the signature is source-breaking — coordinate as one PR
that:
1. Adds the `IReadOnlyList<EventFieldProjection>` (or equivalent
abstract `EventFilterSpec`) parameter
2. Updates Galaxy's existing override (currently the only override) to
honour the projection (best-effort — the Galaxy A&E log has a fixed
field set so most projections degrade to the default columns)
3. Lands the OpcUaClient passthrough using `Session.HistoryReadAsync` with
`ReadEventDetails`
**SDK API**:
- `ReadEventDetails { StartTime, EndTime, NumValuesPerNode, Filter }`
- `Session.HistoryReadAsync` is already the call we use for Raw — pass
`new ExtensionObject(new ReadEventDetails { ... })` for events
- `HistoryEvent.Events: HistoryEventFieldList[]` — unwrap into
`HistoricalEvent` records
**Files**:
- `src/.../Core.Abstractions/IHistoryProvider.cs` — interface change
- `src/.../Driver.Galaxy.../*HistoryProvider*.cs` — adjust signature
- `src/.../OpcUaClient/OpcUaClientDriver.cs` — implement
`ReadEventsAsync`; reuse `ExecuteHistoryReadAsync` shape
- Server-side history facade — propagate the new parameter
**Tests**: integration test against opc-plc with
`--alm` (alarm sim already enabled per the fixture doc) — verify the
SelectClause projection comes back correctly.
**Risks**: the cross-driver interface change is the riskiest single
ergonomic call in this plan. If we can't fit the new parameter without
breaking every driver's `IHistoryProvider` impl, fall back to a sibling
`IEventHistoryProvider` interface and only the OPC UA Client + Galaxy
implement it. **Decide this in the PR review.**
**Docs / fixture / e2e**: new "HistoryRead Events" section in
`docs/drivers/OpcUaClient.md` documenting the `EventFilter`-aware
passthrough; update `docs/Client.CLI.md` `historyread` page to cover
event-mode reads. **Cross-driver doc updates** (this PR adds an
"`IHistoryProvider.ReadEventsAsync` signature change — see
`docs/plans/opcuaclient-plan.md` PR-12" note to every other driver
plan that has a history surface): `docs/plans/abcip-plan.md`,
`docs/plans/ablegacy-plan.md`, `docs/plans/focas-plan.md`,
`docs/plans/s7-plan.md`, `docs/plans/twincat-plan.md`, the Galaxy plan
family (`docs/plans/galaxy-*.md` if/when present, and the LMX equivalent
if it lands), and any Modbus plan. Galaxy is the only existing
implementor and gets a real signature update in this PR; the others
get a heads-up note so future work tracks the new shape. Fixture: opc-
plc runs with `--alm` already (per existing fixture doc) — no compose
change. Integration test issues a HistoryRead Events with a non-default
SelectClause and asserts the projected fields. E2E: extend
`scripts/e2e/test-opcuaclient.ps1` with a "history events" stage
gated on the `--alm` simulator producing at least one event.
---
#### PR-13: Full Aggregate function set (gap #13)
**Goal**: extend `HistoryAggregateType` from the 5 enum values today
(Average/Minimum/Maximum/Total/Count) to the OPC UA Part 13 standard
catalog of 30+ aggregates that historian-class clients expect.
**SDK API**: `ObjectIds.AggregateFunction_*` constants — one per
aggregate. The SDK already exposes them; this is pure mapping work.
Aggregates to add (Part 13 §5):
- `TimeAverage`, `TimeAverage2`
- `Interpolative`
- `MinimumActualTime`, `MaximumActualTime`, `Range`, `Range2`
- `AnnotationCount`, `DurationGood`, `DurationBad`,
`PercentGood`, `PercentBad`
- `WorstQuality`, `WorstQuality2`
- `StandardDeviationSample`, `StandardDeviationPopulation`,
`VarianceSample`, `VariancePopulation`
- `NumberOfTransitions`
- `Start`, `End`, `Delta`, `StartBound`, `EndBound`
- `DurationInStateZero`, `DurationInStateNonZero`
**Files**:
- `src/.../Core.Abstractions/IHistoryProvider.cs` — extend
`HistoryAggregateType` enum (additive — existing values keep their
ordinal)
- `src/.../OpcUaClient/OpcUaClientDriver.cs`
`MapAggregateToNodeId` switch grows; default arm rejects `out of range`
**Tests**: parametrized unit test sweeping every enum value — assert
each maps to a non-null `NodeId` in the SDK's well-known set.
**Risks**: low — this is mapping work. Drivers without a real historian
(everything except Galaxy + OpcUaClient) keep throwing `NotSupported`.
**Docs / fixture / e2e**: extend the "HistoryRead aggregates" section in
`docs/drivers/OpcUaClient.md` with the full Part 13 catalog and which
aggregates require server-side support; update
`docs/Client.CLI.md` `historyread` page enumerating the new
`--aggregate` values. Fixture: opc-plc historian support is limited —
flag in `docs/drivers/OpcUaClient-Test-Fixture.md` that the new
aggregates are unit-tested via the SDK's well-known NodeId set, not
exercised wire-side. Integration test sweeps every enum value and
asserts the mapping; gated-skip for aggregates the live opc-plc image
doesn't honour. No e2e change.
---
#### PR-14: `ServerUriArray` redundant failover (gap #14)
**Goal**: read upstream `Server.ServerArray` /
`ServerStatus.ServerArray` and `ServerRedundancyType.RedundancySupport` at
session activation; when the upstream server advertises non-`None`
redundancy, fail over mid-session on `ServiceLevel` drop without losing
client subscriptions. Today our `EndpointUrls` is a one-shot connect-
attempt list, not a live redundancy group.
**SDK API**:
- `Session.ReadValueAsync(VariableIds.Server_ServerStatus_ServerArray, ct)`
→ URI list
- `Session.ReadValueAsync(VariableIds.Server_ServiceLevel, ct)` polled or
subscribed via MonitoredItem
- Subscribe `Server_ServiceLevel` on the existing alarm subscription so
drops propagate via the publish channel
- On low-`ServiceLevel`: open a parallel session against the next URI in
`ServerArray`, `Session.TransferSubscriptionsAsync(otherSession, ...)`
the live subscriptions, swap `Session` reference
**Files**:
- `src/.../OpcUaClient/OpcUaClientDriver.cs` — new
`MonitorServerRedundancyAsync` method; integrate with the existing
`OnKeepAlive` / `SessionReconnectHandler` machinery so reconnect and
redundancy-failover share the subscription-transfer code path
- `src/.../OpcUaClient/OpcUaClientDriverOptions.cs` — add
`Redundancy: { Enabled, ServiceLevelThreshold (default 200) }`
**Tests**: with two opc-plc containers behind the driver,
artificially drop ServiceLevel on the active one and assert the
secondary takes over; assert subscription handles stay valid.
**Risks**: redundancy is the second-riskiest item after Reverse Connect.
The SDK's `TransferSubscriptions` has known edge cases when the
secondary's `SecureChannel` rejects the source-channel's authentication
token; doc that the secondary must trust the same client cert as the
primary.
**Docs / fixture / e2e**: new "Upstream redundancy (`ServerArray`)"
section in `docs/drivers/OpcUaClient.md` with the ServiceLevel
threshold, the shared-cert prerequisite for `TransferSubscriptions`,
and the ops runbook for forcing a failover; cross-link from
`docs/Redundancy.md` (which today covers OUR server's redundancy —
add a "vs upstream-side redundancy" note). Fixture: extend
`Docker/docker-compose.yml` with a second `opc-plc-secondary` service
on a different port; `OpcPlcFixture` gains a multi-endpoint variant.
Integration test drops the active server's ServiceLevel and asserts
the secondary takes over with subscription handles intact. E2E: add a
`-PrimaryUrl` / `-SecondaryUrl` pair to
`scripts/e2e/test-opcuaclient.ps1` (and matching keys to
`scripts/e2e/e2e-config.sample.json`) that scripts a primary stop +
asserts the bridge stage continues to pass.
---
## Documentation, fixture, and e2e impact
Consolidated index of every doc page, fixture asset, and e2e script touched
by the plan above. Authoritative for review — if a PR's `Docs / fixture /
e2e` line references a path not listed here, that's a checklist gap.
### Driver user docs
- `docs/drivers/OpcUaClient.md` — **create on first PR that needs it
(PR-1)** if not present, then extend with one section per PR-1 through
PR-14 covering: subscription tuning, per-tag deadband, OperationLimits
handling, diagnostics counters, CRL/SHA1, FindServers, curation,
type mirroring, methods, ModelChangeEvent, Reverse Connect, history
events, aggregates, upstream redundancy.
- `docs/drivers/OpcUaClient-Test-Fixture.md` — coverage map updated for
curation (PR-7), Reverse Connect (PR-11), aggregates note (PR-13),
redundancy multi-endpoint variant (PR-14).
- `docs/Client.CLI.md` — extended for subscribe deadband syntax (PR-2),
any `discover` command (PR-6), `call` command (PR-9), `historyread`
event mode (PR-12), `--aggregate` enum expansion (PR-13).
- `docs/Client.UI.md` — extended for Subscriptions tab deadband fields
(PR-2), Browse-tree type rendering note (PR-8), Method-call surface
(PR-9) if it ships.
- `docs/security.md` — cross-link from PR-5 (CRL/SHA1 knobs).
- `docs/Redundancy.md` — cross-link from PR-14 (note distinguishing
server-side redundancy from upstream-side redundancy).
### Fixture assets
- `tests/ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient.IntegrationTests/Docker/docker-compose.yml`
— add `opc-plc-rc` (PR-11) and `opc-plc-secondary` (PR-14) service
variants; optional secured endpoint (PR-5).
- `tests/ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient.IntegrationTests/OpcPlcFixture.cs`
— discovery probe at collection init (PR-6), reverse-connect listener
(PR-11), multi-endpoint variant (PR-14), model-change helper (PR-10).
- `tests/ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient.IntegrationTests/OpcPlcProfile.cs`
— flag noisy analogs for deadband (PR-2), enumerate exercised
namespaces for curation (PR-7), record at least one custom ObjectType
(PR-8).
- New integration tests added per PR; all live under the existing
`tests/ZB.MOM.WW.OtOpcUa.Driver.OpcUaClient.IntegrationTests/`
collection.
- Test certs (PR-5): SHA1-signed + revoked test fixtures checked into
the unit-test project's resources.
### E2E scripts
- `scripts/e2e/test-opcuaclient.ps1` — new stages added per PR (subscription
tuning PR-1, deadband PR-2, diagnostics PR-4, CRL PR-5, discovery
PR-6, curation PR-7, method call PR-9, topology change PR-10,
reverse connect PR-11, history events PR-12, redundancy failover
PR-14). The script is the single integration point for every
driver-level e2e — keep the stages ordered top-down by phase.
- `scripts/e2e/e2e-config.sample.json` — new keys: `deadband`,
`discoveryUrl`, `includePath`, `namespaceRemap`, `methodNodeId`,
`reverseConnect`, `primaryUrl`, `secondaryUrl`.
- `scripts/e2e/test-all.ps1` — no structural change; the existing
`opcuaclient` block forwards new params after wiring them through
`e2e-config.sample.json`.
### Cross-driver impact (PR-12 — `IHistoryProvider.ReadEventsAsync`)
PR-12 changes the `IHistoryProvider.ReadEventsAsync` signature in
`Core.Abstractions` (or introduces a sibling `IEventHistoryProvider`
— pinned in PR-12 review per Open Question 2). That decision is
source-breaking for every driver that opts into history. PR-12 must
add an explicit "interface change — adopt new signature when this
driver implements `ReadEventsAsync`" note to:
- `docs/plans/abcip-plan.md`
- `docs/plans/ablegacy-plan.md`
- `docs/plans/focas-plan.md`
- `docs/plans/s7-plan.md`
- `docs/plans/twincat-plan.md`
- The Galaxy plan family — `docs/plans/galaxy-*.md` if/when those
pages exist; Galaxy is the only current implementor and gets the
real signature update in PR-12, not just a note.
- The LMX plan — `docs/plans/lmx-*.md` if/when it lands (current state:
the LMX driver's history surface is implicit through Galaxy; revisit
during PR-12 review).
- A Modbus plan page if/when one exists; Modbus does not implement
history today but the heads-up note tracks the cross-driver shape.
The cross-driver note text should be a one-paragraph "Heads up: the
`IHistoryProvider.ReadEventsAsync` interface gained an
`EventFilterSpec` parameter in OpcUaClient PR-12 (`docs/plans/opcuaclient-plan.md`).
If/when this driver implements event-history, adopt the new signature."
This pattern keeps each driver plan stable while the cross-cutting
breakage is owned by one PR.
---
## Skip-rated items (for context)
These featuregaps rows are **Build = No** and intentionally omitted from
the plan above:
| # | Gap | Why we're skipping |
| :---: | --- | --- |
| 3 | Multicast / LDS-ME registration | Server-side responsibility, not aggregator's. |
| 4 | GDS push management (Part 12) | Significant infra; rare for our deployment scale. |
| 11 | HistoryUpdate / Modified / Annotation passthrough | MES backfill scope; defer. |
| 16 | Connection / session pooling for multi-instance scale-out | Premature; current per-instance model is simple and adequate. |
| 18 | Kerberos / OAuth2 / JWT identity | Significant security work; defer until AD integration drives it (separate workstream). |
| 19 | Write attribute scope beyond `Value` | Niche; rarely used in OPC UA practice. |
If any of these get prioritized later they slot cleanly between the phases
above — none have prerequisites among the Build = Yes items.
## Open questions
1. **`ISubscribable` overload vs new method (PR-2)**: per-tag spec
carrier is needed for deadband; do we extend the existing
`SubscribeAsync` overload or add `SubscribeWithSpecsAsync`? The
former is source-breaking but cleaner; the latter is additive but
leaves two parallel paths.
2. **`IHistoryProvider.ReadEventsAsync` shape (PR-12)**: does the
`EventFilterSpec` parameter live on `IHistoryProvider` (one interface,
every driver gets it) or on a sibling `IEventHistoryProvider` (two
interfaces, only event-history drivers implement)? Memory entry
suggests the former; preference depends on whether non-OPC-UA drivers
ever expect to project arbitrary event fields. **Pin this in PR-12
review.**
3. **`IMethodInvoker` capability (PR-9)**: does this become the 9th
capability interface (currently 8/8) or is it folded into
`IWritable` as a method-invoke variant? Adding a 9th interface is
the cleaner model and matches the spec layering.
4. **Type mirroring address-space surface (PR-8)**: does
`IAddressSpaceBuilder` already accept type nodes? If yes, PR-8 is
straightforward; if no, it splits into a prerequisite PR-8a that
extends the builder, then PR-8b for the OPC UA Client wire-up. The
answer determines whether PR-8 ships in Phase 2 or slips to a later
phase.
5. **Reverse Connect listener ownership (PR-11)**: one listener per
driver instance (port collision when multiple reverse-connect
drivers run in the same process) vs one shared listener with a
`expectedServerUri` dispatcher. Shared is the right answer; pin
the singleton lifetime to the driver-host.
6. **Phase 1 ship order**: PR-1, PR-3, PR-4, PR-5 are independent and can
land in parallel. PR-2 depends on the `ISubscribable` interface
decision (Q1) — recommend landing PR-1 first to validate the
`OpcUaSubscriptionDefaults` shape, then PR-2.