Per-kind override shape: each hook receives the pre-filtered nodesToProcess list (NodeHandles for nodes this manager claimed), iterates them, resolves handle.NodeId.Identifier to the driver-side full reference string, and dispatches to the right IHistoryProvider method. Write back into the outer results + errors slots at handle.Index (not the local loop counter — nodesToProcess is a filtered subset of nodesToRead, so indexing by the loop counter lands in the wrong slot for mixed-manager batches). WriteResult helper sets both results[i] AND errors[i]; this matters because MasterNodeManager merges them and leaving errors[i] at its default (BadHistoryOperationUnsupported) overrides a Good result with Unsupported on the wire — this was the subtle failure mode that masked a correctly-constructed HistoryData response during debugging. Failure-isolation per node: NotSupportedException from a driver that doesn't implement a particular HistoryProvider method translates to BadHistoryOperationUnsupported in that slot; generic exceptions log and surface BadInternalError; unresolvable NodeIds get BadNodeIdUnknown. The batch continues unconditionally. Aggregate mapping: MapAggregate translates ObjectIds.AggregateFunction_Average / Minimum / Maximum / Total / Count to the driver's HistoryAggregateType enum. Null for anything else (e.g. TimeAverage, Interpolative) so the handler surfaces BadAggregateNotSupported at the batch level — per Part 13, one unsupported aggregate means the whole request fails since ReadProcessedDetails carries one aggregate list for all nodes. BuildHistoryData wraps driver DataValueSnapshots as Opc.Ua.HistoryData in an ExtensionObject; BuildHistoryEvent wraps HistoricalEvents as Opc.Ua.HistoryEvent with the canonical BaseEventType field list (EventId, SourceName, Message, Severity, Time, ReceiveTime — the order OPC UA clients that didn't customize the SelectClause expect). ToDataValue preserves null SourceTimestamp (Galaxy historian rows often carry only ServerTimestamp) — synthesizing a SourceTimestamp would lie about actual sample time. Two address-space changes were required to make the stack dispatch reach the per-kind hooks at all: (1) historized variables get AccessLevels.HistoryRead added to their AccessLevel byte — the base's early-gate check on (variable.AccessLevel & HistoryRead != 0) was rejecting requests before our override ever ran; (2) the driver-root folder gets EventNotifiers.HistoryRead | SubscribeToEvents so HistoryReadEvents can target it (the conventional pattern for alarm-history browse against a driver-owned object). Document the 'set both bits' requirement inline since it's not obvious from the surface API. OpcHistoryReadResult alias: Opc.Ua.HistoryReadResult (service-layer per-node result) collides with Core.Abstractions.HistoryReadResult (driver-side samples + continuation point) by type name; the alias 'using OpcHistoryReadResult = Opc.Ua.HistoryReadResult' keeps the override signatures unambiguous and the test project applies the mirror pattern for its stub driver impl. Tests — DriverNodeManagerHistoryMappingTests (12 new Category=Unit cases): MapAggregate translates each supported aggregate NodeId via reflection-backed theory (guards against the stack renaming AggregateFunction_* constants); returns null for unsupported NodeIds (TimeAverage) and null input; BuildHistoryData wraps samples with correct DataValues + SourceTimestamp preservation; BuildHistoryEvent emits the 6-element BaseEventType field list in canonical order (regression guard for a future 'respect the client's SelectClauses' change); null SourceName / Message translate to empty-string Variants (nullable-Variant refactor trap); ToDataValue preserves StatusCode + both timestamps; ToDataValue leaves SourceTimestamp at default when the snapshot omits it. HistoryReadIntegrationTests (5 new Category=Integration): drives a real OPC UA client Session.HistoryRead against a fake HistoryDriver through the running server. Covers raw round-trip (verifies per-node DataValue ordering + values); processed with Average aggregate (captures the driver's received aggregate + interval, asserting MapAggregate routed correctly); unsupported aggregate (TimeAverage → BadAggregateNotSupported); at-time (forwards the per-timestamp list); events (BaseEventType field list shape, SelectClauses populated to satisfy the stack's filter validator). Server.Tests Unit: 55 pass / 0 fail (43 prior + 12 new mapping). Server.Tests Integration: 14 pass / 0 fail (9 prior + 5 new history). Full solution build clean, 0 errors. lmx-followups.md #1 updated to 'DONE (PRs 35 + 38)' with two explicit deferred items: continuation-point plumbing (driver returns null today so pass-through is fine) and per-SelectClause evaluation in HistoryReadEvents (clients with custom field selections get the canonical BaseEventType layout today). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
181 lines
9.6 KiB
Markdown
181 lines
9.6 KiB
Markdown
# LMX Galaxy bridge — remaining follow-ups
|
|
|
|
State after PR 19: the Galaxy driver is functionally at v1 parity through the
|
|
`IDriver` abstraction; the OPC UA server runs with LDAP-authenticated
|
|
Basic256Sha256 endpoints and alarms are observable through
|
|
`AlarmConditionState.ReportEvent`. The items below are what remains LMX-
|
|
specific before the stack can fully replace the v1 deployment, in
|
|
rough priority order.
|
|
|
|
## 1. Proxy-side `IHistoryProvider` for `ReadAtTime` / `ReadEvents` — **DONE (PRs 35 + 38)**
|
|
|
|
PR 35 extended `IHistoryProvider` with `ReadAtTimeAsync` + `ReadEventsAsync`
|
|
(default throwing implementations so existing impls keep compiling), added the
|
|
`HistoricalEvent` + `HistoricalEventsResult` records to `Core.Abstractions`,
|
|
and implemented both methods in `GalaxyProxyDriver` on top of the PR 10 / PR 11
|
|
IPC messages.
|
|
|
|
PR 38 wired the OPC UA HistoryRead service-handler through
|
|
`DriverNodeManager` by overriding `CustomNodeManager2`'s four per-kind hooks —
|
|
`HistoryReadRawModified` / `HistoryReadProcessed` / `HistoryReadAtTime` /
|
|
`HistoryReadEvents`. Each walks `nodesToProcess`, resolves the driver-side
|
|
full reference from `NodeId.Identifier`, dispatches to the right
|
|
`IHistoryProvider` method, and populates the paired results + errors lists
|
|
(both must be set — the MasterNodeManager merges them and a Good result with
|
|
an unset error slot serializes as `BadHistoryOperationUnsupported` on the
|
|
wire). Historized variables gain `AccessLevels.HistoryRead` so the stack
|
|
dispatches; the driver root folder gains `EventNotifiers.HistoryRead` so
|
|
`HistoryReadEvents` can target it.
|
|
|
|
Aggregate translation uses a small `MapAggregate` helper that handles
|
|
`Average` / `Minimum` / `Maximum` / `Total` / `Count` (the enum surface the
|
|
driver exposes) and returns null for unsupported aggregates so the handler
|
|
can surface `BadAggregateNotSupported`. Raw+Processed+AtTime wrap driver
|
|
samples as `HistoryData` in an `ExtensionObject`; Events emits a
|
|
`HistoryEvent` with the standard BaseEventType field list (EventId /
|
|
SourceName / Message / Severity / Time / ReceiveTime) — custom
|
|
`SelectClause` evaluation is an explicit follow-up.
|
|
|
|
**Tests**:
|
|
|
|
- `DriverNodeManagerHistoryMappingTests` — 12 unit cases pinning
|
|
`MapAggregate`, `BuildHistoryData`, `BuildHistoryEvent`, `ToDataValue`.
|
|
- `HistoryReadIntegrationTests` — 5 end-to-end cases drive a real OPC UA
|
|
client (`Session.HistoryRead`) against a fake `IHistoryProvider` driver
|
|
through the running stack. Covers raw round-trip, processed with Average
|
|
aggregate, unsupported aggregate → `BadAggregateNotSupported`, at-time
|
|
timestamp forwarding, and events field-list shape.
|
|
|
|
**Deferred**:
|
|
- Continuation-point plumbing via `Session.Save/RestoreHistoryContinuationPoint`.
|
|
Driver returns null continuations today so the pass-through is fine.
|
|
- Per-`SelectClause` evaluation in HistoryReadEvents — clients that send a
|
|
custom field selection currently get the standard BaseEventType layout.
|
|
|
|
## 2. Write-gating by role — **DONE (PR 26)**
|
|
|
|
Landed in PR 26. `WriteAuthzPolicy` in `Server/Security/` maps
|
|
`SecurityClassification` → required role (`FreeAccess` → no role required,
|
|
`Operate`/`SecuredWrite` → `WriteOperate`, `Tune` → `WriteTune`,
|
|
`Configure`/`VerifiedWrite` → `WriteConfigure`, `ViewOnly` → deny regardless).
|
|
`DriverNodeManager` caches the classification per variable during discovery and
|
|
checks the session's roles (via `IRoleBearer`) in `OnWriteValue` before calling
|
|
`IWritable.WriteAsync`. Roles do not cascade — a session with `WriteOperate`
|
|
can't write a `Tune` attribute unless it also carries `WriteTune`.
|
|
|
|
See `feedback_acl_at_server_layer.md` in memory for the architectural directive
|
|
that authz stays at the server layer and never delegates to driver-specific auth.
|
|
|
|
## 3. Admin UI client-cert trust management — **DONE (PR 28)**
|
|
|
|
PR 28 shipped `/certificates` in the Admin UI. `CertTrustService` reads the OPC
|
|
UA server's PKI store root (`OpcUaServerOptions.PkiStoreRoot` — default
|
|
`%ProgramData%\OtOpcUa\pki`) and lists rejected + trusted certs by parsing the
|
|
`.der` files directly, so it has no `Opc.Ua` dependency and runs on any
|
|
Admin host that can reach the shared PKI directory.
|
|
|
|
Operator actions: Trust (moves `rejected/certs/*.der` → `trusted/certs/*.der`),
|
|
Delete rejected, Revoke trust. The OPC UA stack re-reads the trusted store on
|
|
each new client handshake, so no explicit reload signal is needed —
|
|
operators retry the rejected client's connection after trusting.
|
|
|
|
Deferred: flipping `AutoAcceptUntrustedClientCertificates` to `false` as the
|
|
deployment default. That's a production-hardening config change, not a code
|
|
gap — the Admin UI is now ready to be the trust gate.
|
|
|
|
## 4. Live-LDAP integration test — **DONE (PR 31)**
|
|
|
|
PR 31 shipped `Server.Tests/LdapUserAuthenticatorLiveTests.cs` — 6 live-bind
|
|
tests against the dev GLAuth instance at `localhost:3893`, skipped cleanly
|
|
when the port is unreachable. Covers: valid bind, wrong password, unknown
|
|
user, empty credentials, single-group → WriteOperate mapping, multi-group
|
|
admin user surfacing all mapped roles.
|
|
|
|
Also added `UserNameAttribute` to `LdapOptions` (default `uid` for RFC 2307
|
|
compat) so Active Directory deployments can configure `sAMAccountName` /
|
|
`userPrincipalName` without code changes. `LdapUserAuthenticatorAdCompatTests`
|
|
(5 unit guards) pins the AD-shape DN parsing + filter escape behaviors. See
|
|
`docs/security.md` §"Active Directory configuration" for the AD appsettings
|
|
snippet.
|
|
|
|
Deferred: asserting `session.Identity` end-to-end on the server side (i.e.
|
|
drive a full OPC UA session with username/password, then read an
|
|
`IHostConnectivityProbe`-style "whoami" node to verify the role surfaced).
|
|
That needs a test-only address-space node and is a separate PR.
|
|
|
|
## 5. Full Galaxy live-service smoke test against the merged v2 stack — **IN PROGRESS (PRs 36 + 37)**
|
|
|
|
PR 36 shipped the prerequisites helper (`AvevaPrerequisites`) that probes
|
|
every dependency a live smoke test needs and produces actionable skip
|
|
messages.
|
|
|
|
PR 37 shipped the live-stack smoke test project structure:
|
|
`tests/Driver.Galaxy.Proxy.Tests/LiveStack/` with `LiveStackFixture` (connects
|
|
to the *already-running* `OtOpcUaGalaxyHost` Windows service via named pipe;
|
|
never spawns the Host process) and `LiveStackSmokeTests` covering:
|
|
|
|
- Fixture initializes successfully (IPC handshake succeeds end-to-end).
|
|
- Driver reports `DriverState.Healthy` post-handshake.
|
|
- `DiscoverAsync` returns at least one variable from the live Galaxy.
|
|
- `GetHostStatuses` reports at least one Platform/AppEngine host.
|
|
- `ReadAsync` on a discovered variable round-trips through
|
|
Proxy → Host pipe → MXAccess → back without a BadInternalError.
|
|
|
|
Shared secret + pipe name resolve from `OTOPCUA_GALAXY_SECRET` /
|
|
`OTOPCUA_GALAXY_PIPE` env vars, falling back to reading the service's
|
|
registry-stored Environment values (requires elevated test host).
|
|
|
|
**Remaining**:
|
|
- Install + run the `OtOpcUaGalaxyHost` + `OtOpcUa` services on the dev box
|
|
(`scripts/install/Install-Services.ps1`) so the skip-on-unready tests
|
|
actually execute and the smoke PR lands green.
|
|
- Subscribe-and-receive-data-change fact (needs a known tag that actually
|
|
ticks; deferred until operators confirm a scratch tag exists).
|
|
- Write-and-roundtrip fact (needs a test-only UDA or agreed scratch tag
|
|
so we can't accidentally mutate a process-critical value).
|
|
|
|
## 6. Second driver instance on the same server — **DONE (PR 32)**
|
|
|
|
`Server.Tests/MultipleDriverInstancesIntegrationTests.cs` registers two
|
|
drivers with distinct `DriverInstanceId`s on one `DriverHost`, spins up the
|
|
full OPC UA server, and asserts three behaviors: (1) each driver's namespace
|
|
URI (`urn:OtOpcUa:{id}`) resolves to a distinct index in the client's
|
|
NamespaceUris, (2) browsing one subtree returns that driver's folder and
|
|
does NOT leak the other driver's folder, (3) reads route to the correct
|
|
driver — the alpha instance returns 42 while beta returns 99, so a misroute
|
|
would surface at the assertion layer.
|
|
|
|
Deferred: the alarm-event multi-driver parity case (two drivers each raising
|
|
a `GalaxyAlarmEvent`, assert each condition lands on its owning instance's
|
|
condition node). Alarm tracking already has its own integration test
|
|
(`AlarmSubscription*`); the multi-driver alarm case would need a stub
|
|
`IAlarmSource` that's worth its own focused PR.
|
|
|
|
## 7. Host-status per-AppEngine granularity → Admin UI dashboard — **DONE (PRs 33 + 34)**
|
|
|
|
**PR 33** landed the data layer: `DriverHostStatus` entity + migration with
|
|
composite key `(NodeId, DriverInstanceId, HostName)` and two query-supporting
|
|
indexes (per-cluster drill-down on `NodeId`, stale-row detection on
|
|
`LastSeenUtc`).
|
|
|
|
**PR 34** wired the publisher + consumer. `HostStatusPublisher` is a
|
|
`BackgroundService` in the Server process that walks every registered
|
|
`IHostConnectivityProbe`-capable driver every 10s, calls
|
|
`GetHostStatuses()`, and upserts rows (`LastSeenUtc` advances each tick;
|
|
`State` + `StateChangedUtc` update on transitions). Admin UI `/hosts` page
|
|
groups by cluster, shows four summary cards (Hosts / Running / Stale /
|
|
Faulted), and flags rows whose `LastSeenUtc` is older than 30s as Stale so
|
|
operators see crashed Servers without waiting for a state change.
|
|
|
|
Deferred as follow-ups:
|
|
|
|
- Event-driven push (subscribe to `OnHostStatusChanged` per driver for
|
|
sub-heartbeat latency). Adds DriverHost lifecycle-event plumbing;
|
|
10s polling is fine for operator-scale use.
|
|
- Failure-count column — needs the publisher to track a transition history
|
|
per host, not just current-state.
|
|
- SignalR fan-out to the Admin page (currently the page polls the DB, not
|
|
a hub). The DB-polled version is fine at current cadence but a hub push
|
|
would eliminate the 10s race where a new row sits in the DB before the
|
|
Admin page notices.
|