diff --git a/docs/README.md b/docs/README.md index 09d3012..aa1c842 100644 --- a/docs/README.md +++ b/docs/README.md @@ -98,7 +98,6 @@ Design decisions + phase plans + execution notes. Load-bearing cross-references - [v2/test-data-sources.md](v2/test-data-sources.md) — integration-test simulator matrix (includes the pinned libplctag `ab_server` version for AB CIP tests) - [v2/multi-host-dispatch.md](v2/multi-host-dispatch.md) — per-PLC circuit breakers (Phase 6.1 decision #144) - [v2/v2-release-readiness.md](v2/v2-release-readiness.md) — release-readiness tracker -- [v2/lmx-followups.md](v2/lmx-followups.md) — historical Galaxy-bridge follow-ups (pre-PR-7.2) - [v2/implementation/phase-*-*.md](v2/implementation/) — per-phase execution plans with exit-gate evidence ## v1 archive diff --git a/docs/v2/lmx-followups.md b/docs/v2/lmx-followups.md deleted file mode 100644 index 1c36b1a..0000000 --- a/docs/v2/lmx-followups.md +++ /dev/null @@ -1,195 +0,0 @@ -# LMX Galaxy bridge — remaining follow-ups - -State after PR 19: the Galaxy driver is functionally at v1 parity through the -`IDriver` abstraction; the OPC UA server runs with LDAP-authenticated -Basic256Sha256 endpoints and alarms are observable through -`AlarmConditionState.ReportEvent`. The items below are what remains LMX- -specific before the stack can fully replace the v1 deployment, in -rough priority order. - -## 1. Proxy-side `IHistoryProvider` for `ReadAtTime` / `ReadEvents` — **DONE (PRs 35 + 38)** - -PR 35 extended `IHistoryProvider` with `ReadAtTimeAsync` + `ReadEventsAsync` -(default throwing implementations so existing impls keep compiling), added the -`HistoricalEvent` + `HistoricalEventsResult` records to `Core.Abstractions`, -and implemented both methods in `GalaxyProxyDriver` on top of the PR 10 / PR 11 -IPC messages. - -PR 38 wired the OPC UA HistoryRead service-handler through -`DriverNodeManager` by overriding `CustomNodeManager2`'s four per-kind hooks — -`HistoryReadRawModified` / `HistoryReadProcessed` / `HistoryReadAtTime` / -`HistoryReadEvents`. Each walks `nodesToProcess`, resolves the driver-side -full reference from `NodeId.Identifier`, dispatches to the right -`IHistoryProvider` method, and populates the paired results + errors lists -(both must be set — the MasterNodeManager merges them and a Good result with -an unset error slot serializes as `BadHistoryOperationUnsupported` on the -wire). Historized variables gain `AccessLevels.HistoryRead` so the stack -dispatches; the driver root folder gains `EventNotifiers.HistoryRead` so -`HistoryReadEvents` can target it. - -Aggregate translation uses a small `MapAggregate` helper that handles -`Average` / `Minimum` / `Maximum` / `Total` / `Count` (the enum surface the -driver exposes) and returns null for unsupported aggregates so the handler -can surface `BadAggregateNotSupported`. Raw+Processed+AtTime wrap driver -samples as `HistoryData` in an `ExtensionObject`; Events emits a -`HistoryEvent` with the standard BaseEventType field list (EventId / -SourceName / Message / Severity / Time / ReceiveTime) — custom -`SelectClause` evaluation is an explicit follow-up. - -**Tests**: - -- `DriverNodeManagerHistoryMappingTests` — 12 unit cases pinning - `MapAggregate`, `BuildHistoryData`, `BuildHistoryEvent`, `ToDataValue`. -- `HistoryReadIntegrationTests` — 5 end-to-end cases drive a real OPC UA - client (`Session.HistoryRead`) against a fake `IHistoryProvider` driver - through the running stack. Covers raw round-trip, processed with Average - aggregate, unsupported aggregate → `BadAggregateNotSupported`, at-time - timestamp forwarding, and events field-list shape. - -**Deferred**: -- Continuation-point plumbing via `Session.Save/RestoreHistoryContinuationPoint`. - Driver returns null continuations today so the pass-through is fine. -- Per-`SelectClause` evaluation in HistoryReadEvents — clients that send a - custom field selection currently get the standard BaseEventType layout. - -## 2. Write-gating by role — **DONE (PR 26)** - -Landed in PR 26. `WriteAuthzPolicy` in `Server/Security/` maps -`SecurityClassification` → required role (`FreeAccess` → no role required, -`Operate`/`SecuredWrite` → `WriteOperate`, `Tune` → `WriteTune`, -`Configure`/`VerifiedWrite` → `WriteConfigure`, `ViewOnly` → deny regardless). -`DriverNodeManager` caches the classification per variable during discovery and -checks the session's roles (via `IRoleBearer`) in `OnWriteValue` before calling -`IWritable.WriteAsync`. Roles do not cascade — a session with `WriteOperate` -can't write a `Tune` attribute unless it also carries `WriteTune`. - -See `feedback_acl_at_server_layer.md` in memory for the architectural directive -that authz stays at the server layer and never delegates to driver-specific auth. - -## 3. Admin UI client-cert trust management — **DONE (PR 28)** - -PR 28 shipped `/certificates` in the Admin UI. `CertTrustService` reads the OPC -UA server's PKI store root (`OpcUaServerOptions.PkiStoreRoot` — default -`%ProgramData%\OtOpcUa\pki`) and lists rejected + trusted certs by parsing the -`.der` files directly, so it has no `Opc.Ua` dependency and runs on any -Admin host that can reach the shared PKI directory. - -Operator actions: Trust (moves `rejected/certs/*.der` → `trusted/certs/*.der`), -Delete rejected, Revoke trust. The OPC UA stack re-reads the trusted store on -each new client handshake, so no explicit reload signal is needed — -operators retry the rejected client's connection after trusting. - -Deferred: flipping `AutoAcceptUntrustedClientCertificates` to `false` as the -deployment default. That's a production-hardening config change, not a code -gap — the Admin UI is now ready to be the trust gate. - -## 4. Live-LDAP integration test — **DONE (PR 31)** - -PR 31 shipped `Server.Tests/LdapUserAuthenticatorLiveTests.cs` — 6 live-bind -tests against the dev GLAuth instance at `localhost:3893`, skipped cleanly -when the port is unreachable. Covers: valid bind, wrong password, unknown -user, empty credentials, single-group → WriteOperate mapping, multi-group -admin user surfacing all mapped roles. - -Also added `UserNameAttribute` to `LdapOptions` (default `uid` for RFC 2307 -compat) so Active Directory deployments can configure `sAMAccountName` / -`userPrincipalName` without code changes. `LdapUserAuthenticatorAdCompatTests` -(5 unit guards) pins the AD-shape DN parsing + filter escape behaviors. See -`docs/security.md` §"Active Directory configuration" for the AD appsettings -snippet. - -Deferred: asserting `session.Identity` end-to-end on the server side (i.e. -drive a full OPC UA session with username/password, then read an -`IHostConnectivityProbe`-style "whoami" node to verify the role surfaced). -That needs a test-only address-space node and is a separate PR. - -## 5. Full Galaxy live-service smoke test against the merged v2 stack — **IN PROGRESS (PRs 36 + 37)** - -PR 36 shipped the prerequisites helper (`AvevaPrerequisites`) that probes -every dependency a live smoke test needs and produces actionable skip -messages. - -PR 37 shipped the live-stack smoke test project structure: -`tests/Driver.Galaxy.Proxy.Tests/LiveStack/` with `LiveStackFixture` (connects -to the *already-running* `OtOpcUaGalaxyHost` Windows service via named pipe; -never spawns the Host process) and `LiveStackSmokeTests` covering: - -- Fixture initializes successfully (IPC handshake succeeds end-to-end). -- Driver reports `DriverState.Healthy` post-handshake. -- `DiscoverAsync` returns at least one variable from the live Galaxy. -- `GetHostStatuses` reports at least one Platform/AppEngine host. -- `ReadAsync` on a discovered variable round-trips through - Proxy → Host pipe → MXAccess → back without a BadInternalError. - -Shared secret + pipe name resolve from `OTOPCUA_GALAXY_SECRET` / -`OTOPCUA_GALAXY_PIPE` env vars, falling back to reading the service's -registry-stored Environment values (requires elevated test host). - -**PR 40** added the write + subscribe facts targeting -`DelmiaReceiver_001.TestAttribute` (the writable Boolean UDA the dev Galaxy -ships under TestMachine_001) — write-then-read with a 5s scan-window poll + -restore-on-finally, and subscribe-then-write asserting both an initial-value -OnDataChange and a post-write OnDataChange. PR 39 added the elevated-shell -short-circuit so a developer running from an admin window gets an actionable -skip instead of `UnauthorizedAccessException`. - -**Run the live tests** (from a NORMAL non-admin PowerShell): - -```powershell -$env:OTOPCUA_GALAXY_SECRET = Get-Content C:\Users\dohertj2\Desktop\lmxopcua\.local\galaxy-host-secret.txt -cd C:\Users\dohertj2\Desktop\lmxopcua -dotnet test tests\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests --filter "FullyQualifiedName~LiveStackSmokeTests" -``` - -Expected: 7/7 pass against the running `OtOpcUaGalaxyHost` service. - -**Remaining for #5 in production-grade form**: -- Confirm the suite passes from a non-elevated shell (operator action). -- Add similar facts for an alarm-source attribute once `TestMachine_001` (or - a sibling) carries a deployed alarm condition — the current dev Galaxy's - TestAttribute isn't alarm-flagged. - -## 6. Second driver instance on the same server — **DONE (PR 32)** - -`Server.Tests/MultipleDriverInstancesIntegrationTests.cs` registers two -drivers with distinct `DriverInstanceId`s on one `DriverHost`, spins up the -full OPC UA server, and asserts three behaviors: (1) each driver's namespace -URI (`urn:OtOpcUa:{id}`) resolves to a distinct index in the client's -NamespaceUris, (2) browsing one subtree returns that driver's folder and -does NOT leak the other driver's folder, (3) reads route to the correct -driver — the alpha instance returns 42 while beta returns 99, so a misroute -would surface at the assertion layer. - -Deferred: the alarm-event multi-driver parity case (two drivers each raising -a `GalaxyAlarmEvent`, assert each condition lands on its owning instance's -condition node). Alarm tracking already has its own integration test -(`AlarmSubscription*`); the multi-driver alarm case would need a stub -`IAlarmSource` that's worth its own focused PR. - -## 7. Host-status per-AppEngine granularity → Admin UI dashboard — **DONE (PRs 33 + 34)** - -**PR 33** landed the data layer: `DriverHostStatus` entity + migration with -composite key `(NodeId, DriverInstanceId, HostName)` and two query-supporting -indexes (per-cluster drill-down on `NodeId`, stale-row detection on -`LastSeenUtc`). - -**PR 34** wired the publisher + consumer. `HostStatusPublisher` is a -`BackgroundService` in the Server process that walks every registered -`IHostConnectivityProbe`-capable driver every 10s, calls -`GetHostStatuses()`, and upserts rows (`LastSeenUtc` advances each tick; -`State` + `StateChangedUtc` update on transitions). Admin UI `/hosts` page -groups by cluster, shows four summary cards (Hosts / Running / Stale / -Faulted), and flags rows whose `LastSeenUtc` is older than 30s as Stale so -operators see crashed Servers without waiting for a state change. - -Deferred as follow-ups: - -- Event-driven push (subscribe to `OnHostStatusChanged` per driver for - sub-heartbeat latency). Adds DriverHost lifecycle-event plumbing; - 10s polling is fine for operator-scale use. -- Failure-count column — needs the publisher to track a transition history - per host, not just current-state. -- SignalR fan-out to the Admin page (currently the page polls the DB, not - a hub). The DB-polled version is fine at current cadence but a hub push - would eliminate the 10s race where a new row sits in the DB before the - Admin page notices.