Compare commits

...

8 Commits

Author SHA1 Message Date
Joseph Doherty
8adc8f5ab8 Phase 3 PR 37 — End-to-end live-stack Galaxy smoke test. Closes the code side of LMX follow-up #5; once OtOpcUaGalaxyHost is installed + started on the dev box, the suite exercises the full topology GalaxyProxyDriver in-process → named-pipe IPC → running OtOpcUaGalaxyHost Windows service → MxAccessGalaxyBackend → live MXAccess runtime → real deployed Galaxy objects. Never spawns the Host process itself — connects to the already-running service per project_galaxy_host_service.md, which is the only way to exercise the production COM-apartment + service-account + pipe-ACL configuration.
LiveStackConfig resolves the pipe name + per-install shared secret from two sources in order: OTOPCUA_GALAXY_PIPE + OTOPCUA_GALAXY_SECRET env vars first (for CI / benchwork overrides), then the service's per-process Environment registry values under HKLM\SYSTEM\CurrentControlSet\Services\OtOpcUaGalaxyHost (what Install-Services.ps1 writes at install time). Registry read requires the test host to run elevated on most boxes — the skip message says so explicitly so operators see the right remediation. Hard-coded secrets are deliberately avoided: the installer generates 32 fresh random bytes per install, a committed secret would diverge from production the moment the service is re-installed.
LiveStackFixture is an IAsyncLifetime that (1) runs AvevaPrerequisites.CheckAllAsync with CheckGalaxyHostPipe=true + CheckHistorian=false — produces a structured PrerequisiteReport whose SkipReason is the exact operator-facing 'here's what you need to fix' text, (2) resolves LiveStackConfig and surfaces a clear skip when the secret isn't discoverable, (3) instantiates GalaxyProxyDriver + calls InitializeAsync (the IPC handshake), capturing a skip with the exception detail + common-cause hints (secret mismatch, SID not in pipe ACL, Host's backend couldn't connect to ZB) rather than letting a NullRef cascade through every subsequent test. SkipIfUnavailable() translates the captured SkipReason into Assert.Skip at the top of every fact so tests read as cleanly-skipped with a visible reason, not silently-passed or crashed.
LiveStackSmokeTests (5 facts, Collection=LiveStack, Category=LiveGalaxy): Fixture_initialized_successfully (cheapest possible end-to-end assertion — if this passes, the IPC handshake worked); Driver_reports_Healthy_after_IPC_handshake (DriverHealth.State post-connect); DiscoverAsync_returns_at_least_one_variable_from_live_galaxy (captures every Variable() call from DiscoverAsync via CapturingAddressSpaceBuilder and asserts > 0 — zero here usually means the Host couldn't read ZB, the skip message names OTOPCUA_GALAXY_ZB_CONN to check); GetHostStatuses_reports_at_least_one_platform (IHostConnectivityProbe surface — zero means the probe loop hasn't fired or no Platform is deployed locally); Can_read_a_discovered_variable_from_live_galaxy (reads the first discovered attribute's full reference, asserts status != BadInternalError — Galaxy's Uncertain-quality-until-first-Engine-scan is intentionally NOT treated as failure since it depends on runtime state that varies across test runs). Read-only by design; writes need an agreed scratch tag to avoid mutating a process-critical attribute — deferred to a follow-up PR that reuses this fixture.
CapturingAddressSpaceBuilder is a minimal IAddressSpaceBuilder that flattens every Variable() call into a list so tests can inspect what discovery produced without booting the full OPC UA node-manager stack; alarm annotation + property calls are no-ops. Scoped private to the test class.
Galaxy.Proxy.Tests csproj gains a ProjectReference to Driver.Galaxy.TestSupport (PR 36) for AvevaPrerequisites. The NU1702 warning about the Host project being net48-referenced-by-net10 is pre-existing from the HostSubprocessParityTests — Proxy.Tests only needs the Host EXE path for that parity scenario, not type surface.
Test run on THIS machine (OtOpcUaGalaxyHost not yet installed): Skipped! Failed 0, Passed 0, Skipped 5 — each skip message includes the full prerequisites report pointing at the missing service. Once the service is installed + started (scripts\install\Install-Services.ps1), the 5 facts will execute against live Galaxy. Proxy.Tests Unit: 17 pass / 0 fail (unchanged — new tests are Category=LiveGalaxy, separate suite). Full Proxy build clean. Memory already captures the 'live tests run via already-running service, don't spawn' convention (project_galaxy_host_service.md).
lmx-followups.md #5 updated: status is 'IN PROGRESS' across PRs 36 + 37 with the explicit remaining work (install + start services, subscribe-and-receive, write round-trip).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 16:49:51 -04:00
261869d84e Merge pull request 'Phase 3 PR 36 — AVEVA prerequisites test-support library' (#35) from phase-3-pr36-aveva-prerequisites into v2 2026-04-18 16:44:41 -04:00
Joseph Doherty
08c90d19fd Phase 3 PR 36 — AVEVA prerequisites test-support library. New tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.TestSupport multi-targeted class library (net10.0 + net48 so both the modern and the MXAccess-COM x86 test projects can consume it) that probes every piece of the AVEVA System Platform + OtOpcUa stack a live-Galaxy test depends on and returns a structured PrerequisiteReport. Closes the gap where live-smoke tests silently returned 'unreachable' without telling operators which specific piece failed.
AvevaPrerequisites.CheckAllAsync walks eight probe categories producing PrerequisiteCheck rows each with Name (e.g. 'service:aaBootstrap', 'sql:ZB', 'com:LMXProxy', 'registry:ArchestrA.Framework'), Category (AvevaCoreService / AvevaSoftService / AvevaInstall / MxAccessCom / GalaxyRepository / AvevaHistorian / OtOpcUaService / Environment), Status (Pass / Warn / Fail / Skip), and operator-facing Detail message. Report aggregates them: IsLivetestReady (no Fails anywhere) and IsAvevaSideReady (AVEVA-side categories pass, our v2 services can be absent while still considering the environment AVEVA-ready) so different test tiers can use the right threshold.
Individual probes: ServiceProbe.Check queries the Windows Service Control Manager via System.ServiceProcess.ServiceController — treats DemandStart+Stopped as Warn (NmxSvc is DemandStart by design; master pulls it up) but AutoStart+Stopped as Fail; not-installed is Fail for hard-required services, Warn for soft ones; non-Windows hosts get Skip; transitional states like StartPending get Warn with a 'try again' hint. RegistryProbe reads HKLM\SOFTWARE\WOW6432Node\ArchestrA\{Framework,Framework\Platform,MSIInstall} — Framework key presence + populated InstallPath/RootPath values mean System Platform installed; PfeConfigOptions in the Platform subkey (format 'PlatformId=N,EngineId=N,...') indicates a Platform has been deployed from the IDE (PlatformId=0 means never deployed — MXAccess will connect but every subscription will be Bad quality); RebootRequired='True' under MSIInstall surfaces as a loud warn since post-patch behavior is undefined. MxAccessComProbe resolves the LMXProxy.LMXProxyServer ProgID → CLSID → HKLM\SOFTWARE\Classes\WOW6432Node\CLSID\{guid}\InprocServer32, verifying the registered file exists on disk (catches the orphan-registry case where a previous uninstall left the ProgID registered but the DLL is gone — distinguishes it from the 'totally not installed' case by message); also emits a Warn when the test process is 64-bit (MXAccess COM activation fails with REGDB_E_CLASSNOTREG 0x80040154 regardless of registration, so seeing this warning tells operators why the activation would fail even on a fully-installed machine). SqlProbe tests Galaxy Repository via Microsoft.Data.SqlClient using the Windows-auth localhost connection string the repo code defaults to — distinguishes 'SQL Server unreachable' (connection fails) from 'ZB database does not exist' (SELECT DB_ID('ZB') returns null) because they have different remediation paths (sc.exe start MSSQLSERVER vs. restore from .cab backup); a secondary CheckDeployedObjectCountAsync query on 'gobject WHERE deployed_version > 0' warns when the count is zero because discovery smoke tests will return empty hierarchies. NamedPipeProbe opens a 2s NamedPipeClientStream against OtOpcUaGalaxyHost's pipe ('OtOpcUaGalaxy' per the installer default) — pipe accepting a connection proves the Host service is listening; disconnects immediately so we don't consume a session slot.
Service lists kept as internal static data so tests can inspect + override: CoreServices (aaBootstrap + aaGR + NmxSvc + MSSQLSERVER — hard fail if missing), SoftServices (aaLogger + aaUserValidator + aaGlobalDataCacheMonitorSvr — warn only; stack runs without them but diagnostics/auth are degraded), HistorianServices (aahClientAccessPoint + aahGateway — opt-in via Options.CheckHistorian, only matters for HistoryRead IPC paths), OtOpcUaServices (our OtOpcUaGalaxyHost hard-required for end-to-end live tests + OtOpcUa warn + GLAuth warn). Narrower entry points CheckRepositoryOnlyAsync and CheckGalaxyHostPipeOnlyAsync for tests that only care about specific subsystems — avoid paying the full probe cost on every GalaxyRepositoryLiveSmokeTests fact.
Multi-targeting mechanics: System.ServiceProcess.ServiceController + Microsoft.Win32.Registry are NuGet packages on net10 but in-box BCL references on net48; csproj conditions Package vs Reference by TargetFramework. Microsoft.Data.SqlClient v6 supports both frameworks so single PackageReference. Net48Polyfills.cs provides IsExternalInit shim (records/init-only setters) and SupportedOSPlatformAttribute stub so the same Probe sources compile on both frameworks without per-callsite preprocessor guards — lets Roslyn's platform-compatibility analyzer stay useful on net10 without breaking net48 builds.
Existing GalaxyRepositoryLiveSmokeTests updated to delegate its skip decision to AvevaPrerequisites.CheckRepositoryOnlyAsync (legacy ZbReachableAsync kept as a compatibility adapter so the in-test 'if (!await ZbReachableAsync()) return;' pattern keeps working while the surrounding fixtures gradually migrate to Assert.Skip-with-reason). Slnx file registers the new project.
Tests — AvevaPrerequisitesLiveTests (8 new Integration cases, Category=LiveGalaxy): the helper correctly reports Framework install (registry pass), aaBootstrap Running (service pass), aaGR Running (service pass), MxAccess COM registered (com pass), ZB database reachable (sql pass), deployed-object count > 0 (warn-upgraded-to-pass because this box has 49 objects deployed), the AVEVA side is ready even when our own services (OtOpcUaGalaxyHost) aren't installed yet (IsAvevaSideReady=true), and the helper emits rows for OtOpcUaGalaxyHost + OtOpcUa + GLAuth even when not installed (regression guard — nobody can accidentally ship a check that omits our own services). Full Galaxy.Host.Tests Category=LiveGalaxy suite: 13 pass (5 prior smoke + 8 new prerequisites). Full solution build clean, 0 errors.
What's NOT in this PR: end-to-end Galaxy stack smoke (Proxy → Host pipe → MXAccess → real Galaxy tag). That's the next PR — this one is the gate the end-to-end smoke will call first to produce actionable skip messages instead of silent returns.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 16:36:13 -04:00
5cc120d836 Merge pull request 'Phase 3 PR 35 — IHistoryProvider gains ReadAtTime + ReadEvents; Proxy implements both' (#34) from phase-3-pr35-history-readtime-readevents into v2 2026-04-18 16:12:43 -04:00
Joseph Doherty
bf329b05d8 Phase 3 PR 35 — IHistoryProvider gains ReadAtTimeAsync + ReadEventsAsync; GalaxyProxyDriver implements both. Extends Core.Abstractions.IHistoryProvider with two new methods that round out the OPC UA Part 11 HistoryRead surface (HistoryReadAtTime + HistoryReadEvents are the last two modes not covered by the PR 19-era ReadRawAsync + ReadProcessedAsync) and wires GalaxyProxyDriver to call the existing PR-10/PR-11 IPC contracts the Host already implements.
Interface additions use C# default interface implementations that throw NotSupportedException — existing IHistoryProvider implementations keep compiling, only drivers whose backend carries the relevant capability override. This matches the 'capabilities are optional per driver' design already used by IHistoryProvider.ReadProcessedAsync's docs (Modbus / OPC UA Client drivers never had an event historian and the default-throw path lets callers see BadHistoryOperationUnsupported naturally). New HistoricalEvent record models one historian row (EventId, SourceName, EventTimeUtc + ReceivedTimeUtc — process vs historian-persist timestamps, Message, Severity mapped to OPC UA's 1-1000 range); HistoricalEventsResult pairs the event list with a continuation-point token for future batching. Both live in Core.Abstractions so downstream (Proxy, Host, Server) reference a single domain shape — no Shared-contract leak into the driver-facing interface.
GalaxyProxyDriver.ReadAtTimeAsync maps the domain DateTime[] to Unix-ms longs, calls CallAsync on the existing MessageKind.HistoryReadAtTimeRequest, and trusts the Host's one-sample-per-requested-timestamp contract (the Host pads with bad-quality snapshots for timestamps it can't interpolate; re-aligning on the Proxy side would duplicate the Host's interpolation policy logic). ReadEventsAsync does the same for HistoryReadEventsRequest; ToHistoricalEvent translates GalaxyHistoricalEvent (MessagePack-annotated, Unix-ms) to the domain record, explicitly tagging DateTimeKind.Utc on both timestamp fields so downstream serializers (JSON, OPC UA types) don't apply an unexpected local-time offset.
Tests — HistoricalEventMappingTests (3 new Proxy.Tests unit cases): every field maps correctly from wire to domain; null SourceName and null DisplayText preserve through the mapping (system events without a source come out with null so callers can distinguish them from alarm events); both timestamps come out as DateTimeKind.Utc (regression guard against a future refactor using DateTime.FromFileTimeUtc or similar that defaults to Unspecified). Driver.Galaxy.Proxy.Tests Unit suite: 17 pass / 0 fail (14 prior + 3 new). Full solution build clean, 0 errors.
Scope exclusions — DriverNodeManager HistoryRead service-handler wiring (on the OPC UA Server side, where HistoryReadAtTime and HistoryReadEvents service requests land) and the full-loop integration test (OPC UA client → server → IPC → Host → HistorianDataSource → back) are deferred to a focused follow-up PR. The capability surface is the load-bearing change; wiring the service handlers is mechanical in comparison and worth its own PR for reviewability. docs/v2/lmx-followups.md #1 updated with the split.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 16:08:27 -04:00
2584379e75 Merge pull request 'Phase 3 PR 34 — Host-status publisher (Server) + /hosts drill-down page (Admin)' (#33) from phase-3-pr34-host-status-publisher-page into v2 2026-04-18 16:04:20 -04:00
Joseph Doherty
ef2a810b2d Phase 3 PR 34 — Host-status publisher (Server) + /hosts drill-down page (Admin). Closes LMX follow-up #7 by wiring together the data layer from PR 33. Server.HostStatusPublisher is a BackgroundService that walks every driver registered in DriverHost every 10 seconds, skips drivers that don't implement IHostConnectivityProbe, calls GetHostStatuses() on each probe-capable driver, and upserts one DriverHostStatus row per (NodeId, DriverInstanceId, HostName) into the central config DB. Upsert path: SingleOrDefaultAsync on the composite PK; if no row exists, Add a new one; if a row exists, LastSeenUtc advances unconditionally (heartbeat) and State + StateChangedUtc update only on transitions so Admin UI can distinguish 'still reporting, still Running' from 'freshly transitioned to Running'. MapState translates Core.Abstractions.HostState to Configuration.Enums.DriverHostState (intentional duplicate enum — Configuration project stays free of driver-runtime deps per PR 33's choice). If a driver's GetHostStatuses throws, log warning and skip that driver this tick — never take down the Server on a publisher failure. If the DB is unreachable, log warning + retry next heartbeat (no buffering — next tick's current-state snapshot is more useful than replaying stale transitions after a long outage). 2-second startup delay so NodeBootstrap's RegisterAsync calls land before the first publish tick, then tick runs immediately so a freshly-started Server surfaces its host topology in the Admin UI without waiting a full interval.
Polling chosen over event-driven for initial scope: simpler, matches Admin UI consumer cadence, avoids DriverHost lifecycle-event plumbing that doesn't exist today. Event-driven push for sub-heartbeat latency is a straightforward follow-up.
Admin.Services.HostStatusService left-joins DriverHostStatus against ClusterNode on NodeId so rows persist even when the ClusterNode entry doesn't exist yet (first-boot bootstrap case). StaleThreshold = 30s — covers one missed publisher heartbeat plus a generous buffer for clock skew and GC pauses. Admin Components/Pages/Hosts.razor — FleetAdmin-visible page grouped by cluster (handles the '(unassigned)' case for rows without a matching ClusterNode). Four summary cards (Hosts / Running / Stale / Faulted); per-cluster table with Node / Driver / Host / State + Stale-badge / Last-transition / Last-seen / Detail columns; 10s auto-refresh via IServiceScopeFactory timer pattern matching FleetStatusPoller + Fleet dashboard (PR 27). Row-class highlighting: Faulted → table-danger, Stale → table-warning, else default. State badge maps DriverHostState enum to bootstrap color classes. Sidebar link added between 'Fleet status' and 'Clusters'.
Server csproj adds Microsoft.EntityFrameworkCore.SqlServer 10.0.0 + registers OtOpcUaConfigDbContext in Program.cs scoped via NodeOptions.ConfigDbConnectionString (no Admin-style manual SQL raw — the DbContext is the only access path, keeps migrations owner-of-record).
Tests — HostStatusPublisherTests (4 new Integration cases, uses per-run throwaway DB matching the FleetStatusPollerTests pattern): publisher upserts one row per host from each probe-capable driver and skips non-probe drivers; second tick advances LastSeenUtc without creating duplicate rows (upsert pattern verified end-to-end); state change between ticks updates State AND StateChangedUtc (datetime2(3) rounds to millisecond precision so comparison uses 1ms tolerance — documented inline); MapState translates every HostState enum member. Server.Tests Integration: 4 new tests pass. Admin build clean, Admin.Tests Unit still 23 / 0. docs/v2/lmx-followups.md item #7 marked DONE with three explicit deferred items (event-driven push, failure-count column, SignalR fan-out).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 15:51:55 -04:00
a7764e50f3 Merge pull request 'Phase 3 PR 33 — DriverHostStatus entity + migration (LMX #7 data layer)' (#32) from phase-3-pr33-driverhoststatus-entity into v2 2026-04-18 15:43:37 -04:00
30 changed files with 2199 additions and 36 deletions

View File

@@ -21,6 +21,7 @@
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Admin.Tests/ZB.MOM.WW.OtOpcUa.Admin.Tests.csproj"/> <Project Path="tests/ZB.MOM.WW.OtOpcUa.Admin.Tests/ZB.MOM.WW.OtOpcUa.Admin.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Tests.csproj"/> <Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests.csproj"/> <Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.TestSupport/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.TestSupport.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests.csproj"/> <Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E.csproj"/> <Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Tests/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Tests.csproj"/> <Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Tests/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Tests.csproj"/>

View File

@@ -9,19 +9,18 @@ rough priority order.
## 1. Proxy-side `IHistoryProvider` for `ReadAtTime` / `ReadEvents` ## 1. Proxy-side `IHistoryProvider` for `ReadAtTime` / `ReadEvents`
**Status**: Host-side IPC shipped (PR 10 + PR 11). Proxy consumer not written. **Status**: Capability surface complete (PR 35). OPC UA HistoryRead service-handler
wiring in `DriverNodeManager` remains as the next step; integration-test still
pending.
PR 10 added `HistoryReadAtTimeRequest/Response` on the IPC wire and PR 35 extended `IHistoryProvider` with `ReadAtTimeAsync` + `ReadEventsAsync`
`MxAccessGalaxyBackend.HistoryReadAtTimeAsync` delegates to (default throwing implementations so existing impls keep compiling), added the
`HistorianDataSource.ReadAtTimeAsync`. PR 11 did the same for events `HistoricalEvent` + `HistoricalEventsResult` records to
(`HistoryReadEventsRequest/Response` + `GalaxyHistoricalEvent`). The Proxy `Core.Abstractions`, and implemented both methods in `GalaxyProxyDriver` on top
side (`GalaxyProxyDriver`) doesn't call those yet — `Core.Abstractions.IHistoryProvider` of the PR 10 / PR 11 IPC messages. Wire-to-domain mapping (`ToHistoricalEvent`)
only exposes `ReadRawAsync` + `ReadProcessedAsync`. is unit-tested for field fidelity, null-preservation, and `DateTimeKind.Utc`.
**To do**: **Remaining**:
- Extend `IHistoryProvider` with `ReadAtTimeAsync(string, DateTime[], …)` and
`ReadEventsAsync(string?, DateTime, DateTime, int, …)`.
- `GalaxyProxyDriver` calls the new IPC message kinds.
- `DriverNodeManager` wires the new capability methods onto `HistoryRead` - `DriverNodeManager` wires the new capability methods onto `HistoryRead`
`AtTime` + `Events` service handlers. `AtTime` + `Events` service handlers.
- Integration test: OPC UA client calls `HistoryReadAtTime` / `HistoryReadEvents`, - Integration test: OPC UA client calls `HistoryReadAtTime` / `HistoryReadEvents`,
@@ -78,18 +77,36 @@ drive a full OPC UA session with username/password, then read an
`IHostConnectivityProbe`-style "whoami" node to verify the role surfaced). `IHostConnectivityProbe`-style "whoami" node to verify the role surfaced).
That needs a test-only address-space node and is a separate PR. That needs a test-only address-space node and is a separate PR.
## 5. Full Galaxy live-service smoke test against the merged v2 stack ## 5. Full Galaxy live-service smoke test against the merged v2 stack — **IN PROGRESS (PRs 36 + 37)**
**Status**: Individual pieces have live smoke tests (PR 5 MXAccess, PR 13 PR 36 shipped the prerequisites helper (`AvevaPrerequisites`) that probes
probe manager, PR 14 alarm tracker), but the full loop — OPC UA client → every dependency a live smoke test needs and produces actionable skip
`OtOpcUaServer``GalaxyProxyDriver` (in-process) → named-pipe to messages.
Galaxy.Host subprocess → live MXAccess runtime → real Galaxy objects — has
no single end-to-end smoke test.
**To do**: PR 37 shipped the live-stack smoke test project structure:
- Test that spawns the full topology, discovers a deployed Galaxy object, `tests/Driver.Galaxy.Proxy.Tests/LiveStack/` with `LiveStackFixture` (connects
subscribes to one of its attributes, writes a value back, and asserts the to the *already-running* `OtOpcUaGalaxyHost` Windows service via named pipe;
write round-tripped through MXAccess. Skip when ArchestrA isn't running. never spawns the Host process) and `LiveStackSmokeTests` covering:
- Fixture initializes successfully (IPC handshake succeeds end-to-end).
- Driver reports `DriverState.Healthy` post-handshake.
- `DiscoverAsync` returns at least one variable from the live Galaxy.
- `GetHostStatuses` reports at least one Platform/AppEngine host.
- `ReadAsync` on a discovered variable round-trips through
Proxy → Host pipe → MXAccess → back without a BadInternalError.
Shared secret + pipe name resolve from `OTOPCUA_GALAXY_SECRET` /
`OTOPCUA_GALAXY_PIPE` env vars, falling back to reading the service's
registry-stored Environment values (requires elevated test host).
**Remaining**:
- Install + run the `OtOpcUaGalaxyHost` + `OtOpcUa` services on the dev box
(`scripts/install/Install-Services.ps1`) so the skip-on-unready tests
actually execute and the smoke PR lands green.
- Subscribe-and-receive-data-change fact (needs a known tag that actually
ticks; deferred until operators confirm a scratch tag exists).
- Write-and-roundtrip fact (needs a test-only UDA or agreed scratch tag
so we can't accidentally mutate a process-critical value).
## 6. Second driver instance on the same server — **DONE (PR 32)** ## 6. Second driver instance on the same server — **DONE (PR 32)**
@@ -108,13 +125,30 @@ condition node). Alarm tracking already has its own integration test
(`AlarmSubscription*`); the multi-driver alarm case would need a stub (`AlarmSubscription*`); the multi-driver alarm case would need a stub
`IAlarmSource` that's worth its own focused PR. `IAlarmSource` that's worth its own focused PR.
## 7. Host-status per-AppEngine granularity → Admin UI dashboard ## 7. Host-status per-AppEngine granularity → Admin UI dashboard — **DONE (PRs 33 + 34)**
**Status**: PR 13 ships per-platform/per-AppEngine `ScanState` probing; PR 17 **PR 33** landed the data layer: `DriverHostStatus` entity + migration with
surfaces the resulting `OnHostStatusChanged` events through OPC UA. Admin composite key `(NodeId, DriverInstanceId, HostName)` and two query-supporting
UI doesn't render a per-host dashboard yet. indexes (per-cluster drill-down on `NodeId`, stale-row detection on
`LastSeenUtc`).
**To do**: **PR 34** wired the publisher + consumer. `HostStatusPublisher` is a
- SignalR hub push of `HostStatusChangedEventArgs` to the Admin UI. `BackgroundService` in the Server process that walks every registered
- Dashboard page showing each tracked host, current state, last transition `IHostConnectivityProbe`-capable driver every 10s, calls
time, failure count. `GetHostStatuses()`, and upserts rows (`LastSeenUtc` advances each tick;
`State` + `StateChangedUtc` update on transitions). Admin UI `/hosts` page
groups by cluster, shows four summary cards (Hosts / Running / Stale /
Faulted), and flags rows whose `LastSeenUtc` is older than 30s as Stale so
operators see crashed Servers without waiting for a state change.
Deferred as follow-ups:
- Event-driven push (subscribe to `OnHostStatusChanged` per driver for
sub-heartbeat latency). Adds DriverHost lifecycle-event plumbing;
10s polling is fine for operator-scale use.
- Failure-count column — needs the publisher to track a transition history
per host, not just current-state.
- SignalR fan-out to the Admin page (currently the page polls the DB, not
a hub). The DB-polled version is fine at current cadence but a hub push
would eliminate the 10s race where a new row sits in the DB before the
Admin page notices.

View File

@@ -6,6 +6,7 @@
<ul class="nav flex-column"> <ul class="nav flex-column">
<li class="nav-item"><a class="nav-link text-light" href="/">Overview</a></li> <li class="nav-item"><a class="nav-link text-light" href="/">Overview</a></li>
<li class="nav-item"><a class="nav-link text-light" href="/fleet">Fleet status</a></li> <li class="nav-item"><a class="nav-link text-light" href="/fleet">Fleet status</a></li>
<li class="nav-item"><a class="nav-link text-light" href="/hosts">Host status</a></li>
<li class="nav-item"><a class="nav-link text-light" href="/clusters">Clusters</a></li> <li class="nav-item"><a class="nav-link text-light" href="/clusters">Clusters</a></li>
<li class="nav-item"><a class="nav-link text-light" href="/reservations">Reservations</a></li> <li class="nav-item"><a class="nav-link text-light" href="/reservations">Reservations</a></li>
<li class="nav-item"><a class="nav-link text-light" href="/certificates">Certificates</a></li> <li class="nav-item"><a class="nav-link text-light" href="/certificates">Certificates</a></li>

View File

@@ -0,0 +1,160 @@
@page "/hosts"
@using Microsoft.EntityFrameworkCore
@using ZB.MOM.WW.OtOpcUa.Admin.Services
@using ZB.MOM.WW.OtOpcUa.Configuration.Enums
@inject IServiceScopeFactory ScopeFactory
@implements IDisposable
<h1 class="mb-4">Driver host status</h1>
<div class="d-flex align-items-center mb-3 gap-2">
<button class="btn btn-sm btn-outline-primary" @onclick="RefreshAsync" disabled="@_refreshing">
@if (_refreshing) { <span class="spinner-border spinner-border-sm me-1" /> }
Refresh
</button>
<span class="text-muted small">
Auto-refresh every @RefreshIntervalSeconds s. Last updated: @(_lastRefreshUtc?.ToString("HH:mm:ss 'UTC'") ?? "—")
</span>
</div>
<div class="alert alert-info small mb-4">
Each row is one host reported by a driver instance on a server node. Galaxy drivers report
per-Platform / per-AppEngine entries; Modbus drivers report the PLC endpoint. Rows age out
of the Server's publisher on every 10-second heartbeat — rows whose LastSeen is older than
30s are flagged Stale, which usually means the owning Server process has crashed or lost
its DB connection.
</div>
@if (_rows is null)
{
<p>Loading…</p>
}
else if (_rows.Count == 0)
{
<div class="alert alert-secondary">
No host-status rows yet. The Server publishes its first tick 2s after startup; if this list stays empty, check that the Server is running and the driver implements <code>IHostConnectivityProbe</code>.
</div>
}
else
{
<div class="row g-3 mb-4">
<div class="col-md-3"><div class="card"><div class="card-body">
<h6 class="text-muted mb-1">Hosts</h6>
<div class="fs-3">@_rows.Count</div>
</div></div></div>
<div class="col-md-3"><div class="card border-success"><div class="card-body">
<h6 class="text-muted mb-1">Running</h6>
<div class="fs-3 text-success">@_rows.Count(r => r.State == DriverHostState.Running && !HostStatusService.IsStale(r))</div>
</div></div></div>
<div class="col-md-3"><div class="card border-warning"><div class="card-body">
<h6 class="text-muted mb-1">Stale</h6>
<div class="fs-3 text-warning">@_rows.Count(HostStatusService.IsStale)</div>
</div></div></div>
<div class="col-md-3"><div class="card border-danger"><div class="card-body">
<h6 class="text-muted mb-1">Faulted</h6>
<div class="fs-3 text-danger">@_rows.Count(r => r.State == DriverHostState.Faulted)</div>
</div></div></div>
</div>
@foreach (var cluster in _rows.GroupBy(r => r.ClusterId ?? "(unassigned)").OrderBy(g => g.Key))
{
<h2 class="h5 mt-4">Cluster: <code>@cluster.Key</code></h2>
<table class="table table-sm table-hover align-middle">
<thead>
<tr>
<th>Node</th>
<th>Driver</th>
<th>Host</th>
<th>State</th>
<th>Last transition</th>
<th>Last seen</th>
<th>Detail</th>
</tr>
</thead>
<tbody>
@foreach (var r in cluster)
{
<tr class="@RowClass(r)">
<td><code>@r.NodeId</code></td>
<td><code>@r.DriverInstanceId</code></td>
<td>@r.HostName</td>
<td>
<span class="badge @StateBadge(r.State)">@r.State</span>
@if (HostStatusService.IsStale(r))
{
<span class="badge bg-warning text-dark ms-1">Stale</span>
}
</td>
<td class="small">@FormatAge(r.StateChangedUtc)</td>
<td class="small @(HostStatusService.IsStale(r) ? "text-warning" : "")">@FormatAge(r.LastSeenUtc)</td>
<td class="text-truncate small" style="max-width: 320px;" title="@r.Detail">@r.Detail</td>
</tr>
}
</tbody>
</table>
}
}
@code {
// Mirrors HostStatusPublisher.HeartbeatInterval — polling ahead of the broadcaster
// produces stale-looking rows mid-cycle.
private const int RefreshIntervalSeconds = 10;
private List<HostStatusRow>? _rows;
private bool _refreshing;
private DateTime? _lastRefreshUtc;
private Timer? _timer;
protected override async Task OnInitializedAsync()
{
await RefreshAsync();
_timer = new Timer(async _ => await InvokeAsync(RefreshAsync),
state: null,
dueTime: TimeSpan.FromSeconds(RefreshIntervalSeconds),
period: TimeSpan.FromSeconds(RefreshIntervalSeconds));
}
private async Task RefreshAsync()
{
if (_refreshing) return;
_refreshing = true;
try
{
using var scope = ScopeFactory.CreateScope();
var svc = scope.ServiceProvider.GetRequiredService<HostStatusService>();
_rows = (await svc.ListAsync()).ToList();
_lastRefreshUtc = DateTime.UtcNow;
}
finally
{
_refreshing = false;
StateHasChanged();
}
}
private static string RowClass(HostStatusRow r) => r.State switch
{
DriverHostState.Faulted => "table-danger",
_ when HostStatusService.IsStale(r) => "table-warning",
_ => "",
};
private static string StateBadge(DriverHostState s) => s switch
{
DriverHostState.Running => "bg-success",
DriverHostState.Stopped => "bg-secondary",
DriverHostState.Faulted => "bg-danger",
_ => "bg-secondary",
};
private static string FormatAge(DateTime t)
{
var age = DateTime.UtcNow - t;
if (age.TotalSeconds < 60) return $"{(int)age.TotalSeconds}s ago";
if (age.TotalMinutes < 60) return $"{(int)age.TotalMinutes}m ago";
if (age.TotalHours < 24) return $"{(int)age.TotalHours}h ago";
return t.ToString("yyyy-MM-dd HH:mm 'UTC'");
}
public void Dispose() => _timer?.Dispose();
}

View File

@@ -47,6 +47,7 @@ builder.Services.AddScoped<NodeAclService>();
builder.Services.AddScoped<ReservationService>(); builder.Services.AddScoped<ReservationService>();
builder.Services.AddScoped<DraftValidationService>(); builder.Services.AddScoped<DraftValidationService>();
builder.Services.AddScoped<AuditLogService>(); builder.Services.AddScoped<AuditLogService>();
builder.Services.AddScoped<HostStatusService>();
// Cert-trust management — reads the OPC UA server's PKI store root so rejected client certs // Cert-trust management — reads the OPC UA server's PKI store root so rejected client certs
// can be promoted to trusted via the Admin UI. Singleton: no per-request state, just // can be promoted to trusted via the Admin UI. Singleton: no per-request state, just

View File

@@ -0,0 +1,63 @@
using Microsoft.EntityFrameworkCore;
using ZB.MOM.WW.OtOpcUa.Configuration;
using ZB.MOM.WW.OtOpcUa.Configuration.Entities;
using ZB.MOM.WW.OtOpcUa.Configuration.Enums;
namespace ZB.MOM.WW.OtOpcUa.Admin.Services;
/// <summary>
/// One row per <see cref="DriverHostStatus"/> record, enriched with the owning
/// <c>ClusterNode.ClusterId</c> when available (left-join). The Admin <c>/hosts</c> page
/// groups by cluster and renders a per-node → per-driver → per-host tree.
/// </summary>
public sealed record HostStatusRow(
string NodeId,
string? ClusterId,
string DriverInstanceId,
string HostName,
DriverHostState State,
DateTime StateChangedUtc,
DateTime LastSeenUtc,
string? Detail);
/// <summary>
/// Read-side service for the Admin UI's per-host drill-down. Loads
/// <see cref="DriverHostStatus"/> rows (written by the Server process's
/// <c>HostStatusPublisher</c>) and left-joins <c>ClusterNode</c> so each row knows which
/// cluster it belongs to — the Admin UI groups by cluster for the fleet-wide view.
/// </summary>
/// <remarks>
/// The publisher heartbeat is 10s (<c>HostStatusPublisher.HeartbeatInterval</c>). The
/// Admin page also polls every ~10s and treats rows with <c>LastSeenUtc</c> older than
/// <c>StaleThreshold</c> (30s) as stale — covers a missed heartbeat tolerance plus
/// a generous buffer for clock skew and publisher GC pauses.
/// </remarks>
public sealed class HostStatusService(OtOpcUaConfigDbContext db)
{
public static readonly TimeSpan StaleThreshold = TimeSpan.FromSeconds(30);
public async Task<IReadOnlyList<HostStatusRow>> ListAsync(CancellationToken ct = default)
{
// LEFT JOIN on NodeId so a row persists even when its owning ClusterNode row hasn't
// been created yet (first-boot bootstrap case — keeps the UI from losing sight of
// the reporting server).
var rows = await (from s in db.DriverHostStatuses.AsNoTracking()
join n in db.ClusterNodes.AsNoTracking()
on s.NodeId equals n.NodeId into nodeJoin
from n in nodeJoin.DefaultIfEmpty()
orderby s.NodeId, s.DriverInstanceId, s.HostName
select new HostStatusRow(
s.NodeId,
n != null ? n.ClusterId : null,
s.DriverInstanceId,
s.HostName,
s.State,
s.StateChangedUtc,
s.LastSeenUtc,
s.Detail)).ToListAsync(ct);
return rows;
}
public static bool IsStale(HostStatusRow row) =>
DateTime.UtcNow - row.LastSeenUtc > StaleThreshold;
}

View File

@@ -30,6 +30,52 @@ public interface IHistoryProvider
TimeSpan interval, TimeSpan interval,
HistoryAggregateType aggregate, HistoryAggregateType aggregate,
CancellationToken cancellationToken); CancellationToken cancellationToken);
/// <summary>
/// Read one sample per requested timestamp — OPC UA HistoryReadAtTime service. The
/// driver interpolates (or returns the prior-boundary sample) when no exact match
/// exists. Optional; drivers that can't interpolate throw <see cref="NotSupportedException"/>.
/// </summary>
/// <remarks>
/// Default implementation throws. Drivers opt in by overriding; keeps existing
/// <c>IHistoryProvider</c> implementations compiling without forcing a ReadAtTime path
/// they may not have a backend for.
/// </remarks>
Task<HistoryReadResult> ReadAtTimeAsync(
string fullReference,
IReadOnlyList<DateTime> timestampsUtc,
CancellationToken cancellationToken)
=> throw new NotSupportedException(
$"{GetType().Name} does not implement ReadAtTimeAsync. " +
"Drivers whose backends support at-time reads override this method.");
/// <summary>
/// Read historical alarm/event records — OPC UA HistoryReadEvents service. Distinct
/// from the live event stream — historical rows come from an event historian (Galaxy's
/// Alarm Provider history log, etc.) rather than the driver's active subscription.
/// </summary>
/// <param name="sourceName">
/// Optional filter: null means "all sources", otherwise restrict to events from that
/// source-object name. Drivers may ignore the filter if the backend doesn't support it.
/// </param>
/// <param name="startUtc">Inclusive lower bound on <c>EventTimeUtc</c>.</param>
/// <param name="endUtc">Exclusive upper bound on <c>EventTimeUtc</c>.</param>
/// <param name="maxEvents">Upper cap on returned events — the driver's backend enforces this.</param>
/// <param name="cancellationToken">Request cancellation.</param>
/// <remarks>
/// Default implementation throws. Only drivers with an event historian (Galaxy via the
/// Wonderware Alarm &amp; Events log) override. Modbus / the OPC UA Client driver stay
/// with the default and let callers see <c>BadHistoryOperationUnsupported</c>.
/// </remarks>
Task<HistoricalEventsResult> ReadEventsAsync(
string? sourceName,
DateTime startUtc,
DateTime endUtc,
int maxEvents,
CancellationToken cancellationToken)
=> throw new NotSupportedException(
$"{GetType().Name} does not implement ReadEventsAsync. " +
"Drivers whose backends have an event historian override this method.");
} }
/// <summary>Result of a HistoryRead call.</summary> /// <summary>Result of a HistoryRead call.</summary>
@@ -48,3 +94,29 @@ public enum HistoryAggregateType
Total, Total,
Count, Count,
} }
/// <summary>
/// One row returned by <see cref="IHistoryProvider.ReadEventsAsync"/> — a historical
/// alarm/event record, not the OPC UA live-event stream. Fields match the minimum set the
/// Server needs to populate a <c>HistoryEventFieldList</c> for HistoryReadEvents responses.
/// </summary>
/// <param name="EventId">Stable unique id for the event — driver-specific format.</param>
/// <param name="SourceName">Source object that emitted the event. May differ from the <c>sourceName</c> filter the caller passed (fuzzy matches).</param>
/// <param name="EventTimeUtc">Process-side timestamp — when the event actually occurred.</param>
/// <param name="ReceivedTimeUtc">Historian-side timestamp — when the historian persisted the row; may lag <paramref name="EventTimeUtc"/> by the historian's buffer flush cadence.</param>
/// <param name="Message">Human-readable message text.</param>
/// <param name="Severity">OPC UA severity (1-1000). Drivers map their native priority scale onto this range.</param>
public sealed record HistoricalEvent(
string EventId,
string? SourceName,
DateTime EventTimeUtc,
DateTime ReceivedTimeUtc,
string? Message,
ushort Severity);
/// <summary>Result of a <see cref="IHistoryProvider.ReadEventsAsync"/> call.</summary>
/// <param name="Events">Events in chronological order by <c>EventTimeUtc</c>.</param>
/// <param name="ContinuationPoint">Opaque token for the next call when more events are available; null when complete.</param>
public sealed record HistoricalEventsResult(
IReadOnlyList<HistoricalEvent> Events,
byte[]? ContinuationPoint);

View File

@@ -339,6 +339,64 @@ public sealed class GalaxyProxyDriver(GalaxyProxyOptions options)
return new HistoryReadResult(samples, ContinuationPoint: null); return new HistoryReadResult(samples, ContinuationPoint: null);
} }
public async Task<HistoryReadResult> ReadAtTimeAsync(
string fullReference, IReadOnlyList<DateTime> timestampsUtc, CancellationToken cancellationToken)
{
var client = RequireClient();
var resp = await client.CallAsync<HistoryReadAtTimeRequest, HistoryReadAtTimeResponse>(
MessageKind.HistoryReadAtTimeRequest,
new HistoryReadAtTimeRequest
{
SessionId = _sessionId,
TagReference = fullReference,
TimestampsUtcUnixMs = [.. timestampsUtc.Select(t => new DateTimeOffset(t, TimeSpan.Zero).ToUnixTimeMilliseconds())],
},
MessageKind.HistoryReadAtTimeResponse,
cancellationToken);
if (!resp.Success)
throw new InvalidOperationException($"Galaxy.Host HistoryReadAtTime failed: {resp.Error}");
// ReadAtTime returns one sample per requested timestamp in the same order — the Host
// pads with bad-quality snapshots when a timestamp can't be interpolated, so response
// length matches request length exactly. We trust that contract rather than
// re-aligning here, because the Host is the source-of-truth for interpolation policy.
IReadOnlyList<DataValueSnapshot> samples = [.. resp.Values.Select(ToSnapshot)];
return new HistoryReadResult(samples, ContinuationPoint: null);
}
public async Task<HistoricalEventsResult> ReadEventsAsync(
string? sourceName, DateTime startUtc, DateTime endUtc, int maxEvents, CancellationToken cancellationToken)
{
var client = RequireClient();
var resp = await client.CallAsync<HistoryReadEventsRequest, HistoryReadEventsResponse>(
MessageKind.HistoryReadEventsRequest,
new HistoryReadEventsRequest
{
SessionId = _sessionId,
SourceName = sourceName,
StartUtcUnixMs = new DateTimeOffset(startUtc, TimeSpan.Zero).ToUnixTimeMilliseconds(),
EndUtcUnixMs = new DateTimeOffset(endUtc, TimeSpan.Zero).ToUnixTimeMilliseconds(),
MaxEvents = maxEvents,
},
MessageKind.HistoryReadEventsResponse,
cancellationToken);
if (!resp.Success)
throw new InvalidOperationException($"Galaxy.Host HistoryReadEvents failed: {resp.Error}");
IReadOnlyList<HistoricalEvent> events = [.. resp.Events.Select(ToHistoricalEvent)];
return new HistoricalEventsResult(events, ContinuationPoint: null);
}
internal static HistoricalEvent ToHistoricalEvent(GalaxyHistoricalEvent wire) => new(
EventId: wire.EventId,
SourceName: wire.SourceName,
EventTimeUtc: DateTimeOffset.FromUnixTimeMilliseconds(wire.EventTimeUtcUnixMs).UtcDateTime,
ReceivedTimeUtc: DateTimeOffset.FromUnixTimeMilliseconds(wire.ReceivedTimeUtcUnixMs).UtcDateTime,
Message: wire.DisplayText,
Severity: wire.Severity);
/// <summary> /// <summary>
/// Maps the OPC UA Part 13 aggregate enum onto the Wonderware Historian /// Maps the OPC UA Part 13 aggregate enum onto the Wonderware Historian
/// AnalogSummaryQuery column names consumed by <c>HistorianDataSource.ReadAggregateAsync</c>. /// AnalogSummaryQuery column names consumed by <c>HistorianDataSource.ReadAggregateAsync</c>.

View File

@@ -0,0 +1,143 @@
using Microsoft.EntityFrameworkCore;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Hosting;
using Microsoft.Extensions.Logging;
using ZB.MOM.WW.OtOpcUa.Configuration;
using ZB.MOM.WW.OtOpcUa.Configuration.Entities;
using ZB.MOM.WW.OtOpcUa.Configuration.Enums;
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
using ZB.MOM.WW.OtOpcUa.Core.Hosting;
namespace ZB.MOM.WW.OtOpcUa.Server;
/// <summary>
/// Walks every registered driver once per heartbeat interval, asks each
/// <see cref="IHostConnectivityProbe"/>-capable driver for its current
/// <see cref="HostConnectivityStatus"/> list, and upserts one
/// <see cref="DriverHostStatus"/> row per (NodeId, DriverInstanceId, HostName) into the
/// central config DB. Powers the Admin UI's per-host drill-down page (LMX follow-up #7).
/// </summary>
/// <remarks>
/// <para>
/// Polling rather than event-driven: simpler, and matches the cadence the Admin UI
/// consumes. An event-subscription optimization (push on <c>OnHostStatusChanged</c> for
/// immediate reflection) is a straightforward follow-up but adds lifecycle complexity
/// — drivers can be registered after the publisher starts, and subscribing to each
/// one's event on register + unsubscribing on unregister requires DriverHost to expose
/// lifecycle events it doesn't today.
/// </para>
/// <para>
/// <see cref="DriverHostStatus.LastSeenUtc"/> advances every heartbeat so the Admin UI
/// can flag stale rows from a crashed Server process independent of
/// <see cref="DriverHostStatus.State"/> — a Faulted publisher that stops heartbeating
/// stays Faulted in the DB but its LastSeenUtc ages out, which is the signal
/// operators actually want.
/// </para>
/// <para>
/// If the DB is unreachable on a given tick, the publisher logs and moves on — it
/// does not retry or buffer. The next heartbeat picks up the current-state snapshot,
/// which is more useful than replaying stale transitions after a long outage.
/// </para>
/// </remarks>
public sealed class HostStatusPublisher(
DriverHost driverHost,
NodeOptions nodeOptions,
IServiceScopeFactory scopeFactory,
ILogger<HostStatusPublisher> logger) : BackgroundService
{
internal static readonly TimeSpan HeartbeatInterval = TimeSpan.FromSeconds(10);
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
// Wait a short moment at startup so NodeBootstrap's RegisterAsync calls have had a
// chance to land. First tick runs immediately after so a freshly-started Server
// surfaces its host topology in the Admin UI without waiting a full interval.
try { await Task.Delay(TimeSpan.FromSeconds(2), stoppingToken); }
catch (OperationCanceledException) { return; }
while (!stoppingToken.IsCancellationRequested)
{
try { await PublishOnceAsync(stoppingToken); }
catch (OperationCanceledException) { return; }
catch (Exception ex)
{
// Never take down the Server on a publisher failure. Log and continue —
// stale-row detection on the Admin side will surface the outage.
logger.LogWarning(ex, "Host-status publisher tick failed — will retry next heartbeat");
}
try { await Task.Delay(HeartbeatInterval, stoppingToken); }
catch (OperationCanceledException) { return; }
}
}
internal async Task PublishOnceAsync(CancellationToken ct)
{
var driverIds = driverHost.RegisteredDriverIds;
if (driverIds.Count == 0) return;
var now = DateTime.UtcNow;
using var scope = scopeFactory.CreateScope();
var db = scope.ServiceProvider.GetRequiredService<OtOpcUaConfigDbContext>();
foreach (var driverId in driverIds)
{
var driver = driverHost.GetDriver(driverId);
if (driver is not IHostConnectivityProbe probe) continue;
IReadOnlyList<HostConnectivityStatus> statuses;
try { statuses = probe.GetHostStatuses(); }
catch (Exception ex)
{
logger.LogWarning(ex, "Driver {DriverId} GetHostStatuses threw — skipping this tick", driverId);
continue;
}
foreach (var status in statuses)
{
await UpsertAsync(db, driverId, status, now, ct);
}
}
await db.SaveChangesAsync(ct);
}
private async Task UpsertAsync(OtOpcUaConfigDbContext db, string driverId,
HostConnectivityStatus status, DateTime now, CancellationToken ct)
{
var mapped = MapState(status.State);
var existing = await db.DriverHostStatuses.SingleOrDefaultAsync(r =>
r.NodeId == nodeOptions.NodeId
&& r.DriverInstanceId == driverId
&& r.HostName == status.HostName, ct);
if (existing is null)
{
db.DriverHostStatuses.Add(new DriverHostStatus
{
NodeId = nodeOptions.NodeId,
DriverInstanceId = driverId,
HostName = status.HostName,
State = mapped,
StateChangedUtc = status.LastChangedUtc,
LastSeenUtc = now,
});
return;
}
existing.LastSeenUtc = now;
if (existing.State != mapped)
{
existing.State = mapped;
existing.StateChangedUtc = status.LastChangedUtc;
}
}
internal static DriverHostState MapState(HostState state) => state switch
{
HostState.Running => DriverHostState.Running,
HostState.Stopped => DriverHostState.Stopped,
HostState.Faulted => DriverHostState.Faulted,
_ => DriverHostState.Unknown,
};
}

View File

@@ -1,8 +1,10 @@
using Microsoft.EntityFrameworkCore;
using Microsoft.Extensions.Configuration; using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.DependencyInjection; using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Hosting; using Microsoft.Extensions.Hosting;
using Microsoft.Extensions.Logging; using Microsoft.Extensions.Logging;
using Serilog; using Serilog;
using ZB.MOM.WW.OtOpcUa.Configuration;
using ZB.MOM.WW.OtOpcUa.Configuration.LocalCache; using ZB.MOM.WW.OtOpcUa.Configuration.LocalCache;
using ZB.MOM.WW.OtOpcUa.Core.Hosting; using ZB.MOM.WW.OtOpcUa.Core.Hosting;
using ZB.MOM.WW.OtOpcUa.Server; using ZB.MOM.WW.OtOpcUa.Server;
@@ -72,5 +74,11 @@ builder.Services.AddSingleton<NodeBootstrap>();
builder.Services.AddSingleton<OpcUaApplicationHost>(); builder.Services.AddSingleton<OpcUaApplicationHost>();
builder.Services.AddHostedService<OpcUaServerService>(); builder.Services.AddHostedService<OpcUaServerService>();
// Central-config DB access for the host-status publisher (LMX follow-up #7). Scoped context
// so per-heartbeat change-tracking stays isolated; publisher opens one scope per tick.
builder.Services.AddDbContext<OtOpcUaConfigDbContext>(opt =>
opt.UseSqlServer(options.ConfigDbConnectionString));
builder.Services.AddHostedService<HostStatusPublisher>();
var host = builder.Build(); var host = builder.Build();
await host.RunAsync(); await host.RunAsync();

View File

@@ -24,6 +24,7 @@
<PackageReference Include="OPCFoundation.NetStandard.Opc.Ua.Server" Version="1.5.374.126"/> <PackageReference Include="OPCFoundation.NetStandard.Opc.Ua.Server" Version="1.5.374.126"/>
<PackageReference Include="OPCFoundation.NetStandard.Opc.Ua.Configuration" Version="1.5.374.126"/> <PackageReference Include="OPCFoundation.NetStandard.Opc.Ua.Configuration" Version="1.5.374.126"/>
<PackageReference Include="Novell.Directory.Ldap.NETStandard" Version="3.6.0"/> <PackageReference Include="Novell.Directory.Ldap.NETStandard" Version="3.6.0"/>
<PackageReference Include="Microsoft.EntityFrameworkCore.SqlServer" Version="10.0.0"/>
</ItemGroup> </ItemGroup>
<ItemGroup> <ItemGroup>

View File

@@ -0,0 +1,127 @@
using System.Linq;
using System.Threading;
using System.Threading.Tasks;
using Shouldly;
using Xunit;
using Xunit.Abstractions;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.TestSupport;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests
{
/// <summary>
/// Exercises <see cref="AvevaPrerequisites"/> against the live dev box so the helper
/// itself gets integration coverage — i.e. "do the probes return Pass for things that
/// really are Pass?" as validated against this machine's known-installed topology.
/// Category <c>LiveGalaxy</c> so CI / clean dev boxes skip cleanly.
/// </summary>
[Trait("Category", "LiveGalaxy")]
public sealed class AvevaPrerequisitesLiveTests
{
private readonly ITestOutputHelper _output;
public AvevaPrerequisitesLiveTests(ITestOutputHelper output) => _output = output;
[Fact]
public async Task CheckAll_on_live_box_reports_Framework_install()
{
var report = await AvevaPrerequisites.CheckAllAsync();
_output.WriteLine(report.ToString());
report.Checks.ShouldContain(c =>
c.Name == "registry:ArchestrA.Framework" && c.Status == PrerequisiteStatus.Pass,
"ArchestrA Framework registry root should be found on this machine.");
}
[Fact]
public async Task CheckAll_on_live_box_reports_aaBootstrap_running()
{
var report = await AvevaPrerequisites.CheckAllAsync();
var bootstrap = report.Checks.FirstOrDefault(c => c.Name == "service:aaBootstrap");
bootstrap.ShouldNotBeNull();
bootstrap.Status.ShouldBe(PrerequisiteStatus.Pass,
$"aaBootstrap must be Running for any live-Galaxy test to work — detail: {bootstrap.Detail}");
}
[Fact]
public async Task CheckAll_on_live_box_reports_aaGR_running()
{
var report = await AvevaPrerequisites.CheckAllAsync();
var gr = report.Checks.FirstOrDefault(c => c.Name == "service:aaGR");
gr.ShouldNotBeNull();
gr.Status.ShouldBe(PrerequisiteStatus.Pass,
$"aaGR (Galaxy Repository) must be Running — detail: {gr.Detail}");
}
[Fact]
public async Task CheckAll_on_live_box_reports_MxAccess_COM_registered()
{
var report = await AvevaPrerequisites.CheckAllAsync();
var com = report.Checks.FirstOrDefault(c => c.Name == "com:LMXProxy");
com.ShouldNotBeNull();
com.Status.ShouldBe(PrerequisiteStatus.Pass,
$"LMXProxy.LMXProxyServer ProgID must resolve to an InprocServer32 DLL — detail: {com.Detail}");
}
[Fact]
public async Task CheckRepositoryOnly_on_live_box_reports_ZB_reachable()
{
var report = await AvevaPrerequisites.CheckRepositoryOnlyAsync(ct: CancellationToken.None);
var zb = report.Checks.FirstOrDefault(c => c.Name == "sql:ZB");
zb.ShouldNotBeNull();
zb.Status.ShouldBe(PrerequisiteStatus.Pass,
$"ZB database must be reachable via SQL Server Windows auth — detail: {zb.Detail}");
}
[Fact]
public async Task CheckRepositoryOnly_on_live_box_reports_non_zero_deployed_objects()
{
// This box has 49 deployed objects per the research; we just assert > 0 so adding/
// removing objects doesn't break the test.
var report = await AvevaPrerequisites.CheckRepositoryOnlyAsync();
var deployed = report.Checks.FirstOrDefault(c => c.Name == "sql:ZB.deployedObjects");
deployed.ShouldNotBeNull();
deployed.Status.ShouldBe(PrerequisiteStatus.Pass,
$"At least one deployed gobject should exist — detail: {deployed.Detail}");
}
[Fact]
public async Task Aveva_side_is_ready_on_this_machine()
{
// Narrower than "livetest ready" — our own services (OtOpcUa / OtOpcUaGalaxyHost)
// may not be installed on a developer's box while they're actively iterating on
// them, but the AVEVA side (Framework / Galaxy Repository / MXAccess COM /
// SQL / core services) should always be up on a machine with System Platform
// installed. This assertion is what gates live-Galaxy tests that go straight to
// the Galaxy Repository without routing through our stack.
var report = await AvevaPrerequisites.CheckAllAsync(
new AvevaPrerequisites.Options { CheckGalaxyHostPipe = false });
_output.WriteLine(report.ToString());
_output.WriteLine(report.Warnings ?? "no warnings");
// Enumerate AVEVA-side failures (if any) for an actionable assertion message.
var avevaFails = report.Checks
.Where(c => c.Status == PrerequisiteStatus.Fail &&
c.Category != PrerequisiteCategory.OtOpcUaService)
.ToList();
report.IsAvevaSideReady.ShouldBeTrue(
avevaFails.Count == 0
? "unexpected state"
: "AVEVA-side failures: " + string.Join(" ; ",
avevaFails.Select(f => $"{f.Name}: {f.Detail}")));
}
[Fact]
public async Task Report_captures_OtOpcUa_services_state_even_when_not_installed()
{
// The helper reports the status of OtOpcUaGalaxyHost + OtOpcUa services even if
// they're not installed yet — absence is itself an actionable signal. This test
// doesn't assert Pass/Fail on those services (their state depends on what's
// installed when the test runs) — it only asserts the helper EMITTED the rows,
// so nobody can ship a prerequisite check that silently omits our own services.
var report = await AvevaPrerequisites.CheckAllAsync();
report.Checks.ShouldContain(c => c.Name == "service:OtOpcUaGalaxyHost");
report.Checks.ShouldContain(c => c.Name == "service:OtOpcUa");
report.Checks.ShouldContain(c => c.Name == "service:GLAuth");
}
}
}

View File

@@ -6,6 +6,7 @@ using Xunit;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend; using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Galaxy; using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Galaxy;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts; using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.TestSupport;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests
{ {
@@ -16,6 +17,11 @@ namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests
/// SQL the v1 Host uses, proving the lift is byte-for-byte equivalent at the /// SQL the v1 Host uses, proving the lift is byte-for-byte equivalent at the
/// <c>DiscoverHierarchyResponse</c> shape. /// <c>DiscoverHierarchyResponse</c> shape.
/// </summary> /// </summary>
/// <remarks>
/// Since PR 36, skip logic is delegated to <see cref="AvevaPrerequisites.CheckRepositoryOnlyAsync"/>
/// so operators see exactly why a test skipped ("ZB db not found" vs "SQL Server
/// unreachable") instead of a silent return.
/// </remarks>
[Trait("Category", "LiveGalaxy")] [Trait("Category", "LiveGalaxy")]
public sealed class GalaxyRepositoryLiveSmokeTests public sealed class GalaxyRepositoryLiveSmokeTests
{ {
@@ -26,15 +32,20 @@ namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests
CommandTimeoutSeconds = 10, CommandTimeoutSeconds = 10,
}; };
private static async Task<string?> RepositorySkipReasonAsync()
{
using var cts = new CancellationTokenSource(TimeSpan.FromSeconds(4));
var report = await AvevaPrerequisites.CheckRepositoryOnlyAsync(
DevZbOptions().ConnectionString, cts.Token);
return report.SkipReason;
}
private static async Task<bool> ZbReachableAsync() private static async Task<bool> ZbReachableAsync()
{ {
try // Legacy silent-skip adapter — keeps the existing tests compiling while
{ // gradually migrating to the Skip-with-reason pattern. Returns true when the
var repo = new GalaxyRepository(DevZbOptions()); // prerequisite check has no Fail entries.
using var cts = new CancellationTokenSource(TimeSpan.FromSeconds(3)); return (await RepositorySkipReasonAsync()) is null;
return await repo.TestConnectionAsync(cts.Token);
}
catch { return false; }
} }
[Fact] [Fact]

View File

@@ -23,6 +23,7 @@
<ItemGroup> <ItemGroup>
<ProjectReference Include="..\..\src\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.csproj"/> <ProjectReference Include="..\..\src\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.csproj"/>
<ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.TestSupport\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.TestSupport.csproj"/>
<Reference Include="System.ServiceProcess"/> <Reference Include="System.ServiceProcess"/>
<!-- IMxProxy's delegate signatures mention ArchestrA.MxAccess.MXSTATUS_PROXY, so tests <!-- IMxProxy's delegate signatures mention ArchestrA.MxAccess.MXSTATUS_PROXY, so tests
implementing the interface must resolve that type at compile time. --> implementing the interface must resolve that type at compile time. -->

View File

@@ -0,0 +1,81 @@
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests;
/// <summary>
/// Pins <see cref="GalaxyProxyDriver.ToHistoricalEvent"/> — the wire-to-domain mapping
/// from <see cref="GalaxyHistoricalEvent"/> (MessagePack-annotated IPC contract,
/// Unix-ms timestamps) to <c>Core.Abstractions.HistoricalEvent</c> (domain record,
/// <see cref="DateTime"/> timestamps). Added in PR 35 alongside the new
/// <c>IHistoryProvider.ReadEventsAsync</c> method.
/// </summary>
[Trait("Category", "Unit")]
public sealed class HistoricalEventMappingTests
{
[Fact]
public void Maps_every_field_from_wire_to_domain_record()
{
var wire = new GalaxyHistoricalEvent
{
EventId = "evt-42",
SourceName = "Tank1.HiAlarm",
EventTimeUtcUnixMs = 1_700_000_000_000L, // 2023-11-14T22:13:20.000Z
ReceivedTimeUtcUnixMs = 1_700_000_000_500L,
DisplayText = "High level reached",
Severity = 750,
};
var domain = GalaxyProxyDriver.ToHistoricalEvent(wire);
domain.EventId.ShouldBe("evt-42");
domain.SourceName.ShouldBe("Tank1.HiAlarm");
domain.EventTimeUtc.ShouldBe(new DateTime(2023, 11, 14, 22, 13, 20, DateTimeKind.Utc));
domain.ReceivedTimeUtc.ShouldBe(new DateTime(2023, 11, 14, 22, 13, 20, 500, DateTimeKind.Utc));
domain.Message.ShouldBe("High level reached");
domain.Severity.ShouldBe((ushort)750);
}
[Fact]
public void Preserves_null_SourceName_and_DisplayText()
{
// Historical rows from the Galaxy event historian often omit source or message for
// system events (e.g. time sync). The mapping must preserve null — callers use it to
// distinguish system events from alarm events.
var wire = new GalaxyHistoricalEvent
{
EventId = "sys-1",
SourceName = null,
EventTimeUtcUnixMs = 0,
ReceivedTimeUtcUnixMs = 0,
DisplayText = null,
Severity = 1,
};
var domain = GalaxyProxyDriver.ToHistoricalEvent(wire);
domain.SourceName.ShouldBeNull();
domain.Message.ShouldBeNull();
}
[Fact]
public void EventTime_and_ReceivedTime_are_produced_as_DateTimeKind_Utc()
{
// Unix-ms timestamps come off the wire timezone-agnostic; the mapping must tag the
// resulting DateTime as Utc so downstream serializers (JSON, OPC UA types) don't apply
// an unexpected local-time offset.
var wire = new GalaxyHistoricalEvent
{
EventId = "e",
EventTimeUtcUnixMs = 1_000L,
ReceivedTimeUtcUnixMs = 2_000L,
};
var domain = GalaxyProxyDriver.ToHistoricalEvent(wire);
domain.EventTimeUtc.Kind.ShouldBe(DateTimeKind.Utc);
domain.ReceivedTimeUtc.Kind.ShouldBe(DateTimeKind.Utc);
}
}

View File

@@ -0,0 +1,75 @@
using System.Runtime.InteropServices;
using System.Runtime.Versioning;
using Microsoft.Win32;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests.LiveStack;
/// <summary>
/// Resolves the pipe name + shared secret the live <see cref="GalaxyProxyDriver"/> needs
/// to connect to a running <c>OtOpcUaGalaxyHost</c> Windows service. Two sources are
/// consulted, first match wins:
/// <list type="number">
/// <item>Explicit env vars (<c>OTOPCUA_GALAXY_PIPE</c>, <c>OTOPCUA_GALAXY_SECRET</c>) — lets CI / benchwork override.</item>
/// <item>The service's per-process <c>Environment</c> registry values under
/// <c>HKLM\SYSTEM\CurrentControlSet\Services\OtOpcUaGalaxyHost</c> — what
/// <c>Install-Services.ps1</c> writes at install time. Requires the test to run as a
/// principal with read access to that registry key (typically Administrators).</item>
/// </list>
/// </summary>
/// <remarks>
/// Explicitly NOT baked-in-to-source: the shared secret is rotated per install (the
/// installer generates 32 random bytes and stores the base64 string). A hard-coded secret
/// in tests would diverge from production the moment someone re-installed the service.
/// </remarks>
public sealed record LiveStackConfig(string PipeName, string SharedSecret, string? Source)
{
public const string EnvPipeName = "OTOPCUA_GALAXY_PIPE";
public const string EnvSharedSecret = "OTOPCUA_GALAXY_SECRET";
public const string ServiceRegistryKey =
@"SYSTEM\CurrentControlSet\Services\OtOpcUaGalaxyHost";
public const string DefaultPipeName = "OtOpcUaGalaxy";
public static LiveStackConfig? Resolve()
{
var envPipe = Environment.GetEnvironmentVariable(EnvPipeName);
var envSecret = Environment.GetEnvironmentVariable(EnvSharedSecret);
if (!string.IsNullOrWhiteSpace(envPipe) && !string.IsNullOrWhiteSpace(envSecret))
return new LiveStackConfig(envPipe, envSecret, "env vars");
if (!RuntimeInformation.IsOSPlatform(OSPlatform.Windows))
return null;
return FromServiceRegistry();
}
[SupportedOSPlatform("windows")]
private static LiveStackConfig? FromServiceRegistry()
{
try
{
using var key = Registry.LocalMachine.OpenSubKey(ServiceRegistryKey);
if (key is null) return null;
var env = key.GetValue("Environment") as string[];
if (env is null || env.Length == 0) return null;
string? pipe = null, secret = null;
foreach (var line in env)
{
var eq = line.IndexOf('=');
if (eq <= 0) continue;
var name = line[..eq];
var value = line[(eq + 1)..];
if (name.Equals(EnvPipeName, StringComparison.OrdinalIgnoreCase)) pipe = value;
else if (name.Equals(EnvSharedSecret, StringComparison.OrdinalIgnoreCase)) secret = value;
}
if (string.IsNullOrWhiteSpace(secret)) return null;
return new LiveStackConfig(pipe ?? DefaultPipeName, secret, "service registry");
}
catch
{
// Access denied / key missing / malformed — caller gets null and surfaces a Skip.
return null;
}
}
}

View File

@@ -0,0 +1,120 @@
using System.Threading;
using System.Threading.Tasks;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.TestSupport;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests.LiveStack;
/// <summary>
/// Connects a single <see cref="GalaxyProxyDriver"/> to the already-running
/// <c>OtOpcUaGalaxyHost</c> Windows service for the lifetime of a test class. Uses
/// <see cref="AvevaPrerequisites"/> to decide whether to proceed; on failure,
/// <see cref="SkipReason"/> is populated and each test calls <see cref="SkipIfUnavailable"/>
/// to translate that into <c>Assert.Skip</c>.
/// </summary>
/// <remarks>
/// <para>
/// <b>Does NOT spawn the Host process.</b> Production deploys <c>OtOpcUaGalaxyHost</c>
/// as a standalone Windows service — spawning a second instance from a test would
/// bypass the COM-apartment + service-account setup and fail differently than
/// production (see <c>project_galaxy_host_service.md</c> memory).
/// </para>
/// <para>
/// <b>Shared-secret handling</b>: read from <see cref="LiveStackConfig"/> — env vars
/// first, then the service's registry-stored <c>Environment</c> values. Requires
/// the test process to have read access to
/// <c>HKLM\SYSTEM\CurrentControlSet\Services\OtOpcUaGalaxyHost</c>; on a dev box
/// that typically means running the test host elevated, or exporting
/// <c>OTOPCUA_GALAXY_SECRET</c> out-of-band.
/// </para>
/// </remarks>
public sealed class LiveStackFixture : IAsyncLifetime
{
public GalaxyProxyDriver? Driver { get; private set; }
public string? SkipReason { get; private set; }
public PrerequisiteReport? PrerequisiteReport { get; private set; }
public LiveStackConfig? Config { get; private set; }
public async ValueTask InitializeAsync()
{
// 1. AVEVA + OtOpcUa service state — actionable diagnostic if anything is missing.
using var cts = new CancellationTokenSource(TimeSpan.FromSeconds(10));
PrerequisiteReport = await AvevaPrerequisites.CheckAllAsync(
new AvevaPrerequisites.Options { CheckGalaxyHostPipe = true, CheckHistorian = false },
cts.Token);
if (!PrerequisiteReport.IsLivetestReady)
{
SkipReason = PrerequisiteReport.SkipReason;
return;
}
// 2. Secret / pipe-name resolution. If the service is running but we can't discover its
// env vars from registry (non-elevated test host), a clear message beats a silent
// connect-rejected failure 10 seconds later.
Config = LiveStackConfig.Resolve();
if (Config is null)
{
SkipReason =
$"Cannot resolve shared secret. Set {LiveStackConfig.EnvSharedSecret} (and optionally " +
$"{LiveStackConfig.EnvPipeName}) in the environment, or run the test host elevated so it " +
$"can read HKLM\\{LiveStackConfig.ServiceRegistryKey}\\Environment.";
return;
}
// 3. Connect. InitializeAsync does the pipe connect + handshake; a 5-second
// ConnectTimeout gives enough headroom for a service that just started.
Driver = new GalaxyProxyDriver(new GalaxyProxyOptions
{
DriverInstanceId = "live-stack-smoke",
PipeName = Config.PipeName,
SharedSecret = Config.SharedSecret,
ConnectTimeout = TimeSpan.FromSeconds(5),
});
try
{
await Driver.InitializeAsync(driverConfigJson: "{}", CancellationToken.None);
}
catch (Exception ex)
{
SkipReason =
$"Connected to named pipe '{Config.PipeName}' but GalaxyProxyDriver.InitializeAsync failed: " +
$"{ex.GetType().Name}: {ex.Message}. Common causes: shared secret mismatch (rotated after last install), " +
$"service account SID not in pipe ACL (installer sets OTOPCUA_ALLOWED_SID to the service account — " +
$"test must run as that user), or Host's backend couldn't connect to ZB.";
Driver.Dispose();
Driver = null;
return;
}
}
public async ValueTask DisposeAsync()
{
if (Driver is not null)
{
try { await Driver.ShutdownAsync(CancellationToken.None); } catch { /* best-effort */ }
Driver.Dispose();
}
}
/// <summary>
/// Translate <see cref="SkipReason"/> into <c>Assert.Skip</c>. Tests call this at the
/// top of every fact so a fixture init failure shows up as a cleanly-skipped test with
/// the full prerequisites report, not a cascading NullReferenceException on
/// <see cref="Driver"/>.
/// </summary>
public void SkipIfUnavailable()
{
if (SkipReason is not null) Assert.Skip(SkipReason);
}
}
[CollectionDefinition(Name)]
public sealed class LiveStackCollection : ICollectionFixture<LiveStackFixture>
{
public const string Name = "LiveStack";
}

View File

@@ -0,0 +1,147 @@
using System.Collections.Generic;
using System.Linq;
using System.Threading;
using System.Threading.Tasks;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests.LiveStack;
/// <summary>
/// End-to-end smoke against the installed <c>OtOpcUaGalaxyHost</c> Windows service.
/// Closes LMX follow-up #5 — exercises the full topology: <see cref="GalaxyProxyDriver"/>
/// in-process → named-pipe IPC → <c>OtOpcUaGalaxyHost</c> service → <c>MxAccessGalaxyBackend</c> →
/// live MXAccess runtime → real Galaxy objects + attributes.
/// </summary>
/// <remarks>
/// <para>
/// <b>Preconditions</b> (all checked by <see cref="LiveStackFixture"/>, surfaced via
/// <c>Assert.Skip</c> when missing):
/// </para>
/// <list type="bullet">
/// <item>AVEVA System Platform installed + Platform deployed.</item>
/// <item><c>aaBootstrap</c> / <c>aaGR</c> / <c>NmxSvc</c> / <c>MSSQLSERVER</c> running.</item>
/// <item>MXAccess COM server registered.</item>
/// <item>ZB database exists with at least one deployed gobject.</item>
/// <item><c>OtOpcUaGalaxyHost</c> service installed + running (named pipe accepting connections).</item>
/// <item>Shared secret discoverable via <c>OTOPCUA_GALAXY_SECRET</c> env var or the
/// service's registry Environment values (test host typically needs to be elevated
/// to read the latter).</item>
/// <item>Test process runs as the account listed in the service's pipe ACL
/// (<c>OTOPCUA_ALLOWED_SID</c>, typically the service account per decision #76).</item>
/// </list>
/// <para>
/// Tests here are deliberately read-only. Writes against live Galaxy attributes are a
/// separate concern — they need a test-only UDA or an agreed scratch tag so they can't
/// accidentally mutate a process-critical value. Adding a write test is a follow-up
/// PR that reuses this fixture.
/// </para>
/// </remarks>
[Trait("Category", "LiveGalaxy")]
[Collection(LiveStackCollection.Name)]
public sealed class LiveStackSmokeTests(LiveStackFixture fixture)
{
[Fact]
public void Fixture_initialized_successfully()
{
fixture.SkipIfUnavailable();
// If the fixture init succeeded, Driver is non-null and InitializeAsync completed.
// This is the cheapest possible assertion that the IPC handshake worked end-to-end;
// every other test in this class depends on it.
fixture.Driver.ShouldNotBeNull();
fixture.Config.ShouldNotBeNull();
fixture.PrerequisiteReport.ShouldNotBeNull();
fixture.PrerequisiteReport!.IsLivetestReady.ShouldBeTrue(fixture.PrerequisiteReport.SkipReason);
}
[Fact]
public void Driver_reports_Healthy_after_IPC_handshake()
{
fixture.SkipIfUnavailable();
var health = fixture.Driver!.GetHealth();
health.State.ShouldBe(DriverState.Healthy,
$"Expected Healthy after successful IPC connect; Reason={health.LastError}");
}
[Fact]
public async Task DiscoverAsync_returns_at_least_one_variable_from_live_galaxy()
{
fixture.SkipIfUnavailable();
var builder = new CapturingAddressSpaceBuilder();
using var cts = new CancellationTokenSource(TimeSpan.FromSeconds(30));
await fixture.Driver!.DiscoverAsync(builder, cts.Token);
builder.Variables.Count.ShouldBeGreaterThan(0,
"Live Galaxy has > 0 deployed objects per the prereq check — at least one variable must be discovered. " +
"Zero usually means the Host couldn't read ZB (check OTOPCUA_GALAXY_ZB_CONN in the service Environment).");
// Every discovered attribute must carry a non-empty FullName so the OPC UA server can
// route reads/writes back. Regression guard — PR 19 normalized this across drivers.
builder.Variables.ShouldAllBe(v => !string.IsNullOrEmpty(v.AttributeInfo.FullName));
}
[Fact]
public void GetHostStatuses_reports_at_least_one_platform()
{
fixture.SkipIfUnavailable();
var statuses = fixture.Driver!.GetHostStatuses();
statuses.Count.ShouldBeGreaterThan(0,
"Live Galaxy must report at least one Platform/AppEngine host via IHostConnectivityProbe. " +
"Zero means the Host's probe loop hasn't completed its first tick or the Platform isn't deployed locally.");
// Host names are driver-opaque to the Core but non-empty by contract.
statuses.ShouldAllBe(h => !string.IsNullOrEmpty(h.HostName));
}
[Fact]
public async Task Can_read_a_discovered_variable_from_live_galaxy()
{
fixture.SkipIfUnavailable();
var builder = new CapturingAddressSpaceBuilder();
using var cts = new CancellationTokenSource(TimeSpan.FromSeconds(30));
await fixture.Driver!.DiscoverAsync(builder, cts.Token);
builder.Variables.Count.ShouldBeGreaterThan(0);
// Pick the first discovered variable. Read-only smoke — we don't assert on Value,
// only that a ReadAsync round-trip through Proxy → Host pipe → MXAccess → back
// returns a snapshot with a non-BadInternalError status. Galaxy attributes default to
// Uncertain quality until the Engine's first scan publishes them, which is fine here.
var full = builder.Variables[0].AttributeInfo.FullName;
var snapshots = await fixture.Driver!.ReadAsync([full], cts.Token);
snapshots.Count.ShouldBe(1);
var snap = snapshots[0];
snap.StatusCode.ShouldNotBe(0x80020000u,
$"Read returned BadInternalError for {full} — the Host couldn't fulfil the request. " +
$"Investigate: the Host service's logs at {System.Environment.GetFolderPath(System.Environment.SpecialFolder.CommonApplicationData)}\\OtOpcUa\\Galaxy\\logs.");
}
/// <summary>
/// Minimal <see cref="IAddressSpaceBuilder"/> implementation that captures every
/// Variable() call into a flat list so tests can inspect what discovery produced
/// without running the full OPC UA node-manager stack.
/// </summary>
private sealed class CapturingAddressSpaceBuilder : IAddressSpaceBuilder
{
public List<(string BrowseName, DriverAttributeInfo AttributeInfo)> Variables { get; } = [];
public IAddressSpaceBuilder Folder(string browseName, string displayName) => this;
public IVariableHandle Variable(string browseName, string displayName, DriverAttributeInfo attributeInfo)
{
Variables.Add((browseName, attributeInfo));
return new NoopHandle(attributeInfo.FullName);
}
public void AddProperty(string browseName, DriverDataType dataType, object? value) { }
private sealed class NoopHandle(string fullReference) : IVariableHandle
{
public string FullReference { get; } = fullReference;
public IAlarmConditionSink MarkAsAlarmCondition(AlarmConditionInfo info) => new NoopSink();
private sealed class NoopSink : IAlarmConditionSink
{
public void OnTransition(AlarmEventArgs args) { }
}
}
}
}

View File

@@ -22,6 +22,7 @@
<ItemGroup> <ItemGroup>
<ProjectReference Include="..\..\src\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.csproj"/> <ProjectReference Include="..\..\src\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.csproj"/>
<ProjectReference Include="..\..\src\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.csproj"/> <ProjectReference Include="..\..\src\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.csproj"/>
<ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.TestSupport\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.TestSupport.csproj"/>
</ItemGroup> </ItemGroup>
<ItemGroup> <ItemGroup>

View File

@@ -0,0 +1,163 @@
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.TestSupport.Probes;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.TestSupport;
/// <summary>
/// Entry point for live-AVEVA test fixtures. Runs every relevant probe and returns a
/// <see cref="PrerequisiteReport"/> whose <c>SkipReason</c> feeds <c>Assert.Skip</c> when
/// the environment isn't set up. Non-Windows hosts get a single aggregated Skip row per
/// category instead of a flood of individual skips.
/// </summary>
/// <remarks>
/// <para><b>Call shape</b>:</para>
/// <code>
/// var report = await AvevaPrerequisites.CheckAllAsync();
/// if (report.SkipReason is not null) Assert.Skip(report.SkipReason);
/// </code>
/// <para><b>Categories in rough order of 'would I want to know first?'</b>:</para>
/// <list type="number">
/// <item>Environment — process bitness, OS platform, RPCSS up.</item>
/// <item>AvevaInstall — Framework registry, install paths, no pending reboot.</item>
/// <item>AvevaCoreService — aaBootstrap / aaGR / NmxSvc running.</item>
/// <item>MxAccessCom — LMXProxy.LMXProxyServer ProgID → CLSID → file-on-disk.</item>
/// <item>GalaxyRepository — SQL reachable, ZB exists, deployed-object count.</item>
/// <item>OtOpcUaService — our two Windows services + GLAuth.</item>
/// <item>AvevaSoftService — aaLogger etc., warn only.</item>
/// <item>AvevaHistorian — aahClientAccessPoint etc., optional.</item>
/// </list>
/// <para><b>What's NOT checked here</b>: end-to-end subscribe / read / write against a real
/// Galaxy tag. That's the job of the live-smoke tests this helper gates — the helper just
/// tells them whether running is worthwhile.</para>
/// </remarks>
public static class AvevaPrerequisites
{
// -------- Individual service lists (kept as data so tests can inspect / override) --------
/// <summary>Services whose absence means live-Galaxy tests can't run at all.</summary>
internal static readonly (string Name, string Purpose)[] CoreServices =
[
("aaBootstrap", "master service that starts the Platform process + brokers aa* communication"),
("aaGR", "Galaxy Repository host — mediates IDE / runtime access to ZB"),
("NmxSvc", "Network Message Exchange — MXAccess + Bootstrap transport"),
("MSSQLSERVER", "SQL Server instance that hosts the ZB database"),
];
/// <summary>Warn-but-don't-fail AVEVA services.</summary>
internal static readonly (string Name, string Purpose)[] SoftServices =
[
("aaLogger", "ArchestrA Logger — diagnostic log receiver; stack runs without it but error visibility suffers"),
("aaUserValidator", "OS user/group auth for ArchestrA security; only required when Galaxy security mode isn't 'Open'"),
("aaGlobalDataCacheMonitorSvr", "cross-platform global data cache; single-node dev boxes run fine without it"),
];
/// <summary>Optional AVEVA Historian services — only required for HistoryRead IPC paths.</summary>
internal static readonly (string Name, string Purpose)[] HistorianServices =
[
("aahClientAccessPoint", "AVEVA Historian Client Access Point — HistoryRead IPC endpoint"),
("aahGateway", "AVEVA Historian Gateway"),
];
/// <summary>OtOpcUa-stack Windows services + third-party deps we manage.</summary>
internal static readonly (string Name, string Purpose, bool HardRequired)[] OtOpcUaServices =
[
("OtOpcUaGalaxyHost", "Galaxy.Host out-of-process service (net48 x86, STA + MXAccess)", true),
("OtOpcUa", "Main OPC UA server service (hosts Proxy + DriverHost + Admin-facing DB publisher)", false),
("GLAuth", "LDAP server (dev only) — glauth.exe on localhost:3893", false),
];
// -------- Orchestrator --------
public static async Task<PrerequisiteReport> CheckAllAsync(
Options? options = null, CancellationToken ct = default)
{
options ??= new Options();
var checks = new List<PrerequisiteCheck>();
// Environment
checks.Add(MxAccessComProbe.CheckProcessBitness());
// AvevaInstall — registry + files
checks.Add(RegistryProbe.CheckFrameworkInstalled());
checks.Add(RegistryProbe.CheckPlatformDeployed());
checks.Add(RegistryProbe.CheckRebootPending());
// AvevaCoreService
foreach (var (name, purpose) in CoreServices)
checks.Add(ServiceProbe.Check(name, PrerequisiteCategory.AvevaCoreService, hardRequired: true, whatItDoes: purpose));
// MxAccessCom
checks.Add(MxAccessComProbe.Check());
// GalaxyRepository
checks.Add(await SqlProbe.CheckZbDatabaseAsync(options.SqlConnectionString, ct));
// Deployed-object count only makes sense if the DB check passed.
if (checks[checks.Count - 1].Status == PrerequisiteStatus.Pass)
checks.Add(await SqlProbe.CheckDeployedObjectCountAsync(options.SqlConnectionString, ct));
// OtOpcUaService
foreach (var (name, purpose, hard) in OtOpcUaServices)
checks.Add(ServiceProbe.Check(name, PrerequisiteCategory.OtOpcUaService, hardRequired: hard, whatItDoes: purpose));
if (options.CheckGalaxyHostPipe)
checks.Add(await NamedPipeProbe.CheckGalaxyHostPipeAsync(options.GalaxyHostPipeName, ct));
// AvevaSoftService
foreach (var (name, purpose) in SoftServices)
checks.Add(ServiceProbe.Check(name, PrerequisiteCategory.AvevaSoftService, hardRequired: false, whatItDoes: purpose));
// AvevaHistorian
if (options.CheckHistorian)
{
foreach (var (name, purpose) in HistorianServices)
checks.Add(ServiceProbe.Check(name, PrerequisiteCategory.AvevaHistorian, hardRequired: false, whatItDoes: purpose));
}
return new PrerequisiteReport(checks);
}
/// <summary>
/// Narrower check for tests that only need the Galaxy Repository (SQL) path — don't
/// pay the cost of probing every aa* service when the test only reads gobject rows.
/// </summary>
public static async Task<PrerequisiteReport> CheckRepositoryOnlyAsync(
string? sqlConnectionString = null, CancellationToken ct = default)
{
var checks = new List<PrerequisiteCheck>
{
await SqlProbe.CheckZbDatabaseAsync(sqlConnectionString, ct),
};
if (checks[0].Status == PrerequisiteStatus.Pass)
checks.Add(await SqlProbe.CheckDeployedObjectCountAsync(sqlConnectionString, ct));
return new PrerequisiteReport(checks);
}
/// <summary>
/// Narrower check for the named-pipe endpoint — tests that drive the full Proxy
/// against a live Galaxy.Host service don't need the SQL or AVEVA-internal probes
/// (the Host does that work internally; we just need the pipe to accept).
/// </summary>
public static async Task<PrerequisiteReport> CheckGalaxyHostPipeOnlyAsync(
string? pipeName = null, CancellationToken ct = default)
{
var checks = new List<PrerequisiteCheck>
{
await NamedPipeProbe.CheckGalaxyHostPipeAsync(pipeName, ct),
};
return new PrerequisiteReport(checks);
}
/// <summary>Knobs for <see cref="CheckAllAsync"/>.</summary>
public sealed class Options
{
/// <summary>SQL Server connection string — defaults to Windows-auth <c>localhost\ZB</c>.</summary>
public string? SqlConnectionString { get; init; }
/// <summary>Named-pipe endpoint for OtOpcUaGalaxyHost — defaults to <c>OtOpcUaGalaxy</c>.</summary>
public string? GalaxyHostPipeName { get; init; }
/// <summary>Include the named-pipe probe. Off by default — it's a seconds-long TCP-like probe and some tests don't need it.</summary>
public bool CheckGalaxyHostPipe { get; init; } = true;
/// <summary>Include Historian service probes. Off by default — Historian is optional.</summary>
public bool CheckHistorian { get; init; } = false;
}
}

View File

@@ -0,0 +1,26 @@
#if NET48
// Polyfills for C# 9+ language features that the helper uses but that net48 BCL doesn't
// provide. Keeps the sources single-target-free at the language level — the same .cs files
// build on both frameworks without preprocessor guards in the callsites.
namespace System.Runtime.CompilerServices
{
/// <summary>Required by C# 9 <c>init</c>-only setters and <c>record</c> types.</summary>
internal static class IsExternalInit { }
}
namespace System.Runtime.Versioning
{
/// <summary>
/// Minimal shim for the .NET 5+ <c>SupportedOSPlatformAttribute</c>. Pure marker for the
/// compiler on net10; on net48 we still want the attribute to exist so the same
/// <c>[SupportedOSPlatform("windows")]</c> source compiles. The attribute is internal
/// and attribute-targets-everything to minimize surface.
/// </summary>
[AttributeUsage(AttributeTargets.All, Inherited = false, AllowMultiple = true)]
internal sealed class SupportedOSPlatformAttribute(string platformName) : Attribute
{
public string PlatformName { get; } = platformName;
}
}
#endif

View File

@@ -0,0 +1,44 @@
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.TestSupport;
/// <summary>One prerequisite probe's outcome. <see cref="AvevaPrerequisites"/> returns many of these.</summary>
/// <param name="Name">Short diagnostic id — e.g. <c>service:aaBootstrap</c>, <c>sql:ZB</c>, <c>registry:ArchestrA.Framework</c>.</param>
/// <param name="Category">Which subsystem the probe belongs to — lets callers filter (e.g. "Historian warns don't gate the core Galaxy smoke").</param>
/// <param name="Status">Outcome.</param>
/// <param name="Detail">One-line specific message an operator can act on — <c>"aaGR not installed — install the Galaxy Repository role from the System Platform setup"</c> beats <c>"failed"</c>.</param>
public sealed record PrerequisiteCheck(
string Name,
PrerequisiteCategory Category,
PrerequisiteStatus Status,
string Detail);
public enum PrerequisiteStatus
{
/// <summary>Prerequisite is met; no action needed.</summary>
Pass,
/// <summary>Soft dependency missing — stack still runs but some feature (e.g. logging) is degraded.</summary>
Warn,
/// <summary>Hard dependency missing — live tests can't proceed; <see cref="PrerequisiteReport.SkipReason"/> surfaces this.</summary>
Fail,
/// <summary>Probe wasn't applicable in this environment (e.g. non-Windows host, Historian not installed).</summary>
Skip,
}
public enum PrerequisiteCategory
{
/// <summary>Platform sanity — process bitness, OS platform, DCOM/RPCSS.</summary>
Environment,
/// <summary>Hard-required AVEVA Windows services (aaBootstrap, aaGR, NmxSvc).</summary>
AvevaCoreService,
/// <summary>Soft-required AVEVA Windows services (aaLogger, aaUserValidator) — warn only.</summary>
AvevaSoftService,
/// <summary>ArchestrA Framework install markers (registry + files).</summary>
AvevaInstall,
/// <summary>MXAccess COM server registration + file on disk.</summary>
MxAccessCom,
/// <summary>SQL Server reachability + ZB database presence + deployed-object count.</summary>
GalaxyRepository,
/// <summary>Historian services (optional — only required for HistoryRead IPC paths).</summary>
AvevaHistorian,
/// <summary>OtOpcUa-side services (OtOpcUa, OtOpcUaGalaxyHost) + third-party deps (GLAuth).</summary>
OtOpcUaService,
}

View File

@@ -0,0 +1,94 @@
using System.Text;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.TestSupport;
/// <summary>
/// Aggregated result of an <see cref="AvevaPrerequisites.CheckAll"/> run. Test fixtures
/// typically call <see cref="SkipReason"/> to produce the argument for xUnit's
/// <c>Assert.Skip</c> when any hard dependency failed.
/// </summary>
public sealed class PrerequisiteReport
{
public IReadOnlyList<PrerequisiteCheck> Checks { get; }
public PrerequisiteReport(IEnumerable<PrerequisiteCheck> checks)
{
Checks = [.. checks];
}
/// <summary>True when every probe is Pass / Warn / Skip — no Fail entries.</summary>
public bool IsLivetestReady => !Checks.Any(c => c.Status == PrerequisiteStatus.Fail);
/// <summary>
/// True when only the AVEVA-side probes pass — ignores failures in the
/// <see cref="PrerequisiteCategory.OtOpcUaService"/> category. Lets a live-test gate
/// say "AVEVA is ready even if the v2 services aren't installed yet" without
/// conflating the two. Useful for tests that exercise Galaxy directly (e.g.
/// <see cref="GalaxyRepositoryLiveSmokeTests"/>) rather than through our stack.
/// </summary>
public bool IsAvevaSideReady =>
!Checks.Any(c => c.Status == PrerequisiteStatus.Fail && c.Category != PrerequisiteCategory.OtOpcUaService);
/// <summary>
/// Multi-line message for <c>Assert.Skip</c> when a hard dependency isn't met. Returns
/// null when <see cref="IsLivetestReady"/> is true.
/// </summary>
public string? SkipReason
{
get
{
var fails = Checks.Where(c => c.Status == PrerequisiteStatus.Fail).ToList();
if (fails.Count == 0) return null;
var sb = new StringBuilder();
sb.AppendLine($"Live-AVEVA prerequisites not met ({fails.Count} failed):");
foreach (var f in fails)
sb.AppendLine($" • [{f.Category}] {f.Name} — {f.Detail}");
sb.Append("Run `Get-Service aa*` / `sqlcmd -S localhost -d ZB -E -Q \"SELECT 1\"` to triage.");
return sb.ToString();
}
}
/// <summary>
/// Human-readable summary of warnings — caller decides whether to log or ignore. Useful
/// when a live test does pass but an operator should know their environment is degraded.
/// </summary>
public string? Warnings
{
get
{
var warns = Checks.Where(c => c.Status == PrerequisiteStatus.Warn).ToList();
if (warns.Count == 0) return null;
var sb = new StringBuilder();
sb.AppendLine($"AVEVA prerequisites with warnings ({warns.Count}):");
foreach (var w in warns)
sb.AppendLine($" • [{w.Category}] {w.Name} — {w.Detail}");
return sb.ToString();
}
}
/// <summary>
/// Throw <see cref="InvalidOperationException"/> if any <paramref name="categories"/>
/// contain a Fail — useful when a specific test needs, say, Galaxy Repository but doesn't
/// care about Historian. Call before <c>Assert.Skip</c> if you want to be strict.
/// </summary>
public void RequireCategories(params PrerequisiteCategory[] categories)
{
var set = categories.ToHashSet();
var fails = Checks.Where(c => c.Status == PrerequisiteStatus.Fail && set.Contains(c.Category)).ToList();
if (fails.Count == 0) return;
var detail = string.Join("; ", fails.Select(f => $"{f.Name}: {f.Detail}"));
throw new InvalidOperationException($"Required prerequisite categories failed: {detail}");
}
public override string ToString()
{
var sb = new StringBuilder();
sb.AppendLine($"PrerequisiteReport: {Checks.Count} checks");
foreach (var c in Checks)
sb.AppendLine($" [{c.Status,-4}] {c.Category}/{c.Name}: {c.Detail}");
return sb.ToString();
}
}

View File

@@ -0,0 +1,102 @@
using System.Runtime.InteropServices;
using System.Runtime.Versioning;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.TestSupport.Probes;
/// <summary>
/// Confirms MXAccess COM server registration by resolving the
/// <c>LMXProxy.LMXProxyServer</c> ProgID to its CLSID, then checking that the CLSID's
/// 32-bit <c>InprocServer32</c> entry points at a file that exists on disk.
/// </summary>
/// <remarks>
/// A common failure mode on partial installs: ProgID is registered but the CLSID
/// InprocServer32 DLL is missing (previous install uninstalled but registry orphan remains).
/// This probe surfaces that case with an actionable message instead of the
/// <c>0x80040154 REGDB_E_CLASSNOTREG</c> you'd see from a late COM activation failure.
/// </remarks>
public static class MxAccessComProbe
{
public const string ProgId = "LMXProxy.LMXProxyServer";
public const string VersionedProgId = "LMXProxy.LMXProxyServer.1";
public static PrerequisiteCheck Check()
{
if (!RuntimeInformation.IsOSPlatform(OSPlatform.Windows))
{
return new PrerequisiteCheck("com:LMXProxy", PrerequisiteCategory.MxAccessCom,
PrerequisiteStatus.Skip, "COM registration probes only run on Windows.");
}
return CheckWindows();
}
[SupportedOSPlatform("windows")]
private static PrerequisiteCheck CheckWindows()
{
try
{
var (clsid, dll) = RegistryProbe.ResolveProgIdToInproc(ProgId);
if (clsid is null)
{
return new PrerequisiteCheck("com:LMXProxy", PrerequisiteCategory.MxAccessCom,
PrerequisiteStatus.Fail,
$"ProgID {ProgId} not registered — MXAccess COM server isn't installed. " +
$"Install System Platform's MXAccess component and re-run.");
}
if (string.IsNullOrWhiteSpace(dll))
{
return new PrerequisiteCheck("com:LMXProxy", PrerequisiteCategory.MxAccessCom,
PrerequisiteStatus.Fail,
$"ProgID {ProgId} → CLSID {clsid} but InprocServer32 is empty. " +
$"Registry is orphaned; re-register with: regsvr32 /s LmxProxy.dll (from an elevated cmd in the Framework bin dir).");
}
// Resolve the recorded path — sometimes registered as a bare filename that the COM
// runtime resolves via the current process's DLL-search path. Accept either an
// absolute path that exists, or a bare filename whose resolution we can't verify
// without loading it (treat as Pass-with-note).
if (Path.IsPathRooted(dll))
{
if (!File.Exists(dll))
{
return new PrerequisiteCheck("com:LMXProxy", PrerequisiteCategory.MxAccessCom,
PrerequisiteStatus.Fail,
$"ProgID {ProgId} → CLSID {clsid} → InprocServer32 {dll}, but the file is missing. " +
$"Re-install the Framework or restore from backup.");
}
return new PrerequisiteCheck("com:LMXProxy", PrerequisiteCategory.MxAccessCom,
PrerequisiteStatus.Pass,
$"ProgID {ProgId} → {dll} (file exists).");
}
return new PrerequisiteCheck("com:LMXProxy", PrerequisiteCategory.MxAccessCom,
PrerequisiteStatus.Pass,
$"ProgID {ProgId} → {dll} (bare filename — relies on PATH resolution at COM activation time).");
}
catch (Exception ex)
{
return new PrerequisiteCheck("com:LMXProxy", PrerequisiteCategory.MxAccessCom,
PrerequisiteStatus.Warn,
$"Probe failed: {ex.GetType().Name}: {ex.Message}");
}
}
/// <summary>
/// Warn when running as a 64-bit process — MXAccess COM activation will fail with
/// <c>0x80040154</c> regardless of registration state. The production drivers run net48
/// x86; xunit hosts run 64-bit by default so this often surfaces first.
/// </summary>
public static PrerequisiteCheck CheckProcessBitness()
{
if (Environment.Is64BitProcess)
{
return new PrerequisiteCheck("env:ProcessBitness", PrerequisiteCategory.Environment,
PrerequisiteStatus.Warn,
"Test host is 64-bit. Direct MXAccess COM activation would fail with REGDB_E_CLASSNOTREG (0x80040154); " +
"the production driver workaround is to run Galaxy.Host as a 32-bit process. Tests that only " +
"talk to the Host service over the named pipe aren't affected.");
}
return new PrerequisiteCheck("env:ProcessBitness", PrerequisiteCategory.Environment,
PrerequisiteStatus.Pass, "Test host is 32-bit.");
}
}

View File

@@ -0,0 +1,59 @@
using System.IO.Pipes;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.TestSupport.Probes;
/// <summary>
/// Verifies the <c>OtOpcUaGalaxyHost</c> named-pipe endpoint is accepting connections —
/// the handshake the Proxy performs at boot. A clean pipe connect without sending any
/// framed message proves the Host service is listening; we disconnect immediately so we
/// don't consume a session slot.
/// </summary>
/// <remarks>
/// Default pipe name matches the installer script's <c>OTOPCUA_GALAXY_PIPE</c> default.
/// Override when the Host service was installed with a non-default name (custom deployments).
/// </remarks>
public static class NamedPipeProbe
{
public const string DefaultGalaxyHostPipeName = "OtOpcUaGalaxy";
public static async Task<PrerequisiteCheck> CheckGalaxyHostPipeAsync(
string? pipeName = null, CancellationToken ct = default)
{
pipeName ??= DefaultGalaxyHostPipeName;
try
{
using var client = new NamedPipeClientStream(
serverName: ".",
pipeName: pipeName,
direction: PipeDirection.InOut,
options: PipeOptions.Asynchronous);
using var cts = CancellationTokenSource.CreateLinkedTokenSource(ct);
cts.CancelAfter(TimeSpan.FromSeconds(2));
await client.ConnectAsync(cts.Token);
return new PrerequisiteCheck("pipe:OtOpcUaGalaxyHost", PrerequisiteCategory.OtOpcUaService,
PrerequisiteStatus.Pass,
$@"Pipe \\.\pipe\{pipeName} accepted a connection — OtOpcUaGalaxyHost is listening.");
}
catch (OperationCanceledException)
{
return new PrerequisiteCheck("pipe:OtOpcUaGalaxyHost", PrerequisiteCategory.OtOpcUaService,
PrerequisiteStatus.Fail,
$@"Pipe \\.\pipe\{pipeName} not connectable within 2s — OtOpcUaGalaxyHost service isn't running. " +
"Start with: sc.exe start OtOpcUaGalaxyHost");
}
catch (TimeoutException)
{
return new PrerequisiteCheck("pipe:OtOpcUaGalaxyHost", PrerequisiteCategory.OtOpcUaService,
PrerequisiteStatus.Fail,
$@"Pipe \\.\pipe\{pipeName} connect timed out — service may be starting or stuck. " +
"Check: sc.exe query OtOpcUaGalaxyHost");
}
catch (Exception ex)
{
return new PrerequisiteCheck("pipe:OtOpcUaGalaxyHost", PrerequisiteCategory.OtOpcUaService,
PrerequisiteStatus.Fail,
$@"Pipe \\.\pipe\{pipeName} connect failed: {ex.GetType().Name}: {ex.Message}");
}
}
}

View File

@@ -0,0 +1,162 @@
using System.Runtime.InteropServices;
using System.Runtime.Versioning;
using Microsoft.Win32;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.TestSupport.Probes;
/// <summary>
/// Reads HKLM registry keys to confirm ArchestrA Framework / System Platform install
/// markers. Matches the registered paths documented in
/// <c>docs/v2/implementation/</c> — System Platform is 32-bit so keys live under
/// <c>HKLM\SOFTWARE\WOW6432Node\ArchestrA\...</c>.
/// </summary>
public static class RegistryProbe
{
// Canonical install roots per the research on our dev box (System Platform 2020 R2).
public const string ArchestrARootKey = @"SOFTWARE\WOW6432Node\ArchestrA";
public const string FrameworkKey = @"SOFTWARE\WOW6432Node\ArchestrA\Framework";
public const string PlatformKey = @"SOFTWARE\WOW6432Node\ArchestrA\Framework\Platform";
public const string MsiInstallKey = @"SOFTWARE\WOW6432Node\ArchestrA\MSIInstall";
public static PrerequisiteCheck CheckFrameworkInstalled()
{
if (!RuntimeInformation.IsOSPlatform(OSPlatform.Windows))
{
return new PrerequisiteCheck("registry:ArchestrA.Framework", PrerequisiteCategory.AvevaInstall,
PrerequisiteStatus.Skip, "Registry probes only run on Windows.");
}
return FrameworkInstalledWindows();
}
public static PrerequisiteCheck CheckPlatformDeployed()
{
if (!RuntimeInformation.IsOSPlatform(OSPlatform.Windows))
{
return new PrerequisiteCheck("registry:ArchestrA.Platform", PrerequisiteCategory.AvevaInstall,
PrerequisiteStatus.Skip, "Registry probes only run on Windows.");
}
return PlatformDeployedWindows();
}
public static PrerequisiteCheck CheckRebootPending()
{
if (!RuntimeInformation.IsOSPlatform(OSPlatform.Windows))
{
return new PrerequisiteCheck("registry:ArchestrA.RebootPending", PrerequisiteCategory.AvevaInstall,
PrerequisiteStatus.Skip, "Registry probes only run on Windows.");
}
return RebootPendingWindows();
}
[SupportedOSPlatform("windows")]
private static PrerequisiteCheck FrameworkInstalledWindows()
{
try
{
using var key = Registry.LocalMachine.OpenSubKey(FrameworkKey);
if (key is null)
{
return new PrerequisiteCheck("registry:ArchestrA.Framework", PrerequisiteCategory.AvevaInstall,
PrerequisiteStatus.Fail,
$"Missing {FrameworkKey} — ArchestrA Framework isn't installed. Install AVEVA System Platform from the setup media.");
}
var installPath = key.GetValue("InstallPath") as string;
var rootPath = key.GetValue("RootPath") as string;
if (string.IsNullOrWhiteSpace(installPath) || string.IsNullOrWhiteSpace(rootPath))
{
return new PrerequisiteCheck("registry:ArchestrA.Framework", PrerequisiteCategory.AvevaInstall,
PrerequisiteStatus.Warn,
$"Framework key exists but InstallPath/RootPath values missing — install may be incomplete.");
}
return new PrerequisiteCheck("registry:ArchestrA.Framework", PrerequisiteCategory.AvevaInstall,
PrerequisiteStatus.Pass,
$"Installed at {installPath} (RootPath {rootPath}).");
}
catch (Exception ex)
{
return new PrerequisiteCheck("registry:ArchestrA.Framework", PrerequisiteCategory.AvevaInstall,
PrerequisiteStatus.Warn,
$"Probe failed: {ex.GetType().Name}: {ex.Message}");
}
}
[SupportedOSPlatform("windows")]
private static PrerequisiteCheck PlatformDeployedWindows()
{
try
{
using var key = Registry.LocalMachine.OpenSubKey(PlatformKey);
var pfeConfig = key?.GetValue("PfeConfigOptions") as string;
if (string.IsNullOrWhiteSpace(pfeConfig))
{
return new PrerequisiteCheck("registry:ArchestrA.Platform.Deployed", PrerequisiteCategory.AvevaInstall,
PrerequisiteStatus.Warn,
$"No Platform object deployed locally (Platform\\PfeConfigOptions empty). MXAccess will connect but subscriptions will fail. Deploy a Platform from the IDE.");
}
// PfeConfigOptions format: "PlatformId=N,EngineId=N,EngineName=...,..."
// A non-deployed state leaves PlatformId=0 or the key empty.
if (pfeConfig.Contains("PlatformId=0,", StringComparison.OrdinalIgnoreCase))
{
return new PrerequisiteCheck("registry:ArchestrA.Platform.Deployed", PrerequisiteCategory.AvevaInstall,
PrerequisiteStatus.Warn,
$"Platform never deployed (PfeConfigOptions has PlatformId=0). Deploy a Platform from the IDE before running live tests.");
}
return new PrerequisiteCheck("registry:ArchestrA.Platform.Deployed", PrerequisiteCategory.AvevaInstall,
PrerequisiteStatus.Pass,
$"Platform deployed ({pfeConfig}).");
}
catch (Exception ex)
{
return new PrerequisiteCheck("registry:ArchestrA.Platform.Deployed", PrerequisiteCategory.AvevaInstall,
PrerequisiteStatus.Warn,
$"Probe failed: {ex.GetType().Name}: {ex.Message}");
}
}
[SupportedOSPlatform("windows")]
private static PrerequisiteCheck RebootPendingWindows()
{
try
{
using var key = Registry.LocalMachine.OpenSubKey(MsiInstallKey);
var rebootRequired = key?.GetValue("RebootRequired") as string;
if (string.Equals(rebootRequired, "True", StringComparison.OrdinalIgnoreCase))
{
return new PrerequisiteCheck("registry:ArchestrA.RebootPending", PrerequisiteCategory.AvevaInstall,
PrerequisiteStatus.Warn,
"An ArchestrA patch has been installed but the machine hasn't rebooted. Post-patch behavior is undefined until a reboot.");
}
return new PrerequisiteCheck("registry:ArchestrA.RebootPending", PrerequisiteCategory.AvevaInstall,
PrerequisiteStatus.Pass,
"No pending reboot flagged.");
}
catch (Exception ex)
{
return new PrerequisiteCheck("registry:ArchestrA.RebootPending", PrerequisiteCategory.AvevaInstall,
PrerequisiteStatus.Warn,
$"Probe failed: {ex.GetType().Name}: {ex.Message}");
}
}
/// <summary>
/// Read the registered <see cref="ComProgIdCheck"/> CLSID for the given ProgID and
/// resolve the 32-bit <c>InprocServer32</c> file path. Returns null when either is missing.
/// </summary>
[SupportedOSPlatform("windows")]
internal static (string? Clsid, string? InprocDllPath) ResolveProgIdToInproc(string progId)
{
using var progIdKey = Registry.ClassesRoot.OpenSubKey($@"{progId}\CLSID");
var clsid = progIdKey?.GetValue(null) as string;
if (string.IsNullOrWhiteSpace(clsid)) return (null, null);
// 32-bit COM server under Wow6432Node\CLSID\{guid}\InprocServer32 default value.
using var inproc = Registry.LocalMachine.OpenSubKey(
$@"SOFTWARE\Classes\WOW6432Node\CLSID\{clsid}\InprocServer32");
var dll = inproc?.GetValue(null) as string;
return (clsid, dll);
}
}

View File

@@ -0,0 +1,85 @@
using System.Runtime.InteropServices;
using System.Runtime.Versioning;
using System.ServiceProcess;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.TestSupport.Probes;
/// <summary>
/// Queries the Windows Service Control Manager to report whether a named service is
/// installed, its current state, and its start type. Non-Windows hosts return Skip.
/// </summary>
public static class ServiceProbe
{
public static PrerequisiteCheck Check(
string serviceName,
PrerequisiteCategory category,
bool hardRequired,
string whatItDoes)
{
if (!RuntimeInformation.IsOSPlatform(OSPlatform.Windows))
{
return new PrerequisiteCheck(
Name: $"service:{serviceName}",
Category: category,
Status: PrerequisiteStatus.Skip,
Detail: "Service probes only run on Windows.");
}
return CheckWindows(serviceName, category, hardRequired, whatItDoes);
}
[SupportedOSPlatform("windows")]
private static PrerequisiteCheck CheckWindows(
string serviceName, PrerequisiteCategory category, bool hardRequired, string whatItDoes)
{
try
{
using var sc = new ServiceController(serviceName);
// Touch the Status to force the SCM lookup; if the service doesn't exist, this throws
// InvalidOperationException with message "Service ... was not found on computer.".
var status = sc.Status;
var startType = sc.StartType;
return status switch
{
ServiceControllerStatus.Running => new PrerequisiteCheck(
$"service:{serviceName}", category, PrerequisiteStatus.Pass,
$"Running ({whatItDoes})"),
// DemandStart services (like NmxSvc) that are Stopped are not necessarily a
// failure — the master service (aaBootstrap) brings them up on demand. Treat
// Stopped+Demand as Warn so operators know the situation but tests still proceed.
ServiceControllerStatus.Stopped when startType == ServiceStartMode.Manual =>
new PrerequisiteCheck(
$"service:{serviceName}", category, PrerequisiteStatus.Warn,
$"Installed but Stopped (start type Manual — {whatItDoes}). " +
"Will be pulled up on demand by the master service; fine for tests."),
ServiceControllerStatus.Stopped => Fail(
$"Installed but Stopped. Start with: sc.exe start {serviceName} ({whatItDoes})"),
_ => new PrerequisiteCheck(
$"service:{serviceName}", category, PrerequisiteStatus.Warn,
$"Transitional state {status} ({whatItDoes}) — try again in a few seconds."),
};
PrerequisiteCheck Fail(string detail) => new(
$"service:{serviceName}", category,
hardRequired ? PrerequisiteStatus.Fail : PrerequisiteStatus.Warn,
detail);
}
catch (InvalidOperationException ex) when (ex.Message.Contains("was not found", StringComparison.OrdinalIgnoreCase))
{
return new PrerequisiteCheck(
$"service:{serviceName}", category,
hardRequired ? PrerequisiteStatus.Fail : PrerequisiteStatus.Warn,
$"Not installed ({whatItDoes}). Install the relevant System Platform component and retry.");
}
catch (Exception ex)
{
return new PrerequisiteCheck(
$"service:{serviceName}", category, PrerequisiteStatus.Warn,
$"Probe failed ({ex.GetType().Name}: {ex.Message}) — treat as unknown.");
}
}
}

View File

@@ -0,0 +1,88 @@
using Microsoft.Data.SqlClient;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.TestSupport.Probes;
/// <summary>
/// Verifies the Galaxy Repository SQL side: SQL Server reachable, <c>ZB</c> database
/// present, and at least one deployed object exists (so live tests have something to read).
/// Reuses the Windows-auth connection string the repo code defaults to.
/// </summary>
public static class SqlProbe
{
public const string DefaultConnectionString =
"Server=localhost;Database=ZB;Integrated Security=True;TrustServerCertificate=True;Encrypt=False;Connect Timeout=3;";
public static async Task<PrerequisiteCheck> CheckZbDatabaseAsync(
string? connectionString = null, CancellationToken ct = default)
{
connectionString ??= DefaultConnectionString;
try
{
using var conn = new SqlConnection(connectionString);
await conn.OpenAsync(ct);
// DB_ID returns null when the database doesn't exist on the connected server — distinct
// failure mode from "server unreachable", deserves a distinct message.
using var cmd = conn.CreateCommand();
cmd.CommandText = "SELECT DB_ID('ZB')";
var dbIdObj = await cmd.ExecuteScalarAsync(ct);
if (dbIdObj is null || dbIdObj is DBNull)
{
return new PrerequisiteCheck("sql:ZB", PrerequisiteCategory.GalaxyRepository,
PrerequisiteStatus.Fail,
"SQL Server reachable but database ZB does not exist. " +
"Create the Galaxy from the IDE or restore a .cab backup.");
}
return new PrerequisiteCheck("sql:ZB", PrerequisiteCategory.GalaxyRepository,
PrerequisiteStatus.Pass, "Connected; ZB database exists.");
}
catch (SqlException ex)
{
return new PrerequisiteCheck("sql:ZB", PrerequisiteCategory.GalaxyRepository,
PrerequisiteStatus.Fail,
$"SQL Server unreachable: {ex.Message}. Ensure MSSQLSERVER service is running (sc.exe start MSSQLSERVER) and TCP 1433 is open.");
}
catch (Exception ex)
{
return new PrerequisiteCheck("sql:ZB", PrerequisiteCategory.GalaxyRepository,
PrerequisiteStatus.Fail,
$"Unexpected probe error: {ex.GetType().Name}: {ex.Message}");
}
}
/// <summary>
/// Returns the count of deployed Galaxy objects (<c>deployed_version &gt; 0</c>). Zero
/// isn't a hard failure — lets someone boot a fresh Galaxy and still get meaningful
/// test-suite output — but it IS a warning because any live-read smoke will have
/// nothing to read.
/// </summary>
public static async Task<PrerequisiteCheck> CheckDeployedObjectCountAsync(
string? connectionString = null, CancellationToken ct = default)
{
connectionString ??= DefaultConnectionString;
try
{
using var conn = new SqlConnection(connectionString);
await conn.OpenAsync(ct);
using var cmd = conn.CreateCommand();
cmd.CommandText = "SELECT COUNT(*) FROM gobject WHERE deployed_version > 0";
var countObj = await cmd.ExecuteScalarAsync(ct);
var count = countObj is int i ? i : 0;
return count > 0
? new PrerequisiteCheck("sql:ZB.deployedObjects", PrerequisiteCategory.GalaxyRepository,
PrerequisiteStatus.Pass, $"{count} objects deployed — live reads have data to return.")
: new PrerequisiteCheck("sql:ZB.deployedObjects", PrerequisiteCategory.GalaxyRepository,
PrerequisiteStatus.Warn,
"ZB contains no deployed objects. Discovery smoke tests will return empty hierarchies; " +
"deploy at least a Platform + AppEngine from the IDE to exercise the read path.");
}
catch (Exception ex)
{
return new PrerequisiteCheck("sql:ZB.deployedObjects", PrerequisiteCategory.GalaxyRepository,
PrerequisiteStatus.Warn,
$"Couldn't count deployed objects: {ex.GetType().Name}: {ex.Message}");
}
}
}

View File

@@ -0,0 +1,38 @@
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<!-- Multi-target: net10.0 for modern consumer projects (Galaxy.Proxy.Tests, E2E, Admin.Tests),
net48 for the Galaxy.Host.Tests project that has to stay on .NET Framework x86 for its
MXAccess-COM parent project. The helper uses no OS-level APIs that differ between the
two frameworks (registry / SQL / ServiceController are surface-compatible). -->
<TargetFrameworks>net10.0;net48</TargetFrameworks>
<Nullable>enable</Nullable>
<ImplicitUsings>enable</ImplicitUsings>
<LangVersion>latest</LangVersion>
<IsPackable>false</IsPackable>
<RootNamespace>ZB.MOM.WW.OtOpcUa.Driver.Galaxy.TestSupport</RootNamespace>
</PropertyGroup>
<ItemGroup Condition="'$(TargetFramework)' == 'net10.0'">
<!-- System.ServiceProcess.ServiceController + Microsoft.Win32.Registry are cross-platform
assemblies that throw PlatformNotSupportedException on non-Windows; the probes in
this project guard with RuntimeInformation.IsOSPlatform(OSPlatform.Windows) so they
return Skip on Linux/macOS rather than crashing the test host. -->
<PackageReference Include="System.ServiceProcess.ServiceController" Version="10.0.0"/>
<PackageReference Include="Microsoft.Win32.Registry" Version="5.0.0"/>
<PackageReference Include="Microsoft.Data.SqlClient" Version="6.0.1"/>
</ItemGroup>
<ItemGroup Condition="'$(TargetFramework)' == 'net48'">
<!-- net48 ships System.ServiceProcess + Microsoft.Win32 in-box via BCL references. -->
<Reference Include="System.ServiceProcess"/>
<!-- Microsoft.Data.SqlClient v6 supports net462+; single-target for consistency. -->
<PackageReference Include="Microsoft.Data.SqlClient" Version="6.0.1"/>
</ItemGroup>
<ItemGroup>
<NuGetAuditSuppress Include="https://github.com/advisories/GHSA-37gx-xxp4-5rgx"/>
<NuGetAuditSuppress Include="https://github.com/advisories/GHSA-w3x6-4m5h-cxqf"/>
</ItemGroup>
</Project>

View File

@@ -0,0 +1,197 @@
using Microsoft.EntityFrameworkCore;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Logging.Abstractions;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Configuration;
using ZB.MOM.WW.OtOpcUa.Configuration.Enums;
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
using ZB.MOM.WW.OtOpcUa.Core.Hosting;
using ZB.MOM.WW.OtOpcUa.Server;
namespace ZB.MOM.WW.OtOpcUa.Server.Tests;
[Trait("Category", "Integration")]
public sealed class HostStatusPublisherTests : IDisposable
{
private const string DefaultServer = "localhost,14330";
private const string DefaultSaPassword = "OtOpcUaDev_2026!";
private readonly string _databaseName = $"OtOpcUaPublisher_{Guid.NewGuid():N}";
private readonly string _connectionString;
private readonly ServiceProvider _sp;
public HostStatusPublisherTests()
{
var server = Environment.GetEnvironmentVariable("OTOPCUA_CONFIG_TEST_SERVER") ?? DefaultServer;
var password = Environment.GetEnvironmentVariable("OTOPCUA_CONFIG_TEST_SA_PASSWORD") ?? DefaultSaPassword;
_connectionString =
$"Server={server};Database={_databaseName};User Id=sa;Password={password};TrustServerCertificate=True;Encrypt=False;";
var services = new ServiceCollection();
services.AddLogging();
services.AddDbContext<OtOpcUaConfigDbContext>(o => o.UseSqlServer(_connectionString));
_sp = services.BuildServiceProvider();
using var scope = _sp.CreateScope();
scope.ServiceProvider.GetRequiredService<OtOpcUaConfigDbContext>().Database.Migrate();
}
public void Dispose()
{
_sp.Dispose();
using var conn = new Microsoft.Data.SqlClient.SqlConnection(
new Microsoft.Data.SqlClient.SqlConnectionStringBuilder(_connectionString) { InitialCatalog = "master" }.ConnectionString);
conn.Open();
using var cmd = conn.CreateCommand();
cmd.CommandText = $@"
IF DB_ID(N'{_databaseName}') IS NOT NULL
BEGIN
ALTER DATABASE [{_databaseName}] SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
DROP DATABASE [{_databaseName}];
END";
cmd.ExecuteNonQuery();
}
[Fact]
public async Task Publisher_upserts_one_row_per_host_reported_by_each_probe_driver()
{
var driverHost = new DriverHost();
await driverHost.RegisterAsync(new ProbeStubDriver("driver-a",
new HostConnectivityStatus("HostA1", HostState.Running, DateTime.UtcNow),
new HostConnectivityStatus("HostA2", HostState.Stopped, DateTime.UtcNow)),
"{}", CancellationToken.None);
await driverHost.RegisterAsync(new NonProbeStubDriver("driver-no-probe"), "{}", CancellationToken.None);
var nodeOptions = NewNodeOptions("node-a");
var publisher = new HostStatusPublisher(driverHost, nodeOptions, _sp.GetRequiredService<IServiceScopeFactory>(),
NullLogger<HostStatusPublisher>.Instance);
await publisher.PublishOnceAsync(CancellationToken.None);
using var scope = _sp.CreateScope();
var db = scope.ServiceProvider.GetRequiredService<OtOpcUaConfigDbContext>();
var rows = await db.DriverHostStatuses.AsNoTracking().ToListAsync();
rows.Count.ShouldBe(2, "driver-no-probe doesn't implement IHostConnectivityProbe — no rows for it");
rows.ShouldContain(r => r.HostName == "HostA1" && r.State == DriverHostState.Running && r.DriverInstanceId == "driver-a");
rows.ShouldContain(r => r.HostName == "HostA2" && r.State == DriverHostState.Stopped && r.DriverInstanceId == "driver-a");
rows.ShouldAllBe(r => r.NodeId == "node-a");
}
[Fact]
public async Task Second_tick_updates_LastSeenUtc_without_creating_duplicate_rows()
{
var driver = new ProbeStubDriver("driver-x",
new HostConnectivityStatus("HostX", HostState.Running, DateTime.UtcNow));
var driverHost = new DriverHost();
await driverHost.RegisterAsync(driver, "{}", CancellationToken.None);
var publisher = new HostStatusPublisher(driverHost, NewNodeOptions("node-x"),
_sp.GetRequiredService<IServiceScopeFactory>(),
NullLogger<HostStatusPublisher>.Instance);
await publisher.PublishOnceAsync(CancellationToken.None);
var firstSeen = await SingleRowAsync("node-x", "driver-x", "HostX");
await Task.Delay(50); // guarantee a later wall-clock value so LastSeenUtc advances
await publisher.PublishOnceAsync(CancellationToken.None);
var secondSeen = await SingleRowAsync("node-x", "driver-x", "HostX");
secondSeen.LastSeenUtc.ShouldBeGreaterThan(firstSeen.LastSeenUtc,
"heartbeat advances LastSeenUtc so Admin can stale-flag rows from crashed Servers");
// Still exactly one row — a naive Add-every-tick would have thrown or duplicated.
using var scope = _sp.CreateScope();
var db = scope.ServiceProvider.GetRequiredService<OtOpcUaConfigDbContext>();
(await db.DriverHostStatuses.CountAsync(r => r.NodeId == "node-x")).ShouldBe(1);
}
[Fact]
public async Task State_change_between_ticks_updates_State_and_StateChangedUtc()
{
var driver = new ProbeStubDriver("driver-y",
new HostConnectivityStatus("HostY", HostState.Running, DateTime.UtcNow.AddSeconds(-10)));
var driverHost = new DriverHost();
await driverHost.RegisterAsync(driver, "{}", CancellationToken.None);
var publisher = new HostStatusPublisher(driverHost, NewNodeOptions("node-y"),
_sp.GetRequiredService<IServiceScopeFactory>(),
NullLogger<HostStatusPublisher>.Instance);
await publisher.PublishOnceAsync(CancellationToken.None);
var before = await SingleRowAsync("node-y", "driver-y", "HostY");
// Swap the driver's reported state to Faulted with a newer transition timestamp.
var newChange = DateTime.UtcNow;
driver.Statuses = [new HostConnectivityStatus("HostY", HostState.Faulted, newChange)];
await publisher.PublishOnceAsync(CancellationToken.None);
var after = await SingleRowAsync("node-y", "driver-y", "HostY");
after.State.ShouldBe(DriverHostState.Faulted);
// datetime2(3) has millisecond precision — DateTime.UtcNow carries up to 100ns ticks,
// so the stored value rounds down. Compare at millisecond granularity to stay clean.
after.StateChangedUtc.ShouldBe(newChange, tolerance: TimeSpan.FromMilliseconds(1));
after.StateChangedUtc.ShouldBeGreaterThan(before.StateChangedUtc,
"StateChangedUtc must advance when the state actually changed");
before.State.ShouldBe(DriverHostState.Running);
}
[Fact]
public void MapState_translates_every_HostState_member()
{
HostStatusPublisher.MapState(HostState.Running).ShouldBe(DriverHostState.Running);
HostStatusPublisher.MapState(HostState.Stopped).ShouldBe(DriverHostState.Stopped);
HostStatusPublisher.MapState(HostState.Faulted).ShouldBe(DriverHostState.Faulted);
HostStatusPublisher.MapState(HostState.Unknown).ShouldBe(DriverHostState.Unknown);
}
private async Task<Configuration.Entities.DriverHostStatus> SingleRowAsync(string node, string driver, string host)
{
using var scope = _sp.CreateScope();
var db = scope.ServiceProvider.GetRequiredService<OtOpcUaConfigDbContext>();
return await db.DriverHostStatuses.AsNoTracking()
.SingleAsync(r => r.NodeId == node && r.DriverInstanceId == driver && r.HostName == host);
}
private static NodeOptions NewNodeOptions(string nodeId) => new()
{
NodeId = nodeId,
ClusterId = "cluster-t",
ConfigDbConnectionString = "unused-publisher-gets-db-from-scope",
};
private sealed class ProbeStubDriver(string id, params HostConnectivityStatus[] initial)
: IDriver, IHostConnectivityProbe
{
public HostConnectivityStatus[] Statuses { get; set; } = initial;
public string DriverInstanceId => id;
public string DriverType => "ProbeStub";
public event EventHandler<HostStatusChangedEventArgs>? OnHostStatusChanged;
public Task InitializeAsync(string driverConfigJson, CancellationToken ct) => Task.CompletedTask;
public Task ReinitializeAsync(string driverConfigJson, CancellationToken ct) => Task.CompletedTask;
public Task ShutdownAsync(CancellationToken ct) => Task.CompletedTask;
public DriverHealth GetHealth() => new(DriverState.Healthy, DateTime.UtcNow, null);
public long GetMemoryFootprint() => 0;
public Task FlushOptionalCachesAsync(CancellationToken ct) => Task.CompletedTask;
public IReadOnlyList<HostConnectivityStatus> GetHostStatuses() => Statuses;
// Keeps the compiler happy — event is part of the interface contract even if unused here.
internal void Raise(HostStatusChangedEventArgs e) => OnHostStatusChanged?.Invoke(this, e);
}
private sealed class NonProbeStubDriver(string id) : IDriver
{
public string DriverInstanceId => id;
public string DriverType => "NonProbeStub";
public Task InitializeAsync(string driverConfigJson, CancellationToken ct) => Task.CompletedTask;
public Task ReinitializeAsync(string driverConfigJson, CancellationToken ct) => Task.CompletedTask;
public Task ShutdownAsync(CancellationToken ct) => Task.CompletedTask;
public DriverHealth GetHealth() => new(DriverState.Healthy, DateTime.UtcNow, null);
public long GetMemoryFootprint() => 0;
public Task FlushOptionalCachesAsync(CancellationToken ct) => Task.CompletedTask;
}
}