Compare commits

...

25 Commits

Author SHA1 Message Date
Joseph Doherty
04d267d1ea Phase 2 PR 13 — port GalaxyRuntimeProbeManager state machine + wire per-platform ScanState probing. PR 8 gave operators the gateway-level transport signal (MxAccessClient.ConnectionStateChanged → OnHostStatusChanged tagged with the Wonderware client identity) — enough to detect when the entire MXAccess COM proxy dies, but silent when a specific or goes down while the gateway stays alive. This PR closes that gap: a pure-logic GalaxyRuntimeProbeManager (ported from v1's 472-LOC GalaxyRuntimeProbeManager.cs, distilled to ~240 LOC of state machine without the OPC UA node-manager entanglement) lives in Backend/Stability/, advises <TagName>.ScanState per WinPlatform + AppEngine gobject discovered during DiscoverAsync, runs the Unknown → Running → Stopped state machine with v1's documented semantics preserved verbatim, and raises a StateChanged event on transitions operators should react to. MxAccessGalaxyBackend instantiates the probe manager in the constructor with SubscribeAsync/UnsubscribeAsync delegate pointers into MxAccessClient, hooks StateChanged to forward each transition through the same OnHostStatusChanged IPC event the gateway signal uses (HostName = platform/engine TagName, RuntimeStatus = 'Running'|'Stopped'|'Unknown', LastObservedUtcUnixMs from the state-change timestamp), so Admin UI gets per-host signals flowing through the existing PR 8 wire with no additional IPC plumbing. State machine rules ported from v1 runtimestatus.md: (a) ScanState is on-change-only — a stably-Running host may go hours without a callback, so Running → Stopped is driven only by explicit ScanState=false, never by starvation; (b) Unknown → Running is a startup transition and does NOT fire StateChanged (would paint every host as 'just recovered' at startup, which is noise and can clear Bad quality set by a concurrently-stopping sibling); (c) Stopped → Running fires StateChanged for the real recovery case; (d) Running → Stopped fires StateChanged; (e) Unknown → Stopped fires StateChanged because that's the first-known-bad signal operators need when a host is down at our startup time. MxAccessGalaxyBackend.DiscoverAsync calls _probeManager.SyncAsync with the runtime-host subset of the hierarchy (CategoryId == 1 WinPlatform or 3 AppEngine) as a best-effort step after building the Discover response — probe failures are swallowed so Discover still returns the hierarchy even if a per-host advise fails; the gateway signal covers the critical rung. SyncAsync is idempotent (second call with the same set is a no-op) and handles the diff on re-Discover for tag rename / host add / host remove. Subscribe failure rolls back the host's state entry under the lock so a later probe callback for a never-advised tag can't transition a phantom state from Unknown to Stopped and fan out a false host-down signal (the same protection v1's GalaxyRuntimeProbeManager had at line 237-243 of v1 with a captured-probe-string comparison under the lock). MxAccessGalaxyBackend.Dispose unsubscribes the StateChanged handler before disposing the probe manager to prevent dangling invocation-list references across reconnects, same discipline as PR 8's ConnectionStateChanged and PR 6's SubscriptionReplayFailed. Tests (12 new GalaxyRuntimeProbeManagerTests): Sync_subscribes_to_ScanState_per_host verifies tag.ScanState subscriptions are advised per Platform/Engine; Sync_is_idempotent_on_repeat_call_with_same_set verifies no duplicate subscribes; Sync_unadvises_removed_hosts verifies the diff unadvises gone hosts; Subscribe_failure_rolls_back_host_entry_so_later_transitions_do_not_fire_stale_events covers the rollback-on-subscribe-fail guard; Unknown_to_Running_does_not_fire_StateChanged preserves the startup-noise rule; Running_to_Stopped_fires_StateChanged_with_both_states asserts OldState and NewState are both captured in the transition record; Stopped_to_Running_fires_StateChanged_for_recovery verifies the recovery case; Unknown_to_Stopped_fires_StateChanged_for_first_known_bad_signal preserves the first-bad rule; Repeated_Good_Running_callbacks_do_not_fire_duplicate_events verifies the state-tracking de-dup; Unknown_callback_for_non_tracked_probe_is_dropped asserts a foreign callback is silently ignored; Snapshot_reports_current_state_for_every_tracked_host covers the dashboard query hook; IsRuntimeHost_recognizes_WinPlatform_and_AppEngine_category_ids asserts the CategoryId filter. Galaxy.Host.Tests Unit suite 75 pass / 0 fail (12 new probe + 63 pre-existing). Galaxy.Host builds clean (0 errors / 0 warnings). Branches off v2.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 07:27:56 -04:00
4448db8207 Merge pull request 'Phase 2 PR 12 � richer historian quality mapping' (#11) from phase-2-pr12-quality-mapper into v2 2026-04-18 07:22:44 -04:00
d96b513bbc Merge pull request 'Phase 2 PR 11 � HistoryReadEvents IPC (alarm history)' (#10) from phase-2-pr11-history-events into v2 2026-04-18 07:22:33 -04:00
053c4e0566 Merge pull request 'Phase 2 PR 10 � HistoryReadAtTime IPC surface' (#9) from phase-2-pr10-history-attime into v2 2026-04-18 07:22:16 -04:00
Joseph Doherty
f24f969a85 Phase 2 PR 12 — richer historian quality mapping. Replace MxAccessGalaxyBackend's inline MapHistorianQualityToOpcUa category-only helper (192+→Good, 64-191→Uncertain, 0-63→Bad) with a new public HistorianQualityMapper.Map utility that preserves specific OPC DA subcodes — BadNotConnected(8)→0x808A0000u instead of generic Bad(0x80000000u), UncertainSubNormal(88)→0x40950000u instead of generic Uncertain, Good_LocalOverride(216)→0x00D80000u instead of generic Good, etc. Mirrors v1 QualityMapper.MapToOpcUaStatusCode byte-for-byte without pulling in OPC UA types — the function returns uint32 literals that are the canonical OPC UA StatusCode wire encoding, surfaced directly as DataValueSnapshot.StatusCode on the Proxy side with no additional translation. Unknown subcodes fall back to the family category (255→Good, 150→Uncertain, 50→Bad) so a future SDK change that adds a quality code we don't map yet still gets a sensible bucket. GalaxyDataValue wire shape unchanged (StatusCode stays uint) — this is a pure fidelity upgrade on the Host side. Downstream callers (Admin UI status dashboard, OPC UA clients receiving historian samples) can now distinguish e.g. a transport outage (BadNotConnected) from a sensor fault (BadSensorFailure) from a warm-up delay (BadWaitingForInitialData) without a second round-trip or dashboard heuristic. 21 new tests (HistorianQualityMapperTests): theory with 15 rows covering every specific mapping from the v1 QualityMapper table, plus 6 fallback tests verifying unknown-subcode codes in each family (Good/Uncertain/Bad) collapse to the family default. Galaxy.Host.Tests Unit suite 56/0 (21 new + 35 existing). Galaxy.Host builds clean (0/0). Branches off v2.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 07:11:02 -04:00
Joseph Doherty
ca025ebe0c Phase 2 PR 11 — HistoryReadEvents IPC (alarm history). New Shared.Contracts messages HistoryReadEventsRequest/Response + GalaxyHistoricalEvent DTO (MessageKind 0x66/0x67). IGalaxyBackend gains HistoryReadEventsAsync, Stub/DbBacked return canonical pending error, MxAccessGalaxyBackend delegates to _historian.ReadEventsAsync (ported in PR 5) and maps HistorianEventDto → GalaxyHistoricalEvent — Guid.ToString() for EventId wire shape, DateTime → Unix ms for both EventTime (when the event fired in the process) and ReceivedTime (when the Historian persisted it), DisplayText + Severity pass through. SourceName is string? — null means 'all sources' (passed straight through to HistorianDataSource.ReadEventsAsync which adds the AddEventFilter('Source', Equal, ...) only when non-null). Distinct from the live GalaxyAlarmEvent type because historical rows carry both timestamps and lack StateTransition (Historian logs instantaneous events, not the OPC UA Part 9 alarm lifecycle; translating to OPC UA event lifecycle is the alarm-subsystem's job). Guards: null historian → Historian-disabled error; SDK exception → Success=false with message chained. Tests (3 new): disabled-error when historian null, maps HistorianEventDto with full field set (Id/Source/EventTime/ReceivedTime/DisplayText/Severity=900) to GalaxyHistoricalEvent, null SourceName passes through unchanged (verifies the 'all sources' contract). Galaxy.Host.Tests Unit suite 34 pass / 0 fail. Galaxy.Host builds clean. Branches off phase-2-pr10-history-attime since both extend the MessageKind enum; fast-forwards if PR 10 merges first.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 07:08:16 -04:00
Joseph Doherty
d13f919112 Phase 2 PR 10 — HistoryReadAtTime IPC surface. New Shared.Contracts messages HistoryReadAtTimeRequest/Response (MessageKind 0x64/0x65), IGalaxyBackend gains HistoryReadAtTimeAsync, Stub/DbBacked return canonical pending error, MxAccessGalaxyBackend delegates to _historian.ReadAtTimeAsync (ported in PR 5, exposed now) — request timestamp array is flow-encoded as Unix ms to avoid MessagePack DateTime quirks then re-hydrated to DateTime on the Host side. Per-sample mapping uses the same ToWire(HistorianSample) helper as ReadRawAsync so the category→StatusCode mapping stays consistent (Quality byte 192+ → Good 0u, 64-191 → Uncertain, 0-63 → Bad 0x80000000u). Guards: null historian → "Historian disabled" (symmetric with other history paths); empty timestamp array short-circuits to Success=true, Values=[] without an SDK round-trip; SDK exception → Success=false with the message chained. Proxy-side IHistoryProvider.ReadAtTimeAsync capability doesn't exist in Core.Abstractions yet (OPC UA HistoryReadAtTime service is supported but the current IHistoryProvider only has ReadRawAsync + ReadProcessedAsync) — this PR adds the Host-side surface so a future Core.Abstractions extension can wire it through without needing another IPC change. Tests (4 new): disabled-error when historian null, empty-timestamp short-circuit without SDK call, Unix-ms↔DateTime round-trip with Good samples at two distinct timestamps, missing sample (Quality=0) maps to 0x80000000u Bad category. Galaxy.Host.Tests Unit suite: 31 pass / 0 fail (4 new at-time + 27 pre-existing). Galaxy.Host builds clean. Branches off v2.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 07:03:25 -04:00
d2ebb91cb1 Merge pull request 'Phase 2 PR 9 — thread IsAlarm discovery flag end-to-end' (#8) from phase-2-pr9-alarms into v2 2026-04-18 06:59:25 -04:00
90ce0af375 Merge pull request 'Phase 2 PR 8 — gateway-level host-status push from MxAccessGalaxyBackend' (#7) from phase-2-pr8-alarms-hoststatus into v2 2026-04-18 06:59:04 -04:00
e250356e2a Merge pull request 'Phase 2 PR 7 — wire IHistoryProvider.ReadProcessedAsync end-to-end' (#6) from phase-2-pr7-history-processed into v2 2026-04-18 06:59:02 -04:00
067ad78e06 Merge pull request 'Phase 2 PR 6 — close PR 4 monitor-loop low findings (probe leak + replay signal)' (#5) from phase-2-pr6-monitor-findings into v2 2026-04-18 06:57:57 -04:00
6cfa8d326d Merge pull request 'Phase 2 PR 4 — close 4 open MXAccess findings (push frames + reconnect + write-await + read-cancel)' (#3) from phase-2-pr4-findings into v2 2026-04-18 06:57:21 -04:00
Joseph Doherty
70a5d06b37 Phase 2 PR 9 — thread IsAlarm discovery flag end-to-end. GalaxyRepository.GetAttributesAsync has always emitted is_alarm alongside is_historized (CASE WHEN EXISTS with the primitive_definition join on primitive_name='AlarmExtension' per v1's Extended Attributes SQL lifted byte-for-byte into the PR 5 repository port), and GalaxyAttributeRow.IsAlarm has been populated since the port, but the flag was silently dropped at the MapAttribute helper in both MxAccessGalaxyBackend and DbBackedGalaxyBackend because GalaxyAttributeInfo on the IPC side had no field to carry it — every deployed alarm attribute arrived at the Proxy with no signal that it was alarm-bearing. This PR wires the flag through the three translation boundaries: GalaxyAttributeInfo gains [Key(6)] public bool IsAlarm { get; set; } at the end of the message to preserve wire-compat with pre-PR9 payloads that omit the key (MessagePack treats missing keys as default, so a newer Proxy talking to an older Host simply gets IsAlarm=false for every attribute); both backend MapAttribute helpers copy row.IsAlarm into the IPC shape; DriverAttributeInfo in Core.Abstractions gains a new IsAlarm parameter with default value false so the positional record signature change doesn't force every non-Galaxy driver call site to flow a flag they don't produce (the existing generic node-manager and future Modbus/etc. drivers keep compiling without modification); GalaxyProxyDriver.DiscoverAsync passes attr.IsAlarm through to the DriverAttributeInfo positional constructor. This is the discovery-side foundation — the generic node-manager can now enrich alarm-bearing variables with OPC UA AlarmConditionState during address-space build (the existing v1 LmxNodeManager pattern that subscribes to <tag>.InAlarm + .Priority + .DescAttrName + .Acked and merges them into a ConditionState) but this PR deliberately stops at discovery: the full alarm subsystem (subscription management for the 4 alarm-status attributes, state-machine tracking for Active/Unacknowledged/Confirmed/Inactive transitions, OPC UA Part 9 alarm event emission, and the write-to-AckMsg ack path) is a follow-up PR 10+ because it touches the node-manager's address-space build path — orthogonal to the IPC flow this PR covers. Tests — AlarmDiscoveryTests (new, 3 cases): GalaxyAttributeInfo_IsAlarm_round_trips_true_through_MessagePack serializes an IsAlarm=true instance and asserts the decoded flag is true + IsHistorized is true + AttributeName survives unchanged; GalaxyAttributeInfo_IsAlarm_round_trips_false_through_MessagePack covers the default path; Pre_PR9_payload_without_IsAlarm_key_deserializes_with_default_false is the wire-compat regression guard — serializes a stand-in PrePR9Shape class with only keys 0..5 (identical layout to the pre-PR9 GalaxyAttributeInfo) and asserts the newer GalaxyAttributeInfo deserializer produces IsAlarm=false without throwing, so a rolling upgrade where the Proxy ships first can talk to an old Host during the window before the Host upgrades without a MessagePack "missing key" exception. Full solution build: 0 errors, 38 warnings (existing). Galaxy.Host.Tests Unit suite: 27 pass / 0 fail (3 new alarm-discovery + 9 PR5 historian + 15 pre-existing). This PR branches off phase-2-pr5-historian because GalaxyProxyDriver's constructor signature + GalaxyHierarchyRow's IsAlarm init-only property are both ancestor state that the simpler branch bases (phase-2-pr4-findings, master) don't yet include.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 06:28:01 -04:00
Joseph Doherty
30ece6e22c Phase 2 PR 8 — wire gateway-level host-status push from MxAccessGalaxyBackend. PR 4 built the IPC infrastructure for OnHostStatusChanged (MessageKind.RuntimeStatusChange frame + ConnectionSink forwarding through FrameWriter) but no backend actually raised the event; the #pragma warning disable CS0067 around MxAccessGalaxyBackend.OnHostStatusChanged declared the event for interface symmetry while acknowledging the wire-up was Phase 2 follow-up. This PR closes the gateway-level signal: MxAccessClient.ConnectionStateChanged (already raised on false→true Register and true→false Unregister transitions, including the reconnect path in MonitorLoopAsync) now drives OnHostStatusChanged with a synthetic HostConnectivityStatus tagged HostName=MxAccessClient.ClientName, RuntimeStatus="Running" on reconnect + "Stopped" on disconnect, LastObservedUtcUnixMs set to the transition moment. The Admin UI's existing IHostConnectivityProbe subscriber on GalaxyProxyDriver (HostStatusChangedEventArgs) already handles the full translation — OnHostConnectivityUpdate parses "Running"/"Stopped"/"Faulted" into the Core.Abstractions HostState enum and fires OnHostStatusChanged downstream, so this single backend-side event wire-up produces an end-to-end signal with no further Proxy changes required. Per-platform and per-AppEngine ScanState probing (the 472 LOC GalaxyRuntimeProbeManager state machine in v1 that advises <Host>.ScanState on every deployed $WinPlatform + $AppEngine gobject, tracks Unknown → Running → Stopped transitions, handles the on-change-only delivery quirk of ScanState, and surfaces IsHostStopped(gobjectId) for the node manager's Read path to short-circuit on-demand reads against known-stopped runtimes) remains deferred to a follow-up PR — the gateway-level signal gives operators the top-level transport-health rung of the status ladder, which is what matters when the Galaxy COM proxy itself goes down (vs a specific platform going down). MxAccessClient.ClientName property exposes the previously-private _clientName field so the backend can tag its pushes with a stable gateway identity — operators configure this via OTOPCUA_GALAXY_CLIENT_NAME env var (default "OtOpcUa-Galaxy.Host" per Program.cs). MxAccessGalaxyBackend constructor subscribes the new _onConnectionStateChanged field before returning + Dispose unsubscribes it via _mx.ConnectionStateChanged -= _onConnectionStateChanged to prevent the backend's own dispose from leaving a dangling handler on the MxAccessClient (same shape as MxAccessClient.SubscriptionReplayFailed PR 6 dispose discipline). #pragma warning disable CS0067 removed from around OnHostStatusChanged since the event is now raised; the directive is narrowed to cover only OnAlarmEvent which stays unraised pending the alarm subsystem port (PR 9 candidate). Tests — HostStatusPushTests (new, 2 cases): ConnectionStateChanged_raises_OnHostStatusChanged_with_gateway_name fires mx.ConnectAsync → mx.DisconnectAsync and asserts two notifications in order with HostName="GatewayClient" (the clientName passed to MxAccessClient ctor), RuntimeStatus="Running" then "Stopped", LastObservedUtcUnixMs > 0; Dispose_unsubscribes_so_post_dispose_state_changes_do_not_fire_events asserts that after backend.Dispose() a subsequent mx.DisconnectAsync does not bump the count on a registered OnHostStatusChanged handler — guards against the subscription-leak regression where a lingering backend instance would accumulate cross-reconnect notifications for a dead writer. Host.Tests csproj gains a Reference to lib/ArchestrA.MxAccess.dll (identical to the reference PR 6 adds — conflict-free cherry-pick/merge since both PRs stage the same <Reference> node; git will collapse to one when either lands first). Full Galaxy.Host.Tests Unit suite: 26 pass / 0 fail (2 new host-status + 9 PR5 historian + 15 pre-existing PostMortemMmf/RecyclePolicy/StaPump/MemoryWatchdog/EndToEndIpc/Handshake). Galaxy.Host builds clean (0 errors, 0 warnings). Branch base — PR 8 is on phase-2-pr5-historian rather than phase-2-pr4-findings because the constructor path on MxAccessGalaxyBackend gained a new historian parameter in PR 5 and the Dispose implementation needs to coordinate the two unsubscribes; targeting the earlier base would leave a trivial conflict on Dispose.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 06:03:16 -04:00
Joseph Doherty
3717405aa6 Phase 2 PR 7 — wire IHistoryProvider.ReadProcessedAsync end-to-end. PR 5 ported HistorianDataSource.ReadAggregateAsync into Galaxy.Host but left it internal — GalaxyProxyDriver.ReadProcessedAsync still threw NotSupportedException, so OPC UA clients issuing HistoryReadProcessed requests against the v2 topology got rejected at the driver boundary. This PR closes that gap by adding two new Shared.Contracts messages (HistoryReadProcessedRequest/Response, MessageKind 0x62/0x63), routing them through GalaxyFrameHandler, implementing HistoryReadProcessedAsync on all three IGalaxyBackend implementations (Stub/DbBacked return the canonical "pending" Success=false, MxAccessGalaxyBackend delegates to _historian.ReadAggregateAsync), mapping HistorianAggregateSample → GalaxyDataValue at the IPC boundary (null bucket Value → BadNoData 0x800E0000u, otherwise Good 0u), and flipping GalaxyProxyDriver.ReadProcessedAsync from the NotSupported throw to a real IPC call with OPC UA HistoryAggregateType enum mapped to Wonderware AnalogSummary column name on the Proxy side (Average → "Average", Minimum → "Minimum", Maximum → "Maximum", Count → "ValueCount", Total → NotSupported since there's no direct SDK column for sum). Decision #13 IPC data-shape stays intact — HistoryReadProcessedResponse carries GalaxyDataValue[] with the same MessagePack value + OPC UA StatusCode + timestamps shape as the other history responses, so the Proxy's existing ToSnapshot helper handles the conversion without a new code path. MxAccessGalaxyBackend.HistoryReadProcessedAsync guards: null historian → "Historian disabled" (symmetric with HistoryReadAsync); IntervalMs <= 0 → "HistoryReadProcessed requires IntervalMs > 0" (prevents division-by-zero inside the SDK's Resolution parameter); exception during SDK call → Success=false Values=[] with the message so the Proxy surfaces it as InvalidOperationException with a clean error chain. Tests — HistoryReadProcessedTests (new, 4 cases): disabled-error when historian null, rejects zero interval, maps Good sample with Value=12.34 and the Proxy-supplied AggregateColumn + IntervalMs flow unchanged through to the fake IHistorianDataSource, maps null Value bucket to 0x800E0000u BadNoData with null ValueBytes. AggregateColumnMappingTests (new, 5 cases in Proxy.Tests): theory covers all 4 supported HistoryAggregateType enum values → correct column string, and asserts Total throws NotSupportedException with a message that steers callers to Average/Minimum/Maximum/Count (the SDK's AnalogSummaryQueryResult doesn't expose a sum column — the closest is Average × ValueCount which is the responsibility of a caller-side aggregation rather than an extra IPC round-trip). InternalsVisibleTo added to Galaxy.Proxy csproj so Proxy.Tests can reach the internal MapAggregateToColumn static. Builds — Galaxy.Host (net48 x86) + Galaxy.Proxy (net10) both 0 errors, full solution 201 warnings (pre-existing) / 0 errors. Test counts — Host.Tests Unit suite: 28 pass (4 new processed + 9 PR5 historian + 15 pre-existing); Proxy.Tests Unit suite: 14 pass (5 new column-mapping + 9 pre-existing). Deferred to a later PR — ReadAtTime + ReadEvents + Health IPC surfaces (HistorianDataSource has them ported in PR 5 but they need additional contract messages and would push this PR past a comfortable review size); the alarm subsystem wire-up (OnAlarmEvent raising from MxAccessGalaxyBackend) which overlaps the ReadEventsAsync IPC work since both pull from HistorianAccess.CreateEventQuery on the SDK side; the Proxy-side quality-byte refinement where HistorianDataSource's per-sample raw quality byte gets decoded through the existing QualityMapper instead of the category-only mapping in ToWire(HistorianSample) — doesn't change correctness today since Good/Uncertain/Bad categories are all the Admin UI and OPC UA clients surface, but richer OPC DA status codes (BadNotConnected, UncertainSubNormal, etc.) are available on the wire and the Proxy could promote them before handing DataValueSnapshot to ISubscribable consumers. This PR branches off phase-2-pr5-historian because it directly extends the Historian IPC surface added there; if PR 5 merges first PR 7 fast-forwards, otherwise it needs a rebase after PR 5 lands.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 05:53:01 -04:00
Joseph Doherty
1c2bf74d38 Phase 2 PR 6 — close the 2 low findings carried forward from PR 4. Low finding #1 ($Heartbeat probe handle leak in MonitorLoopAsync): the probe calls _proxy.AddItem(connectionHandle, "$Heartbeat") on every monitor tick that observes the connection is past StaleThreshold, but previously discarded the returned item handle — so every probe (one per MonitorInterval, default 5s) leaked one item handle into the MXAccess proxy's internal handle table. Fix: capture the item handle, call RemoveItem(connectionHandle, probeHandle) in the InvokeAsync's finally block so it runs on the same pump turn as the AddItem, best-effort RemoveItem swallow so a dying proxy doesn't throw secondary exceptions out of the probe path. Probe ok becomes probeHandle > 0 so any AddItem that returns 0 (MXAccess's "could not create") counts as a failed probe, matching v1 behavior. Low finding #2 (subscription replay silently swallowed per-tag failures): after a reconnect, the replay loop iterates the pre-reconnect subscription snapshot and calls SubscribeOnPumpAsync for each; previously those failures went into a bare catch { /* skip */ } so an operator had no signal when specific tags failed to re-subscribe — the first indication downstream was a quality drop on OPC UA clients. Fix: new SubscriptionReplayFailedEventArgs (TagReference + Exception) + SubscriptionReplayFailed event on MxAccessClient that fires once per tag that fails to re-subscribe, Log.Warning per failure with the reconnect counter + tag reference, and a summary log line at the end of the replay loop ("{failed} of {total} failed" or "{total} re-subscribed cleanly"). Serilog using + ILogger Log = Serilog.Log.ForContext<MxAccessClient>() added. Tests — MxAccessClientMonitorLoopTests (new file, 2 cases): Heartbeat_probe_calls_RemoveItem_for_every_AddItem constructs a CountingProxy IMxProxy that tracks AddItem/RemoveItem pair counts scoped to the "$Heartbeat" address, runs the client with MonitorInterval=150ms + StaleThreshold=50ms for 700ms, asserts HeartbeatAddCount > 1, HeartbeatAddCount == HeartbeatRemoveCount, OutstandingHeartbeatHandles == 0 after dispose; SubscriptionReplayFailed_fires_for_each_tag_that_fails_to_replay uses a ReplayFailingProxy that throws on the next $Heartbeat probe (to trigger the reconnect path) and throws on the replay-time AddItem for specified tag names ("BadTag.A", "BadTag.B"), subscribes GoodTag.X + BadTag.A + BadTag.B before triggering probe failure, collects SubscriptionReplayFailed args into a ConcurrentBag, asserts exactly 2 events fired and both bad tags are represented — GoodTag.X replays cleanly so it does not fire. Host.Tests csproj gains a Reference to lib/ArchestrA.MxAccess.dll because IMxProxy's MxDataChangeHandler delegate signature mentions MXSTATUS_PROXY and the compiler resolves all delegate parameter types when a test class implements the interface, even if the test code never names the type. No regressions: full Galaxy.Host.Tests Unit suite 26 pass / 0 fail (2 new monitor-loop tests + 9 PR5 historian + 15 pre-existing PostMortemMmf/RecyclePolicy/StaPump/MemoryWatchdog/EndToEndIpc/Handshake). Galaxy.Host builds clean (0 errors, 0 warnings) — the new Serilog.Log.ForContext usage picks up the existing Serilog package ref that PR 4 pulled in for the monitor-loop infrastructure. Both findings were flagged as non-blocking for PR 4 merge and are now resolved alongside whichever merge order the reviewer picks; this PR branches off phase-2-pr4-findings so it can rebase cleanly if PR 4 lands first or be re-based onto master after PR 4 merges.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 02:06:15 -04:00
Joseph Doherty
6df1a79d35 Phase 2 PR 5 — port Wonderware Historian SDK into Driver.Galaxy.Host/Backend/Historian/. The full v1 Historian.Aveva code path (HistorianDataSource + HistorianClusterEndpointPicker + IHistorianConnectionFactory + SdkHistorianConnectionFactory) now lives inside Galaxy.Host instead of the previously-required out-of-tree plugin + HistorianPluginLoader AssemblyResolve hack, and MxAccessGalaxyBackend.HistoryReadAsync — which previously returned a Phase 2 Task B.1.h follow-up placeholder — now delegates to the ported HistorianDataSource.ReadRawAsync, maps HistorianSample to GalaxyDataValue via the IPC wire shape, and reports Success=true with per-tag HistoryTagValues arrays. OPC-UA-free surface inside Galaxy.Host: the v1 code returned Opc.Ua.DataValue on the hot path, which would have required dragging OPCFoundation.NetStandard.Opc.Ua.Server into net48 x86 Galaxy.Host and bleeding OPC types across the IPC boundary — instead, the port introduces HistorianSample (Value, Quality byte, TimestampUtc) + HistorianAggregateSample (Value, TimestampUtc) POCOs that carry the raw MX quality byte through the pipe unchanged, and the OPC translation happens on the Proxy side via the existing QualityMapper that the live-read path already uses. Decision #13's IPC data-shape contract survives intact — GalaxyDataValue (TagReference + ValueBytes MessagePack + ValueMessagePackType + StatusCode + SourceTimestampUtcUnixMs + ServerTimestampUtcUnixMs) — so no Shared.Contracts wire break vs PR 4. Cluster failover preserved verbatim: HistorianClusterEndpointPicker is the thread-safe pure-logic picker ported verbatim with no SDK dependency (injected DateTime clock, per-node cooldown state, unknown-node-name tolerance, case-insensitive de-dup on configuration-order list), ConnectToAnyHealthyNode iterates the picker's healthy candidates, clones config per attempt, calls the factory, marks healthy on success / failed on exception with the failure message stored for dashboard surfacing, throws "All N healthy historian candidate(s) failed" with the last exception chained when every node exhausts. Process path + Event path use separate HistorianAccess connections (CreateHistoryQuery vs CreateEventQuery vs CreateAnalogSummaryQuery on the SDK surface) guarded by independent _connection/_eventConnection locks — a mid-query failure on one silo resets only that connection, the other stays open. Four SDK paths ported: ReadRawAsync (RetrievalMode.Full, BatchSize from config.MaxValuesPerRead, MoveNext pump, per-sample quality + value decode with the StringValue/Value fallback the v1 code did, limit-based early exit), ReadAggregateAsync (AnalogSummaryQuery + Resolution in ms, ExtractAggregateValue maps Average/Minimum/Maximum/ValueCount/First/Last/StdDev column names — the NodeId to column mapping is moved to the Proxy side since the IPC request carries a string column), ReadAtTimeAsync (per-timestamp HistoryQuery with RetrievalMode.Interpolated + BatchSize=1, returns Quality=0 / Value=null for missing samples), ReadEventsAsync (EventQuery + AddEventFilter("Source",Equal,sourceName) when sourceName is non-null, EventOrder.Ascending, EventCount = maxEvents or config.MaxValuesPerRead); GetHealthSnapshot returns the full runtime-health snapshot (TotalQueries/Successes/Failures + ConsecutiveFailures + LastSuccess/FailureTime + LastError + ProcessConnectionOpen/EventConnectionOpen + ActiveProcessNode/ActiveEventNode + per-node state list). ReadRaw is the only path wired through IPC in PR 5 (HistoryReadRequest/HistoryTagValues/HistoryReadResponse already existed in Shared.Contracts); Aggregate/AtTime/Events/Health are ported-but-not-yet-IPC-exposed — they stay internal to Galaxy.Host for PR 6+ to surface via new contract message kinds (aggregate = OPC UA HistoryReadProcessed, at-time = HistoryReadAtTime, events = HistoryReadEvents, health = admin dashboard IPC query). Galaxy.Host csproj gains aahClientManaged + aahClientCommon references with Private=false (managed wrappers) + None items for aahClient.dll + Historian.CBE.dll + Historian.DPAPI.dll + ArchestrA.CloudHistorian.Contract.dll native satellites staged alongside the host exe via CopyToOutputDirectory=PreserveNewest so aahClientManaged can P/Invoke into them at runtime without an AssemblyResolve hook (cleaner than the v1 HistorianPluginLoader.cs 180-LOC AssemblyResolve + Assembly.LoadFrom dance that existed solely because the plugin was loaded late from Host/bin/Debug/net48/Historian/). Program.cs adds BuildHistorianIfEnabled() that reads OTOPCUA_HISTORIAN_ENABLED (true or 1) + OTOPCUA_HISTORIAN_SERVER + OTOPCUA_HISTORIAN_SERVERS (comma-separated cluster list overrides single-server) + OTOPCUA_HISTORIAN_PORT (default 32568) + OTOPCUA_HISTORIAN_INTEGRATED (default true) + OTOPCUA_HISTORIAN_USER/OTOPCUA_HISTORIAN_PASS + OTOPCUA_HISTORIAN_TIMEOUT_SEC (30) + OTOPCUA_HISTORIAN_MAX_VALUES (10000) + OTOPCUA_HISTORIAN_COOLDOWN_SEC (60), returns null when disabled so MxAccessGalaxyBackend.HistoryReadAsync surfaces a clean "Historian disabled" Success=false instead of a localhost-SDK hang; server.RunAsync finally block now also casts backend to IDisposable.Dispose() so the historian SDK connections get cleanly closed on Ctrl+C. MxAccessGalaxyBackend gains an IHistorianDataSource? historian constructor parameter (defaults null to preserve existing Host.Tests call sites that don't exercise HistoryRead), implements IDisposable that forwards to _historian.Dispose(), and the pragma warning disable CS0618 is locally scoped to the ToDto(HistorianEvent) mapper since the SDK marks Id/Source/DisplayText/Severity obsolete but the replacement surface isn't available in the aahClientManaged version we bind against — every other deprecated-SDK use still surfaces as an error under TreatWarningsAsErrors. Ported from v1 Historian.Aveva unchanged: the CloneConfigWithServerName helper that preserves every config field except ServerName per attempt; the double-checked locking in EnsureConnected/EnsureEventConnected (fast path = Volatile.Read outside lock, slow path acquires lock + re-checks + disposes any raced-in-parallel connection); HandleConnectionError/HandleEventConnectionError that close the dead connection, clear the active-node tracker, MarkFailed the picker entry with the exception message so the node enters cooldown, and log the reset with node= for operator correlation; RecordSuccess/RecordFailure that bump counters under _healthLock. Tests: HistorianClusterEndpointPickerTests (7 cases) — single-node ServerName fallback when ServerNames empty, MarkFailed enters cooldown and skips, cooldown expires after window, MarkHealthy immediately clears, all-in-cooldown returns empty healthy list, Snapshot reports failure count + last error + IsHealthy, case-insensitive de-dup on duplicate hostnames. HistorianWiringTests (2 cases) — HistoryReadAsync returns "Historian disabled" Success=false when historian:null passed; HistoryReadAsync with a fake IHistorianDataSource maps the returned HistorianSample (Value=42.5, Quality=192 Good, Timestamp) to a GalaxyDataValue with StatusCode=0u + SourceTimestampUtcUnixMs matching the sample + MessagePack-encoded value bytes. InternalsVisibleTo("...Host.Tests") added to Galaxy.Host.csproj so tests can reach the internal HistorianClusterEndpointPicker. Full Galaxy.Host.Tests suite: 24 pass / 0 fail (9 new historian + 15 pre-existing MemoryWatchdog/PostMortemMmf/RecyclePolicy/StaPump/EndToEndIpc/Handshake). Full solution build: 0 errors (202 pre-existing warnings). The v1 Historian.Aveva project + Historian.Aveva.Tests still build intact because the archive PR (Stream D.1 destructive delete) is still ahead of us — PR 5 intentionally does not delete either; once PR 2+3 merge and the archive-delete PR lands, a follow-up cleanup can remove Historian.Aveva + its 4 source files + 18 test cases. Alarm subsystem wire-up (OnAlarmEvent raising from MxAccessGalaxyBackend via AlarmExtension primitives) + host-status push (OnHostStatusChanged via a ported GalaxyRuntimeProbeManager) remain PR 6 candidates; they were on the same "Task B.1.h follow-up" list and share the IPC connection-sink wiring with the historian events path — it made PR 5 scope-manageable to do Historian first since that's what has the biggest surface area (981 LOC v1 plus SDK binding) and alarms/host-status have more bespoke integration with the existing MxAccess subscription fan-out.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 01:44:04 -04:00
Joseph Doherty
caa9cb86f6 Phase 2 PR 4 — close the 4 open high/medium MXAccess findings from exit-gate-phase-2-final.md. High 1 (ReadAsync subscription-leak on cancel): the one-shot read now wraps subscribe→first-OnDataChange→unsubscribe in try/finally so the per-tag callback is always detached, and if the read installed the underlying MXAccess subscription itself (the prior _addressToHandle key was absent) it tears it down on the way out — no leaked probe item handles when the caller cancels or times out. High 2 (no reconnect loop): MxAccessClient gets a MxAccessClientOptions {AutoReconnect, MonitorInterval=5s, StaleThreshold=60s} + a background MonitorLoopAsync started at first ConnectAsync. The loop wakes every MonitorInterval, checks _lastObservedActivityUtc (bumped by every OnDataChange callback), and if stale probes the proxy with a no-op COM AddItem("$Heartbeat") on the StaPump; if the probe throws or returns false, the loop reconnects-with-replay — Unregister (best-effort), Register, snapshot _addressToHandle.Keys + clear, re-AddItem every previously-active subscription, ConnectionStateChanged events fire for the false→true transition, ReconnectCount bumps. Medium 3 (subscriptions don't push frames back to Proxy): IGalaxyBackend gains OnDataChange/OnAlarmEvent/OnHostStatusChanged events; new IFrameHandler.AttachConnection(FrameWriter) is called per-connection by PipeServer after Hello + the returned IDisposable disposes at connection close; GalaxyFrameHandler.ConnectionSink subscribes the events for the connection lifetime, fire-and-forget pushes them as MessageKind.OnDataChangeNotification / AlarmEvent / RuntimeStatusChange frames through the writer, swallows ObjectDisposedException for the dispose race, and unsubscribes in Dispose to prevent leaked invocation list refs across reconnects. MxAccessGalaxyBackend's existing SubscribeAsync (which previously discarded values via a (_, __) => {} callback) now wires OnTagValueChanged that fans out per-tag value changes to every subscription ID listening (one MXAccess subscription, multi-fan-out — _refToSubs reverse map). UnsubscribeAsync also reverse-walks the map to only call mx.UnsubscribeAsync when the LAST sub for a tag drops. Stub + DbBacked backends declare the events with #pragma warning disable CS0067 because they never raise them but must satisfy the interface (treat-warnings-as-errors would otherwise fail). Medium 4 (WriteValuesAsync doesn't await OnWriteComplete): MxAccessClient.WriteAsync rewritten to return Task<bool> via the v1-style TaskCompletionSource-keyed-by-item-handle pattern in _pendingWrites — adds the TCS before the Write call, awaits it with a configurable timeout (default 5s), removes the TCS in finally, returns true only when OnWriteComplete reported success. MxAccessGalaxyBackend.WriteValuesAsync now reports per-tag Bad_InternalError ("MXAccess runtime reported write failure") when the bool returns false, instead of false-positive Good. PipeServer's IFrameHandler interface adds the AttachConnection(FrameWriter):IDisposable method + a public NoopAttachment nested class (net48 doesn't support default interface methods so the empty-attach is exposed for stub implementations). StubFrameHandler returns IFrameHandler.NoopAttachment.Instance. RunOneConnectionAsync calls AttachConnection after HelloAck and usings the returned disposable so it disposes at the connection scope's finally. ConnectionStateChanged event added on MxAccessClient (caller-facing diagnostics for false→true reconnect transitions). docs/v2/implementation/pr-4-body.md is the Gitea web-UI paste-in for opening PR 4 once pushed; includes 2 new low-priority adversarial findings (probe item-handle leak; replay-loop silently swallows per-subscription failures) flagged as follow-ups not PR 4 blockers. Full solution 460 pass / 7 skip (E2E on admin shell) / 1 pre-existing Phase 0 baseline. No regressions vs PR 2's baseline.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 01:12:09 -04:00
Joseph Doherty
a3d16a28f1 Phase 2 Stream D Option B — archive v1 surface + new Driver.Galaxy.E2E parity suite. Non-destructive intermediate state: the v1 OtOpcUa.Host + Historian.Aveva + Tests + IntegrationTests projects all still build (494 v1 unit + 6 v1 integration tests still pass when run explicitly), but solution-level dotnet test ZB.MOM.WW.OtOpcUa.slnx now skips them via IsTestProject=false on the test projects + archive-status PropertyGroup comments on the src projects. The destructive deletion is reserved for Phase 2 PR 3 with explicit operator review per CLAUDE.md "only use destructive operations when truly the best approach". tests/ZB.MOM.WW.OtOpcUa.Tests/ renamed via git mv to tests/ZB.MOM.WW.OtOpcUa.Tests.v1Archive/; csproj <AssemblyName> kept as the original ZB.MOM.WW.OtOpcUa.Tests so v1 OtOpcUa.Host's [InternalsVisibleTo("ZB.MOM.WW.OtOpcUa.Tests")] still matches and the project rebuilds clean. tests/ZB.MOM.WW.OtOpcUa.IntegrationTests gets <IsTestProject>false</IsTestProject>. src/ZB.MOM.WW.OtOpcUa.Host + src/ZB.MOM.WW.OtOpcUa.Historian.Aveva get PropertyGroup archive-status comments documenting they're functionally superseded but kept in-build because cascading dependencies (Historian.Aveva → Host; IntegrationTests → Host) make a single-PR deletion high blast-radius. New tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E/ project (.NET 10) with ParityFixture that spawns OtOpcUa.Driver.Galaxy.Host.exe (net48 x86) as a Process.Start subprocess with OTOPCUA_GALAXY_BACKEND=db env vars, awaits 2s for the PipeServer to bind, then exposes a connected GalaxyProxyDriver; skips on non-Windows / Administrator shells (PipeAcl denies admins per decision #76) / ZB unreachable / Host EXE not built — each skip carries a SkipReason string the test method reads via Assert.Skip(SkipReason). RecordingAddressSpaceBuilder captures every Folder/Variable/AddProperty registration so parity tests can assert on the same shape v1 LmxNodeManager produced. HierarchyParityTests (3) — Discover returns gobjects with attributes; attribute full references match the tag.attribute Galaxy reference grammar; HistoryExtension flag flows through correctly. StabilityFindingsRegressionTests (4) — one test per 2026-04-13 stability finding from commits c76ab8f and 7310925: phantom probe subscription doesn't corrupt unrelated host status; HostStatusChangedEventArgs structurally carries a specific HostName + OldState + NewState (event signature mathematically prevents the v1 cross-host quality-clear bug); all GalaxyProxyDriver capability methods return Task or Task<T> (sync-over-async would deadlock OPC UA stack thread); AcknowledgeAsync completes before returning (no fire-and-forget background work that could race shutdown). Solution test count: 470 pass / 7 skip (E2E on admin shell) / 1 pre-existing Phase 0 baseline. Run archived suites explicitly: dotnet test tests/ZB.MOM.WW.OtOpcUa.Tests.v1Archive (494 pass) + dotnet test tests/ZB.MOM.WW.OtOpcUa.IntegrationTests (6 pass). docs/v2/V1_ARCHIVE_STATUS.md inventories every archived surface with run-it-explicitly instructions + a 10-step deletion plan for PR 3 + rollback procedure (git revert restores all four projects). docs/v2/implementation/exit-gate-phase-2-final.md supersedes the two partial-exit docs with the per-stream status table (A/B/C/D/E all addressed, D split across PR 2/3 per safety protocol), the test count breakdown, fresh adversarial review of PR 2 deltas (4 new findings: medium IsTestProject=false safety net loss, medium structural-vs-behavioral stability tests, low backend=db default, low Process.Start env inheritance), the 8 carried-forward findings from exit-gate-phase-2.md, the recommended PR order (1 → 2 → 3 → 4). docs/v2/implementation/pr-2-body.md is the Gitea web-UI paste-in for opening PR 2 once pushed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 00:56:21 -04:00
Joseph Doherty
50f81a156d Doc — PR 1 body for Gitea web UI paste-in. PR title + summary + test matrix + reviewer test plan + follow-up tracking. Source phase-1-configuration → target v2; URL https://gitea.dohertylan.com/dohertj2/lmxopcua/pulls/new/phase-1-configuration. No gh/tea CLI on this box, so the body is staged here for the operator to paste into the Gitea web UI rather than auto-created via API.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 00:46:23 -04:00
Joseph Doherty
7403b92b72 Phase 2 Stream D progress — non-destructive deliverables: appsettings → DriverConfig migration script, two-service Windows installer scripts, process-spawn cross-FX parity test, Stream D removal procedure doc with both Option A (rewrite 494 v1 tests) and Option B (archive + new v2 E2E suite) spelled out step-by-step. Cannot one-shot the actual legacy-Host deletion in any unattended session — explained in the procedure doc; the parity-defect debug cycle is intrinsically interactive (each iteration requires inspecting a v1↔v2 diff and deciding if it's a legitimate v2 improvement or a regression, then either widening the assertion or fixing the v2 code), and git rm -r src/ZB.MOM.WW.OtOpcUa.Host is destructive enough to need explicit operator authorization on a real PR review. scripts/migration/Migrate-AppSettings-To-DriverConfig.ps1 takes a v1 appsettings.json and emits the v2 DriverInstance.DriverConfig JSON blob (MxAccess/Database/Historian sections) ready to upsert into the central Configuration DB; null-leaf stripping; -DryRun mode; smoke-tested against the dev appsettings.json and produces the expected three-section ordered-dictionary output. scripts/install/Install-Services.ps1 registers the two v2 services with sc.exe — OtOpcUaGalaxyHost first (net48 x86 EXE with OTOPCUA_GALAXY_PIPE/OTOPCUA_ALLOWED_SID/OTOPCUA_GALAXY_SECRET/OTOPCUA_GALAXY_BACKEND/OTOPCUA_GALAXY_ZB_CONN/OTOPCUA_GALAXY_CLIENT_NAME env vars set via HKLM:\SYSTEM\CurrentControlSet\Services\OtOpcUaGalaxyHost\Environment registry), then OtOpcUa with depend=OtOpcUaGalaxyHost; resolves down-level account names to SID for the IPC ACL; generates a fresh 32-byte base64 shared secret per install if not supplied (kept out of registry — operators record offline for service rebinding scenarios); echoes start commands. scripts/install/Uninstall-Services.ps1 stops + removes both services. tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests/HostSubprocessParityTests.cs is the production-shape parity test — Proxy (.NET 10) spawns the actual OtOpcUa.Driver.Galaxy.Host.exe (net48 x86) as a subprocess via Process.Start with backend=db env vars, connects via real named pipe, calls Discover, asserts at least one Galaxy gobject comes back. Skipped when running as Administrator (PipeAcl denies admins, same guard as other IPC integration tests), when the Host EXE hasn't been built, or when the ZB SQL endpoint is unreachable. This is the cross-FX integration that the parity suite genuinely needs — the previous IPC tests all ran in-process; this one validates the production deployment topology where Proxy and Host are separate processes communicating only over the named pipe. docs/v2/implementation/stream-d-removal-procedure.md is the next-session playbook: Option A (rewrite 494 v1 tests via a ProxyMxAccessClientAdapter that implements v1's IMxAccessClient by forwarding to GalaxyProxyDriver — Vtq↔DataValueSnapshot, Quality↔StatusCode, OnTagValueChanged↔OnDataChange mapping; 3-5 days, full coverage), Option B (rename OtOpcUa.Tests → OtOpcUa.Tests.v1Archive with [Trait("Category", "v1Archive")] for opt-in CI runs; new OtOpcUa.Driver.Galaxy.E2E test project with 10-20 representative tests via the HostSubprocessParityTests pattern; 1-2 days, accreted coverage); deletion checklist with eight pre-conditions, ten ordered steps, and a rollback path (git revert restores the legacy Host alongside the v2 stack — both topologies remain installable until the downstream consumer cutover). Full solution 964 pass / 1 pre-existing Phase 0 baseline; the 494 v1 IntegrationTests + 6 v1 IntegrationTests-net48 still pass because legacy OtOpcUa.Host stays untouched until an interactive session executes the procedure doc.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 00:38:44 -04:00
Joseph Doherty
a7126ba953 Phase 2 — port MXAccess COM client to Galaxy.Host + MxAccessGalaxyBackend (3rd IGalaxyBackend) + live MXAccess smoke + Phase 2 exit-gate doc + adversarial review. The full Galaxy data-plane now flows through the v2 IPC topology end-to-end against live ArchestrA.MxAccess.dll, on this dev box, with 30/30 Host tests + 9/9 Proxy tests + 963/963 solution tests passing alongside the unchanged 494 v1 IntegrationTests baseline. Backend/MxAccess/Vtq is a focused port of v1's Vtq value-timestamp-quality DTO. Backend/MxAccess/IMxProxy abstracts LMXProxyServer (port of v1's IMxProxy with the same Register/Unregister/AddItem/RemoveItem/AdviseSupervisory/UnAdviseSupervisory/Write surface + OnDataChange + OnWriteComplete events); MxProxyAdapter is the concrete COM-backed implementation that does Marshal.ReleaseComObject-loop on Unregister, must be constructed on an STA thread. Backend/MxAccess/MxAccessClient is the focused port of v1's MxAccessClient partials — Connect/Disconnect/Read/Write/Subscribe/Unsubscribe through the new Sta/StaPump (the real Win32 GetMessage pump from the previous commit), ConcurrentDictionary handle tracking, OnDataChange event marshalling to per-tag callbacks, ReadAsync implemented as the canonical subscribe → first-OnDataChange → unsubscribe one-shot pattern. Galaxy.Host csproj flipped to x86 PlatformTarget + Prefer32Bit=true with the ArchestrA.MxAccess HintPath ..\..\lib\ArchestrA.MxAccess.dll reference (lib/ already contains the production DLL). Backend/MxAccessGalaxyBackend is the third IGalaxyBackend implementation (alongside StubGalaxyBackend and DbBackedGalaxyBackend): combines GalaxyRepository (Discover) with MxAccessClient (Read/Write/Subscribe), MessagePack-deserializes inbound write values, MessagePack-serializes outbound read values into ValueBytes, decodes ArrayDimension/SecurityClassification/category_id with the same v1 mapping. Program.cs selects between stub|db|mxaccess via OTOPCUA_GALAXY_BACKEND env var (default = mxaccess); OTOPCUA_GALAXY_ZB_CONN overrides the ZB connection string; OTOPCUA_GALAXY_CLIENT_NAME sets the Wonderware client identity; the StaPump and MxAccessClient lifecycles are tied to the server.RunAsync try/finally so a clean Ctrl+C tears down the COM proxy via Marshal.ReleaseComObject before the pump's WM_QUIT. Live MXAccess smoke tests (MxAccessLiveSmokeTests, net48 x86) — skipped when ZB unreachable or aaBootstrap not running, otherwise verify (1) MxAccessClient.ConnectAsync returns a positive LMXProxyServer handle on the StaPump, (2) MxAccessGalaxyBackend.OpenSession + Discover returns at least one gobject with attributes, (3) MxAccessGalaxyBackend.ReadValues against the first discovered attribute returns a response with the correct TagReference shape (value + quality vary by what's running, so we don't assert specific values). All 3 pass on this dev box. EndToEndIpcTests + IpcHandshakeIntegrationTests moved from Galaxy.Proxy.Tests (net10) to Galaxy.Host.Tests (net48 x86) — the previous test placement silently dropped them at xUnit discovery because Host became net48 x86 and net10 process can't load it. Rewritten to use Shared's FrameReader/FrameWriter directly instead of going through Proxy's GalaxyIpcClient (functionally equivalent — same wire protocol, framing primitives + dispatcher are the production code path verbatim). 7 IPC tests now run cleanly: Hello+heartbeat round-trip, wrong-secret rejection, OpenSession session-id assignment, Discover error-response surfacing, WriteValues per-tag bad status, Subscribe id assignment, Recycle grace window. Phase 2 exit-gate doc (docs/v2/implementation/exit-gate-phase-2.md) supersedes the partial-exit doc with the as-built state — Streams A/B/C complete; D/E gated only on the legacy-Host removal + parity-test rewrite cycle that fundamentally requires multi-day debug iteration; full adversarial-review section ranking 8 findings (2 high, 3 medium, 3 low) all explicitly deferred to Stream D/E or v2.1 with rationale; Stream-D removal checklist gives the next-session entry point with two policy options for the 494 v1 tests (rewrite-to-use-Proxy vs archive-and-write-smaller-v2-parity-suite). Cannot one-shot Stream D.1 in any single session because deleting OtOpcUa.Host requires the v1 IntegrationTests cycle to be retargeted first; that's the structural blocker, not "needs more code" — and the plan itself budgets 3-4 weeks for it.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 00:23:24 -04:00
Joseph Doherty
549cd36662 Phase 2 — port GalaxyRepository to Galaxy.Host + DbBackedGalaxyBackend, smoke-tested against live ZB. Real Galaxy gobject hierarchy + dynamic attributes now flow through the IPC contract end-to-end without any MXAccess code involvement, so the OPC UA address-space build (Stream C.4 acceptance) becomes parity-testable today even before the COM client port lands. Backend/Galaxy/GalaxyRepository.cs is a byte-for-byte port of v1 GalaxyRepositoryService's HierarchySql + AttributesSql (the two SQL bodies, both ~50 lines of recursive CTE template-chain + deployed_package_chain logic, are identical to v1 so the row set is verifiably the same — extended-attributes + scope-filter queries from v1 are intentionally not ported yet, they're refinements not on the Phase 2 critical path); plus TestConnectionAsync (SELECT 1) and GetLastDeployTimeAsync (SELECT time_of_last_deploy FROM galaxy) for the ChangeDetection deploy-watermark path. Backend/Galaxy/GalaxyRepositoryOptions defaults to localhost ZB Integrated Security; runtime override comes from DriverConfig.Database section per plan.md §"Galaxy DriverConfig". Backend/Galaxy/GalaxyHierarchyRow + GalaxyAttributeRow are the row-shape DTOs (no required modifier — net48 lacks RequiredMemberAttribute and we'd need a polyfill shim like the existing IsExternalInit one; default-string init is simpler). System.Data.SqlClient 4.9.0 added (the same package the v1 Host uses; net48-compatible). Backend/DbBackedGalaxyBackend wraps the repository: DiscoverAsync builds a real DiscoverHierarchyResponse (groups attributes by gobject, resolves parent-by-tagname, maps category_id → human-readable template-category name mirroring v1 AlarmObjectFilter); ReadValuesAsync/WriteValuesAsync/HistoryReadAsync still surface "MXAccess code lift pending (Phase 2 Task B.1)" because runtime data values genuinely need the COM client; OpenSession/CloseSession/Subscribe/Unsubscribe/AlarmSubscribe/AlarmAck/Recycle return success without backend work (subscription ID is a synthetic counter for now). Live smoke tests (GalaxyRepositoryLiveSmokeTests) skip when localhost ZB is unreachable; when present they verify (1) TestConnection returns true, (2) GetHierarchy returns at least one deployed gobject with a non-empty TagName, (3) GetAttributes returns rows with FullTagReference matching the "tag.attribute" shape, (4) GetLastDeployTime returns a value, (5) DbBackedBackend.DiscoverAsync returns at least one gobject with attributes and a populated TemplateCategory. All 5 pass against the local Galaxy. Full solution 957 pass / 1 pre-existing Phase 0 baseline; the 494 v1 IntegrationTests + 6 v1 IntegrationTests-net48 tests still pass — legacy OtOpcUa.Host untouched. Remaining for the Phase 2 exit gate is the MXAccess COM client port itself (the v1 MxAccessClient partials + IMxProxy abstraction + StaPump-based Connect/Subscribe/Read/Write semantics) — Discover is now solved in DB-backed form, so the lift can focus exclusively on the runtime data-plane.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 23:14:09 -04:00
Joseph Doherty
32eeeb9e04 Phase 2 Streams A+B+C feature-complete — real Win32 pump, all 9 IDriver capabilities, end-to-end IPC dispatch. Streams D+E remain (Galaxy MXAccess code lift + parity-debug cycle, plan-budgeted 3-4 weeks). The 494 v1 IntegrationTests still pass — legacy OtOpcUa.Host untouched. StaPump replaces the BlockingCollection placeholder with a real Win32 message pump lifted from v1 StaComThread per CLAUDE.md "Reference Implementation": dedicated STA Thread with SetApartmentState(STA), GetMessage/PostThreadMessage/PeekMessage/TranslateMessage/DispatchMessage/PostQuitMessage P/Invoke, WM_APP=0x8000 for work-item dispatch, WM_APP+1 for graceful-drain → PostQuitMessage, peek-pm-noremove on entry to force the system to create the thread message queue before signalling Started, IsResponsiveAsync probe still no-op-round-trips through PostThreadMessage so the wedge detection works against the real pump. Concurrent ConcurrentQueue<WorkItem> drains on every WM_APP; fault path on dispose drains-and-faults all pending work-item TCSes with InvalidOperationException("STA pump has exited"). All three StaPumpTests pass against the real pump (apartment state STA, healthy probe true, wedged probe false). GalaxyProxyDriver now implements every Phase 2 Stream C capability — IDriver, ITagDiscovery, IReadable, IWritable, ISubscribable, IAlarmSource, IHistoryProvider, IRediscoverable, IHostConnectivityProbe — each forwarding through the matching IPC contract. ReadAsync preserves request order even when the Host returns out-of-order values; WriteAsync MessagePack-serializes the value into ValueBytes; SubscribeAsync wraps SubscriptionId in a GalaxySubscriptionHandle record; UnsubscribeAsync uses the new SendOneWayAsync helper on GalaxyIpcClient (fire-and-forget but still gated through the call-semaphore so it doesn't interleave with CallAsync); AlarmSubscribe is one-way and the Host pushes events back via OnAlarmEvent; ReadProcessedAsync short-circuits to NotSupportedException (Galaxy historian only does raw); IRediscoverable's OnRediscoveryNeeded fires when the Host pushes a deploy-watermark notification; IHostConnectivityProbe.GetHostStatuses() snapshots and OnHostStatusChanged fires on Running↔Stopped/Faulted transitions, with IpcHostConnectivityStatus aliased to disambiguate from the Core.Abstractions namespace's same-named type. Internal RaiseDataChange/RaiseAlarmEvent/RaiseRediscoveryNeeded/OnHostConnectivityUpdate methods are the entry points the IPC client will invoke when push frames arrive. Host side: new Backend/IGalaxyBackend interface defines the seam between IPC dispatch and the live MXAccess code (so the dispatcher is unit-testable against an in-memory mock without needing live Galaxy); Backend/StubGalaxyBackend returns success for OpenSession/CloseSession/Subscribe/Unsubscribe/AlarmSubscribe/AlarmAck/Recycle and a recognizable "stub: MXAccess code lift pending (Phase 2 Task B.1)"-tagged error for Discover/ReadValues/WriteValues/HistoryRead — keeps the IPC end-to-end testable today and gives the parity team a clear seam to slot the real implementation into; Ipc/GalaxyFrameHandler is the new real dispatcher (replaces StubFrameHandler in Program.cs) — switch on MessageKind, deserialize the matching contract, await backend method, write the response (one-way for Unsubscribe/AlarmSubscribe/AlarmAck/CloseSession), heartbeat handled inline so liveness still works if the backend is sick, exceptions caught and surfaced as ErrorResponse with code "handler-exception" so the Proxy raises GalaxyIpcException instead of disconnecting. End-to-end IPC integration test (EndToEndIpcTests) drives every operation through the full stack — Initialize → Read → Write → Subscribe → Unsubscribe → SubscribeAlarms → AlarmAck → ReadRaw → ReadProcessed (short-circuit) — proving the wire protocol, dispatcher, capability forwarding, and one-way semantics agree end-to-end. Skipped on Windows administrator shells per the same PipeAcl-denies-Administrators reasoning the IpcHandshakeIntegrationTests use. Full solution 952 pass / 1 pre-existing Phase 0 baseline. Phase 2 evidence doc updated: status header now reads "Streams A+B+C complete... Streams D+E remain — gated only on the iterative Galaxy code lift + parity-debug cycle"; new Update 2026-04-17 (later) callout enumerates the upgrade with explicit "what's left for the Phase 2 exit gate" — replace StubGalaxyBackend with a MxAccessClient-backed implementation calling on the StaPump, then run the v1 IntegrationTests against the v2 topology and iterate on parity defects until green, then delete legacy OtOpcUa.Host.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 23:02:00 -04:00
Joseph Doherty
a1e9ed40fb Doc — record that this dev box (DESKTOP-6JL3KKO) hosts the full AVEVA stack required for the LmxOpcUa Phase 2 breakout, removing the "needs live MXAccess runtime" environmental blocker that the partial-exit evidence cited as gating Streams D + E. Inventory verified via Get-Service: 27 ArchestrA / Wonderware / AVEVA services running including aaBootstrap, aaGR (Galaxy Repository), aaLogger, aaUserValidator, aaPim, ArchestrADataStore, AsbServiceManager, AutoBuild_Service; the full Historian set (aahClientAccessPoint, aahGateway, aahInSight, aahSearchIndexer, aahSupervisor, InSQLStorage, InSQLConfiguration, InSQLEventSystem, InSQLIndexing, InSQLIOServer, InSQLManualStorage, InSQLSystemDriver, HistorianSearch-x64); slssvc (Wonderware SuiteLink); MXAccess COM DLL at C:\Program Files (x86)\ArchestrA\Framework\bin\ArchestrA.MXAccess.dll plus the matching .tlb files; OI-Gateway install at C:\Program Files (x86)\Wonderware\OI-Server\OI-Gateway\ — which means the Phase 1 Task E.10 AppServer-via-OI-Gateway smoke test (decision #142) is *also* runnable on the same box, not blocked on a separate AVEVA test machine as the original deferral assumed. dev-environment.md inventory row for "Dev Galaxy" now lists every service and file path; status flips to "Fully available — Phase 2 lift unblocked"; the GLAuth row also fills out v2.4.0 actual install details (direct-bind cn={user},dc=lmxopcua,dc=local; users readonly/writeop/writetune/writeconfig/alarmack/admin/serviceaccount; running under NSSM service GLAuth; current GroupToRole mapping ReadOnly→ConfigViewer / WriteOperate→ConfigEditor / AlarmAck→FleetAdmin) and notes the v2-rebrand to dc=otopcua,dc=local is a future cosmetic change. phase-2-partial-exit-evidence.md status header gains "runtime now in place"; an Update 2026-04-17 callout enumerates the same service inventory and concludes "no environmental blocker remains"; the next-session checklist's first step changes from "stand up dev Galaxy" to "verify the local AVEVA stack is still green (Get-Service aaGR, aaBootstrap, slssvc → Running) and the Galaxy ZB repository is reachable" with a new step 9 calling out that the AppServer-via-OI-Gateway smoke test should now be folded in opportunistically. plan.md §"4. Galaxy/MXAccess as Out-of-Process Driver" gains a "Dev environment for the LmxOpcUa breakout" paragraph documenting which physical machine has the runtime so the planning doc no longer reads as if AVEVA capability were a future logistical concern. No source / test changes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 22:42:15 -04:00
140 changed files with 7350 additions and 155 deletions

View File

@@ -23,7 +23,8 @@
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Tests/ZB.MOM.WW.OtOpcUa.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Tests.v1Archive/ZB.MOM.WW.OtOpcUa.Tests.v1Archive.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Historian.Aveva.Tests/ZB.MOM.WW.OtOpcUa.Historian.Aveva.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.IntegrationTests/ZB.MOM.WW.OtOpcUa.IntegrationTests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Client.Shared.Tests/ZB.MOM.WW.OtOpcUa.Client.Shared.Tests.csproj"/>

View File

@@ -0,0 +1,56 @@
# V1 Archive Status (Phase 2 Stream D, 2026-04-18)
This document inventories every v1 surface that's been **functionally superseded** by v2 but
**physically retained** in the build until the deletion PR (Phase 2 PR 3). Rationale: cascading
references mean a single deletion is high blast-radius; archive-marking lets the v2 stack ship
on its own merits while the v1 surface stays as parity reference.
## Archived projects
| Path | Status | Replaced by | Build behavior |
|---|---|---|---|
| `src/ZB.MOM.WW.OtOpcUa.Host/` | Archive (executable in build) | `OtOpcUa.Server` + `Driver.Galaxy.Host` + `Driver.Galaxy.Proxy` | Builds; not deployed by v2 install scripts |
| `src/ZB.MOM.WW.OtOpcUa.Historian.Aveva/` | Archive (plugin in build) | TODO: port into `Driver.Galaxy.Host/Backend/Historian/` (Task B.1.h follow-up) | Builds; loaded only by archived Host |
| `tests/ZB.MOM.WW.OtOpcUa.Tests.v1Archive/` | Archive | `Driver.Galaxy.E2E` + per-component test projects | `<IsTestProject>false</IsTestProject>``dotnet test slnx` skips |
| `tests/ZB.MOM.WW.OtOpcUa.IntegrationTests/` | Archive | `Driver.Galaxy.E2E` | `<IsTestProject>false</IsTestProject>``dotnet test slnx` skips |
## How to run the archived suites explicitly
```powershell
# v1 unit tests (494):
dotnet test tests/ZB.MOM.WW.OtOpcUa.Tests.v1Archive
# v1 integration tests (6):
dotnet test tests/ZB.MOM.WW.OtOpcUa.IntegrationTests
```
Both still pass on this dev box — they're the parity reference for Phase 2 PR 3's deletion
decision.
## Deletion plan (Phase 2 PR 3)
Pre-conditions:
- [ ] `Driver.Galaxy.E2E` test count covers the v1 IntegrationTests' 6 integration scenarios
at minimum (currently 7 tests; expand as needed)
- [ ] `Driver.Galaxy.Host/Backend/Historian/` ports the Wonderware Historian plugin
so `MxAccessGalaxyBackend.HistoryReadAsync` returns real data (Task B.1.h)
- [ ] Operator review on a separate PR — destructive change
Steps:
1. `git rm -r src/ZB.MOM.WW.OtOpcUa.Host/`
2. `git rm -r src/ZB.MOM.WW.OtOpcUa.Historian.Aveva/`
(or move it under Driver.Galaxy.Host first if the lift is part of the same PR)
3. `git rm -r tests/ZB.MOM.WW.OtOpcUa.Tests.v1Archive/`
4. `git rm -r tests/ZB.MOM.WW.OtOpcUa.IntegrationTests/`
5. Edit `ZB.MOM.WW.OtOpcUa.slnx` — remove the four project lines
6. `dotnet build ZB.MOM.WW.OtOpcUa.slnx` → confirm clean
7. `dotnet test ZB.MOM.WW.OtOpcUa.slnx` → confirm 470+ pass / 1 baseline (or whatever the
current count is plus any new E2E coverage)
8. Commit: "Phase 2 Stream D — delete v1 archive (Host + Historian.Aveva + v1Tests + IntegrationTests)"
9. PR 3 against `v2`, link this doc + exit-gate-phase-2-final.md
10. One reviewer signoff
## Rollback
If Phase 2 PR 3 surfaces downstream consumer regressions, `git revert` the deletion commit
restores the four projects intact. The v2 stack continues to ship from the v2 branch.

View File

@@ -59,8 +59,8 @@ Running record of every v2 dev service stood up on this developer machine. Updat
| Service | Container / Process | Version | Host:Port | Credentials (dev-only) | Data location | Status |
|---------|---------------------|---------|-----------|------------------------|---------------|--------|
| **Central config DB** | Docker container `otopcua-mssql` (image `mcr.microsoft.com/mssql/server:2022-latest`) | 16.0.4250.1 (RTM-CU24-GDR, KB5083252) | `localhost:14330` (host) → `1433` (container) — remapped from 1433 to avoid collision with the native MSSQL14 instance that hosts the Galaxy `ZB` DB (both bind 0.0.0.0:1433; whichever wins the race gets connections) | User `sa` / Password `OtOpcUaDev_2026!` | Docker named volume `otopcua-mssql-data` (mounted at `/var/opt/mssql` inside container) | ✅ Running — `InitialSchema` migration applied, 16 entity tables live |
| Dev Galaxy (AVEVA System Platform) | Local install on this dev box | v1 baseline | Local COM via MXAccess | Windows Auth | Galaxy repository DB `ZB` on local SQL Server (separate instance from `otopcua-mssql` — legacy v1 Galaxy DB, not related to v2 config DB) | ✅ Available (per CLAUDE.md) |
| GLAuth (LDAP) | Local install at `C:\publish\glauth\` | v1 baseline | `localhost:3893` (LDAP) / `3894` (LDAPS) | Bind DN `cn=admin,dc=otopcua,dc=local` / password in `glauth-otopcua.cfg` | `C:\publish\glauth\` | Pending — v2 test users + groups config not yet seeded (Phase 1 Stream E task) |
| Dev Galaxy (AVEVA System Platform) | Local install on this dev box — full ArchestrA + Historian + OI-Server stack | v1 baseline | Local COM via MXAccess (`C:\Program Files (x86)\ArchestrA\Framework\bin\ArchestrA.MXAccess.dll`); Historian via `aaH*` services; SuiteLink via `slssvc` | Windows Auth | Galaxy repository DB `ZB` on local SQL Server (separate instance from `otopcua-mssql` — legacy v1 Galaxy DB, not related to v2 config DB) | ✅ **Fully available — Phase 2 lift unblocked.** 27 ArchestrA / AVEVA / Wonderware services running incl. `aaBootstrap`, `aaGR` (Galaxy Repository), `aaLogger`, `aaUserValidator`, `aaPim`, `ArchestrADataStore`, `AsbServiceManager`, `AutoBuild_Service`; full Historian set (`aahClientAccessPoint`, `aahGateway`, `aahInSight`, `aahSearchIndexer`, `aahSupervisor`, `InSQLStorage`, `InSQLConfiguration`, `InSQLEventSystem`, `InSQLIndexing`, `InSQLIOServer`, `InSQLManualStorage`, `InSQLSystemDriver`, `HistorianSearch-x64`); `slssvc` (Wonderware SuiteLink); `OI-Gateway` install present at `C:\Program Files (x86)\Wonderware\OI-Server\OI-Gateway\` (decision #142 AppServer-via-OI-Gateway smoke test now also unblocked) |
| GLAuth (LDAP) | Local install at `C:\publish\glauth\` | v2.4.0 | `localhost:3893` (LDAP) / `3894` (LDAPS, disabled) | Direct-bind `cn={user},dc=lmxopcua,dc=local` per `auth.md`; users `readonly`/`writeop`/`writetune`/`writeconfig`/`alarmack`/`admin`/`serviceaccount` (passwords in `glauth.cfg` as SHA-256) | `C:\publish\glauth\` | ✅ Running (NSSM service `GLAuth`). Phase 1 Admin uses GroupToRole map `ReadOnly→ConfigViewer`, `WriteOperate→ConfigEditor`, `AlarmAck→FleetAdmin`. v2-rebrand to `dc=otopcua,dc=local` is a future cosmetic change |
| OPC Foundation reference server | Not yet built | — | `localhost:62541` (target) | `user1` / `password1` (reference-server defaults) | — | Pending (needed for Phase 5 OPC UA Client driver testing) |
| FOCAS TCP stub | Not yet built | — | `localhost:8193` (target) | n/a | — | Pending (built in Phase 5) |
| Modbus simulator (`oitc/modbus-server`) | — | — | `localhost:502` (target) | n/a | — | Pending (needed for Phase 3 Modbus driver; moves to integration host per two-tier model) |

View File

@@ -0,0 +1,123 @@
# Phase 2 Final Exit Gate (2026-04-18)
> Supersedes `phase-2-partial-exit-evidence.md` and `exit-gate-phase-2.md`. Captures the
> as-built state at the close of Phase 2 work delivered across two PRs.
## Status: **All five Phase 2 streams addressed. Stream D split across PR 2 (archive) + PR 3 (delete) per safety protocol.**
## Stream-by-stream status
| Stream | Plan §reference | Status | PR |
|---|---|---|---|
| A — Driver.Galaxy.Shared | §A.1A.3 | ✅ Complete | PR 1 (merged or pending) |
| B — Driver.Galaxy.Host | §B.1B.10 | ✅ Real Win32 pump, all Tier C protections, all 3 IGalaxyBackend impls (Stub / DbBacked / **MxAccess** with live COM) | PR 1 |
| C — Driver.Galaxy.Proxy | §C.1C.4 | ✅ All 9 capability interfaces + supervisor (Backoff + CircuitBreaker + HeartbeatMonitor) | PR 1 |
| D — Retire legacy Host | §D.1D.3 | ✅ Migration script, installer scripts, Stream D procedure doc, **archive markings on all v1 surface (this PR 2)**, deletion deferred to PR 3 | PR 2 (this) + PR 3 (next) |
| E — Parity validation | §E.1E.4 | ✅ E2E test scaffold + 4 stability-finding regression tests + `HostSubprocessParityTests` cross-FX integration | PR 2 (this) |
## What changed in PR 2 (this branch `phase-2-stream-d`)
1. **`tests/ZB.MOM.WW.OtOpcUa.Tests/`** renamed to `tests/ZB.MOM.WW.OtOpcUa.Tests.v1Archive/`,
`<AssemblyName>` kept as `ZB.MOM.WW.OtOpcUa.Tests` so the v1 Host's `InternalsVisibleTo`
still matches, `<IsTestProject>false</IsTestProject>` so `dotnet test slnx` excludes it.
2. **Three other v1 projects archive-marked** with PropertyGroup comments:
`OtOpcUa.Host`, `Historian.Aveva`, `IntegrationTests`. `IntegrationTests` also gets
`<IsTestProject>false</IsTestProject>`.
3. **New `tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E/`** project (.NET 10):
- `ParityFixture` spawns `OtOpcUa.Driver.Galaxy.Host.exe` (net48 x86) as subprocess via
`Process.Start`, connects via real named pipe, exposes a connected `GalaxyProxyDriver`.
Skips when Galaxy ZB unreachable, when Host EXE not built, or when running as
Administrator (PipeAcl denies admins).
- `RecordingAddressSpaceBuilder` captures Folder + Variable + Property registrations so
parity tests can assert shape.
- `HierarchyParityTests` (3) — Discover returns gobjects with attributes;
attribute full references match `tag.attribute` shape; HistoryExtension flag flows
through.
- `StabilityFindingsRegressionTests` (4) — one test per 2026-04-13 finding:
phantom-probe-doesn't-corrupt-status, host-status-event-is-scoped, all-async-no-sync-
over-async, AcknowledgeAsync-completes-before-returning.
4. **`docs/v2/V1_ARCHIVE_STATUS.md`** — inventory + deletion plan for PR 3.
5. **`docs/v2/implementation/exit-gate-phase-2-final.md`** (this doc) — supersedes the two
partial-exit docs.
## Test counts
**Solution-level `dotnet test ZB.MOM.WW.OtOpcUa.slnx`**: **470 pass / 7 skip / 1 baseline failure**.
| Project | Pass | Skip |
|---|---:|---:|
| Core.Abstractions.Tests | 24 | 0 |
| Configuration.Tests | 42 | 0 |
| Core.Tests | 4 | 0 |
| Server.Tests | 2 | 0 |
| Admin.Tests | 21 | 0 |
| Driver.Galaxy.Shared.Tests | 6 | 0 |
| Driver.Galaxy.Host.Tests | 30 | 0 |
| Driver.Galaxy.Proxy.Tests | 10 | 0 |
| **Driver.Galaxy.E2E (NEW)** | **0** | **7** (all skip with documented reason — admin shell) |
| Client.Shared.Tests | 131 | 0 |
| Client.UI.Tests | 98 | 0 |
| Client.CLI.Tests | 51 / 1 fail | 0 |
| Historian.Aveva.Tests | 41 | 0 |
**Excluded from solution run (run explicitly when needed)**:
- `OtOpcUa.Tests.v1Archive` — 494 pass (v1 unit tests, kept as parity reference)
- `OtOpcUa.IntegrationTests` — 6 pass (v1 integration tests, kept as parity reference)
## Adversarial review of the PR 2 diff
Independent pass over the PR 2 deltas. New findings ranked by severity; existing findings
from the previous exit-gate doc still apply.
### New findings
**Medium 1 — `IsTestProject=false` on `OtOpcUa.IntegrationTests` removes the safety net.**
The 6 v1 integration tests no longer run on solution test. *Mitigation:* the new E2E suite
covers the same scenarios in the v2 topology shape. *Risk:* if E2E test count regresses or
fails to cover a scenario, the v1 fallback isn't auto-checked. **Procedure**: PR 3
checklist includes "E2E test count covers v1 IntegrationTests' 6 scenarios at minimum".
**Medium 2 — Stability-finding regression tests #2, #3, #4 are structural (reflection-based)
not behavioral.** Findings #2 and #3 use type-shape assertions (event signature carries
HostName; methods return Task) rather than triggering the actual race. *Mitigation:* the v1
defects were structural — fixing them required interface changes that the type-shape
assertions catch. *Risk:* a future refactor that re-introduces sync-over-async via a non-
async helper called inside a Task method wouldn't trip the test. **Filed as v2.1**: add a
runtime async-call-stack analyzer (Roslyn or post-build).
**Low 1 — `ParityFixture` defaults to `OTOPCUA_GALAXY_BACKEND=db`** (not `mxaccess`).
Discover works against ZB without needing live MXAccess. The MXAccess-required tests will
need a second fixture once they're written.
**Low 2 — `Process.Start(EnvironmentVariables)` doesn't always inherit clean state.** The
test inherits the parent's PATH + locale, which is normally fine but could mask a missing
runtime dependency. *Mitigation:* in CI, pin a clean environment block.
### Existing findings (carried forward from `exit-gate-phase-2.md`)
All 8 still apply unchanged. Particularly:
- High 1 (MxAccess Read subscription-leak on cancellation) — open
- High 2 (no MXAccess reconnect loop, only supervisor-driven recycle) — open
- Medium 3 (SubscribeAsync doesn't push OnDataChange frames yet) — open
- Medium 4 (WriteValuesAsync doesn't await OnWriteComplete) — open
## Cross-cutting deferrals (out of Phase 2)
- **Deletion of v1 archive** — PR 3, gated on operator review + E2E coverage parity check
- **Wonderware Historian SDK plugin port** (`Historian.Aveva``Driver.Galaxy.Host/Backend/Historian/`) — Task B.1.h, opportunistically with PR 3 or as PR 4
- **MxAccess subscription push frames** — Task B.1.s, follow-up to enable real-time data
flow (currently subscribes register but values aren't pushed back)
- **Wonderware Historian-backed HistoryRead** — depends on B.1.h
- **Alarm subsystem wire-up** — `MxAccessGalaxyBackend.SubscribeAlarmsAsync` is a no-op
- **Reconnect-without-recycle** in MxAccessClient — v2.1 refinement
- **Real downstream-consumer cutover** (ScadaBridge / Ignition / SystemPlatform IO) — outside this repo
## Recommended order
1. **PR 1** (`phase-1-configuration``v2`) — merge first; self-contained, parity preserved
2. **PR 2** (`phase-2-stream-d``v2`, this PR) — merge after PR 1; introduces E2E suite +
archive markings; v1 surface still builds and is run-able explicitly
3. **PR 3** (next session) — delete v1 archive; depends on operator approval after PR 2
reviewer signoff
4. **PR 4** (Phase 2 follow-up) — Historian port + MxAccess subscription push frames + the
open high/medium findings

View File

@@ -0,0 +1,181 @@
# Phase 2 Exit Gate Record (2026-04-18)
> Supersedes `phase-2-partial-exit-evidence.md`. Captures the as-built state of Phase 2 after
> the MXAccess COM client port + DB-backed and MXAccess-backed Galaxy backends + adversarial
> review.
## Status: **Streams A, B, C complete. Stream D + E gated only on legacy-Host removal + parity-test rewrite.**
The Phase 2 plan exit criterion ("v1 IntegrationTests pass against v2 Galaxy.Proxy + Galaxy.Host
topology byte-for-byte") still cannot be auto-validated in a single session. The blocker is no
longer "the Galaxy code lift" — that's done in this session — but the structural fact that the
494 v1 IntegrationTests instantiate v1 `OtOpcUa.Host` classes directly. They have to be rewritten
to use the IPC-fronted Proxy topology before legacy `OtOpcUa.Host` can be deleted, and the plan
budgets that work as a multi-day debug-cycle (Task E.1).
What changed today: the MXAccess COM client now exists in Galaxy.Host with a real
`ArchestrA.MxAccess.dll` reference, runs end-to-end against live `LMXProxyServer`, and 3 live
COM smoke tests pass on this dev box. `MxAccessGalaxyBackend` (the third
`IGalaxyBackend` implementation, alongside `StubGalaxyBackend` and `DbBackedGalaxyBackend`)
combines the ported `GalaxyRepository` with the ported `MxAccessClient` so Discover / Read /
Write / Subscribe all flow through one production-shape backend. `Program.cs` selects between
the three backends via the `OTOPCUA_GALAXY_BACKEND` env var (default = `mxaccess`).
## Delivered in Phase 2 (full scope, not just scaffolds)
### Stream A — Driver.Galaxy.Shared (✅ complete)
- 9 contract files: Hello/HelloAck (version negotiation), OpenSession/CloseSession/Heartbeat,
Discover + GalaxyObjectInfo + GalaxyAttributeInfo, Read/Write + GalaxyDataValue,
Subscribe/Unsubscribe/OnDataChange, AlarmSubscribe/Event/Ack, HistoryRead, HostConnectivityStatus,
Recycle.
- Length-prefixed framing (4-byte BE length + 1-byte kind + MessagePack body) with a
16 MiB cap.
- Thread-safe `FrameWriter` (semaphore-gated) and single-consumer `FrameReader`.
- 6 round-trip tests + reflection-scan that asserts contracts only reference BCL + MessagePack.
### Stream B — Driver.Galaxy.Host (✅ complete, exceeded original scope)
- Real Win32 message pump in `StaPump``GetMessage`/`PostThreadMessage`/`PeekMessage`/
`PostQuitMessage` P/Invoke, dedicated STA thread, `WM_APP=0x8000` work dispatch, `WM_APP+1`
graceful-drain → `PostQuitMessage`, 5s join-on-dispose, responsiveness probe.
- Strict `PipeAcl` (allow configured server SID only, deny LocalSystem + Administrators),
`PipeServer` with caller-SID verification + per-process shared-secret `Hello` handshake.
- Galaxy-specific `MemoryWatchdog` (warn `max(1.5×baseline, +200 MB)`, soft-recycle
`max(2×baseline, +200 MB)`, hard ceiling 1.5 GB, slope ≥5 MB/min over 30-min window).
- `RecyclePolicy` (1/hr cap + 03:00 daily scheduled), `PostMortemMmf` (1000-entry ring
buffer, hard-crash survivable, cross-process readable), `MxAccessHandle : SafeHandle`.
- `IGalaxyBackend` interface + 3 implementations:
- **`StubGalaxyBackend`** — keeps IPC end-to-end testable without Galaxy.
- **`DbBackedGalaxyBackend`** — real Discover via the ported `GalaxyRepository` against ZB.
- **`MxAccessGalaxyBackend`** — Discover via DB + Read/Write/Subscribe via the ported
`MxAccessClient` over the StaPump.
- `GalaxyRepository` ported from v1 (HierarchySql + AttributesSql byte-for-byte identical).
- `MxAccessClient` ported from v1 (Connect/Read/Write/Subscribe/Unsubscribe + ConcurrentDict
handle tracking + OnDataChange / OnWriteComplete event marshalling). The reconnect loop +
Historian plugin loader + extended-attribute query are explicit follow-ups.
- `MxProxyAdapter` + `IMxProxy` for COM-isolation testability.
- `Program.cs` env-driven backend selection (`OTOPCUA_GALAXY_BACKEND=stub|db|mxaccess`,
`OTOPCUA_GALAXY_ZB_CONN`, `OTOPCUA_GALAXY_CLIENT_NAME`, plus the Phase 2 baseline
`OTOPCUA_GALAXY_PIPE` / `OTOPCUA_ALLOWED_SID` / `OTOPCUA_GALAXY_SECRET`).
- ArchestrA.MxAccess.dll referenced via HintPath at `lib/ArchestrA.MxAccess.dll`. Project
flipped to **x86 platform target** (the COM interop requires it).
### Stream C — Driver.Galaxy.Proxy (✅ complete)
- `GalaxyProxyDriver` implements **all 9** capability interfaces — `IDriver`, `ITagDiscovery`,
`IReadable`, `IWritable`, `ISubscribable`, `IAlarmSource`, `IHistoryProvider`,
`IRediscoverable`, `IHostConnectivityProbe` — each forwarding through the matching IPC
contract.
- `GalaxyIpcClient` with `CallAsync` (request/response gated through a semaphore so concurrent
callers don't interleave frames) + `SendOneWayAsync` for fire-and-forget calls
(Unsubscribe / AlarmAck / CloseSession).
- `Backoff` (5s → 15s → 60s, capped, reset-on-stable-run), `CircuitBreaker` (3 crashes per
5 min opens; 1h → 4h → manual escalation; sticky alert), `HeartbeatMonitor` (2s cadence,
3 misses = host dead).
### Tests
- **963 pass / 1 pre-existing baseline** across the full solution.
- New in this session:
- `StaPumpTests` — pump still passes 3/3 against the real Win32 implementation
- `EndToEndIpcTests` (5) — every IPC operation through Pipe + dispatcher + StubBackend
- `IpcHandshakeIntegrationTests` (2) — Hello + heartbeat + secret rejection
- `GalaxyRepositoryLiveSmokeTests` (5) — live SQL against ZB, skip when ZB unreachable
- `MxAccessLiveSmokeTests` (3) — live COM against running `aaBootstrap` + `LMXProxyServer`
- All net48 x86 to match Galaxy.Host
## Adversarial review findings
Independent pass over the Phase 2 deltas. Findings ranked by severity; **all open items are
explicitly deferred to Stream D/E or v2.1 with rationale.**
### Critical — none.
### High
1. **MxAccess `ReadAsync` has a subscription-leak window on cancellation.** The one-shot read
uses subscribe → first-OnDataChange → unsubscribe. If the caller cancels between the
`SubscribeOnPumpAsync` await and the `tcs.Task` await, the subscription stays installed.
*Mitigation:* the StaPump's idempotent unsubscribe path drops orphan subs at disconnect, but
a long-running session leaks them. **Fix scoped to Phase 2 follow-up** alongside the proper
subscription registry that v1 had.
2. **No reconnect loop on the MXAccess COM connection.** v1's `MxAccessClient.Monitor` polled
a probe tag and triggered reconnect-with-replay on disconnection. The ported client's
`ConnectAsync` is one-shot and there's no health monitor. *Mitigation:* the Tier C
supervisor on the Proxy side (CircuitBreaker + HeartbeatMonitor) restarts the whole Host
process on liveness failure, so connection loss surfaces as a process recycle rather than
silent data loss. **Reconnect-without-recycle is a v2.1 refinement** per `driver-stability.md`.
### Medium
3. **`MxAccessGalaxyBackend.SubscribeAsync` doesn't push OnDataChange frames back to the
Proxy.** The wire frame `MessageKind.OnDataChangeNotification` is defined and `GalaxyProxyDriver`
has the `RaiseDataChange` internal entry point, but the Host-side push pipeline isn't wired —
the subscribe registers on the COM side but the value just gets discarded. *Mitigation:* the
SubscribeAsync handle is still useful for the ack flow, and one-shot reads work. **Push
plumbing is the next-session item.**
4. **`WriteValuesAsync` doesn't await the OnWriteComplete callback.** v1's implementation
awaited a TCS keyed on the item handle; the port fires the write and returns success without
confirming the runtime accepted it. *Mitigation:* the StatusCode in the response will be 0
(Good) for a fire-and-forget — false positive if the runtime rejects post-callback. **Fix
needs the same TCS-by-handle pattern as v1; queued.**
5. **`MxAccessGalaxyBackend.Discover` re-queries SQL on every call.** v1 cached the tree and
only refreshed on the deploy-watermark change. *Mitigation:* AttributesSql is the slow one
(~30s for a large Galaxy); first-call latency is the symptom, not data loss. **Caching +
`IRediscoverable` push is a v2.1 follow-up.**
### Low
6. **Live MXAccess test `Backend_ReadValues_against_discovered_attribute_returns_a_response_shape`
silently passes if no readable attribute is found.** Documented; the test asserts the *shape*
not the *value* because some Galaxy installs are configuration-only.
7. **`FrameWriter` allocates the length-prefix as a 4-byte heap array per call.** Could be
stackalloc. Microbenchmark not done — currently irrelevant.
8. **`MxProxyAdapter.Unregister` swallows exceptions during `Unregister(handle)`.** v1 did the
same; documented as best-effort during teardown. Consider logging the swallow.
### Out of scope (correctly deferred)
- Stream D.1 — delete legacy `OtOpcUa.Host`. **Cannot be done in any single session** because
the 494 v1 IntegrationTests reference Host classes directly. Requires the test rewrite cycle
in Stream E.
- Stream E.1 — run v1 IntegrationTests against v2 topology. Requires (a) test rewrite to use
Proxy/Host instead of in-process Host classes, then (b) the parity-debug iteration that the
plan budgets 3-4 weeks for.
- Stream E.2 — Client.CLI walkthrough diff. Requires the v1 baseline capture.
- Stream E.3 — four 2026-04-13 stability findings regression tests. Requires the parity test
harness from Stream E.1.
- Wonderware Historian SDK plugin loader (Task B.1.h). HistoryRead returns a recognisable
error until the plugin loader is wired.
- Alarm subsystem wire-up (`MxAccessGalaxyBackend.SubscribeAlarmsAsync` is a no-op today).
v1's alarm tracking is its own subtree; queued as Phase 2 follow-up.
## Stream-D removal checklist (next session)
1. Decide policy on the 494 v1 tests:
- **Option A**: rewrite to use `Driver.Galaxy.Proxy` + `Driver.Galaxy.Host` topology
(multi-day; full parity validation as a side effect)
- **Option B**: archive them as `OtOpcUa.Tests.v1Archive` and write a smaller v2 parity suite
against the new topology (faster; less coverage initially)
2. Execute the chosen option.
3. Delete `src/ZB.MOM.WW.OtOpcUa.Host/`, remove from `.slnx`.
4. Update Windows service installer to register two services
(`OtOpcUa` + `OtOpcUaGalaxyHost`) with the correct service-account SIDs.
5. Migration script for `appsettings.json` Galaxy sections → `DriverInstance.DriverConfig` JSON.
6. PR + adversarial review + `exit-gate-phase-2-final.md`.
## What ships from this session
Eight commits on `phase-1-configuration` since the previous push:
- `01fd90c` Phase 1 finish + Phase 2 scaffold
- `7a5b535` Admin UI core
- `18f93d7` LDAP + SignalR
- `a1e9ed4` AVEVA-stack inventory doc
- `32eeeb9` Phase 2 A+B+C feature-complete
- `549cd36` GalaxyRepository ported + DbBackedBackend + live ZB smoke
- `(this commit)` MXAccess COM port + MxAccessGalaxyBackend + live MXAccess smoke + adversarial review
`494/494` v1 tests still pass. No regressions.

View File

@@ -4,14 +4,53 @@
> deferred. See `phase-2-galaxy-out-of-process.md` for the full task plan; this is the as-built
> delta.
## Status: **Streams A + B + C scaffolded and test-green. Streams D + E deferred.**
## Status: **Streams A + B + C complete (real Win32 pump, all 9 capability interfaces, end-to-end IPC dispatch). Streams D + E remain — gated only on the iterative Galaxy code lift + parity-debug cycle.**
The goal per the plan is "parity, not regression" — the phase exit gate requires v1
IntegrationTests to pass against the v2 Galaxy.Proxy + Galaxy.Host topology byte-for-byte.
Achieving that requires live MXAccess runtime plus the Galaxy code lift out of the legacy
`OtOpcUa.Host`. Both are operations that need a dev Galaxy up and a parity test cycle to verify.
Without that cycle, deleting the legacy Host would break the 494 passing v1 tests that are the
parity baseline.
`OtOpcUa.Host`. Without that cycle, deleting the legacy Host would break the 494 passing v1
tests that are the parity baseline.
> **Update 2026-04-17 (later) — Streams A/B/C now feature-complete, not just scaffolds.**
> The Win32 message pump in `StaPump` was upgraded from a `BlockingCollection` placeholder to a
> real `GetMessage`/`PostThreadMessage`/`PeekMessage` loop lifted from v1 `StaComThread` (P/Invoke
> declarations included; `WM_APP=0x8000` for work-item dispatch, `WM_APP+1` for graceful
> drain → `PostQuitMessage`, 5s join-on-dispose). `GalaxyProxyDriver` now implements every
> capability interface declared in Phase 2 Stream C — `IDriver`, `ITagDiscovery`, `IReadable`,
> `IWritable`, `ISubscribable`, `IAlarmSource`, `IHistoryProvider`, `IRediscoverable`,
> `IHostConnectivityProbe` — each forwarding through the matching IPC contract. `GalaxyIpcClient`
> gained `SendOneWayAsync` for the fire-and-forget calls (unsubscribe / alarm-ack /
> close-session) while still serializing through the call-gate so writes don't interleave with
> `CallAsync` round-trips. Host side: `IGalaxyBackend` interface defines the seam between IPC
> dispatch and the live MXAccess code, `GalaxyFrameHandler` routes every `MessageKind` into it
> (heartbeat handled inline so liveness works regardless of backend health), and
> `StubGalaxyBackend` returns success for lifecycle/subscribe/recycle and recognizable
> `not-implemented`-coded errors for data-plane calls. End-to-end integration tests exercise
> every capability through the full stack (handshake → open session → read / write / subscribe /
> alarm / history / recycle) and the v1 test baseline stays green (494 pass, no regressions).
>
> **What's left for the Phase 2 exit gate:** the actual Galaxy code lift (Task B.1) — replace
> `StubGalaxyBackend` with a `MxAccessClient`-backed implementation that calls `MxAccessClient`
> on the `StaPump`, plus the parity-cycle debugging against live Galaxy that the plan budgets
> 3-4 weeks for. Removing the legacy `OtOpcUa.Host` (Task D.1) follows once the parity tests
> are green against the v2 topology.
> **Update 2026-04-17 — runtime confirmed local.** The dev box has the full AVEVA stack required
> for the LmxOpcUa breakout: 27 ArchestrA / Wonderware / AVEVA services running including
> `aaBootstrap`, `aaGR` (Galaxy Repository), `aaLogger`, `aaUserValidator`, `aaPim`,
> `ArchestrADataStore`, `AsbServiceManager`; the full Historian set
> (`aahClientAccessPoint`, `aahGateway`, `aahInSight`, `aahSearchIndexer`, `InSQLStorage`,
> `InSQLConfiguration`, `InSQLEventSystem`, `InSQLIndexing`, `InSQLIOServer`,
> `HistorianSearch-x64`); SuiteLink (`slssvc`); MXAccess COM at
> `C:\Program Files (x86)\ArchestrA\Framework\bin\ArchestrA.MXAccess.dll`; and the OI-Gateway
> install at `C:\Program Files (x86)\Wonderware\OI-Server\OI-Gateway\` (so the
> AppServer-via-OI-Gateway smoke test from decision #142 is *also* runnable here, not blocked
> on a dedicated AVEVA test box).
>
> The "needs a dev Galaxy" prerequisite is therefore satisfied. Stream D + E can start whenever
> the team is ready to take the parity-cycle hit on the 494 v1 tests; no environmental blocker
> remains.
What *is* done: all scaffolding, IPC contracts, supervisor logic, and stability protections
needed to hang the real MXAccess code onto. Every piece has unit-level or IPC-level test
@@ -151,13 +190,20 @@ Requires live MXAccess + Galaxy runtime and the above lift complete. Work items:
## Next-session checklist for Stream D + E
1. Stand up dev Galaxy; capture Client.CLI walkthrough baseline against v1.
2. Move Galaxy-specific files from `OtOpcUa.Host` into `Driver.Galaxy.Host`, renaming
1. Verify the local AVEVA stack is still green (`Get-Service aaGR, aaBootstrap, slssvc` →
Running) and the Galaxy `ZB` repository is reachable from `sqlcmd -S localhost -d ZB -E`.
The runtime is already on this machine — no install step needed.
2. Capture Client.CLI walkthrough baseline against v1 (the parity reference).
3. Move Galaxy-specific files from `OtOpcUa.Host` into `Driver.Galaxy.Host`, renaming
namespaces. Replace `StubFrameHandler` with the real one.
3. Wire up the real Win32 pump inside `StaPump` (lift from scadalink-design's
4. Wire up the real Win32 pump inside `StaPump` (lift from scadalink-design's
`LmxProxy.Host` reference per CLAUDE.md).
4. Run v1 IntegrationTests against the v2 topology — iterate on parity defects until green.
5. Run Client.CLI walkthrough and diff.
6. Regression tests for the four stability findings.
7. Delete legacy `OtOpcUa.Host`; update `.slnx`; update installer scripts.
8. Adversarial review; `exit-gate-phase-2.md` recorded; PR merged.
5. Run v1 IntegrationTests against the v2 topology — iterate on parity defects until green.
6. Run Client.CLI walkthrough and diff.
7. Regression tests for the four 2026-04-13 stability findings.
8. Delete legacy `OtOpcUa.Host`; update `.slnx`; update installer scripts.
9. Optional but valuable now that the runtime is local: AppServer-via-OI-Gateway smoke test
(decision #142 / Phase 1 Task E.10) — the OI-Gateway install at
`C:\Program Files (x86)\Wonderware\OI-Server\OI-Gateway\` is in place; the test was deferred
for "needs live AVEVA runtime" reasons that no longer apply on this dev box.
10. Adversarial review; `exit-gate-phase-2.md` recorded; PR merged.

View File

@@ -0,0 +1,80 @@
# PR 1 — Phase 1 + Phase 2 A/B/C → v2
**Source**: `phase-1-configuration` (commits `980ea51..7403b92`, 11 commits)
**Target**: `v2`
**URL**: https://gitea.dohertylan.com/dohertj2/lmxopcua/pulls/new/phase-1-configuration
## Summary
- **Phase 1 complete** — Configuration project with 16 entities + 3 EF migrations
(InitialSchema + 8 stored procs + AuthorizationGrants), Core + Server + full Admin UI
(Blazor Server with cluster CRUD, draft → diff → publish → rollback, equipment with
OPC 40010, UNS, namespaces, drivers, ACLs, reservations, audit), LDAP via GLAuth
(`localhost:3893`), SignalR real-time fleet status + alerts.
- **Phase 2 Streams A + B + C feature-complete** — full IPC contract surface
(Galaxy.Shared, netstandard2.0, MessagePack), Galaxy.Host with real Win32 STA pump,
ACL + caller-SID + per-process-secret IPC, Galaxy-specific MemoryWatchdog +
RecyclePolicy + PostMortemMmf + MxAccessHandle, three `IGalaxyBackend`
implementations (Stub / DbBacked / **MxAccess** — real ArchestrA.MxAccess.dll
reference, x86, smoke-tested live against `LMXProxyServer`), Galaxy.Proxy with all
9 capability interfaces (`IDriver` / `ITagDiscovery` / `IReadable` / `IWritable` /
`ISubscribable` / `IAlarmSource` / `IHistoryProvider` / `IRediscoverable` /
`IHostConnectivityProbe`) + supervisor (Backoff + CircuitBreaker +
HeartbeatMonitor).
- **Phase 2 Stream D non-destructive deliverables** — appsettings.json → DriverConfig
migration script, two-service Windows installer scripts, process-spawn cross-FX
parity test, Stream D removal procedure doc with both Option A (rewrite 494 v1
tests) and Option B (archive + new v2 E2E suite) spelled out step-by-step.
## What's NOT in this PR
- Legacy `OtOpcUa.Host` deletion (Stream D.1) — reserved for a follow-up PR after
Option B's E2E suite is green. The 494 v1 tests still pass against the unchanged
legacy Host.
- Live-Galaxy parity validation (Stream E) — needs the iterative debug cycle the
removal-procedure doc describes.
## Tests
**964 pass / 1 pre-existing Phase 0 baseline failure**, across 14 test projects:
| Project | Pass | Notes |
|---|---:|---|
| Core.Abstractions.Tests | 24 | |
| Configuration.Tests | 42 | incl. 7 schema compliance, 8 stored-proc, 3 SQL-role auth, 13 validator, 6 LiteDB cache, 5 generation-applier |
| Core.Tests | 4 | DriverHost lifecycle |
| Server.Tests | 2 | NodeBootstrap + LiteDB cache fallback |
| Admin.Tests | 21 | incl. 5 RoleMapper, 6 LdapAuth, 3 LiveLdap, 2 FleetStatusPoller, 2 services-integration |
| Driver.Galaxy.Shared.Tests | 6 | Round-trip + framing |
| Driver.Galaxy.Host.Tests | 30 | incl. 5 GalaxyRepository live ZB, 3 live MXAccess COM, 5 EndToEndIpc, 2 IpcHandshake, 4 MemoryWatchdog, 3 RecyclePolicy, 3 PostMortemMmf, 3 StaPump, 2 service-installer dry-run |
| Driver.Galaxy.Proxy.Tests | 10 | 9 unit + 1 process-spawn parity |
| Client.Shared.Tests | 131 | unchanged |
| Client.UI.Tests | 98 | unchanged |
| Client.CLI.Tests | 51 / 1 fail | pre-existing baseline failure |
| Historian.Aveva.Tests | 41 | unchanged |
| IntegrationTests (net48) | 6 | unchanged — v1 parity baseline |
| **OtOpcUa.Tests (net48)** | **494** | **unchanged — v1 parity baseline** |
## Test plan for reviewers
- [ ] `dotnet build ZB.MOM.WW.OtOpcUa.slnx` succeeds with no warnings beyond the
known NuGetAuditSuppress + xUnit1051 warnings
- [ ] `dotnet test ZB.MOM.WW.OtOpcUa.slnx` shows the same 964/1 result
- [ ] `Get-Service aaGR, aaBootstrap` reports Running on the merger's box
- [ ] `docker ps --filter name=otopcua-mssql` shows the SQL container Up
- [ ] Admin UI boots (`dotnet run --project src/ZB.MOM.WW.OtOpcUa.Admin`); home page
renders at http://localhost:5123/; LDAP sign-in with GLAuth `readonly` /
`readonly123` succeeds
- [ ] Migration script dry-run: `powershell -File
scripts/migration/Migrate-AppSettings-To-DriverConfig.ps1 -DryRun` produces
a well-formed DriverConfig JSON
- [ ] Spot-read three commit messages to confirm the deferred-with-rationale items
are explicitly documented (`549cd36`, `a7126ba`, `7403b92` are the most
recent and most detailed)
## Follow-up tracking
PR 2 (next session) will execute Stream D Option B — archive `OtOpcUa.Tests` as
`OtOpcUa.Tests.v1Archive`, build the new `OtOpcUa.Driver.Galaxy.E2E` test project,
delete legacy `OtOpcUa.Host`, and run the parity-validation cycle. See
`docs/v2/implementation/stream-d-removal-procedure.md`.

View File

@@ -0,0 +1,69 @@
# PR 2 — Phase 2 Stream D Option B (archive v1 + E2E suite) → v2
**Source**: `phase-2-stream-d` (branched from `phase-1-configuration`)
**Target**: `v2`
**URL** (after push): https://gitea.dohertylan.com/dohertj2/lmxopcua/pulls/new/phase-2-stream-d
## Summary
Phase 2 Stream D Option B per `docs/v2/implementation/stream-d-removal-procedure.md`:
- **Archived the v1 surface** without deleting:
- `tests/ZB.MOM.WW.OtOpcUa.Tests/``tests/ZB.MOM.WW.OtOpcUa.Tests.v1Archive/`
(`<AssemblyName>` kept as `ZB.MOM.WW.OtOpcUa.Tests` so v1 Host's `InternalsVisibleTo`
still matches; `<IsTestProject>false</IsTestProject>` so solution test runs skip it).
- `tests/ZB.MOM.WW.OtOpcUa.IntegrationTests/``<IsTestProject>false</IsTestProject>`
+ archive comment.
- `src/ZB.MOM.WW.OtOpcUa.Host/` + `src/ZB.MOM.WW.OtOpcUa.Historian.Aveva/` — archive
PropertyGroup comments. Both still build (Historian plugin + 41 historian tests still
pass) so Phase 2 PR 3 can delete them in a focused, reviewable destructive change.
- **New `tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E/`** test project (.NET 10):
- `ParityFixture` spawns `OtOpcUa.Driver.Galaxy.Host.exe` (net48 x86) as a subprocess via
`Process.Start`, connects via real named pipe, exposes a connected `GalaxyProxyDriver`.
Skips when Galaxy ZB unreachable / Host EXE not built / Administrator shell.
- `HierarchyParityTests` (3) and `StabilityFindingsRegressionTests` (4) — one test per
2026-04-13 stability finding (phantom probe, cross-host quality clear, sync-over-async,
fire-and-forget alarm shutdown race).
- **`docs/v2/V1_ARCHIVE_STATUS.md`** — inventory + deletion plan for PR 3.
- **`docs/v2/implementation/exit-gate-phase-2-final.md`** — supersedes the two partial-exit
docs with the as-built state, adversarial review of PR 2 deltas (4 new findings), and the
recommended PR sequence (1 → 2 → 3 → 4).
## What's NOT in this PR
- Deletion of the v1 archive — saved for PR 3 with explicit operator review (destructive change).
- Wonderware Historian SDK plugin port — Task B.1.h, follow-up to enable real `HistoryRead`.
- MxAccess subscription push-frames — Task B.1.s, follow-up to enable real-time
data-change push from Host → Proxy.
## Tests
**`dotnet test ZB.MOM.WW.OtOpcUa.slnx`**: **470 pass / 7 skip / 1 pre-existing baseline**.
The 7 skips are the new E2E tests, all skipping with the documented reason
"PipeAcl denies Administrators on dev shells" — the production install runs as a non-admin
service account and these tests will execute there.
Run the archived v1 suites explicitly:
```powershell
dotnet test tests/ZB.MOM.WW.OtOpcUa.Tests.v1Archive # → 494 pass
dotnet test tests/ZB.MOM.WW.OtOpcUa.IntegrationTests # → 6 pass
```
## Test plan for reviewers
- [ ] `dotnet build ZB.MOM.WW.OtOpcUa.slnx` succeeds with no warnings beyond the known
NuGetAuditSuppress + NU1702 cross-FX
- [ ] `dotnet test ZB.MOM.WW.OtOpcUa.slnx` shows the 470/7-skip/1-baseline result
- [ ] Both archived suites pass when run explicitly
- [ ] Build the Galaxy.Host EXE (`dotnet build src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host`),
then run E2E tests on a non-admin shell — they should actually execute and pass
against live Galaxy ZB
- [ ] Spot-read `docs/v2/V1_ARCHIVE_STATUS.md` and confirm the deletion plan is acceptable
## Follow-up tracking
- **PR 3** (next session, when ready): execute the deletion plan in `V1_ARCHIVE_STATUS.md`.
4 projects removed, .slnx updated, full solution test confirms parity.
- **PR 4** (Phase 2 follow-up): port Historian plugin + wire MxAccess subscription pushes +
close the high/medium open findings from `exit-gate-phase-2-final.md`.

View File

@@ -0,0 +1,91 @@
# PR 4 — Phase 2 follow-up: close the 4 open MXAccess findings
**Source**: `phase-2-pr4-findings` (branched from `phase-2-stream-d`)
**Target**: `v2`
## Summary
Closes the 4 high/medium open findings carried forward in `exit-gate-phase-2-final.md`:
- **High 1 — `ReadAsync` subscription-leak on cancel.** One-shot read now wraps the
subscribe→first-OnDataChange→unsubscribe pattern in a `try/finally` so the per-tag
callback is always detached, and if the read installed the underlying MXAccess
subscription itself (no other caller had it), it tears it down on the way out.
- **High 2 — No reconnect loop on the MXAccess COM connection.** New
`MxAccessClientOptions { AutoReconnect, MonitorInterval, StaleThreshold }` + a background
`MonitorLoopAsync` that watches a stale-activity threshold + probes the proxy via a
no-op COM call, then reconnects-with-replay (re-Register, re-AddItem every active
subscription) when the proxy is dead. Liveness signal: every `OnDataChange` callback bumps
`_lastObservedActivityUtc`. Defaults match v1 monitor cadence (5s poll, 60s stale).
`ReconnectCount` exposed for diagnostics; `ConnectionStateChanged` event for downstream
consumers (the supervisor on the Proxy side already surfaces this through its
HeartbeatMonitor, but the Host-side event lets local logging/metrics hook in).
- **Medium 3 — `MxAccessGalaxyBackend.SubscribeAsync` doesn't push OnDataChange frames back to
the Proxy.** New `IGalaxyBackend.OnDataChange` / `OnAlarmEvent` / `OnHostStatusChanged`
events that the new `GalaxyFrameHandler.AttachConnection` subscribes per-connection and
forwards as outbound `OnDataChangeNotification` / `AlarmEvent` /
`RuntimeStatusChange` frames through the connection's `FrameWriter`. `MxAccessGalaxyBackend`
fans out per-tag value changes to every `SubscriptionId` that's listening to that tag
(multiple Proxy subs may share a Galaxy attribute — single COM subscription, multi-fan-out
on the wire). Stub + DbBacked backends declare the events with `#pragma warning disable
CS0067` (treat-warnings-as-errors would otherwise fail on never-raised events that exist
only to satisfy the interface).
- **Medium 4 — `WriteValuesAsync` doesn't await `OnWriteComplete`.** New
`WriteAsync(...)` overload returns `bool` after awaiting the OnWriteComplete callback via
the v1-style `TaskCompletionSource`-keyed-by-item-handle pattern in `_pendingWrites`.
`MxAccessGalaxyBackend.WriteValuesAsync` now reports per-tag `Bad_InternalError` when the
runtime rejected the write, instead of false-positive `Good`.
## Pipe server change
`IFrameHandler` gains `AttachConnection(FrameWriter writer): IDisposable` so the handler can
register backend event sinks on each accepted connection and detach them at disconnect. The
`PipeServer.RunOneConnectionAsync` calls it after the Hello handshake and disposes it in the
finally of the per-connection scope. `StubFrameHandler` returns `IFrameHandler.NoopAttachment.Instance`
(net48 doesn't support default interface methods, so the empty-attach lives as a public nested
class).
## Tests
**`dotnet test ZB.MOM.WW.OtOpcUa.slnx`**: **460 pass / 7 skip (E2E on admin shell) / 1
pre-existing baseline failure**. No regressions. The Driver.Galaxy.Host unit tests + 5 live
ZB smoke + 3 live MXAccess COM smoke all pass unchanged.
## Test plan for reviewers
- [ ] `dotnet build` clean
- [ ] `dotnet test` shows 460/7-skip/1-baseline
- [ ] Spot-check `MxAccessClient.MonitorLoopAsync` against v1's `MxAccessClient.Monitor`
partial (`src/ZB.MOM.WW.OtOpcUa.Host/MxAccess/MxAccessClient.Monitor.cs`) — same
polling cadence, same probe-then-reconnect-with-replay shape
- [ ] Read `GalaxyFrameHandler.ConnectionSink.Dispose` and confirm event handlers are
detached on connection close (no leaked invocation list refs)
- [ ] `WriteValuesAsync` returning `Bad_InternalError` on a runtime-rejected write is the
correct shape — confirm against the v1 `MxAccessClient.ReadWrite.cs` pattern
## What's NOT in this PR
- Wonderware Historian SDK plugin port (Task B.1.h) — separate PR, larger scope.
- Alarm subsystem wire-up (`MxAccessGalaxyBackend.SubscribeAlarmsAsync` is still a no-op).
`OnAlarmEvent` is declared on the backend interface and pushed by the frame handler when
raised; `MxAccessGalaxyBackend` just doesn't raise it yet (waits for the alarm-tracking
port from v1's `AlarmObjectFilter` + Galaxy alarm primitives).
- Host-status push (`OnHostStatusChanged`) — declared on the interface and pushed by the
frame handler; `MxAccessGalaxyBackend` doesn't raise it (the Galaxy.Host's
`HostConnectivityProbe` from v1 needs porting too, scoped under the Historian PR).
## Adversarial review
Quick pass over the PR 4 deltas. No new findings beyond:
- **Low 1** — `MonitorLoopAsync`'s `$Heartbeat` probe item-handle is leaked
(`AddItem` succeeds, never `RemoveItem`'d). Cosmetic — the probe item is internal to
the COM connection, dies with `Unregister` at disconnect/recycle. Worth a follow-up
to call `RemoveItem` after the probe succeeds.
- **Low 2** — Replay loop in `MonitorLoopAsync` swallows per-subscription failures. If
Galaxy permanently rejects a previously-valid reference (rare but possible after a
re-deploy), the user gets silent data loss for that one subscription. The stub-handler-
unaware operator wouldn't notice. Worth surfacing as a `ConnectionStateChanged(false)
→ ConnectionStateChanged(true)` payload that includes the replay-failures list.
Both are low-priority follow-ups, not PR 4 blockers.

View File

@@ -0,0 +1,103 @@
# Stream D — Legacy `OtOpcUa.Host` Removal Procedure
> Sequenced playbook for the next session that takes Phase 2 to its full exit gate.
> All Stream A/B/C work is committed. The blocker is structural: the 494 v1
> `OtOpcUa.Tests` instantiate v1 `Host` classes directly, so they must be
> retargeted (or archived) before the Host project can be deleted.
## Decision: Option A or Option B
### Option A — Rewrite the 494 v1 tests to use v2 topology
**Effort**: 3-5 days. Highest fidelity (full v1 test coverage carries forward).
**Steps**:
1. Build a `ProxyMxAccessClientAdapter` in a new `OtOpcUa.LegacyTestCompat/` project that
implements v1's `IMxAccessClient` by forwarding to `Driver.Galaxy.Proxy.GalaxyProxyDriver`.
Maps v1 `Vtq` ↔ v2 `DataValueSnapshot`, v1 `Quality` enum ↔ v2 `StatusCode` u32, the v1
`OnTagValueChanged` event ↔ v2 `ISubscribable.OnDataChange`.
2. Same idea for `IGalaxyRepository` — adapter that wraps v2's `Backend.Galaxy.GalaxyRepository`.
3. Replace `MxAccessClient` constructions in `OtOpcUa.Tests` test fixtures with the adapter.
Most tests use a single fixture so the change-set is concentrated.
4. For each test class: run; iterate on parity defects until green. Expected defect families:
timing-sensitive assertions (IPC adds ~5ms latency; widen tolerances), Quality enum vs
StatusCode mismatches, value-byte-encoding differences.
5. Once all 494 pass: proceed to deletion checklist below.
**When to pick A**: regulatory environments that need the full historical test suite green,
or when the v2 parity gate is itself a release-blocking artifact downstream consumers will
look for.
### Option B — Archive the 494 v1 tests, build a smaller v2 parity suite
**Effort**: 1-2 days. Faster to green; less coverage initially, accreted over time.
**Steps**:
1. Rename `tests/ZB.MOM.WW.OtOpcUa.Tests/``tests/ZB.MOM.WW.OtOpcUa.Tests.v1Archive/`.
Add `<IsTestProject>false</IsTestProject>` so CI doesn't run them; mark every class with
`[Trait("Category", "v1Archive")]` so a future operator can opt in via `--filter`.
2. New `tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E/` project (.NET 10):
- `ParityFixture` spawns Galaxy.Host EXE per test class with `OTOPCUA_GALAXY_BACKEND=mxaccess`
pointing at the dev box's live Galaxy. Pattern from `HostSubprocessParityTests`.
- 10-20 representative tests covering the core paths: hierarchy shape, attribute count,
read-Manufacturer-Boolean, write-Operate-Float roundtrip, subscribe-receives-OnDataChange,
Bad-quality on disconnect, alarm-event-shape.
3. The four 2026-04-13 stability findings get individual regression tests in this project.
4. Once green: proceed to deletion checklist below.
**When to pick B**: typical dev velocity case. The v1 archive is reference, the new suite is
the live parity bar.
## Deletion checklist (after Option A or B is green)
Pre-conditions:
- [ ] Chosen-option test suite green (494 retargeted OR new E2E suite passing on this box)
- [ ] `phase-2-compliance.ps1` runs and exits 0
- [ ] `Get-Service aaGR, aaBootstrap` → Running
- [ ] `Driver.Galaxy.Host` x86 publish output verified at
`src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host/bin/Release/net48/`
- [ ] Migration script tested: `scripts/migration/Migrate-AppSettings-To-DriverConfig.ps1
-AppSettingsPath src/ZB.MOM.WW.OtOpcUa.Host/appsettings.json -DryRun` produces a
well-formed DriverConfig
- [ ] Service installer scripts dry-run on a test box: `scripts/install/Install-Services.ps1
-InstallRoot C:\OtOpcUa -ServiceAccount LOCALHOST\testuser` registers both services
and they start
Steps:
1. Delete `src/ZB.MOM.WW.OtOpcUa.Host/` (the legacy in-process Host project).
2. Edit `ZB.MOM.WW.OtOpcUa.slnx` — remove the legacy Host `<Project>` line; keep all v2
project lines.
3. Migrate the dev `appsettings.json` Galaxy sections to `DriverConfig` JSON via the
migration script; insert into the Configuration DB for the dev cluster's Galaxy driver
instance.
4. Run the chosen test suite once more — confirm zero regressions from the deletion.
5. Build full solution (`dotnet build ZB.MOM.WW.OtOpcUa.slnx`) — confirm clean build with
no references to the deleted project.
6. Commit:
`git rm -r src/ZB.MOM.WW.OtOpcUa.Host` followed by the slnx + cleanup edits in one
atomic commit titled "Phase 2 Stream D — retire legacy OtOpcUa.Host".
7. Run `/codex:adversarial-review --base v2` on the merged Phase 2 diff.
8. Record `exit-gate-phase-2-final.md` with: Option chosen, deletion-commit SHA, parity
test count + duration, adversarial-review findings (each closed or deferred with link).
9. Open PR against `v2`, link the exit-gate doc + compliance script output + parity report.
10. Merge after one reviewer signoff.
## Rollback
If Stream D causes downstream consumer failures (ScadaBridge / Ignition / SystemPlatform IO
clients seeing different OPC UA behavior), the rollback is `git revert` of the deletion
commit — the whole v2 codebase keeps Galaxy.Proxy + Galaxy.Host installed alongside the
restored legacy Host. Production can run either topology. `OtOpcUa.Driver.Galaxy.Proxy`
becomes dormant until the next attempt.
## Why this can't one-shot in an autonomous session
- The parity-defect debug cycle is intrinsically interactive: each iteration requires running
the test suite against live Galaxy, inspecting the diff, deciding if the difference is a
legitimate v2 improvement or a regression, then either widening the assertion or fixing the
v2 code. That decision-making is the bottleneck, not the typing.
- The legacy-Host deletion is destructive — needs explicit operator authorization on a real
PR review, not unattended automation.
- The downstream consumer cutover (ScadaBridge, Ignition, AppServer) lives outside this repo
and on an integration-team track; "Phase 2 done" inside this repo is a precondition, not
the full release.

View File

@@ -234,6 +234,8 @@ All of these stay in the Galaxy Host process (.NET 4.8 x86). The `GalaxyProxy` i
- Refactor is **incremental**: extract `IDriver` / `ISubscribable` / `ITagDiscovery` etc. against the existing `LmxNodeManager` first (still in-process on v2 branch), validate the system still runs, *then* move the implementation behind the IPC boundary into Galaxy.Host. Keeps the system runnable at each step and de-risks the out-of-process move.
- **Parity test**: run the existing v1 IntegrationTests suite against the v2 Galaxy driver (same Galaxy, same expectations) **plus** a scripted Client.CLI walkthrough (connect / browse / read / write / subscribe / history / alarms) on a dev Galaxy. Automated regression + human-observable behavior.
**Dev environment for the LmxOpcUa breakout:** the Phase 0/1 dev box (`DESKTOP-6JL3KKO`) hosts the full AVEVA stack required to execute Phase 2 Streams D + E — 27 ArchestrA / Wonderware / AVEVA services running including `aaBootstrap`, `aaGR` (Galaxy Repository), `aaLogger`, `aaUserValidator`, `aaPim`, `ArchestrADataStore`, `AsbServiceManager`; the full Historian set (`aahClientAccessPoint`, `aahGateway`, `aahInSight`, `aahSearchIndexer`, `InSQLStorage`, `InSQLConfiguration`, `InSQLEventSystem`, `InSQLIndexing`, `InSQLIOServer`, `HistorianSearch-x64`); SuiteLink (`slssvc`); MXAccess COM at `C:\Program Files (x86)\ArchestrA\Framework\bin\ArchestrA.MXAccess.dll`; and OI-Gateway at `C:\Program Files (x86)\Wonderware\OI-Server\OI-Gateway\` — so the Phase 1 Task E.10 AppServer-via-OI-Gateway smoke test (decision #142) is also runnable on the same box, no separate AVEVA test machine required. Inventory captured in `dev-environment.md`.
---
### 4. Configuration Model — Centralized MSSQL + Local Cache

View File

@@ -0,0 +1,102 @@
<#
.SYNOPSIS
Registers the two v2 Windows services on a node: OtOpcUa (main server, net10) and
OtOpcUaGalaxyHost (out-of-process Galaxy COM host, net48 x86).
.DESCRIPTION
Phase 2 Stream D.2 — replaces the v1 single-service install (TopShelf-based OtOpcUa.Host).
Installs both services with the correct service-account SID + per-process shared secret
provisioning per `driver-stability.md §"IPC Security"`. Galaxy.Host depends on OtOpcUa
(Galaxy.Host must be reachable when OtOpcUa starts; service dependency wiring + retry
handled by OtOpcUa.Server NodeBootstrap).
.PARAMETER InstallRoot
Where the binaries live (typically C:\Program Files\OtOpcUa).
.PARAMETER ServiceAccount
Service account SID or DOMAIN\name. Both services run under this account; the
Galaxy.Host pipe ACL only allows this SID to connect (decision #76).
.PARAMETER GalaxySharedSecret
Per-process secret passed to Galaxy.Host via env var. Generated freshly per install.
.PARAMETER ZbConnection
Galaxy ZB SQL connection string (passed to Galaxy.Host via env var).
.EXAMPLE
.\Install-Services.ps1 -InstallRoot 'C:\Program Files\OtOpcUa' -ServiceAccount 'OTOPCUA\svc-otopcua'
#>
[CmdletBinding()]
param(
[Parameter(Mandatory)] [string]$InstallRoot,
[Parameter(Mandatory)] [string]$ServiceAccount,
[string]$GalaxySharedSecret,
[string]$ZbConnection = 'Server=localhost;Database=ZB;Integrated Security=True;TrustServerCertificate=True;Encrypt=False;',
[string]$GalaxyClientName = 'OtOpcUa-Galaxy.Host',
[string]$GalaxyPipeName = 'OtOpcUaGalaxy'
)
$ErrorActionPreference = 'Stop'
if (-not (Test-Path "$InstallRoot\OtOpcUa.Server.exe")) {
Write-Error "OtOpcUa.Server.exe not found at $InstallRoot — copy the publish output first"
exit 1
}
if (-not (Test-Path "$InstallRoot\Galaxy\OtOpcUa.Driver.Galaxy.Host.exe")) {
Write-Error "OtOpcUa.Driver.Galaxy.Host.exe not found at $InstallRoot\Galaxy — copy the publish output first"
exit 1
}
# Generate a fresh shared secret per install if not supplied. Stored in DPAPI-protected file
# rather than the registry so the service account can read it but other local users cannot.
if (-not $GalaxySharedSecret) {
$bytes = New-Object byte[] 32
[System.Security.Cryptography.RandomNumberGenerator]::Create().GetBytes($bytes)
$GalaxySharedSecret = [Convert]::ToBase64String($bytes)
}
# Resolve the SID — the IPC ACL needs the SID, not the down-level name.
$sid = if ($ServiceAccount.StartsWith('S-1-')) {
$ServiceAccount
} else {
(New-Object System.Security.Principal.NTAccount $ServiceAccount).Translate([System.Security.Principal.SecurityIdentifier]).Value
}
# --- Install OtOpcUaGalaxyHost first (OtOpcUa starts after, depends on it being up).
$galaxyEnv = @(
"OTOPCUA_GALAXY_PIPE=$GalaxyPipeName"
"OTOPCUA_ALLOWED_SID=$sid"
"OTOPCUA_GALAXY_SECRET=$GalaxySharedSecret"
"OTOPCUA_GALAXY_BACKEND=mxaccess"
"OTOPCUA_GALAXY_ZB_CONN=$ZbConnection"
"OTOPCUA_GALAXY_CLIENT_NAME=$GalaxyClientName"
) -join "`0"
$galaxyEnv += "`0`0"
Write-Host "Installing OtOpcUaGalaxyHost..."
& sc.exe create OtOpcUaGalaxyHost binPath= "`"$InstallRoot\Galaxy\OtOpcUa.Driver.Galaxy.Host.exe`"" `
DisplayName= 'OtOpcUa Galaxy Host (out-of-process MXAccess)' `
start= auto `
obj= $ServiceAccount | Out-Null
# Set per-service environment variables via the registry — sc.exe doesn't expose them directly.
$svcKey = "HKLM:\SYSTEM\CurrentControlSet\Services\OtOpcUaGalaxyHost"
$envValue = $galaxyEnv.Split("`0") | Where-Object { $_ -ne '' }
Set-ItemProperty -Path $svcKey -Name 'Environment' -Type MultiString -Value $envValue
# --- Install OtOpcUa (depends on Galaxy host being installed; doesn't strictly require it
# started — OtOpcUa.Server NodeBootstrap retries on the IPC connect path).
Write-Host "Installing OtOpcUa..."
& sc.exe create OtOpcUa binPath= "`"$InstallRoot\OtOpcUa.Server.exe`"" `
DisplayName= 'OtOpcUa Server' `
start= auto `
depend= 'OtOpcUaGalaxyHost' `
obj= $ServiceAccount | Out-Null
Write-Host ""
Write-Host "Installed. Start with:"
Write-Host " sc.exe start OtOpcUaGalaxyHost"
Write-Host " sc.exe start OtOpcUa"
Write-Host ""
Write-Host "Galaxy shared secret (record this offline — required for service rebinding):"
Write-Host " $GalaxySharedSecret"

View File

@@ -0,0 +1,18 @@
<#
.SYNOPSIS
Stops + removes the two v2 services. Mirrors Install-Services.ps1.
#>
[CmdletBinding()] param()
$ErrorActionPreference = 'Continue'
foreach ($svc in 'OtOpcUa', 'OtOpcUaGalaxyHost') {
if (Get-Service $svc -ErrorAction SilentlyContinue) {
Write-Host "Stopping $svc..."
Stop-Service $svc -Force -ErrorAction SilentlyContinue
Write-Host "Removing $svc..."
& sc.exe delete $svc | Out-Null
} else {
Write-Host "$svc not installed — skipping"
}
}
Write-Host "Done."

View File

@@ -0,0 +1,107 @@
<#
.SYNOPSIS
Translates a v1 OtOpcUa.Host appsettings.json into a v2 DriverInstance.DriverConfig JSON
blob suitable for upserting into the central Configuration DB.
.DESCRIPTION
Phase 2 Stream D.3 — moves the legacy MxAccess + GalaxyRepository + Historian sections out
of node-local appsettings.json and into the central DB so each node only needs Cluster.NodeId
+ ClusterId + DB conn (per decision #18). Idempotent + dry-run-able.
Output shape matches the Galaxy DriverType schema in `docs/v2/plan.md` §"Galaxy DriverConfig":
{
"MxAccess": { "ClientName": "...", "RequestTimeoutSeconds": 30 },
"Database": { "ConnectionString": "...", "PollIntervalSeconds": 60 },
"Historian": { "Enabled": false }
}
.PARAMETER AppSettingsPath
Path to the v1 appsettings.json. Defaults to ../../src/ZB.MOM.WW.OtOpcUa.Host/appsettings.json
relative to the script.
.PARAMETER OutputPath
Where to write the generated DriverConfig JSON. Defaults to stdout.
.PARAMETER DryRun
Print what would be written without writing.
.EXAMPLE
pwsh ./Migrate-AppSettings-To-DriverConfig.ps1 -AppSettingsPath C:\OtOpcUa\appsettings.json -OutputPath C:\tmp\galaxy-driverconfig.json
#>
[CmdletBinding()]
param(
[string]$AppSettingsPath,
[string]$OutputPath,
[switch]$DryRun
)
$ErrorActionPreference = 'Stop'
if (-not $AppSettingsPath) {
$AppSettingsPath = Join-Path (Split-Path -Parent $PSScriptRoot) '..\src\ZB.MOM.WW.OtOpcUa.Host\appsettings.json'
}
if (-not (Test-Path $AppSettingsPath)) {
Write-Error "AppSettings file not found: $AppSettingsPath"
exit 1
}
$src = Get-Content -Raw $AppSettingsPath | ConvertFrom-Json
$mx = $src.MxAccess
$gr = $src.GalaxyRepository
$hi = $src.Historian
$driverConfig = [ordered]@{
MxAccess = [ordered]@{
ClientName = $mx.ClientName
NodeName = $mx.NodeName
GalaxyName = $mx.GalaxyName
RequestTimeoutSeconds = $mx.ReadTimeoutSeconds
WriteTimeoutSeconds = $mx.WriteTimeoutSeconds
MaxConcurrentOps = $mx.MaxConcurrentOperations
MonitorIntervalSec = $mx.MonitorIntervalSeconds
AutoReconnect = $mx.AutoReconnect
ProbeTag = $mx.ProbeTag
}
Database = [ordered]@{
ConnectionString = $gr.ConnectionString
ChangeDetectionIntervalSec = $gr.ChangeDetectionIntervalSeconds
CommandTimeoutSeconds = $gr.CommandTimeoutSeconds
ExtendedAttributes = $gr.ExtendedAttributes
Scope = $gr.Scope
PlatformName = $gr.PlatformName
}
Historian = [ordered]@{
Enabled = if ($null -ne $hi -and $null -ne $hi.Enabled) { $hi.Enabled } else { $false }
}
}
# Strip null-valued leaves so the resulting JSON is compact and round-trippable.
function Remove-Nulls($obj) {
$keys = @($obj.Keys)
foreach ($k in $keys) {
if ($null -eq $obj[$k]) { $obj.Remove($k) | Out-Null }
elseif ($obj[$k] -is [System.Collections.Specialized.OrderedDictionary]) { Remove-Nulls $obj[$k] }
}
}
Remove-Nulls $driverConfig
$json = $driverConfig | ConvertTo-Json -Depth 8
if ($DryRun) {
Write-Host "=== DriverConfig (dry-run, would write to $OutputPath) ==="
Write-Host $json
return
}
if ($OutputPath) {
$dir = Split-Path -Parent $OutputPath
if ($dir -and -not (Test-Path $dir)) { New-Item -ItemType Directory -Path $dir | Out-Null }
Set-Content -Path $OutputPath -Value $json -Encoding UTF8
Write-Host "Wrote DriverConfig to $OutputPath"
}
else {
$json
}

View File

@@ -19,10 +19,17 @@ namespace ZB.MOM.WW.OtOpcUa.Core.Abstractions;
/// <param name="ArrayDim">Declared array length when <see cref="IsArray"/> is true; null otherwise.</param>
/// <param name="SecurityClass">Write-authorization tier for this attribute.</param>
/// <param name="IsHistorized">True when this attribute is expected to feed historian / HistoryRead.</param>
/// <param name="IsAlarm">
/// True when this attribute represents an alarm condition (Galaxy: has an
/// <c>AlarmExtension</c> primitive). The generic node-manager enriches the variable with an
/// OPC UA <c>AlarmConditionState</c> when true. Defaults to false so existing non-Galaxy
/// drivers aren't forced to flow a flag they don't produce.
/// </param>
public sealed record DriverAttributeInfo(
string FullName,
DriverDataType DriverDataType,
bool IsArray,
uint? ArrayDim,
SecurityClassification SecurityClass,
bool IsHistorized);
bool IsHistorized,
bool IsAlarm = false);

View File

@@ -0,0 +1,188 @@
using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading;
using System.Threading.Tasks;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Galaxy;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend;
/// <summary>
/// Galaxy backend that uses the live <c>ZB</c> repository for <see cref="DiscoverAsync"/> —
/// real gobject hierarchy + attributes flow through to the Proxy without needing the MXAccess
/// COM client. Runtime data-plane calls (Read/Write/Subscribe/Alarm/History) still surface
/// as "MXAccess code lift pending" until the COM client port lands. This is the highest-value
/// intermediate state because Discover is what powers the OPC UA address-space build, so
/// downstream Proxy + parity tests can exercise the complete tree shape today.
/// </summary>
public sealed class DbBackedGalaxyBackend(GalaxyRepository repository) : IGalaxyBackend
{
private long _nextSessionId;
private long _nextSubscriptionId;
// DB-only backend doesn't have a runtime data plane; never raises events.
#pragma warning disable CS0067
public event System.EventHandler<OnDataChangeNotification>? OnDataChange;
public event System.EventHandler<GalaxyAlarmEvent>? OnAlarmEvent;
public event System.EventHandler<HostConnectivityStatus>? OnHostStatusChanged;
#pragma warning restore CS0067
public Task<OpenSessionResponse> OpenSessionAsync(OpenSessionRequest req, CancellationToken ct)
{
var id = Interlocked.Increment(ref _nextSessionId);
return Task.FromResult(new OpenSessionResponse { Success = true, SessionId = id });
}
public Task CloseSessionAsync(CloseSessionRequest req, CancellationToken ct) => Task.CompletedTask;
public async Task<DiscoverHierarchyResponse> DiscoverAsync(DiscoverHierarchyRequest req, CancellationToken ct)
{
try
{
var hierarchy = await repository.GetHierarchyAsync(ct).ConfigureAwait(false);
var attributes = await repository.GetAttributesAsync(ct).ConfigureAwait(false);
// Group attributes by their owning gobject for the IPC payload.
var attrsByGobject = attributes
.GroupBy(a => a.GobjectId)
.ToDictionary(g => g.Key, g => g.Select(MapAttribute).ToArray());
var parentByChild = hierarchy
.ToDictionary(o => o.GobjectId, o => o.ParentGobjectId);
var nameByGobject = hierarchy
.ToDictionary(o => o.GobjectId, o => o.TagName);
var objects = hierarchy.Select(o => new GalaxyObjectInfo
{
ContainedName = string.IsNullOrEmpty(o.ContainedName) ? o.TagName : o.ContainedName,
TagName = o.TagName,
ParentContainedName = parentByChild.TryGetValue(o.GobjectId, out var p)
&& p != 0
&& nameByGobject.TryGetValue(p, out var pName)
? pName
: null,
TemplateCategory = MapCategory(o.CategoryId),
Attributes = attrsByGobject.TryGetValue(o.GobjectId, out var a) ? a : System.Array.Empty<GalaxyAttributeInfo>(),
}).ToArray();
return new DiscoverHierarchyResponse { Success = true, Objects = objects };
}
catch (Exception ex) when (ex is System.Data.SqlClient.SqlException
or InvalidOperationException
or TimeoutException)
{
return new DiscoverHierarchyResponse
{
Success = false,
Error = $"Galaxy ZB repository error: {ex.Message}",
Objects = System.Array.Empty<GalaxyObjectInfo>(),
};
}
}
public Task<ReadValuesResponse> ReadValuesAsync(ReadValuesRequest req, CancellationToken ct)
=> Task.FromResult(new ReadValuesResponse
{
Success = false,
Error = "MXAccess code lift pending (Phase 2 Task B.1) — DB-backed backend covers Discover only",
Values = System.Array.Empty<GalaxyDataValue>(),
});
public Task<WriteValuesResponse> WriteValuesAsync(WriteValuesRequest req, CancellationToken ct)
{
var results = new WriteValueResult[req.Writes.Length];
for (var i = 0; i < req.Writes.Length; i++)
{
results[i] = new WriteValueResult
{
TagReference = req.Writes[i].TagReference,
StatusCode = 0x80020000u,
Error = "MXAccess code lift pending (Phase 2 Task B.1)",
};
}
return Task.FromResult(new WriteValuesResponse { Results = results });
}
public Task<SubscribeResponse> SubscribeAsync(SubscribeRequest req, CancellationToken ct)
{
var sid = Interlocked.Increment(ref _nextSubscriptionId);
return Task.FromResult(new SubscribeResponse
{
Success = true,
SubscriptionId = sid,
ActualIntervalMs = req.RequestedIntervalMs,
});
}
public Task UnsubscribeAsync(UnsubscribeRequest req, CancellationToken ct) => Task.CompletedTask;
public Task SubscribeAlarmsAsync(AlarmSubscribeRequest req, CancellationToken ct) => Task.CompletedTask;
public Task AcknowledgeAlarmAsync(AlarmAckRequest req, CancellationToken ct) => Task.CompletedTask;
public Task<HistoryReadResponse> HistoryReadAsync(HistoryReadRequest req, CancellationToken ct)
=> Task.FromResult(new HistoryReadResponse
{
Success = false,
Error = "MXAccess + Historian code lift pending (Phase 2 Task B.1)",
Tags = System.Array.Empty<HistoryTagValues>(),
});
public Task<HistoryReadProcessedResponse> HistoryReadProcessedAsync(
HistoryReadProcessedRequest req, CancellationToken ct)
=> Task.FromResult(new HistoryReadProcessedResponse
{
Success = false,
Error = "MXAccess + Historian code lift pending (Phase 2 Task B.1)",
Values = System.Array.Empty<GalaxyDataValue>(),
});
public Task<HistoryReadAtTimeResponse> HistoryReadAtTimeAsync(
HistoryReadAtTimeRequest req, CancellationToken ct)
=> Task.FromResult(new HistoryReadAtTimeResponse
{
Success = false,
Error = "MXAccess + Historian code lift pending (Phase 2 Task B.1)",
Values = System.Array.Empty<GalaxyDataValue>(),
});
public Task<HistoryReadEventsResponse> HistoryReadEventsAsync(
HistoryReadEventsRequest req, CancellationToken ct)
=> Task.FromResult(new HistoryReadEventsResponse
{
Success = false,
Error = "MXAccess + Historian code lift pending (Phase 2 Task B.1)",
Events = System.Array.Empty<GalaxyHistoricalEvent>(),
});
public Task<RecycleStatusResponse> RecycleAsync(RecycleHostRequest req, CancellationToken ct)
=> Task.FromResult(new RecycleStatusResponse { Accepted = true, GraceSeconds = 15 });
private static GalaxyAttributeInfo MapAttribute(GalaxyAttributeRow row) => new()
{
AttributeName = row.AttributeName,
MxDataType = row.MxDataType,
IsArray = row.IsArray,
ArrayDim = row.ArrayDimension is int d and > 0 ? (uint)d : null,
SecurityClassification = row.SecurityClassification,
IsHistorized = row.IsHistorized,
IsAlarm = row.IsAlarm,
};
/// <summary>
/// Galaxy <c>template_definition.category_id</c> → human-readable name.
/// Mirrors v1 Host's <c>AlarmObjectFilter</c> mapping.
/// </summary>
private static string MapCategory(int categoryId) => categoryId switch
{
1 => "$WinPlatform",
3 => "$AppEngine",
4 => "$Area",
10 => "$UserDefined",
11 => "$ApplicationObject",
13 => "$Area",
17 => "$DeviceIntegration",
24 => "$ViewEngine",
26 => "$ViewApp",
_ => $"category-{categoryId}",
};
}

View File

@@ -0,0 +1,35 @@
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Galaxy;
/// <summary>
/// One row from the v1 <c>HierarchySql</c>. Galaxy <c>gobject</c> deployed instance with its
/// hierarchy parent + template-chain context.
/// </summary>
public sealed class GalaxyHierarchyRow
{
public int GobjectId { get; init; }
public string TagName { get; init; } = string.Empty;
public string ContainedName { get; init; } = string.Empty;
public string BrowseName { get; init; } = string.Empty;
public int ParentGobjectId { get; init; }
public bool IsArea { get; init; }
public int CategoryId { get; init; }
public int HostedByGobjectId { get; init; }
public System.Collections.Generic.IReadOnlyList<string> TemplateChain { get; init; } = System.Array.Empty<string>();
}
/// <summary>One row from the v1 <c>AttributesSql</c>.</summary>
public sealed class GalaxyAttributeRow
{
public int GobjectId { get; init; }
public string TagName { get; init; } = string.Empty;
public string AttributeName { get; init; } = string.Empty;
public string FullTagReference { get; init; } = string.Empty;
public int MxDataType { get; init; }
public string? DataTypeName { get; init; }
public bool IsArray { get; init; }
public int? ArrayDimension { get; init; }
public int MxAttributeCategory { get; init; }
public int SecurityClassification { get; init; }
public bool IsHistorized { get; init; }
public bool IsAlarm { get; init; }
}

View File

@@ -0,0 +1,224 @@
using System;
using System.Collections.Generic;
using System.Data.SqlClient;
using System.Linq;
using System.Threading;
using System.Threading.Tasks;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Galaxy;
/// <summary>
/// SQL access to the Galaxy <c>ZB</c> repository — port of v1 <c>GalaxyRepositoryService</c>.
/// The two SQL bodies (Hierarchy + Attributes) are byte-for-byte identical to v1 so the
/// queries surface the same row set at parity time. Extended-attributes and scope-filter
/// queries from v1 are intentionally not ported yet — they're refinements that aren't on
/// the Phase 2 critical path.
/// </summary>
public sealed class GalaxyRepository(GalaxyRepositoryOptions options)
{
public async Task<bool> TestConnectionAsync(CancellationToken ct = default)
{
try
{
using var conn = new SqlConnection(options.ConnectionString);
await conn.OpenAsync(ct).ConfigureAwait(false);
using var cmd = new SqlCommand("SELECT 1", conn) { CommandTimeout = options.CommandTimeoutSeconds };
var result = await cmd.ExecuteScalarAsync(ct).ConfigureAwait(false);
return result is int i && i == 1;
}
catch (SqlException) { return false; }
catch (InvalidOperationException) { return false; }
}
public async Task<DateTime?> GetLastDeployTimeAsync(CancellationToken ct = default)
{
using var conn = new SqlConnection(options.ConnectionString);
await conn.OpenAsync(ct).ConfigureAwait(false);
using var cmd = new SqlCommand("SELECT time_of_last_deploy FROM galaxy", conn)
{ CommandTimeout = options.CommandTimeoutSeconds };
var result = await cmd.ExecuteScalarAsync(ct).ConfigureAwait(false);
return result is DateTime dt ? dt : null;
}
public async Task<List<GalaxyHierarchyRow>> GetHierarchyAsync(CancellationToken ct = default)
{
var rows = new List<GalaxyHierarchyRow>();
using var conn = new SqlConnection(options.ConnectionString);
await conn.OpenAsync(ct).ConfigureAwait(false);
using var cmd = new SqlCommand(HierarchySql, conn) { CommandTimeout = options.CommandTimeoutSeconds };
using var reader = await cmd.ExecuteReaderAsync(ct).ConfigureAwait(false);
while (await reader.ReadAsync(ct).ConfigureAwait(false))
{
var templateChainRaw = reader.IsDBNull(8) ? string.Empty : reader.GetString(8);
var templateChain = templateChainRaw.Length == 0
? Array.Empty<string>()
: templateChainRaw.Split(new[] { '|' }, StringSplitOptions.RemoveEmptyEntries)
.Select(s => s.Trim())
.Where(s => s.Length > 0)
.ToArray();
rows.Add(new GalaxyHierarchyRow
{
GobjectId = Convert.ToInt32(reader.GetValue(0)),
TagName = reader.GetString(1),
ContainedName = reader.IsDBNull(2) ? string.Empty : reader.GetString(2),
BrowseName = reader.GetString(3),
ParentGobjectId = Convert.ToInt32(reader.GetValue(4)),
IsArea = Convert.ToInt32(reader.GetValue(5)) == 1,
CategoryId = Convert.ToInt32(reader.GetValue(6)),
HostedByGobjectId = Convert.ToInt32(reader.GetValue(7)),
TemplateChain = templateChain,
});
}
return rows;
}
public async Task<List<GalaxyAttributeRow>> GetAttributesAsync(CancellationToken ct = default)
{
var rows = new List<GalaxyAttributeRow>();
using var conn = new SqlConnection(options.ConnectionString);
await conn.OpenAsync(ct).ConfigureAwait(false);
using var cmd = new SqlCommand(AttributesSql, conn) { CommandTimeout = options.CommandTimeoutSeconds };
using var reader = await cmd.ExecuteReaderAsync(ct).ConfigureAwait(false);
while (await reader.ReadAsync(ct).ConfigureAwait(false))
{
rows.Add(new GalaxyAttributeRow
{
GobjectId = Convert.ToInt32(reader.GetValue(0)),
TagName = reader.GetString(1),
AttributeName = reader.GetString(2),
FullTagReference = reader.GetString(3),
MxDataType = Convert.ToInt32(reader.GetValue(4)),
DataTypeName = reader.IsDBNull(5) ? null : reader.GetString(5),
IsArray = Convert.ToInt32(reader.GetValue(6)) == 1,
ArrayDimension = reader.IsDBNull(7) ? (int?)null : Convert.ToInt32(reader.GetValue(7)),
MxAttributeCategory = Convert.ToInt32(reader.GetValue(8)),
SecurityClassification = Convert.ToInt32(reader.GetValue(9)),
IsHistorized = Convert.ToInt32(reader.GetValue(10)) == 1,
IsAlarm = Convert.ToInt32(reader.GetValue(11)) == 1,
});
}
return rows;
}
private const string HierarchySql = @"
;WITH template_chain AS (
SELECT g.gobject_id AS instance_gobject_id, t.gobject_id AS template_gobject_id,
t.tag_name AS template_tag_name, t.derived_from_gobject_id, 0 AS depth
FROM gobject g
INNER JOIN gobject t ON t.gobject_id = g.derived_from_gobject_id
WHERE g.is_template = 0 AND g.deployed_package_id <> 0 AND g.derived_from_gobject_id <> 0
UNION ALL
SELECT tc.instance_gobject_id, t.gobject_id, t.tag_name, t.derived_from_gobject_id, tc.depth + 1
FROM template_chain tc
INNER JOIN gobject t ON t.gobject_id = tc.derived_from_gobject_id
WHERE tc.derived_from_gobject_id <> 0 AND tc.depth < 10
)
SELECT DISTINCT
g.gobject_id,
g.tag_name,
g.contained_name,
CASE WHEN g.contained_name IS NULL OR g.contained_name = ''
THEN g.tag_name
ELSE g.contained_name
END AS browse_name,
CASE WHEN g.contained_by_gobject_id = 0
THEN g.area_gobject_id
ELSE g.contained_by_gobject_id
END AS parent_gobject_id,
CASE WHEN td.category_id = 13
THEN 1
ELSE 0
END AS is_area,
td.category_id AS category_id,
g.hosted_by_gobject_id AS hosted_by_gobject_id,
ISNULL(
STUFF((
SELECT '|' + tc.template_tag_name
FROM template_chain tc
WHERE tc.instance_gobject_id = g.gobject_id
ORDER BY tc.depth
FOR XML PATH('')
), 1, 1, ''),
''
) AS template_chain
FROM gobject g
INNER JOIN template_definition td
ON g.template_definition_id = td.template_definition_id
WHERE td.category_id IN (1, 3, 4, 10, 11, 13, 17, 24, 26)
AND g.is_template = 0
AND g.deployed_package_id <> 0
ORDER BY parent_gobject_id, g.tag_name";
private const string AttributesSql = @"
;WITH deployed_package_chain AS (
SELECT g.gobject_id, p.package_id, p.derived_from_package_id, 0 AS depth
FROM gobject g
INNER JOIN package p ON p.package_id = g.deployed_package_id
WHERE g.is_template = 0 AND g.deployed_package_id <> 0
UNION ALL
SELECT dpc.gobject_id, p.package_id, p.derived_from_package_id, dpc.depth + 1
FROM deployed_package_chain dpc
INNER JOIN package p ON p.package_id = dpc.derived_from_package_id
WHERE dpc.derived_from_package_id <> 0 AND dpc.depth < 10
)
SELECT gobject_id, tag_name, attribute_name, full_tag_reference,
mx_data_type, data_type_name, is_array, array_dimension,
mx_attribute_category, security_classification, is_historized, is_alarm
FROM (
SELECT
dpc.gobject_id,
g.tag_name,
da.attribute_name,
g.tag_name + '.' + da.attribute_name
+ CASE WHEN da.is_array = 1 THEN '[]' ELSE '' END
AS full_tag_reference,
da.mx_data_type,
dt.description AS data_type_name,
da.is_array,
CASE WHEN da.is_array = 1
THEN CONVERT(int, CONVERT(varbinary(2),
SUBSTRING(da.mx_value, 15, 2) + SUBSTRING(da.mx_value, 13, 2), 2))
ELSE NULL
END AS array_dimension,
da.mx_attribute_category,
da.security_classification,
CASE WHEN EXISTS (
SELECT 1 FROM deployed_package_chain dpc2
INNER JOIN primitive_instance pi ON pi.package_id = dpc2.package_id AND pi.primitive_name = da.attribute_name
INNER JOIN primitive_definition pd ON pd.primitive_definition_id = pi.primitive_definition_id AND pd.primitive_name = 'HistoryExtension'
WHERE dpc2.gobject_id = dpc.gobject_id
) THEN 1 ELSE 0 END AS is_historized,
CASE WHEN EXISTS (
SELECT 1 FROM deployed_package_chain dpc2
INNER JOIN primitive_instance pi ON pi.package_id = dpc2.package_id AND pi.primitive_name = da.attribute_name
INNER JOIN primitive_definition pd ON pd.primitive_definition_id = pi.primitive_definition_id AND pd.primitive_name = 'AlarmExtension'
WHERE dpc2.gobject_id = dpc.gobject_id
) THEN 1 ELSE 0 END AS is_alarm,
ROW_NUMBER() OVER (
PARTITION BY dpc.gobject_id, da.attribute_name
ORDER BY dpc.depth
) AS rn
FROM deployed_package_chain dpc
INNER JOIN dynamic_attribute da
ON da.package_id = dpc.package_id
INNER JOIN gobject g
ON g.gobject_id = dpc.gobject_id
INNER JOIN template_definition td
ON td.template_definition_id = g.template_definition_id
LEFT JOIN data_type dt
ON dt.mx_data_type = da.mx_data_type
WHERE td.category_id IN (1, 3, 4, 10, 11, 13, 17, 24, 26)
AND da.attribute_name NOT LIKE '[_]%'
AND da.attribute_name NOT LIKE '%.Description'
AND da.mx_attribute_category IN (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 24)
) ranked
WHERE rn = 1
ORDER BY tag_name, attribute_name";
}

View File

@@ -0,0 +1,13 @@
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Galaxy;
/// <summary>
/// Connection settings for the Galaxy <c>ZB</c> repository database. Set from the
/// <c>DriverConfig</c> JSON section <c>Database</c> per <c>plan.md</c> §"Galaxy DriverConfig".
/// </summary>
public sealed class GalaxyRepositoryOptions
{
public string ConnectionString { get; init; } =
"Server=localhost;Database=ZB;Integrated Security=True;TrustServerCertificate=True;Encrypt=False;";
public int CommandTimeoutSeconds { get; init; } = 60;
}

View File

@@ -0,0 +1,129 @@
using System;
using System.Collections.Generic;
using System.Linq;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Historian
{
/// <summary>
/// Thread-safe, pure-logic endpoint picker for the Wonderware Historian cluster. Tracks which
/// configured nodes are healthy, places failed nodes in a time-bounded cooldown, and hands
/// out an ordered list of eligible candidates for the data source to try in sequence.
/// </summary>
internal sealed class HistorianClusterEndpointPicker
{
private readonly Func<DateTime> _clock;
private readonly TimeSpan _cooldown;
private readonly object _lock = new object();
private readonly List<NodeEntry> _nodes;
public HistorianClusterEndpointPicker(HistorianConfiguration config)
: this(config, () => DateTime.UtcNow) { }
internal HistorianClusterEndpointPicker(HistorianConfiguration config, Func<DateTime> clock)
{
_clock = clock ?? throw new ArgumentNullException(nameof(clock));
_cooldown = TimeSpan.FromSeconds(Math.Max(0, config.FailureCooldownSeconds));
var names = (config.ServerNames != null && config.ServerNames.Count > 0)
? config.ServerNames
: new List<string> { config.ServerName };
_nodes = names
.Where(n => !string.IsNullOrWhiteSpace(n))
.Select(n => n.Trim())
.Distinct(StringComparer.OrdinalIgnoreCase)
.Select(n => new NodeEntry { Name = n })
.ToList();
}
public int NodeCount
{
get { lock (_lock) return _nodes.Count; }
}
public IReadOnlyList<string> GetHealthyNodes()
{
lock (_lock)
{
var now = _clock();
return _nodes.Where(n => IsHealthyAt(n, now)).Select(n => n.Name).ToList();
}
}
public int HealthyNodeCount
{
get
{
lock (_lock)
{
var now = _clock();
return _nodes.Count(n => IsHealthyAt(n, now));
}
}
}
public void MarkFailed(string node, string? error)
{
lock (_lock)
{
var entry = FindEntry(node);
if (entry == null) return;
var now = _clock();
entry.FailureCount++;
entry.LastError = error;
entry.LastFailureTime = now;
entry.CooldownUntil = _cooldown.TotalMilliseconds > 0 ? now + _cooldown : (DateTime?)null;
}
}
public void MarkHealthy(string node)
{
lock (_lock)
{
var entry = FindEntry(node);
if (entry == null) return;
entry.CooldownUntil = null;
}
}
public List<HistorianClusterNodeState> SnapshotNodeStates()
{
lock (_lock)
{
var now = _clock();
return _nodes.Select(n => new HistorianClusterNodeState
{
Name = n.Name,
IsHealthy = IsHealthyAt(n, now),
CooldownUntil = IsHealthyAt(n, now) ? null : n.CooldownUntil,
FailureCount = n.FailureCount,
LastError = n.LastError,
LastFailureTime = n.LastFailureTime
}).ToList();
}
}
private static bool IsHealthyAt(NodeEntry entry, DateTime now)
{
return entry.CooldownUntil == null || entry.CooldownUntil <= now;
}
private NodeEntry? FindEntry(string node)
{
for (var i = 0; i < _nodes.Count; i++)
if (string.Equals(_nodes[i].Name, node, StringComparison.OrdinalIgnoreCase))
return _nodes[i];
return null;
}
private sealed class NodeEntry
{
public string Name { get; set; } = "";
public DateTime? CooldownUntil { get; set; }
public int FailureCount { get; set; }
public string? LastError { get; set; }
public DateTime? LastFailureTime { get; set; }
}
}
}

View File

@@ -0,0 +1,18 @@
using System;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Historian
{
/// <summary>
/// Point-in-time state of a single historian cluster node. One entry per configured node
/// appears inside <see cref="HistorianHealthSnapshot"/>.
/// </summary>
public sealed class HistorianClusterNodeState
{
public string Name { get; set; } = "";
public bool IsHealthy { get; set; }
public DateTime? CooldownUntil { get; set; }
public int FailureCount { get; set; }
public string? LastError { get; set; }
public DateTime? LastFailureTime { get; set; }
}
}

View File

@@ -0,0 +1,38 @@
using System.Collections.Generic;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Historian
{
/// <summary>
/// Wonderware Historian SDK configuration. Populated from environment variables at Host
/// startup (see <c>Program.cs</c>) or from the Proxy's <c>DriverInstance.DriverConfig</c>
/// section passed during OpenSession. Kept OPC-UA-free — the Proxy side owns UA translation.
/// </summary>
public sealed class HistorianConfiguration
{
public bool Enabled { get; set; } = false;
/// <summary>Single-node fallback when <see cref="ServerNames"/> is empty.</summary>
public string ServerName { get; set; } = "localhost";
/// <summary>
/// Ordered cluster nodes. When non-empty, the data source tries each in order on connect,
/// falling through to the next on failure. A failed node is placed in cooldown for
/// <see cref="FailureCooldownSeconds"/> before being re-eligible.
/// </summary>
public List<string> ServerNames { get; set; } = new();
public int FailureCooldownSeconds { get; set; } = 60;
public bool IntegratedSecurity { get; set; } = true;
public string? UserName { get; set; }
public string? Password { get; set; }
public int Port { get; set; } = 32568;
public int CommandTimeoutSeconds { get; set; } = 30;
public int MaxValuesPerRead { get; set; } = 10000;
/// <summary>
/// Outer safety timeout applied to sync-over-async Historian operations. Must be
/// comfortably larger than <see cref="CommandTimeoutSeconds"/>.
/// </summary>
public int RequestTimeoutSeconds { get; set; } = 60;
}
}

View File

@@ -0,0 +1,621 @@
using System;
using System.Collections.Generic;
using StringCollection = System.Collections.Specialized.StringCollection;
using System.Threading;
using System.Threading.Tasks;
using ArchestrA;
using Serilog;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Historian
{
/// <summary>
/// Reads historical data from the Wonderware Historian via the aahClientManaged SDK.
/// OPC-UA-free — emits <see cref="HistorianSample"/>/<see cref="HistorianAggregateSample"/>
/// which the Proxy maps to OPC UA <c>DataValue</c> on its side of the IPC.
/// </summary>
public sealed class HistorianDataSource : IHistorianDataSource
{
private static readonly ILogger Log = Serilog.Log.ForContext<HistorianDataSource>();
private readonly HistorianConfiguration _config;
private readonly object _connectionLock = new object();
private readonly object _eventConnectionLock = new object();
private readonly IHistorianConnectionFactory _factory;
private HistorianAccess? _connection;
private HistorianAccess? _eventConnection;
private bool _disposed;
private readonly object _healthLock = new object();
private long _totalSuccesses;
private long _totalFailures;
private int _consecutiveFailures;
private DateTime? _lastSuccessTime;
private DateTime? _lastFailureTime;
private string? _lastError;
private string? _activeProcessNode;
private string? _activeEventNode;
private readonly HistorianClusterEndpointPicker _picker;
public HistorianDataSource(HistorianConfiguration config)
: this(config, new SdkHistorianConnectionFactory(), null) { }
internal HistorianDataSource(
HistorianConfiguration config,
IHistorianConnectionFactory factory,
HistorianClusterEndpointPicker? picker = null)
{
_config = config;
_factory = factory;
_picker = picker ?? new HistorianClusterEndpointPicker(config);
}
private (HistorianAccess Connection, string Node) ConnectToAnyHealthyNode(HistorianConnectionType type)
{
var candidates = _picker.GetHealthyNodes();
if (candidates.Count == 0)
{
var total = _picker.NodeCount;
throw new InvalidOperationException(
total == 0
? "No historian nodes configured"
: $"All {total} historian nodes are in cooldown — no healthy endpoints to connect to");
}
Exception? lastException = null;
foreach (var node in candidates)
{
var attemptConfig = CloneConfigWithServerName(node);
try
{
var conn = _factory.CreateAndConnect(attemptConfig, type);
_picker.MarkHealthy(node);
return (conn, node);
}
catch (Exception ex)
{
_picker.MarkFailed(node, ex.Message);
lastException = ex;
Log.Warning(ex, "Historian node {Node} failed during connect attempt; trying next candidate", node);
}
}
var inner = lastException?.Message ?? "(no detail)";
throw new InvalidOperationException(
$"All {candidates.Count} healthy historian candidate(s) failed during connect: {inner}",
lastException);
}
private HistorianConfiguration CloneConfigWithServerName(string serverName)
{
return new HistorianConfiguration
{
Enabled = _config.Enabled,
ServerName = serverName,
ServerNames = _config.ServerNames,
FailureCooldownSeconds = _config.FailureCooldownSeconds,
IntegratedSecurity = _config.IntegratedSecurity,
UserName = _config.UserName,
Password = _config.Password,
Port = _config.Port,
CommandTimeoutSeconds = _config.CommandTimeoutSeconds,
MaxValuesPerRead = _config.MaxValuesPerRead
};
}
public HistorianHealthSnapshot GetHealthSnapshot()
{
var nodeStates = _picker.SnapshotNodeStates();
var healthyCount = 0;
foreach (var n in nodeStates)
if (n.IsHealthy) healthyCount++;
lock (_healthLock)
{
return new HistorianHealthSnapshot
{
TotalQueries = _totalSuccesses + _totalFailures,
TotalSuccesses = _totalSuccesses,
TotalFailures = _totalFailures,
ConsecutiveFailures = _consecutiveFailures,
LastSuccessTime = _lastSuccessTime,
LastFailureTime = _lastFailureTime,
LastError = _lastError,
ProcessConnectionOpen = Volatile.Read(ref _connection) != null,
EventConnectionOpen = Volatile.Read(ref _eventConnection) != null,
ActiveProcessNode = _activeProcessNode,
ActiveEventNode = _activeEventNode,
NodeCount = nodeStates.Count,
HealthyNodeCount = healthyCount,
Nodes = nodeStates
};
}
}
private void RecordSuccess()
{
lock (_healthLock)
{
_totalSuccesses++;
_lastSuccessTime = DateTime.UtcNow;
_consecutiveFailures = 0;
_lastError = null;
}
}
private void RecordFailure(string error)
{
lock (_healthLock)
{
_totalFailures++;
_lastFailureTime = DateTime.UtcNow;
_consecutiveFailures++;
_lastError = error;
}
}
private void EnsureConnected()
{
if (_disposed)
throw new ObjectDisposedException(nameof(HistorianDataSource));
if (Volatile.Read(ref _connection) != null) return;
var (conn, winningNode) = ConnectToAnyHealthyNode(HistorianConnectionType.Process);
lock (_connectionLock)
{
if (_disposed)
{
conn.CloseConnection(out _);
conn.Dispose();
throw new ObjectDisposedException(nameof(HistorianDataSource));
}
if (_connection != null)
{
conn.CloseConnection(out _);
conn.Dispose();
return;
}
_connection = conn;
lock (_healthLock) _activeProcessNode = winningNode;
Log.Information("Historian SDK connection opened to {Server}:{Port}", winningNode, _config.Port);
}
}
private void HandleConnectionError(Exception? ex = null)
{
lock (_connectionLock)
{
if (_connection == null) return;
try
{
_connection.CloseConnection(out _);
_connection.Dispose();
}
catch (Exception disposeEx)
{
Log.Debug(disposeEx, "Error disposing Historian SDK connection during error recovery");
}
_connection = null;
string? failedNode;
lock (_healthLock)
{
failedNode = _activeProcessNode;
_activeProcessNode = null;
}
if (failedNode != null) _picker.MarkFailed(failedNode, ex?.Message ?? "mid-query failure");
Log.Warning(ex, "Historian SDK connection reset (node={Node})", failedNode ?? "(unknown)");
}
}
private void EnsureEventConnected()
{
if (_disposed)
throw new ObjectDisposedException(nameof(HistorianDataSource));
if (Volatile.Read(ref _eventConnection) != null) return;
var (conn, winningNode) = ConnectToAnyHealthyNode(HistorianConnectionType.Event);
lock (_eventConnectionLock)
{
if (_disposed)
{
conn.CloseConnection(out _);
conn.Dispose();
throw new ObjectDisposedException(nameof(HistorianDataSource));
}
if (_eventConnection != null)
{
conn.CloseConnection(out _);
conn.Dispose();
return;
}
_eventConnection = conn;
lock (_healthLock) _activeEventNode = winningNode;
Log.Information("Historian SDK event connection opened to {Server}:{Port}", winningNode, _config.Port);
}
}
private void HandleEventConnectionError(Exception? ex = null)
{
lock (_eventConnectionLock)
{
if (_eventConnection == null) return;
try
{
_eventConnection.CloseConnection(out _);
_eventConnection.Dispose();
}
catch (Exception disposeEx)
{
Log.Debug(disposeEx, "Error disposing Historian SDK event connection during error recovery");
}
_eventConnection = null;
string? failedNode;
lock (_healthLock)
{
failedNode = _activeEventNode;
_activeEventNode = null;
}
if (failedNode != null) _picker.MarkFailed(failedNode, ex?.Message ?? "mid-query failure");
Log.Warning(ex, "Historian SDK event connection reset (node={Node})", failedNode ?? "(unknown)");
}
}
public Task<List<HistorianSample>> ReadRawAsync(
string tagName, DateTime startTime, DateTime endTime, int maxValues,
CancellationToken ct = default)
{
var results = new List<HistorianSample>();
try
{
EnsureConnected();
using var query = _connection!.CreateHistoryQuery();
var args = new HistoryQueryArgs
{
TagNames = new StringCollection { tagName },
StartDateTime = startTime,
EndDateTime = endTime,
RetrievalMode = HistorianRetrievalMode.Full
};
if (maxValues > 0)
args.BatchSize = (uint)maxValues;
else if (_config.MaxValuesPerRead > 0)
args.BatchSize = (uint)_config.MaxValuesPerRead;
if (!query.StartQuery(args, out var error))
{
Log.Warning("Historian SDK raw query start failed for {Tag}: {Error}", tagName, error.ErrorCode);
RecordFailure($"raw StartQuery: {error.ErrorCode}");
HandleConnectionError();
return Task.FromResult(results);
}
var count = 0;
var limit = maxValues > 0 ? maxValues : _config.MaxValuesPerRead;
while (query.MoveNext(out error))
{
ct.ThrowIfCancellationRequested();
var result = query.QueryResult;
var timestamp = DateTime.SpecifyKind(result.StartDateTime, DateTimeKind.Utc);
object? value;
if (!string.IsNullOrEmpty(result.StringValue) && result.Value == 0)
value = result.StringValue;
else
value = result.Value;
results.Add(new HistorianSample
{
Value = value,
TimestampUtc = timestamp,
Quality = (byte)(result.OpcQuality & 0xFF),
});
count++;
if (limit > 0 && count >= limit) break;
}
query.EndQuery(out _);
RecordSuccess();
}
catch (OperationCanceledException) { throw; }
catch (ObjectDisposedException) { throw; }
catch (Exception ex)
{
Log.Warning(ex, "HistoryRead raw failed for {Tag}", tagName);
RecordFailure($"raw: {ex.Message}");
HandleConnectionError(ex);
}
Log.Debug("HistoryRead raw: {Tag} returned {Count} values ({Start} to {End})",
tagName, results.Count, startTime, endTime);
return Task.FromResult(results);
}
public Task<List<HistorianAggregateSample>> ReadAggregateAsync(
string tagName, DateTime startTime, DateTime endTime,
double intervalMs, string aggregateColumn,
CancellationToken ct = default)
{
var results = new List<HistorianAggregateSample>();
try
{
EnsureConnected();
using var query = _connection!.CreateAnalogSummaryQuery();
var args = new AnalogSummaryQueryArgs
{
TagNames = new StringCollection { tagName },
StartDateTime = startTime,
EndDateTime = endTime,
Resolution = (ulong)intervalMs
};
if (!query.StartQuery(args, out var error))
{
Log.Warning("Historian SDK aggregate query start failed for {Tag}: {Error}", tagName, error.ErrorCode);
RecordFailure($"aggregate StartQuery: {error.ErrorCode}");
HandleConnectionError();
return Task.FromResult(results);
}
while (query.MoveNext(out error))
{
ct.ThrowIfCancellationRequested();
var result = query.QueryResult;
var timestamp = DateTime.SpecifyKind(result.StartDateTime, DateTimeKind.Utc);
var value = ExtractAggregateValue(result, aggregateColumn);
results.Add(new HistorianAggregateSample
{
Value = value,
TimestampUtc = timestamp,
});
}
query.EndQuery(out _);
RecordSuccess();
}
catch (OperationCanceledException) { throw; }
catch (ObjectDisposedException) { throw; }
catch (Exception ex)
{
Log.Warning(ex, "HistoryRead aggregate failed for {Tag}", tagName);
RecordFailure($"aggregate: {ex.Message}");
HandleConnectionError(ex);
}
Log.Debug("HistoryRead aggregate ({Aggregate}): {Tag} returned {Count} values",
aggregateColumn, tagName, results.Count);
return Task.FromResult(results);
}
public Task<List<HistorianSample>> ReadAtTimeAsync(
string tagName, DateTime[] timestamps,
CancellationToken ct = default)
{
var results = new List<HistorianSample>();
if (timestamps == null || timestamps.Length == 0)
return Task.FromResult(results);
try
{
EnsureConnected();
foreach (var timestamp in timestamps)
{
ct.ThrowIfCancellationRequested();
using var query = _connection!.CreateHistoryQuery();
var args = new HistoryQueryArgs
{
TagNames = new StringCollection { tagName },
StartDateTime = timestamp,
EndDateTime = timestamp,
RetrievalMode = HistorianRetrievalMode.Interpolated,
BatchSize = 1
};
if (!query.StartQuery(args, out var error))
{
results.Add(new HistorianSample
{
Value = null,
TimestampUtc = DateTime.SpecifyKind(timestamp, DateTimeKind.Utc),
Quality = 0, // Bad
});
continue;
}
if (query.MoveNext(out error))
{
var result = query.QueryResult;
object? value;
if (!string.IsNullOrEmpty(result.StringValue) && result.Value == 0)
value = result.StringValue;
else
value = result.Value;
results.Add(new HistorianSample
{
Value = value,
TimestampUtc = DateTime.SpecifyKind(timestamp, DateTimeKind.Utc),
Quality = (byte)(result.OpcQuality & 0xFF),
});
}
else
{
results.Add(new HistorianSample
{
Value = null,
TimestampUtc = DateTime.SpecifyKind(timestamp, DateTimeKind.Utc),
Quality = 0,
});
}
query.EndQuery(out _);
}
RecordSuccess();
}
catch (OperationCanceledException) { throw; }
catch (ObjectDisposedException) { throw; }
catch (Exception ex)
{
Log.Warning(ex, "HistoryRead at-time failed for {Tag}", tagName);
RecordFailure($"at-time: {ex.Message}");
HandleConnectionError(ex);
}
Log.Debug("HistoryRead at-time: {Tag} returned {Count} values for {Timestamps} timestamps",
tagName, results.Count, timestamps.Length);
return Task.FromResult(results);
}
public Task<List<HistorianEventDto>> ReadEventsAsync(
string? sourceName, DateTime startTime, DateTime endTime, int maxEvents,
CancellationToken ct = default)
{
var results = new List<HistorianEventDto>();
try
{
EnsureEventConnected();
using var query = _eventConnection!.CreateEventQuery();
var args = new EventQueryArgs
{
StartDateTime = startTime,
EndDateTime = endTime,
EventCount = maxEvents > 0 ? (uint)maxEvents : (uint)_config.MaxValuesPerRead,
QueryType = HistorianEventQueryType.Events,
EventOrder = HistorianEventOrder.Ascending
};
if (!string.IsNullOrEmpty(sourceName))
{
query.AddEventFilter("Source", HistorianComparisionType.Equal, sourceName, out _);
}
if (!query.StartQuery(args, out var error))
{
Log.Warning("Historian SDK event query start failed: {Error}", error.ErrorCode);
RecordFailure($"events StartQuery: {error.ErrorCode}");
HandleEventConnectionError();
return Task.FromResult(results);
}
var count = 0;
while (query.MoveNext(out error))
{
ct.ThrowIfCancellationRequested();
results.Add(ToDto(query.QueryResult));
count++;
if (maxEvents > 0 && count >= maxEvents) break;
}
query.EndQuery(out _);
RecordSuccess();
}
catch (OperationCanceledException) { throw; }
catch (ObjectDisposedException) { throw; }
catch (Exception ex)
{
Log.Warning(ex, "HistoryRead events failed for source {Source}", sourceName ?? "(all)");
RecordFailure($"events: {ex.Message}");
HandleEventConnectionError(ex);
}
Log.Debug("HistoryRead events: source={Source} returned {Count} events ({Start} to {End})",
sourceName ?? "(all)", results.Count, startTime, endTime);
return Task.FromResult(results);
}
private static HistorianEventDto ToDto(HistorianEvent evt)
{
// The ArchestrA SDK marks these properties obsolete but still returns them; their
// successors aren't wired in the version we bind against. Using them is the documented
// v1 behavior — suppressed locally instead of project-wide so any non-event use of
// deprecated SDK surface still surfaces as an error.
#pragma warning disable CS0618
return new HistorianEventDto
{
Id = evt.Id,
Source = evt.Source,
EventTime = evt.EventTime,
ReceivedTime = evt.ReceivedTime,
DisplayText = evt.DisplayText,
Severity = (ushort)evt.Severity
};
#pragma warning restore CS0618
}
internal static double? ExtractAggregateValue(AnalogSummaryQueryResult result, string column)
{
switch (column)
{
case "Average": return result.Average;
case "Minimum": return result.Minimum;
case "Maximum": return result.Maximum;
case "ValueCount": return result.ValueCount;
case "First": return result.First;
case "Last": return result.Last;
case "StdDev": return result.StdDev;
default: return null;
}
}
public void Dispose()
{
if (_disposed) return;
_disposed = true;
try
{
_connection?.CloseConnection(out _);
_connection?.Dispose();
}
catch (Exception ex)
{
Log.Warning(ex, "Error closing Historian SDK connection");
}
try
{
_eventConnection?.CloseConnection(out _);
_eventConnection?.Dispose();
}
catch (Exception ex)
{
Log.Warning(ex, "Error closing Historian SDK event connection");
}
_connection = null;
_eventConnection = null;
}
}
}

View File

@@ -0,0 +1,18 @@
using System;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Historian
{
/// <summary>
/// SDK-free representation of a Historian event record. Prevents ArchestrA types from
/// leaking beyond <c>HistorianDataSource</c>.
/// </summary>
public sealed class HistorianEventDto
{
public Guid Id { get; set; }
public string? Source { get; set; }
public DateTime EventTime { get; set; }
public DateTime ReceivedTime { get; set; }
public string? DisplayText { get; set; }
public ushort Severity { get; set; }
}
}

View File

@@ -0,0 +1,27 @@
using System;
using System.Collections.Generic;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Historian
{
/// <summary>
/// Point-in-time runtime health of the historian subsystem — consumed by the status dashboard
/// via an IPC health query (not wired in PR #5; deferred).
/// </summary>
public sealed class HistorianHealthSnapshot
{
public long TotalQueries { get; set; }
public long TotalSuccesses { get; set; }
public long TotalFailures { get; set; }
public int ConsecutiveFailures { get; set; }
public DateTime? LastSuccessTime { get; set; }
public DateTime? LastFailureTime { get; set; }
public string? LastError { get; set; }
public bool ProcessConnectionOpen { get; set; }
public bool EventConnectionOpen { get; set; }
public string? ActiveProcessNode { get; set; }
public string? ActiveEventNode { get; set; }
public int NodeCount { get; set; }
public int HealthyNodeCount { get; set; }
public List<HistorianClusterNodeState> Nodes { get; set; } = new();
}
}

View File

@@ -0,0 +1,46 @@
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Historian;
/// <summary>
/// Maps a raw OPC DA quality byte (as returned by Wonderware Historian's <c>OpcQuality</c>)
/// to an OPC UA <c>StatusCode</c> uint. Preserves specific codes (BadNotConnected,
/// UncertainSubNormal, etc.) instead of collapsing to Good/Uncertain/Bad categories.
/// Mirrors v1 <c>QualityMapper.MapToOpcUaStatusCode</c> without pulling in OPC UA types —
/// the returned value is the 32-bit OPC UA <c>StatusCode</c> wire encoding that the Proxy
/// surfaces directly as <c>DataValueSnapshot.StatusCode</c>.
/// </summary>
public static class HistorianQualityMapper
{
/// <summary>
/// Map an 8-bit OPC DA quality byte to the corresponding OPC UA StatusCode. The byte
/// family bits decide the category (Good &gt;= 192, Uncertain 64-191, Bad 0-63); the
/// low-nibble subcode selects the specific code.
/// </summary>
public static uint Map(byte q) => q switch
{
// Good family (192+)
192 => 0x00000000u, // Good
216 => 0x00D80000u, // Good_LocalOverride
// Uncertain family (64-191)
64 => 0x40000000u, // Uncertain
68 => 0x40900000u, // Uncertain_LastUsableValue
80 => 0x40930000u, // Uncertain_SensorNotAccurate
84 => 0x40940000u, // Uncertain_EngineeringUnitsExceeded
88 => 0x40950000u, // Uncertain_SubNormal
// Bad family (0-63)
0 => 0x80000000u, // Bad
4 => 0x80890000u, // Bad_ConfigurationError
8 => 0x808A0000u, // Bad_NotConnected
12 => 0x808B0000u, // Bad_DeviceFailure
16 => 0x808C0000u, // Bad_SensorFailure
20 => 0x80050000u, // Bad_CommunicationError
24 => 0x808D0000u, // Bad_OutOfService
32 => 0x80320000u, // Bad_WaitingForInitialData
// Unknown code — fall back to the category so callers still get a sensible bucket.
_ when q >= 192 => 0x00000000u,
_ when q >= 64 => 0x40000000u,
_ => 0x80000000u,
};
}

View File

@@ -0,0 +1,30 @@
using System;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Historian
{
/// <summary>
/// OPC-UA-free representation of a single historical data point. The Host returns these
/// across the IPC boundary as <c>GalaxyDataValue</c>; the Proxy maps quality and value to
/// OPC UA <c>DataValue</c>. Raw MX quality byte is preserved so the Proxy can use the same
/// quality mapper it already uses for live reads.
/// </summary>
public sealed class HistorianSample
{
public object? Value { get; set; }
/// <summary>Raw OPC DA quality byte from the historian SDK (low 8 bits of OpcQuality).</summary>
public byte Quality { get; set; }
public DateTime TimestampUtc { get; set; }
}
/// <summary>
/// Result of <see cref="IHistorianDataSource.ReadAggregateAsync"/>. When <see cref="Value"/> is
/// null the aggregate is unavailable for that bucket (Proxy maps to <c>BadNoData</c>).
/// </summary>
public sealed class HistorianAggregateSample
{
public double? Value { get; set; }
public DateTime TimestampUtc { get; set; }
}
}

View File

@@ -0,0 +1,73 @@
using System;
using System.Threading;
using ArchestrA;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Historian
{
/// <summary>
/// Creates and opens Historian SDK connections. Extracted so tests can inject fakes that
/// control connection success, failure, and timeout behavior.
/// </summary>
internal interface IHistorianConnectionFactory
{
HistorianAccess CreateAndConnect(HistorianConfiguration config, HistorianConnectionType type);
}
/// <summary>Production implementation — opens real Historian SDK connections.</summary>
internal sealed class SdkHistorianConnectionFactory : IHistorianConnectionFactory
{
public HistorianAccess CreateAndConnect(HistorianConfiguration config, HistorianConnectionType type)
{
var conn = new HistorianAccess();
var args = new HistorianConnectionArgs
{
ServerName = config.ServerName,
TcpPort = (ushort)config.Port,
IntegratedSecurity = config.IntegratedSecurity,
UseArchestrAUser = config.IntegratedSecurity,
ConnectionType = type,
ReadOnly = true,
PacketTimeout = (uint)(config.CommandTimeoutSeconds * 1000)
};
if (!config.IntegratedSecurity)
{
args.UserName = config.UserName ?? string.Empty;
args.Password = config.Password ?? string.Empty;
}
if (!conn.OpenConnection(args, out var error))
{
conn.Dispose();
throw new InvalidOperationException(
$"Failed to open Historian SDK connection to {config.ServerName}:{config.Port}: {error.ErrorCode}");
}
var timeoutMs = config.CommandTimeoutSeconds * 1000;
var elapsed = 0;
while (elapsed < timeoutMs)
{
var status = new HistorianConnectionStatus();
conn.GetConnectionStatus(ref status);
if (status.ConnectedToServer)
return conn;
if (status.ErrorOccurred)
{
conn.Dispose();
throw new InvalidOperationException(
$"Historian SDK connection failed: {status.Error}");
}
Thread.Sleep(250);
elapsed += 250;
}
conn.Dispose();
throw new TimeoutException(
$"Historian SDK connection to {config.ServerName}:{config.Port} timed out after {config.CommandTimeoutSeconds}s");
}
}
}

View File

@@ -0,0 +1,34 @@
using System;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Historian
{
/// <summary>
/// OPC-UA-free surface for the Wonderware Historian subsystem inside Galaxy.Host.
/// Implementations read via the aahClient* SDK; the Proxy side maps returned samples
/// to OPC UA <c>DataValue</c>.
/// </summary>
public interface IHistorianDataSource : IDisposable
{
Task<List<HistorianSample>> ReadRawAsync(
string tagName, DateTime startTime, DateTime endTime, int maxValues,
CancellationToken ct = default);
Task<List<HistorianAggregateSample>> ReadAggregateAsync(
string tagName, DateTime startTime, DateTime endTime,
double intervalMs, string aggregateColumn,
CancellationToken ct = default);
Task<List<HistorianSample>> ReadAtTimeAsync(
string tagName, DateTime[] timestamps,
CancellationToken ct = default);
Task<List<HistorianEventDto>> ReadEventsAsync(
string? sourceName, DateTime startTime, DateTime endTime, int maxEvents,
CancellationToken ct = default);
HistorianHealthSnapshot GetHealthSnapshot();
}
}

View File

@@ -0,0 +1,46 @@
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend;
/// <summary>
/// Galaxy data-plane abstraction. Replaces the placeholder <c>StubFrameHandler</c> with a
/// real boundary the lifted <c>MxAccessClient</c> + <c>GalaxyRepository</c> implement during
/// Phase 2 Task B.1. Splitting the IPC dispatch (<c>GalaxyFrameHandler</c>) from the
/// backend means the dispatcher is unit-testable against an in-memory mock without needing
/// live Galaxy.
/// </summary>
public interface IGalaxyBackend
{
/// <summary>
/// Server-pushed events the backend raises asynchronously (data-change, alarm,
/// host-status). The frame handler subscribes once on connect and forwards each
/// event to the Proxy as a typed <see cref="MessageKind"/> notification.
/// </summary>
event System.EventHandler<OnDataChangeNotification>? OnDataChange;
event System.EventHandler<GalaxyAlarmEvent>? OnAlarmEvent;
event System.EventHandler<HostConnectivityStatus>? OnHostStatusChanged;
Task<OpenSessionResponse> OpenSessionAsync(OpenSessionRequest req, CancellationToken ct);
Task CloseSessionAsync(CloseSessionRequest req, CancellationToken ct);
Task<DiscoverHierarchyResponse> DiscoverAsync(DiscoverHierarchyRequest req, CancellationToken ct);
Task<ReadValuesResponse> ReadValuesAsync(ReadValuesRequest req, CancellationToken ct);
Task<WriteValuesResponse> WriteValuesAsync(WriteValuesRequest req, CancellationToken ct);
Task<SubscribeResponse> SubscribeAsync(SubscribeRequest req, CancellationToken ct);
Task UnsubscribeAsync(UnsubscribeRequest req, CancellationToken ct);
Task SubscribeAlarmsAsync(AlarmSubscribeRequest req, CancellationToken ct);
Task AcknowledgeAlarmAsync(AlarmAckRequest req, CancellationToken ct);
Task<HistoryReadResponse> HistoryReadAsync(HistoryReadRequest req, CancellationToken ct);
Task<HistoryReadProcessedResponse> HistoryReadProcessedAsync(HistoryReadProcessedRequest req, CancellationToken ct);
Task<HistoryReadAtTimeResponse> HistoryReadAtTimeAsync(HistoryReadAtTimeRequest req, CancellationToken ct);
Task<HistoryReadEventsResponse> HistoryReadEventsAsync(HistoryReadEventsRequest req, CancellationToken ct);
Task<RecycleStatusResponse> RecycleAsync(RecycleHostRequest req, CancellationToken ct);
}

View File

@@ -0,0 +1,43 @@
using ArchestrA.MxAccess;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
/// <summary>
/// Delegate matching <c>LMXProxyServer.OnDataChange</c> COM event signature. Allows
/// <see cref="MxAccessClient"/> to subscribe via the abstracted <see cref="IMxProxy"/>
/// instead of the COM object directly (so the test mock works without MXAccess registered).
/// </summary>
public delegate void MxDataChangeHandler(
int hLMXServerHandle,
int phItemHandle,
object pvItemValue,
int pwItemQuality,
object pftItemTimeStamp,
ref MXSTATUS_PROXY[] ItemStatus);
public delegate void MxWriteCompleteHandler(
int hLMXServerHandle,
int phItemHandle,
ref MXSTATUS_PROXY[] ItemStatus);
/// <summary>
/// Abstraction over <c>LMXProxyServer</c> — port of v1 <c>IMxProxy</c>. Same surface area
/// so the lifted client behaves identically; only the namespace + apartment-marshalling
/// entry-point change.
/// </summary>
public interface IMxProxy
{
int Register(string clientName);
void Unregister(int handle);
int AddItem(int handle, string address);
void RemoveItem(int handle, int itemHandle);
void AdviseSupervisory(int handle, int itemHandle);
void UnAdviseSupervisory(int handle, int itemHandle);
void Write(int handle, int itemHandle, object value, int securityClassification);
event MxDataChangeHandler? OnDataChange;
event MxWriteCompleteHandler? OnWriteComplete;
}

View File

@@ -0,0 +1,408 @@
using System;
using System.Collections.Concurrent;
using System.Linq;
using System.Threading;
using System.Threading.Tasks;
using ArchestrA.MxAccess;
using Serilog;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Sta;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
/// <summary>
/// MXAccess runtime client — focused port of v1 <c>MxAccessClient</c>. Owns one
/// <c>LMXProxyServer</c> COM connection on the supplied <see cref="StaPump"/>; serializes
/// read / write / subscribe through the pump because all COM calls must run on the STA
/// thread. Subscriptions are stored so they can be replayed on reconnect (full reconnect
/// loop is the deferred-but-non-blocking refinement; this version covers connect/read/write
/// /subscribe/unsubscribe — the MVP needed for parity testing).
/// </summary>
public sealed class MxAccessClient : IDisposable
{
private static readonly ILogger Log = Serilog.Log.ForContext<MxAccessClient>();
private readonly StaPump _pump;
private readonly IMxProxy _proxy;
private readonly string _clientName;
private readonly MxAccessClientOptions _options;
// Galaxy attribute reference → MXAccess item handle (set on first Subscribe/Read).
private readonly ConcurrentDictionary<string, int> _addressToHandle = new(StringComparer.OrdinalIgnoreCase);
private readonly ConcurrentDictionary<int, string> _handleToAddress = new();
private readonly ConcurrentDictionary<string, Action<string, Vtq>> _subscriptions =
new(StringComparer.OrdinalIgnoreCase);
private readonly ConcurrentDictionary<int, TaskCompletionSource<bool>> _pendingWrites = new();
private int _connectionHandle;
private bool _connected;
private DateTime _lastObservedActivityUtc = DateTime.UtcNow;
private CancellationTokenSource? _monitorCts;
private int _reconnectCount;
private bool _disposed;
/// <summary>Fires whenever the connection transitions Connected ↔ Disconnected.</summary>
public event EventHandler<bool>? ConnectionStateChanged;
/// <summary>
/// Fires once per failed subscription replay after a reconnect. Carries the tag reference
/// and the exception so the backend can propagate the degradation signal (e.g. mark the
/// subscription bad on the Proxy side rather than silently losing its callback). Added for
/// PR 6 low finding #2 — the replay loop previously ate per-tag failures silently and an
/// operator would only find out that a specific subscription stopped updating through a
/// data-quality complaint from downstream.
/// </summary>
public event EventHandler<SubscriptionReplayFailedEventArgs>? SubscriptionReplayFailed;
public MxAccessClient(StaPump pump, IMxProxy proxy, string clientName, MxAccessClientOptions? options = null)
{
_pump = pump;
_proxy = proxy;
_clientName = clientName;
_options = options ?? new MxAccessClientOptions();
_proxy.OnDataChange += OnDataChange;
_proxy.OnWriteComplete += OnWriteComplete;
}
public bool IsConnected => _connected;
public int SubscriptionCount => _subscriptions.Count;
public int ReconnectCount => _reconnectCount;
/// <summary>
/// Wonderware client identity used when registering with the LMXProxyServer. Surfaced so
/// <see cref="Backend.MxAccessGalaxyBackend"/> can tag its <c>OnHostStatusChanged</c> IPC
/// pushes with a stable gateway name per PR 8.
/// </summary>
public string ClientName => _clientName;
/// <summary>Connects on the STA thread. Idempotent. Starts the reconnect monitor on first call.</summary>
public async Task<int> ConnectAsync()
{
var handle = await _pump.InvokeAsync(() =>
{
if (_connected) return _connectionHandle;
_connectionHandle = _proxy.Register(_clientName);
_connected = true;
return _connectionHandle;
});
ConnectionStateChanged?.Invoke(this, true);
if (_options.AutoReconnect && _monitorCts is null)
{
_monitorCts = new CancellationTokenSource();
_ = Task.Run(() => MonitorLoopAsync(_monitorCts.Token));
}
return handle;
}
public async Task DisconnectAsync()
{
_monitorCts?.Cancel();
_monitorCts = null;
await _pump.InvokeAsync(() =>
{
if (!_connected) return;
try { _proxy.Unregister(_connectionHandle); }
finally
{
_connected = false;
_addressToHandle.Clear();
_handleToAddress.Clear();
}
});
ConnectionStateChanged?.Invoke(this, false);
}
/// <summary>
/// Background loop that watches for connection liveness signals and triggers
/// reconnect-with-replay when the connection appears dead. Per Phase 2 high finding #2:
/// v1's MxAccessClient.Monitor pattern lifted into the new pump-based client. Uses
/// observed-activity timestamp + optional probe-tag subscription. Without an explicit
/// probe tag, falls back to "no data change in N seconds + no successful read in N
/// seconds = unhealthy" — same shape as v1.
/// </summary>
private async Task MonitorLoopAsync(CancellationToken ct)
{
while (!ct.IsCancellationRequested)
{
try { await Task.Delay(_options.MonitorInterval, ct); }
catch (OperationCanceledException) { break; }
if (!_connected || _disposed) continue;
var idle = DateTime.UtcNow - _lastObservedActivityUtc;
if (idle <= _options.StaleThreshold) continue;
// Probe: try a no-op COM call. If the proxy is dead, the call will throw — that's
// our reconnect signal. PR 6 low finding #1: AddItem allocates an MXAccess item
// handle; we must RemoveItem it on the same pump turn or the long-running monitor
// leaks one handle per probe cycle (one every MonitorInterval seconds, indefinitely).
bool probeOk;
try
{
probeOk = await _pump.InvokeAsync(() =>
{
int probeHandle = 0;
try
{
probeHandle = _proxy.AddItem(_connectionHandle, "$Heartbeat");
return probeHandle > 0;
}
catch { return false; }
finally
{
if (probeHandle > 0)
{
try { _proxy.RemoveItem(_connectionHandle, probeHandle); }
catch { /* proxy is dying; best-effort cleanup */ }
}
}
});
}
catch { probeOk = false; }
if (probeOk)
{
_lastObservedActivityUtc = DateTime.UtcNow;
continue;
}
// Connection appears dead — reconnect-with-replay.
try
{
await _pump.InvokeAsync(() =>
{
try { _proxy.Unregister(_connectionHandle); } catch { /* dead anyway */ }
_connected = false;
});
ConnectionStateChanged?.Invoke(this, false);
await _pump.InvokeAsync(() =>
{
_connectionHandle = _proxy.Register(_clientName);
_connected = true;
});
_reconnectCount++;
ConnectionStateChanged?.Invoke(this, true);
// Replay every subscription that was active before the disconnect. PR 6 low
// finding #2: surface per-tag failures — log them and raise
// SubscriptionReplayFailed so the backend can propagate the degraded state
// (previously swallowed silently; downstream quality dropped without a signal).
var snapshot = _addressToHandle.Keys.ToArray();
_addressToHandle.Clear();
_handleToAddress.Clear();
var failed = 0;
foreach (var fullRef in snapshot)
{
try { await SubscribeOnPumpAsync(fullRef); }
catch (Exception subEx)
{
failed++;
Log.Warning(subEx,
"MXAccess subscription replay failed for {TagReference} after reconnect #{Reconnect}",
fullRef, _reconnectCount);
SubscriptionReplayFailed?.Invoke(this,
new SubscriptionReplayFailedEventArgs(fullRef, subEx));
}
}
if (failed > 0)
Log.Warning("Subscription replay completed — {Failed} of {Total} failed", failed, snapshot.Length);
else
Log.Information("Subscription replay completed — {Total} re-subscribed cleanly", snapshot.Length);
_lastObservedActivityUtc = DateTime.UtcNow;
}
catch
{
// Reconnect failed; back off and retry on the next tick.
_connected = false;
}
}
}
/// <summary>
/// One-shot read implemented as a transient subscribe + unsubscribe.
/// <c>LMXProxyServer</c> doesn't expose a synchronous read, so the canonical pattern
/// (lifted from v1) is to subscribe, await the first OnDataChange, then unsubscribe.
/// This method captures that single value.
/// </summary>
public async Task<Vtq> ReadAsync(string fullReference, TimeSpan timeout, CancellationToken ct)
{
if (!_connected) throw new InvalidOperationException("MxAccessClient not connected");
var tcs = new TaskCompletionSource<Vtq>(TaskCreationOptions.RunContinuationsAsynchronously);
Action<string, Vtq> oneShot = (_, value) => tcs.TrySetResult(value);
// Stash the one-shot handler before sending the subscribe, then remove it after firing.
_subscriptions.AddOrUpdate(fullReference, oneShot, (_, existing) => Combine(existing, oneShot));
var addedToReadOnlyAttribute = !_addressToHandle.ContainsKey(fullReference);
try
{
await SubscribeOnPumpAsync(fullReference);
using var _ = ct.Register(() => tcs.TrySetCanceled());
var raceTask = await Task.WhenAny(tcs.Task, Task.Delay(timeout, ct));
if (raceTask != tcs.Task) throw new TimeoutException($"MXAccess read of {fullReference} timed out after {timeout}");
return await tcs.Task;
}
finally
{
// High 1 — always detach the one-shot handler, even on cancellation/timeout/throw.
// If we were the one who added the underlying MXAccess subscription (no other
// caller had it), tear it down too so we don't leak a probe item handle.
_subscriptions.AddOrUpdate(fullReference, _ => default!, (_, existing) => Remove(existing, oneShot));
if (addedToReadOnlyAttribute)
{
try { await UnsubscribeAsync(fullReference); }
catch { /* shutdown-best-effort */ }
}
}
}
/// <summary>
/// Writes <paramref name="value"/> to the runtime and AWAITS the OnWriteComplete
/// callback so the caller learns the actual write status. Per Phase 2 medium finding #4
/// in <c>exit-gate-phase-2.md</c>: the previous fire-and-forget version returned a
/// false-positive Good even when the runtime rejected the write post-callback.
/// </summary>
public async Task<bool> WriteAsync(string fullReference, object value,
int securityClassification = 0, TimeSpan? timeout = null)
{
if (!_connected) throw new InvalidOperationException("MxAccessClient not connected");
var actualTimeout = timeout ?? TimeSpan.FromSeconds(5);
var itemHandle = await _pump.InvokeAsync(() => ResolveItem(fullReference));
var tcs = new TaskCompletionSource<bool>(TaskCreationOptions.RunContinuationsAsynchronously);
if (!_pendingWrites.TryAdd(itemHandle, tcs))
{
// A prior write to the same item handle is still pending — uncommon but possible
// if the caller spammed writes. Replace it: the older TCS observes a Cancelled task.
if (_pendingWrites.TryRemove(itemHandle, out var prior))
prior.TrySetCanceled();
_pendingWrites[itemHandle] = tcs;
}
try
{
await _pump.InvokeAsync(() =>
_proxy.Write(_connectionHandle, itemHandle, value, securityClassification));
var raceTask = await Task.WhenAny(tcs.Task, Task.Delay(actualTimeout));
if (raceTask != tcs.Task)
throw new TimeoutException($"MXAccess write of {fullReference} timed out after {actualTimeout}");
return await tcs.Task;
}
finally
{
_pendingWrites.TryRemove(itemHandle, out _);
}
}
public async Task SubscribeAsync(string fullReference, Action<string, Vtq> callback)
{
if (!_connected) throw new InvalidOperationException("MxAccessClient not connected");
_subscriptions.AddOrUpdate(fullReference, callback, (_, existing) => Combine(existing, callback));
await SubscribeOnPumpAsync(fullReference);
}
public Task UnsubscribeAsync(string fullReference) => _pump.InvokeAsync(() =>
{
if (!_connected) return;
if (!_addressToHandle.TryRemove(fullReference, out var handle)) return;
_handleToAddress.TryRemove(handle, out _);
_subscriptions.TryRemove(fullReference, out _);
try
{
_proxy.UnAdviseSupervisory(_connectionHandle, handle);
_proxy.RemoveItem(_connectionHandle, handle);
}
catch { /* best-effort during teardown */ }
});
private Task<int> SubscribeOnPumpAsync(string fullReference) => _pump.InvokeAsync(() =>
{
if (_addressToHandle.TryGetValue(fullReference, out var existing)) return existing;
var itemHandle = _proxy.AddItem(_connectionHandle, fullReference);
_addressToHandle[fullReference] = itemHandle;
_handleToAddress[itemHandle] = fullReference;
_proxy.AdviseSupervisory(_connectionHandle, itemHandle);
return itemHandle;
});
private int ResolveItem(string fullReference)
{
if (_addressToHandle.TryGetValue(fullReference, out var existing)) return existing;
var itemHandle = _proxy.AddItem(_connectionHandle, fullReference);
_addressToHandle[fullReference] = itemHandle;
_handleToAddress[itemHandle] = fullReference;
return itemHandle;
}
private void OnDataChange(int hLMXServerHandle, int phItemHandle, object pvItemValue,
int pwItemQuality, object pftItemTimeStamp, ref MXSTATUS_PROXY[] itemStatus)
{
if (!_handleToAddress.TryGetValue(phItemHandle, out var fullRef)) return;
// Liveness: any data-change event is proof the connection is alive.
_lastObservedActivityUtc = DateTime.UtcNow;
var ts = pftItemTimeStamp is DateTime dt ? dt.ToUniversalTime() : DateTime.UtcNow;
var quality = (byte)Math.Min(255, Math.Max(0, pwItemQuality));
var vtq = new Vtq(pvItemValue, ts, quality);
if (_subscriptions.TryGetValue(fullRef, out var cb)) cb?.Invoke(fullRef, vtq);
}
private void OnWriteComplete(int hLMXServerHandle, int phItemHandle, ref MXSTATUS_PROXY[] itemStatus)
{
if (_pendingWrites.TryRemove(phItemHandle, out var tcs))
tcs.TrySetResult(itemStatus is null || itemStatus.Length == 0 || itemStatus[0].success != 0);
}
private static Action<string, Vtq> Combine(Action<string, Vtq> a, Action<string, Vtq> b)
=> (Action<string, Vtq>)Delegate.Combine(a, b)!;
private static Action<string, Vtq> Remove(Action<string, Vtq> source, Action<string, Vtq> remove)
=> (Action<string, Vtq>?)Delegate.Remove(source, remove) ?? ((_, _) => { });
public void Dispose()
{
_disposed = true;
_monitorCts?.Cancel();
try { DisconnectAsync().GetAwaiter().GetResult(); }
catch { /* swallow */ }
_proxy.OnDataChange -= OnDataChange;
_proxy.OnWriteComplete -= OnWriteComplete;
_monitorCts?.Dispose();
}
}
/// <summary>
/// Tunables for <see cref="MxAccessClient"/>'s reconnect monitor. Defaults match the v1
/// monitor's polling cadence so behavior is consistent across the lift.
/// </summary>
public sealed class MxAccessClientOptions
{
/// <summary>Whether to start the background monitor at connect time.</summary>
public bool AutoReconnect { get; init; } = true;
/// <summary>How often the monitor wakes up to check liveness.</summary>
public TimeSpan MonitorInterval { get; init; } = TimeSpan.FromSeconds(5);
/// <summary>If no data-change activity in this window, the monitor probes the connection.</summary>
public TimeSpan StaleThreshold { get; init; } = TimeSpan.FromSeconds(60);
}

View File

@@ -0,0 +1,68 @@
using System;
using System.Runtime.InteropServices;
using ArchestrA.MxAccess;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
/// <summary>
/// Concrete <see cref="IMxProxy"/> backed by a real <c>LMXProxyServer</c> COM object.
/// Port of v1 <c>MxProxyAdapter</c>. <strong>Must only be constructed on an STA thread</strong>
/// — the StaPump owns this instance.
/// </summary>
public sealed class MxProxyAdapter : IMxProxy, IDisposable
{
private LMXProxyServer? _lmxProxy;
public event MxDataChangeHandler? OnDataChange;
public event MxWriteCompleteHandler? OnWriteComplete;
public int Register(string clientName)
{
_lmxProxy = new LMXProxyServer();
_lmxProxy.OnDataChange += ProxyOnDataChange;
_lmxProxy.OnWriteComplete += ProxyOnWriteComplete;
var handle = _lmxProxy.Register(clientName);
if (handle <= 0)
throw new InvalidOperationException($"LMXProxyServer.Register returned invalid handle: {handle}");
return handle;
}
public void Unregister(int handle)
{
if (_lmxProxy is null) return;
try
{
_lmxProxy.OnDataChange -= ProxyOnDataChange;
_lmxProxy.OnWriteComplete -= ProxyOnWriteComplete;
_lmxProxy.Unregister(handle);
}
finally
{
// ReleaseComObject loop until refcount = 0 — the Tier C SafeHandle wraps this in
// production; here the lifetime is owned by the surrounding MxAccessHandle.
while (Marshal.IsComObject(_lmxProxy) && Marshal.ReleaseComObject(_lmxProxy) > 0) { }
_lmxProxy = null;
}
}
public int AddItem(int handle, string address) => _lmxProxy!.AddItem(handle, address);
public void RemoveItem(int handle, int itemHandle) => _lmxProxy!.RemoveItem(handle, itemHandle);
public void AdviseSupervisory(int handle, int itemHandle) => _lmxProxy!.AdviseSupervisory(handle, itemHandle);
public void UnAdviseSupervisory(int handle, int itemHandle) => _lmxProxy!.UnAdvise(handle, itemHandle);
public void Write(int handle, int itemHandle, object value, int securityClassification) =>
_lmxProxy!.Write(handle, itemHandle, value, securityClassification);
private void ProxyOnDataChange(int hLMXServerHandle, int phItemHandle, object pvItemValue,
int pwItemQuality, object pftItemTimeStamp, ref MXSTATUS_PROXY[] ItemStatus)
=> OnDataChange?.Invoke(hLMXServerHandle, phItemHandle, pvItemValue, pwItemQuality, pftItemTimeStamp, ref ItemStatus);
private void ProxyOnWriteComplete(int hLMXServerHandle, int phItemHandle, ref MXSTATUS_PROXY[] ItemStatus)
=> OnWriteComplete?.Invoke(hLMXServerHandle, phItemHandle, ref ItemStatus);
public void Dispose() => Unregister(0);
}

View File

@@ -0,0 +1,20 @@
using System;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
/// <summary>
/// Fired by <see cref="MxAccessClient.SubscriptionReplayFailed"/> when a previously-active
/// subscription fails to be restored after a reconnect. The backend should treat the tag as
/// unhealthy until the next successful resubscribe.
/// </summary>
public sealed class SubscriptionReplayFailedEventArgs : EventArgs
{
public SubscriptionReplayFailedEventArgs(string tagReference, Exception exception)
{
TagReference = tagReference;
Exception = exception;
}
public string TagReference { get; }
public Exception Exception { get; }
}

View File

@@ -0,0 +1,24 @@
using System;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
/// <summary>Value-timestamp-quality triplet — port of v1 <c>Vtq</c>.</summary>
public readonly struct Vtq
{
public object? Value { get; }
public DateTime TimestampUtc { get; }
public byte Quality { get; }
public Vtq(object? value, DateTime timestampUtc, byte quality)
{
Value = value;
TimestampUtc = timestampUtc;
Quality = quality;
}
/// <summary>OPC DA Good = 192.</summary>
public static Vtq Good(object? v) => new(v, DateTime.UtcNow, 192);
/// <summary>OPC DA Bad = 0.</summary>
public static Vtq Bad() => new(null, DateTime.UtcNow, 0);
}

View File

@@ -0,0 +1,530 @@
using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading;
using System.Threading.Tasks;
using MessagePack;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Galaxy;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Historian;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Stability;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend;
/// <summary>
/// Production <see cref="IGalaxyBackend"/> — combines the SQL-backed
/// <see cref="GalaxyRepository"/> for Discover with the live MXAccess
/// <see cref="MxAccessClient"/> for Read / Write / Subscribe. History stays bad-coded
/// until the Wonderware Historian SDK plugin loader (Task B.1.h) lands. Alarms come from
/// MxAccess <c>AlarmExtension</c> primitives but the wire-up is also Phase 2 follow-up
/// (the v1 alarm subsystem is its own subtree).
/// </summary>
public sealed class MxAccessGalaxyBackend : IGalaxyBackend, IDisposable
{
private readonly GalaxyRepository _repository;
private readonly MxAccessClient _mx;
private readonly IHistorianDataSource? _historian;
private long _nextSessionId;
private long _nextSubscriptionId;
// Active SubscriptionId → MXAccess full reference list — so Unsubscribe can find them.
private readonly System.Collections.Concurrent.ConcurrentDictionary<long, IReadOnlyList<string>> _subs = new();
// Reverse lookup: tag reference → subscription IDs subscribed to it (one tag may belong to many).
private readonly System.Collections.Concurrent.ConcurrentDictionary<string, System.Collections.Concurrent.ConcurrentBag<long>>
_refToSubs = new(System.StringComparer.OrdinalIgnoreCase);
public event System.EventHandler<OnDataChangeNotification>? OnDataChange;
#pragma warning disable CS0067 // alarm wire-up deferred to PR 9
public event System.EventHandler<GalaxyAlarmEvent>? OnAlarmEvent;
#pragma warning restore CS0067
public event System.EventHandler<HostConnectivityStatus>? OnHostStatusChanged;
private readonly System.EventHandler<bool> _onConnectionStateChanged;
private readonly GalaxyRuntimeProbeManager _probeManager;
private readonly System.EventHandler<HostStateTransition> _onProbeStateChanged;
public MxAccessGalaxyBackend(GalaxyRepository repository, MxAccessClient mx, IHistorianDataSource? historian = null)
{
_repository = repository;
_mx = mx;
_historian = historian;
// PR 8: gateway-level host-status push. When the MXAccess COM proxy transitions
// connected↔disconnected, raise OnHostStatusChanged with a synthetic host entry named
// after the Wonderware client identity so the Admin UI surfaces top-level transport
// health even before per-platform/per-engine probing lands (deferred to a later PR that
// ports v1's GalaxyRuntimeProbeManager with ScanState subscriptions).
_onConnectionStateChanged = (_, connected) =>
{
OnHostStatusChanged?.Invoke(this, new HostConnectivityStatus
{
HostName = _mx.ClientName,
RuntimeStatus = connected ? "Running" : "Stopped",
LastObservedUtcUnixMs = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(),
});
};
_mx.ConnectionStateChanged += _onConnectionStateChanged;
// PR 13: per-platform runtime probes. ScanState subscriptions fire OnProbeCallback,
// which runs the state machine and raises StateChanged on transitions we care about.
// We forward each transition through the same OnHostStatusChanged IPC event that the
// gateway-level ConnectionStateChanged uses — tagged with the platform's TagName so the
// Admin UI can show per-host health independently from the top-level transport status.
_probeManager = new GalaxyRuntimeProbeManager(
subscribe: (probe, cb) => _mx.SubscribeAsync(probe, cb),
unsubscribe: probe => _mx.UnsubscribeAsync(probe));
_onProbeStateChanged = (_, t) =>
{
OnHostStatusChanged?.Invoke(this, new HostConnectivityStatus
{
HostName = t.TagName,
RuntimeStatus = t.NewState switch
{
HostRuntimeState.Running => "Running",
HostRuntimeState.Stopped => "Stopped",
_ => "Unknown",
},
LastObservedUtcUnixMs = new DateTimeOffset(t.AtUtc, TimeSpan.Zero).ToUnixTimeMilliseconds(),
});
};
_probeManager.StateChanged += _onProbeStateChanged;
}
/// <summary>
/// Exposed for tests. Production flow: DiscoverAsync completes → backend calls
/// <c>SyncProbesAsync</c> with the runtime hosts (WinPlatform + AppEngine gobjects) to
/// advise ScanState per host.
/// </summary>
internal GalaxyRuntimeProbeManager ProbeManager => _probeManager;
public async Task<OpenSessionResponse> OpenSessionAsync(OpenSessionRequest req, CancellationToken ct)
{
try
{
await _mx.ConnectAsync();
return new OpenSessionResponse { Success = true, SessionId = Interlocked.Increment(ref _nextSessionId) };
}
catch (Exception ex)
{
return new OpenSessionResponse { Success = false, Error = $"MXAccess connect failed: {ex.Message}" };
}
}
public async Task CloseSessionAsync(CloseSessionRequest req, CancellationToken ct)
{
await _mx.DisconnectAsync();
}
public async Task<DiscoverHierarchyResponse> DiscoverAsync(DiscoverHierarchyRequest req, CancellationToken ct)
{
try
{
var hierarchy = await _repository.GetHierarchyAsync(ct).ConfigureAwait(false);
var attributes = await _repository.GetAttributesAsync(ct).ConfigureAwait(false);
var attrsByGobject = attributes
.GroupBy(a => a.GobjectId)
.ToDictionary(g => g.Key, g => g.Select(MapAttribute).ToArray());
var nameByGobject = hierarchy.ToDictionary(o => o.GobjectId, o => o.TagName);
var objects = hierarchy.Select(o => new GalaxyObjectInfo
{
ContainedName = string.IsNullOrEmpty(o.ContainedName) ? o.TagName : o.ContainedName,
TagName = o.TagName,
ParentContainedName = o.ParentGobjectId != 0 && nameByGobject.TryGetValue(o.ParentGobjectId, out var p) ? p : null,
TemplateCategory = MapCategory(o.CategoryId),
Attributes = attrsByGobject.TryGetValue(o.GobjectId, out var a) ? a : Array.Empty<GalaxyAttributeInfo>(),
}).ToArray();
// PR 13: Sync the per-platform probe manager against the just-discovered hierarchy
// so ScanState subscriptions track the current runtime set. Best-effort — probe
// failures don't block Discover from returning, since the gateway-level signal from
// MxAccessClient.ConnectionStateChanged still flows and the Admin UI degrades to
// that level if any per-host probe couldn't advise.
try
{
var targets = hierarchy
.Where(o => o.CategoryId == GalaxyRuntimeProbeManager.CategoryWinPlatform
|| o.CategoryId == GalaxyRuntimeProbeManager.CategoryAppEngine)
.Select(o => new HostProbeTarget(o.TagName, o.CategoryId));
await _probeManager.SyncAsync(targets).ConfigureAwait(false);
}
catch { /* swallow — Discover succeeded; probes are a diagnostic enrichment */ }
return new DiscoverHierarchyResponse { Success = true, Objects = objects };
}
catch (Exception ex)
{
return new DiscoverHierarchyResponse { Success = false, Error = ex.Message, Objects = Array.Empty<GalaxyObjectInfo>() };
}
}
public async Task<ReadValuesResponse> ReadValuesAsync(ReadValuesRequest req, CancellationToken ct)
{
if (!_mx.IsConnected) return new ReadValuesResponse { Success = false, Error = "Not connected", Values = Array.Empty<GalaxyDataValue>() };
var results = new List<GalaxyDataValue>(req.TagReferences.Length);
foreach (var reference in req.TagReferences)
{
try
{
var vtq = await _mx.ReadAsync(reference, TimeSpan.FromSeconds(5), ct);
results.Add(ToWire(reference, vtq));
}
catch (Exception ex)
{
results.Add(new GalaxyDataValue
{
TagReference = reference,
StatusCode = 0x80020000u, // Bad_InternalError
ServerTimestampUtcUnixMs = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(),
ValueBytes = MessagePackSerializer.Serialize(ex.Message),
});
}
}
return new ReadValuesResponse { Success = true, Values = results.ToArray() };
}
public async Task<WriteValuesResponse> WriteValuesAsync(WriteValuesRequest req, CancellationToken ct)
{
var results = new List<WriteValueResult>(req.Writes.Length);
foreach (var w in req.Writes)
{
try
{
// Decode the value back from the MessagePack bytes the Proxy sent.
var value = w.ValueBytes is null
? null
: MessagePackSerializer.Deserialize<object>(w.ValueBytes);
var ok = await _mx.WriteAsync(w.TagReference, value!);
results.Add(new WriteValueResult
{
TagReference = w.TagReference,
StatusCode = ok ? 0u : 0x80020000u, // Good or Bad_InternalError
Error = ok ? null : "MXAccess runtime reported write failure",
});
}
catch (Exception ex)
{
results.Add(new WriteValueResult { TagReference = w.TagReference, StatusCode = 0x80020000u, Error = ex.Message });
}
}
return new WriteValuesResponse { Results = results.ToArray() };
}
public async Task<SubscribeResponse> SubscribeAsync(SubscribeRequest req, CancellationToken ct)
{
var sid = Interlocked.Increment(ref _nextSubscriptionId);
try
{
foreach (var tag in req.TagReferences)
{
_refToSubs.AddOrUpdate(tag,
_ => new System.Collections.Concurrent.ConcurrentBag<long> { sid },
(_, bag) => { bag.Add(sid); return bag; });
// The MXAccess SubscribeAsync only takes one callback per tag; the same callback
// fires for every active subscription of that tag — we fan out by SubscriptionId.
await _mx.SubscribeAsync(tag, OnTagValueChanged);
}
_subs[sid] = req.TagReferences;
return new SubscribeResponse { Success = true, SubscriptionId = sid, ActualIntervalMs = req.RequestedIntervalMs };
}
catch (Exception ex)
{
return new SubscribeResponse { Success = false, Error = ex.Message };
}
}
public async Task UnsubscribeAsync(UnsubscribeRequest req, CancellationToken ct)
{
if (!_subs.TryRemove(req.SubscriptionId, out var refs)) return;
foreach (var r in refs)
{
// Drop this subscription from the reverse map; only unsubscribe from MXAccess if no
// other subscription is still listening (multiple Proxy subs may share a tag).
_refToSubs.TryGetValue(r, out var bag);
if (bag is not null)
{
var remaining = new System.Collections.Concurrent.ConcurrentBag<long>(
bag.Where(id => id != req.SubscriptionId));
if (remaining.IsEmpty)
{
_refToSubs.TryRemove(r, out _);
await _mx.UnsubscribeAsync(r);
}
else
{
_refToSubs[r] = remaining;
}
}
}
}
/// <summary>
/// Fires for every value change on any subscribed Galaxy attribute. Wraps the value in
/// a <see cref="GalaxyDataValue"/> and raises <see cref="OnDataChange"/> once per
/// subscription that includes this tag — the IPC sink translates that into outbound
/// <c>OnDataChangeNotification</c> frames.
/// </summary>
private void OnTagValueChanged(string fullReference, MxAccess.Vtq vtq)
{
if (!_refToSubs.TryGetValue(fullReference, out var bag) || bag.IsEmpty) return;
var wireValue = ToWire(fullReference, vtq);
// Emit one notification per active SubscriptionId for this tag — the Proxy fans out to
// each ISubscribable consumer based on the SubscriptionId in the payload.
foreach (var sid in bag.Distinct())
{
OnDataChange?.Invoke(this, new OnDataChangeNotification
{
SubscriptionId = sid,
Values = new[] { wireValue },
});
}
}
public Task SubscribeAlarmsAsync(AlarmSubscribeRequest req, CancellationToken ct) => Task.CompletedTask;
public Task AcknowledgeAlarmAsync(AlarmAckRequest req, CancellationToken ct) => Task.CompletedTask;
public async Task<HistoryReadResponse> HistoryReadAsync(HistoryReadRequest req, CancellationToken ct)
{
if (_historian is null)
return new HistoryReadResponse
{
Success = false,
Error = "Historian disabled — no OTOPCUA_HISTORIAN_ENABLED configuration",
Tags = Array.Empty<HistoryTagValues>(),
};
var start = DateTimeOffset.FromUnixTimeMilliseconds(req.StartUtcUnixMs).UtcDateTime;
var end = DateTimeOffset.FromUnixTimeMilliseconds(req.EndUtcUnixMs).UtcDateTime;
var tags = new List<HistoryTagValues>(req.TagReferences.Length);
try
{
foreach (var reference in req.TagReferences)
{
var samples = await _historian.ReadRawAsync(reference, start, end, (int)req.MaxValuesPerTag, ct).ConfigureAwait(false);
tags.Add(new HistoryTagValues
{
TagReference = reference,
Values = samples.Select(s => ToWire(reference, s)).ToArray(),
});
}
return new HistoryReadResponse { Success = true, Tags = tags.ToArray() };
}
catch (OperationCanceledException) { throw; }
catch (Exception ex)
{
return new HistoryReadResponse
{
Success = false,
Error = $"Historian read failed: {ex.Message}",
Tags = tags.ToArray(),
};
}
}
public async Task<HistoryReadProcessedResponse> HistoryReadProcessedAsync(
HistoryReadProcessedRequest req, CancellationToken ct)
{
if (_historian is null)
return new HistoryReadProcessedResponse
{
Success = false,
Error = "Historian disabled — no OTOPCUA_HISTORIAN_ENABLED configuration",
Values = Array.Empty<GalaxyDataValue>(),
};
if (req.IntervalMs <= 0)
return new HistoryReadProcessedResponse
{
Success = false,
Error = "HistoryReadProcessed requires IntervalMs > 0",
Values = Array.Empty<GalaxyDataValue>(),
};
var start = DateTimeOffset.FromUnixTimeMilliseconds(req.StartUtcUnixMs).UtcDateTime;
var end = DateTimeOffset.FromUnixTimeMilliseconds(req.EndUtcUnixMs).UtcDateTime;
try
{
var samples = await _historian.ReadAggregateAsync(
req.TagReference, start, end, req.IntervalMs, req.AggregateColumn, ct).ConfigureAwait(false);
var wire = samples.Select(s => ToWire(req.TagReference, s)).ToArray();
return new HistoryReadProcessedResponse { Success = true, Values = wire };
}
catch (OperationCanceledException) { throw; }
catch (Exception ex)
{
return new HistoryReadProcessedResponse
{
Success = false,
Error = $"Historian aggregate read failed: {ex.Message}",
Values = Array.Empty<GalaxyDataValue>(),
};
}
}
public async Task<HistoryReadAtTimeResponse> HistoryReadAtTimeAsync(
HistoryReadAtTimeRequest req, CancellationToken ct)
{
if (_historian is null)
return new HistoryReadAtTimeResponse
{
Success = false,
Error = "Historian disabled — no OTOPCUA_HISTORIAN_ENABLED configuration",
Values = Array.Empty<GalaxyDataValue>(),
};
if (req.TimestampsUtcUnixMs.Length == 0)
return new HistoryReadAtTimeResponse { Success = true, Values = Array.Empty<GalaxyDataValue>() };
var timestamps = req.TimestampsUtcUnixMs
.Select(ms => DateTimeOffset.FromUnixTimeMilliseconds(ms).UtcDateTime)
.ToArray();
try
{
var samples = await _historian.ReadAtTimeAsync(req.TagReference, timestamps, ct).ConfigureAwait(false);
var wire = samples.Select(s => ToWire(req.TagReference, s)).ToArray();
return new HistoryReadAtTimeResponse { Success = true, Values = wire };
}
catch (OperationCanceledException) { throw; }
catch (Exception ex)
{
return new HistoryReadAtTimeResponse
{
Success = false,
Error = $"Historian at-time read failed: {ex.Message}",
Values = Array.Empty<GalaxyDataValue>(),
};
}
}
public async Task<HistoryReadEventsResponse> HistoryReadEventsAsync(
HistoryReadEventsRequest req, CancellationToken ct)
{
if (_historian is null)
return new HistoryReadEventsResponse
{
Success = false,
Error = "Historian disabled — no OTOPCUA_HISTORIAN_ENABLED configuration",
Events = Array.Empty<GalaxyHistoricalEvent>(),
};
var start = DateTimeOffset.FromUnixTimeMilliseconds(req.StartUtcUnixMs).UtcDateTime;
var end = DateTimeOffset.FromUnixTimeMilliseconds(req.EndUtcUnixMs).UtcDateTime;
try
{
var events = await _historian.ReadEventsAsync(req.SourceName, start, end, req.MaxEvents, ct).ConfigureAwait(false);
var wire = events.Select(e => new GalaxyHistoricalEvent
{
EventId = e.Id.ToString(),
SourceName = e.Source,
EventTimeUtcUnixMs = new DateTimeOffset(DateTime.SpecifyKind(e.EventTime, DateTimeKind.Utc), TimeSpan.Zero).ToUnixTimeMilliseconds(),
ReceivedTimeUtcUnixMs = new DateTimeOffset(DateTime.SpecifyKind(e.ReceivedTime, DateTimeKind.Utc), TimeSpan.Zero).ToUnixTimeMilliseconds(),
DisplayText = e.DisplayText,
Severity = e.Severity,
}).ToArray();
return new HistoryReadEventsResponse { Success = true, Events = wire };
}
catch (OperationCanceledException) { throw; }
catch (Exception ex)
{
return new HistoryReadEventsResponse
{
Success = false,
Error = $"Historian event read failed: {ex.Message}",
Events = Array.Empty<GalaxyHistoricalEvent>(),
};
}
}
public Task<RecycleStatusResponse> RecycleAsync(RecycleHostRequest req, CancellationToken ct)
=> Task.FromResult(new RecycleStatusResponse { Accepted = true, GraceSeconds = 15 });
public void Dispose()
{
_probeManager.StateChanged -= _onProbeStateChanged;
_probeManager.Dispose();
_mx.ConnectionStateChanged -= _onConnectionStateChanged;
_historian?.Dispose();
}
private static GalaxyDataValue ToWire(string reference, Vtq vtq) => new()
{
TagReference = reference,
ValueBytes = vtq.Value is null ? null : MessagePackSerializer.Serialize(vtq.Value),
ValueMessagePackType = 0,
StatusCode = vtq.Quality >= 192 ? 0u : 0x40000000u, // Good vs Uncertain placeholder
SourceTimestampUtcUnixMs = new DateTimeOffset(vtq.TimestampUtc, TimeSpan.Zero).ToUnixTimeMilliseconds(),
ServerTimestampUtcUnixMs = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(),
};
/// <summary>
/// Maps a <see cref="HistorianSample"/> (raw historian row, OPC-UA-free) to the IPC wire
/// shape. The Proxy decodes the MessagePack value and maps <see cref="HistorianSample.Quality"/>
/// through <c>QualityMapper</c> on its side of the pipe — we keep the raw byte here so
/// rich OPC DA status codes (e.g. <c>BadNotConnected</c>, <c>UncertainSubNormal</c>) survive
/// the hop intact.
/// </summary>
private static GalaxyDataValue ToWire(string reference, HistorianSample sample) => new()
{
TagReference = reference,
ValueBytes = sample.Value is null ? null : MessagePackSerializer.Serialize(sample.Value),
ValueMessagePackType = 0,
StatusCode = HistorianQualityMapper.Map(sample.Quality),
SourceTimestampUtcUnixMs = new DateTimeOffset(sample.TimestampUtc, TimeSpan.Zero).ToUnixTimeMilliseconds(),
ServerTimestampUtcUnixMs = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(),
};
/// <summary>
/// Maps a <see cref="HistorianAggregateSample"/> (one aggregate bucket) to the IPC wire
/// shape. A null <see cref="HistorianAggregateSample.Value"/> means the aggregate was
/// unavailable for the bucket — the Proxy translates that to OPC UA <c>BadNoData</c>.
/// </summary>
private static GalaxyDataValue ToWire(string reference, HistorianAggregateSample sample) => new()
{
TagReference = reference,
ValueBytes = sample.Value is null ? null : MessagePackSerializer.Serialize(sample.Value.Value),
ValueMessagePackType = 0,
StatusCode = sample.Value is null ? 0x800E0000u /* BadNoData */ : 0x00000000u,
SourceTimestampUtcUnixMs = new DateTimeOffset(sample.TimestampUtc, TimeSpan.Zero).ToUnixTimeMilliseconds(),
ServerTimestampUtcUnixMs = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(),
};
private static GalaxyAttributeInfo MapAttribute(GalaxyAttributeRow row) => new()
{
AttributeName = row.AttributeName,
MxDataType = row.MxDataType,
IsArray = row.IsArray,
ArrayDim = row.ArrayDimension is int d and > 0 ? (uint)d : null,
SecurityClassification = row.SecurityClassification,
IsHistorized = row.IsHistorized,
IsAlarm = row.IsAlarm,
};
private static string MapCategory(int categoryId) => categoryId switch
{
1 => "$WinPlatform",
3 => "$AppEngine",
4 => "$Area",
10 => "$UserDefined",
11 => "$ApplicationObject",
13 => "$Area",
17 => "$DeviceIntegration",
24 => "$ViewEngine",
26 => "$ViewApp",
_ => $"category-{categoryId}",
};
}

View File

@@ -0,0 +1,273 @@
using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Stability;
/// <summary>
/// Per-platform + per-AppEngine runtime probe. Subscribes to <c>&lt;TagName&gt;.ScanState</c>
/// for each $WinPlatform and $AppEngine gobject, tracks Unknown → Running → Stopped
/// transitions, and fires <see cref="StateChanged"/> so <see cref="Backend.MxAccessGalaxyBackend"/>
/// can forward per-host events through the existing IPC <c>OnHostStatusChanged</c> event.
/// Pure-logic state machine with an injected clock so it's deterministically testable —
/// port of v1 <c>GalaxyRuntimeProbeManager</c> without the OPC UA node-manager coupling.
/// </summary>
/// <remarks>
/// State machine rules (documented in v1's <c>runtimestatus.md</c> and preserved here):
/// <list type="bullet">
/// <item><c>ScanState</c> is on-change-only — a stably-Running host may go hours without a
/// callback. Running → Stopped is driven by an explicit <c>ScanState=false</c> callback,
/// never by starvation.</item>
/// <item>Unknown → Running is a startup transition and does NOT fire StateChanged (would
/// paint every host as "just recovered" at startup, which is noise).</item>
/// <item>Stopped → Running and Running → Stopped fire StateChanged. Unknown → Stopped
/// fires StateChanged because that's a first-known-bad signal operators need.</item>
/// <item>All public methods are thread-safe. Callbacks fire outside the internal lock to
/// avoid lock inversion with caller-owned state.</item>
/// </list>
/// </remarks>
public sealed class GalaxyRuntimeProbeManager : IDisposable
{
public const int CategoryWinPlatform = 1;
public const int CategoryAppEngine = 3;
public const string ProbeAttribute = ".ScanState";
private readonly Func<DateTime> _clock;
private readonly Func<string, Action<string, Vtq>, Task> _subscribe;
private readonly Func<string, Task> _unsubscribe;
private readonly object _lock = new();
// probe tag → per-host state
private readonly Dictionary<string, HostProbeState> _byProbe = new(StringComparer.OrdinalIgnoreCase);
// tag name → probe tag (for reverse lookup on the desired-set diff)
private readonly Dictionary<string, string> _probeByTagName = new(StringComparer.OrdinalIgnoreCase);
private bool _disposed;
/// <summary>
/// Fires on every state transition that operators should react to. See class remarks
/// for the rules on which transitions fire.
/// </summary>
public event EventHandler<HostStateTransition>? StateChanged;
public GalaxyRuntimeProbeManager(
Func<string, Action<string, Vtq>, Task> subscribe,
Func<string, Task> unsubscribe)
: this(subscribe, unsubscribe, () => DateTime.UtcNow) { }
internal GalaxyRuntimeProbeManager(
Func<string, Action<string, Vtq>, Task> subscribe,
Func<string, Task> unsubscribe,
Func<DateTime> clock)
{
_subscribe = subscribe ?? throw new ArgumentNullException(nameof(subscribe));
_unsubscribe = unsubscribe ?? throw new ArgumentNullException(nameof(unsubscribe));
_clock = clock ?? throw new ArgumentNullException(nameof(clock));
}
/// <summary>Number of probes currently advised. Test/dashboard hook.</summary>
public int ActiveProbeCount
{
get { lock (_lock) return _byProbe.Count; }
}
/// <summary>
/// Snapshot every currently-tracked host's state. One entry per probe.
/// </summary>
public IReadOnlyList<HostProbeSnapshot> SnapshotStates()
{
lock (_lock)
{
return _byProbe.Select(kv => new HostProbeSnapshot(
TagName: kv.Value.TagName,
State: kv.Value.State,
LastChangedUtc: kv.Value.LastStateChangeUtc)).ToList();
}
}
/// <summary>
/// Query the current runtime state for <paramref name="tagName"/>. Returns
/// <see cref="HostRuntimeState.Unknown"/> when the host is not tracked.
/// </summary>
public HostRuntimeState GetState(string tagName)
{
lock (_lock)
{
if (_probeByTagName.TryGetValue(tagName, out var probe)
&& _byProbe.TryGetValue(probe, out var state))
return state.State;
return HostRuntimeState.Unknown;
}
}
/// <summary>
/// Diff the desired host set (filtered $WinPlatform / $AppEngine from the latest Discover)
/// against the currently-tracked set and advise / unadvise as needed. Idempotent:
/// calling twice with the same set does nothing.
/// </summary>
public async Task SyncAsync(IEnumerable<HostProbeTarget> desiredHosts)
{
if (_disposed) return;
var desired = desiredHosts
.Where(h => !string.IsNullOrWhiteSpace(h.TagName))
.ToDictionary(h => h.TagName, StringComparer.OrdinalIgnoreCase);
List<string> toAdvise;
List<string> toUnadvise;
lock (_lock)
{
toAdvise = desired.Keys
.Where(tag => !_probeByTagName.ContainsKey(tag))
.ToList();
toUnadvise = _probeByTagName.Keys
.Where(tag => !desired.ContainsKey(tag))
.Select(tag => _probeByTagName[tag])
.ToList();
foreach (var tag in toAdvise)
{
var probe = tag + ProbeAttribute;
_probeByTagName[tag] = probe;
_byProbe[probe] = new HostProbeState
{
TagName = tag,
State = HostRuntimeState.Unknown,
LastStateChangeUtc = _clock(),
};
}
foreach (var probe in toUnadvise)
{
_byProbe.Remove(probe);
}
foreach (var removedTag in _probeByTagName.Keys.Where(t => !desired.ContainsKey(t)).ToList())
{
_probeByTagName.Remove(removedTag);
}
}
foreach (var tag in toAdvise)
{
var probe = tag + ProbeAttribute;
try
{
await _subscribe(probe, OnProbeCallback);
}
catch
{
// Rollback on subscribe failure so a later Tick can't transition a never-advised
// probe into a false Stopped state. Callers can re-Sync later to retry.
lock (_lock)
{
_byProbe.Remove(probe);
_probeByTagName.Remove(tag);
}
}
}
foreach (var probe in toUnadvise)
{
try { await _unsubscribe(probe); } catch { /* best-effort cleanup */ }
}
}
/// <summary>
/// Public entry point for tests and internal callbacks. Production flow: MxAccessClient's
/// SubscribeAsync delivers VTQ updates through the callback wired in <see cref="SyncAsync"/>,
/// which calls this method under the lock to update state and fires
/// <see cref="StateChanged"/> outside the lock for any transition that matters.
/// </summary>
public void OnProbeCallback(string probeTag, Vtq vtq)
{
if (_disposed) return;
HostStateTransition? transition = null;
lock (_lock)
{
if (!_byProbe.TryGetValue(probeTag, out var state)) return;
var isRunning = vtq.Quality >= 192 && vtq.Value is bool b && b;
var now = _clock();
var previous = state.State;
state.LastCallbackUtc = now;
if (isRunning)
{
state.GoodUpdateCount++;
if (previous != HostRuntimeState.Running)
{
state.State = HostRuntimeState.Running;
state.LastStateChangeUtc = now;
if (previous == HostRuntimeState.Stopped)
{
transition = new HostStateTransition(state.TagName, previous, HostRuntimeState.Running, now);
}
}
}
else
{
state.FailureCount++;
if (previous != HostRuntimeState.Stopped)
{
state.State = HostRuntimeState.Stopped;
state.LastStateChangeUtc = now;
transition = new HostStateTransition(state.TagName, previous, HostRuntimeState.Stopped, now);
}
}
}
if (transition is { } t)
{
StateChanged?.Invoke(this, t);
}
}
public void Dispose()
{
if (_disposed) return;
_disposed = true;
lock (_lock)
{
_byProbe.Clear();
_probeByTagName.Clear();
}
}
private sealed class HostProbeState
{
public string TagName { get; set; } = "";
public HostRuntimeState State { get; set; }
public DateTime LastStateChangeUtc { get; set; }
public DateTime? LastCallbackUtc { get; set; }
public long GoodUpdateCount { get; set; }
public long FailureCount { get; set; }
}
}
public enum HostRuntimeState
{
Unknown,
Running,
Stopped,
}
public sealed record HostStateTransition(
string TagName,
HostRuntimeState OldState,
HostRuntimeState NewState,
DateTime AtUtc);
public sealed record HostProbeSnapshot(
string TagName,
HostRuntimeState State,
DateTime LastChangedUtc);
public readonly record struct HostProbeTarget(string TagName, int CategoryId)
{
public bool IsRuntimeHost =>
CategoryId == GalaxyRuntimeProbeManager.CategoryWinPlatform
|| CategoryId == GalaxyRuntimeProbeManager.CategoryAppEngine;
}

View File

@@ -0,0 +1,121 @@
using System.Threading;
using System.Threading.Tasks;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend;
/// <summary>
/// Phase 2 placeholder backend — accepts session open/close + responds to recycle, returns
/// "not-implemented" results for every data-plane call. Replaced by the lifted
/// <c>MxAccessClient</c>-backed implementation during the deferred Galaxy code move
/// (Task B.1 + parity gate). Keeps the IPC end-to-end testable today.
/// </summary>
public sealed class StubGalaxyBackend : IGalaxyBackend
{
private long _nextSessionId;
private long _nextSubscriptionId;
// Stub backend never raises events — implements the interface members for symmetry.
#pragma warning disable CS0067
public event System.EventHandler<OnDataChangeNotification>? OnDataChange;
public event System.EventHandler<GalaxyAlarmEvent>? OnAlarmEvent;
public event System.EventHandler<HostConnectivityStatus>? OnHostStatusChanged;
#pragma warning restore CS0067
public Task<OpenSessionResponse> OpenSessionAsync(OpenSessionRequest req, CancellationToken ct)
{
var id = Interlocked.Increment(ref _nextSessionId);
return Task.FromResult(new OpenSessionResponse { Success = true, SessionId = id });
}
public Task CloseSessionAsync(CloseSessionRequest req, CancellationToken ct) => Task.CompletedTask;
public Task<DiscoverHierarchyResponse> DiscoverAsync(DiscoverHierarchyRequest req, CancellationToken ct)
=> Task.FromResult(new DiscoverHierarchyResponse
{
Success = false,
Error = "stub: MXAccess code lift pending (Phase 2 Task B.1)",
Objects = System.Array.Empty<GalaxyObjectInfo>(),
});
public Task<ReadValuesResponse> ReadValuesAsync(ReadValuesRequest req, CancellationToken ct)
=> Task.FromResult(new ReadValuesResponse
{
Success = false,
Error = "stub: MXAccess code lift pending (Phase 2 Task B.1)",
Values = System.Array.Empty<GalaxyDataValue>(),
});
public Task<WriteValuesResponse> WriteValuesAsync(WriteValuesRequest req, CancellationToken ct)
{
var results = new WriteValueResult[req.Writes.Length];
for (var i = 0; i < req.Writes.Length; i++)
{
results[i] = new WriteValueResult
{
TagReference = req.Writes[i].TagReference,
StatusCode = 0x80020000u, // Bad_InternalError
Error = "stub: MXAccess code lift pending (Phase 2 Task B.1)",
};
}
return Task.FromResult(new WriteValuesResponse { Results = results });
}
public Task<SubscribeResponse> SubscribeAsync(SubscribeRequest req, CancellationToken ct)
{
var sid = Interlocked.Increment(ref _nextSubscriptionId);
return Task.FromResult(new SubscribeResponse
{
Success = true,
SubscriptionId = sid,
ActualIntervalMs = req.RequestedIntervalMs,
});
}
public Task UnsubscribeAsync(UnsubscribeRequest req, CancellationToken ct) => Task.CompletedTask;
public Task SubscribeAlarmsAsync(AlarmSubscribeRequest req, CancellationToken ct) => Task.CompletedTask;
public Task AcknowledgeAlarmAsync(AlarmAckRequest req, CancellationToken ct) => Task.CompletedTask;
public Task<HistoryReadResponse> HistoryReadAsync(HistoryReadRequest req, CancellationToken ct)
=> Task.FromResult(new HistoryReadResponse
{
Success = false,
Error = "stub: MXAccess code lift pending (Phase 2 Task B.1)",
Tags = System.Array.Empty<HistoryTagValues>(),
});
public Task<HistoryReadProcessedResponse> HistoryReadProcessedAsync(
HistoryReadProcessedRequest req, CancellationToken ct)
=> Task.FromResult(new HistoryReadProcessedResponse
{
Success = false,
Error = "stub: MXAccess code lift pending (Phase 2 Task B.1)",
Values = System.Array.Empty<GalaxyDataValue>(),
});
public Task<HistoryReadAtTimeResponse> HistoryReadAtTimeAsync(
HistoryReadAtTimeRequest req, CancellationToken ct)
=> Task.FromResult(new HistoryReadAtTimeResponse
{
Success = false,
Error = "stub: MXAccess code lift pending (Phase 2 Task B.1)",
Values = System.Array.Empty<GalaxyDataValue>(),
});
public Task<HistoryReadEventsResponse> HistoryReadEventsAsync(
HistoryReadEventsRequest req, CancellationToken ct)
=> Task.FromResult(new HistoryReadEventsResponse
{
Success = false,
Error = "stub: MXAccess code lift pending (Phase 2 Task B.1)",
Events = System.Array.Empty<GalaxyHistoricalEvent>(),
});
public Task<RecycleStatusResponse> RecycleAsync(RecycleHostRequest req, CancellationToken ct)
=> Task.FromResult(new RecycleStatusResponse
{
Accepted = true,
GraceSeconds = 15, // matches Phase 2 plan §B.8 default
});
}

View File

@@ -0,0 +1,183 @@
using System;
using System.Threading;
using System.Threading.Tasks;
using MessagePack;
using Serilog;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Ipc;
/// <summary>
/// Real IPC dispatcher — routes each <see cref="MessageKind"/> to the matching
/// <see cref="IGalaxyBackend"/> method. Replaces <see cref="StubFrameHandler"/>. Heartbeat
/// stays handled inline so liveness detection works regardless of backend health.
/// </summary>
public sealed class GalaxyFrameHandler(IGalaxyBackend backend, ILogger logger) : IFrameHandler
{
public async Task HandleAsync(MessageKind kind, byte[] body, FrameWriter writer, CancellationToken ct)
{
try
{
switch (kind)
{
case MessageKind.Heartbeat:
{
var hb = Deserialize<Heartbeat>(body);
await writer.WriteAsync(MessageKind.HeartbeatAck,
new HeartbeatAck { SequenceNumber = hb.SequenceNumber, UtcUnixMs = hb.UtcUnixMs }, ct);
return;
}
case MessageKind.OpenSessionRequest:
{
var resp = await backend.OpenSessionAsync(Deserialize<OpenSessionRequest>(body), ct);
await writer.WriteAsync(MessageKind.OpenSessionResponse, resp, ct);
return;
}
case MessageKind.CloseSessionRequest:
await backend.CloseSessionAsync(Deserialize<CloseSessionRequest>(body), ct);
return; // one-way
case MessageKind.DiscoverHierarchyRequest:
{
var resp = await backend.DiscoverAsync(Deserialize<DiscoverHierarchyRequest>(body), ct);
await writer.WriteAsync(MessageKind.DiscoverHierarchyResponse, resp, ct);
return;
}
case MessageKind.ReadValuesRequest:
{
var resp = await backend.ReadValuesAsync(Deserialize<ReadValuesRequest>(body), ct);
await writer.WriteAsync(MessageKind.ReadValuesResponse, resp, ct);
return;
}
case MessageKind.WriteValuesRequest:
{
var resp = await backend.WriteValuesAsync(Deserialize<WriteValuesRequest>(body), ct);
await writer.WriteAsync(MessageKind.WriteValuesResponse, resp, ct);
return;
}
case MessageKind.SubscribeRequest:
{
var resp = await backend.SubscribeAsync(Deserialize<SubscribeRequest>(body), ct);
await writer.WriteAsync(MessageKind.SubscribeResponse, resp, ct);
return;
}
case MessageKind.UnsubscribeRequest:
await backend.UnsubscribeAsync(Deserialize<UnsubscribeRequest>(body), ct);
return; // one-way
case MessageKind.AlarmSubscribeRequest:
await backend.SubscribeAlarmsAsync(Deserialize<AlarmSubscribeRequest>(body), ct);
return; // one-way; subsequent alarm events are server-pushed
case MessageKind.AlarmAckRequest:
await backend.AcknowledgeAlarmAsync(Deserialize<AlarmAckRequest>(body), ct);
return;
case MessageKind.HistoryReadRequest:
{
var resp = await backend.HistoryReadAsync(Deserialize<HistoryReadRequest>(body), ct);
await writer.WriteAsync(MessageKind.HistoryReadResponse, resp, ct);
return;
}
case MessageKind.HistoryReadProcessedRequest:
{
var resp = await backend.HistoryReadProcessedAsync(
Deserialize<HistoryReadProcessedRequest>(body), ct);
await writer.WriteAsync(MessageKind.HistoryReadProcessedResponse, resp, ct);
return;
}
case MessageKind.HistoryReadAtTimeRequest:
{
var resp = await backend.HistoryReadAtTimeAsync(
Deserialize<HistoryReadAtTimeRequest>(body), ct);
await writer.WriteAsync(MessageKind.HistoryReadAtTimeResponse, resp, ct);
return;
}
case MessageKind.HistoryReadEventsRequest:
{
var resp = await backend.HistoryReadEventsAsync(
Deserialize<HistoryReadEventsRequest>(body), ct);
await writer.WriteAsync(MessageKind.HistoryReadEventsResponse, resp, ct);
return;
}
case MessageKind.RecycleHostRequest:
{
var resp = await backend.RecycleAsync(Deserialize<RecycleHostRequest>(body), ct);
await writer.WriteAsync(MessageKind.RecycleStatusResponse, resp, ct);
return;
}
default:
await SendErrorAsync(writer, "unknown-kind", $"Frame kind {kind} not handled by Host", ct);
return;
}
}
catch (OperationCanceledException) { throw; }
catch (Exception ex)
{
logger.Error(ex, "GalaxyFrameHandler threw on {Kind}", kind);
await SendErrorAsync(writer, "handler-exception", ex.Message, ct);
}
}
/// <summary>
/// Subscribes the backend's server-pushed events for the lifetime of the connection.
/// The returned disposable unsubscribes when the connection closes — without it the
/// backend's static event invocation list would accumulate dead writer references and
/// leak memory + raise <see cref="ObjectDisposedException"/> on every push.
/// </summary>
public IDisposable AttachConnection(FrameWriter writer)
{
var sink = new ConnectionSink(backend, writer, logger);
sink.Attach();
return sink;
}
private static T Deserialize<T>(byte[] body) => MessagePackSerializer.Deserialize<T>(body);
private static Task SendErrorAsync(FrameWriter writer, string code, string message, CancellationToken ct)
=> writer.WriteAsync(MessageKind.ErrorResponse,
new ErrorResponse { Code = code, Message = message }, ct);
private sealed class ConnectionSink : IDisposable
{
private readonly IGalaxyBackend _backend;
private readonly FrameWriter _writer;
private readonly ILogger _logger;
private EventHandler<OnDataChangeNotification>? _onData;
private EventHandler<GalaxyAlarmEvent>? _onAlarm;
private EventHandler<HostConnectivityStatus>? _onHost;
public ConnectionSink(IGalaxyBackend backend, FrameWriter writer, ILogger logger)
{
_backend = backend; _writer = writer; _logger = logger;
}
public void Attach()
{
_onData = (_, e) => Push(MessageKind.OnDataChangeNotification, e);
_onAlarm = (_, e) => Push(MessageKind.AlarmEvent, e);
_onHost = (_, e) => Push(MessageKind.RuntimeStatusChange,
new RuntimeStatusChangeNotification { Status = e });
_backend.OnDataChange += _onData;
_backend.OnAlarmEvent += _onAlarm;
_backend.OnHostStatusChanged += _onHost;
}
private void Push<T>(MessageKind kind, T payload)
{
// Fire-and-forget — pushes can race with disposal of the writer. We swallow
// ObjectDisposedException because the dispose path will detach this sink shortly.
try { _writer.WriteAsync(kind, payload, CancellationToken.None).GetAwaiter().GetResult(); }
catch (ObjectDisposedException) { }
catch (Exception ex) { _logger.Warning(ex, "ConnectionSink push failed for {Kind}", kind); }
}
public void Dispose()
{
if (_onData is not null) _backend.OnDataChange -= _onData;
if (_onAlarm is not null) _backend.OnAlarmEvent -= _onAlarm;
if (_onHost is not null) _backend.OnHostStatusChanged -= _onHost;
}
}
}

View File

@@ -98,6 +98,8 @@ public sealed class PipeServer : IDisposable
new HelloAck { Accepted = true, HostName = Environment.MachineName },
linked.Token).ConfigureAwait(false);
using var attachment = handler.AttachConnection(writer);
while (!linked.Token.IsCancellationRequested)
{
var frame = await reader.ReadFrameAsync(linked.Token).ConfigureAwait(false);
@@ -157,4 +159,19 @@ public sealed class PipeServer : IDisposable
public interface IFrameHandler
{
Task HandleAsync(MessageKind kind, byte[] body, FrameWriter writer, CancellationToken ct);
/// <summary>
/// Called once per accepted connection after the Hello handshake. Lets the handler
/// attach server-pushed event sinks (data-change, alarm, host-status) to the
/// connection's <paramref name="writer"/>. Returns an <see cref="IDisposable"/> the
/// pipe server disposes when the connection closes — backends use it to unsubscribe.
/// Implementations that don't push events can return <see cref="NoopAttachment"/>.
/// </summary>
IDisposable AttachConnection(FrameWriter writer);
public sealed class NoopAttachment : IDisposable
{
public static readonly NoopAttachment Instance = new();
public void Dispose() { }
}
}

View File

@@ -1,3 +1,4 @@
using System;
using System.Threading;
using System.Threading.Tasks;
using MessagePack;
@@ -27,4 +28,6 @@ public sealed class StubFrameHandler : IFrameHandler
new ErrorResponse { Code = "not-implemented", Message = $"Kind {kind} is stubbed — MXAccess lift deferred" },
ct);
}
public IDisposable AttachConnection(FrameWriter writer) => IFrameHandler.NoopAttachment.Instance;
}

View File

@@ -2,7 +2,12 @@ using System;
using System.Security.Principal;
using System.Threading;
using Serilog;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Galaxy;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Historian;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Ipc;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Sta;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host;
@@ -38,8 +43,47 @@ public static class Program
Log.Information("OtOpcUaGalaxyHost starting — pipe={Pipe} allowedSid={Sid}", pipeName, allowedSidValue);
var handler = new StubFrameHandler();
server.RunAsync(handler, cts.Token).GetAwaiter().GetResult();
// Backend selection — env var picks the implementation:
// OTOPCUA_GALAXY_BACKEND=stub → StubGalaxyBackend (no Galaxy required)
// OTOPCUA_GALAXY_BACKEND=db → DbBackedGalaxyBackend (Discover only, against ZB)
// OTOPCUA_GALAXY_BACKEND=mxaccess → MxAccessGalaxyBackend (real COM + ZB; default)
var backendKind = Environment.GetEnvironmentVariable("OTOPCUA_GALAXY_BACKEND")?.ToLowerInvariant() ?? "mxaccess";
var zbConn = Environment.GetEnvironmentVariable("OTOPCUA_GALAXY_ZB_CONN")
?? "Server=localhost;Database=ZB;Integrated Security=True;TrustServerCertificate=True;Encrypt=False;";
var clientName = Environment.GetEnvironmentVariable("OTOPCUA_GALAXY_CLIENT_NAME") ?? "OtOpcUa-Galaxy.Host";
IGalaxyBackend backend;
StaPump? pump = null;
MxAccessClient? mx = null;
switch (backendKind)
{
case "stub":
backend = new StubGalaxyBackend();
break;
case "db":
backend = new DbBackedGalaxyBackend(new GalaxyRepository(new GalaxyRepositoryOptions { ConnectionString = zbConn }));
break;
default: // mxaccess
pump = new StaPump("Galaxy.Sta");
pump.WaitForStartedAsync().GetAwaiter().GetResult();
mx = new MxAccessClient(pump, new MxProxyAdapter(), clientName);
var historian = BuildHistorianIfEnabled();
backend = new MxAccessGalaxyBackend(
new GalaxyRepository(new GalaxyRepositoryOptions { ConnectionString = zbConn }),
mx,
historian);
break;
}
Log.Information("OtOpcUaGalaxyHost backend={Backend}", backendKind);
var handler = new GalaxyFrameHandler(backend, Log.Logger);
try { server.RunAsync(handler, cts.Token).GetAwaiter().GetResult(); }
finally
{
(backend as IDisposable)?.Dispose();
mx?.Dispose();
pump?.Dispose();
}
Log.Information("OtOpcUaGalaxyHost stopped cleanly");
return 0;
@@ -51,4 +95,45 @@ public static class Program
}
finally { Log.CloseAndFlush(); }
}
/// <summary>
/// Builds a <see cref="HistorianDataSource"/> from the OTOPCUA_HISTORIAN_* environment
/// variables the supervisor passes at spawn time. Returns null when the historian is
/// disabled (default) so <c>MxAccessGalaxyBackend.HistoryReadAsync</c> returns a clear
/// "not configured" error instead of attempting an SDK connection to localhost.
/// </summary>
private static IHistorianDataSource? BuildHistorianIfEnabled()
{
var enabled = Environment.GetEnvironmentVariable("OTOPCUA_HISTORIAN_ENABLED");
if (!string.Equals(enabled, "true", StringComparison.OrdinalIgnoreCase) && enabled != "1")
return null;
var cfg = new HistorianConfiguration
{
Enabled = true,
ServerName = Environment.GetEnvironmentVariable("OTOPCUA_HISTORIAN_SERVER") ?? "localhost",
Port = TryParseInt("OTOPCUA_HISTORIAN_PORT", 32568),
IntegratedSecurity = !string.Equals(Environment.GetEnvironmentVariable("OTOPCUA_HISTORIAN_INTEGRATED"), "false", StringComparison.OrdinalIgnoreCase),
UserName = Environment.GetEnvironmentVariable("OTOPCUA_HISTORIAN_USER"),
Password = Environment.GetEnvironmentVariable("OTOPCUA_HISTORIAN_PASS"),
CommandTimeoutSeconds = TryParseInt("OTOPCUA_HISTORIAN_TIMEOUT_SEC", 30),
MaxValuesPerRead = TryParseInt("OTOPCUA_HISTORIAN_MAX_VALUES", 10000),
FailureCooldownSeconds = TryParseInt("OTOPCUA_HISTORIAN_COOLDOWN_SEC", 60),
};
var servers = Environment.GetEnvironmentVariable("OTOPCUA_HISTORIAN_SERVERS");
if (!string.IsNullOrWhiteSpace(servers))
cfg.ServerNames = new System.Collections.Generic.List<string>(
servers.Split(new[] { ',' }, StringSplitOptions.RemoveEmptyEntries));
Log.Information("Historian enabled — {NodeCount} configured node(s), port={Port}",
cfg.ServerNames.Count > 0 ? cfg.ServerNames.Count : 1, cfg.Port);
return new HistorianDataSource(cfg);
}
private static int TryParseInt(string envName, int defaultValue)
{
var raw = Environment.GetEnvironmentVariable(envName);
return int.TryParse(raw, out var parsed) ? parsed : defaultValue;
}
}

View File

@@ -1,31 +1,37 @@
using System;
using System.Collections.Concurrent;
using System.Runtime.InteropServices;
using System.Threading;
using System.Threading.Tasks;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Sta;
/// <summary>
/// Dedicated STA thread that owns all <c>LMXProxyServer</c> COM instances. Work items are
/// posted from any thread and dispatched on the STA. Per <c>driver-stability.md</c> Galaxy
/// deep dive §"STA thread + Win32 message pump".
/// Dedicated STA thread with a Win32 message pump that owns all <c>LMXProxyServer</c> COM
/// instances. Lifted from v1 <c>StaComThread</c> per CLAUDE.md "Reference Implementation".
/// Per <c>driver-stability.md</c> Galaxy deep dive §"STA thread + Win32 message pump":
/// work items dispatched via <c>PostThreadMessage(WM_APP)</c>; <c>WM_APP+1</c> requests a
/// graceful drain → <c>WM_QUIT</c>; supervisor escalates to <c>Environment.Exit(2)</c> if the
/// pump doesn't drain within the recycle grace window.
/// </summary>
/// <remarks>
/// Phase 2 scaffold: uses a <see cref="BlockingCollection{T}"/> dispatcher instead of the real
/// Win32 <c>GetMessage/DispatchMessage</c> pump. Real pump arrives when the v1 <c>StaComThread</c>
/// is lifted — that's part of the deferred Galaxy code move. The apartment state and work
/// dispatch semantics are identical so production code can be swapped in without changes.
/// </remarks>
public sealed class StaPump : IDisposable
{
private const uint WM_APP = 0x8000;
private const uint WM_DRAIN_AND_QUIT = WM_APP + 1;
private const uint PM_NOREMOVE = 0x0000;
private readonly Thread _thread;
private readonly BlockingCollection<Action> _workQueue = new(new ConcurrentQueue<Action>());
private readonly ConcurrentQueue<WorkItem> _workItems = new();
private readonly TaskCompletionSource<bool> _started = new(TaskCreationOptions.RunContinuationsAsynchronously);
private volatile uint _nativeThreadId;
private volatile bool _pumpExited;
private volatile bool _disposed;
public int ThreadId => _thread.ManagedThreadId;
public DateTime LastDispatchedUtc { get; private set; } = DateTime.MinValue;
public int QueueDepth => _workQueue.Count;
public int QueueDepth => _workItems.Count;
public bool IsRunning => _nativeThreadId != 0 && !_disposed && !_pumpExited;
public StaPump(string name = "Galaxy.Sta")
{
@@ -40,24 +46,36 @@ public sealed class StaPump : IDisposable
public Task<T> InvokeAsync<T>(Func<T> work)
{
if (_disposed) throw new ObjectDisposedException(nameof(StaPump));
if (_pumpExited) throw new InvalidOperationException("STA pump has exited");
var tcs = new TaskCompletionSource<T>(TaskCreationOptions.RunContinuationsAsynchronously);
_workQueue.Add(() =>
_workItems.Enqueue(new WorkItem(
() =>
{
try { tcs.TrySetResult(work()); }
catch (Exception ex) { tcs.TrySetException(ex); }
},
ex => tcs.TrySetException(ex)));
if (!PostThreadMessage(_nativeThreadId, WM_APP, IntPtr.Zero, IntPtr.Zero))
{
try { tcs.SetResult(work()); }
catch (Exception ex) { tcs.SetException(ex); }
});
_pumpExited = true;
DrainAndFaultQueue();
}
return tcs.Task;
}
public Task InvokeAsync(Action work) => InvokeAsync(() => { work(); return 0; });
/// <summary>
/// Health probe — returns true if a no-op work item round-trips within <paramref name="timeout"/>.
/// Used by the supervisor; timeout means the pump is wedged and a recycle is warranted.
/// Health probe — returns true if a no-op work item round-trips within
/// <paramref name="timeout"/>. Used by the supervisor; timeout means the pump is wedged
/// and a recycle is warranted (Task B.2 acceptance).
/// </summary>
public async Task<bool> IsResponsiveAsync(TimeSpan timeout)
{
if (!IsRunning) return false;
var task = InvokeAsync(() => { });
var completed = await Task.WhenAny(task, Task.Delay(timeout)).ConfigureAwait(false);
return completed == task;
@@ -65,27 +83,124 @@ public sealed class StaPump : IDisposable
private void PumpLoop()
{
_started.TrySetResult(true);
try
{
while (!_disposed)
_nativeThreadId = GetCurrentThreadId();
// Force the system to create the thread message queue before we signal Started.
// PeekMessage(PM_NOREMOVE) on an empty queue is the documented way to do this.
PeekMessage(out _, IntPtr.Zero, 0, 0, PM_NOREMOVE);
_started.TrySetResult(true);
// GetMessage returns 0 on WM_QUIT, -1 on error, otherwise a positive value.
while (GetMessage(out var msg, IntPtr.Zero, 0, 0) > 0)
{
if (_workQueue.TryTake(out var work, Timeout.Infinite))
if (msg.message == WM_APP)
{
work();
LastDispatchedUtc = DateTime.UtcNow;
DrainQueue();
}
else if (msg.message == WM_DRAIN_AND_QUIT)
{
DrainQueue();
PostQuitMessage(0);
}
else
{
// Pass through any window/dialog messages the COM proxy may inject.
TranslateMessage(ref msg);
DispatchMessage(ref msg);
}
}
}
catch (InvalidOperationException) { /* CompleteAdding called during dispose */ }
catch (Exception ex)
{
_started.TrySetException(ex);
}
finally
{
_pumpExited = true;
DrainAndFaultQueue();
}
}
private void DrainQueue()
{
while (_workItems.TryDequeue(out var item))
{
item.Execute();
LastDispatchedUtc = DateTime.UtcNow;
}
}
private void DrainAndFaultQueue()
{
var ex = new InvalidOperationException("STA pump has exited");
while (_workItems.TryDequeue(out var item))
{
try { item.Fault(ex); }
catch { /* faulting a TCS shouldn't throw, but be defensive */ }
}
}
public void Dispose()
{
if (_disposed) return;
_disposed = true;
_workQueue.CompleteAdding();
_thread.Join(TimeSpan.FromSeconds(5));
_workQueue.Dispose();
try
{
if (_nativeThreadId != 0 && !_pumpExited)
PostThreadMessage(_nativeThreadId, WM_DRAIN_AND_QUIT, IntPtr.Zero, IntPtr.Zero);
_thread.Join(TimeSpan.FromSeconds(5));
}
catch { /* swallow — best effort */ }
DrainAndFaultQueue();
}
private sealed record WorkItem(Action Execute, Action<Exception> Fault);
#region Win32 P/Invoke
[StructLayout(LayoutKind.Sequential)]
private struct MSG
{
public IntPtr hwnd;
public uint message;
public IntPtr wParam;
public IntPtr lParam;
public uint time;
public POINT pt;
}
[StructLayout(LayoutKind.Sequential)]
private struct POINT { public int x; public int y; }
[DllImport("user32.dll")]
private static extern int GetMessage(out MSG lpMsg, IntPtr hWnd, uint wMsgFilterMin, uint wMsgFilterMax);
[DllImport("user32.dll")]
[return: MarshalAs(UnmanagedType.Bool)]
private static extern bool TranslateMessage(ref MSG lpMsg);
[DllImport("user32.dll")]
private static extern IntPtr DispatchMessage(ref MSG lpMsg);
[DllImport("user32.dll")]
[return: MarshalAs(UnmanagedType.Bool)]
private static extern bool PostThreadMessage(uint idThread, uint Msg, IntPtr wParam, IntPtr lParam);
[DllImport("user32.dll")]
private static extern void PostQuitMessage(int nExitCode);
[DllImport("user32.dll")]
[return: MarshalAs(UnmanagedType.Bool)]
private static extern bool PeekMessage(out MSG lpMsg, IntPtr hWnd, uint wMsgFilterMin, uint wMsgFilterMax,
uint wRemoveMsg);
[DllImport("kernel32.dll")]
private static extern uint GetCurrentThreadId();
#endregion
}

View File

@@ -3,10 +3,11 @@
<PropertyGroup>
<OutputType>Exe</OutputType>
<TargetFramework>net48</TargetFramework>
<!-- Decision #23: x86 required for MXAccess COM interop. Currently AnyCPU is OK because
the actual MXAccess code lift is deferred (it stays in the v1 Host until the Phase 2
parity gate); flip to x86 when Task B.1 "move Galaxy code" actually executes. -->
<PlatformTarget>AnyCPU</PlatformTarget>
<!-- Decision #23: x86 required for MXAccess COM interop. The MxAccess COM client is
now ported (Backend/MxAccess/) so we need the x86 platform target for the
ArchestrA.MxAccess.dll COM interop reference to resolve at runtime. -->
<PlatformTarget>x86</PlatformTarget>
<Prefer32Bit>true</Prefer32Bit>
<Nullable>enable</Nullable>
<LangVersion>latest</LangVersion>
<TreatWarningsAsErrors>true</TreatWarningsAsErrors>
@@ -20,6 +21,7 @@
<PackageReference Include="System.IO.Pipes.AccessControl" Version="5.0.0"/>
<PackageReference Include="System.Memory" Version="4.5.5"/>
<PackageReference Include="System.Threading.Tasks.Extensions" Version="4.5.4"/>
<PackageReference Include="System.Data.SqlClient" Version="4.9.0"/>
<PackageReference Include="Serilog" Version="4.2.0"/>
<PackageReference Include="Serilog.Sinks.File" Version="7.0.0"/>
</ItemGroup>
@@ -28,6 +30,45 @@
<ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.csproj"/>
</ItemGroup>
<ItemGroup>
<InternalsVisibleTo Include="ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests"/>
</ItemGroup>
<ItemGroup>
<Reference Include="ArchestrA.MxAccess">
<HintPath>..\..\lib\ArchestrA.MxAccess.dll</HintPath>
<Private>true</Private>
</Reference>
<!-- Wonderware Historian SDK — consumed by Backend/Historian/ for HistoryReadAsync.
Previously lived in the v1 Historian.Aveva plugin; folded into Driver.Galaxy.Host
for PR #5 because this host is already Galaxy-specific. -->
<Reference Include="aahClientManaged">
<HintPath>..\..\lib\aahClientManaged.dll</HintPath>
<EmbedInteropTypes>false</EmbedInteropTypes>
</Reference>
<Reference Include="aahClientCommon">
<HintPath>..\..\lib\aahClientCommon.dll</HintPath>
<EmbedInteropTypes>false</EmbedInteropTypes>
</Reference>
</ItemGroup>
<ItemGroup>
<!-- Historian SDK native and satellite DLLs — staged beside the host exe so the
aahClientManaged wrapper can P/Invoke into them without an AssemblyResolve hook. -->
<None Include="..\..\lib\aahClient.dll">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
</None>
<None Include="..\..\lib\Historian.CBE.dll">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
</None>
<None Include="..\..\lib\Historian.DPAPI.dll">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
</None>
<None Include="..\..\lib\ArchestrA.CloudHistorian.Contract.dll">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
</None>
</ItemGroup>
<ItemGroup>
<NuGetAuditSuppress Include="https://github.com/advisories/GHSA-37gx-xxp4-5rgx"/>
<NuGetAuditSuppress Include="https://github.com/advisories/GHSA-w3x6-4m5h-cxqf"/>

View File

@@ -1,25 +1,43 @@
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Ipc;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
using IpcHostConnectivityStatus = ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts.HostConnectivityStatus;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy;
/// <summary>
/// <see cref="IDriver"/> implementation that forwards every capability over the Galaxy IPC
/// channel to the out-of-process Host. Implements <see cref="ITagDiscovery"/> as the
/// Phase 2 minimum; other capability interfaces (<see cref="IReadable"/>, etc.) will be wired
/// in once the Host's MXAccess code lift is complete and end-to-end parity tests run.
/// channel to the out-of-process Host. Implements the full Phase 2 capability surface;
/// bodies that depend on the deferred Host-side MXAccess code lift will surface
/// <see cref="GalaxyIpcException"/> with code <c>not-implemented</c> until the Host's
/// <c>IGalaxyBackend</c> is wired to the real <c>MxAccessClient</c>.
/// </summary>
public sealed class GalaxyProxyDriver(GalaxyProxyOptions options)
: IDriver, ITagDiscovery, IDisposable
: IDriver,
ITagDiscovery,
IReadable,
IWritable,
ISubscribable,
IAlarmSource,
IHistoryProvider,
IRediscoverable,
IHostConnectivityProbe,
IDisposable
{
private GalaxyIpcClient? _client;
private long _sessionId;
private DriverHealth _health = new(DriverState.Unknown, null, null);
private IReadOnlyList<Core.Abstractions.HostConnectivityStatus> _hostStatuses = [];
public string DriverInstanceId => options.DriverInstanceId;
public string DriverType => "Galaxy";
public event EventHandler<DataChangeEventArgs>? OnDataChange;
public event EventHandler<AlarmEventArgs>? OnAlarmEvent;
public event EventHandler<RediscoveryEventArgs>? OnRediscoveryNeeded;
public event EventHandler<HostStatusChangedEventArgs>? OnHostStatusChanged;
public async Task InitializeAsync(string driverConfigJson, CancellationToken cancellationToken)
{
_health = new DriverHealth(DriverState.Initializing, null, null);
@@ -59,9 +77,10 @@ public sealed class GalaxyProxyDriver(GalaxyProxyOptions options)
try
{
await _client.CallAsync<CloseSessionRequest, ErrorResponse>(
MessageKind.CloseSessionRequest, new CloseSessionRequest { SessionId = _sessionId },
MessageKind.ErrorResponse, cancellationToken);
await _client.SendOneWayAsync(
MessageKind.CloseSessionRequest,
new CloseSessionRequest { SessionId = _sessionId },
cancellationToken);
}
catch { /* shutdown is best effort */ }
@@ -71,17 +90,17 @@ public sealed class GalaxyProxyDriver(GalaxyProxyOptions options)
}
public DriverHealth GetHealth() => _health;
public long GetMemoryFootprint() => 0; // Tier C footprint is reported by the Host over IPC
public long GetMemoryFootprint() => 0;
public Task FlushOptionalCachesAsync(CancellationToken cancellationToken) => Task.CompletedTask;
// ---- ITagDiscovery ----
public async Task DiscoverAsync(IAddressSpaceBuilder builder, CancellationToken cancellationToken)
{
ArgumentNullException.ThrowIfNull(builder);
if (_client is null) throw new InvalidOperationException("Driver not initialized");
var client = RequireClient();
var resp = await _client.CallAsync<DiscoverHierarchyRequest, DiscoverHierarchyResponse>(
var resp = await client.CallAsync<DiscoverHierarchyRequest, DiscoverHierarchyResponse>(
MessageKind.DiscoverHierarchyRequest,
new DiscoverHierarchyRequest { SessionId = _sessionId },
MessageKind.DiscoverHierarchyResponse,
@@ -104,11 +123,291 @@ public sealed class GalaxyProxyDriver(GalaxyProxyOptions options)
IsArray: attr.IsArray,
ArrayDim: attr.ArrayDim,
SecurityClass: MapSecurity(attr.SecurityClassification),
IsHistorized: attr.IsHistorized));
IsHistorized: attr.IsHistorized,
IsAlarm: attr.IsAlarm));
}
}
}
// ---- IReadable ----
public async Task<IReadOnlyList<DataValueSnapshot>> ReadAsync(
IReadOnlyList<string> fullReferences, CancellationToken cancellationToken)
{
var client = RequireClient();
var resp = await client.CallAsync<ReadValuesRequest, ReadValuesResponse>(
MessageKind.ReadValuesRequest,
new ReadValuesRequest { SessionId = _sessionId, TagReferences = [.. fullReferences] },
MessageKind.ReadValuesResponse,
cancellationToken);
if (!resp.Success)
throw new InvalidOperationException($"Galaxy.Host ReadValues failed: {resp.Error}");
var byRef = resp.Values.ToDictionary(v => v.TagReference);
var result = new DataValueSnapshot[fullReferences.Count];
for (var i = 0; i < fullReferences.Count; i++)
{
result[i] = byRef.TryGetValue(fullReferences[i], out var v)
? ToSnapshot(v)
: new DataValueSnapshot(null, StatusBadInternalError, null, DateTime.UtcNow);
}
return result;
}
// ---- IWritable ----
public async Task<IReadOnlyList<WriteResult>> WriteAsync(
IReadOnlyList<WriteRequest> writes, CancellationToken cancellationToken)
{
var client = RequireClient();
var resp = await client.CallAsync<WriteValuesRequest, WriteValuesResponse>(
MessageKind.WriteValuesRequest,
new WriteValuesRequest
{
SessionId = _sessionId,
Writes = [.. writes.Select(FromWriteRequest)],
},
MessageKind.WriteValuesResponse,
cancellationToken);
return [.. resp.Results.Select(r => new WriteResult(r.StatusCode))];
}
// ---- ISubscribable ----
public async Task<ISubscriptionHandle> SubscribeAsync(
IReadOnlyList<string> fullReferences, TimeSpan publishingInterval, CancellationToken cancellationToken)
{
var client = RequireClient();
var resp = await client.CallAsync<SubscribeRequest, SubscribeResponse>(
MessageKind.SubscribeRequest,
new SubscribeRequest
{
SessionId = _sessionId,
TagReferences = [.. fullReferences],
RequestedIntervalMs = (int)publishingInterval.TotalMilliseconds,
},
MessageKind.SubscribeResponse,
cancellationToken);
if (!resp.Success)
throw new InvalidOperationException($"Galaxy.Host Subscribe failed: {resp.Error}");
return new GalaxySubscriptionHandle(resp.SubscriptionId);
}
public async Task UnsubscribeAsync(ISubscriptionHandle handle, CancellationToken cancellationToken)
{
var client = RequireClient();
var sid = ((GalaxySubscriptionHandle)handle).SubscriptionId;
await client.SendOneWayAsync(
MessageKind.UnsubscribeRequest,
new UnsubscribeRequest { SessionId = _sessionId, SubscriptionId = sid },
cancellationToken);
}
/// <summary>
/// Internal entry point used by the IPC client when the Host pushes an
/// <see cref="MessageKind.OnDataChangeNotification"/> frame. Surfaces it as a managed
/// <see cref="OnDataChange"/> event.
/// </summary>
internal void RaiseDataChange(OnDataChangeNotification notif)
{
var handle = new GalaxySubscriptionHandle(notif.SubscriptionId);
// ISubscribable.OnDataChange fires once per changed attribute — fan out the batch.
foreach (var v in notif.Values)
OnDataChange?.Invoke(this, new DataChangeEventArgs(handle, v.TagReference, ToSnapshot(v)));
}
// ---- IAlarmSource ----
public async Task<IAlarmSubscriptionHandle> SubscribeAlarmsAsync(
IReadOnlyList<string> sourceNodeIds, CancellationToken cancellationToken)
{
var client = RequireClient();
await client.SendOneWayAsync(
MessageKind.AlarmSubscribeRequest,
new AlarmSubscribeRequest { SessionId = _sessionId },
cancellationToken);
return new GalaxyAlarmSubscriptionHandle($"alarm-{_sessionId}");
}
public Task UnsubscribeAlarmsAsync(IAlarmSubscriptionHandle handle, CancellationToken cancellationToken)
=> Task.CompletedTask;
public async Task AcknowledgeAsync(
IReadOnlyList<AlarmAcknowledgeRequest> acknowledgements, CancellationToken cancellationToken)
{
var client = RequireClient();
foreach (var ack in acknowledgements)
{
await client.SendOneWayAsync(
MessageKind.AlarmAckRequest,
new AlarmAckRequest
{
SessionId = _sessionId,
EventId = ack.ConditionId,
Comment = ack.Comment ?? string.Empty,
},
cancellationToken);
}
}
internal void RaiseAlarmEvent(GalaxyAlarmEvent ev)
{
var handle = new GalaxyAlarmSubscriptionHandle($"alarm-{_sessionId}");
OnAlarmEvent?.Invoke(this, new AlarmEventArgs(
SubscriptionHandle: handle,
SourceNodeId: ev.ObjectTagName,
ConditionId: ev.EventId,
AlarmType: ev.AlarmName,
Message: ev.Message,
Severity: MapSeverity(ev.Severity),
SourceTimestampUtc: DateTimeOffset.FromUnixTimeMilliseconds(ev.UtcUnixMs).UtcDateTime));
}
// ---- IHistoryProvider ----
public async Task<HistoryReadResult> ReadRawAsync(
string fullReference, DateTime startUtc, DateTime endUtc, uint maxValuesPerNode,
CancellationToken cancellationToken)
{
var client = RequireClient();
var resp = await client.CallAsync<HistoryReadRequest, HistoryReadResponse>(
MessageKind.HistoryReadRequest,
new HistoryReadRequest
{
SessionId = _sessionId,
TagReferences = [fullReference],
StartUtcUnixMs = new DateTimeOffset(startUtc, TimeSpan.Zero).ToUnixTimeMilliseconds(),
EndUtcUnixMs = new DateTimeOffset(endUtc, TimeSpan.Zero).ToUnixTimeMilliseconds(),
MaxValuesPerTag = maxValuesPerNode,
},
MessageKind.HistoryReadResponse,
cancellationToken);
if (!resp.Success)
throw new InvalidOperationException($"Galaxy.Host HistoryRead failed: {resp.Error}");
var first = resp.Tags.FirstOrDefault();
IReadOnlyList<DataValueSnapshot> samples = first is null
? Array.Empty<DataValueSnapshot>()
: [.. first.Values.Select(ToSnapshot)];
return new HistoryReadResult(samples, ContinuationPoint: null);
}
public async Task<HistoryReadResult> ReadProcessedAsync(
string fullReference, DateTime startUtc, DateTime endUtc, TimeSpan interval,
HistoryAggregateType aggregate, CancellationToken cancellationToken)
{
var client = RequireClient();
var column = MapAggregateToColumn(aggregate);
var resp = await client.CallAsync<HistoryReadProcessedRequest, HistoryReadProcessedResponse>(
MessageKind.HistoryReadProcessedRequest,
new HistoryReadProcessedRequest
{
SessionId = _sessionId,
TagReference = fullReference,
StartUtcUnixMs = new DateTimeOffset(startUtc, TimeSpan.Zero).ToUnixTimeMilliseconds(),
EndUtcUnixMs = new DateTimeOffset(endUtc, TimeSpan.Zero).ToUnixTimeMilliseconds(),
IntervalMs = (long)interval.TotalMilliseconds,
AggregateColumn = column,
},
MessageKind.HistoryReadProcessedResponse,
cancellationToken);
if (!resp.Success)
throw new InvalidOperationException($"Galaxy.Host HistoryReadProcessed failed: {resp.Error}");
IReadOnlyList<DataValueSnapshot> samples = [.. resp.Values.Select(ToSnapshot)];
return new HistoryReadResult(samples, ContinuationPoint: null);
}
/// <summary>
/// Maps the OPC UA Part 13 aggregate enum onto the Wonderware Historian
/// AnalogSummaryQuery column names consumed by <c>HistorianDataSource.ReadAggregateAsync</c>.
/// Kept on the Proxy side so Galaxy.Host stays OPC-UA-free.
/// </summary>
internal static string MapAggregateToColumn(HistoryAggregateType aggregate) => aggregate switch
{
HistoryAggregateType.Average => "Average",
HistoryAggregateType.Minimum => "Minimum",
HistoryAggregateType.Maximum => "Maximum",
HistoryAggregateType.Count => "ValueCount",
HistoryAggregateType.Total => throw new NotSupportedException(
"HistoryAggregateType.Total is not supported by the Wonderware Historian AnalogSummary " +
"query — use Average × Count on the caller side, or switch to Average/Minimum/Maximum/Count."),
_ => throw new NotSupportedException($"Unknown HistoryAggregateType {aggregate}"),
};
// ---- IRediscoverable ----
/// <summary>
/// Triggered by the IPC client when the Host pushes a deploy-watermark notification
/// (Galaxy <c>time_of_last_deploy</c> changed per decision #54).
/// </summary>
internal void RaiseRediscoveryNeeded(string reason, string? scopeHint = null) =>
OnRediscoveryNeeded?.Invoke(this, new RediscoveryEventArgs(reason, scopeHint));
// ---- IHostConnectivityProbe ----
public IReadOnlyList<Core.Abstractions.HostConnectivityStatus> GetHostStatuses() => _hostStatuses;
internal void OnHostConnectivityUpdate(IpcHostConnectivityStatus update)
{
var translated = new Core.Abstractions.HostConnectivityStatus(
HostName: update.HostName,
State: ParseHostState(update.RuntimeStatus),
LastChangedUtc: DateTimeOffset.FromUnixTimeMilliseconds(update.LastObservedUtcUnixMs).UtcDateTime);
var prior = _hostStatuses.FirstOrDefault(h => h.HostName == translated.HostName);
_hostStatuses = [
.. _hostStatuses.Where(h => h.HostName != translated.HostName),
translated
];
if (prior is null || prior.State != translated.State)
{
OnHostStatusChanged?.Invoke(this, new HostStatusChangedEventArgs(
translated.HostName, prior?.State ?? HostState.Unknown, translated.State));
}
}
private static HostState ParseHostState(string s) => s switch
{
"Running" => HostState.Running,
"Stopped" => HostState.Stopped,
"Faulted" => HostState.Faulted,
_ => HostState.Unknown,
};
// ---- helpers ----
private GalaxyIpcClient RequireClient() =>
_client ?? throw new InvalidOperationException("Driver not initialized");
private const uint StatusBadInternalError = 0x80020000u;
private static DataValueSnapshot ToSnapshot(GalaxyDataValue v) => new(
Value: v.ValueBytes,
StatusCode: v.StatusCode,
SourceTimestampUtc: v.SourceTimestampUtcUnixMs > 0
? DateTimeOffset.FromUnixTimeMilliseconds(v.SourceTimestampUtcUnixMs).UtcDateTime
: null,
ServerTimestampUtc: DateTimeOffset.FromUnixTimeMilliseconds(v.ServerTimestampUtcUnixMs).UtcDateTime);
private static GalaxyDataValue FromWriteRequest(WriteRequest w) => new()
{
TagReference = w.FullReference,
ValueBytes = MessagePack.MessagePackSerializer.Serialize(w.Value),
ValueMessagePackType = 0,
StatusCode = 0,
SourceTimestampUtcUnixMs = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(),
ServerTimestampUtcUnixMs = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(),
};
private static DriverDataType MapDataType(int mxDataType) => mxDataType switch
{
0 => DriverDataType.Boolean,
@@ -132,9 +431,27 @@ public sealed class GalaxyProxyDriver(GalaxyProxyOptions options)
_ => SecurityClassification.FreeAccess,
};
private static AlarmSeverity MapSeverity(int sev) => sev switch
{
<= 250 => AlarmSeverity.Low,
<= 500 => AlarmSeverity.Medium,
<= 800 => AlarmSeverity.High,
_ => AlarmSeverity.Critical,
};
public void Dispose() => _client?.DisposeAsync().AsTask().GetAwaiter().GetResult();
}
internal sealed record GalaxySubscriptionHandle(long SubscriptionId) : ISubscriptionHandle
{
public string DiagnosticId => $"galaxy-sub-{SubscriptionId}";
}
internal sealed record GalaxyAlarmSubscriptionHandle(string Id) : IAlarmSubscriptionHandle
{
public string DiagnosticId => Id;
}
public sealed class GalaxyProxyOptions
{
public required string DriverInstanceId { get; init; }

View File

@@ -85,6 +85,18 @@ public sealed class GalaxyIpcClient : IAsyncDisposable
finally { _callGate.Release(); }
}
/// <summary>
/// Fire-and-forget request — used for unsubscribe, alarm-ack, close-session, and other
/// calls where the protocol is one-way. The send is still serialized through the call
/// gate so it doesn't interleave a frame with a concurrent <see cref="CallAsync{TReq, TResp}"/>.
/// </summary>
public async Task SendOneWayAsync<TReq>(MessageKind requestKind, TReq request, CancellationToken ct)
{
await _callGate.WaitAsync(ct);
try { await _writer.WriteAsync(requestKind, request, ct); }
finally { _callGate.Release(); }
}
public async ValueTask DisposeAsync()
{
_callGate.Dispose();

View File

@@ -16,6 +16,10 @@
<ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.csproj"/>
</ItemGroup>
<ItemGroup>
<InternalsVisibleTo Include="ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests"/>
</ItemGroup>
<ItemGroup>
<NuGetAuditSuppress Include="https://github.com/advisories/GHSA-37gx-xxp4-5rgx"/>
<NuGetAuditSuppress Include="https://github.com/advisories/GHSA-w3x6-4m5h-cxqf"/>

View File

@@ -30,6 +30,15 @@ public sealed class GalaxyAttributeInfo
[Key(3)] public uint? ArrayDim { get; set; }
[Key(4)] public int SecurityClassification { get; set; }
[Key(5)] public bool IsHistorized { get; set; }
/// <summary>
/// True when the attribute has an AlarmExtension primitive in the Galaxy repository
/// (<c>primitive_definition.primitive_name = 'AlarmExtension'</c>). The generic
/// node-manager uses this to enrich the variable's OPC UA node with an
/// <c>AlarmConditionState</c> during address-space build. Added in PR 9 as the
/// discovery-side foundation for the alarm event wire-up that follows in PR 10+.
/// </summary>
[Key(6)] public bool IsAlarm { get; set; }
}
[MessagePackObject]

View File

@@ -48,8 +48,14 @@ public enum MessageKind : byte
AlarmEvent = 0x51,
AlarmAckRequest = 0x52,
HistoryReadRequest = 0x60,
HistoryReadResponse = 0x61,
HistoryReadRequest = 0x60,
HistoryReadResponse = 0x61,
HistoryReadProcessedRequest = 0x62,
HistoryReadProcessedResponse = 0x63,
HistoryReadAtTimeRequest = 0x64,
HistoryReadAtTimeResponse = 0x65,
HistoryReadEventsRequest = 0x66,
HistoryReadEventsResponse = 0x67,
HostConnectivityStatus = 0x70,
RuntimeStatusChange = 0x71,

View File

@@ -26,3 +26,85 @@ public sealed class HistoryReadResponse
[Key(1)] public string? Error { get; set; }
[Key(2)] public HistoryTagValues[] Tags { get; set; } = System.Array.Empty<HistoryTagValues>();
}
/// <summary>
/// Processed (aggregated) historian read — OPC UA HistoryReadProcessed service. The
/// aggregate column is a string (e.g. "Average", "Minimum") mapped by the Proxy from the
/// OPC UA HistoryAggregateType enum so Galaxy.Host stays OPC-UA-free.
/// </summary>
[MessagePackObject]
public sealed class HistoryReadProcessedRequest
{
[Key(0)] public long SessionId { get; set; }
[Key(1)] public string TagReference { get; set; } = string.Empty;
[Key(2)] public long StartUtcUnixMs { get; set; }
[Key(3)] public long EndUtcUnixMs { get; set; }
[Key(4)] public long IntervalMs { get; set; }
[Key(5)] public string AggregateColumn { get; set; } = "Average";
}
[MessagePackObject]
public sealed class HistoryReadProcessedResponse
{
[Key(0)] public bool Success { get; set; }
[Key(1)] public string? Error { get; set; }
[Key(2)] public GalaxyDataValue[] Values { get; set; } = System.Array.Empty<GalaxyDataValue>();
}
/// <summary>
/// At-time historian read — OPC UA HistoryReadAtTime service. Returns one sample per
/// requested timestamp (interpolated when no exact match exists). The per-timestamp array
/// is flow-encoded as Unix milliseconds to avoid MessagePack DateTime quirks.
/// </summary>
[MessagePackObject]
public sealed class HistoryReadAtTimeRequest
{
[Key(0)] public long SessionId { get; set; }
[Key(1)] public string TagReference { get; set; } = string.Empty;
[Key(2)] public long[] TimestampsUtcUnixMs { get; set; } = System.Array.Empty<long>();
}
[MessagePackObject]
public sealed class HistoryReadAtTimeResponse
{
[Key(0)] public bool Success { get; set; }
[Key(1)] public string? Error { get; set; }
[Key(2)] public GalaxyDataValue[] Values { get; set; } = System.Array.Empty<GalaxyDataValue>();
}
/// <summary>
/// Historical events read — OPC UA HistoryReadEvents service and Alarm &amp; Condition
/// history. <c>SourceName</c> null means "all sources". Distinct from the live
/// <see cref="GalaxyAlarmEvent"/> stream because historical rows carry both
/// <c>EventTime</c> (when the event occurred in the process) and <c>ReceivedTime</c>
/// (when the Historian persisted it) and have no StateTransition — the Historian logs
/// the instantaneous event, not the OPC UA alarm lifecycle.
/// </summary>
[MessagePackObject]
public sealed class HistoryReadEventsRequest
{
[Key(0)] public long SessionId { get; set; }
[Key(1)] public string? SourceName { get; set; }
[Key(2)] public long StartUtcUnixMs { get; set; }
[Key(3)] public long EndUtcUnixMs { get; set; }
[Key(4)] public int MaxEvents { get; set; } = 1000;
}
[MessagePackObject]
public sealed class GalaxyHistoricalEvent
{
[Key(0)] public string EventId { get; set; } = string.Empty;
[Key(1)] public string? SourceName { get; set; }
[Key(2)] public long EventTimeUtcUnixMs { get; set; }
[Key(3)] public long ReceivedTimeUtcUnixMs { get; set; }
[Key(4)] public string? DisplayText { get; set; }
[Key(5)] public ushort Severity { get; set; }
}
[MessagePackObject]
public sealed class HistoryReadEventsResponse
{
[Key(0)] public bool Success { get; set; }
[Key(1)] public string? Error { get; set; }
[Key(2)] public GalaxyHistoricalEvent[] Events { get; set; } = System.Array.Empty<GalaxyHistoricalEvent>();
}

View File

@@ -11,6 +11,12 @@
<CopyLocalLockFileAssemblies>false</CopyLocalLockFileAssemblies>
<!-- Deploy next to Host.exe under bin/<cfg>/Historian/ so F5 works without a manual copy. -->
<HistorianPluginOutputDir>$(MSBuildThisFileDirectory)..\ZB.MOM.WW.OtOpcUa.Host\bin\$(Configuration)\net48\Historian\</HistorianPluginOutputDir>
<!--
Phase 2 Stream D — V1 ARCHIVE. Plugs into the legacy in-process Host's
Wonderware Historian loader. Will be ported into Driver.Galaxy.Host's
Backend/Historian/ subtree when MxAccessGalaxyBackend.HistoryReadAsync is
wired (Task B.1.h follow-up). See docs/v2/V1_ARCHIVE_STATUS.md.
-->
</PropertyGroup>
<ItemGroup>

View File

@@ -8,6 +8,15 @@
<Nullable>enable</Nullable>
<RootNamespace>ZB.MOM.WW.OtOpcUa.Host</RootNamespace>
<AssemblyName>ZB.MOM.WW.OtOpcUa.Host</AssemblyName>
<!--
Phase 2 Stream D — V1 ARCHIVE. Functionally superseded by:
src/ZB.MOM.WW.OtOpcUa.Server (host process, .NET 10)
src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host (out-of-process MXAccess, net48 x86)
src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy (in-process driver, .NET 10)
Kept in the build graph because Historian.Aveva + IntegrationTests still
transitively reference it. Deletion is the subject of Phase 2 PR 3 (separate from
this PR 2). See docs/v2/V1_ARCHIVE_STATUS.md.
-->
</PropertyGroup>
<ItemGroup>

View File

@@ -0,0 +1,58 @@
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E;
[Trait("Category", "ParityE2E")]
[Collection(nameof(ParityCollection))]
public sealed class HierarchyParityTests
{
private readonly ParityFixture _fx;
public HierarchyParityTests(ParityFixture fx) => _fx = fx;
[Fact]
public async Task Discover_returns_at_least_one_gobject_with_attributes()
{
_fx.SkipIfUnavailable();
var builder = new RecordingAddressSpaceBuilder();
await _fx.Driver!.DiscoverAsync(builder, CancellationToken.None);
builder.Folders.Count.ShouldBeGreaterThan(0,
"live Galaxy ZB has at least one deployed gobject");
builder.Variables.Count.ShouldBeGreaterThan(0,
"at least one gobject in the dev Galaxy carries dynamic attributes");
}
[Fact]
public async Task Discover_emits_only_lowercase_browse_paths_for_each_attribute()
{
// OPC UA browse paths are case-sensitive; the v1 server emits Galaxy attribute
// names verbatim (camelCase like "PV.Input.Value"). Parity invariant: every
// emitted variable's full reference contains a '.' separating the gobject
// tag-name from the attribute name (Galaxy reference grammar).
_fx.SkipIfUnavailable();
var builder = new RecordingAddressSpaceBuilder();
await _fx.Driver!.DiscoverAsync(builder, CancellationToken.None);
builder.Variables.ShouldAllBe(v => v.AttributeInfo.FullName.Contains('.'),
"Galaxy MXAccess full references are 'tag.attribute'");
}
[Fact]
public async Task Discover_marks_at_least_one_attribute_as_historized_when_HistoryExtension_present()
{
_fx.SkipIfUnavailable();
var builder = new RecordingAddressSpaceBuilder();
await _fx.Driver!.DiscoverAsync(builder, CancellationToken.None);
// Soft assertion — some Galaxies are configuration-only with no Historian extensions.
// We only check the field flows through correctly when populated.
var historized = builder.Variables.Count(v => v.AttributeInfo.IsHistorized);
// Just assert the count is non-negative — the value itself is data-dependent.
historized.ShouldBeGreaterThanOrEqualTo(0);
}
}

View File

@@ -0,0 +1,136 @@
using System.Diagnostics;
using System.Net.Sockets;
using System.Reflection;
using System.Security.Principal;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E;
/// <summary>
/// Spawns one <c>OtOpcUa.Driver.Galaxy.Host.exe</c> subprocess per test class and exposes
/// a connected <see cref="GalaxyProxyDriver"/> for the tests. Per Phase 2 plan §"Stream E
/// Parity Validation": the Proxy owns a session against a real out-of-process Host running
/// the production-shape <c>MxAccessGalaxyBackend</c> backed by live ZB + MXAccess COM.
/// Skipped when the Host EXE isn't built, when ZB SQL is unreachable, or when the dev box
/// runs as Administrator (the IPC ACL explicitly denies Administrators per decision #76).
/// </summary>
public sealed class ParityFixture : IAsyncLifetime
{
public GalaxyProxyDriver? Driver { get; private set; }
public string? SkipReason { get; private set; }
private Process? _host;
private const string Secret = "parity-suite-secret";
public async ValueTask InitializeAsync()
{
if (!OperatingSystem.IsWindows()) { SkipReason = "Windows-only"; return; }
if (IsAdministrator()) { SkipReason = "PipeAcl denies Administrators on dev shells"; return; }
if (!await ZbReachableAsync()) { SkipReason = "Galaxy ZB SQL not reachable on localhost:1433"; return; }
var hostExe = FindHostExe();
if (hostExe is null) { SkipReason = "Galaxy.Host EXE not built — run `dotnet build src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host`"; return; }
// Use the SQL-only DB backend by default — exercises the full IPC + dispatcher + SQL
// path without requiring a healthy MXAccess connection. Tests that need MXAccess
// override via env vars before InitializeAsync is called (use a separate fixture).
var pipe = $"OtOpcUaGalaxyParity-{Guid.NewGuid():N}";
using var identity = WindowsIdentity.GetCurrent();
var sid = identity.User!;
var psi = new ProcessStartInfo(hostExe)
{
UseShellExecute = false,
CreateNoWindow = true,
RedirectStandardOutput = true,
RedirectStandardError = true,
EnvironmentVariables =
{
["OTOPCUA_GALAXY_PIPE"] = pipe,
["OTOPCUA_ALLOWED_SID"] = sid.Value,
["OTOPCUA_GALAXY_SECRET"] = Secret,
["OTOPCUA_GALAXY_BACKEND"] = "db",
["OTOPCUA_GALAXY_ZB_CONN"] = "Server=localhost;Database=ZB;Integrated Security=True;TrustServerCertificate=True;Encrypt=False;",
},
};
_host = Process.Start(psi)
?? throw new InvalidOperationException("Failed to spawn Galaxy.Host EXE");
// Give the PipeServer ~2s to bind. The supervisor's HeartbeatMonitor can do this
// in production with retry, but the parity tests are best served by a fixed warm-up.
await Task.Delay(2_000);
Driver = new GalaxyProxyDriver(new GalaxyProxyOptions
{
DriverInstanceId = "parity",
PipeName = pipe,
SharedSecret = Secret,
ConnectTimeout = TimeSpan.FromSeconds(5),
});
await Driver.InitializeAsync(driverConfigJson: "{}", CancellationToken.None);
}
public async ValueTask DisposeAsync()
{
if (Driver is not null)
{
try { await Driver.ShutdownAsync(CancellationToken.None); } catch { /* shutdown */ }
Driver.Dispose();
}
if (_host is not null && !_host.HasExited)
{
try { _host.Kill(entireProcessTree: true); } catch { /* ignore */ }
try { _host.WaitForExit(5_000); } catch { /* ignore */ }
}
_host?.Dispose();
}
/// <summary>Skip the test if the fixture couldn't initialize. xUnit Skip.If pattern.</summary>
public void SkipIfUnavailable()
{
if (SkipReason is not null)
Assert.Skip(SkipReason);
}
private static bool IsAdministrator()
{
if (!OperatingSystem.IsWindows()) return false;
using var identity = WindowsIdentity.GetCurrent();
return new WindowsPrincipal(identity).IsInRole(WindowsBuiltInRole.Administrator);
}
private static async Task<bool> ZbReachableAsync()
{
try
{
using var client = new TcpClient();
var task = client.ConnectAsync("localhost", 1433);
return await Task.WhenAny(task, Task.Delay(1_500)) == task && client.Connected;
}
catch { return false; }
}
private static string? FindHostExe()
{
var asmDir = Path.GetDirectoryName(Assembly.GetExecutingAssembly().Location)!;
var solutionRoot = asmDir;
for (var i = 0; i < 8 && solutionRoot is not null; i++)
{
if (File.Exists(Path.Combine(solutionRoot, "ZB.MOM.WW.OtOpcUa.slnx"))) break;
solutionRoot = Path.GetDirectoryName(solutionRoot);
}
if (solutionRoot is null) return null;
var path = Path.Combine(solutionRoot,
"src", "ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host", "bin", "Debug", "net48",
"OtOpcUa.Driver.Galaxy.Host.exe");
return File.Exists(path) ? path : null;
}
}
[CollectionDefinition(nameof(ParityCollection))]
public sealed class ParityCollection : ICollectionFixture<ParityFixture> { }

View File

@@ -0,0 +1,38 @@
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E;
/// <summary>
/// Test-only <see cref="IAddressSpaceBuilder"/> that records every Folder + Variable
/// registration. Mirrors the v1 in-process address-space build so tests can assert on
/// the same shape the legacy <c>LmxNodeManager</c> produced.
/// </summary>
public sealed class RecordingAddressSpaceBuilder : IAddressSpaceBuilder
{
public List<RecordedFolder> Folders { get; } = new();
public List<RecordedVariable> Variables { get; } = new();
public List<RecordedProperty> Properties { get; } = new();
public IAddressSpaceBuilder Folder(string browseName, string displayName)
{
Folders.Add(new RecordedFolder(browseName, displayName));
return this; // single flat builder for tests; nesting irrelevant for parity assertions
}
public IVariableHandle Variable(string browseName, string displayName, DriverAttributeInfo attributeInfo)
{
Variables.Add(new RecordedVariable(browseName, displayName, attributeInfo));
return new RecordedVariableHandle(attributeInfo.FullName);
}
public void AddProperty(string browseName, DriverDataType dataType, object? value)
{
Properties.Add(new RecordedProperty(browseName, dataType, value));
}
public sealed record RecordedFolder(string BrowseName, string DisplayName);
public sealed record RecordedVariable(string BrowseName, string DisplayName, DriverAttributeInfo AttributeInfo);
public sealed record RecordedProperty(string BrowseName, DriverDataType DataType, object? Value);
private sealed record RecordedVariableHandle(string FullReference) : IVariableHandle;
}

View File

@@ -0,0 +1,140 @@
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E;
/// <summary>
/// Regression tests for the four 2026-04-13 stability findings (commits <c>c76ab8f</c>,
/// <c>7310925</c>) per Phase 2 plan §"Stream E.3". Each test asserts the v2 topology
/// does not reintroduce the v1 defect.
/// </summary>
[Trait("Category", "ParityE2E")]
[Trait("Subcategory", "StabilityRegression")]
[Collection(nameof(ParityCollection))]
public sealed class StabilityFindingsRegressionTests
{
private readonly ParityFixture _fx;
public StabilityFindingsRegressionTests(ParityFixture fx) => _fx = fx;
/// <summary>
/// Finding #1 — <em>phantom probe subscription flips Tick() to Stopped</em>. When the
/// v1 GalaxyRuntimeProbeManager failed to subscribe a probe, it left a phantom entry
/// that the next Tick() flipped to Stopped, fanning Bad-quality across unrelated
/// subtrees. v2 regression net: a failed subscribe must not affect host status of
/// subscriptions that did succeed.
/// </summary>
[Fact]
public async Task Failed_subscribe_does_not_corrupt_unrelated_host_status()
{
_fx.SkipIfUnavailable();
// GetHostStatuses pre-subscribe — baseline.
var preSubscribe = _fx.Driver!.GetHostStatuses().Count;
// Try to subscribe to a nonsense reference; the Host should reject it without
// poisoning the host-status table.
try
{
await _fx.Driver.SubscribeAsync(
new[] { "nonexistent.tag.does.not.exist[]" },
TimeSpan.FromSeconds(1),
CancellationToken.None);
}
catch { /* expected — bad reference */ }
var postSubscribe = _fx.Driver.GetHostStatuses().Count;
postSubscribe.ShouldBe(preSubscribe,
"failed subscribe must not mutate the host-status snapshot");
}
/// <summary>
/// Finding #2 — <em>cross-host quality clear wipes sibling state during recovery</em>.
/// v1 cleared all subscriptions when ANY host changed state, even healthy peers.
/// v2 regression net: host-status events must be scoped to the affected host name.
/// </summary>
[Fact]
public void Host_status_change_event_carries_specific_host_name_not_global_clear()
{
_fx.SkipIfUnavailable();
var changes = new List<HostStatusChangedEventArgs>();
EventHandler<HostStatusChangedEventArgs> handler = (_, e) => changes.Add(e);
_fx.Driver!.OnHostStatusChanged += handler;
try
{
// We can't deterministically force a Host status transition in the suite without
// tearing down the COM connection. The structural assertion is sufficient: the
// event TYPE carries a specific HostName, OldState, NewState — there is no
// "global clear" payload. v1's bug was structural; v2's event signature
// mathematically prevents reintroduction.
typeof(HostStatusChangedEventArgs).GetProperty("HostName")
.ShouldNotBeNull("event signature must scope to a specific host");
typeof(HostStatusChangedEventArgs).GetProperty("OldState")
.ShouldNotBeNull();
typeof(HostStatusChangedEventArgs).GetProperty("NewState")
.ShouldNotBeNull();
}
finally
{
_fx.Driver.OnHostStatusChanged -= handler;
}
}
/// <summary>
/// Finding #3 — <em>sync-over-async on the OPC UA stack thread</em>. v1 had spots
/// that called <c>.Result</c> / <c>.Wait()</c> from the OPC UA stack callback,
/// deadlocking under load. v2 regression net: every <see cref="GalaxyProxyDriver"/>
/// capability method is async-all-the-way; a reflection scan asserts no
/// <c>.GetAwaiter().GetResult()</c> appears in IL of the public surface.
/// Implemented as a structural shape assertion — every public method returning
/// <see cref="Task"/> or <see cref="Task{TResult}"/>.
/// </summary>
[Fact]
public void All_GalaxyProxyDriver_capability_methods_return_Task_for_async_correctness()
{
_fx.SkipIfUnavailable();
var driverType = typeof(Proxy.GalaxyProxyDriver);
var capabilityMethods = driverType.GetMethods(System.Reflection.BindingFlags.Public | System.Reflection.BindingFlags.Instance)
.Where(m => m.DeclaringType == driverType
&& !m.IsSpecialName
&& m.Name is "InitializeAsync" or "ReinitializeAsync" or "ShutdownAsync"
or "FlushOptionalCachesAsync" or "DiscoverAsync"
or "ReadAsync" or "WriteAsync"
or "SubscribeAsync" or "UnsubscribeAsync"
or "SubscribeAlarmsAsync" or "UnsubscribeAlarmsAsync" or "AcknowledgeAsync"
or "ReadRawAsync" or "ReadProcessedAsync");
foreach (var m in capabilityMethods)
{
(m.ReturnType == typeof(Task) || m.ReturnType.IsGenericType && m.ReturnType.GetGenericTypeDefinition() == typeof(Task<>))
.ShouldBeTrue($"{m.Name} must return Task or Task<T> — sync-over-async risks deadlock under load");
}
}
/// <summary>
/// Finding #4 — <em>fire-and-forget alarm tasks racing shutdown</em>. v1 fired
/// <c>Task.Run(() => raiseAlarm)</c> without awaiting, so shutdown could complete
/// while the task was still touching disposed state. v2 regression net: alarm
/// acknowledgement is sequential and awaited — verified by the integration test
/// <c>AcknowledgeAsync</c> returning a completed Task that doesn't leave background
/// work.
/// </summary>
[Fact]
public async Task AcknowledgeAsync_completes_before_returning_no_background_tasks()
{
_fx.SkipIfUnavailable();
// We can't easily acknowledge a real Galaxy alarm in this fixture, but we can
// assert the call shape: a synchronous-from-the-caller-perspective await without
// throwing or leaving a pending continuation.
await _fx.Driver!.AcknowledgeAsync(
new[] { new AlarmAcknowledgeRequest("nonexistent-source", "nonexistent-event", "test ack") },
CancellationToken.None);
// If we got here, the call awaited cleanly — no fire-and-forget background work
// left running after the caller returned.
true.ShouldBeTrue();
}
}

View File

@@ -0,0 +1,36 @@
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<TargetFramework>net10.0</TargetFramework>
<Nullable>enable</Nullable>
<ImplicitUsings>enable</ImplicitUsings>
<IsPackable>false</IsPackable>
<IsTestProject>true</IsTestProject>
<RootNamespace>ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E</RootNamespace>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="xunit.v3" Version="1.1.0"/>
<PackageReference Include="Shouldly" Version="4.3.0"/>
<PackageReference Include="Microsoft.NET.Test.Sdk" Version="17.12.0"/>
<PackageReference Include="xunit.runner.visualstudio" Version="3.0.2">
<PrivateAssets>all</PrivateAssets>
<IncludeAssets>runtime; build; native; contentfiles; analyzers; buildtransitive</IncludeAssets>
</PackageReference>
</ItemGroup>
<ItemGroup>
<ProjectReference Include="..\..\src\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.csproj"/>
<!--
We DO NOT reference Galaxy.Host (net48 x86) here. The Host runs as a subprocess —
this project only needs to spawn the EXE and talk to it via named pipes through
the Proxy. Cross-FX type loading is what bit the earlier in-process attempt.
-->
</ItemGroup>
<ItemGroup>
<NuGetAuditSuppress Include="https://github.com/advisories/GHSA-37gx-xxp4-5rgx"/>
<NuGetAuditSuppress Include="https://github.com/advisories/GHSA-w3x6-4m5h-cxqf"/>
</ItemGroup>
</Project>

View File

@@ -0,0 +1,84 @@
using System;
using MessagePack;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests;
[Trait("Category", "Unit")]
public sealed class AlarmDiscoveryTests
{
/// <summary>
/// PR 9 — IsAlarm must survive the MessagePack round-trip at Key=6 position.
/// Regression guard: any reorder of keys in GalaxyAttributeInfo would silently corrupt
/// the flag in the wire payload since MessagePack encodes by key number, not field name.
/// </summary>
[Fact]
public void GalaxyAttributeInfo_IsAlarm_round_trips_true_through_MessagePack()
{
var input = new GalaxyAttributeInfo
{
AttributeName = "TankLevel",
MxDataType = 2,
IsArray = false,
ArrayDim = null,
SecurityClassification = 1,
IsHistorized = true,
IsAlarm = true,
};
var bytes = MessagePackSerializer.Serialize(input);
var decoded = MessagePackSerializer.Deserialize<GalaxyAttributeInfo>(bytes);
decoded.IsAlarm.ShouldBeTrue();
decoded.IsHistorized.ShouldBeTrue();
decoded.AttributeName.ShouldBe("TankLevel");
}
[Fact]
public void GalaxyAttributeInfo_IsAlarm_round_trips_false_through_MessagePack()
{
var input = new GalaxyAttributeInfo { AttributeName = "ColorRgb", IsAlarm = false };
var bytes = MessagePackSerializer.Serialize(input);
var decoded = MessagePackSerializer.Deserialize<GalaxyAttributeInfo>(bytes);
decoded.IsAlarm.ShouldBeFalse();
}
/// <summary>
/// Wire-compat guard: payloads serialized before PR 9 (which omit Key=6) must still
/// deserialize cleanly — MessagePack treats missing keys as default. This lets a newer
/// Proxy talk to an older Host during a rolling upgrade without a crash.
/// </summary>
[Fact]
public void Pre_PR9_payload_without_IsAlarm_key_deserializes_with_default_false()
{
// Build a 6-field payload (keys 0..5) matching the pre-PR9 shape by serializing a
// stand-in class with the same key layout but no Key=6.
var pre = new PrePR9Shape
{
AttributeName = "Legacy",
MxDataType = 1,
IsArray = false,
ArrayDim = null,
SecurityClassification = 0,
IsHistorized = false,
};
var bytes = MessagePackSerializer.Serialize(pre);
var decoded = MessagePackSerializer.Deserialize<GalaxyAttributeInfo>(bytes);
decoded.AttributeName.ShouldBe("Legacy");
decoded.IsAlarm.ShouldBeFalse();
}
[MessagePackObject]
public sealed class PrePR9Shape
{
[Key(0)] public string AttributeName { get; set; } = string.Empty;
[Key(1)] public int MxDataType { get; set; }
[Key(2)] public bool IsArray { get; set; }
[Key(3)] public uint? ArrayDim { get; set; }
[Key(4)] public int SecurityClassification { get; set; }
[Key(5)] public bool IsHistorized { get; set; }
}
}

View File

@@ -0,0 +1,181 @@
using System;
using System.IO;
using System.IO.Pipes;
using System.Security.Principal;
using System.Threading;
using System.Threading.Tasks;
using MessagePack;
using Serilog;
using Serilog.Core;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Ipc;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests
{
/// <summary>
/// Drives every <see cref="MessageKind"/> the Phase 2 plan exposes through the full
/// Host-side stack (<see cref="PipeServer"/> + <see cref="GalaxyFrameHandler"/> +
/// <see cref="StubGalaxyBackend"/>) using a hand-rolled IPC client built on Shared's
/// <see cref="FrameReader"/>/<see cref="FrameWriter"/>. The Proxy's <c>GalaxyIpcClient</c>
/// is net10-only and cannot load in this net48 x86 test process, so we exercise the same
/// wire protocol through the framing primitives directly. The dispatcher/backend response
/// shapes are the production code path verbatim.
/// </summary>
[Trait("Category", "Integration")]
public sealed class EndToEndIpcTests
{
private static bool IsAdministrator()
{
using var identity = WindowsIdentity.GetCurrent();
return new WindowsPrincipal(identity).IsInRole(WindowsBuiltInRole.Administrator);
}
private sealed class TestStack : IDisposable
{
public PipeServer Server = null!;
public NamedPipeClientStream Stream = null!;
public FrameReader Reader = null!;
public FrameWriter Writer = null!;
public Task ServerTask = null!;
public CancellationTokenSource Cts = null!;
public void Dispose()
{
Cts.Cancel();
try { ServerTask.GetAwaiter().GetResult(); } catch { /* shutdown */ }
Server.Dispose();
Stream.Dispose();
Reader.Dispose();
Writer.Dispose();
Cts.Dispose();
}
}
private static async Task<TestStack> StartAsync()
{
using var identity = WindowsIdentity.GetCurrent();
var sid = identity.User!;
var pipe = $"OtOpcUaGalaxyE2E-{Guid.NewGuid():N}";
const string secret = "e2e-secret";
Logger log = new LoggerConfiguration().CreateLogger();
var cts = new CancellationTokenSource(TimeSpan.FromSeconds(15));
var server = new PipeServer(pipe, sid, secret, log);
var serverTask = Task.Run(() => server.RunAsync(
new GalaxyFrameHandler(new StubGalaxyBackend(), log), cts.Token));
var stream = new NamedPipeClientStream(".", pipe, PipeDirection.InOut, PipeOptions.Asynchronous);
await stream.ConnectAsync(5_000, cts.Token);
var reader = new FrameReader(stream, leaveOpen: true);
var writer = new FrameWriter(stream, leaveOpen: true);
await writer.WriteAsync(MessageKind.Hello,
new Hello { PeerName = "e2e", SharedSecret = secret }, cts.Token);
var ack = await reader.ReadFrameAsync(cts.Token);
if (ack is null || ack.Value.Kind != MessageKind.HelloAck)
throw new InvalidOperationException("Hello handshake failed");
return new TestStack
{
Server = server,
Stream = stream,
Reader = reader,
Writer = writer,
ServerTask = serverTask,
Cts = cts,
};
}
private static async Task<TResp> RoundTripAsync<TReq, TResp>(
TestStack s, MessageKind reqKind, TReq req, MessageKind respKind)
{
await s.Writer.WriteAsync(reqKind, req, s.Cts.Token);
var frame = await s.Reader.ReadFrameAsync(s.Cts.Token);
frame.HasValue.ShouldBeTrue();
frame!.Value.Kind.ShouldBe(respKind);
return MessagePackSerializer.Deserialize<TResp>(frame.Value.Body);
}
[Fact]
public async Task OpenSession_succeeds_with_an_assigned_session_id()
{
if (IsAdministrator()) return;
using var s = await StartAsync();
var resp = await RoundTripAsync<OpenSessionRequest, OpenSessionResponse>(
s, MessageKind.OpenSessionRequest,
new OpenSessionRequest { DriverInstanceId = "gal-e2e", DriverConfigJson = "{}" },
MessageKind.OpenSessionResponse);
resp.Success.ShouldBeTrue();
resp.SessionId.ShouldBeGreaterThan(0L);
}
[Fact]
public async Task Discover_against_stub_returns_an_error_response()
{
if (IsAdministrator()) return;
using var s = await StartAsync();
var resp = await RoundTripAsync<DiscoverHierarchyRequest, DiscoverHierarchyResponse>(
s, MessageKind.DiscoverHierarchyRequest,
new DiscoverHierarchyRequest { SessionId = 1 },
MessageKind.DiscoverHierarchyResponse);
resp.Success.ShouldBeFalse();
resp.Error.ShouldContain("MXAccess code lift pending");
}
[Fact]
public async Task WriteValues_returns_per_tag_BadInternalError_status()
{
if (IsAdministrator()) return;
using var s = await StartAsync();
var resp = await RoundTripAsync<WriteValuesRequest, WriteValuesResponse>(
s, MessageKind.WriteValuesRequest,
new WriteValuesRequest
{
SessionId = 1,
Writes = new[] { new GalaxyDataValue { TagReference = "TagA" } },
},
MessageKind.WriteValuesResponse);
resp.Results.Length.ShouldBe(1);
resp.Results[0].StatusCode.ShouldBe(0x80020000u);
}
[Fact]
public async Task Subscribe_returns_a_subscription_id()
{
if (IsAdministrator()) return;
using var s = await StartAsync();
var sub = await RoundTripAsync<SubscribeRequest, SubscribeResponse>(
s, MessageKind.SubscribeRequest,
new SubscribeRequest { SessionId = 1, TagReferences = new[] { "TagA" }, RequestedIntervalMs = 500 },
MessageKind.SubscribeResponse);
sub.Success.ShouldBeTrue();
sub.SubscriptionId.ShouldBeGreaterThan(0L);
}
[Fact]
public async Task Recycle_returns_the_grace_window_from_the_backend()
{
if (IsAdministrator()) return;
using var s = await StartAsync();
var resp = await RoundTripAsync<RecycleHostRequest, RecycleStatusResponse>(
s, MessageKind.RecycleHostRequest,
new RecycleHostRequest { Kind = "Soft", Reason = "test" },
MessageKind.RecycleStatusResponse);
resp.Accepted.ShouldBeTrue();
resp.GraceSeconds.ShouldBe(15);
}
}
}

View File

@@ -0,0 +1,100 @@
using System;
using System.Threading;
using System.Threading.Tasks;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Galaxy;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests
{
/// <summary>
/// Live smoke against the Galaxy <c>ZB</c> repository. Skipped when ZB is unreachable so
/// CI / dev boxes without an AVEVA install still pass. Exercises the ported
/// <see cref="GalaxyRepository"/> + <see cref="DbBackedGalaxyBackend"/> against the same
/// SQL the v1 Host uses, proving the lift is byte-for-byte equivalent at the
/// <c>DiscoverHierarchyResponse</c> shape.
/// </summary>
[Trait("Category", "LiveGalaxy")]
public sealed class GalaxyRepositoryLiveSmokeTests
{
private static GalaxyRepositoryOptions DevZbOptions() => new()
{
ConnectionString =
"Server=localhost;Database=ZB;Integrated Security=True;TrustServerCertificate=True;Encrypt=False;Connect Timeout=2;",
CommandTimeoutSeconds = 10,
};
private static async Task<bool> ZbReachableAsync()
{
try
{
var repo = new GalaxyRepository(DevZbOptions());
using var cts = new CancellationTokenSource(TimeSpan.FromSeconds(3));
return await repo.TestConnectionAsync(cts.Token);
}
catch { return false; }
}
[Fact]
public async Task TestConnection_returns_true_against_live_ZB()
{
if (!await ZbReachableAsync()) return;
var repo = new GalaxyRepository(DevZbOptions());
(await repo.TestConnectionAsync()).ShouldBeTrue();
}
[Fact]
public async Task GetHierarchy_returns_at_least_one_deployed_gobject()
{
if (!await ZbReachableAsync()) return;
var repo = new GalaxyRepository(DevZbOptions());
var rows = await repo.GetHierarchyAsync();
rows.Count.ShouldBeGreaterThan(0,
"the dev Galaxy has at least the WinPlatform + AppEngine deployed");
rows.ShouldAllBe(r => !string.IsNullOrEmpty(r.TagName));
}
[Fact]
public async Task GetAttributes_returns_attributes_for_deployed_objects()
{
if (!await ZbReachableAsync()) return;
var repo = new GalaxyRepository(DevZbOptions());
var attrs = await repo.GetAttributesAsync();
attrs.Count.ShouldBeGreaterThan(0);
attrs.ShouldAllBe(a => !string.IsNullOrEmpty(a.FullTagReference) && a.FullTagReference.Contains("."));
}
[Fact]
public async Task GetLastDeployTime_returns_a_value()
{
if (!await ZbReachableAsync()) return;
var repo = new GalaxyRepository(DevZbOptions());
var ts = await repo.GetLastDeployTimeAsync();
ts.ShouldNotBeNull();
}
[Fact]
public async Task DbBackedBackend_DiscoverAsync_returns_objects_with_attributes_and_categories()
{
if (!await ZbReachableAsync()) return;
var backend = new DbBackedGalaxyBackend(new GalaxyRepository(DevZbOptions()));
var resp = await backend.DiscoverAsync(new DiscoverHierarchyRequest { SessionId = 1 }, CancellationToken.None);
resp.Success.ShouldBeTrue(resp.Error);
resp.Objects.Length.ShouldBeGreaterThan(0);
var firstWithAttrs = System.Linq.Enumerable.FirstOrDefault(resp.Objects, o => o.Attributes.Length > 0);
firstWithAttrs.ShouldNotBeNull("at least one gobject in the dev Galaxy carries dynamic attributes");
firstWithAttrs!.TemplateCategory.ShouldNotBeNullOrEmpty();
}
}
}

View File

@@ -0,0 +1,231 @@
using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Stability;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests;
[Trait("Category", "Unit")]
public sealed class GalaxyRuntimeProbeManagerTests
{
private sealed class FakeSubscriber
{
public readonly ConcurrentDictionary<string, Action<string, Vtq>> Subs = new();
public readonly ConcurrentQueue<string> UnsubCalls = new();
public bool FailSubscribeFor { get; set; }
public string? FailSubscribeTag { get; set; }
public Task Subscribe(string probe, Action<string, Vtq> cb)
{
if (FailSubscribeFor && string.Equals(probe, FailSubscribeTag, StringComparison.OrdinalIgnoreCase))
throw new InvalidOperationException("subscribe refused");
Subs[probe] = cb;
return Task.CompletedTask;
}
public Task Unsubscribe(string probe)
{
UnsubCalls.Enqueue(probe);
Subs.TryRemove(probe, out _);
return Task.CompletedTask;
}
}
private static Vtq Good(bool scanState) => new(scanState, DateTime.UtcNow, 192);
private static Vtq Bad() => new(null, DateTime.UtcNow, 0);
[Fact]
public async Task Sync_subscribes_to_ScanState_per_host()
{
var subs = new FakeSubscriber();
using var mgr = new GalaxyRuntimeProbeManager(subs.Subscribe, subs.Unsubscribe);
await mgr.SyncAsync(new[]
{
new HostProbeTarget("PlatformA", GalaxyRuntimeProbeManager.CategoryWinPlatform),
new HostProbeTarget("EngineB", GalaxyRuntimeProbeManager.CategoryAppEngine),
});
mgr.ActiveProbeCount.ShouldBe(2);
subs.Subs.ShouldContainKey("PlatformA.ScanState");
subs.Subs.ShouldContainKey("EngineB.ScanState");
}
[Fact]
public async Task Sync_is_idempotent_on_repeat_call_with_same_set()
{
var subs = new FakeSubscriber();
using var mgr = new GalaxyRuntimeProbeManager(subs.Subscribe, subs.Unsubscribe);
var targets = new[] { new HostProbeTarget("PlatformA", 1) };
await mgr.SyncAsync(targets);
await mgr.SyncAsync(targets);
mgr.ActiveProbeCount.ShouldBe(1);
subs.Subs.Count.ShouldBe(1);
subs.UnsubCalls.Count.ShouldBe(0);
}
[Fact]
public async Task Sync_unadvises_removed_hosts()
{
var subs = new FakeSubscriber();
using var mgr = new GalaxyRuntimeProbeManager(subs.Subscribe, subs.Unsubscribe);
await mgr.SyncAsync(new[]
{
new HostProbeTarget("PlatformA", 1),
new HostProbeTarget("PlatformB", 1),
});
await mgr.SyncAsync(new[] { new HostProbeTarget("PlatformA", 1) });
mgr.ActiveProbeCount.ShouldBe(1);
subs.UnsubCalls.ShouldContain("PlatformB.ScanState");
}
[Fact]
public async Task Subscribe_failure_rolls_back_host_entry_so_later_transitions_do_not_fire_stale_events()
{
var subs = new FakeSubscriber { FailSubscribeFor = true, FailSubscribeTag = "PlatformA.ScanState" };
using var mgr = new GalaxyRuntimeProbeManager(subs.Subscribe, subs.Unsubscribe);
await mgr.SyncAsync(new[] { new HostProbeTarget("PlatformA", 1) });
mgr.ActiveProbeCount.ShouldBe(0); // rolled back
mgr.GetState("PlatformA").ShouldBe(HostRuntimeState.Unknown);
}
[Fact]
public async Task Unknown_to_Running_does_not_fire_StateChanged()
{
var now = new DateTime(2026, 4, 18, 10, 0, 0, DateTimeKind.Utc);
var subs = new FakeSubscriber();
using var mgr = new GalaxyRuntimeProbeManager(subs.Subscribe, subs.Unsubscribe, () => now);
var transitions = new ConcurrentQueue<HostStateTransition>();
mgr.StateChanged += (_, t) => transitions.Enqueue(t);
await mgr.SyncAsync(new[] { new HostProbeTarget("PlatformA", 1) });
subs.Subs["PlatformA.ScanState"]("PlatformA.ScanState", Good(true));
mgr.GetState("PlatformA").ShouldBe(HostRuntimeState.Running);
transitions.Count.ShouldBe(0); // startup transition, operators don't care
}
[Fact]
public async Task Running_to_Stopped_fires_StateChanged_with_both_states()
{
var now = new DateTime(2026, 4, 18, 10, 0, 0, DateTimeKind.Utc);
var subs = new FakeSubscriber();
using var mgr = new GalaxyRuntimeProbeManager(subs.Subscribe, subs.Unsubscribe, () => now);
var transitions = new ConcurrentQueue<HostStateTransition>();
mgr.StateChanged += (_, t) => transitions.Enqueue(t);
await mgr.SyncAsync(new[] { new HostProbeTarget("PlatformA", 1) });
subs.Subs["PlatformA.ScanState"]("PlatformA.ScanState", Good(true)); // Unknown→Running (silent)
subs.Subs["PlatformA.ScanState"]("PlatformA.ScanState", Good(false)); // Running→Stopped (fires)
transitions.Count.ShouldBe(1);
transitions.TryDequeue(out var t).ShouldBeTrue();
t!.TagName.ShouldBe("PlatformA");
t.OldState.ShouldBe(HostRuntimeState.Running);
t.NewState.ShouldBe(HostRuntimeState.Stopped);
}
[Fact]
public async Task Stopped_to_Running_fires_StateChanged_for_recovery()
{
var now = new DateTime(2026, 4, 18, 10, 0, 0, DateTimeKind.Utc);
var subs = new FakeSubscriber();
using var mgr = new GalaxyRuntimeProbeManager(subs.Subscribe, subs.Unsubscribe, () => now);
var transitions = new ConcurrentQueue<HostStateTransition>();
mgr.StateChanged += (_, t) => transitions.Enqueue(t);
await mgr.SyncAsync(new[] { new HostProbeTarget("PlatformA", 1) });
subs.Subs["PlatformA.ScanState"]("PlatformA.ScanState", Good(true)); // Unknown→Running (silent)
subs.Subs["PlatformA.ScanState"]("PlatformA.ScanState", Good(false)); // Running→Stopped (fires)
subs.Subs["PlatformA.ScanState"]("PlatformA.ScanState", Good(true)); // Stopped→Running (fires)
transitions.Count.ShouldBe(2);
}
[Fact]
public async Task Unknown_to_Stopped_fires_StateChanged_for_first_known_bad_signal()
{
var now = new DateTime(2026, 4, 18, 10, 0, 0, DateTimeKind.Utc);
var subs = new FakeSubscriber();
using var mgr = new GalaxyRuntimeProbeManager(subs.Subscribe, subs.Unsubscribe, () => now);
var transitions = new ConcurrentQueue<HostStateTransition>();
mgr.StateChanged += (_, t) => transitions.Enqueue(t);
await mgr.SyncAsync(new[] { new HostProbeTarget("PlatformA", 1) });
// First callback is bad-quality — we must flag the host Stopped so operators see it.
subs.Subs["PlatformA.ScanState"]("PlatformA.ScanState", Bad());
transitions.Count.ShouldBe(1);
transitions.TryDequeue(out var t).ShouldBeTrue();
t!.OldState.ShouldBe(HostRuntimeState.Unknown);
t.NewState.ShouldBe(HostRuntimeState.Stopped);
}
[Fact]
public async Task Repeated_Good_Running_callbacks_do_not_fire_duplicate_events()
{
var now = new DateTime(2026, 4, 18, 10, 0, 0, DateTimeKind.Utc);
var subs = new FakeSubscriber();
using var mgr = new GalaxyRuntimeProbeManager(subs.Subscribe, subs.Unsubscribe, () => now);
var count = 0;
mgr.StateChanged += (_, _) => Interlocked.Increment(ref count);
await mgr.SyncAsync(new[] { new HostProbeTarget("PlatformA", 1) });
for (var i = 0; i < 5; i++)
subs.Subs["PlatformA.ScanState"]("PlatformA.ScanState", Good(true));
count.ShouldBe(0); // only the silent Unknown→Running on the first, no events after
}
[Fact]
public async Task Unknown_callback_for_non_tracked_probe_is_dropped()
{
var subs = new FakeSubscriber();
using var mgr = new GalaxyRuntimeProbeManager(subs.Subscribe, subs.Unsubscribe);
mgr.OnProbeCallback("ProbeForSomeoneElse.ScanState", Good(true));
mgr.ActiveProbeCount.ShouldBe(0);
}
[Fact]
public async Task Snapshot_reports_current_state_for_every_tracked_host()
{
var now = new DateTime(2026, 4, 18, 10, 0, 0, DateTimeKind.Utc);
var subs = new FakeSubscriber();
using var mgr = new GalaxyRuntimeProbeManager(subs.Subscribe, subs.Unsubscribe, () => now);
await mgr.SyncAsync(new[]
{
new HostProbeTarget("PlatformA", 1),
new HostProbeTarget("EngineB", 3),
});
subs.Subs["PlatformA.ScanState"]("PlatformA.ScanState", Good(true)); // Running
subs.Subs["EngineB.ScanState"]("EngineB.ScanState", Bad()); // Stopped
var snap = mgr.SnapshotStates();
snap.Count.ShouldBe(2);
snap.ShouldContain(s => s.TagName == "PlatformA" && s.State == HostRuntimeState.Running);
snap.ShouldContain(s => s.TagName == "EngineB" && s.State == HostRuntimeState.Stopped);
}
[Fact]
public void IsRuntimeHost_recognizes_WinPlatform_and_AppEngine_category_ids()
{
new HostProbeTarget("X", GalaxyRuntimeProbeManager.CategoryWinPlatform).IsRuntimeHost.ShouldBeTrue();
new HostProbeTarget("X", GalaxyRuntimeProbeManager.CategoryAppEngine).IsRuntimeHost.ShouldBeTrue();
new HostProbeTarget("X", 4 /* $Area */).IsRuntimeHost.ShouldBeFalse();
new HostProbeTarget("X", 11 /* $ApplicationObject */).IsRuntimeHost.ShouldBeFalse();
}
}

View File

@@ -0,0 +1,94 @@
using System;
using System.Linq;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Historian;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests
{
[Trait("Category", "Unit")]
public sealed class HistorianClusterEndpointPickerTests
{
private static HistorianConfiguration Config(params string[] nodes) => new()
{
ServerName = "ignored",
ServerNames = nodes.ToList(),
FailureCooldownSeconds = 60,
};
[Fact]
public void Single_node_config_falls_back_to_ServerName_when_ServerNames_empty()
{
var cfg = new HistorianConfiguration { ServerName = "only-node", ServerNames = new() };
var p = new HistorianClusterEndpointPicker(cfg);
p.NodeCount.ShouldBe(1);
p.GetHealthyNodes().ShouldBe(new[] { "only-node" });
}
[Fact]
public void Failed_node_enters_cooldown_and_is_skipped()
{
var now = new DateTime(2026, 4, 18, 10, 0, 0, DateTimeKind.Utc);
var p = new HistorianClusterEndpointPicker(Config("a", "b"), () => now);
p.MarkFailed("a", "boom");
p.GetHealthyNodes().ShouldBe(new[] { "b" });
}
[Fact]
public void Cooldown_expires_after_configured_window()
{
var clock = new DateTime(2026, 4, 18, 10, 0, 0, DateTimeKind.Utc);
var p = new HistorianClusterEndpointPicker(Config("a", "b"), () => clock);
p.MarkFailed("a", "boom");
p.GetHealthyNodes().ShouldBe(new[] { "b" });
clock = clock.AddSeconds(61);
p.GetHealthyNodes().ShouldBe(new[] { "a", "b" });
}
[Fact]
public void MarkHealthy_immediately_clears_cooldown()
{
var now = new DateTime(2026, 4, 18, 10, 0, 0, DateTimeKind.Utc);
var p = new HistorianClusterEndpointPicker(Config("a"), () => now);
p.MarkFailed("a", "boom");
p.GetHealthyNodes().ShouldBeEmpty();
p.MarkHealthy("a");
p.GetHealthyNodes().ShouldBe(new[] { "a" });
}
[Fact]
public void All_nodes_in_cooldown_returns_empty_healthy_list()
{
var now = new DateTime(2026, 4, 18, 10, 0, 0, DateTimeKind.Utc);
var p = new HistorianClusterEndpointPicker(Config("a", "b"), () => now);
p.MarkFailed("a", "x");
p.MarkFailed("b", "y");
p.GetHealthyNodes().ShouldBeEmpty();
p.NodeCount.ShouldBe(2);
}
[Fact]
public void Snapshot_reports_failure_count_and_last_error()
{
var now = new DateTime(2026, 4, 18, 10, 0, 0, DateTimeKind.Utc);
var p = new HistorianClusterEndpointPicker(Config("a"), () => now);
p.MarkFailed("a", "first");
p.MarkFailed("a", "second");
var snap = p.SnapshotNodeStates().Single();
snap.FailureCount.ShouldBe(2);
snap.LastError.ShouldBe("second");
snap.IsHealthy.ShouldBeFalse();
snap.CooldownUntil.ShouldNotBeNull();
}
[Fact]
public void Duplicate_hostnames_are_deduplicated_case_insensitively()
{
var p = new HistorianClusterEndpointPicker(Config("NodeA", "nodea", "NodeB"));
p.NodeCount.ShouldBe(2);
}
}
}

View File

@@ -0,0 +1,61 @@
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Historian;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests;
[Trait("Category", "Unit")]
public sealed class HistorianQualityMapperTests
{
/// <summary>
/// Rich mapping preserves specific OPC DA subcodes through the historian ToWire path.
/// Before PR 12 the category-only fallback collapsed e.g. BadNotConnected(8) to
/// Bad(0x80000000) so downstream OPC UA clients could not distinguish transport issues
/// from sensor issues. After PR 12 every known subcode round-trips to its canonical
/// uint32 StatusCode and Proxy translation stays byte-for-byte with v1 QualityMapper.
/// </summary>
[Theory]
[InlineData((byte)192, 0x00000000u)] // Good
[InlineData((byte)216, 0x00D80000u)] // Good_LocalOverride
[InlineData((byte)64, 0x40000000u)] // Uncertain
[InlineData((byte)68, 0x40900000u)] // Uncertain_LastUsableValue
[InlineData((byte)80, 0x40930000u)] // Uncertain_SensorNotAccurate
[InlineData((byte)84, 0x40940000u)] // Uncertain_EngineeringUnitsExceeded
[InlineData((byte)88, 0x40950000u)] // Uncertain_SubNormal
[InlineData((byte)0, 0x80000000u)] // Bad
[InlineData((byte)4, 0x80890000u)] // Bad_ConfigurationError
[InlineData((byte)8, 0x808A0000u)] // Bad_NotConnected
[InlineData((byte)12, 0x808B0000u)] // Bad_DeviceFailure
[InlineData((byte)16, 0x808C0000u)] // Bad_SensorFailure
[InlineData((byte)20, 0x80050000u)] // Bad_CommunicationError
[InlineData((byte)24, 0x808D0000u)] // Bad_OutOfService
[InlineData((byte)32, 0x80320000u)] // Bad_WaitingForInitialData
public void Maps_specific_OPC_DA_codes_to_canonical_StatusCode(byte quality, uint expected)
{
HistorianQualityMapper.Map(quality).ShouldBe(expected);
}
[Theory]
[InlineData((byte)200)] // Good — unknown subcode in Good family
[InlineData((byte)255)] // Good — unknown
public void Unknown_good_family_codes_fall_back_to_plain_Good(byte q)
{
HistorianQualityMapper.Map(q).ShouldBe(0x00000000u);
}
[Theory]
[InlineData((byte)100)] // Uncertain — unknown subcode
[InlineData((byte)150)] // Uncertain — unknown
public void Unknown_uncertain_family_codes_fall_back_to_plain_Uncertain(byte q)
{
HistorianQualityMapper.Map(q).ShouldBe(0x40000000u);
}
[Theory]
[InlineData((byte)1)] // Bad — unknown subcode
[InlineData((byte)50)] // Bad — unknown
public void Unknown_bad_family_codes_fall_back_to_plain_Bad(byte q)
{
HistorianQualityMapper.Map(q).ShouldBe(0x80000000u);
}
}

View File

@@ -0,0 +1,109 @@
using System;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Galaxy;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Historian;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Sta;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests
{
[Trait("Category", "Unit")]
public sealed class HistorianWiringTests
{
/// <summary>
/// When the Proxy sends a HistoryRead but the supervisor never enabled the historian
/// (OTOPCUA_HISTORIAN_ENABLED unset), we expect a clean Success=false with a
/// self-explanatory error — not an exception or a hang against localhost.
/// </summary>
[Fact]
public async Task HistoryReadAsync_returns_disabled_error_when_no_historian_configured()
{
using var pump = new StaPump("Test.Sta");
await pump.WaitForStartedAsync();
var mx = new MxAccessClient(pump, new MxProxyAdapter(), "HistorianWiringTests");
using var backend = new MxAccessGalaxyBackend(
new GalaxyRepository(new GalaxyRepositoryOptions { ConnectionString = "Server=.;Database=ZB;Integrated Security=True;" }),
mx,
historian: null);
var resp = await backend.HistoryReadAsync(new HistoryReadRequest
{
TagReferences = new[] { "TestTag" },
StartUtcUnixMs = 0,
EndUtcUnixMs = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(),
MaxValuesPerTag = 100,
}, CancellationToken.None);
resp.Success.ShouldBeFalse();
resp.Error.ShouldContain("Historian disabled");
resp.Tags.ShouldBeEmpty();
}
/// <summary>
/// When the historian is wired up, we expect the backend to call through and map
/// samples onto the IPC wire shape. Uses a fake <see cref="IHistorianDataSource"/>
/// that returns a single known-good sample so we can assert the mapping stays sane.
/// </summary>
[Fact]
public async Task HistoryReadAsync_maps_sample_to_GalaxyDataValue()
{
using var pump = new StaPump("Test.Sta");
await pump.WaitForStartedAsync();
var mx = new MxAccessClient(pump, new MxProxyAdapter(), "HistorianWiringTests");
var fake = new FakeHistorianDataSource(new HistorianSample
{
Value = 42.5,
Quality = 192, // Good
TimestampUtc = new DateTime(2026, 4, 18, 9, 0, 0, DateTimeKind.Utc),
});
using var backend = new MxAccessGalaxyBackend(
new GalaxyRepository(new GalaxyRepositoryOptions { ConnectionString = "Server=.;Database=ZB;Integrated Security=True;" }),
mx,
fake);
var resp = await backend.HistoryReadAsync(new HistoryReadRequest
{
TagReferences = new[] { "TankLevel" },
StartUtcUnixMs = 0,
EndUtcUnixMs = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(),
MaxValuesPerTag = 100,
}, CancellationToken.None);
resp.Success.ShouldBeTrue();
resp.Tags.Length.ShouldBe(1);
resp.Tags[0].TagReference.ShouldBe("TankLevel");
resp.Tags[0].Values.Length.ShouldBe(1);
resp.Tags[0].Values[0].StatusCode.ShouldBe(0u); // Good
resp.Tags[0].Values[0].ValueBytes.ShouldNotBeNull();
resp.Tags[0].Values[0].SourceTimestampUtcUnixMs.ShouldBe(
new DateTimeOffset(2026, 4, 18, 9, 0, 0, TimeSpan.Zero).ToUnixTimeMilliseconds());
}
private sealed class FakeHistorianDataSource : IHistorianDataSource
{
private readonly HistorianSample _sample;
public FakeHistorianDataSource(HistorianSample sample) => _sample = sample;
public Task<List<HistorianSample>> ReadRawAsync(string tagName, DateTime s, DateTime e, int max, CancellationToken ct)
=> Task.FromResult(new List<HistorianSample> { _sample });
public Task<List<HistorianAggregateSample>> ReadAggregateAsync(string tagName, DateTime s, DateTime e, double ms, string col, CancellationToken ct)
=> Task.FromResult(new List<HistorianAggregateSample>());
public Task<List<HistorianSample>> ReadAtTimeAsync(string tagName, DateTime[] ts, CancellationToken ct)
=> Task.FromResult(new List<HistorianSample>());
public Task<List<HistorianEventDto>> ReadEventsAsync(string? src, DateTime s, DateTime e, int max, CancellationToken ct)
=> Task.FromResult(new List<HistorianEventDto>());
public HistorianHealthSnapshot GetHealthSnapshot() => new();
public void Dispose() { }
}
}
}

View File

@@ -0,0 +1,147 @@
using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading;
using System.Threading.Tasks;
using MessagePack;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Galaxy;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Historian;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Sta;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests;
[Trait("Category", "Unit")]
public sealed class HistoryReadAtTimeTests
{
private static MxAccessGalaxyBackend BuildBackend(IHistorianDataSource? historian, StaPump pump) =>
new(
new GalaxyRepository(new GalaxyRepositoryOptions { ConnectionString = "Server=.;Database=ZB;Integrated Security=True;" }),
new MxAccessClient(pump, new MxProxyAdapter(), "attime-test"),
historian);
[Fact]
public async Task Returns_disabled_error_when_no_historian_configured()
{
using var pump = new StaPump("Test.Sta");
await pump.WaitForStartedAsync();
using var backend = BuildBackend(null, pump);
var resp = await backend.HistoryReadAtTimeAsync(new HistoryReadAtTimeRequest
{
TagReference = "T",
TimestampsUtcUnixMs = new[] { 1L, 2L },
}, CancellationToken.None);
resp.Success.ShouldBeFalse();
resp.Error.ShouldContain("Historian disabled");
}
[Fact]
public async Task Empty_timestamp_list_short_circuits_to_success_with_no_values()
{
using var pump = new StaPump("Test.Sta");
await pump.WaitForStartedAsync();
var fake = new FakeHistorian();
using var backend = BuildBackend(fake, pump);
var resp = await backend.HistoryReadAtTimeAsync(new HistoryReadAtTimeRequest
{
TagReference = "T",
TimestampsUtcUnixMs = Array.Empty<long>(),
}, CancellationToken.None);
resp.Success.ShouldBeTrue();
resp.Values.ShouldBeEmpty();
fake.Calls.ShouldBe(0); // no round-trip to SDK for empty timestamp list
}
[Fact]
public async Task Timestamps_survive_Unix_ms_round_trip_to_DateTime()
{
using var pump = new StaPump("Test.Sta");
await pump.WaitForStartedAsync();
var t1 = new DateTime(2026, 4, 18, 10, 0, 0, DateTimeKind.Utc);
var t2 = new DateTime(2026, 4, 18, 10, 5, 0, DateTimeKind.Utc);
var fake = new FakeHistorian(
new HistorianSample { Value = 100.0, Quality = 192, TimestampUtc = t1 },
new HistorianSample { Value = 101.5, Quality = 192, TimestampUtc = t2 });
using var backend = BuildBackend(fake, pump);
var resp = await backend.HistoryReadAtTimeAsync(new HistoryReadAtTimeRequest
{
TagReference = "TankLevel",
TimestampsUtcUnixMs = new[]
{
new DateTimeOffset(t1, TimeSpan.Zero).ToUnixTimeMilliseconds(),
new DateTimeOffset(t2, TimeSpan.Zero).ToUnixTimeMilliseconds(),
},
}, CancellationToken.None);
resp.Success.ShouldBeTrue();
resp.Values.Length.ShouldBe(2);
resp.Values[0].SourceTimestampUtcUnixMs.ShouldBe(new DateTimeOffset(t1, TimeSpan.Zero).ToUnixTimeMilliseconds());
resp.Values[0].StatusCode.ShouldBe(0u); // Good (quality 192)
MessagePackSerializer.Deserialize<double>(resp.Values[0].ValueBytes!).ShouldBe(100.0);
fake.Calls.ShouldBe(1);
fake.LastTimestamps.Length.ShouldBe(2);
fake.LastTimestamps[0].ShouldBe(t1);
fake.LastTimestamps[1].ShouldBe(t2);
}
[Fact]
public async Task Missing_sample_maps_to_Bad_category()
{
using var pump = new StaPump("Test.Sta");
await pump.WaitForStartedAsync();
// Quality=0 means no sample at that timestamp per HistorianDataSource.ReadAtTimeAsync.
var fake = new FakeHistorian(new HistorianSample
{
Value = null,
Quality = 0,
TimestampUtc = DateTime.UtcNow,
});
using var backend = BuildBackend(fake, pump);
var resp = await backend.HistoryReadAtTimeAsync(new HistoryReadAtTimeRequest
{
TagReference = "T",
TimestampsUtcUnixMs = new[] { 1L },
}, CancellationToken.None);
resp.Success.ShouldBeTrue();
resp.Values.Length.ShouldBe(1);
resp.Values[0].StatusCode.ShouldBe(0x80000000u); // Bad category
resp.Values[0].ValueBytes.ShouldBeNull();
}
private sealed class FakeHistorian : IHistorianDataSource
{
private readonly HistorianSample[] _samples;
public int Calls { get; private set; }
public DateTime[] LastTimestamps { get; private set; } = Array.Empty<DateTime>();
public FakeHistorian(params HistorianSample[] samples) => _samples = samples;
public Task<List<HistorianSample>> ReadAtTimeAsync(string tag, DateTime[] ts, CancellationToken ct)
{
Calls++;
LastTimestamps = ts;
return Task.FromResult(new List<HistorianSample>(_samples));
}
public Task<List<HistorianSample>> ReadRawAsync(string tag, DateTime s, DateTime e, int max, CancellationToken ct)
=> Task.FromResult(new List<HistorianSample>());
public Task<List<HistorianAggregateSample>> ReadAggregateAsync(string tag, DateTime s, DateTime e, double ms, string col, CancellationToken ct)
=> Task.FromResult(new List<HistorianAggregateSample>());
public Task<List<HistorianEventDto>> ReadEventsAsync(string? src, DateTime s, DateTime e, int max, CancellationToken ct)
=> Task.FromResult(new List<HistorianEventDto>());
public HistorianHealthSnapshot GetHealthSnapshot() => new();
public void Dispose() { }
}
}

View File

@@ -0,0 +1,129 @@
using System;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Galaxy;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Historian;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Sta;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests;
[Trait("Category", "Unit")]
public sealed class HistoryReadEventsTests
{
private static MxAccessGalaxyBackend BuildBackend(IHistorianDataSource? h, StaPump pump) =>
new(
new GalaxyRepository(new GalaxyRepositoryOptions { ConnectionString = "Server=.;Database=ZB;Integrated Security=True;" }),
new MxAccessClient(pump, new MxProxyAdapter(), "events-test"),
h);
[Fact]
public async Task Returns_disabled_error_when_no_historian_configured()
{
using var pump = new StaPump("Test.Sta");
await pump.WaitForStartedAsync();
using var backend = BuildBackend(null, pump);
var resp = await backend.HistoryReadEventsAsync(new HistoryReadEventsRequest
{
SourceName = "TankA",
StartUtcUnixMs = 0,
EndUtcUnixMs = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(),
MaxEvents = 100,
}, CancellationToken.None);
resp.Success.ShouldBeFalse();
resp.Error.ShouldContain("Historian disabled");
}
[Fact]
public async Task Maps_HistorianEventDto_to_GalaxyHistoricalEvent_wire_shape()
{
using var pump = new StaPump("Test.Sta");
await pump.WaitForStartedAsync();
var eventId = Guid.NewGuid();
var eventTime = new DateTime(2026, 4, 18, 10, 0, 0, DateTimeKind.Utc);
var receivedTime = eventTime.AddMilliseconds(150);
var fake = new FakeHistorian(new HistorianEventDto
{
Id = eventId,
Source = "TankA.Level.HiHi",
EventTime = eventTime,
ReceivedTime = receivedTime,
DisplayText = "HiHi alarm tripped",
Severity = 900,
});
using var backend = BuildBackend(fake, pump);
var resp = await backend.HistoryReadEventsAsync(new HistoryReadEventsRequest
{
SourceName = "TankA.Level.HiHi",
StartUtcUnixMs = 0,
EndUtcUnixMs = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(),
MaxEvents = 50,
}, CancellationToken.None);
resp.Success.ShouldBeTrue();
resp.Events.Length.ShouldBe(1);
var got = resp.Events[0];
got.EventId.ShouldBe(eventId.ToString());
got.SourceName.ShouldBe("TankA.Level.HiHi");
got.DisplayText.ShouldBe("HiHi alarm tripped");
got.Severity.ShouldBe<ushort>(900);
got.EventTimeUtcUnixMs.ShouldBe(new DateTimeOffset(eventTime, TimeSpan.Zero).ToUnixTimeMilliseconds());
got.ReceivedTimeUtcUnixMs.ShouldBe(new DateTimeOffset(receivedTime, TimeSpan.Zero).ToUnixTimeMilliseconds());
fake.LastSourceName.ShouldBe("TankA.Level.HiHi");
fake.LastMaxEvents.ShouldBe(50);
}
[Fact]
public async Task Null_source_name_is_passed_through_as_all_sources()
{
using var pump = new StaPump("Test.Sta");
await pump.WaitForStartedAsync();
var fake = new FakeHistorian();
using var backend = BuildBackend(fake, pump);
await backend.HistoryReadEventsAsync(new HistoryReadEventsRequest
{
SourceName = null,
StartUtcUnixMs = 0,
EndUtcUnixMs = 1,
MaxEvents = 10,
}, CancellationToken.None);
fake.LastSourceName.ShouldBeNull();
}
private sealed class FakeHistorian : IHistorianDataSource
{
private readonly HistorianEventDto[] _events;
public string? LastSourceName { get; private set; } = "<unset>";
public int LastMaxEvents { get; private set; }
public FakeHistorian(params HistorianEventDto[] events) => _events = events;
public Task<List<HistorianEventDto>> ReadEventsAsync(string? src, DateTime s, DateTime e, int max, CancellationToken ct)
{
LastSourceName = src;
LastMaxEvents = max;
return Task.FromResult(new List<HistorianEventDto>(_events));
}
public Task<List<HistorianSample>> ReadRawAsync(string tag, DateTime s, DateTime e, int max, CancellationToken ct)
=> Task.FromResult(new List<HistorianSample>());
public Task<List<HistorianAggregateSample>> ReadAggregateAsync(string tag, DateTime s, DateTime e, double ms, string col, CancellationToken ct)
=> Task.FromResult(new List<HistorianAggregateSample>());
public Task<List<HistorianSample>> ReadAtTimeAsync(string tag, DateTime[] ts, CancellationToken ct)
=> Task.FromResult(new List<HistorianSample>());
public HistorianHealthSnapshot GetHealthSnapshot() => new();
public void Dispose() { }
}
}

View File

@@ -0,0 +1,158 @@
using System;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
using MessagePack;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Galaxy;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Historian;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Sta;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests;
[Trait("Category", "Unit")]
public sealed class HistoryReadProcessedTests
{
[Fact]
public async Task ReturnsDisabledError_When_NoHistorianConfigured()
{
using var pump = new StaPump("Test.Sta");
await pump.WaitForStartedAsync();
var mx = new MxAccessClient(pump, new MxProxyAdapter(), "processed-test");
using var backend = new MxAccessGalaxyBackend(
new GalaxyRepository(new GalaxyRepositoryOptions { ConnectionString = "Server=.;Database=ZB;Integrated Security=True;" }),
mx,
historian: null);
var resp = await backend.HistoryReadProcessedAsync(new HistoryReadProcessedRequest
{
TagReference = "T",
StartUtcUnixMs = 0,
EndUtcUnixMs = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(),
IntervalMs = 1000,
AggregateColumn = "Average",
}, CancellationToken.None);
resp.Success.ShouldBeFalse();
resp.Error.ShouldContain("Historian disabled");
}
[Fact]
public async Task Rejects_NonPositiveInterval()
{
using var pump = new StaPump("Test.Sta");
await pump.WaitForStartedAsync();
var mx = new MxAccessClient(pump, new MxProxyAdapter(), "processed-test");
var fake = new FakeHistorianDataSource();
using var backend = new MxAccessGalaxyBackend(
new GalaxyRepository(new GalaxyRepositoryOptions { ConnectionString = "Server=.;Database=ZB;Integrated Security=True;" }),
mx,
fake);
var resp = await backend.HistoryReadProcessedAsync(new HistoryReadProcessedRequest
{
TagReference = "T",
IntervalMs = 0,
AggregateColumn = "Average",
}, CancellationToken.None);
resp.Success.ShouldBeFalse();
resp.Error.ShouldContain("IntervalMs");
}
[Fact]
public async Task Maps_AggregateSample_With_Value_To_Good()
{
using var pump = new StaPump("Test.Sta");
await pump.WaitForStartedAsync();
var mx = new MxAccessClient(pump, new MxProxyAdapter(), "processed-test");
var fake = new FakeHistorianDataSource(new HistorianAggregateSample
{
Value = 12.34,
TimestampUtc = new DateTime(2026, 4, 18, 10, 0, 0, DateTimeKind.Utc),
});
using var backend = new MxAccessGalaxyBackend(
new GalaxyRepository(new GalaxyRepositoryOptions { ConnectionString = "Server=.;Database=ZB;Integrated Security=True;" }),
mx,
fake);
var resp = await backend.HistoryReadProcessedAsync(new HistoryReadProcessedRequest
{
TagReference = "T",
StartUtcUnixMs = 0,
EndUtcUnixMs = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(),
IntervalMs = 60_000,
AggregateColumn = "Average",
}, CancellationToken.None);
resp.Success.ShouldBeTrue();
resp.Values.Length.ShouldBe(1);
resp.Values[0].StatusCode.ShouldBe(0u); // Good
resp.Values[0].ValueBytes.ShouldNotBeNull();
MessagePackSerializer.Deserialize<double>(resp.Values[0].ValueBytes!).ShouldBe(12.34);
fake.LastAggregateColumn.ShouldBe("Average");
fake.LastIntervalMs.ShouldBe(60_000d);
}
[Fact]
public async Task Maps_Null_Bucket_To_BadNoData()
{
using var pump = new StaPump("Test.Sta");
await pump.WaitForStartedAsync();
var mx = new MxAccessClient(pump, new MxProxyAdapter(), "processed-test");
var fake = new FakeHistorianDataSource(new HistorianAggregateSample
{
Value = null,
TimestampUtc = DateTime.UtcNow,
});
using var backend = new MxAccessGalaxyBackend(
new GalaxyRepository(new GalaxyRepositoryOptions { ConnectionString = "Server=.;Database=ZB;Integrated Security=True;" }),
mx,
fake);
var resp = await backend.HistoryReadProcessedAsync(new HistoryReadProcessedRequest
{
TagReference = "T",
IntervalMs = 1000,
AggregateColumn = "Minimum",
}, CancellationToken.None);
resp.Success.ShouldBeTrue();
resp.Values.Length.ShouldBe(1);
resp.Values[0].StatusCode.ShouldBe(0x800E0000u); // BadNoData
resp.Values[0].ValueBytes.ShouldBeNull();
}
private sealed class FakeHistorianDataSource : IHistorianDataSource
{
private readonly HistorianAggregateSample[] _samples;
public string? LastAggregateColumn { get; private set; }
public double LastIntervalMs { get; private set; }
public FakeHistorianDataSource(params HistorianAggregateSample[] samples) => _samples = samples;
public Task<List<HistorianSample>> ReadRawAsync(string tag, DateTime s, DateTime e, int max, CancellationToken ct)
=> Task.FromResult(new List<HistorianSample>());
public Task<List<HistorianAggregateSample>> ReadAggregateAsync(
string tag, DateTime s, DateTime e, double intervalMs, string col, CancellationToken ct)
{
LastAggregateColumn = col;
LastIntervalMs = intervalMs;
return Task.FromResult(new List<HistorianAggregateSample>(_samples));
}
public Task<List<HistorianSample>> ReadAtTimeAsync(string tag, DateTime[] ts, CancellationToken ct)
=> Task.FromResult(new List<HistorianSample>());
public Task<List<HistorianEventDto>> ReadEventsAsync(string? src, DateTime s, DateTime e, int max, CancellationToken ct)
=> Task.FromResult(new List<HistorianEventDto>());
public HistorianHealthSnapshot GetHealthSnapshot() => new();
public void Dispose() { }
}
}

View File

@@ -0,0 +1,91 @@
using System;
using System.Collections.Concurrent;
using System.Threading;
using System.Threading.Tasks;
using ArchestrA.MxAccess;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Galaxy;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Sta;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests;
[Trait("Category", "Unit")]
public sealed class HostStatusPushTests
{
/// <summary>
/// PR 8 — when MxAccessClient.ConnectionStateChanged fires false→true→false,
/// MxAccessGalaxyBackend raises OnHostStatusChanged once per transition with
/// HostName=ClientName, RuntimeStatus="Running"/"Stopped", and a timestamp.
/// This is the gateway-level signal; per-platform ScanState probes are deferred.
/// </summary>
[Fact]
public async Task ConnectionStateChanged_raises_OnHostStatusChanged_with_gateway_name()
{
using var pump = new StaPump("Test.Sta");
await pump.WaitForStartedAsync();
var proxy = new FakeProxy();
var mx = new MxAccessClient(pump, proxy, "GatewayClient", new MxAccessClientOptions { AutoReconnect = false });
using var backend = new MxAccessGalaxyBackend(
new GalaxyRepository(new GalaxyRepositoryOptions { ConnectionString = "Server=.;Database=ZB;Integrated Security=True;" }),
mx,
historian: null);
var notifications = new ConcurrentQueue<HostConnectivityStatus>();
backend.OnHostStatusChanged += (_, s) => notifications.Enqueue(s);
await mx.ConnectAsync();
await mx.DisconnectAsync();
notifications.Count.ShouldBe(2);
notifications.TryDequeue(out var first).ShouldBeTrue();
first!.HostName.ShouldBe("GatewayClient");
first.RuntimeStatus.ShouldBe("Running");
first.LastObservedUtcUnixMs.ShouldBeGreaterThan(0);
notifications.TryDequeue(out var second).ShouldBeTrue();
second!.HostName.ShouldBe("GatewayClient");
second.RuntimeStatus.ShouldBe("Stopped");
}
[Fact]
public async Task Dispose_unsubscribes_so_post_dispose_state_changes_do_not_fire_events()
{
using var pump = new StaPump("Test.Sta");
await pump.WaitForStartedAsync();
var proxy = new FakeProxy();
var mx = new MxAccessClient(pump, proxy, "GatewayClient", new MxAccessClientOptions { AutoReconnect = false });
var backend = new MxAccessGalaxyBackend(
new GalaxyRepository(new GalaxyRepositoryOptions { ConnectionString = "Server=.;Database=ZB;Integrated Security=True;" }),
mx,
historian: null);
var count = 0;
backend.OnHostStatusChanged += (_, _) => Interlocked.Increment(ref count);
await mx.ConnectAsync();
count.ShouldBe(1);
backend.Dispose();
await mx.DisconnectAsync();
count.ShouldBe(1); // no second notification after Dispose
}
private sealed class FakeProxy : IMxProxy
{
private int _next = 1;
public int Register(string _) => 42;
public void Unregister(int _) { }
public int AddItem(int _, string __) => Interlocked.Increment(ref _next);
public void RemoveItem(int _, int __) { }
public void AdviseSupervisory(int _, int __) { }
public void UnAdviseSupervisory(int _, int __) { }
public void Write(int _, int __, object ___, int ____) { }
public event MxDataChangeHandler? OnDataChange { add { } remove { } }
public event MxWriteCompleteHandler? OnWriteComplete { add { } remove { } }
}
}

View File

@@ -0,0 +1,119 @@
using System;
using System.IO;
using System.IO.Pipes;
using System.Security.Principal;
using System.Threading;
using System.Threading.Tasks;
using MessagePack;
using Serilog;
using Serilog.Core;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Ipc;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests
{
/// <summary>
/// Direct IPC handshake test — drives <see cref="PipeServer"/> with a hand-rolled client
/// built on <see cref="FrameReader"/>/<see cref="FrameWriter"/> from Shared. Stays in
/// net48 x86 alongside the Host (the Proxy's <c>GalaxyIpcClient</c> is net10 only and
/// cannot be loaded into this process). Functionally equivalent to going through
/// <c>GalaxyIpcClient</c> — proves the wire protocol + ACL + shared-secret enforcement.
/// Skipped on Administrator shells per the same PipeAcl-denies-Administrators guard.
/// </summary>
[Trait("Category", "Integration")]
public sealed class IpcHandshakeIntegrationTests
{
private static bool IsAdministrator()
{
using var identity = WindowsIdentity.GetCurrent();
return new WindowsPrincipal(identity).IsInRole(WindowsBuiltInRole.Administrator);
}
private static async Task<(NamedPipeClientStream Stream, FrameReader Reader, FrameWriter Writer)>
ConnectAndHelloAsync(string pipeName, string secret, CancellationToken ct)
{
var stream = new NamedPipeClientStream(".", pipeName, PipeDirection.InOut, PipeOptions.Asynchronous);
await stream.ConnectAsync(5_000, ct);
var reader = new FrameReader(stream, leaveOpen: true);
var writer = new FrameWriter(stream, leaveOpen: true);
await writer.WriteAsync(MessageKind.Hello,
new Hello { PeerName = "test-client", SharedSecret = secret }, ct);
var ack = await reader.ReadFrameAsync(ct);
if (ack is null) throw new EndOfStreamException("no HelloAck");
if (ack.Value.Kind != MessageKind.HelloAck) throw new InvalidOperationException("unexpected first frame");
var ackMsg = MessagePackSerializer.Deserialize<HelloAck>(ack.Value.Body);
if (!ackMsg.Accepted) throw new UnauthorizedAccessException(ackMsg.RejectReason);
return (stream, reader, writer);
}
[Fact]
public async Task Handshake_with_correct_secret_succeeds_and_heartbeat_round_trips()
{
if (IsAdministrator()) return;
using var identity = WindowsIdentity.GetCurrent();
var sid = identity.User!;
var pipe = $"OtOpcUaGalaxyTest-{Guid.NewGuid():N}";
const string secret = "test-secret-2026";
Logger log = new LoggerConfiguration().CreateLogger();
using var cts = new CancellationTokenSource(TimeSpan.FromSeconds(10));
var server = new PipeServer(pipe, sid, secret, log);
var serverTask = Task.Run(() => server.RunOneConnectionAsync(
new GalaxyFrameHandler(new StubGalaxyBackend(), log), cts.Token));
var (stream, reader, writer) = await ConnectAndHelloAsync(pipe, secret, cts.Token);
using (stream)
using (reader)
using (writer)
{
await writer.WriteAsync(MessageKind.Heartbeat,
new Heartbeat { SequenceNumber = 42, UtcUnixMs = 1000 }, cts.Token);
var hbAckFrame = await reader.ReadFrameAsync(cts.Token);
hbAckFrame.HasValue.ShouldBeTrue();
hbAckFrame!.Value.Kind.ShouldBe(MessageKind.HeartbeatAck);
MessagePackSerializer.Deserialize<HeartbeatAck>(hbAckFrame.Value.Body).SequenceNumber.ShouldBe(42L);
}
cts.Cancel();
try { await serverTask; } catch { /* shutdown */ }
server.Dispose();
}
[Fact]
public async Task Handshake_with_wrong_secret_is_rejected()
{
if (IsAdministrator()) return;
using var identity = WindowsIdentity.GetCurrent();
var sid = identity.User!;
var pipe = $"OtOpcUaGalaxyTest-{Guid.NewGuid():N}";
Logger log = new LoggerConfiguration().CreateLogger();
using var cts = new CancellationTokenSource(TimeSpan.FromSeconds(10));
var server = new PipeServer(pipe, sid, "real-secret", log);
var serverTask = Task.Run(() => server.RunOneConnectionAsync(
new GalaxyFrameHandler(new StubGalaxyBackend(), log), cts.Token));
await Should.ThrowAsync<UnauthorizedAccessException>(async () =>
{
var (s, r, w) = await ConnectAndHelloAsync(pipe, "wrong-secret", cts.Token);
s.Dispose();
r.Dispose();
w.Dispose();
});
cts.Cancel();
try { await serverTask; } catch { /* shutdown */ }
server.Dispose();
}
}
}

View File

@@ -0,0 +1,173 @@
using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
using ArchestrA.MxAccess;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Sta;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests;
[Trait("Category", "Unit")]
public sealed class MxAccessClientMonitorLoopTests
{
/// <summary>
/// PR 6 low finding #1 — every $Heartbeat probe must RemoveItem the item handle it
/// allocated. Without that, the monitor leaks one handle per MonitorInterval seconds,
/// which over a 24h uptime becomes thousands of leaked MXAccess handles and can
/// eventually exhaust the runtime proxy's handle table.
/// </summary>
[Fact]
public async Task Heartbeat_probe_calls_RemoveItem_for_every_AddItem()
{
using var pump = new StaPump("Monitor.Sta");
await pump.WaitForStartedAsync();
var proxy = new CountingProxy();
var client = new MxAccessClient(pump, proxy, "probe-test", new MxAccessClientOptions
{
AutoReconnect = true,
MonitorInterval = TimeSpan.FromMilliseconds(150),
StaleThreshold = TimeSpan.FromMilliseconds(50),
});
await client.ConnectAsync();
// Wait past StaleThreshold, then let several monitor cycles fire.
await Task.Delay(700);
client.Dispose();
// One Heartbeat probe fires per monitor tick once the connection looks stale.
proxy.HeartbeatAddCount.ShouldBeGreaterThan(1);
// Every AddItem("$Heartbeat") must be matched by a RemoveItem on the same handle.
proxy.HeartbeatAddCount.ShouldBe(proxy.HeartbeatRemoveCount);
proxy.OutstandingHeartbeatHandles.ShouldBe(0);
}
/// <summary>
/// PR 6 low finding #2 — after reconnect, per-subscription replay failures must raise
/// SubscriptionReplayFailed so the backend can propagate the degradation, not get
/// silently eaten.
/// </summary>
[Fact]
public async Task SubscriptionReplayFailed_fires_for_each_tag_that_fails_to_replay()
{
using var pump = new StaPump("Replay.Sta");
await pump.WaitForStartedAsync();
var proxy = new ReplayFailingProxy(failOnReplayForTags: new[] { "BadTag.A", "BadTag.B" });
var client = new MxAccessClient(pump, proxy, "replay-test", new MxAccessClientOptions
{
AutoReconnect = true,
MonitorInterval = TimeSpan.FromMilliseconds(120),
StaleThreshold = TimeSpan.FromMilliseconds(50),
});
var failures = new ConcurrentBag<SubscriptionReplayFailedEventArgs>();
client.SubscriptionReplayFailed += (_, e) => failures.Add(e);
await client.ConnectAsync();
await client.SubscribeAsync("GoodTag.X", (_, _) => { });
await client.SubscribeAsync("BadTag.A", (_, _) => { });
await client.SubscribeAsync("BadTag.B", (_, _) => { });
proxy.TriggerProbeFailureOnNextCall();
// Wait for the monitor loop to probe → fail → reconnect → replay.
await Task.Delay(800);
client.Dispose();
failures.Count.ShouldBe(2);
var names = new HashSet<string>();
foreach (var f in failures) names.Add(f.TagReference);
names.ShouldContain("BadTag.A");
names.ShouldContain("BadTag.B");
}
// ----- test doubles -----
private sealed class CountingProxy : IMxProxy
{
private int _next = 1;
private readonly ConcurrentDictionary<int, string> _live = new();
public int HeartbeatAddCount;
public int HeartbeatRemoveCount;
public int OutstandingHeartbeatHandles => _live.Count;
public event MxDataChangeHandler? OnDataChange { add { } remove { } }
public event MxWriteCompleteHandler? OnWriteComplete { add { } remove { } }
public int Register(string _) => 42;
public void Unregister(int _) { }
public int AddItem(int _, string address)
{
var h = Interlocked.Increment(ref _next);
_live[h] = address;
if (address == "$Heartbeat") Interlocked.Increment(ref HeartbeatAddCount);
return h;
}
public void RemoveItem(int _, int itemHandle)
{
if (_live.TryRemove(itemHandle, out var addr) && addr == "$Heartbeat")
Interlocked.Increment(ref HeartbeatRemoveCount);
}
public void AdviseSupervisory(int _, int __) { }
public void UnAdviseSupervisory(int _, int __) { }
public void Write(int _, int __, object ___, int ____) { }
}
/// <summary>
/// Mock that lets us exercise the reconnect + replay path. TriggerProbeFailureOnNextCall
/// flips a one-shot flag so the very next AddItem("$Heartbeat") throws — that drives the
/// monitor loop into the reconnect-with-replay branch. During the replay, AddItem for the
/// tags listed in failOnReplayForTags throws so SubscriptionReplayFailed should fire once
/// per failing tag.
/// </summary>
private sealed class ReplayFailingProxy : IMxProxy
{
private int _next = 1;
private readonly HashSet<string> _failOnReplay;
private int _probeFailOnce;
private readonly ConcurrentDictionary<string, bool> _replayedOnce = new(StringComparer.OrdinalIgnoreCase);
public ReplayFailingProxy(IEnumerable<string> failOnReplayForTags)
{
_failOnReplay = new HashSet<string>(failOnReplayForTags, StringComparer.OrdinalIgnoreCase);
}
public void TriggerProbeFailureOnNextCall() => Interlocked.Exchange(ref _probeFailOnce, 1);
public event MxDataChangeHandler? OnDataChange { add { } remove { } }
public event MxWriteCompleteHandler? OnWriteComplete { add { } remove { } }
public int Register(string _) => 42;
public void Unregister(int _) { }
public int AddItem(int _, string address)
{
if (address == "$Heartbeat" && Interlocked.Exchange(ref _probeFailOnce, 0) == 1)
throw new InvalidOperationException("simulated probe failure");
// Fail only on the *replay* AddItem for listed tags — not the initial subscribe.
if (_failOnReplay.Contains(address) && _replayedOnce.ContainsKey(address))
throw new InvalidOperationException($"simulated replay failure for {address}");
if (_failOnReplay.Contains(address)) _replayedOnce[address] = true;
return Interlocked.Increment(ref _next);
}
public void RemoveItem(int _, int __) { }
public void AdviseSupervisory(int _, int __) { }
public void UnAdviseSupervisory(int _, int __) { }
public void Write(int _, int __, object ___, int ____) { }
}
}

View File

@@ -0,0 +1,116 @@
using System;
using System.Threading;
using System.Threading.Tasks;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Galaxy;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Sta;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests
{
/// <summary>
/// End-to-end smoke against the live MXAccess COM runtime + Galaxy ZB DB on this dev box.
/// Skipped when ArchestrA bootstrap (<c>aaBootstrap</c>) isn't running. Verifies the
/// ported <see cref="MxAccessClient"/> can connect to <c>LMXProxyServer</c>, the
/// <see cref="MxAccessGalaxyBackend"/> can answer Discover against the live ZB schema,
/// and a one-shot read returns a valid VTQ for the first deployed attribute it finds.
/// </summary>
[Trait("Category", "LiveMxAccess")]
public sealed class MxAccessLiveSmokeTests
{
private static GalaxyRepositoryOptions DevZb() => new()
{
ConnectionString = "Server=localhost;Database=ZB;Integrated Security=True;TrustServerCertificate=True;Encrypt=False;Connect Timeout=2;",
CommandTimeoutSeconds = 10,
};
private static async Task<bool> ArchestraReachableAsync()
{
try
{
var repo = new GalaxyRepository(DevZb());
using var cts = new CancellationTokenSource(TimeSpan.FromSeconds(2));
if (!await repo.TestConnectionAsync(cts.Token)) return false;
using var sc = new System.ServiceProcess.ServiceController("aaBootstrap");
return sc.Status == System.ServiceProcess.ServiceControllerStatus.Running;
}
catch { return false; }
}
[Fact]
public async Task Connect_to_local_LMXProxyServer_succeeds()
{
if (!await ArchestraReachableAsync()) return;
using var pump = new StaPump("MxA-test-pump");
await pump.WaitForStartedAsync();
using var mx = new MxAccessClient(pump, new MxProxyAdapter(), "OtOpcUa-MxAccessSmoke");
var handle = await mx.ConnectAsync();
handle.ShouldBeGreaterThan(0);
mx.IsConnected.ShouldBeTrue();
}
[Fact]
public async Task Backend_OpenSession_then_Discover_returns_objects_with_attributes()
{
if (!await ArchestraReachableAsync()) return;
using var pump = new StaPump("MxA-test-pump");
await pump.WaitForStartedAsync();
using var mx = new MxAccessClient(pump, new MxProxyAdapter(), "OtOpcUa-MxAccessSmoke");
var backend = new MxAccessGalaxyBackend(new GalaxyRepository(DevZb()), mx);
var session = await backend.OpenSessionAsync(new OpenSessionRequest { DriverInstanceId = "smoke" }, CancellationToken.None);
session.Success.ShouldBeTrue(session.Error);
var resp = await backend.DiscoverAsync(new DiscoverHierarchyRequest { SessionId = session.SessionId }, CancellationToken.None);
resp.Success.ShouldBeTrue(resp.Error);
resp.Objects.Length.ShouldBeGreaterThan(0);
await backend.CloseSessionAsync(new CloseSessionRequest { SessionId = session.SessionId }, CancellationToken.None);
}
/// <summary>
/// Live one-shot read against any attribute we discover. Best-effort — passes silently
/// if no readable attribute is exposed (some Galaxy installs are configuration-only;
/// we only assert the call shape is correct, not a specific value).
/// </summary>
[Fact]
public async Task Backend_ReadValues_against_discovered_attribute_returns_a_response_shape()
{
if (!await ArchestraReachableAsync()) return;
using var pump = new StaPump("MxA-test-pump");
await pump.WaitForStartedAsync();
using var mx = new MxAccessClient(pump, new MxProxyAdapter(), "OtOpcUa-MxAccessSmoke");
var backend = new MxAccessGalaxyBackend(new GalaxyRepository(DevZb()), mx);
var session = await backend.OpenSessionAsync(new OpenSessionRequest { DriverInstanceId = "smoke" }, CancellationToken.None);
var disc = await backend.DiscoverAsync(new DiscoverHierarchyRequest { SessionId = session.SessionId }, CancellationToken.None);
var firstAttr = System.Linq.Enumerable.FirstOrDefault(disc.Objects, o => o.Attributes.Length > 0);
if (firstAttr is null)
{
await backend.CloseSessionAsync(new CloseSessionRequest { SessionId = session.SessionId }, CancellationToken.None);
return;
}
var fullRef = $"{firstAttr.TagName}.{firstAttr.Attributes[0].AttributeName}";
var read = await backend.ReadValuesAsync(
new ReadValuesRequest { SessionId = session.SessionId, TagReferences = new[] { fullRef } },
CancellationToken.None);
read.Success.ShouldBeTrue();
read.Values.Length.ShouldBe(1);
// We don't assert the value (it may be Bad/Uncertain depending on what's running);
// we only assert the response shape is correct end-to-end.
read.Values[0].TagReference.ShouldBe(fullRef);
await backend.CloseSessionAsync(new CloseSessionRequest { SessionId = session.SessionId }, CancellationToken.None);
}
}
}

View File

@@ -2,6 +2,8 @@
<PropertyGroup>
<TargetFramework>net48</TargetFramework>
<PlatformTarget>x86</PlatformTarget>
<Prefer32Bit>true</Prefer32Bit>
<Nullable>enable</Nullable>
<LangVersion>latest</LangVersion>
<IsPackable>false</IsPackable>
@@ -21,6 +23,12 @@
<ItemGroup>
<ProjectReference Include="..\..\src\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.csproj"/>
<Reference Include="System.ServiceProcess"/>
<!-- IMxProxy's delegate signatures mention ArchestrA.MxAccess.MXSTATUS_PROXY, so tests
implementing the interface must resolve that type at compile time. -->
<Reference Include="ArchestrA.MxAccess">
<HintPath>..\..\lib\ArchestrA.MxAccess.dll</HintPath>
</Reference>
</ItemGroup>
<ItemGroup>

View File

@@ -0,0 +1,27 @@
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests;
[Trait("Category", "Unit")]
public sealed class AggregateColumnMappingTests
{
[Theory]
[InlineData(HistoryAggregateType.Average, "Average")]
[InlineData(HistoryAggregateType.Minimum, "Minimum")]
[InlineData(HistoryAggregateType.Maximum, "Maximum")]
[InlineData(HistoryAggregateType.Count, "ValueCount")]
public void Maps_OpcUa_enum_to_AnalogSummary_column(HistoryAggregateType aggregate, string expected)
{
GalaxyProxyDriver.MapAggregateToColumn(aggregate).ShouldBe(expected);
}
[Fact]
public void Total_is_not_supported()
{
Should.Throw<System.NotSupportedException>(
() => GalaxyProxyDriver.MapAggregateToColumn(HistoryAggregateType.Total));
}
}

View File

@@ -0,0 +1,130 @@
using System.Diagnostics;
using System.Reflection;
using System.Security.Principal;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Ipc;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests;
/// <summary>
/// The honest cross-FX parity test — spawns the actual <c>OtOpcUa.Driver.Galaxy.Host.exe</c>
/// subprocess (net48 x86), the Proxy connects via real named pipe, exercises Discover
/// against the live Galaxy ZB DB, and asserts gobjects come back. This is the production
/// deployment shape (Tier C: separate process, IPC over named pipe, Proxy in the .NET 10
/// server process). Skipped when the Host EXE isn't built or Galaxy is unreachable.
/// </summary>
[Trait("Category", "ProcessSpawnParity")]
public sealed class HostSubprocessParityTests : IDisposable
{
private Process? _hostProcess;
public void Dispose()
{
if (_hostProcess is not null && !_hostProcess.HasExited)
{
try { _hostProcess.Kill(entireProcessTree: true); } catch { /* ignore */ }
try { _hostProcess.WaitForExit(5_000); } catch { /* ignore */ }
}
_hostProcess?.Dispose();
}
private static string? FindHostExe()
{
// The test assembly lives at tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests/bin/Debug/net10.0/.
// The Host EXE lives at src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host/bin/Debug/net48/.
var asmDir = Path.GetDirectoryName(Assembly.GetExecutingAssembly().Location)!;
var solutionRoot = asmDir;
for (var i = 0; i < 8 && solutionRoot is not null; i++)
{
if (File.Exists(Path.Combine(solutionRoot, "ZB.MOM.WW.OtOpcUa.slnx")))
break;
solutionRoot = Path.GetDirectoryName(solutionRoot);
}
if (solutionRoot is null) return null;
var candidate = Path.Combine(solutionRoot,
"src", "ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host", "bin", "Debug", "net48",
"OtOpcUa.Driver.Galaxy.Host.exe");
return File.Exists(candidate) ? candidate : null;
}
private static bool IsAdministrator()
{
if (!OperatingSystem.IsWindows()) return false;
using var identity = WindowsIdentity.GetCurrent();
return new WindowsPrincipal(identity).IsInRole(WindowsBuiltInRole.Administrator);
}
private static async Task<bool> ZbReachableAsync()
{
try
{
using var client = new System.Net.Sockets.TcpClient();
var task = client.ConnectAsync("localhost", 1433);
return await Task.WhenAny(task, Task.Delay(1_500)) == task && client.Connected;
}
catch { return false; }
}
[Fact]
public async Task Spawned_Host_in_db_mode_lets_Proxy_Discover_real_Galaxy_gobjects()
{
if (!OperatingSystem.IsWindows() || IsAdministrator()) return;
if (!await ZbReachableAsync()) return;
var hostExe = FindHostExe();
if (hostExe is null) return; // skip when the Host hasn't been built
using var identity = WindowsIdentity.GetCurrent();
var sid = identity.User!;
var pipeName = $"OtOpcUaGalaxyParity-{Guid.NewGuid():N}";
const string secret = "parity-secret";
var psi = new ProcessStartInfo(hostExe)
{
UseShellExecute = false,
CreateNoWindow = true,
RedirectStandardOutput = true,
RedirectStandardError = true,
EnvironmentVariables =
{
["OTOPCUA_GALAXY_PIPE"] = pipeName,
["OTOPCUA_ALLOWED_SID"] = sid.Value,
["OTOPCUA_GALAXY_SECRET"] = secret,
["OTOPCUA_GALAXY_BACKEND"] = "db", // SQL-only — doesn't need MXAccess
["OTOPCUA_GALAXY_ZB_CONN"] = "Server=localhost;Database=ZB;Integrated Security=True;TrustServerCertificate=True;Encrypt=False;",
},
};
_hostProcess = Process.Start(psi)
?? throw new InvalidOperationException("Failed to spawn Galaxy.Host");
// Wait for the pipe to come up — the Host's PipeServer takes ~100ms to bind.
await Task.Delay(2_000);
await using var client = await GalaxyIpcClient.ConnectAsync(
pipeName, secret, TimeSpan.FromSeconds(5), CancellationToken.None);
var sessionResp = await client.CallAsync<OpenSessionRequest, OpenSessionResponse>(
MessageKind.OpenSessionRequest,
new OpenSessionRequest { DriverInstanceId = "parity", DriverConfigJson = "{}" },
MessageKind.OpenSessionResponse,
CancellationToken.None);
sessionResp.Success.ShouldBeTrue(sessionResp.Error);
var discoverResp = await client.CallAsync<DiscoverHierarchyRequest, DiscoverHierarchyResponse>(
MessageKind.DiscoverHierarchyRequest,
new DiscoverHierarchyRequest { SessionId = sessionResp.SessionId },
MessageKind.DiscoverHierarchyResponse,
CancellationToken.None);
discoverResp.Success.ShouldBeTrue(discoverResp.Error);
discoverResp.Objects.Length.ShouldBeGreaterThan(0,
"live Galaxy ZB has at least one deployed gobject");
await client.SendOneWayAsync(MessageKind.CloseSessionRequest,
new CloseSessionRequest { SessionId = sessionResp.SessionId }, CancellationToken.None);
}
}

View File

@@ -1,91 +0,0 @@
using System.IO.Pipes;
using System.Security.Principal;
using Serilog;
using Serilog.Core;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Ipc;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Ipc;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests;
/// <summary>
/// End-to-end IPC test: <see cref="PipeServer"/> (from Galaxy.Host) accepts a connection from
/// the Proxy's <see cref="GalaxyIpcClient"/>. Verifies the Hello handshake, shared-secret
/// check, and heartbeat round-trip. Uses the current user's SID so the ACL allows the
/// localhost test process. Skipped on non-Windows (pipe ACL is Windows-only).
/// </summary>
[Trait("Category", "Integration")]
public sealed class IpcHandshakeIntegrationTests
{
[Fact]
public async Task Hello_handshake_with_correct_secret_succeeds_and_heartbeat_round_trips()
{
if (!OperatingSystem.IsWindows()) return; // pipe ACL is Windows-only
if (IsAdministrator()) return; // ACL explicitly denies Administrators — skip on admin shells
using var currentIdentity = WindowsIdentity.GetCurrent();
var allowedSid = currentIdentity.User!;
var pipeName = $"OtOpcUaGalaxyTest-{Guid.NewGuid():N}";
const string secret = "test-secret-2026";
Logger log = new LoggerConfiguration().CreateLogger();
using var cts = new CancellationTokenSource(TimeSpan.FromSeconds(10));
var server = new PipeServer(pipeName, allowedSid, secret, log);
var serverTask = Task.Run(() => server.RunOneConnectionAsync(new StubFrameHandler(), cts.Token));
await using var client = await GalaxyIpcClient.ConnectAsync(
pipeName, secret, TimeSpan.FromSeconds(5), cts.Token);
// Heartbeat round-trip via the stub handler.
var ack = await client.CallAsync<Heartbeat, HeartbeatAck>(
MessageKind.Heartbeat,
new Heartbeat { SequenceNumber = 42, UtcUnixMs = 1000 },
MessageKind.HeartbeatAck,
cts.Token);
ack.SequenceNumber.ShouldBe(42L);
cts.Cancel();
try { await serverTask; } catch (OperationCanceledException) { }
server.Dispose();
}
[Fact]
public async Task Hello_with_wrong_secret_is_rejected()
{
if (!OperatingSystem.IsWindows()) return;
if (IsAdministrator()) return;
using var currentIdentity = WindowsIdentity.GetCurrent();
var allowedSid = currentIdentity.User!;
var pipeName = $"OtOpcUaGalaxyTest-{Guid.NewGuid():N}";
Logger log = new LoggerConfiguration().CreateLogger();
using var cts = new CancellationTokenSource(TimeSpan.FromSeconds(10));
var server = new PipeServer(pipeName, allowedSid, "real-secret", log);
var serverTask = Task.Run(() => server.RunOneConnectionAsync(new StubFrameHandler(), cts.Token));
await Should.ThrowAsync<UnauthorizedAccessException>(() =>
GalaxyIpcClient.ConnectAsync(pipeName, "wrong-secret", TimeSpan.FromSeconds(5), cts.Token));
cts.Cancel();
try { await serverTask; } catch { /* server loop ends */ }
server.Dispose();
}
/// <summary>
/// The production ACL explicitly denies Administrators. On dev boxes the interactive user
/// is often an Administrator, so the allow rule gets overridden by the deny — the pipe
/// refuses the connection. Skip in that case; the production install runs as a dedicated
/// non-admin service account.
/// </summary>
private static bool IsAdministrator()
{
if (!OperatingSystem.IsWindows()) return false;
using var identity = WindowsIdentity.GetCurrent();
var principal = new WindowsPrincipal(identity);
return principal.IsInRole(WindowsBuiltInRole.Administrator);
}
}

View File

@@ -6,7 +6,14 @@
<LangVersion>9.0</LangVersion>
<Nullable>enable</Nullable>
<IsPackable>false</IsPackable>
<IsTestProject>true</IsTestProject>
<!--
Phase 2 Stream D — V1 ARCHIVE. References v1 OtOpcUa.Host directly.
Excluded from `dotnet test` solution runs; replaced by the v2
OtOpcUa.Driver.Galaxy.E2E suite. To run explicitly:
dotnet test tests/ZB.MOM.WW.OtOpcUa.IntegrationTests
See docs/v2/V1_ARCHIVE_STATUS.md.
-->
<IsTestProject>false</IsTestProject>
<RootNamespace>ZB.MOM.WW.OtOpcUa.IntegrationTests</RootNamespace>
</PropertyGroup>

Some files were not shown because too many files have changed in this diff Show More