Compare commits

...

27 Commits

Author SHA1 Message Date
Joseph Doherty
46834a43bd Phase 3 PR 17 — complete OPC UA server startup end-to-end + integration test. PR 16 shipped the materialization shape (DriverNodeManager / OtOpcUaServer) without the activation glue; this PR finishes the scope so an external OPC UA client can actually connect, browse, and read. New OpcUaServerOptions DTO bound from the OpcUaServer section of appsettings.json (EndpointUrl default opc.tcp://0.0.0.0:4840/OtOpcUa, ApplicationName, ApplicationUri, PkiStoreRoot default %ProgramData%\OtOpcUa\pki, AutoAcceptUntrustedClientCertificates default true for dev — production flips via config). OpcUaApplicationHost wraps Opc.Ua.Configuration.ApplicationInstance: BuildConfiguration constructs the ApplicationConfiguration programmatically (no external XML) with SecurityConfiguration pointing at <PkiStoreRoot>/own, /issuers, /trusted, /rejected directories — stack auto-creates the cert folders on first run and generates a self-signed application certificate via CheckApplicationInstanceCertificate, ServerConfiguration.BaseAddresses set to the endpoint URL + SecurityPolicies just None + UserTokenPolicies just Anonymous with PolicyId='Anonymous' + SecurityPolicyUri=None so the client's UserTokenPolicy lookup succeeds at OpenSession, TransportQuotas.OperationTimeout=15s + MinRequestThreadCount=5 / MaxRequestThreadCount=100 / MaxQueuedRequestCount=200, CertificateValidator auto-accepts untrusted when configured. StartAsync creates the OtOpcUaServer (passes DriverHost + ILoggerFactory so one DriverNodeManager is created per registered driver in CreateMasterNodeManager from PR 16), calls ApplicationInstance.Start(server) to bind the endpoint, then walks each DriverNodeManager and drives a fresh GenericDriverNodeManager.BuildAddressSpaceAsync against it so the driver's discovery streams into the address space that's already serving clients. Per-driver discovery is isolated per decision #12: a discovery exception marks the driver's subtree faulted but the server stays up serving the other drivers' subtrees. DriverHost.GetDriver(instanceId) public accessor added alongside the existing GetHealth so OtOpcUaServer can enumerate drivers during CreateMasterNodeManager. DriverNodeManager.Driver property made public so OpcUaApplicationHost can identify which driver each node manager wraps during the discovery loop. OpcUaServerService constructor takes OpcUaApplicationHost — ExecuteAsync sequence now: bootstrap.LoadCurrentGenerationAsync → applicationHost.StartAsync → infinite Task.Delay until stop. StopAsync disposes the application host (which stops the server via OtOpcUaServer.Stop) before disposing DriverHost. Program.cs binds OpcUaServerOptions from appsettings + registers OpcUaApplicationHost + OpcUaServerOptions as singletons. Integration test (OpcUaServerIntegrationTests, Category=Integration): IAsyncLifetime spins up the server on a random non-default port (48400+random for test isolation) with a per-test-run PKI store root (%temp%/otopcua-test-<guid>) + a FakeDriver registered in DriverHost that has ITagDiscovery + IReadable implementations — DiscoverAsync registers TestFolder>Var1, ReadAsync returns 42. Client_can_connect_and_browse_driver_subtree creates an in-process OPC UA client session via CoreClientUtils.SelectEndpoint (which talks to the running server's GetEndpoints and fetches the live EndpointDescription with the actual PolicyId), browses the fake driver's root, asserts TestFolder appears in the returned references. Client_can_read_a_driver_variable_through_the_node_manager constructs the variable NodeId using the namespace index the server registered (urn:OtOpcUa:fake), calls Session.ReadValue, asserts the DataValue.Value is 42 — the whole pipeline (client → server endpoint → DriverNodeManager.OnReadValue → FakeDriver.ReadAsync → back through the node manager → response to client) round-trips correctly. Dispose tears down the session, server, driver host, and PKI store directory. Full solution: 0 errors, 165 tests pass (8 Core unit + 14 Proxy unit + 24 Configuration unit + 6 Shared unit + 91 Galaxy.Host unit + 4 Server (2 unit NodeBootstrap + 2 new integration) + 18 Admin). End-to-end outcome: PR 14's GalaxyAlarmTracker alarm events now flow through PR 15's GenericDriverNodeManager event forwarder → PR 16's ConditionSink → OPC UA AlarmConditionState.ReportEvent → out to every OPC UA client subscribed to the alarm condition. The full alarm subsystem (driver-side subscription of the Galaxy 4-attribute quartet, Core-side routing by source node id, Server-side AlarmConditionState materialization with ReportEvent dispatch) is now complete and observable through any compliant OPC UA client. LDAP / security-profile wire-up (replacing the anonymous-only endpoint with BasicSignAndEncrypt + user identity mapping to NodePermissions role) is the next layer — it reuses the same ApplicationConfiguration plumbing this PR introduces but needs a deployment-policy source (central config DB) for the cert trust decisions.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 08:18:37 -04:00
7683b94287 Merge pull request 'Phase 3 PR 16 — concrete OPC UA server scaffolding + AlarmConditionState materialization' (#15) from phase-3-pr16-opcua-server into v2 2026-04-18 08:10:44 -04:00
Joseph Doherty
f53c39a598 Phase 3 PR 16 — concrete OPC UA server scaffolding with AlarmConditionState materialization. Introduces the OPCFoundation.NetStandard.Opc.Ua.Server package (v1.5.374.126, same version the v1 stack already uses) and two new server-side classes: DriverNodeManager : CustomNodeManager2 is the concrete realization of PR 15's IAddressSpaceBuilder contract — Folder() creates FolderState nodes under an Organizes hierarchy rooted at ObjectsFolder > DriverInstanceId; Variable() creates BaseDataVariableState with DataType mapped from DriverDataType (Boolean/Int32/Float/Double/String/DateTime) + ValueRank (Scalar or OneDimension) + AccessLevel CurrentReadOrWrite; AddProperty() creates PropertyState with HasProperty reference. Read hook wires OnReadValue per variable to route to IReadable.ReadAsync; Write hook wires OnWriteValue to route to IWritable.WriteAsync and surface per-tag StatusCode. MarkAsAlarmCondition() materializes an OPC UA AlarmConditionState child of the variable, seeded from AlarmConditionInfo (SourceName, InitialSeverity → UA severity via Low=250/Medium=500/High=700/Critical=900, InitialDescription), initial state Enabled + Acknowledged + Inactive + Retain=false. Returns an IAlarmConditionSink whose OnTransition updates alarm.Severity/Time/Message and switches state per AlarmType string ('Active' → SetActiveState(true) + SetAcknowledgedState(false) + Retain=true; 'Acknowledged' → SetAcknowledgedState(true); 'Inactive' → SetActiveState(false) + Retain=false if already Acked) then calls alarm.ReportEvent to emit the OPC UA event to subscribed clients. Galaxy's GalaxyAlarmTracker (PR 14) now lands at a concrete AlarmConditionState node instead of just raising an unobserved C# event. OtOpcUaServer : StandardServer wires one DriverNodeManager per DriverHost.GetDriver during CreateMasterNodeManager — anonymous endpoint, no security profile (minimum-viable; LDAP + security-profile wire-up is the next PR). DriverHost gains public GetDriver(instanceId) so the server can enumerate drivers at startup. NestedBuilder inner class in DriverNodeManager implements IAddressSpaceBuilder by temporarily retargeting the parent's _currentFolder during each call so Folder→Variable→AddProperty land under the correct subtree — not thread-safe if discovery ran concurrently, but GenericDriverNodeManager.BuildAddressSpaceAsync is sequential per driver so this is safe by construction. NuGet audit suppress for GHSA-h958-fxgg-g7w3 (moderate-severity in OPCFoundation.NetStandard.Opc.Ua.Core 1.5.374.126; v1 stack already accepts this risk on the same package version). PR 16 is scoped as scaffolding — the actual server startup (ApplicationInstance, certificate config, endpoint binding, session management wiring into OpcUaServerService.ExecuteAsync) is deferred to a follow-up PR because it needs ApplicationConfiguration XML + optional-cert-store logic that depends on per-deployment policy decisions. The materialization shape is complete: a subsequent PR adds 100 LOC to start the server and all the already-written IAddressSpaceBuilder + alarm-condition + read/write wire-up activates end-to-end. Full solution: 0 errors, 152 unit tests pass (no new tests this PR — DriverNodeManager unit testing needs an IServerInternal mock which is heavyweight; live-endpoint integration tests land alongside the server-startup PR).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 08:00:36 -04:00
d569c39f30 Merge pull request 'Phase 3 PR 15 — alarm-condition contract in abstract layer' (#14) from phase-3-pr15-alarm-contract into v2 2026-04-18 07:54:30 -04:00
Joseph Doherty
190d09cdeb Phase 3 PR 15 — alarm-condition contract in IAddressSpaceBuilder + wire OnAlarmEvent through GenericDriverNodeManager. IAddressSpaceBuilder.IVariableHandle gains MarkAsAlarmCondition(AlarmConditionInfo) which returns an IAlarmConditionSink. AlarmConditionInfo carries SourceName/InitialSeverity/InitialDescription. Concrete address-space builders (the upcoming PR 16 OPC UA server backend) materialize a sibling AlarmConditionState node on the first call; the sink receives every lifecycle transition the generic node manager forwards. GenericDriverNodeManager gains a CapturingBuilder wrapper that transparently wraps every Folder/Variable call — the wrapper observes MarkAsAlarmCondition calls without participating in materialization, captures the resulting IAlarmConditionSink into an internal source-node-id → sink ConcurrentDictionary keyed by IVariableHandle.FullReference. After DiscoverAsync completes, if the driver implements IAlarmSource the node manager subscribes to OnAlarmEvent and routes every AlarmEventArgs to the sink registered for args.SourceNodeId — unknown source ids are dropped silently (may belong to another driver or to a variable the builder chose not to flag). Dispose unsubscribes the forwarder to prevent dangling invocation-list references across node-manager rebuilds. GalaxyProxyDriver.DiscoverAsync now calls handle.MarkAsAlarmCondition(new AlarmConditionInfo(fullName, AlarmSeverity.Medium, null)) on every attr.IsAlarm=true variable — severity seed is Medium because the live Priority byte arrives through the subsequent GalaxyAlarmEvent stream (which PR 14's GalaxyAlarmTracker now emits); the Admin UI sees the severity update on the first transition. RecordingAddressSpaceBuilder in Driver.Galaxy.E2E gains a RecordedAlarmCondition list + a RecordingSink implementation that captures AlarmEventArgs for test assertion — the E2E parity suite can now verify alarm-condition registration shape in addition to folder/variable shape. Tests (4 new GenericDriverNodeManagerTests): Alarm_events_are_routed_to_the_sink_registered_for_the_matching_source_node_id — 2 alarms registered (Tank.HiHi + Heater.OverTemp), driver raises an event for Tank.HiHi, the Tank.HiHi sink captures the payload, the Heater.OverTemp sink does not (tag-scoped fan-out, not broadcast); Non_alarm_variables_do_not_register_sinks — plain Tank.Level in the same discover is not in TrackedAlarmSources; Unknown_source_node_id_is_dropped_silently — a transition for Unknown.Source doesn't reach any sink + no exception; Dispose_unsubscribes_from_OnAlarmEvent — post-dispose, a transition for a previously-registered tag is no-op because the forwarder detached. InternalsVisibleTo('ZB.MOM.WW.OtOpcUa.Core.Tests') added to Core csproj so TrackedAlarmSources internal property is visible to the test. Full solution: 0 errors, 152 unit tests pass (8 Core + 14 Proxy + 14 Admin + 24 Configuration + 6 Shared + 84 Galaxy.Host + 2 Server). PR 16 will implement the concrete OPC UA address-space builder that materializes AlarmConditionState from this contract.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 07:51:35 -04:00
4e0040e670 Merge pull request 'Phase 2 PR 14 — alarm subsystem (subscribe to alarm attribute quartet + raise GalaxyAlarmEvent)' (#13) from phase-2-pr14-alarm-subsystem into v2 2026-04-18 07:37:49 -04:00
91cb2a1355 Merge pull request 'Phase 2 PR 13 — port GalaxyRuntimeProbeManager + per-platform ScanState probing' (#12) from phase-2-pr13-runtime-probe into v2 2026-04-18 07:37:41 -04:00
Joseph Doherty
c14624f012 Phase 2 PR 14 — alarm subsystem wire-up. Per IsAlarm=true attribute (PR 9 added the discovery flag), GalaxyAlarmTracker in Backend/Alarms/ advises the four Galaxy alarm-state attributes: .InAlarm (boolean alarm active), .Priority (int severity), .DescAttrName (human-readable description), .Acked (boolean acknowledged). Runs the OPC UA Part 9 alarm lifecycle state machine simplified for the Galaxy AlarmExtension model and raises AlarmTransition events on transitions operators must react to — Active (InAlarm false→true, default Unacknowledged), Acknowledged (Acked false→true while InAlarm still true), Inactive (InAlarm true→false). MxAccessGalaxyBackend instantiates the tracker in its constructor with delegate-based subscribe/unsubscribe/write pointers to MxAccessClient, hooks TransitionRaised to forward each transition through the existing OnAlarmEvent IPC event that PR 4 ConnectionSink wires into MessageKind.AlarmEvent frames — no new contract messages required since GalaxyAlarmEvent already exists in Shared.Contracts. Field mapping: EventId = fresh Guid.ToString('N') per transition, ObjectTagName = alarm attribute full reference, AlarmName = alarm attribute full reference, Severity = tracked Priority, StateTransition = 'Active'|'Acknowledged'|'Inactive', Message = DescAttrName or tag fallback, UtcUnixMs = transition time. DiscoverAsync caches every IsAlarm=true attribute's full reference (tag.attribute) into _discoveredAlarmTags (ConcurrentBag cleared-then-filled on every re-Discover to track Galaxy redeploys). SubscribeAlarmsAsync iterates the cache and advises each via GalaxyAlarmTracker.TrackAsync; best-effort per-alarm — a subscribe failure on one alarm doesn't abort the whole call since operators prefer partial alarm coverage to none. Tracker is internally idempotent on repeat Track calls (second invocation for same alarm tag is a no-op; already-subscribed check short-circuits before the 4 MXAccess sub calls). Subscribe-failure rollback inside TrackAsync removes the alarm state + unadvises any of the 4 that did succeed so a partial advise can't leak a phantom tracking entry. AcknowledgeAlarmAsync routes to tracker.AcknowledgeAsync which writes the operator comment to <tag>.AckMsg via MxAccessClient.WriteAsync — writes use the existing MXAccess OnWriteComplete TCS-by-handle path (PR 4 Medium 4) so a runtime-refused ack bubbles up as Success=false rather than false-positive. State-machine quirks preserved from v1: (1) initial Acked=true on subscribe does NOT fire Acknowledged (alarm at rest, pre-acknowledged — default state is Acked=true so the first subscribe callback is a no-op transition), (2) Acked false→true only fires Acknowledged when InAlarm is currently true (acking a latched-inactive alarm is not a user-visible transition), (3) Active transition clears the Acked flag in-state so the next Acked callback correctly fires Acknowledged (v1 had this buried in the ConditionState logic; we track it on the AlarmState struct directly). Priority value handled as int/short/long via type pattern match with int.MaxValue guard — Galaxy attribute category returns varying CLR types (Int32 is canonical but some older templates use Int16), and a long overflow cast to int would silently corrupt the severity. Dispose cascade in MxAccessGalaxyBackend.Dispose: alarm-tracker unsubscribe→dispose, probe-manager unsubscribe→dispose, mx.ConnectionStateChanged detach, historian dispose — same discipline PR 6 / PR 8 / PR 13 established so dangling invocation-list refs don't survive a backend recycle. #pragma warning disable CS0067 around OnAlarmEvent removed since the event is now raised. Tests (9 new, GalaxyAlarmTrackerTests): four-attribute subscribe per alarm, idempotent repeat-track, InAlarm false→true fires Active with Priority + Desc, InAlarm true→false fires Inactive, Acked false→true while InAlarm fires Acknowledged, Acked transition while InAlarm=false does not fire, AckMsg write path carries the comment, snapshot reports latest four fields, foreign probe callback for a non-tracked tag is silently dropped. Full Galaxy.Host.Tests Unit suite 84 pass / 0 fail (9 new alarm + 12 PR 13 probe + 21 PR 12 quality + 42 pre-existing). Galaxy.Host builds clean (0/0). Branches off phase-2-pr13-runtime-probe so the MxAccessGalaxyBackend constructor/Dispose chain gets the probe-manager + alarm-tracker wire-up in a coherent order; fast-forwards if PR 13 merges first.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 07:34:13 -04:00
Joseph Doherty
04d267d1ea Phase 2 PR 13 — port GalaxyRuntimeProbeManager state machine + wire per-platform ScanState probing. PR 8 gave operators the gateway-level transport signal (MxAccessClient.ConnectionStateChanged → OnHostStatusChanged tagged with the Wonderware client identity) — enough to detect when the entire MXAccess COM proxy dies, but silent when a specific or goes down while the gateway stays alive. This PR closes that gap: a pure-logic GalaxyRuntimeProbeManager (ported from v1's 472-LOC GalaxyRuntimeProbeManager.cs, distilled to ~240 LOC of state machine without the OPC UA node-manager entanglement) lives in Backend/Stability/, advises <TagName>.ScanState per WinPlatform + AppEngine gobject discovered during DiscoverAsync, runs the Unknown → Running → Stopped state machine with v1's documented semantics preserved verbatim, and raises a StateChanged event on transitions operators should react to. MxAccessGalaxyBackend instantiates the probe manager in the constructor with SubscribeAsync/UnsubscribeAsync delegate pointers into MxAccessClient, hooks StateChanged to forward each transition through the same OnHostStatusChanged IPC event the gateway signal uses (HostName = platform/engine TagName, RuntimeStatus = 'Running'|'Stopped'|'Unknown', LastObservedUtcUnixMs from the state-change timestamp), so Admin UI gets per-host signals flowing through the existing PR 8 wire with no additional IPC plumbing. State machine rules ported from v1 runtimestatus.md: (a) ScanState is on-change-only — a stably-Running host may go hours without a callback, so Running → Stopped is driven only by explicit ScanState=false, never by starvation; (b) Unknown → Running is a startup transition and does NOT fire StateChanged (would paint every host as 'just recovered' at startup, which is noise and can clear Bad quality set by a concurrently-stopping sibling); (c) Stopped → Running fires StateChanged for the real recovery case; (d) Running → Stopped fires StateChanged; (e) Unknown → Stopped fires StateChanged because that's the first-known-bad signal operators need when a host is down at our startup time. MxAccessGalaxyBackend.DiscoverAsync calls _probeManager.SyncAsync with the runtime-host subset of the hierarchy (CategoryId == 1 WinPlatform or 3 AppEngine) as a best-effort step after building the Discover response — probe failures are swallowed so Discover still returns the hierarchy even if a per-host advise fails; the gateway signal covers the critical rung. SyncAsync is idempotent (second call with the same set is a no-op) and handles the diff on re-Discover for tag rename / host add / host remove. Subscribe failure rolls back the host's state entry under the lock so a later probe callback for a never-advised tag can't transition a phantom state from Unknown to Stopped and fan out a false host-down signal (the same protection v1's GalaxyRuntimeProbeManager had at line 237-243 of v1 with a captured-probe-string comparison under the lock). MxAccessGalaxyBackend.Dispose unsubscribes the StateChanged handler before disposing the probe manager to prevent dangling invocation-list references across reconnects, same discipline as PR 8's ConnectionStateChanged and PR 6's SubscriptionReplayFailed. Tests (12 new GalaxyRuntimeProbeManagerTests): Sync_subscribes_to_ScanState_per_host verifies tag.ScanState subscriptions are advised per Platform/Engine; Sync_is_idempotent_on_repeat_call_with_same_set verifies no duplicate subscribes; Sync_unadvises_removed_hosts verifies the diff unadvises gone hosts; Subscribe_failure_rolls_back_host_entry_so_later_transitions_do_not_fire_stale_events covers the rollback-on-subscribe-fail guard; Unknown_to_Running_does_not_fire_StateChanged preserves the startup-noise rule; Running_to_Stopped_fires_StateChanged_with_both_states asserts OldState and NewState are both captured in the transition record; Stopped_to_Running_fires_StateChanged_for_recovery verifies the recovery case; Unknown_to_Stopped_fires_StateChanged_for_first_known_bad_signal preserves the first-bad rule; Repeated_Good_Running_callbacks_do_not_fire_duplicate_events verifies the state-tracking de-dup; Unknown_callback_for_non_tracked_probe_is_dropped asserts a foreign callback is silently ignored; Snapshot_reports_current_state_for_every_tracked_host covers the dashboard query hook; IsRuntimeHost_recognizes_WinPlatform_and_AppEngine_category_ids asserts the CategoryId filter. Galaxy.Host.Tests Unit suite 75 pass / 0 fail (12 new probe + 63 pre-existing). Galaxy.Host builds clean (0 errors / 0 warnings). Branches off v2.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 07:27:56 -04:00
4448db8207 Merge pull request 'Phase 2 PR 12 � richer historian quality mapping' (#11) from phase-2-pr12-quality-mapper into v2 2026-04-18 07:22:44 -04:00
d96b513bbc Merge pull request 'Phase 2 PR 11 � HistoryReadEvents IPC (alarm history)' (#10) from phase-2-pr11-history-events into v2 2026-04-18 07:22:33 -04:00
053c4e0566 Merge pull request 'Phase 2 PR 10 � HistoryReadAtTime IPC surface' (#9) from phase-2-pr10-history-attime into v2 2026-04-18 07:22:16 -04:00
Joseph Doherty
f24f969a85 Phase 2 PR 12 — richer historian quality mapping. Replace MxAccessGalaxyBackend's inline MapHistorianQualityToOpcUa category-only helper (192+→Good, 64-191→Uncertain, 0-63→Bad) with a new public HistorianQualityMapper.Map utility that preserves specific OPC DA subcodes — BadNotConnected(8)→0x808A0000u instead of generic Bad(0x80000000u), UncertainSubNormal(88)→0x40950000u instead of generic Uncertain, Good_LocalOverride(216)→0x00D80000u instead of generic Good, etc. Mirrors v1 QualityMapper.MapToOpcUaStatusCode byte-for-byte without pulling in OPC UA types — the function returns uint32 literals that are the canonical OPC UA StatusCode wire encoding, surfaced directly as DataValueSnapshot.StatusCode on the Proxy side with no additional translation. Unknown subcodes fall back to the family category (255→Good, 150→Uncertain, 50→Bad) so a future SDK change that adds a quality code we don't map yet still gets a sensible bucket. GalaxyDataValue wire shape unchanged (StatusCode stays uint) — this is a pure fidelity upgrade on the Host side. Downstream callers (Admin UI status dashboard, OPC UA clients receiving historian samples) can now distinguish e.g. a transport outage (BadNotConnected) from a sensor fault (BadSensorFailure) from a warm-up delay (BadWaitingForInitialData) without a second round-trip or dashboard heuristic. 21 new tests (HistorianQualityMapperTests): theory with 15 rows covering every specific mapping from the v1 QualityMapper table, plus 6 fallback tests verifying unknown-subcode codes in each family (Good/Uncertain/Bad) collapse to the family default. Galaxy.Host.Tests Unit suite 56/0 (21 new + 35 existing). Galaxy.Host builds clean (0/0). Branches off v2.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 07:11:02 -04:00
Joseph Doherty
ca025ebe0c Phase 2 PR 11 — HistoryReadEvents IPC (alarm history). New Shared.Contracts messages HistoryReadEventsRequest/Response + GalaxyHistoricalEvent DTO (MessageKind 0x66/0x67). IGalaxyBackend gains HistoryReadEventsAsync, Stub/DbBacked return canonical pending error, MxAccessGalaxyBackend delegates to _historian.ReadEventsAsync (ported in PR 5) and maps HistorianEventDto → GalaxyHistoricalEvent — Guid.ToString() for EventId wire shape, DateTime → Unix ms for both EventTime (when the event fired in the process) and ReceivedTime (when the Historian persisted it), DisplayText + Severity pass through. SourceName is string? — null means 'all sources' (passed straight through to HistorianDataSource.ReadEventsAsync which adds the AddEventFilter('Source', Equal, ...) only when non-null). Distinct from the live GalaxyAlarmEvent type because historical rows carry both timestamps and lack StateTransition (Historian logs instantaneous events, not the OPC UA Part 9 alarm lifecycle; translating to OPC UA event lifecycle is the alarm-subsystem's job). Guards: null historian → Historian-disabled error; SDK exception → Success=false with message chained. Tests (3 new): disabled-error when historian null, maps HistorianEventDto with full field set (Id/Source/EventTime/ReceivedTime/DisplayText/Severity=900) to GalaxyHistoricalEvent, null SourceName passes through unchanged (verifies the 'all sources' contract). Galaxy.Host.Tests Unit suite 34 pass / 0 fail. Galaxy.Host builds clean. Branches off phase-2-pr10-history-attime since both extend the MessageKind enum; fast-forwards if PR 10 merges first.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 07:08:16 -04:00
Joseph Doherty
d13f919112 Phase 2 PR 10 — HistoryReadAtTime IPC surface. New Shared.Contracts messages HistoryReadAtTimeRequest/Response (MessageKind 0x64/0x65), IGalaxyBackend gains HistoryReadAtTimeAsync, Stub/DbBacked return canonical pending error, MxAccessGalaxyBackend delegates to _historian.ReadAtTimeAsync (ported in PR 5, exposed now) — request timestamp array is flow-encoded as Unix ms to avoid MessagePack DateTime quirks then re-hydrated to DateTime on the Host side. Per-sample mapping uses the same ToWire(HistorianSample) helper as ReadRawAsync so the category→StatusCode mapping stays consistent (Quality byte 192+ → Good 0u, 64-191 → Uncertain, 0-63 → Bad 0x80000000u). Guards: null historian → "Historian disabled" (symmetric with other history paths); empty timestamp array short-circuits to Success=true, Values=[] without an SDK round-trip; SDK exception → Success=false with the message chained. Proxy-side IHistoryProvider.ReadAtTimeAsync capability doesn't exist in Core.Abstractions yet (OPC UA HistoryReadAtTime service is supported but the current IHistoryProvider only has ReadRawAsync + ReadProcessedAsync) — this PR adds the Host-side surface so a future Core.Abstractions extension can wire it through without needing another IPC change. Tests (4 new): disabled-error when historian null, empty-timestamp short-circuit without SDK call, Unix-ms↔DateTime round-trip with Good samples at two distinct timestamps, missing sample (Quality=0) maps to 0x80000000u Bad category. Galaxy.Host.Tests Unit suite: 31 pass / 0 fail (4 new at-time + 27 pre-existing). Galaxy.Host builds clean. Branches off v2.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 07:03:25 -04:00
d2ebb91cb1 Merge pull request 'Phase 2 PR 9 — thread IsAlarm discovery flag end-to-end' (#8) from phase-2-pr9-alarms into v2 2026-04-18 06:59:25 -04:00
90ce0af375 Merge pull request 'Phase 2 PR 8 — gateway-level host-status push from MxAccessGalaxyBackend' (#7) from phase-2-pr8-alarms-hoststatus into v2 2026-04-18 06:59:04 -04:00
e250356e2a Merge pull request 'Phase 2 PR 7 — wire IHistoryProvider.ReadProcessedAsync end-to-end' (#6) from phase-2-pr7-history-processed into v2 2026-04-18 06:59:02 -04:00
067ad78e06 Merge pull request 'Phase 2 PR 6 — close PR 4 monitor-loop low findings (probe leak + replay signal)' (#5) from phase-2-pr6-monitor-findings into v2 2026-04-18 06:57:57 -04:00
6cfa8d326d Merge pull request 'Phase 2 PR 4 — close 4 open MXAccess findings (push frames + reconnect + write-await + read-cancel)' (#3) from phase-2-pr4-findings into v2 2026-04-18 06:57:21 -04:00
Joseph Doherty
70a5d06b37 Phase 2 PR 9 — thread IsAlarm discovery flag end-to-end. GalaxyRepository.GetAttributesAsync has always emitted is_alarm alongside is_historized (CASE WHEN EXISTS with the primitive_definition join on primitive_name='AlarmExtension' per v1's Extended Attributes SQL lifted byte-for-byte into the PR 5 repository port), and GalaxyAttributeRow.IsAlarm has been populated since the port, but the flag was silently dropped at the MapAttribute helper in both MxAccessGalaxyBackend and DbBackedGalaxyBackend because GalaxyAttributeInfo on the IPC side had no field to carry it — every deployed alarm attribute arrived at the Proxy with no signal that it was alarm-bearing. This PR wires the flag through the three translation boundaries: GalaxyAttributeInfo gains [Key(6)] public bool IsAlarm { get; set; } at the end of the message to preserve wire-compat with pre-PR9 payloads that omit the key (MessagePack treats missing keys as default, so a newer Proxy talking to an older Host simply gets IsAlarm=false for every attribute); both backend MapAttribute helpers copy row.IsAlarm into the IPC shape; DriverAttributeInfo in Core.Abstractions gains a new IsAlarm parameter with default value false so the positional record signature change doesn't force every non-Galaxy driver call site to flow a flag they don't produce (the existing generic node-manager and future Modbus/etc. drivers keep compiling without modification); GalaxyProxyDriver.DiscoverAsync passes attr.IsAlarm through to the DriverAttributeInfo positional constructor. This is the discovery-side foundation — the generic node-manager can now enrich alarm-bearing variables with OPC UA AlarmConditionState during address-space build (the existing v1 LmxNodeManager pattern that subscribes to <tag>.InAlarm + .Priority + .DescAttrName + .Acked and merges them into a ConditionState) but this PR deliberately stops at discovery: the full alarm subsystem (subscription management for the 4 alarm-status attributes, state-machine tracking for Active/Unacknowledged/Confirmed/Inactive transitions, OPC UA Part 9 alarm event emission, and the write-to-AckMsg ack path) is a follow-up PR 10+ because it touches the node-manager's address-space build path — orthogonal to the IPC flow this PR covers. Tests — AlarmDiscoveryTests (new, 3 cases): GalaxyAttributeInfo_IsAlarm_round_trips_true_through_MessagePack serializes an IsAlarm=true instance and asserts the decoded flag is true + IsHistorized is true + AttributeName survives unchanged; GalaxyAttributeInfo_IsAlarm_round_trips_false_through_MessagePack covers the default path; Pre_PR9_payload_without_IsAlarm_key_deserializes_with_default_false is the wire-compat regression guard — serializes a stand-in PrePR9Shape class with only keys 0..5 (identical layout to the pre-PR9 GalaxyAttributeInfo) and asserts the newer GalaxyAttributeInfo deserializer produces IsAlarm=false without throwing, so a rolling upgrade where the Proxy ships first can talk to an old Host during the window before the Host upgrades without a MessagePack "missing key" exception. Full solution build: 0 errors, 38 warnings (existing). Galaxy.Host.Tests Unit suite: 27 pass / 0 fail (3 new alarm-discovery + 9 PR5 historian + 15 pre-existing). This PR branches off phase-2-pr5-historian because GalaxyProxyDriver's constructor signature + GalaxyHierarchyRow's IsAlarm init-only property are both ancestor state that the simpler branch bases (phase-2-pr4-findings, master) don't yet include.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 06:28:01 -04:00
Joseph Doherty
30ece6e22c Phase 2 PR 8 — wire gateway-level host-status push from MxAccessGalaxyBackend. PR 4 built the IPC infrastructure for OnHostStatusChanged (MessageKind.RuntimeStatusChange frame + ConnectionSink forwarding through FrameWriter) but no backend actually raised the event; the #pragma warning disable CS0067 around MxAccessGalaxyBackend.OnHostStatusChanged declared the event for interface symmetry while acknowledging the wire-up was Phase 2 follow-up. This PR closes the gateway-level signal: MxAccessClient.ConnectionStateChanged (already raised on false→true Register and true→false Unregister transitions, including the reconnect path in MonitorLoopAsync) now drives OnHostStatusChanged with a synthetic HostConnectivityStatus tagged HostName=MxAccessClient.ClientName, RuntimeStatus="Running" on reconnect + "Stopped" on disconnect, LastObservedUtcUnixMs set to the transition moment. The Admin UI's existing IHostConnectivityProbe subscriber on GalaxyProxyDriver (HostStatusChangedEventArgs) already handles the full translation — OnHostConnectivityUpdate parses "Running"/"Stopped"/"Faulted" into the Core.Abstractions HostState enum and fires OnHostStatusChanged downstream, so this single backend-side event wire-up produces an end-to-end signal with no further Proxy changes required. Per-platform and per-AppEngine ScanState probing (the 472 LOC GalaxyRuntimeProbeManager state machine in v1 that advises <Host>.ScanState on every deployed $WinPlatform + $AppEngine gobject, tracks Unknown → Running → Stopped transitions, handles the on-change-only delivery quirk of ScanState, and surfaces IsHostStopped(gobjectId) for the node manager's Read path to short-circuit on-demand reads against known-stopped runtimes) remains deferred to a follow-up PR — the gateway-level signal gives operators the top-level transport-health rung of the status ladder, which is what matters when the Galaxy COM proxy itself goes down (vs a specific platform going down). MxAccessClient.ClientName property exposes the previously-private _clientName field so the backend can tag its pushes with a stable gateway identity — operators configure this via OTOPCUA_GALAXY_CLIENT_NAME env var (default "OtOpcUa-Galaxy.Host" per Program.cs). MxAccessGalaxyBackend constructor subscribes the new _onConnectionStateChanged field before returning + Dispose unsubscribes it via _mx.ConnectionStateChanged -= _onConnectionStateChanged to prevent the backend's own dispose from leaving a dangling handler on the MxAccessClient (same shape as MxAccessClient.SubscriptionReplayFailed PR 6 dispose discipline). #pragma warning disable CS0067 removed from around OnHostStatusChanged since the event is now raised; the directive is narrowed to cover only OnAlarmEvent which stays unraised pending the alarm subsystem port (PR 9 candidate). Tests — HostStatusPushTests (new, 2 cases): ConnectionStateChanged_raises_OnHostStatusChanged_with_gateway_name fires mx.ConnectAsync → mx.DisconnectAsync and asserts two notifications in order with HostName="GatewayClient" (the clientName passed to MxAccessClient ctor), RuntimeStatus="Running" then "Stopped", LastObservedUtcUnixMs > 0; Dispose_unsubscribes_so_post_dispose_state_changes_do_not_fire_events asserts that after backend.Dispose() a subsequent mx.DisconnectAsync does not bump the count on a registered OnHostStatusChanged handler — guards against the subscription-leak regression where a lingering backend instance would accumulate cross-reconnect notifications for a dead writer. Host.Tests csproj gains a Reference to lib/ArchestrA.MxAccess.dll (identical to the reference PR 6 adds — conflict-free cherry-pick/merge since both PRs stage the same <Reference> node; git will collapse to one when either lands first). Full Galaxy.Host.Tests Unit suite: 26 pass / 0 fail (2 new host-status + 9 PR5 historian + 15 pre-existing PostMortemMmf/RecyclePolicy/StaPump/MemoryWatchdog/EndToEndIpc/Handshake). Galaxy.Host builds clean (0 errors, 0 warnings). Branch base — PR 8 is on phase-2-pr5-historian rather than phase-2-pr4-findings because the constructor path on MxAccessGalaxyBackend gained a new historian parameter in PR 5 and the Dispose implementation needs to coordinate the two unsubscribes; targeting the earlier base would leave a trivial conflict on Dispose.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 06:03:16 -04:00
Joseph Doherty
3717405aa6 Phase 2 PR 7 — wire IHistoryProvider.ReadProcessedAsync end-to-end. PR 5 ported HistorianDataSource.ReadAggregateAsync into Galaxy.Host but left it internal — GalaxyProxyDriver.ReadProcessedAsync still threw NotSupportedException, so OPC UA clients issuing HistoryReadProcessed requests against the v2 topology got rejected at the driver boundary. This PR closes that gap by adding two new Shared.Contracts messages (HistoryReadProcessedRequest/Response, MessageKind 0x62/0x63), routing them through GalaxyFrameHandler, implementing HistoryReadProcessedAsync on all three IGalaxyBackend implementations (Stub/DbBacked return the canonical "pending" Success=false, MxAccessGalaxyBackend delegates to _historian.ReadAggregateAsync), mapping HistorianAggregateSample → GalaxyDataValue at the IPC boundary (null bucket Value → BadNoData 0x800E0000u, otherwise Good 0u), and flipping GalaxyProxyDriver.ReadProcessedAsync from the NotSupported throw to a real IPC call with OPC UA HistoryAggregateType enum mapped to Wonderware AnalogSummary column name on the Proxy side (Average → "Average", Minimum → "Minimum", Maximum → "Maximum", Count → "ValueCount", Total → NotSupported since there's no direct SDK column for sum). Decision #13 IPC data-shape stays intact — HistoryReadProcessedResponse carries GalaxyDataValue[] with the same MessagePack value + OPC UA StatusCode + timestamps shape as the other history responses, so the Proxy's existing ToSnapshot helper handles the conversion without a new code path. MxAccessGalaxyBackend.HistoryReadProcessedAsync guards: null historian → "Historian disabled" (symmetric with HistoryReadAsync); IntervalMs <= 0 → "HistoryReadProcessed requires IntervalMs > 0" (prevents division-by-zero inside the SDK's Resolution parameter); exception during SDK call → Success=false Values=[] with the message so the Proxy surfaces it as InvalidOperationException with a clean error chain. Tests — HistoryReadProcessedTests (new, 4 cases): disabled-error when historian null, rejects zero interval, maps Good sample with Value=12.34 and the Proxy-supplied AggregateColumn + IntervalMs flow unchanged through to the fake IHistorianDataSource, maps null Value bucket to 0x800E0000u BadNoData with null ValueBytes. AggregateColumnMappingTests (new, 5 cases in Proxy.Tests): theory covers all 4 supported HistoryAggregateType enum values → correct column string, and asserts Total throws NotSupportedException with a message that steers callers to Average/Minimum/Maximum/Count (the SDK's AnalogSummaryQueryResult doesn't expose a sum column — the closest is Average × ValueCount which is the responsibility of a caller-side aggregation rather than an extra IPC round-trip). InternalsVisibleTo added to Galaxy.Proxy csproj so Proxy.Tests can reach the internal MapAggregateToColumn static. Builds — Galaxy.Host (net48 x86) + Galaxy.Proxy (net10) both 0 errors, full solution 201 warnings (pre-existing) / 0 errors. Test counts — Host.Tests Unit suite: 28 pass (4 new processed + 9 PR5 historian + 15 pre-existing); Proxy.Tests Unit suite: 14 pass (5 new column-mapping + 9 pre-existing). Deferred to a later PR — ReadAtTime + ReadEvents + Health IPC surfaces (HistorianDataSource has them ported in PR 5 but they need additional contract messages and would push this PR past a comfortable review size); the alarm subsystem wire-up (OnAlarmEvent raising from MxAccessGalaxyBackend) which overlaps the ReadEventsAsync IPC work since both pull from HistorianAccess.CreateEventQuery on the SDK side; the Proxy-side quality-byte refinement where HistorianDataSource's per-sample raw quality byte gets decoded through the existing QualityMapper instead of the category-only mapping in ToWire(HistorianSample) — doesn't change correctness today since Good/Uncertain/Bad categories are all the Admin UI and OPC UA clients surface, but richer OPC DA status codes (BadNotConnected, UncertainSubNormal, etc.) are available on the wire and the Proxy could promote them before handing DataValueSnapshot to ISubscribable consumers. This PR branches off phase-2-pr5-historian because it directly extends the Historian IPC surface added there; if PR 5 merges first PR 7 fast-forwards, otherwise it needs a rebase after PR 5 lands.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 05:53:01 -04:00
Joseph Doherty
1c2bf74d38 Phase 2 PR 6 — close the 2 low findings carried forward from PR 4. Low finding #1 ($Heartbeat probe handle leak in MonitorLoopAsync): the probe calls _proxy.AddItem(connectionHandle, "$Heartbeat") on every monitor tick that observes the connection is past StaleThreshold, but previously discarded the returned item handle — so every probe (one per MonitorInterval, default 5s) leaked one item handle into the MXAccess proxy's internal handle table. Fix: capture the item handle, call RemoveItem(connectionHandle, probeHandle) in the InvokeAsync's finally block so it runs on the same pump turn as the AddItem, best-effort RemoveItem swallow so a dying proxy doesn't throw secondary exceptions out of the probe path. Probe ok becomes probeHandle > 0 so any AddItem that returns 0 (MXAccess's "could not create") counts as a failed probe, matching v1 behavior. Low finding #2 (subscription replay silently swallowed per-tag failures): after a reconnect, the replay loop iterates the pre-reconnect subscription snapshot and calls SubscribeOnPumpAsync for each; previously those failures went into a bare catch { /* skip */ } so an operator had no signal when specific tags failed to re-subscribe — the first indication downstream was a quality drop on OPC UA clients. Fix: new SubscriptionReplayFailedEventArgs (TagReference + Exception) + SubscriptionReplayFailed event on MxAccessClient that fires once per tag that fails to re-subscribe, Log.Warning per failure with the reconnect counter + tag reference, and a summary log line at the end of the replay loop ("{failed} of {total} failed" or "{total} re-subscribed cleanly"). Serilog using + ILogger Log = Serilog.Log.ForContext<MxAccessClient>() added. Tests — MxAccessClientMonitorLoopTests (new file, 2 cases): Heartbeat_probe_calls_RemoveItem_for_every_AddItem constructs a CountingProxy IMxProxy that tracks AddItem/RemoveItem pair counts scoped to the "$Heartbeat" address, runs the client with MonitorInterval=150ms + StaleThreshold=50ms for 700ms, asserts HeartbeatAddCount > 1, HeartbeatAddCount == HeartbeatRemoveCount, OutstandingHeartbeatHandles == 0 after dispose; SubscriptionReplayFailed_fires_for_each_tag_that_fails_to_replay uses a ReplayFailingProxy that throws on the next $Heartbeat probe (to trigger the reconnect path) and throws on the replay-time AddItem for specified tag names ("BadTag.A", "BadTag.B"), subscribes GoodTag.X + BadTag.A + BadTag.B before triggering probe failure, collects SubscriptionReplayFailed args into a ConcurrentBag, asserts exactly 2 events fired and both bad tags are represented — GoodTag.X replays cleanly so it does not fire. Host.Tests csproj gains a Reference to lib/ArchestrA.MxAccess.dll because IMxProxy's MxDataChangeHandler delegate signature mentions MXSTATUS_PROXY and the compiler resolves all delegate parameter types when a test class implements the interface, even if the test code never names the type. No regressions: full Galaxy.Host.Tests Unit suite 26 pass / 0 fail (2 new monitor-loop tests + 9 PR5 historian + 15 pre-existing PostMortemMmf/RecyclePolicy/StaPump/MemoryWatchdog/EndToEndIpc/Handshake). Galaxy.Host builds clean (0 errors, 0 warnings) — the new Serilog.Log.ForContext usage picks up the existing Serilog package ref that PR 4 pulled in for the monitor-loop infrastructure. Both findings were flagged as non-blocking for PR 4 merge and are now resolved alongside whichever merge order the reviewer picks; this PR branches off phase-2-pr4-findings so it can rebase cleanly if PR 4 lands first or be re-based onto master after PR 4 merges.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 02:06:15 -04:00
Joseph Doherty
6df1a79d35 Phase 2 PR 5 — port Wonderware Historian SDK into Driver.Galaxy.Host/Backend/Historian/. The full v1 Historian.Aveva code path (HistorianDataSource + HistorianClusterEndpointPicker + IHistorianConnectionFactory + SdkHistorianConnectionFactory) now lives inside Galaxy.Host instead of the previously-required out-of-tree plugin + HistorianPluginLoader AssemblyResolve hack, and MxAccessGalaxyBackend.HistoryReadAsync — which previously returned a Phase 2 Task B.1.h follow-up placeholder — now delegates to the ported HistorianDataSource.ReadRawAsync, maps HistorianSample to GalaxyDataValue via the IPC wire shape, and reports Success=true with per-tag HistoryTagValues arrays. OPC-UA-free surface inside Galaxy.Host: the v1 code returned Opc.Ua.DataValue on the hot path, which would have required dragging OPCFoundation.NetStandard.Opc.Ua.Server into net48 x86 Galaxy.Host and bleeding OPC types across the IPC boundary — instead, the port introduces HistorianSample (Value, Quality byte, TimestampUtc) + HistorianAggregateSample (Value, TimestampUtc) POCOs that carry the raw MX quality byte through the pipe unchanged, and the OPC translation happens on the Proxy side via the existing QualityMapper that the live-read path already uses. Decision #13's IPC data-shape contract survives intact — GalaxyDataValue (TagReference + ValueBytes MessagePack + ValueMessagePackType + StatusCode + SourceTimestampUtcUnixMs + ServerTimestampUtcUnixMs) — so no Shared.Contracts wire break vs PR 4. Cluster failover preserved verbatim: HistorianClusterEndpointPicker is the thread-safe pure-logic picker ported verbatim with no SDK dependency (injected DateTime clock, per-node cooldown state, unknown-node-name tolerance, case-insensitive de-dup on configuration-order list), ConnectToAnyHealthyNode iterates the picker's healthy candidates, clones config per attempt, calls the factory, marks healthy on success / failed on exception with the failure message stored for dashboard surfacing, throws "All N healthy historian candidate(s) failed" with the last exception chained when every node exhausts. Process path + Event path use separate HistorianAccess connections (CreateHistoryQuery vs CreateEventQuery vs CreateAnalogSummaryQuery on the SDK surface) guarded by independent _connection/_eventConnection locks — a mid-query failure on one silo resets only that connection, the other stays open. Four SDK paths ported: ReadRawAsync (RetrievalMode.Full, BatchSize from config.MaxValuesPerRead, MoveNext pump, per-sample quality + value decode with the StringValue/Value fallback the v1 code did, limit-based early exit), ReadAggregateAsync (AnalogSummaryQuery + Resolution in ms, ExtractAggregateValue maps Average/Minimum/Maximum/ValueCount/First/Last/StdDev column names — the NodeId to column mapping is moved to the Proxy side since the IPC request carries a string column), ReadAtTimeAsync (per-timestamp HistoryQuery with RetrievalMode.Interpolated + BatchSize=1, returns Quality=0 / Value=null for missing samples), ReadEventsAsync (EventQuery + AddEventFilter("Source",Equal,sourceName) when sourceName is non-null, EventOrder.Ascending, EventCount = maxEvents or config.MaxValuesPerRead); GetHealthSnapshot returns the full runtime-health snapshot (TotalQueries/Successes/Failures + ConsecutiveFailures + LastSuccess/FailureTime + LastError + ProcessConnectionOpen/EventConnectionOpen + ActiveProcessNode/ActiveEventNode + per-node state list). ReadRaw is the only path wired through IPC in PR 5 (HistoryReadRequest/HistoryTagValues/HistoryReadResponse already existed in Shared.Contracts); Aggregate/AtTime/Events/Health are ported-but-not-yet-IPC-exposed — they stay internal to Galaxy.Host for PR 6+ to surface via new contract message kinds (aggregate = OPC UA HistoryReadProcessed, at-time = HistoryReadAtTime, events = HistoryReadEvents, health = admin dashboard IPC query). Galaxy.Host csproj gains aahClientManaged + aahClientCommon references with Private=false (managed wrappers) + None items for aahClient.dll + Historian.CBE.dll + Historian.DPAPI.dll + ArchestrA.CloudHistorian.Contract.dll native satellites staged alongside the host exe via CopyToOutputDirectory=PreserveNewest so aahClientManaged can P/Invoke into them at runtime without an AssemblyResolve hook (cleaner than the v1 HistorianPluginLoader.cs 180-LOC AssemblyResolve + Assembly.LoadFrom dance that existed solely because the plugin was loaded late from Host/bin/Debug/net48/Historian/). Program.cs adds BuildHistorianIfEnabled() that reads OTOPCUA_HISTORIAN_ENABLED (true or 1) + OTOPCUA_HISTORIAN_SERVER + OTOPCUA_HISTORIAN_SERVERS (comma-separated cluster list overrides single-server) + OTOPCUA_HISTORIAN_PORT (default 32568) + OTOPCUA_HISTORIAN_INTEGRATED (default true) + OTOPCUA_HISTORIAN_USER/OTOPCUA_HISTORIAN_PASS + OTOPCUA_HISTORIAN_TIMEOUT_SEC (30) + OTOPCUA_HISTORIAN_MAX_VALUES (10000) + OTOPCUA_HISTORIAN_COOLDOWN_SEC (60), returns null when disabled so MxAccessGalaxyBackend.HistoryReadAsync surfaces a clean "Historian disabled" Success=false instead of a localhost-SDK hang; server.RunAsync finally block now also casts backend to IDisposable.Dispose() so the historian SDK connections get cleanly closed on Ctrl+C. MxAccessGalaxyBackend gains an IHistorianDataSource? historian constructor parameter (defaults null to preserve existing Host.Tests call sites that don't exercise HistoryRead), implements IDisposable that forwards to _historian.Dispose(), and the pragma warning disable CS0618 is locally scoped to the ToDto(HistorianEvent) mapper since the SDK marks Id/Source/DisplayText/Severity obsolete but the replacement surface isn't available in the aahClientManaged version we bind against — every other deprecated-SDK use still surfaces as an error under TreatWarningsAsErrors. Ported from v1 Historian.Aveva unchanged: the CloneConfigWithServerName helper that preserves every config field except ServerName per attempt; the double-checked locking in EnsureConnected/EnsureEventConnected (fast path = Volatile.Read outside lock, slow path acquires lock + re-checks + disposes any raced-in-parallel connection); HandleConnectionError/HandleEventConnectionError that close the dead connection, clear the active-node tracker, MarkFailed the picker entry with the exception message so the node enters cooldown, and log the reset with node= for operator correlation; RecordSuccess/RecordFailure that bump counters under _healthLock. Tests: HistorianClusterEndpointPickerTests (7 cases) — single-node ServerName fallback when ServerNames empty, MarkFailed enters cooldown and skips, cooldown expires after window, MarkHealthy immediately clears, all-in-cooldown returns empty healthy list, Snapshot reports failure count + last error + IsHealthy, case-insensitive de-dup on duplicate hostnames. HistorianWiringTests (2 cases) — HistoryReadAsync returns "Historian disabled" Success=false when historian:null passed; HistoryReadAsync with a fake IHistorianDataSource maps the returned HistorianSample (Value=42.5, Quality=192 Good, Timestamp) to a GalaxyDataValue with StatusCode=0u + SourceTimestampUtcUnixMs matching the sample + MessagePack-encoded value bytes. InternalsVisibleTo("...Host.Tests") added to Galaxy.Host.csproj so tests can reach the internal HistorianClusterEndpointPicker. Full Galaxy.Host.Tests suite: 24 pass / 0 fail (9 new historian + 15 pre-existing MemoryWatchdog/PostMortemMmf/RecyclePolicy/StaPump/EndToEndIpc/Handshake). Full solution build: 0 errors (202 pre-existing warnings). The v1 Historian.Aveva project + Historian.Aveva.Tests still build intact because the archive PR (Stream D.1 destructive delete) is still ahead of us — PR 5 intentionally does not delete either; once PR 2+3 merge and the archive-delete PR lands, a follow-up cleanup can remove Historian.Aveva + its 4 source files + 18 test cases. Alarm subsystem wire-up (OnAlarmEvent raising from MxAccessGalaxyBackend via AlarmExtension primitives) + host-status push (OnHostStatusChanged via a ported GalaxyRuntimeProbeManager) remain PR 6 candidates; they were on the same "Task B.1.h follow-up" list and share the IPC connection-sink wiring with the historian events path — it made PR 5 scope-manageable to do Historian first since that's what has the biggest surface area (981 LOC v1 plus SDK binding) and alarms/host-status have more bespoke integration with the existing MxAccess subscription fan-out.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 01:44:04 -04:00
Joseph Doherty
caa9cb86f6 Phase 2 PR 4 — close the 4 open high/medium MXAccess findings from exit-gate-phase-2-final.md. High 1 (ReadAsync subscription-leak on cancel): the one-shot read now wraps subscribe→first-OnDataChange→unsubscribe in try/finally so the per-tag callback is always detached, and if the read installed the underlying MXAccess subscription itself (the prior _addressToHandle key was absent) it tears it down on the way out — no leaked probe item handles when the caller cancels or times out. High 2 (no reconnect loop): MxAccessClient gets a MxAccessClientOptions {AutoReconnect, MonitorInterval=5s, StaleThreshold=60s} + a background MonitorLoopAsync started at first ConnectAsync. The loop wakes every MonitorInterval, checks _lastObservedActivityUtc (bumped by every OnDataChange callback), and if stale probes the proxy with a no-op COM AddItem("$Heartbeat") on the StaPump; if the probe throws or returns false, the loop reconnects-with-replay — Unregister (best-effort), Register, snapshot _addressToHandle.Keys + clear, re-AddItem every previously-active subscription, ConnectionStateChanged events fire for the false→true transition, ReconnectCount bumps. Medium 3 (subscriptions don't push frames back to Proxy): IGalaxyBackend gains OnDataChange/OnAlarmEvent/OnHostStatusChanged events; new IFrameHandler.AttachConnection(FrameWriter) is called per-connection by PipeServer after Hello + the returned IDisposable disposes at connection close; GalaxyFrameHandler.ConnectionSink subscribes the events for the connection lifetime, fire-and-forget pushes them as MessageKind.OnDataChangeNotification / AlarmEvent / RuntimeStatusChange frames through the writer, swallows ObjectDisposedException for the dispose race, and unsubscribes in Dispose to prevent leaked invocation list refs across reconnects. MxAccessGalaxyBackend's existing SubscribeAsync (which previously discarded values via a (_, __) => {} callback) now wires OnTagValueChanged that fans out per-tag value changes to every subscription ID listening (one MXAccess subscription, multi-fan-out — _refToSubs reverse map). UnsubscribeAsync also reverse-walks the map to only call mx.UnsubscribeAsync when the LAST sub for a tag drops. Stub + DbBacked backends declare the events with #pragma warning disable CS0067 because they never raise them but must satisfy the interface (treat-warnings-as-errors would otherwise fail). Medium 4 (WriteValuesAsync doesn't await OnWriteComplete): MxAccessClient.WriteAsync rewritten to return Task<bool> via the v1-style TaskCompletionSource-keyed-by-item-handle pattern in _pendingWrites — adds the TCS before the Write call, awaits it with a configurable timeout (default 5s), removes the TCS in finally, returns true only when OnWriteComplete reported success. MxAccessGalaxyBackend.WriteValuesAsync now reports per-tag Bad_InternalError ("MXAccess runtime reported write failure") when the bool returns false, instead of false-positive Good. PipeServer's IFrameHandler interface adds the AttachConnection(FrameWriter):IDisposable method + a public NoopAttachment nested class (net48 doesn't support default interface methods so the empty-attach is exposed for stub implementations). StubFrameHandler returns IFrameHandler.NoopAttachment.Instance. RunOneConnectionAsync calls AttachConnection after HelloAck and usings the returned disposable so it disposes at the connection scope's finally. ConnectionStateChanged event added on MxAccessClient (caller-facing diagnostics for false→true reconnect transitions). docs/v2/implementation/pr-4-body.md is the Gitea web-UI paste-in for opening PR 4 once pushed; includes 2 new low-priority adversarial findings (probe item-handle leak; replay-loop silently swallows per-subscription failures) flagged as follow-ups not PR 4 blockers. Full solution 460 pass / 7 skip (E2E on admin shell) / 1 pre-existing Phase 0 baseline. No regressions vs PR 2's baseline.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 01:12:09 -04:00
Joseph Doherty
a3d16a28f1 Phase 2 Stream D Option B — archive v1 surface + new Driver.Galaxy.E2E parity suite. Non-destructive intermediate state: the v1 OtOpcUa.Host + Historian.Aveva + Tests + IntegrationTests projects all still build (494 v1 unit + 6 v1 integration tests still pass when run explicitly), but solution-level dotnet test ZB.MOM.WW.OtOpcUa.slnx now skips them via IsTestProject=false on the test projects + archive-status PropertyGroup comments on the src projects. The destructive deletion is reserved for Phase 2 PR 3 with explicit operator review per CLAUDE.md "only use destructive operations when truly the best approach". tests/ZB.MOM.WW.OtOpcUa.Tests/ renamed via git mv to tests/ZB.MOM.WW.OtOpcUa.Tests.v1Archive/; csproj <AssemblyName> kept as the original ZB.MOM.WW.OtOpcUa.Tests so v1 OtOpcUa.Host's [InternalsVisibleTo("ZB.MOM.WW.OtOpcUa.Tests")] still matches and the project rebuilds clean. tests/ZB.MOM.WW.OtOpcUa.IntegrationTests gets <IsTestProject>false</IsTestProject>. src/ZB.MOM.WW.OtOpcUa.Host + src/ZB.MOM.WW.OtOpcUa.Historian.Aveva get PropertyGroup archive-status comments documenting they're functionally superseded but kept in-build because cascading dependencies (Historian.Aveva → Host; IntegrationTests → Host) make a single-PR deletion high blast-radius. New tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E/ project (.NET 10) with ParityFixture that spawns OtOpcUa.Driver.Galaxy.Host.exe (net48 x86) as a Process.Start subprocess with OTOPCUA_GALAXY_BACKEND=db env vars, awaits 2s for the PipeServer to bind, then exposes a connected GalaxyProxyDriver; skips on non-Windows / Administrator shells (PipeAcl denies admins per decision #76) / ZB unreachable / Host EXE not built — each skip carries a SkipReason string the test method reads via Assert.Skip(SkipReason). RecordingAddressSpaceBuilder captures every Folder/Variable/AddProperty registration so parity tests can assert on the same shape v1 LmxNodeManager produced. HierarchyParityTests (3) — Discover returns gobjects with attributes; attribute full references match the tag.attribute Galaxy reference grammar; HistoryExtension flag flows through correctly. StabilityFindingsRegressionTests (4) — one test per 2026-04-13 stability finding from commits c76ab8f and 7310925: phantom probe subscription doesn't corrupt unrelated host status; HostStatusChangedEventArgs structurally carries a specific HostName + OldState + NewState (event signature mathematically prevents the v1 cross-host quality-clear bug); all GalaxyProxyDriver capability methods return Task or Task<T> (sync-over-async would deadlock OPC UA stack thread); AcknowledgeAsync completes before returning (no fire-and-forget background work that could race shutdown). Solution test count: 470 pass / 7 skip (E2E on admin shell) / 1 pre-existing Phase 0 baseline. Run archived suites explicitly: dotnet test tests/ZB.MOM.WW.OtOpcUa.Tests.v1Archive (494 pass) + dotnet test tests/ZB.MOM.WW.OtOpcUa.IntegrationTests (6 pass). docs/v2/V1_ARCHIVE_STATUS.md inventories every archived surface with run-it-explicitly instructions + a 10-step deletion plan for PR 3 + rollback procedure (git revert restores all four projects). docs/v2/implementation/exit-gate-phase-2-final.md supersedes the two partial-exit docs with the per-stream status table (A/B/C/D/E all addressed, D split across PR 2/3 per safety protocol), the test count breakdown, fresh adversarial review of PR 2 deltas (4 new findings: medium IsTestProject=false safety net loss, medium structural-vs-behavioral stability tests, low backend=db default, low Process.Start env inheritance), the 8 carried-forward findings from exit-gate-phase-2.md, the recommended PR order (1 → 2 → 3 → 4). docs/v2/implementation/pr-2-body.md is the Gitea web-UI paste-in for opening PR 2 once pushed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-18 00:56:21 -04:00
133 changed files with 6146 additions and 75 deletions

View File

@@ -23,7 +23,8 @@
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Tests.csproj"/> <Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests.csproj"/> <Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests.csproj"/> <Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Tests/ZB.MOM.WW.OtOpcUa.Tests.csproj"/> <Project Path="tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Tests.v1Archive/ZB.MOM.WW.OtOpcUa.Tests.v1Archive.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Historian.Aveva.Tests/ZB.MOM.WW.OtOpcUa.Historian.Aveva.Tests.csproj"/> <Project Path="tests/ZB.MOM.WW.OtOpcUa.Historian.Aveva.Tests/ZB.MOM.WW.OtOpcUa.Historian.Aveva.Tests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.IntegrationTests/ZB.MOM.WW.OtOpcUa.IntegrationTests.csproj"/> <Project Path="tests/ZB.MOM.WW.OtOpcUa.IntegrationTests/ZB.MOM.WW.OtOpcUa.IntegrationTests.csproj"/>
<Project Path="tests/ZB.MOM.WW.OtOpcUa.Client.Shared.Tests/ZB.MOM.WW.OtOpcUa.Client.Shared.Tests.csproj"/> <Project Path="tests/ZB.MOM.WW.OtOpcUa.Client.Shared.Tests/ZB.MOM.WW.OtOpcUa.Client.Shared.Tests.csproj"/>

View File

@@ -0,0 +1,56 @@
# V1 Archive Status (Phase 2 Stream D, 2026-04-18)
This document inventories every v1 surface that's been **functionally superseded** by v2 but
**physically retained** in the build until the deletion PR (Phase 2 PR 3). Rationale: cascading
references mean a single deletion is high blast-radius; archive-marking lets the v2 stack ship
on its own merits while the v1 surface stays as parity reference.
## Archived projects
| Path | Status | Replaced by | Build behavior |
|---|---|---|---|
| `src/ZB.MOM.WW.OtOpcUa.Host/` | Archive (executable in build) | `OtOpcUa.Server` + `Driver.Galaxy.Host` + `Driver.Galaxy.Proxy` | Builds; not deployed by v2 install scripts |
| `src/ZB.MOM.WW.OtOpcUa.Historian.Aveva/` | Archive (plugin in build) | TODO: port into `Driver.Galaxy.Host/Backend/Historian/` (Task B.1.h follow-up) | Builds; loaded only by archived Host |
| `tests/ZB.MOM.WW.OtOpcUa.Tests.v1Archive/` | Archive | `Driver.Galaxy.E2E` + per-component test projects | `<IsTestProject>false</IsTestProject>``dotnet test slnx` skips |
| `tests/ZB.MOM.WW.OtOpcUa.IntegrationTests/` | Archive | `Driver.Galaxy.E2E` | `<IsTestProject>false</IsTestProject>``dotnet test slnx` skips |
## How to run the archived suites explicitly
```powershell
# v1 unit tests (494):
dotnet test tests/ZB.MOM.WW.OtOpcUa.Tests.v1Archive
# v1 integration tests (6):
dotnet test tests/ZB.MOM.WW.OtOpcUa.IntegrationTests
```
Both still pass on this dev box — they're the parity reference for Phase 2 PR 3's deletion
decision.
## Deletion plan (Phase 2 PR 3)
Pre-conditions:
- [ ] `Driver.Galaxy.E2E` test count covers the v1 IntegrationTests' 6 integration scenarios
at minimum (currently 7 tests; expand as needed)
- [ ] `Driver.Galaxy.Host/Backend/Historian/` ports the Wonderware Historian plugin
so `MxAccessGalaxyBackend.HistoryReadAsync` returns real data (Task B.1.h)
- [ ] Operator review on a separate PR — destructive change
Steps:
1. `git rm -r src/ZB.MOM.WW.OtOpcUa.Host/`
2. `git rm -r src/ZB.MOM.WW.OtOpcUa.Historian.Aveva/`
(or move it under Driver.Galaxy.Host first if the lift is part of the same PR)
3. `git rm -r tests/ZB.MOM.WW.OtOpcUa.Tests.v1Archive/`
4. `git rm -r tests/ZB.MOM.WW.OtOpcUa.IntegrationTests/`
5. Edit `ZB.MOM.WW.OtOpcUa.slnx` — remove the four project lines
6. `dotnet build ZB.MOM.WW.OtOpcUa.slnx` → confirm clean
7. `dotnet test ZB.MOM.WW.OtOpcUa.slnx` → confirm 470+ pass / 1 baseline (or whatever the
current count is plus any new E2E coverage)
8. Commit: "Phase 2 Stream D — delete v1 archive (Host + Historian.Aveva + v1Tests + IntegrationTests)"
9. PR 3 against `v2`, link this doc + exit-gate-phase-2-final.md
10. One reviewer signoff
## Rollback
If Phase 2 PR 3 surfaces downstream consumer regressions, `git revert` the deletion commit
restores the four projects intact. The v2 stack continues to ship from the v2 branch.

View File

@@ -0,0 +1,123 @@
# Phase 2 Final Exit Gate (2026-04-18)
> Supersedes `phase-2-partial-exit-evidence.md` and `exit-gate-phase-2.md`. Captures the
> as-built state at the close of Phase 2 work delivered across two PRs.
## Status: **All five Phase 2 streams addressed. Stream D split across PR 2 (archive) + PR 3 (delete) per safety protocol.**
## Stream-by-stream status
| Stream | Plan §reference | Status | PR |
|---|---|---|---|
| A — Driver.Galaxy.Shared | §A.1A.3 | ✅ Complete | PR 1 (merged or pending) |
| B — Driver.Galaxy.Host | §B.1B.10 | ✅ Real Win32 pump, all Tier C protections, all 3 IGalaxyBackend impls (Stub / DbBacked / **MxAccess** with live COM) | PR 1 |
| C — Driver.Galaxy.Proxy | §C.1C.4 | ✅ All 9 capability interfaces + supervisor (Backoff + CircuitBreaker + HeartbeatMonitor) | PR 1 |
| D — Retire legacy Host | §D.1D.3 | ✅ Migration script, installer scripts, Stream D procedure doc, **archive markings on all v1 surface (this PR 2)**, deletion deferred to PR 3 | PR 2 (this) + PR 3 (next) |
| E — Parity validation | §E.1E.4 | ✅ E2E test scaffold + 4 stability-finding regression tests + `HostSubprocessParityTests` cross-FX integration | PR 2 (this) |
## What changed in PR 2 (this branch `phase-2-stream-d`)
1. **`tests/ZB.MOM.WW.OtOpcUa.Tests/`** renamed to `tests/ZB.MOM.WW.OtOpcUa.Tests.v1Archive/`,
`<AssemblyName>` kept as `ZB.MOM.WW.OtOpcUa.Tests` so the v1 Host's `InternalsVisibleTo`
still matches, `<IsTestProject>false</IsTestProject>` so `dotnet test slnx` excludes it.
2. **Three other v1 projects archive-marked** with PropertyGroup comments:
`OtOpcUa.Host`, `Historian.Aveva`, `IntegrationTests`. `IntegrationTests` also gets
`<IsTestProject>false</IsTestProject>`.
3. **New `tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E/`** project (.NET 10):
- `ParityFixture` spawns `OtOpcUa.Driver.Galaxy.Host.exe` (net48 x86) as subprocess via
`Process.Start`, connects via real named pipe, exposes a connected `GalaxyProxyDriver`.
Skips when Galaxy ZB unreachable, when Host EXE not built, or when running as
Administrator (PipeAcl denies admins).
- `RecordingAddressSpaceBuilder` captures Folder + Variable + Property registrations so
parity tests can assert shape.
- `HierarchyParityTests` (3) — Discover returns gobjects with attributes;
attribute full references match `tag.attribute` shape; HistoryExtension flag flows
through.
- `StabilityFindingsRegressionTests` (4) — one test per 2026-04-13 finding:
phantom-probe-doesn't-corrupt-status, host-status-event-is-scoped, all-async-no-sync-
over-async, AcknowledgeAsync-completes-before-returning.
4. **`docs/v2/V1_ARCHIVE_STATUS.md`** — inventory + deletion plan for PR 3.
5. **`docs/v2/implementation/exit-gate-phase-2-final.md`** (this doc) — supersedes the two
partial-exit docs.
## Test counts
**Solution-level `dotnet test ZB.MOM.WW.OtOpcUa.slnx`**: **470 pass / 7 skip / 1 baseline failure**.
| Project | Pass | Skip |
|---|---:|---:|
| Core.Abstractions.Tests | 24 | 0 |
| Configuration.Tests | 42 | 0 |
| Core.Tests | 4 | 0 |
| Server.Tests | 2 | 0 |
| Admin.Tests | 21 | 0 |
| Driver.Galaxy.Shared.Tests | 6 | 0 |
| Driver.Galaxy.Host.Tests | 30 | 0 |
| Driver.Galaxy.Proxy.Tests | 10 | 0 |
| **Driver.Galaxy.E2E (NEW)** | **0** | **7** (all skip with documented reason — admin shell) |
| Client.Shared.Tests | 131 | 0 |
| Client.UI.Tests | 98 | 0 |
| Client.CLI.Tests | 51 / 1 fail | 0 |
| Historian.Aveva.Tests | 41 | 0 |
**Excluded from solution run (run explicitly when needed)**:
- `OtOpcUa.Tests.v1Archive` — 494 pass (v1 unit tests, kept as parity reference)
- `OtOpcUa.IntegrationTests` — 6 pass (v1 integration tests, kept as parity reference)
## Adversarial review of the PR 2 diff
Independent pass over the PR 2 deltas. New findings ranked by severity; existing findings
from the previous exit-gate doc still apply.
### New findings
**Medium 1 — `IsTestProject=false` on `OtOpcUa.IntegrationTests` removes the safety net.**
The 6 v1 integration tests no longer run on solution test. *Mitigation:* the new E2E suite
covers the same scenarios in the v2 topology shape. *Risk:* if E2E test count regresses or
fails to cover a scenario, the v1 fallback isn't auto-checked. **Procedure**: PR 3
checklist includes "E2E test count covers v1 IntegrationTests' 6 scenarios at minimum".
**Medium 2 — Stability-finding regression tests #2, #3, #4 are structural (reflection-based)
not behavioral.** Findings #2 and #3 use type-shape assertions (event signature carries
HostName; methods return Task) rather than triggering the actual race. *Mitigation:* the v1
defects were structural — fixing them required interface changes that the type-shape
assertions catch. *Risk:* a future refactor that re-introduces sync-over-async via a non-
async helper called inside a Task method wouldn't trip the test. **Filed as v2.1**: add a
runtime async-call-stack analyzer (Roslyn or post-build).
**Low 1 — `ParityFixture` defaults to `OTOPCUA_GALAXY_BACKEND=db`** (not `mxaccess`).
Discover works against ZB without needing live MXAccess. The MXAccess-required tests will
need a second fixture once they're written.
**Low 2 — `Process.Start(EnvironmentVariables)` doesn't always inherit clean state.** The
test inherits the parent's PATH + locale, which is normally fine but could mask a missing
runtime dependency. *Mitigation:* in CI, pin a clean environment block.
### Existing findings (carried forward from `exit-gate-phase-2.md`)
All 8 still apply unchanged. Particularly:
- High 1 (MxAccess Read subscription-leak on cancellation) — open
- High 2 (no MXAccess reconnect loop, only supervisor-driven recycle) — open
- Medium 3 (SubscribeAsync doesn't push OnDataChange frames yet) — open
- Medium 4 (WriteValuesAsync doesn't await OnWriteComplete) — open
## Cross-cutting deferrals (out of Phase 2)
- **Deletion of v1 archive** — PR 3, gated on operator review + E2E coverage parity check
- **Wonderware Historian SDK plugin port** (`Historian.Aveva``Driver.Galaxy.Host/Backend/Historian/`) — Task B.1.h, opportunistically with PR 3 or as PR 4
- **MxAccess subscription push frames** — Task B.1.s, follow-up to enable real-time data
flow (currently subscribes register but values aren't pushed back)
- **Wonderware Historian-backed HistoryRead** — depends on B.1.h
- **Alarm subsystem wire-up** — `MxAccessGalaxyBackend.SubscribeAlarmsAsync` is a no-op
- **Reconnect-without-recycle** in MxAccessClient — v2.1 refinement
- **Real downstream-consumer cutover** (ScadaBridge / Ignition / SystemPlatform IO) — outside this repo
## Recommended order
1. **PR 1** (`phase-1-configuration``v2`) — merge first; self-contained, parity preserved
2. **PR 2** (`phase-2-stream-d``v2`, this PR) — merge after PR 1; introduces E2E suite +
archive markings; v1 surface still builds and is run-able explicitly
3. **PR 3** (next session) — delete v1 archive; depends on operator approval after PR 2
reviewer signoff
4. **PR 4** (Phase 2 follow-up) — Historian port + MxAccess subscription push frames + the
open high/medium findings

View File

@@ -0,0 +1,69 @@
# PR 2 — Phase 2 Stream D Option B (archive v1 + E2E suite) → v2
**Source**: `phase-2-stream-d` (branched from `phase-1-configuration`)
**Target**: `v2`
**URL** (after push): https://gitea.dohertylan.com/dohertj2/lmxopcua/pulls/new/phase-2-stream-d
## Summary
Phase 2 Stream D Option B per `docs/v2/implementation/stream-d-removal-procedure.md`:
- **Archived the v1 surface** without deleting:
- `tests/ZB.MOM.WW.OtOpcUa.Tests/``tests/ZB.MOM.WW.OtOpcUa.Tests.v1Archive/`
(`<AssemblyName>` kept as `ZB.MOM.WW.OtOpcUa.Tests` so v1 Host's `InternalsVisibleTo`
still matches; `<IsTestProject>false</IsTestProject>` so solution test runs skip it).
- `tests/ZB.MOM.WW.OtOpcUa.IntegrationTests/``<IsTestProject>false</IsTestProject>`
+ archive comment.
- `src/ZB.MOM.WW.OtOpcUa.Host/` + `src/ZB.MOM.WW.OtOpcUa.Historian.Aveva/` — archive
PropertyGroup comments. Both still build (Historian plugin + 41 historian tests still
pass) so Phase 2 PR 3 can delete them in a focused, reviewable destructive change.
- **New `tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E/`** test project (.NET 10):
- `ParityFixture` spawns `OtOpcUa.Driver.Galaxy.Host.exe` (net48 x86) as a subprocess via
`Process.Start`, connects via real named pipe, exposes a connected `GalaxyProxyDriver`.
Skips when Galaxy ZB unreachable / Host EXE not built / Administrator shell.
- `HierarchyParityTests` (3) and `StabilityFindingsRegressionTests` (4) — one test per
2026-04-13 stability finding (phantom probe, cross-host quality clear, sync-over-async,
fire-and-forget alarm shutdown race).
- **`docs/v2/V1_ARCHIVE_STATUS.md`** — inventory + deletion plan for PR 3.
- **`docs/v2/implementation/exit-gate-phase-2-final.md`** — supersedes the two partial-exit
docs with the as-built state, adversarial review of PR 2 deltas (4 new findings), and the
recommended PR sequence (1 → 2 → 3 → 4).
## What's NOT in this PR
- Deletion of the v1 archive — saved for PR 3 with explicit operator review (destructive change).
- Wonderware Historian SDK plugin port — Task B.1.h, follow-up to enable real `HistoryRead`.
- MxAccess subscription push-frames — Task B.1.s, follow-up to enable real-time
data-change push from Host → Proxy.
## Tests
**`dotnet test ZB.MOM.WW.OtOpcUa.slnx`**: **470 pass / 7 skip / 1 pre-existing baseline**.
The 7 skips are the new E2E tests, all skipping with the documented reason
"PipeAcl denies Administrators on dev shells" — the production install runs as a non-admin
service account and these tests will execute there.
Run the archived v1 suites explicitly:
```powershell
dotnet test tests/ZB.MOM.WW.OtOpcUa.Tests.v1Archive # → 494 pass
dotnet test tests/ZB.MOM.WW.OtOpcUa.IntegrationTests # → 6 pass
```
## Test plan for reviewers
- [ ] `dotnet build ZB.MOM.WW.OtOpcUa.slnx` succeeds with no warnings beyond the known
NuGetAuditSuppress + NU1702 cross-FX
- [ ] `dotnet test ZB.MOM.WW.OtOpcUa.slnx` shows the 470/7-skip/1-baseline result
- [ ] Both archived suites pass when run explicitly
- [ ] Build the Galaxy.Host EXE (`dotnet build src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host`),
then run E2E tests on a non-admin shell — they should actually execute and pass
against live Galaxy ZB
- [ ] Spot-read `docs/v2/V1_ARCHIVE_STATUS.md` and confirm the deletion plan is acceptable
## Follow-up tracking
- **PR 3** (next session, when ready): execute the deletion plan in `V1_ARCHIVE_STATUS.md`.
4 projects removed, .slnx updated, full solution test confirms parity.
- **PR 4** (Phase 2 follow-up): port Historian plugin + wire MxAccess subscription pushes +
close the high/medium open findings from `exit-gate-phase-2-final.md`.

View File

@@ -0,0 +1,91 @@
# PR 4 — Phase 2 follow-up: close the 4 open MXAccess findings
**Source**: `phase-2-pr4-findings` (branched from `phase-2-stream-d`)
**Target**: `v2`
## Summary
Closes the 4 high/medium open findings carried forward in `exit-gate-phase-2-final.md`:
- **High 1 — `ReadAsync` subscription-leak on cancel.** One-shot read now wraps the
subscribe→first-OnDataChange→unsubscribe pattern in a `try/finally` so the per-tag
callback is always detached, and if the read installed the underlying MXAccess
subscription itself (no other caller had it), it tears it down on the way out.
- **High 2 — No reconnect loop on the MXAccess COM connection.** New
`MxAccessClientOptions { AutoReconnect, MonitorInterval, StaleThreshold }` + a background
`MonitorLoopAsync` that watches a stale-activity threshold + probes the proxy via a
no-op COM call, then reconnects-with-replay (re-Register, re-AddItem every active
subscription) when the proxy is dead. Liveness signal: every `OnDataChange` callback bumps
`_lastObservedActivityUtc`. Defaults match v1 monitor cadence (5s poll, 60s stale).
`ReconnectCount` exposed for diagnostics; `ConnectionStateChanged` event for downstream
consumers (the supervisor on the Proxy side already surfaces this through its
HeartbeatMonitor, but the Host-side event lets local logging/metrics hook in).
- **Medium 3 — `MxAccessGalaxyBackend.SubscribeAsync` doesn't push OnDataChange frames back to
the Proxy.** New `IGalaxyBackend.OnDataChange` / `OnAlarmEvent` / `OnHostStatusChanged`
events that the new `GalaxyFrameHandler.AttachConnection` subscribes per-connection and
forwards as outbound `OnDataChangeNotification` / `AlarmEvent` /
`RuntimeStatusChange` frames through the connection's `FrameWriter`. `MxAccessGalaxyBackend`
fans out per-tag value changes to every `SubscriptionId` that's listening to that tag
(multiple Proxy subs may share a Galaxy attribute — single COM subscription, multi-fan-out
on the wire). Stub + DbBacked backends declare the events with `#pragma warning disable
CS0067` (treat-warnings-as-errors would otherwise fail on never-raised events that exist
only to satisfy the interface).
- **Medium 4 — `WriteValuesAsync` doesn't await `OnWriteComplete`.** New
`WriteAsync(...)` overload returns `bool` after awaiting the OnWriteComplete callback via
the v1-style `TaskCompletionSource`-keyed-by-item-handle pattern in `_pendingWrites`.
`MxAccessGalaxyBackend.WriteValuesAsync` now reports per-tag `Bad_InternalError` when the
runtime rejected the write, instead of false-positive `Good`.
## Pipe server change
`IFrameHandler` gains `AttachConnection(FrameWriter writer): IDisposable` so the handler can
register backend event sinks on each accepted connection and detach them at disconnect. The
`PipeServer.RunOneConnectionAsync` calls it after the Hello handshake and disposes it in the
finally of the per-connection scope. `StubFrameHandler` returns `IFrameHandler.NoopAttachment.Instance`
(net48 doesn't support default interface methods, so the empty-attach lives as a public nested
class).
## Tests
**`dotnet test ZB.MOM.WW.OtOpcUa.slnx`**: **460 pass / 7 skip (E2E on admin shell) / 1
pre-existing baseline failure**. No regressions. The Driver.Galaxy.Host unit tests + 5 live
ZB smoke + 3 live MXAccess COM smoke all pass unchanged.
## Test plan for reviewers
- [ ] `dotnet build` clean
- [ ] `dotnet test` shows 460/7-skip/1-baseline
- [ ] Spot-check `MxAccessClient.MonitorLoopAsync` against v1's `MxAccessClient.Monitor`
partial (`src/ZB.MOM.WW.OtOpcUa.Host/MxAccess/MxAccessClient.Monitor.cs`) — same
polling cadence, same probe-then-reconnect-with-replay shape
- [ ] Read `GalaxyFrameHandler.ConnectionSink.Dispose` and confirm event handlers are
detached on connection close (no leaked invocation list refs)
- [ ] `WriteValuesAsync` returning `Bad_InternalError` on a runtime-rejected write is the
correct shape — confirm against the v1 `MxAccessClient.ReadWrite.cs` pattern
## What's NOT in this PR
- Wonderware Historian SDK plugin port (Task B.1.h) — separate PR, larger scope.
- Alarm subsystem wire-up (`MxAccessGalaxyBackend.SubscribeAlarmsAsync` is still a no-op).
`OnAlarmEvent` is declared on the backend interface and pushed by the frame handler when
raised; `MxAccessGalaxyBackend` just doesn't raise it yet (waits for the alarm-tracking
port from v1's `AlarmObjectFilter` + Galaxy alarm primitives).
- Host-status push (`OnHostStatusChanged`) — declared on the interface and pushed by the
frame handler; `MxAccessGalaxyBackend` doesn't raise it (the Galaxy.Host's
`HostConnectivityProbe` from v1 needs porting too, scoped under the Historian PR).
## Adversarial review
Quick pass over the PR 4 deltas. No new findings beyond:
- **Low 1** — `MonitorLoopAsync`'s `$Heartbeat` probe item-handle is leaked
(`AddItem` succeeds, never `RemoveItem`'d). Cosmetic — the probe item is internal to
the COM connection, dies with `Unregister` at disconnect/recycle. Worth a follow-up
to call `RemoveItem` after the probe succeeds.
- **Low 2** — Replay loop in `MonitorLoopAsync` swallows per-subscription failures. If
Galaxy permanently rejects a previously-valid reference (rare but possible after a
re-deploy), the user gets silent data loss for that one subscription. The stub-handler-
unaware operator wouldn't notice. Worth surfacing as a `ConnectionStateChanged(false)
→ ConnectionStateChanged(true)` payload that includes the replay-failures list.
Both are low-priority follow-ups, not PR 4 blockers.

View File

@@ -19,10 +19,17 @@ namespace ZB.MOM.WW.OtOpcUa.Core.Abstractions;
/// <param name="ArrayDim">Declared array length when <see cref="IsArray"/> is true; null otherwise.</param> /// <param name="ArrayDim">Declared array length when <see cref="IsArray"/> is true; null otherwise.</param>
/// <param name="SecurityClass">Write-authorization tier for this attribute.</param> /// <param name="SecurityClass">Write-authorization tier for this attribute.</param>
/// <param name="IsHistorized">True when this attribute is expected to feed historian / HistoryRead.</param> /// <param name="IsHistorized">True when this attribute is expected to feed historian / HistoryRead.</param>
/// <param name="IsAlarm">
/// True when this attribute represents an alarm condition (Galaxy: has an
/// <c>AlarmExtension</c> primitive). The generic node-manager enriches the variable with an
/// OPC UA <c>AlarmConditionState</c> when true. Defaults to false so existing non-Galaxy
/// drivers aren't forced to flow a flag they don't produce.
/// </param>
public sealed record DriverAttributeInfo( public sealed record DriverAttributeInfo(
string FullName, string FullName,
DriverDataType DriverDataType, DriverDataType DriverDataType,
bool IsArray, bool IsArray,
uint? ArrayDim, uint? ArrayDim,
SecurityClassification SecurityClass, SecurityClassification SecurityClass,
bool IsHistorized); bool IsHistorized,
bool IsAlarm = false);

View File

@@ -42,4 +42,39 @@ public interface IVariableHandle
{ {
/// <summary>Driver-side full reference for read/write addressing.</summary> /// <summary>Driver-side full reference for read/write addressing.</summary>
string FullReference { get; } string FullReference { get; }
/// <summary>
/// Annotate this variable with an OPC UA <c>AlarmConditionState</c>. Drivers with
/// <see cref="DriverAttributeInfo.IsAlarm"/> = true call this during discovery so the
/// concrete address-space builder can materialize a sibling condition node. The returned
/// sink receives lifecycle transitions raised through <see cref="IAlarmSource.OnAlarmEvent"/>
/// — the generic node manager wires the subscription; the concrete builder decides how
/// to surface the state (e.g. OPC UA <c>AlarmConditionState.Activate</c>,
/// <c>Acknowledge</c>, <c>Deactivate</c>).
/// </summary>
IAlarmConditionSink MarkAsAlarmCondition(AlarmConditionInfo info);
}
/// <summary>
/// Metadata used to materialize an OPC UA <c>AlarmConditionState</c> sibling for a variable.
/// Populated by the driver's discovery step; concrete builders decide how to surface it.
/// </summary>
/// <param name="SourceName">Human-readable alarm name used for the <c>SourceName</c> event field.</param>
/// <param name="InitialSeverity">Severity at address-space build time; updates arrive via <see cref="IAlarmConditionSink"/>.</param>
/// <param name="InitialDescription">Initial description; updates arrive via <see cref="IAlarmConditionSink"/>.</param>
public sealed record AlarmConditionInfo(
string SourceName,
AlarmSeverity InitialSeverity,
string? InitialDescription);
/// <summary>
/// Sink a concrete address-space builder returns from <see cref="IVariableHandle.MarkAsAlarmCondition"/>.
/// The generic node manager routes per-alarm <see cref="IAlarmSource.OnAlarmEvent"/> payloads here —
/// the sink translates the transition into an OPC UA condition state change or whatever the
/// concrete builder's backing address space supports.
/// </summary>
public interface IAlarmConditionSink
{
/// <summary>Push an alarm transition (Active / Acknowledged / Inactive) for this condition.</summary>
void OnTransition(AlarmEventArgs args);
} }

View File

@@ -24,6 +24,17 @@ public sealed class DriverHost : IAsyncDisposable
return _drivers.TryGetValue(driverInstanceId, out var d) ? d.GetHealth() : null; return _drivers.TryGetValue(driverInstanceId, out var d) ? d.GetHealth() : null;
} }
/// <summary>
/// Look up a registered driver by instance id. Used by the OPC UA server runtime
/// (<c>OtOpcUaServer</c>) to instantiate one <c>DriverNodeManager</c> per driver at
/// startup. Returns null when the driver is not registered.
/// </summary>
public IDriver? GetDriver(string driverInstanceId)
{
lock (_lock)
return _drivers.TryGetValue(driverInstanceId, out var d) ? d : null;
}
/// <summary> /// <summary>
/// Registers the driver and calls <see cref="IDriver.InitializeAsync"/>. If initialization /// Registers the driver and calls <see cref="IDriver.InitializeAsync"/>. If initialization
/// throws, the driver is kept in the registry so the operator can retry; quality on its /// throws, the driver is kept in the registry so the operator can retry; quality on its

View File

@@ -1,27 +1,41 @@
using System.Collections.Concurrent;
using ZB.MOM.WW.OtOpcUa.Core.Abstractions; using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
namespace ZB.MOM.WW.OtOpcUa.Core.OpcUa; namespace ZB.MOM.WW.OtOpcUa.Core.OpcUa;
/// <summary> /// <summary>
/// Generic, driver-agnostic backbone for populating the OPC UA address space from an /// Generic, driver-agnostic backbone for populating the OPC UA address space from an
/// <see cref="IDriver"/>. The Galaxy-specific subclass (<c>GalaxyNodeManager</c>) is deferred /// <see cref="IDriver"/>. Walks the driver's discovery, wires the alarm + data-change +
/// to Phase 2 per decision #62 — this class is the foundation that Phase 2 ports the v1 /// rediscovery subscription events, and hands each variable to the supplied
/// <c>LmxNodeManager</c> logic into. /// <see cref="IAddressSpaceBuilder"/>. Concrete OPC UA server implementations provide the
/// builder — see the Server project's <c>OpcUaAddressSpaceBuilder</c> for the materialization
/// against <c>CustomNodeManager2</c>.
/// </summary> /// </summary>
/// <remarks> /// <remarks>
/// Phase 1 status: scaffold only. The v1 <c>LmxNodeManager</c> in the legacy Host is unchanged /// Per <c>docs/v2/plan.md</c> decision #52 + #62 — Core owns the node tree, drivers stream
/// so IntegrationTests continue to pass. Phase 2 will lift-and-shift its logic here, swapping /// <c>Folder</c>/<c>Variable</c> calls, alarm-bearing variables are annotated via
/// <c>IMxAccessClient</c> for <see cref="IDriver"/> and <c>GalaxyAttributeInfo</c> for /// <see cref="IVariableHandle.MarkAsAlarmCondition"/> and subsequent
/// <see cref="DriverAttributeInfo"/>. /// <see cref="IAlarmSource.OnAlarmEvent"/> payloads route to the sink the builder returned.
/// </remarks> /// </remarks>
public abstract class GenericDriverNodeManager(IDriver driver) public class GenericDriverNodeManager(IDriver driver) : IDisposable
{ {
protected IDriver Driver { get; } = driver ?? throw new ArgumentNullException(nameof(driver)); protected IDriver Driver { get; } = driver ?? throw new ArgumentNullException(nameof(driver));
public string DriverInstanceId => Driver.DriverInstanceId; public string DriverInstanceId => Driver.DriverInstanceId;
// Source tag (DriverAttributeInfo.FullName) → alarm-condition sink. Populated during
// BuildAddressSpaceAsync by a recording IAddressSpaceBuilder implementation that captures the
// IVariableHandle per attr.IsAlarm=true variable and calls MarkAsAlarmCondition.
private readonly ConcurrentDictionary<string, IAlarmConditionSink> _alarmSinks =
new(StringComparer.OrdinalIgnoreCase);
private EventHandler<AlarmEventArgs>? _alarmForwarder;
private bool _disposed;
/// <summary> /// <summary>
/// Populates the address space by streaming nodes from the driver into the supplied builder. /// Populates the address space by streaming nodes from the driver into the supplied builder,
/// wraps the builder so alarm-condition sinks are captured, subscribes to the driver's
/// alarm event stream, and routes each transition to the matching sink by <c>SourceNodeId</c>.
/// Driver exceptions are isolated per decision #12 — the driver's subtree is marked Faulted, /// Driver exceptions are isolated per decision #12 — the driver's subtree is marked Faulted,
/// but other drivers remain available. /// but other drivers remain available.
/// </summary> /// </summary>
@@ -32,6 +46,73 @@ public abstract class GenericDriverNodeManager(IDriver driver)
if (Driver is not ITagDiscovery discovery) if (Driver is not ITagDiscovery discovery)
throw new NotSupportedException($"Driver '{Driver.DriverInstanceId}' does not implement ITagDiscovery."); throw new NotSupportedException($"Driver '{Driver.DriverInstanceId}' does not implement ITagDiscovery.");
await discovery.DiscoverAsync(builder, ct); var capturing = new CapturingBuilder(builder, _alarmSinks);
await discovery.DiscoverAsync(capturing, ct);
if (Driver is IAlarmSource alarmSource)
{
_alarmForwarder = (_, e) =>
{
// Route the alarm to the sink registered for the originating variable, if any.
// Unknown source ids are dropped silently — they may belong to another driver or
// to a variable the builder chose not to flag as an alarm condition.
if (_alarmSinks.TryGetValue(e.SourceNodeId, out var sink))
sink.OnTransition(e);
};
alarmSource.OnAlarmEvent += _alarmForwarder;
}
}
public void Dispose()
{
if (_disposed) return;
_disposed = true;
if (_alarmForwarder is not null && Driver is IAlarmSource alarmSource)
{
alarmSource.OnAlarmEvent -= _alarmForwarder;
}
_alarmSinks.Clear();
}
/// <summary>
/// Snapshot the current alarm-sink registry by source node id. Diagnostic + test hook;
/// not part of the hot path.
/// </summary>
internal IReadOnlyCollection<string> TrackedAlarmSources => _alarmSinks.Keys.ToList();
/// <summary>
/// Wraps the caller-supplied <see cref="IAddressSpaceBuilder"/> so every
/// <see cref="IVariableHandle.MarkAsAlarmCondition"/> call registers the returned sink in
/// the node manager's source-node-id map. The builder itself drives materialization;
/// this wrapper only observes.
/// </summary>
private sealed class CapturingBuilder(
IAddressSpaceBuilder inner,
ConcurrentDictionary<string, IAlarmConditionSink> sinks) : IAddressSpaceBuilder
{
public IAddressSpaceBuilder Folder(string browseName, string displayName)
=> new CapturingBuilder(inner.Folder(browseName, displayName), sinks);
public IVariableHandle Variable(string browseName, string displayName, DriverAttributeInfo attributeInfo)
=> new CapturingHandle(inner.Variable(browseName, displayName, attributeInfo), sinks);
public void AddProperty(string browseName, DriverDataType dataType, object? value)
=> inner.AddProperty(browseName, dataType, value);
}
private sealed class CapturingHandle(
IVariableHandle inner,
ConcurrentDictionary<string, IAlarmConditionSink> sinks) : IVariableHandle
{
public string FullReference => inner.FullReference;
public IAlarmConditionSink MarkAsAlarmCondition(AlarmConditionInfo info)
{
var sink = inner.MarkAsAlarmCondition(info);
// Register by the driver-side full reference so the alarm forwarder can look it up
// using AlarmEventArgs.SourceNodeId (which the driver populates with the same tag).
sinks[inner.FullReference] = sink;
return sink;
}
} }
} }

View File

@@ -16,6 +16,10 @@
<ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Configuration\ZB.MOM.WW.OtOpcUa.Configuration.csproj"/> <ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Configuration\ZB.MOM.WW.OtOpcUa.Configuration.csproj"/>
</ItemGroup> </ItemGroup>
<ItemGroup>
<InternalsVisibleTo Include="ZB.MOM.WW.OtOpcUa.Core.Tests"/>
</ItemGroup>
<ItemGroup> <ItemGroup>
<NuGetAuditSuppress Include="https://github.com/advisories/GHSA-37gx-xxp4-5rgx"/> <NuGetAuditSuppress Include="https://github.com/advisories/GHSA-37gx-xxp4-5rgx"/>
<NuGetAuditSuppress Include="https://github.com/advisories/GHSA-w3x6-4m5h-cxqf"/> <NuGetAuditSuppress Include="https://github.com/advisories/GHSA-w3x6-4m5h-cxqf"/>

View File

@@ -0,0 +1,260 @@
using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Alarms;
/// <summary>
/// Subscribes to the four Galaxy alarm attributes (<c>.InAlarm</c>, <c>.Priority</c>,
/// <c>.DescAttrName</c>, <c>.Acked</c>) per alarm-bearing attribute discovered during
/// <c>DiscoverAsync</c>. Maintains one <see cref="AlarmState"/> per alarm, raises
/// <see cref="AlarmTransition"/> on lifecycle transitions (Active / Unacknowledged /
/// Acknowledged / Inactive). Ack path writes <c>.AckMsg</c>. Pure-logic state machine
/// with delegate-based subscribe/write so it's testable against in-memory fakes.
/// </summary>
/// <remarks>
/// Transitions emitted (OPC UA Part 9 alarm lifecycle, simplified for the Galaxy model):
/// <list type="bullet">
/// <item><c>Active</c> — InAlarm false → true. Default to Unacknowledged.</item>
/// <item><c>Acknowledged</c> — Acked false → true while InAlarm is still true.</item>
/// <item><c>Inactive</c> — InAlarm true → false. If still unacknowledged the alarm
/// is marked latched-inactive-unack; next Ack transitions straight to Inactive.</item>
/// </list>
/// </remarks>
public sealed class GalaxyAlarmTracker : IDisposable
{
public const string InAlarmAttr = ".InAlarm";
public const string PriorityAttr = ".Priority";
public const string DescAttrNameAttr = ".DescAttrName";
public const string AckedAttr = ".Acked";
public const string AckMsgAttr = ".AckMsg";
private readonly Func<string, Action<string, Vtq>, Task> _subscribe;
private readonly Func<string, Task> _unsubscribe;
private readonly Func<string, object, Task<bool>> _write;
private readonly Func<DateTime> _clock;
// Alarm tag (attribute full ref, e.g. "Tank.Level.HiHi") → state.
private readonly ConcurrentDictionary<string, AlarmState> _alarms =
new(StringComparer.OrdinalIgnoreCase);
// Reverse lookup: probed tag (".InAlarm" etc.) → owning alarm tag.
private readonly ConcurrentDictionary<string, (string AlarmTag, AlarmField Field)> _probeToAlarm =
new(StringComparer.OrdinalIgnoreCase);
private bool _disposed;
public event EventHandler<AlarmTransition>? TransitionRaised;
public GalaxyAlarmTracker(
Func<string, Action<string, Vtq>, Task> subscribe,
Func<string, Task> unsubscribe,
Func<string, object, Task<bool>> write)
: this(subscribe, unsubscribe, write, () => DateTime.UtcNow) { }
internal GalaxyAlarmTracker(
Func<string, Action<string, Vtq>, Task> subscribe,
Func<string, Task> unsubscribe,
Func<string, object, Task<bool>> write,
Func<DateTime> clock)
{
_subscribe = subscribe ?? throw new ArgumentNullException(nameof(subscribe));
_unsubscribe = unsubscribe ?? throw new ArgumentNullException(nameof(unsubscribe));
_write = write ?? throw new ArgumentNullException(nameof(write));
_clock = clock ?? throw new ArgumentNullException(nameof(clock));
}
public int TrackedAlarmCount => _alarms.Count;
/// <summary>
/// Advise the four alarm attributes for <paramref name="alarmTag"/>. Idempotent —
/// repeat calls for the same alarm tag are a no-op. Subscribe failure for any of the
/// four rolls back the alarm entry so a stale callback cannot promote a phantom.
/// </summary>
public async Task TrackAsync(string alarmTag)
{
if (_disposed || string.IsNullOrWhiteSpace(alarmTag)) return;
if (_alarms.ContainsKey(alarmTag)) return;
var state = new AlarmState { AlarmTag = alarmTag };
if (!_alarms.TryAdd(alarmTag, state)) return;
var probes = new[]
{
(Tag: alarmTag + InAlarmAttr, Field: AlarmField.InAlarm),
(Tag: alarmTag + PriorityAttr, Field: AlarmField.Priority),
(Tag: alarmTag + DescAttrNameAttr, Field: AlarmField.DescAttrName),
(Tag: alarmTag + AckedAttr, Field: AlarmField.Acked),
};
foreach (var p in probes)
{
_probeToAlarm[p.Tag] = (alarmTag, p.Field);
}
try
{
foreach (var p in probes)
{
await _subscribe(p.Tag, OnProbeCallback).ConfigureAwait(false);
}
}
catch
{
// Rollback so a partial advise doesn't leak state.
_alarms.TryRemove(alarmTag, out _);
foreach (var p in probes)
{
_probeToAlarm.TryRemove(p.Tag, out _);
try { await _unsubscribe(p.Tag).ConfigureAwait(false); } catch { }
}
throw;
}
}
/// <summary>
/// Drop every tracked alarm. Unadvises all 4 probes per alarm as best-effort.
/// </summary>
public async Task ClearAsync()
{
_alarms.Clear();
foreach (var kv in _probeToAlarm.ToList())
{
_probeToAlarm.TryRemove(kv.Key, out _);
try { await _unsubscribe(kv.Key).ConfigureAwait(false); } catch { }
}
}
/// <summary>
/// Operator ack — write the comment text into <c>&lt;alarmTag&gt;.AckMsg</c>.
/// Returns false when the runtime reports the write failed.
/// </summary>
public Task<bool> AcknowledgeAsync(string alarmTag, string comment)
{
if (_disposed || string.IsNullOrWhiteSpace(alarmTag))
return Task.FromResult(false);
return _write(alarmTag + AckMsgAttr, comment ?? string.Empty);
}
/// <summary>
/// Subscription callback entry point. Exposed for tests and for the Backend to route
/// fan-out callbacks through. Runs the state machine and fires TransitionRaised
/// outside the lock.
/// </summary>
public void OnProbeCallback(string probeTag, Vtq vtq)
{
if (_disposed) return;
if (!_probeToAlarm.TryGetValue(probeTag, out var link)) return;
if (!_alarms.TryGetValue(link.AlarmTag, out var state)) return;
AlarmTransition? transition = null;
var now = _clock();
lock (state.Lock)
{
switch (link.Field)
{
case AlarmField.InAlarm:
{
var wasActive = state.InAlarm;
var isActive = vtq.Value is bool b && b;
state.InAlarm = isActive;
state.LastUpdateUtc = now;
if (!wasActive && isActive)
{
state.Acked = false;
state.LastTransitionUtc = now;
transition = new AlarmTransition(state.AlarmTag, AlarmStateTransition.Active, state.Priority, state.DescAttrName, now);
}
else if (wasActive && !isActive)
{
state.LastTransitionUtc = now;
transition = new AlarmTransition(state.AlarmTag, AlarmStateTransition.Inactive, state.Priority, state.DescAttrName, now);
}
break;
}
case AlarmField.Priority:
if (vtq.Value is int pi) state.Priority = pi;
else if (vtq.Value is short ps) state.Priority = ps;
else if (vtq.Value is long pl && pl <= int.MaxValue) state.Priority = (int)pl;
state.LastUpdateUtc = now;
break;
case AlarmField.DescAttrName:
state.DescAttrName = vtq.Value as string;
state.LastUpdateUtc = now;
break;
case AlarmField.Acked:
{
var wasAcked = state.Acked;
var isAcked = vtq.Value is bool b && b;
state.Acked = isAcked;
state.LastUpdateUtc = now;
// Fire Acknowledged only when transitioning false→true. Don't fire on initial
// subscribe callback (wasAcked==isAcked in that case because the state starts
// with Acked=false and the initial probe is usually true for an un-active alarm).
if (!wasAcked && isAcked && state.InAlarm)
{
state.LastTransitionUtc = now;
transition = new AlarmTransition(state.AlarmTag, AlarmStateTransition.Acknowledged, state.Priority, state.DescAttrName, now);
}
break;
}
}
}
if (transition is { } t)
{
TransitionRaised?.Invoke(this, t);
}
}
public IReadOnlyList<AlarmSnapshot> SnapshotStates()
{
return _alarms.Values.Select(s =>
{
lock (s.Lock)
return new AlarmSnapshot(s.AlarmTag, s.InAlarm, s.Acked, s.Priority, s.DescAttrName);
}).ToList();
}
public void Dispose()
{
if (_disposed) return;
_disposed = true;
_alarms.Clear();
_probeToAlarm.Clear();
}
private sealed class AlarmState
{
public readonly object Lock = new();
public string AlarmTag = "";
public bool InAlarm;
public bool Acked = true; // default ack'd so first false→true on subscribe doesn't misfire
public int Priority;
public string? DescAttrName;
public DateTime LastUpdateUtc;
public DateTime LastTransitionUtc;
}
private enum AlarmField { InAlarm, Priority, DescAttrName, Acked }
}
public enum AlarmStateTransition { Active, Acknowledged, Inactive }
public sealed record AlarmTransition(
string AlarmTag,
AlarmStateTransition Transition,
int Priority,
string? DescAttrName,
DateTime AtUtc);
public sealed record AlarmSnapshot(
string AlarmTag,
bool InAlarm,
bool Acked,
int Priority,
string? DescAttrName);

View File

@@ -21,6 +21,13 @@ public sealed class DbBackedGalaxyBackend(GalaxyRepository repository) : IGalaxy
private long _nextSessionId; private long _nextSessionId;
private long _nextSubscriptionId; private long _nextSubscriptionId;
// DB-only backend doesn't have a runtime data plane; never raises events.
#pragma warning disable CS0067
public event System.EventHandler<OnDataChangeNotification>? OnDataChange;
public event System.EventHandler<GalaxyAlarmEvent>? OnAlarmEvent;
public event System.EventHandler<HostConnectivityStatus>? OnHostStatusChanged;
#pragma warning restore CS0067
public Task<OpenSessionResponse> OpenSessionAsync(OpenSessionRequest req, CancellationToken ct) public Task<OpenSessionResponse> OpenSessionAsync(OpenSessionRequest req, CancellationToken ct)
{ {
var id = Interlocked.Increment(ref _nextSessionId); var id = Interlocked.Increment(ref _nextSessionId);
@@ -120,6 +127,33 @@ public sealed class DbBackedGalaxyBackend(GalaxyRepository repository) : IGalaxy
Tags = System.Array.Empty<HistoryTagValues>(), Tags = System.Array.Empty<HistoryTagValues>(),
}); });
public Task<HistoryReadProcessedResponse> HistoryReadProcessedAsync(
HistoryReadProcessedRequest req, CancellationToken ct)
=> Task.FromResult(new HistoryReadProcessedResponse
{
Success = false,
Error = "MXAccess + Historian code lift pending (Phase 2 Task B.1)",
Values = System.Array.Empty<GalaxyDataValue>(),
});
public Task<HistoryReadAtTimeResponse> HistoryReadAtTimeAsync(
HistoryReadAtTimeRequest req, CancellationToken ct)
=> Task.FromResult(new HistoryReadAtTimeResponse
{
Success = false,
Error = "MXAccess + Historian code lift pending (Phase 2 Task B.1)",
Values = System.Array.Empty<GalaxyDataValue>(),
});
public Task<HistoryReadEventsResponse> HistoryReadEventsAsync(
HistoryReadEventsRequest req, CancellationToken ct)
=> Task.FromResult(new HistoryReadEventsResponse
{
Success = false,
Error = "MXAccess + Historian code lift pending (Phase 2 Task B.1)",
Events = System.Array.Empty<GalaxyHistoricalEvent>(),
});
public Task<RecycleStatusResponse> RecycleAsync(RecycleHostRequest req, CancellationToken ct) public Task<RecycleStatusResponse> RecycleAsync(RecycleHostRequest req, CancellationToken ct)
=> Task.FromResult(new RecycleStatusResponse { Accepted = true, GraceSeconds = 15 }); => Task.FromResult(new RecycleStatusResponse { Accepted = true, GraceSeconds = 15 });
@@ -131,6 +165,7 @@ public sealed class DbBackedGalaxyBackend(GalaxyRepository repository) : IGalaxy
ArrayDim = row.ArrayDimension is int d and > 0 ? (uint)d : null, ArrayDim = row.ArrayDimension is int d and > 0 ? (uint)d : null,
SecurityClassification = row.SecurityClassification, SecurityClassification = row.SecurityClassification,
IsHistorized = row.IsHistorized, IsHistorized = row.IsHistorized,
IsAlarm = row.IsAlarm,
}; };
/// <summary> /// <summary>

View File

@@ -0,0 +1,129 @@
using System;
using System.Collections.Generic;
using System.Linq;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Historian
{
/// <summary>
/// Thread-safe, pure-logic endpoint picker for the Wonderware Historian cluster. Tracks which
/// configured nodes are healthy, places failed nodes in a time-bounded cooldown, and hands
/// out an ordered list of eligible candidates for the data source to try in sequence.
/// </summary>
internal sealed class HistorianClusterEndpointPicker
{
private readonly Func<DateTime> _clock;
private readonly TimeSpan _cooldown;
private readonly object _lock = new object();
private readonly List<NodeEntry> _nodes;
public HistorianClusterEndpointPicker(HistorianConfiguration config)
: this(config, () => DateTime.UtcNow) { }
internal HistorianClusterEndpointPicker(HistorianConfiguration config, Func<DateTime> clock)
{
_clock = clock ?? throw new ArgumentNullException(nameof(clock));
_cooldown = TimeSpan.FromSeconds(Math.Max(0, config.FailureCooldownSeconds));
var names = (config.ServerNames != null && config.ServerNames.Count > 0)
? config.ServerNames
: new List<string> { config.ServerName };
_nodes = names
.Where(n => !string.IsNullOrWhiteSpace(n))
.Select(n => n.Trim())
.Distinct(StringComparer.OrdinalIgnoreCase)
.Select(n => new NodeEntry { Name = n })
.ToList();
}
public int NodeCount
{
get { lock (_lock) return _nodes.Count; }
}
public IReadOnlyList<string> GetHealthyNodes()
{
lock (_lock)
{
var now = _clock();
return _nodes.Where(n => IsHealthyAt(n, now)).Select(n => n.Name).ToList();
}
}
public int HealthyNodeCount
{
get
{
lock (_lock)
{
var now = _clock();
return _nodes.Count(n => IsHealthyAt(n, now));
}
}
}
public void MarkFailed(string node, string? error)
{
lock (_lock)
{
var entry = FindEntry(node);
if (entry == null) return;
var now = _clock();
entry.FailureCount++;
entry.LastError = error;
entry.LastFailureTime = now;
entry.CooldownUntil = _cooldown.TotalMilliseconds > 0 ? now + _cooldown : (DateTime?)null;
}
}
public void MarkHealthy(string node)
{
lock (_lock)
{
var entry = FindEntry(node);
if (entry == null) return;
entry.CooldownUntil = null;
}
}
public List<HistorianClusterNodeState> SnapshotNodeStates()
{
lock (_lock)
{
var now = _clock();
return _nodes.Select(n => new HistorianClusterNodeState
{
Name = n.Name,
IsHealthy = IsHealthyAt(n, now),
CooldownUntil = IsHealthyAt(n, now) ? null : n.CooldownUntil,
FailureCount = n.FailureCount,
LastError = n.LastError,
LastFailureTime = n.LastFailureTime
}).ToList();
}
}
private static bool IsHealthyAt(NodeEntry entry, DateTime now)
{
return entry.CooldownUntil == null || entry.CooldownUntil <= now;
}
private NodeEntry? FindEntry(string node)
{
for (var i = 0; i < _nodes.Count; i++)
if (string.Equals(_nodes[i].Name, node, StringComparison.OrdinalIgnoreCase))
return _nodes[i];
return null;
}
private sealed class NodeEntry
{
public string Name { get; set; } = "";
public DateTime? CooldownUntil { get; set; }
public int FailureCount { get; set; }
public string? LastError { get; set; }
public DateTime? LastFailureTime { get; set; }
}
}
}

View File

@@ -0,0 +1,18 @@
using System;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Historian
{
/// <summary>
/// Point-in-time state of a single historian cluster node. One entry per configured node
/// appears inside <see cref="HistorianHealthSnapshot"/>.
/// </summary>
public sealed class HistorianClusterNodeState
{
public string Name { get; set; } = "";
public bool IsHealthy { get; set; }
public DateTime? CooldownUntil { get; set; }
public int FailureCount { get; set; }
public string? LastError { get; set; }
public DateTime? LastFailureTime { get; set; }
}
}

View File

@@ -0,0 +1,38 @@
using System.Collections.Generic;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Historian
{
/// <summary>
/// Wonderware Historian SDK configuration. Populated from environment variables at Host
/// startup (see <c>Program.cs</c>) or from the Proxy's <c>DriverInstance.DriverConfig</c>
/// section passed during OpenSession. Kept OPC-UA-free — the Proxy side owns UA translation.
/// </summary>
public sealed class HistorianConfiguration
{
public bool Enabled { get; set; } = false;
/// <summary>Single-node fallback when <see cref="ServerNames"/> is empty.</summary>
public string ServerName { get; set; } = "localhost";
/// <summary>
/// Ordered cluster nodes. When non-empty, the data source tries each in order on connect,
/// falling through to the next on failure. A failed node is placed in cooldown for
/// <see cref="FailureCooldownSeconds"/> before being re-eligible.
/// </summary>
public List<string> ServerNames { get; set; } = new();
public int FailureCooldownSeconds { get; set; } = 60;
public bool IntegratedSecurity { get; set; } = true;
public string? UserName { get; set; }
public string? Password { get; set; }
public int Port { get; set; } = 32568;
public int CommandTimeoutSeconds { get; set; } = 30;
public int MaxValuesPerRead { get; set; } = 10000;
/// <summary>
/// Outer safety timeout applied to sync-over-async Historian operations. Must be
/// comfortably larger than <see cref="CommandTimeoutSeconds"/>.
/// </summary>
public int RequestTimeoutSeconds { get; set; } = 60;
}
}

View File

@@ -0,0 +1,621 @@
using System;
using System.Collections.Generic;
using StringCollection = System.Collections.Specialized.StringCollection;
using System.Threading;
using System.Threading.Tasks;
using ArchestrA;
using Serilog;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Historian
{
/// <summary>
/// Reads historical data from the Wonderware Historian via the aahClientManaged SDK.
/// OPC-UA-free — emits <see cref="HistorianSample"/>/<see cref="HistorianAggregateSample"/>
/// which the Proxy maps to OPC UA <c>DataValue</c> on its side of the IPC.
/// </summary>
public sealed class HistorianDataSource : IHistorianDataSource
{
private static readonly ILogger Log = Serilog.Log.ForContext<HistorianDataSource>();
private readonly HistorianConfiguration _config;
private readonly object _connectionLock = new object();
private readonly object _eventConnectionLock = new object();
private readonly IHistorianConnectionFactory _factory;
private HistorianAccess? _connection;
private HistorianAccess? _eventConnection;
private bool _disposed;
private readonly object _healthLock = new object();
private long _totalSuccesses;
private long _totalFailures;
private int _consecutiveFailures;
private DateTime? _lastSuccessTime;
private DateTime? _lastFailureTime;
private string? _lastError;
private string? _activeProcessNode;
private string? _activeEventNode;
private readonly HistorianClusterEndpointPicker _picker;
public HistorianDataSource(HistorianConfiguration config)
: this(config, new SdkHistorianConnectionFactory(), null) { }
internal HistorianDataSource(
HistorianConfiguration config,
IHistorianConnectionFactory factory,
HistorianClusterEndpointPicker? picker = null)
{
_config = config;
_factory = factory;
_picker = picker ?? new HistorianClusterEndpointPicker(config);
}
private (HistorianAccess Connection, string Node) ConnectToAnyHealthyNode(HistorianConnectionType type)
{
var candidates = _picker.GetHealthyNodes();
if (candidates.Count == 0)
{
var total = _picker.NodeCount;
throw new InvalidOperationException(
total == 0
? "No historian nodes configured"
: $"All {total} historian nodes are in cooldown — no healthy endpoints to connect to");
}
Exception? lastException = null;
foreach (var node in candidates)
{
var attemptConfig = CloneConfigWithServerName(node);
try
{
var conn = _factory.CreateAndConnect(attemptConfig, type);
_picker.MarkHealthy(node);
return (conn, node);
}
catch (Exception ex)
{
_picker.MarkFailed(node, ex.Message);
lastException = ex;
Log.Warning(ex, "Historian node {Node} failed during connect attempt; trying next candidate", node);
}
}
var inner = lastException?.Message ?? "(no detail)";
throw new InvalidOperationException(
$"All {candidates.Count} healthy historian candidate(s) failed during connect: {inner}",
lastException);
}
private HistorianConfiguration CloneConfigWithServerName(string serverName)
{
return new HistorianConfiguration
{
Enabled = _config.Enabled,
ServerName = serverName,
ServerNames = _config.ServerNames,
FailureCooldownSeconds = _config.FailureCooldownSeconds,
IntegratedSecurity = _config.IntegratedSecurity,
UserName = _config.UserName,
Password = _config.Password,
Port = _config.Port,
CommandTimeoutSeconds = _config.CommandTimeoutSeconds,
MaxValuesPerRead = _config.MaxValuesPerRead
};
}
public HistorianHealthSnapshot GetHealthSnapshot()
{
var nodeStates = _picker.SnapshotNodeStates();
var healthyCount = 0;
foreach (var n in nodeStates)
if (n.IsHealthy) healthyCount++;
lock (_healthLock)
{
return new HistorianHealthSnapshot
{
TotalQueries = _totalSuccesses + _totalFailures,
TotalSuccesses = _totalSuccesses,
TotalFailures = _totalFailures,
ConsecutiveFailures = _consecutiveFailures,
LastSuccessTime = _lastSuccessTime,
LastFailureTime = _lastFailureTime,
LastError = _lastError,
ProcessConnectionOpen = Volatile.Read(ref _connection) != null,
EventConnectionOpen = Volatile.Read(ref _eventConnection) != null,
ActiveProcessNode = _activeProcessNode,
ActiveEventNode = _activeEventNode,
NodeCount = nodeStates.Count,
HealthyNodeCount = healthyCount,
Nodes = nodeStates
};
}
}
private void RecordSuccess()
{
lock (_healthLock)
{
_totalSuccesses++;
_lastSuccessTime = DateTime.UtcNow;
_consecutiveFailures = 0;
_lastError = null;
}
}
private void RecordFailure(string error)
{
lock (_healthLock)
{
_totalFailures++;
_lastFailureTime = DateTime.UtcNow;
_consecutiveFailures++;
_lastError = error;
}
}
private void EnsureConnected()
{
if (_disposed)
throw new ObjectDisposedException(nameof(HistorianDataSource));
if (Volatile.Read(ref _connection) != null) return;
var (conn, winningNode) = ConnectToAnyHealthyNode(HistorianConnectionType.Process);
lock (_connectionLock)
{
if (_disposed)
{
conn.CloseConnection(out _);
conn.Dispose();
throw new ObjectDisposedException(nameof(HistorianDataSource));
}
if (_connection != null)
{
conn.CloseConnection(out _);
conn.Dispose();
return;
}
_connection = conn;
lock (_healthLock) _activeProcessNode = winningNode;
Log.Information("Historian SDK connection opened to {Server}:{Port}", winningNode, _config.Port);
}
}
private void HandleConnectionError(Exception? ex = null)
{
lock (_connectionLock)
{
if (_connection == null) return;
try
{
_connection.CloseConnection(out _);
_connection.Dispose();
}
catch (Exception disposeEx)
{
Log.Debug(disposeEx, "Error disposing Historian SDK connection during error recovery");
}
_connection = null;
string? failedNode;
lock (_healthLock)
{
failedNode = _activeProcessNode;
_activeProcessNode = null;
}
if (failedNode != null) _picker.MarkFailed(failedNode, ex?.Message ?? "mid-query failure");
Log.Warning(ex, "Historian SDK connection reset (node={Node})", failedNode ?? "(unknown)");
}
}
private void EnsureEventConnected()
{
if (_disposed)
throw new ObjectDisposedException(nameof(HistorianDataSource));
if (Volatile.Read(ref _eventConnection) != null) return;
var (conn, winningNode) = ConnectToAnyHealthyNode(HistorianConnectionType.Event);
lock (_eventConnectionLock)
{
if (_disposed)
{
conn.CloseConnection(out _);
conn.Dispose();
throw new ObjectDisposedException(nameof(HistorianDataSource));
}
if (_eventConnection != null)
{
conn.CloseConnection(out _);
conn.Dispose();
return;
}
_eventConnection = conn;
lock (_healthLock) _activeEventNode = winningNode;
Log.Information("Historian SDK event connection opened to {Server}:{Port}", winningNode, _config.Port);
}
}
private void HandleEventConnectionError(Exception? ex = null)
{
lock (_eventConnectionLock)
{
if (_eventConnection == null) return;
try
{
_eventConnection.CloseConnection(out _);
_eventConnection.Dispose();
}
catch (Exception disposeEx)
{
Log.Debug(disposeEx, "Error disposing Historian SDK event connection during error recovery");
}
_eventConnection = null;
string? failedNode;
lock (_healthLock)
{
failedNode = _activeEventNode;
_activeEventNode = null;
}
if (failedNode != null) _picker.MarkFailed(failedNode, ex?.Message ?? "mid-query failure");
Log.Warning(ex, "Historian SDK event connection reset (node={Node})", failedNode ?? "(unknown)");
}
}
public Task<List<HistorianSample>> ReadRawAsync(
string tagName, DateTime startTime, DateTime endTime, int maxValues,
CancellationToken ct = default)
{
var results = new List<HistorianSample>();
try
{
EnsureConnected();
using var query = _connection!.CreateHistoryQuery();
var args = new HistoryQueryArgs
{
TagNames = new StringCollection { tagName },
StartDateTime = startTime,
EndDateTime = endTime,
RetrievalMode = HistorianRetrievalMode.Full
};
if (maxValues > 0)
args.BatchSize = (uint)maxValues;
else if (_config.MaxValuesPerRead > 0)
args.BatchSize = (uint)_config.MaxValuesPerRead;
if (!query.StartQuery(args, out var error))
{
Log.Warning("Historian SDK raw query start failed for {Tag}: {Error}", tagName, error.ErrorCode);
RecordFailure($"raw StartQuery: {error.ErrorCode}");
HandleConnectionError();
return Task.FromResult(results);
}
var count = 0;
var limit = maxValues > 0 ? maxValues : _config.MaxValuesPerRead;
while (query.MoveNext(out error))
{
ct.ThrowIfCancellationRequested();
var result = query.QueryResult;
var timestamp = DateTime.SpecifyKind(result.StartDateTime, DateTimeKind.Utc);
object? value;
if (!string.IsNullOrEmpty(result.StringValue) && result.Value == 0)
value = result.StringValue;
else
value = result.Value;
results.Add(new HistorianSample
{
Value = value,
TimestampUtc = timestamp,
Quality = (byte)(result.OpcQuality & 0xFF),
});
count++;
if (limit > 0 && count >= limit) break;
}
query.EndQuery(out _);
RecordSuccess();
}
catch (OperationCanceledException) { throw; }
catch (ObjectDisposedException) { throw; }
catch (Exception ex)
{
Log.Warning(ex, "HistoryRead raw failed for {Tag}", tagName);
RecordFailure($"raw: {ex.Message}");
HandleConnectionError(ex);
}
Log.Debug("HistoryRead raw: {Tag} returned {Count} values ({Start} to {End})",
tagName, results.Count, startTime, endTime);
return Task.FromResult(results);
}
public Task<List<HistorianAggregateSample>> ReadAggregateAsync(
string tagName, DateTime startTime, DateTime endTime,
double intervalMs, string aggregateColumn,
CancellationToken ct = default)
{
var results = new List<HistorianAggregateSample>();
try
{
EnsureConnected();
using var query = _connection!.CreateAnalogSummaryQuery();
var args = new AnalogSummaryQueryArgs
{
TagNames = new StringCollection { tagName },
StartDateTime = startTime,
EndDateTime = endTime,
Resolution = (ulong)intervalMs
};
if (!query.StartQuery(args, out var error))
{
Log.Warning("Historian SDK aggregate query start failed for {Tag}: {Error}", tagName, error.ErrorCode);
RecordFailure($"aggregate StartQuery: {error.ErrorCode}");
HandleConnectionError();
return Task.FromResult(results);
}
while (query.MoveNext(out error))
{
ct.ThrowIfCancellationRequested();
var result = query.QueryResult;
var timestamp = DateTime.SpecifyKind(result.StartDateTime, DateTimeKind.Utc);
var value = ExtractAggregateValue(result, aggregateColumn);
results.Add(new HistorianAggregateSample
{
Value = value,
TimestampUtc = timestamp,
});
}
query.EndQuery(out _);
RecordSuccess();
}
catch (OperationCanceledException) { throw; }
catch (ObjectDisposedException) { throw; }
catch (Exception ex)
{
Log.Warning(ex, "HistoryRead aggregate failed for {Tag}", tagName);
RecordFailure($"aggregate: {ex.Message}");
HandleConnectionError(ex);
}
Log.Debug("HistoryRead aggregate ({Aggregate}): {Tag} returned {Count} values",
aggregateColumn, tagName, results.Count);
return Task.FromResult(results);
}
public Task<List<HistorianSample>> ReadAtTimeAsync(
string tagName, DateTime[] timestamps,
CancellationToken ct = default)
{
var results = new List<HistorianSample>();
if (timestamps == null || timestamps.Length == 0)
return Task.FromResult(results);
try
{
EnsureConnected();
foreach (var timestamp in timestamps)
{
ct.ThrowIfCancellationRequested();
using var query = _connection!.CreateHistoryQuery();
var args = new HistoryQueryArgs
{
TagNames = new StringCollection { tagName },
StartDateTime = timestamp,
EndDateTime = timestamp,
RetrievalMode = HistorianRetrievalMode.Interpolated,
BatchSize = 1
};
if (!query.StartQuery(args, out var error))
{
results.Add(new HistorianSample
{
Value = null,
TimestampUtc = DateTime.SpecifyKind(timestamp, DateTimeKind.Utc),
Quality = 0, // Bad
});
continue;
}
if (query.MoveNext(out error))
{
var result = query.QueryResult;
object? value;
if (!string.IsNullOrEmpty(result.StringValue) && result.Value == 0)
value = result.StringValue;
else
value = result.Value;
results.Add(new HistorianSample
{
Value = value,
TimestampUtc = DateTime.SpecifyKind(timestamp, DateTimeKind.Utc),
Quality = (byte)(result.OpcQuality & 0xFF),
});
}
else
{
results.Add(new HistorianSample
{
Value = null,
TimestampUtc = DateTime.SpecifyKind(timestamp, DateTimeKind.Utc),
Quality = 0,
});
}
query.EndQuery(out _);
}
RecordSuccess();
}
catch (OperationCanceledException) { throw; }
catch (ObjectDisposedException) { throw; }
catch (Exception ex)
{
Log.Warning(ex, "HistoryRead at-time failed for {Tag}", tagName);
RecordFailure($"at-time: {ex.Message}");
HandleConnectionError(ex);
}
Log.Debug("HistoryRead at-time: {Tag} returned {Count} values for {Timestamps} timestamps",
tagName, results.Count, timestamps.Length);
return Task.FromResult(results);
}
public Task<List<HistorianEventDto>> ReadEventsAsync(
string? sourceName, DateTime startTime, DateTime endTime, int maxEvents,
CancellationToken ct = default)
{
var results = new List<HistorianEventDto>();
try
{
EnsureEventConnected();
using var query = _eventConnection!.CreateEventQuery();
var args = new EventQueryArgs
{
StartDateTime = startTime,
EndDateTime = endTime,
EventCount = maxEvents > 0 ? (uint)maxEvents : (uint)_config.MaxValuesPerRead,
QueryType = HistorianEventQueryType.Events,
EventOrder = HistorianEventOrder.Ascending
};
if (!string.IsNullOrEmpty(sourceName))
{
query.AddEventFilter("Source", HistorianComparisionType.Equal, sourceName, out _);
}
if (!query.StartQuery(args, out var error))
{
Log.Warning("Historian SDK event query start failed: {Error}", error.ErrorCode);
RecordFailure($"events StartQuery: {error.ErrorCode}");
HandleEventConnectionError();
return Task.FromResult(results);
}
var count = 0;
while (query.MoveNext(out error))
{
ct.ThrowIfCancellationRequested();
results.Add(ToDto(query.QueryResult));
count++;
if (maxEvents > 0 && count >= maxEvents) break;
}
query.EndQuery(out _);
RecordSuccess();
}
catch (OperationCanceledException) { throw; }
catch (ObjectDisposedException) { throw; }
catch (Exception ex)
{
Log.Warning(ex, "HistoryRead events failed for source {Source}", sourceName ?? "(all)");
RecordFailure($"events: {ex.Message}");
HandleEventConnectionError(ex);
}
Log.Debug("HistoryRead events: source={Source} returned {Count} events ({Start} to {End})",
sourceName ?? "(all)", results.Count, startTime, endTime);
return Task.FromResult(results);
}
private static HistorianEventDto ToDto(HistorianEvent evt)
{
// The ArchestrA SDK marks these properties obsolete but still returns them; their
// successors aren't wired in the version we bind against. Using them is the documented
// v1 behavior — suppressed locally instead of project-wide so any non-event use of
// deprecated SDK surface still surfaces as an error.
#pragma warning disable CS0618
return new HistorianEventDto
{
Id = evt.Id,
Source = evt.Source,
EventTime = evt.EventTime,
ReceivedTime = evt.ReceivedTime,
DisplayText = evt.DisplayText,
Severity = (ushort)evt.Severity
};
#pragma warning restore CS0618
}
internal static double? ExtractAggregateValue(AnalogSummaryQueryResult result, string column)
{
switch (column)
{
case "Average": return result.Average;
case "Minimum": return result.Minimum;
case "Maximum": return result.Maximum;
case "ValueCount": return result.ValueCount;
case "First": return result.First;
case "Last": return result.Last;
case "StdDev": return result.StdDev;
default: return null;
}
}
public void Dispose()
{
if (_disposed) return;
_disposed = true;
try
{
_connection?.CloseConnection(out _);
_connection?.Dispose();
}
catch (Exception ex)
{
Log.Warning(ex, "Error closing Historian SDK connection");
}
try
{
_eventConnection?.CloseConnection(out _);
_eventConnection?.Dispose();
}
catch (Exception ex)
{
Log.Warning(ex, "Error closing Historian SDK event connection");
}
_connection = null;
_eventConnection = null;
}
}
}

View File

@@ -0,0 +1,18 @@
using System;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Historian
{
/// <summary>
/// SDK-free representation of a Historian event record. Prevents ArchestrA types from
/// leaking beyond <c>HistorianDataSource</c>.
/// </summary>
public sealed class HistorianEventDto
{
public Guid Id { get; set; }
public string? Source { get; set; }
public DateTime EventTime { get; set; }
public DateTime ReceivedTime { get; set; }
public string? DisplayText { get; set; }
public ushort Severity { get; set; }
}
}

View File

@@ -0,0 +1,27 @@
using System;
using System.Collections.Generic;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Historian
{
/// <summary>
/// Point-in-time runtime health of the historian subsystem — consumed by the status dashboard
/// via an IPC health query (not wired in PR #5; deferred).
/// </summary>
public sealed class HistorianHealthSnapshot
{
public long TotalQueries { get; set; }
public long TotalSuccesses { get; set; }
public long TotalFailures { get; set; }
public int ConsecutiveFailures { get; set; }
public DateTime? LastSuccessTime { get; set; }
public DateTime? LastFailureTime { get; set; }
public string? LastError { get; set; }
public bool ProcessConnectionOpen { get; set; }
public bool EventConnectionOpen { get; set; }
public string? ActiveProcessNode { get; set; }
public string? ActiveEventNode { get; set; }
public int NodeCount { get; set; }
public int HealthyNodeCount { get; set; }
public List<HistorianClusterNodeState> Nodes { get; set; } = new();
}
}

View File

@@ -0,0 +1,46 @@
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Historian;
/// <summary>
/// Maps a raw OPC DA quality byte (as returned by Wonderware Historian's <c>OpcQuality</c>)
/// to an OPC UA <c>StatusCode</c> uint. Preserves specific codes (BadNotConnected,
/// UncertainSubNormal, etc.) instead of collapsing to Good/Uncertain/Bad categories.
/// Mirrors v1 <c>QualityMapper.MapToOpcUaStatusCode</c> without pulling in OPC UA types —
/// the returned value is the 32-bit OPC UA <c>StatusCode</c> wire encoding that the Proxy
/// surfaces directly as <c>DataValueSnapshot.StatusCode</c>.
/// </summary>
public static class HistorianQualityMapper
{
/// <summary>
/// Map an 8-bit OPC DA quality byte to the corresponding OPC UA StatusCode. The byte
/// family bits decide the category (Good &gt;= 192, Uncertain 64-191, Bad 0-63); the
/// low-nibble subcode selects the specific code.
/// </summary>
public static uint Map(byte q) => q switch
{
// Good family (192+)
192 => 0x00000000u, // Good
216 => 0x00D80000u, // Good_LocalOverride
// Uncertain family (64-191)
64 => 0x40000000u, // Uncertain
68 => 0x40900000u, // Uncertain_LastUsableValue
80 => 0x40930000u, // Uncertain_SensorNotAccurate
84 => 0x40940000u, // Uncertain_EngineeringUnitsExceeded
88 => 0x40950000u, // Uncertain_SubNormal
// Bad family (0-63)
0 => 0x80000000u, // Bad
4 => 0x80890000u, // Bad_ConfigurationError
8 => 0x808A0000u, // Bad_NotConnected
12 => 0x808B0000u, // Bad_DeviceFailure
16 => 0x808C0000u, // Bad_SensorFailure
20 => 0x80050000u, // Bad_CommunicationError
24 => 0x808D0000u, // Bad_OutOfService
32 => 0x80320000u, // Bad_WaitingForInitialData
// Unknown code — fall back to the category so callers still get a sensible bucket.
_ when q >= 192 => 0x00000000u,
_ when q >= 64 => 0x40000000u,
_ => 0x80000000u,
};
}

View File

@@ -0,0 +1,30 @@
using System;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Historian
{
/// <summary>
/// OPC-UA-free representation of a single historical data point. The Host returns these
/// across the IPC boundary as <c>GalaxyDataValue</c>; the Proxy maps quality and value to
/// OPC UA <c>DataValue</c>. Raw MX quality byte is preserved so the Proxy can use the same
/// quality mapper it already uses for live reads.
/// </summary>
public sealed class HistorianSample
{
public object? Value { get; set; }
/// <summary>Raw OPC DA quality byte from the historian SDK (low 8 bits of OpcQuality).</summary>
public byte Quality { get; set; }
public DateTime TimestampUtc { get; set; }
}
/// <summary>
/// Result of <see cref="IHistorianDataSource.ReadAggregateAsync"/>. When <see cref="Value"/> is
/// null the aggregate is unavailable for that bucket (Proxy maps to <c>BadNoData</c>).
/// </summary>
public sealed class HistorianAggregateSample
{
public double? Value { get; set; }
public DateTime TimestampUtc { get; set; }
}
}

View File

@@ -0,0 +1,73 @@
using System;
using System.Threading;
using ArchestrA;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Historian
{
/// <summary>
/// Creates and opens Historian SDK connections. Extracted so tests can inject fakes that
/// control connection success, failure, and timeout behavior.
/// </summary>
internal interface IHistorianConnectionFactory
{
HistorianAccess CreateAndConnect(HistorianConfiguration config, HistorianConnectionType type);
}
/// <summary>Production implementation — opens real Historian SDK connections.</summary>
internal sealed class SdkHistorianConnectionFactory : IHistorianConnectionFactory
{
public HistorianAccess CreateAndConnect(HistorianConfiguration config, HistorianConnectionType type)
{
var conn = new HistorianAccess();
var args = new HistorianConnectionArgs
{
ServerName = config.ServerName,
TcpPort = (ushort)config.Port,
IntegratedSecurity = config.IntegratedSecurity,
UseArchestrAUser = config.IntegratedSecurity,
ConnectionType = type,
ReadOnly = true,
PacketTimeout = (uint)(config.CommandTimeoutSeconds * 1000)
};
if (!config.IntegratedSecurity)
{
args.UserName = config.UserName ?? string.Empty;
args.Password = config.Password ?? string.Empty;
}
if (!conn.OpenConnection(args, out var error))
{
conn.Dispose();
throw new InvalidOperationException(
$"Failed to open Historian SDK connection to {config.ServerName}:{config.Port}: {error.ErrorCode}");
}
var timeoutMs = config.CommandTimeoutSeconds * 1000;
var elapsed = 0;
while (elapsed < timeoutMs)
{
var status = new HistorianConnectionStatus();
conn.GetConnectionStatus(ref status);
if (status.ConnectedToServer)
return conn;
if (status.ErrorOccurred)
{
conn.Dispose();
throw new InvalidOperationException(
$"Historian SDK connection failed: {status.Error}");
}
Thread.Sleep(250);
elapsed += 250;
}
conn.Dispose();
throw new TimeoutException(
$"Historian SDK connection to {config.ServerName}:{config.Port} timed out after {config.CommandTimeoutSeconds}s");
}
}
}

View File

@@ -0,0 +1,34 @@
using System;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Historian
{
/// <summary>
/// OPC-UA-free surface for the Wonderware Historian subsystem inside Galaxy.Host.
/// Implementations read via the aahClient* SDK; the Proxy side maps returned samples
/// to OPC UA <c>DataValue</c>.
/// </summary>
public interface IHistorianDataSource : IDisposable
{
Task<List<HistorianSample>> ReadRawAsync(
string tagName, DateTime startTime, DateTime endTime, int maxValues,
CancellationToken ct = default);
Task<List<HistorianAggregateSample>> ReadAggregateAsync(
string tagName, DateTime startTime, DateTime endTime,
double intervalMs, string aggregateColumn,
CancellationToken ct = default);
Task<List<HistorianSample>> ReadAtTimeAsync(
string tagName, DateTime[] timestamps,
CancellationToken ct = default);
Task<List<HistorianEventDto>> ReadEventsAsync(
string? sourceName, DateTime startTime, DateTime endTime, int maxEvents,
CancellationToken ct = default);
HistorianHealthSnapshot GetHealthSnapshot();
}
}

View File

@@ -14,6 +14,15 @@ namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend;
/// </summary> /// </summary>
public interface IGalaxyBackend public interface IGalaxyBackend
{ {
/// <summary>
/// Server-pushed events the backend raises asynchronously (data-change, alarm,
/// host-status). The frame handler subscribes once on connect and forwards each
/// event to the Proxy as a typed <see cref="MessageKind"/> notification.
/// </summary>
event System.EventHandler<OnDataChangeNotification>? OnDataChange;
event System.EventHandler<GalaxyAlarmEvent>? OnAlarmEvent;
event System.EventHandler<HostConnectivityStatus>? OnHostStatusChanged;
Task<OpenSessionResponse> OpenSessionAsync(OpenSessionRequest req, CancellationToken ct); Task<OpenSessionResponse> OpenSessionAsync(OpenSessionRequest req, CancellationToken ct);
Task CloseSessionAsync(CloseSessionRequest req, CancellationToken ct); Task CloseSessionAsync(CloseSessionRequest req, CancellationToken ct);
@@ -29,6 +38,9 @@ public interface IGalaxyBackend
Task AcknowledgeAlarmAsync(AlarmAckRequest req, CancellationToken ct); Task AcknowledgeAlarmAsync(AlarmAckRequest req, CancellationToken ct);
Task<HistoryReadResponse> HistoryReadAsync(HistoryReadRequest req, CancellationToken ct); Task<HistoryReadResponse> HistoryReadAsync(HistoryReadRequest req, CancellationToken ct);
Task<HistoryReadProcessedResponse> HistoryReadProcessedAsync(HistoryReadProcessedRequest req, CancellationToken ct);
Task<HistoryReadAtTimeResponse> HistoryReadAtTimeAsync(HistoryReadAtTimeRequest req, CancellationToken ct);
Task<HistoryReadEventsResponse> HistoryReadEventsAsync(HistoryReadEventsRequest req, CancellationToken ct);
Task<RecycleStatusResponse> RecycleAsync(RecycleHostRequest req, CancellationToken ct); Task<RecycleStatusResponse> RecycleAsync(RecycleHostRequest req, CancellationToken ct);
} }

View File

@@ -1,8 +1,10 @@
using System; using System;
using System.Collections.Concurrent; using System.Collections.Concurrent;
using System.Linq;
using System.Threading; using System.Threading;
using System.Threading.Tasks; using System.Threading.Tasks;
using ArchestrA.MxAccess; using ArchestrA.MxAccess;
using Serilog;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Sta; using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Sta;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess; namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
@@ -17,9 +19,12 @@ namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
/// </summary> /// </summary>
public sealed class MxAccessClient : IDisposable public sealed class MxAccessClient : IDisposable
{ {
private static readonly ILogger Log = Serilog.Log.ForContext<MxAccessClient>();
private readonly StaPump _pump; private readonly StaPump _pump;
private readonly IMxProxy _proxy; private readonly IMxProxy _proxy;
private readonly string _clientName; private readonly string _clientName;
private readonly MxAccessClientOptions _options;
// Galaxy attribute reference → MXAccess item handle (set on first Subscribe/Read). // Galaxy attribute reference → MXAccess item handle (set on first Subscribe/Read).
private readonly ConcurrentDictionary<string, int> _addressToHandle = new(StringComparer.OrdinalIgnoreCase); private readonly ConcurrentDictionary<string, int> _addressToHandle = new(StringComparer.OrdinalIgnoreCase);
@@ -30,21 +35,49 @@ public sealed class MxAccessClient : IDisposable
private int _connectionHandle; private int _connectionHandle;
private bool _connected; private bool _connected;
private DateTime _lastObservedActivityUtc = DateTime.UtcNow;
private CancellationTokenSource? _monitorCts;
private int _reconnectCount;
private bool _disposed;
public MxAccessClient(StaPump pump, IMxProxy proxy, string clientName) /// <summary>Fires whenever the connection transitions Connected ↔ Disconnected.</summary>
public event EventHandler<bool>? ConnectionStateChanged;
/// <summary>
/// Fires once per failed subscription replay after a reconnect. Carries the tag reference
/// and the exception so the backend can propagate the degradation signal (e.g. mark the
/// subscription bad on the Proxy side rather than silently losing its callback). Added for
/// PR 6 low finding #2 — the replay loop previously ate per-tag failures silently and an
/// operator would only find out that a specific subscription stopped updating through a
/// data-quality complaint from downstream.
/// </summary>
public event EventHandler<SubscriptionReplayFailedEventArgs>? SubscriptionReplayFailed;
public MxAccessClient(StaPump pump, IMxProxy proxy, string clientName, MxAccessClientOptions? options = null)
{ {
_pump = pump; _pump = pump;
_proxy = proxy; _proxy = proxy;
_clientName = clientName; _clientName = clientName;
_options = options ?? new MxAccessClientOptions();
_proxy.OnDataChange += OnDataChange; _proxy.OnDataChange += OnDataChange;
_proxy.OnWriteComplete += OnWriteComplete; _proxy.OnWriteComplete += OnWriteComplete;
} }
public bool IsConnected => _connected; public bool IsConnected => _connected;
public int SubscriptionCount => _subscriptions.Count; public int SubscriptionCount => _subscriptions.Count;
public int ReconnectCount => _reconnectCount;
/// <summary>Connects on the STA thread. Idempotent.</summary> /// <summary>
public Task<int> ConnectAsync() => _pump.InvokeAsync(() => /// Wonderware client identity used when registering with the LMXProxyServer. Surfaced so
/// <see cref="Backend.MxAccessGalaxyBackend"/> can tag its <c>OnHostStatusChanged</c> IPC
/// pushes with a stable gateway name per PR 8.
/// </summary>
public string ClientName => _clientName;
/// <summary>Connects on the STA thread. Idempotent. Starts the reconnect monitor on first call.</summary>
public async Task<int> ConnectAsync()
{
var handle = await _pump.InvokeAsync(() =>
{ {
if (_connected) return _connectionHandle; if (_connected) return _connectionHandle;
_connectionHandle = _proxy.Register(_clientName); _connectionHandle = _proxy.Register(_clientName);
@@ -52,7 +85,23 @@ public sealed class MxAccessClient : IDisposable
return _connectionHandle; return _connectionHandle;
}); });
public Task DisconnectAsync() => _pump.InvokeAsync(() => ConnectionStateChanged?.Invoke(this, true);
if (_options.AutoReconnect && _monitorCts is null)
{
_monitorCts = new CancellationTokenSource();
_ = Task.Run(() => MonitorLoopAsync(_monitorCts.Token));
}
return handle;
}
public async Task DisconnectAsync()
{
_monitorCts?.Cancel();
_monitorCts = null;
await _pump.InvokeAsync(() =>
{ {
if (!_connected) return; if (!_connected) return;
try { _proxy.Unregister(_connectionHandle); } try { _proxy.Unregister(_connectionHandle); }
@@ -64,6 +113,118 @@ public sealed class MxAccessClient : IDisposable
} }
}); });
ConnectionStateChanged?.Invoke(this, false);
}
/// <summary>
/// Background loop that watches for connection liveness signals and triggers
/// reconnect-with-replay when the connection appears dead. Per Phase 2 high finding #2:
/// v1's MxAccessClient.Monitor pattern lifted into the new pump-based client. Uses
/// observed-activity timestamp + optional probe-tag subscription. Without an explicit
/// probe tag, falls back to "no data change in N seconds + no successful read in N
/// seconds = unhealthy" — same shape as v1.
/// </summary>
private async Task MonitorLoopAsync(CancellationToken ct)
{
while (!ct.IsCancellationRequested)
{
try { await Task.Delay(_options.MonitorInterval, ct); }
catch (OperationCanceledException) { break; }
if (!_connected || _disposed) continue;
var idle = DateTime.UtcNow - _lastObservedActivityUtc;
if (idle <= _options.StaleThreshold) continue;
// Probe: try a no-op COM call. If the proxy is dead, the call will throw — that's
// our reconnect signal. PR 6 low finding #1: AddItem allocates an MXAccess item
// handle; we must RemoveItem it on the same pump turn or the long-running monitor
// leaks one handle per probe cycle (one every MonitorInterval seconds, indefinitely).
bool probeOk;
try
{
probeOk = await _pump.InvokeAsync(() =>
{
int probeHandle = 0;
try
{
probeHandle = _proxy.AddItem(_connectionHandle, "$Heartbeat");
return probeHandle > 0;
}
catch { return false; }
finally
{
if (probeHandle > 0)
{
try { _proxy.RemoveItem(_connectionHandle, probeHandle); }
catch { /* proxy is dying; best-effort cleanup */ }
}
}
});
}
catch { probeOk = false; }
if (probeOk)
{
_lastObservedActivityUtc = DateTime.UtcNow;
continue;
}
// Connection appears dead — reconnect-with-replay.
try
{
await _pump.InvokeAsync(() =>
{
try { _proxy.Unregister(_connectionHandle); } catch { /* dead anyway */ }
_connected = false;
});
ConnectionStateChanged?.Invoke(this, false);
await _pump.InvokeAsync(() =>
{
_connectionHandle = _proxy.Register(_clientName);
_connected = true;
});
_reconnectCount++;
ConnectionStateChanged?.Invoke(this, true);
// Replay every subscription that was active before the disconnect. PR 6 low
// finding #2: surface per-tag failures — log them and raise
// SubscriptionReplayFailed so the backend can propagate the degraded state
// (previously swallowed silently; downstream quality dropped without a signal).
var snapshot = _addressToHandle.Keys.ToArray();
_addressToHandle.Clear();
_handleToAddress.Clear();
var failed = 0;
foreach (var fullRef in snapshot)
{
try { await SubscribeOnPumpAsync(fullRef); }
catch (Exception subEx)
{
failed++;
Log.Warning(subEx,
"MXAccess subscription replay failed for {TagReference} after reconnect #{Reconnect}",
fullRef, _reconnectCount);
SubscriptionReplayFailed?.Invoke(this,
new SubscriptionReplayFailedEventArgs(fullRef, subEx));
}
}
if (failed > 0)
Log.Warning("Subscription replay completed — {Failed} of {Total} failed", failed, snapshot.Length);
else
Log.Information("Subscription replay completed — {Total} re-subscribed cleanly", snapshot.Length);
_lastObservedActivityUtc = DateTime.UtcNow;
}
catch
{
// Reconnect failed; back off and retry on the next tick.
_connected = false;
}
}
}
/// <summary> /// <summary>
/// One-shot read implemented as a transient subscribe + unsubscribe. /// One-shot read implemented as a transient subscribe + unsubscribe.
/// <c>LMXProxyServer</c> doesn't expose a synchronous read, so the canonical pattern /// <c>LMXProxyServer</c> doesn't expose a synchronous read, so the canonical pattern
@@ -79,26 +240,72 @@ public sealed class MxAccessClient : IDisposable
// Stash the one-shot handler before sending the subscribe, then remove it after firing. // Stash the one-shot handler before sending the subscribe, then remove it after firing.
_subscriptions.AddOrUpdate(fullReference, oneShot, (_, existing) => Combine(existing, oneShot)); _subscriptions.AddOrUpdate(fullReference, oneShot, (_, existing) => Combine(existing, oneShot));
var addedToReadOnlyAttribute = !_addressToHandle.ContainsKey(fullReference);
var itemHandle = await SubscribeOnPumpAsync(fullReference); try
{
await SubscribeOnPumpAsync(fullReference);
using var _ = ct.Register(() => tcs.TrySetCanceled()); using var _ = ct.Register(() => tcs.TrySetCanceled());
var raceTask = await Task.WhenAny(tcs.Task, Task.Delay(timeout, ct)); var raceTask = await Task.WhenAny(tcs.Task, Task.Delay(timeout, ct));
if (raceTask != tcs.Task) throw new TimeoutException($"MXAccess read of {fullReference} timed out after {timeout}"); if (raceTask != tcs.Task) throw new TimeoutException($"MXAccess read of {fullReference} timed out after {timeout}");
// Detach the one-shot handler. return await tcs.Task;
}
finally
{
// High 1 — always detach the one-shot handler, even on cancellation/timeout/throw.
// If we were the one who added the underlying MXAccess subscription (no other
// caller had it), tear it down too so we don't leak a probe item handle.
_subscriptions.AddOrUpdate(fullReference, _ => default!, (_, existing) => Remove(existing, oneShot)); _subscriptions.AddOrUpdate(fullReference, _ => default!, (_, existing) => Remove(existing, oneShot));
if (addedToReadOnlyAttribute)
{
try { await UnsubscribeAsync(fullReference); }
catch { /* shutdown-best-effort */ }
}
}
}
/// <summary>
/// Writes <paramref name="value"/> to the runtime and AWAITS the OnWriteComplete
/// callback so the caller learns the actual write status. Per Phase 2 medium finding #4
/// in <c>exit-gate-phase-2.md</c>: the previous fire-and-forget version returned a
/// false-positive Good even when the runtime rejected the write post-callback.
/// </summary>
public async Task<bool> WriteAsync(string fullReference, object value,
int securityClassification = 0, TimeSpan? timeout = null)
{
if (!_connected) throw new InvalidOperationException("MxAccessClient not connected");
var actualTimeout = timeout ?? TimeSpan.FromSeconds(5);
var itemHandle = await _pump.InvokeAsync(() => ResolveItem(fullReference));
var tcs = new TaskCompletionSource<bool>(TaskCreationOptions.RunContinuationsAsynchronously);
if (!_pendingWrites.TryAdd(itemHandle, tcs))
{
// A prior write to the same item handle is still pending — uncommon but possible
// if the caller spammed writes. Replace it: the older TCS observes a Cancelled task.
if (_pendingWrites.TryRemove(itemHandle, out var prior))
prior.TrySetCanceled();
_pendingWrites[itemHandle] = tcs;
}
try
{
await _pump.InvokeAsync(() =>
_proxy.Write(_connectionHandle, itemHandle, value, securityClassification));
var raceTask = await Task.WhenAny(tcs.Task, Task.Delay(actualTimeout));
if (raceTask != tcs.Task)
throw new TimeoutException($"MXAccess write of {fullReference} timed out after {actualTimeout}");
return await tcs.Task; return await tcs.Task;
} }
finally
public Task WriteAsync(string fullReference, object value, int securityClassification = 0) =>
_pump.InvokeAsync(() =>
{ {
if (!_connected) throw new InvalidOperationException("MxAccessClient not connected"); _pendingWrites.TryRemove(itemHandle, out _);
var itemHandle = ResolveItem(fullReference); }
_proxy.Write(_connectionHandle, itemHandle, value, securityClassification); }
});
public async Task SubscribeAsync(string fullReference, Action<string, Vtq> callback) public async Task SubscribeAsync(string fullReference, Action<string, Vtq> callback)
{ {
@@ -148,6 +355,9 @@ public sealed class MxAccessClient : IDisposable
{ {
if (!_handleToAddress.TryGetValue(phItemHandle, out var fullRef)) return; if (!_handleToAddress.TryGetValue(phItemHandle, out var fullRef)) return;
// Liveness: any data-change event is proof the connection is alive.
_lastObservedActivityUtc = DateTime.UtcNow;
var ts = pftItemTimeStamp is DateTime dt ? dt.ToUniversalTime() : DateTime.UtcNow; var ts = pftItemTimeStamp is DateTime dt ? dt.ToUniversalTime() : DateTime.UtcNow;
var quality = (byte)Math.Min(255, Math.Max(0, pwItemQuality)); var quality = (byte)Math.Min(255, Math.Max(0, pwItemQuality));
var vtq = new Vtq(pvItemValue, ts, quality); var vtq = new Vtq(pvItemValue, ts, quality);
@@ -169,10 +379,30 @@ public sealed class MxAccessClient : IDisposable
public void Dispose() public void Dispose()
{ {
_disposed = true;
_monitorCts?.Cancel();
try { DisconnectAsync().GetAwaiter().GetResult(); } try { DisconnectAsync().GetAwaiter().GetResult(); }
catch { /* swallow */ } catch { /* swallow */ }
_proxy.OnDataChange -= OnDataChange; _proxy.OnDataChange -= OnDataChange;
_proxy.OnWriteComplete -= OnWriteComplete; _proxy.OnWriteComplete -= OnWriteComplete;
_monitorCts?.Dispose();
} }
} }
/// <summary>
/// Tunables for <see cref="MxAccessClient"/>'s reconnect monitor. Defaults match the v1
/// monitor's polling cadence so behavior is consistent across the lift.
/// </summary>
public sealed class MxAccessClientOptions
{
/// <summary>Whether to start the background monitor at connect time.</summary>
public bool AutoReconnect { get; init; } = true;
/// <summary>How often the monitor wakes up to check liveness.</summary>
public TimeSpan MonitorInterval { get; init; } = TimeSpan.FromSeconds(5);
/// <summary>If no data-change activity in this window, the monitor probes the connection.</summary>
public TimeSpan StaleThreshold { get; init; } = TimeSpan.FromSeconds(60);
}

View File

@@ -0,0 +1,20 @@
using System;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
/// <summary>
/// Fired by <see cref="MxAccessClient.SubscriptionReplayFailed"/> when a previously-active
/// subscription fails to be restored after a reconnect. The backend should treat the tag as
/// unhealthy until the next successful resubscribe.
/// </summary>
public sealed class SubscriptionReplayFailedEventArgs : EventArgs
{
public SubscriptionReplayFailedEventArgs(string tagReference, Exception exception)
{
TagReference = tagReference;
Exception = exception;
}
public string TagReference { get; }
public Exception Exception { get; }
}

View File

@@ -4,8 +4,11 @@ using System.Linq;
using System.Threading; using System.Threading;
using System.Threading.Tasks; using System.Threading.Tasks;
using MessagePack; using MessagePack;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Alarms;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Galaxy; using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Galaxy;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Historian;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess; using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Stability;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts; using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend; namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend;
@@ -18,22 +21,114 @@ namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend;
/// MxAccess <c>AlarmExtension</c> primitives but the wire-up is also Phase 2 follow-up /// MxAccess <c>AlarmExtension</c> primitives but the wire-up is also Phase 2 follow-up
/// (the v1 alarm subsystem is its own subtree). /// (the v1 alarm subsystem is its own subtree).
/// </summary> /// </summary>
public sealed class MxAccessGalaxyBackend : IGalaxyBackend public sealed class MxAccessGalaxyBackend : IGalaxyBackend, IDisposable
{ {
private readonly GalaxyRepository _repository; private readonly GalaxyRepository _repository;
private readonly MxAccessClient _mx; private readonly MxAccessClient _mx;
private readonly IHistorianDataSource? _historian;
private long _nextSessionId; private long _nextSessionId;
private long _nextSubscriptionId; private long _nextSubscriptionId;
// Active SubscriptionId → MXAccess full reference list — so Unsubscribe can find them. // Active SubscriptionId → MXAccess full reference list — so Unsubscribe can find them.
private readonly System.Collections.Concurrent.ConcurrentDictionary<long, IReadOnlyList<string>> _subs = new(); private readonly System.Collections.Concurrent.ConcurrentDictionary<long, IReadOnlyList<string>> _subs = new();
// Reverse lookup: tag reference → subscription IDs subscribed to it (one tag may belong to many).
private readonly System.Collections.Concurrent.ConcurrentDictionary<string, System.Collections.Concurrent.ConcurrentBag<long>>
_refToSubs = new(System.StringComparer.OrdinalIgnoreCase);
public MxAccessGalaxyBackend(GalaxyRepository repository, MxAccessClient mx) public event System.EventHandler<OnDataChangeNotification>? OnDataChange;
public event System.EventHandler<GalaxyAlarmEvent>? OnAlarmEvent;
public event System.EventHandler<HostConnectivityStatus>? OnHostStatusChanged;
private readonly System.EventHandler<bool> _onConnectionStateChanged;
private readonly GalaxyRuntimeProbeManager _probeManager;
private readonly System.EventHandler<HostStateTransition> _onProbeStateChanged;
private readonly GalaxyAlarmTracker _alarmTracker;
private readonly System.EventHandler<AlarmTransition> _onAlarmTransition;
// Cached during DiscoverAsync so SubscribeAlarmsAsync knows which attributes to advise.
// One entry per IsAlarm=true attribute in the last discovered hierarchy.
private readonly System.Collections.Concurrent.ConcurrentBag<string> _discoveredAlarmTags = new();
public MxAccessGalaxyBackend(GalaxyRepository repository, MxAccessClient mx, IHistorianDataSource? historian = null)
{ {
_repository = repository; _repository = repository;
_mx = mx; _mx = mx;
_historian = historian;
// PR 8: gateway-level host-status push. When the MXAccess COM proxy transitions
// connected↔disconnected, raise OnHostStatusChanged with a synthetic host entry named
// after the Wonderware client identity so the Admin UI surfaces top-level transport
// health even before per-platform/per-engine probing lands (deferred to a later PR that
// ports v1's GalaxyRuntimeProbeManager with ScanState subscriptions).
_onConnectionStateChanged = (_, connected) =>
{
OnHostStatusChanged?.Invoke(this, new HostConnectivityStatus
{
HostName = _mx.ClientName,
RuntimeStatus = connected ? "Running" : "Stopped",
LastObservedUtcUnixMs = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(),
});
};
_mx.ConnectionStateChanged += _onConnectionStateChanged;
// PR 13: per-platform runtime probes. ScanState subscriptions fire OnProbeCallback,
// which runs the state machine and raises StateChanged on transitions we care about.
// We forward each transition through the same OnHostStatusChanged IPC event that the
// gateway-level ConnectionStateChanged uses — tagged with the platform's TagName so the
// Admin UI can show per-host health independently from the top-level transport status.
_probeManager = new GalaxyRuntimeProbeManager(
subscribe: (probe, cb) => _mx.SubscribeAsync(probe, cb),
unsubscribe: probe => _mx.UnsubscribeAsync(probe));
_onProbeStateChanged = (_, t) =>
{
OnHostStatusChanged?.Invoke(this, new HostConnectivityStatus
{
HostName = t.TagName,
RuntimeStatus = t.NewState switch
{
HostRuntimeState.Running => "Running",
HostRuntimeState.Stopped => "Stopped",
_ => "Unknown",
},
LastObservedUtcUnixMs = new DateTimeOffset(t.AtUtc, TimeSpan.Zero).ToUnixTimeMilliseconds(),
});
};
_probeManager.StateChanged += _onProbeStateChanged;
// PR 14: alarm subsystem. Per IsAlarm=true attribute discovered, subscribe to the four
// alarm-state attributes (.InAlarm/.Priority/.DescAttrName/.Acked), track lifecycle,
// and raise GalaxyAlarmEvent on transitions — forwarded through the existing
// OnAlarmEvent IPC event that the PR 4 ConnectionSink already wires into AlarmEvent frames.
_alarmTracker = new GalaxyAlarmTracker(
subscribe: (tag, cb) => _mx.SubscribeAsync(tag, cb),
unsubscribe: tag => _mx.UnsubscribeAsync(tag),
write: (tag, v) => _mx.WriteAsync(tag, v));
_onAlarmTransition = (_, t) => OnAlarmEvent?.Invoke(this, new GalaxyAlarmEvent
{
EventId = Guid.NewGuid().ToString("N"),
ObjectTagName = t.AlarmTag,
AlarmName = t.AlarmTag,
Severity = t.Priority,
StateTransition = t.Transition switch
{
AlarmStateTransition.Active => "Active",
AlarmStateTransition.Acknowledged => "Acknowledged",
AlarmStateTransition.Inactive => "Inactive",
_ => "Unknown",
},
Message = t.DescAttrName ?? t.AlarmTag,
UtcUnixMs = new DateTimeOffset(t.AtUtc, TimeSpan.Zero).ToUnixTimeMilliseconds(),
});
_alarmTracker.TransitionRaised += _onAlarmTransition;
} }
/// <summary>
/// Exposed for tests. Production flow: DiscoverAsync completes → backend calls
/// <c>SyncProbesAsync</c> with the runtime hosts (WinPlatform + AppEngine gobjects) to
/// advise ScanState per host.
/// </summary>
internal GalaxyRuntimeProbeManager ProbeManager => _probeManager;
public async Task<OpenSessionResponse> OpenSessionAsync(OpenSessionRequest req, CancellationToken ct) public async Task<OpenSessionResponse> OpenSessionAsync(OpenSessionRequest req, CancellationToken ct)
{ {
try try
@@ -73,6 +168,34 @@ public sealed class MxAccessGalaxyBackend : IGalaxyBackend
Attributes = attrsByGobject.TryGetValue(o.GobjectId, out var a) ? a : Array.Empty<GalaxyAttributeInfo>(), Attributes = attrsByGobject.TryGetValue(o.GobjectId, out var a) ? a : Array.Empty<GalaxyAttributeInfo>(),
}).ToArray(); }).ToArray();
// PR 14: cache alarm-bearing attribute full refs so SubscribeAlarmsAsync can advise
// them on demand. Format matches the Galaxy reference grammar <tag>.<attr>.
var freshAlarmTags = attributes
.Where(a => a.IsAlarm)
.Select(a => nameByGobject.TryGetValue(a.GobjectId, out var tn)
? tn + "." + a.AttributeName
: null)
.Where(s => !string.IsNullOrWhiteSpace(s))
.Cast<string>()
.ToArray();
while (_discoveredAlarmTags.TryTake(out _)) { }
foreach (var t in freshAlarmTags) _discoveredAlarmTags.Add(t);
// PR 13: Sync the per-platform probe manager against the just-discovered hierarchy
// so ScanState subscriptions track the current runtime set. Best-effort — probe
// failures don't block Discover from returning, since the gateway-level signal from
// MxAccessClient.ConnectionStateChanged still flows and the Admin UI degrades to
// that level if any per-host probe couldn't advise.
try
{
var targets = hierarchy
.Where(o => o.CategoryId == GalaxyRuntimeProbeManager.CategoryWinPlatform
|| o.CategoryId == GalaxyRuntimeProbeManager.CategoryAppEngine)
.Select(o => new HostProbeTarget(o.TagName, o.CategoryId));
await _probeManager.SyncAsync(targets).ConfigureAwait(false);
}
catch { /* swallow — Discover succeeded; probes are a diagnostic enrichment */ }
return new DiscoverHierarchyResponse { Success = true, Objects = objects }; return new DiscoverHierarchyResponse { Success = true, Objects = objects };
} }
catch (Exception ex) catch (Exception ex)
@@ -120,8 +243,13 @@ public sealed class MxAccessGalaxyBackend : IGalaxyBackend
? null ? null
: MessagePackSerializer.Deserialize<object>(w.ValueBytes); : MessagePackSerializer.Deserialize<object>(w.ValueBytes);
await _mx.WriteAsync(w.TagReference, value!); var ok = await _mx.WriteAsync(w.TagReference, value!);
results.Add(new WriteValueResult { TagReference = w.TagReference, StatusCode = 0 }); results.Add(new WriteValueResult
{
TagReference = w.TagReference,
StatusCode = ok ? 0u : 0x80020000u, // Good or Bad_InternalError
Error = ok ? null : "MXAccess runtime reported write failure",
});
} }
catch (Exception ex) catch (Exception ex)
{ {
@@ -137,12 +265,16 @@ public sealed class MxAccessGalaxyBackend : IGalaxyBackend
try try
{ {
// For each requested tag, register a subscription that publishes back via the
// shared MXAccess data-change handler. The OnDataChange push frame to the Proxy
// is wired in the upcoming subscription-push pass; for now the value is captured
// for the first ReadAsync to hit it (so the subscribe surface itself is functional).
foreach (var tag in req.TagReferences) foreach (var tag in req.TagReferences)
await _mx.SubscribeAsync(tag, (_, __) => { /* push-frame plumbing in next iteration */ }); {
_refToSubs.AddOrUpdate(tag,
_ => new System.Collections.Concurrent.ConcurrentBag<long> { sid },
(_, bag) => { bag.Add(sid); return bag; });
// The MXAccess SubscribeAsync only takes one callback per tag; the same callback
// fires for every active subscription of that tag — we fan out by SubscriptionId.
await _mx.SubscribeAsync(tag, OnTagValueChanged);
}
_subs[sid] = req.TagReferences; _subs[sid] = req.TagReferences;
return new SubscribeResponse { Success = true, SubscriptionId = sid, ActualIntervalMs = req.RequestedIntervalMs }; return new SubscribeResponse { Success = true, SubscriptionId = sid, ActualIntervalMs = req.RequestedIntervalMs };
@@ -157,23 +289,255 @@ public sealed class MxAccessGalaxyBackend : IGalaxyBackend
{ {
if (!_subs.TryRemove(req.SubscriptionId, out var refs)) return; if (!_subs.TryRemove(req.SubscriptionId, out var refs)) return;
foreach (var r in refs) foreach (var r in refs)
{
// Drop this subscription from the reverse map; only unsubscribe from MXAccess if no
// other subscription is still listening (multiple Proxy subs may share a tag).
_refToSubs.TryGetValue(r, out var bag);
if (bag is not null)
{
var remaining = new System.Collections.Concurrent.ConcurrentBag<long>(
bag.Where(id => id != req.SubscriptionId));
if (remaining.IsEmpty)
{
_refToSubs.TryRemove(r, out _);
await _mx.UnsubscribeAsync(r); await _mx.UnsubscribeAsync(r);
} }
else
{
_refToSubs[r] = remaining;
}
}
}
}
public Task SubscribeAlarmsAsync(AlarmSubscribeRequest req, CancellationToken ct) => Task.CompletedTask; /// <summary>
public Task AcknowledgeAlarmAsync(AlarmAckRequest req, CancellationToken ct) => Task.CompletedTask; /// Fires for every value change on any subscribed Galaxy attribute. Wraps the value in
/// a <see cref="GalaxyDataValue"/> and raises <see cref="OnDataChange"/> once per
/// subscription that includes this tag — the IPC sink translates that into outbound
/// <c>OnDataChangeNotification</c> frames.
/// </summary>
private void OnTagValueChanged(string fullReference, MxAccess.Vtq vtq)
{
if (!_refToSubs.TryGetValue(fullReference, out var bag) || bag.IsEmpty) return;
public Task<HistoryReadResponse> HistoryReadAsync(HistoryReadRequest req, CancellationToken ct) var wireValue = ToWire(fullReference, vtq);
=> Task.FromResult(new HistoryReadResponse // Emit one notification per active SubscriptionId for this tag — the Proxy fans out to
// each ISubscribable consumer based on the SubscriptionId in the payload.
foreach (var sid in bag.Distinct())
{
OnDataChange?.Invoke(this, new OnDataChangeNotification
{
SubscriptionId = sid,
Values = new[] { wireValue },
});
}
}
/// <summary>
/// PR 14: advise every alarm-bearing attribute's 4-attr quartet. Best-effort per-alarm —
/// a subscribe failure on one alarm doesn't abort the whole call, since operators prefer
/// partial alarm coverage to none. Idempotent on repeat calls (tracker internally
/// skips already-tracked alarms).
/// </summary>
public async Task SubscribeAlarmsAsync(AlarmSubscribeRequest req, CancellationToken ct)
{
foreach (var tag in _discoveredAlarmTags)
{
try { await _alarmTracker.TrackAsync(tag).ConfigureAwait(false); }
catch { /* swallow per-alarm — tracker rolls back its own state on failure */ }
}
}
/// <summary>
/// PR 14: route operator ack through the tracker's AckMsg write path. EventId on the
/// incoming request maps directly to the alarm full reference (Proxy-side naming
/// convention from GalaxyProxyDriver.RaiseAlarmEvent → ev.EventId).
/// </summary>
public async Task AcknowledgeAlarmAsync(AlarmAckRequest req, CancellationToken ct)
{
// EventId carries a per-transition Guid.ToString("N"); there's no reverse map from
// event id to alarm tag yet, so v1's convention (ack targets the condition) is matched
// by reading the alarm name from the Comment envelope: v1 packed "<tag>|<comment>".
// Until the Proxy is updated to send the alarm tag separately, fall back to treating
// the EventId as the alarm tag — Client CLI passes it through unchanged.
var tag = req.EventId;
if (!string.IsNullOrWhiteSpace(tag))
{
try { await _alarmTracker.AcknowledgeAsync(tag, req.Comment ?? string.Empty).ConfigureAwait(false); }
catch { /* swallow — ack failures surface via MxAccessClient.WriteAsync logs */ }
}
}
public async Task<HistoryReadResponse> HistoryReadAsync(HistoryReadRequest req, CancellationToken ct)
{
if (_historian is null)
return new HistoryReadResponse
{ {
Success = false, Success = false,
Error = "Wonderware Historian plugin loader not yet wired (Phase 2 Task B.1.h follow-up)", Error = "Historian disabled — no OTOPCUA_HISTORIAN_ENABLED configuration",
Tags = Array.Empty<HistoryTagValues>(), Tags = Array.Empty<HistoryTagValues>(),
};
var start = DateTimeOffset.FromUnixTimeMilliseconds(req.StartUtcUnixMs).UtcDateTime;
var end = DateTimeOffset.FromUnixTimeMilliseconds(req.EndUtcUnixMs).UtcDateTime;
var tags = new List<HistoryTagValues>(req.TagReferences.Length);
try
{
foreach (var reference in req.TagReferences)
{
var samples = await _historian.ReadRawAsync(reference, start, end, (int)req.MaxValuesPerTag, ct).ConfigureAwait(false);
tags.Add(new HistoryTagValues
{
TagReference = reference,
Values = samples.Select(s => ToWire(reference, s)).ToArray(),
}); });
}
return new HistoryReadResponse { Success = true, Tags = tags.ToArray() };
}
catch (OperationCanceledException) { throw; }
catch (Exception ex)
{
return new HistoryReadResponse
{
Success = false,
Error = $"Historian read failed: {ex.Message}",
Tags = tags.ToArray(),
};
}
}
public async Task<HistoryReadProcessedResponse> HistoryReadProcessedAsync(
HistoryReadProcessedRequest req, CancellationToken ct)
{
if (_historian is null)
return new HistoryReadProcessedResponse
{
Success = false,
Error = "Historian disabled — no OTOPCUA_HISTORIAN_ENABLED configuration",
Values = Array.Empty<GalaxyDataValue>(),
};
if (req.IntervalMs <= 0)
return new HistoryReadProcessedResponse
{
Success = false,
Error = "HistoryReadProcessed requires IntervalMs > 0",
Values = Array.Empty<GalaxyDataValue>(),
};
var start = DateTimeOffset.FromUnixTimeMilliseconds(req.StartUtcUnixMs).UtcDateTime;
var end = DateTimeOffset.FromUnixTimeMilliseconds(req.EndUtcUnixMs).UtcDateTime;
try
{
var samples = await _historian.ReadAggregateAsync(
req.TagReference, start, end, req.IntervalMs, req.AggregateColumn, ct).ConfigureAwait(false);
var wire = samples.Select(s => ToWire(req.TagReference, s)).ToArray();
return new HistoryReadProcessedResponse { Success = true, Values = wire };
}
catch (OperationCanceledException) { throw; }
catch (Exception ex)
{
return new HistoryReadProcessedResponse
{
Success = false,
Error = $"Historian aggregate read failed: {ex.Message}",
Values = Array.Empty<GalaxyDataValue>(),
};
}
}
public async Task<HistoryReadAtTimeResponse> HistoryReadAtTimeAsync(
HistoryReadAtTimeRequest req, CancellationToken ct)
{
if (_historian is null)
return new HistoryReadAtTimeResponse
{
Success = false,
Error = "Historian disabled — no OTOPCUA_HISTORIAN_ENABLED configuration",
Values = Array.Empty<GalaxyDataValue>(),
};
if (req.TimestampsUtcUnixMs.Length == 0)
return new HistoryReadAtTimeResponse { Success = true, Values = Array.Empty<GalaxyDataValue>() };
var timestamps = req.TimestampsUtcUnixMs
.Select(ms => DateTimeOffset.FromUnixTimeMilliseconds(ms).UtcDateTime)
.ToArray();
try
{
var samples = await _historian.ReadAtTimeAsync(req.TagReference, timestamps, ct).ConfigureAwait(false);
var wire = samples.Select(s => ToWire(req.TagReference, s)).ToArray();
return new HistoryReadAtTimeResponse { Success = true, Values = wire };
}
catch (OperationCanceledException) { throw; }
catch (Exception ex)
{
return new HistoryReadAtTimeResponse
{
Success = false,
Error = $"Historian at-time read failed: {ex.Message}",
Values = Array.Empty<GalaxyDataValue>(),
};
}
}
public async Task<HistoryReadEventsResponse> HistoryReadEventsAsync(
HistoryReadEventsRequest req, CancellationToken ct)
{
if (_historian is null)
return new HistoryReadEventsResponse
{
Success = false,
Error = "Historian disabled — no OTOPCUA_HISTORIAN_ENABLED configuration",
Events = Array.Empty<GalaxyHistoricalEvent>(),
};
var start = DateTimeOffset.FromUnixTimeMilliseconds(req.StartUtcUnixMs).UtcDateTime;
var end = DateTimeOffset.FromUnixTimeMilliseconds(req.EndUtcUnixMs).UtcDateTime;
try
{
var events = await _historian.ReadEventsAsync(req.SourceName, start, end, req.MaxEvents, ct).ConfigureAwait(false);
var wire = events.Select(e => new GalaxyHistoricalEvent
{
EventId = e.Id.ToString(),
SourceName = e.Source,
EventTimeUtcUnixMs = new DateTimeOffset(DateTime.SpecifyKind(e.EventTime, DateTimeKind.Utc), TimeSpan.Zero).ToUnixTimeMilliseconds(),
ReceivedTimeUtcUnixMs = new DateTimeOffset(DateTime.SpecifyKind(e.ReceivedTime, DateTimeKind.Utc), TimeSpan.Zero).ToUnixTimeMilliseconds(),
DisplayText = e.DisplayText,
Severity = e.Severity,
}).ToArray();
return new HistoryReadEventsResponse { Success = true, Events = wire };
}
catch (OperationCanceledException) { throw; }
catch (Exception ex)
{
return new HistoryReadEventsResponse
{
Success = false,
Error = $"Historian event read failed: {ex.Message}",
Events = Array.Empty<GalaxyHistoricalEvent>(),
};
}
}
public Task<RecycleStatusResponse> RecycleAsync(RecycleHostRequest req, CancellationToken ct) public Task<RecycleStatusResponse> RecycleAsync(RecycleHostRequest req, CancellationToken ct)
=> Task.FromResult(new RecycleStatusResponse { Accepted = true, GraceSeconds = 15 }); => Task.FromResult(new RecycleStatusResponse { Accepted = true, GraceSeconds = 15 });
public void Dispose()
{
_alarmTracker.TransitionRaised -= _onAlarmTransition;
_alarmTracker.Dispose();
_probeManager.StateChanged -= _onProbeStateChanged;
_probeManager.Dispose();
_mx.ConnectionStateChanged -= _onConnectionStateChanged;
_historian?.Dispose();
}
private static GalaxyDataValue ToWire(string reference, Vtq vtq) => new() private static GalaxyDataValue ToWire(string reference, Vtq vtq) => new()
{ {
TagReference = reference, TagReference = reference,
@@ -184,6 +548,39 @@ public sealed class MxAccessGalaxyBackend : IGalaxyBackend
ServerTimestampUtcUnixMs = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(), ServerTimestampUtcUnixMs = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(),
}; };
/// <summary>
/// Maps a <see cref="HistorianSample"/> (raw historian row, OPC-UA-free) to the IPC wire
/// shape. The Proxy decodes the MessagePack value and maps <see cref="HistorianSample.Quality"/>
/// through <c>QualityMapper</c> on its side of the pipe — we keep the raw byte here so
/// rich OPC DA status codes (e.g. <c>BadNotConnected</c>, <c>UncertainSubNormal</c>) survive
/// the hop intact.
/// </summary>
private static GalaxyDataValue ToWire(string reference, HistorianSample sample) => new()
{
TagReference = reference,
ValueBytes = sample.Value is null ? null : MessagePackSerializer.Serialize(sample.Value),
ValueMessagePackType = 0,
StatusCode = HistorianQualityMapper.Map(sample.Quality),
SourceTimestampUtcUnixMs = new DateTimeOffset(sample.TimestampUtc, TimeSpan.Zero).ToUnixTimeMilliseconds(),
ServerTimestampUtcUnixMs = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(),
};
/// <summary>
/// Maps a <see cref="HistorianAggregateSample"/> (one aggregate bucket) to the IPC wire
/// shape. A null <see cref="HistorianAggregateSample.Value"/> means the aggregate was
/// unavailable for the bucket — the Proxy translates that to OPC UA <c>BadNoData</c>.
/// </summary>
private static GalaxyDataValue ToWire(string reference, HistorianAggregateSample sample) => new()
{
TagReference = reference,
ValueBytes = sample.Value is null ? null : MessagePackSerializer.Serialize(sample.Value.Value),
ValueMessagePackType = 0,
StatusCode = sample.Value is null ? 0x800E0000u /* BadNoData */ : 0x00000000u,
SourceTimestampUtcUnixMs = new DateTimeOffset(sample.TimestampUtc, TimeSpan.Zero).ToUnixTimeMilliseconds(),
ServerTimestampUtcUnixMs = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(),
};
private static GalaxyAttributeInfo MapAttribute(GalaxyAttributeRow row) => new() private static GalaxyAttributeInfo MapAttribute(GalaxyAttributeRow row) => new()
{ {
AttributeName = row.AttributeName, AttributeName = row.AttributeName,
@@ -192,6 +589,7 @@ public sealed class MxAccessGalaxyBackend : IGalaxyBackend
ArrayDim = row.ArrayDimension is int d and > 0 ? (uint)d : null, ArrayDim = row.ArrayDimension is int d and > 0 ? (uint)d : null,
SecurityClassification = row.SecurityClassification, SecurityClassification = row.SecurityClassification,
IsHistorized = row.IsHistorized, IsHistorized = row.IsHistorized,
IsAlarm = row.IsAlarm,
}; };
private static string MapCategory(int categoryId) => categoryId switch private static string MapCategory(int categoryId) => categoryId switch

View File

@@ -0,0 +1,273 @@
using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Stability;
/// <summary>
/// Per-platform + per-AppEngine runtime probe. Subscribes to <c>&lt;TagName&gt;.ScanState</c>
/// for each $WinPlatform and $AppEngine gobject, tracks Unknown → Running → Stopped
/// transitions, and fires <see cref="StateChanged"/> so <see cref="Backend.MxAccessGalaxyBackend"/>
/// can forward per-host events through the existing IPC <c>OnHostStatusChanged</c> event.
/// Pure-logic state machine with an injected clock so it's deterministically testable —
/// port of v1 <c>GalaxyRuntimeProbeManager</c> without the OPC UA node-manager coupling.
/// </summary>
/// <remarks>
/// State machine rules (documented in v1's <c>runtimestatus.md</c> and preserved here):
/// <list type="bullet">
/// <item><c>ScanState</c> is on-change-only — a stably-Running host may go hours without a
/// callback. Running → Stopped is driven by an explicit <c>ScanState=false</c> callback,
/// never by starvation.</item>
/// <item>Unknown → Running is a startup transition and does NOT fire StateChanged (would
/// paint every host as "just recovered" at startup, which is noise).</item>
/// <item>Stopped → Running and Running → Stopped fire StateChanged. Unknown → Stopped
/// fires StateChanged because that's a first-known-bad signal operators need.</item>
/// <item>All public methods are thread-safe. Callbacks fire outside the internal lock to
/// avoid lock inversion with caller-owned state.</item>
/// </list>
/// </remarks>
public sealed class GalaxyRuntimeProbeManager : IDisposable
{
public const int CategoryWinPlatform = 1;
public const int CategoryAppEngine = 3;
public const string ProbeAttribute = ".ScanState";
private readonly Func<DateTime> _clock;
private readonly Func<string, Action<string, Vtq>, Task> _subscribe;
private readonly Func<string, Task> _unsubscribe;
private readonly object _lock = new();
// probe tag → per-host state
private readonly Dictionary<string, HostProbeState> _byProbe = new(StringComparer.OrdinalIgnoreCase);
// tag name → probe tag (for reverse lookup on the desired-set diff)
private readonly Dictionary<string, string> _probeByTagName = new(StringComparer.OrdinalIgnoreCase);
private bool _disposed;
/// <summary>
/// Fires on every state transition that operators should react to. See class remarks
/// for the rules on which transitions fire.
/// </summary>
public event EventHandler<HostStateTransition>? StateChanged;
public GalaxyRuntimeProbeManager(
Func<string, Action<string, Vtq>, Task> subscribe,
Func<string, Task> unsubscribe)
: this(subscribe, unsubscribe, () => DateTime.UtcNow) { }
internal GalaxyRuntimeProbeManager(
Func<string, Action<string, Vtq>, Task> subscribe,
Func<string, Task> unsubscribe,
Func<DateTime> clock)
{
_subscribe = subscribe ?? throw new ArgumentNullException(nameof(subscribe));
_unsubscribe = unsubscribe ?? throw new ArgumentNullException(nameof(unsubscribe));
_clock = clock ?? throw new ArgumentNullException(nameof(clock));
}
/// <summary>Number of probes currently advised. Test/dashboard hook.</summary>
public int ActiveProbeCount
{
get { lock (_lock) return _byProbe.Count; }
}
/// <summary>
/// Snapshot every currently-tracked host's state. One entry per probe.
/// </summary>
public IReadOnlyList<HostProbeSnapshot> SnapshotStates()
{
lock (_lock)
{
return _byProbe.Select(kv => new HostProbeSnapshot(
TagName: kv.Value.TagName,
State: kv.Value.State,
LastChangedUtc: kv.Value.LastStateChangeUtc)).ToList();
}
}
/// <summary>
/// Query the current runtime state for <paramref name="tagName"/>. Returns
/// <see cref="HostRuntimeState.Unknown"/> when the host is not tracked.
/// </summary>
public HostRuntimeState GetState(string tagName)
{
lock (_lock)
{
if (_probeByTagName.TryGetValue(tagName, out var probe)
&& _byProbe.TryGetValue(probe, out var state))
return state.State;
return HostRuntimeState.Unknown;
}
}
/// <summary>
/// Diff the desired host set (filtered $WinPlatform / $AppEngine from the latest Discover)
/// against the currently-tracked set and advise / unadvise as needed. Idempotent:
/// calling twice with the same set does nothing.
/// </summary>
public async Task SyncAsync(IEnumerable<HostProbeTarget> desiredHosts)
{
if (_disposed) return;
var desired = desiredHosts
.Where(h => !string.IsNullOrWhiteSpace(h.TagName))
.ToDictionary(h => h.TagName, StringComparer.OrdinalIgnoreCase);
List<string> toAdvise;
List<string> toUnadvise;
lock (_lock)
{
toAdvise = desired.Keys
.Where(tag => !_probeByTagName.ContainsKey(tag))
.ToList();
toUnadvise = _probeByTagName.Keys
.Where(tag => !desired.ContainsKey(tag))
.Select(tag => _probeByTagName[tag])
.ToList();
foreach (var tag in toAdvise)
{
var probe = tag + ProbeAttribute;
_probeByTagName[tag] = probe;
_byProbe[probe] = new HostProbeState
{
TagName = tag,
State = HostRuntimeState.Unknown,
LastStateChangeUtc = _clock(),
};
}
foreach (var probe in toUnadvise)
{
_byProbe.Remove(probe);
}
foreach (var removedTag in _probeByTagName.Keys.Where(t => !desired.ContainsKey(t)).ToList())
{
_probeByTagName.Remove(removedTag);
}
}
foreach (var tag in toAdvise)
{
var probe = tag + ProbeAttribute;
try
{
await _subscribe(probe, OnProbeCallback);
}
catch
{
// Rollback on subscribe failure so a later Tick can't transition a never-advised
// probe into a false Stopped state. Callers can re-Sync later to retry.
lock (_lock)
{
_byProbe.Remove(probe);
_probeByTagName.Remove(tag);
}
}
}
foreach (var probe in toUnadvise)
{
try { await _unsubscribe(probe); } catch { /* best-effort cleanup */ }
}
}
/// <summary>
/// Public entry point for tests and internal callbacks. Production flow: MxAccessClient's
/// SubscribeAsync delivers VTQ updates through the callback wired in <see cref="SyncAsync"/>,
/// which calls this method under the lock to update state and fires
/// <see cref="StateChanged"/> outside the lock for any transition that matters.
/// </summary>
public void OnProbeCallback(string probeTag, Vtq vtq)
{
if (_disposed) return;
HostStateTransition? transition = null;
lock (_lock)
{
if (!_byProbe.TryGetValue(probeTag, out var state)) return;
var isRunning = vtq.Quality >= 192 && vtq.Value is bool b && b;
var now = _clock();
var previous = state.State;
state.LastCallbackUtc = now;
if (isRunning)
{
state.GoodUpdateCount++;
if (previous != HostRuntimeState.Running)
{
state.State = HostRuntimeState.Running;
state.LastStateChangeUtc = now;
if (previous == HostRuntimeState.Stopped)
{
transition = new HostStateTransition(state.TagName, previous, HostRuntimeState.Running, now);
}
}
}
else
{
state.FailureCount++;
if (previous != HostRuntimeState.Stopped)
{
state.State = HostRuntimeState.Stopped;
state.LastStateChangeUtc = now;
transition = new HostStateTransition(state.TagName, previous, HostRuntimeState.Stopped, now);
}
}
}
if (transition is { } t)
{
StateChanged?.Invoke(this, t);
}
}
public void Dispose()
{
if (_disposed) return;
_disposed = true;
lock (_lock)
{
_byProbe.Clear();
_probeByTagName.Clear();
}
}
private sealed class HostProbeState
{
public string TagName { get; set; } = "";
public HostRuntimeState State { get; set; }
public DateTime LastStateChangeUtc { get; set; }
public DateTime? LastCallbackUtc { get; set; }
public long GoodUpdateCount { get; set; }
public long FailureCount { get; set; }
}
}
public enum HostRuntimeState
{
Unknown,
Running,
Stopped,
}
public sealed record HostStateTransition(
string TagName,
HostRuntimeState OldState,
HostRuntimeState NewState,
DateTime AtUtc);
public sealed record HostProbeSnapshot(
string TagName,
HostRuntimeState State,
DateTime LastChangedUtc);
public readonly record struct HostProbeTarget(string TagName, int CategoryId)
{
public bool IsRuntimeHost =>
CategoryId == GalaxyRuntimeProbeManager.CategoryWinPlatform
|| CategoryId == GalaxyRuntimeProbeManager.CategoryAppEngine;
}

View File

@@ -15,6 +15,13 @@ public sealed class StubGalaxyBackend : IGalaxyBackend
private long _nextSessionId; private long _nextSessionId;
private long _nextSubscriptionId; private long _nextSubscriptionId;
// Stub backend never raises events — implements the interface members for symmetry.
#pragma warning disable CS0067
public event System.EventHandler<OnDataChangeNotification>? OnDataChange;
public event System.EventHandler<GalaxyAlarmEvent>? OnAlarmEvent;
public event System.EventHandler<HostConnectivityStatus>? OnHostStatusChanged;
#pragma warning restore CS0067
public Task<OpenSessionResponse> OpenSessionAsync(OpenSessionRequest req, CancellationToken ct) public Task<OpenSessionResponse> OpenSessionAsync(OpenSessionRequest req, CancellationToken ct)
{ {
var id = Interlocked.Increment(ref _nextSessionId); var id = Interlocked.Increment(ref _nextSessionId);
@@ -78,6 +85,33 @@ public sealed class StubGalaxyBackend : IGalaxyBackend
Tags = System.Array.Empty<HistoryTagValues>(), Tags = System.Array.Empty<HistoryTagValues>(),
}); });
public Task<HistoryReadProcessedResponse> HistoryReadProcessedAsync(
HistoryReadProcessedRequest req, CancellationToken ct)
=> Task.FromResult(new HistoryReadProcessedResponse
{
Success = false,
Error = "stub: MXAccess code lift pending (Phase 2 Task B.1)",
Values = System.Array.Empty<GalaxyDataValue>(),
});
public Task<HistoryReadAtTimeResponse> HistoryReadAtTimeAsync(
HistoryReadAtTimeRequest req, CancellationToken ct)
=> Task.FromResult(new HistoryReadAtTimeResponse
{
Success = false,
Error = "stub: MXAccess code lift pending (Phase 2 Task B.1)",
Values = System.Array.Empty<GalaxyDataValue>(),
});
public Task<HistoryReadEventsResponse> HistoryReadEventsAsync(
HistoryReadEventsRequest req, CancellationToken ct)
=> Task.FromResult(new HistoryReadEventsResponse
{
Success = false,
Error = "stub: MXAccess code lift pending (Phase 2 Task B.1)",
Events = System.Array.Empty<GalaxyHistoricalEvent>(),
});
public Task<RecycleStatusResponse> RecycleAsync(RecycleHostRequest req, CancellationToken ct) public Task<RecycleStatusResponse> RecycleAsync(RecycleHostRequest req, CancellationToken ct)
=> Task.FromResult(new RecycleStatusResponse => Task.FromResult(new RecycleStatusResponse
{ {

View File

@@ -80,6 +80,27 @@ public sealed class GalaxyFrameHandler(IGalaxyBackend backend, ILogger logger) :
await writer.WriteAsync(MessageKind.HistoryReadResponse, resp, ct); await writer.WriteAsync(MessageKind.HistoryReadResponse, resp, ct);
return; return;
} }
case MessageKind.HistoryReadProcessedRequest:
{
var resp = await backend.HistoryReadProcessedAsync(
Deserialize<HistoryReadProcessedRequest>(body), ct);
await writer.WriteAsync(MessageKind.HistoryReadProcessedResponse, resp, ct);
return;
}
case MessageKind.HistoryReadAtTimeRequest:
{
var resp = await backend.HistoryReadAtTimeAsync(
Deserialize<HistoryReadAtTimeRequest>(body), ct);
await writer.WriteAsync(MessageKind.HistoryReadAtTimeResponse, resp, ct);
return;
}
case MessageKind.HistoryReadEventsRequest:
{
var resp = await backend.HistoryReadEventsAsync(
Deserialize<HistoryReadEventsRequest>(body), ct);
await writer.WriteAsync(MessageKind.HistoryReadEventsResponse, resp, ct);
return;
}
case MessageKind.RecycleHostRequest: case MessageKind.RecycleHostRequest:
{ {
var resp = await backend.RecycleAsync(Deserialize<RecycleHostRequest>(body), ct); var resp = await backend.RecycleAsync(Deserialize<RecycleHostRequest>(body), ct);
@@ -99,9 +120,64 @@ public sealed class GalaxyFrameHandler(IGalaxyBackend backend, ILogger logger) :
} }
} }
/// <summary>
/// Subscribes the backend's server-pushed events for the lifetime of the connection.
/// The returned disposable unsubscribes when the connection closes — without it the
/// backend's static event invocation list would accumulate dead writer references and
/// leak memory + raise <see cref="ObjectDisposedException"/> on every push.
/// </summary>
public IDisposable AttachConnection(FrameWriter writer)
{
var sink = new ConnectionSink(backend, writer, logger);
sink.Attach();
return sink;
}
private static T Deserialize<T>(byte[] body) => MessagePackSerializer.Deserialize<T>(body); private static T Deserialize<T>(byte[] body) => MessagePackSerializer.Deserialize<T>(body);
private static Task SendErrorAsync(FrameWriter writer, string code, string message, CancellationToken ct) private static Task SendErrorAsync(FrameWriter writer, string code, string message, CancellationToken ct)
=> writer.WriteAsync(MessageKind.ErrorResponse, => writer.WriteAsync(MessageKind.ErrorResponse,
new ErrorResponse { Code = code, Message = message }, ct); new ErrorResponse { Code = code, Message = message }, ct);
private sealed class ConnectionSink : IDisposable
{
private readonly IGalaxyBackend _backend;
private readonly FrameWriter _writer;
private readonly ILogger _logger;
private EventHandler<OnDataChangeNotification>? _onData;
private EventHandler<GalaxyAlarmEvent>? _onAlarm;
private EventHandler<HostConnectivityStatus>? _onHost;
public ConnectionSink(IGalaxyBackend backend, FrameWriter writer, ILogger logger)
{
_backend = backend; _writer = writer; _logger = logger;
}
public void Attach()
{
_onData = (_, e) => Push(MessageKind.OnDataChangeNotification, e);
_onAlarm = (_, e) => Push(MessageKind.AlarmEvent, e);
_onHost = (_, e) => Push(MessageKind.RuntimeStatusChange,
new RuntimeStatusChangeNotification { Status = e });
_backend.OnDataChange += _onData;
_backend.OnAlarmEvent += _onAlarm;
_backend.OnHostStatusChanged += _onHost;
}
private void Push<T>(MessageKind kind, T payload)
{
// Fire-and-forget — pushes can race with disposal of the writer. We swallow
// ObjectDisposedException because the dispose path will detach this sink shortly.
try { _writer.WriteAsync(kind, payload, CancellationToken.None).GetAwaiter().GetResult(); }
catch (ObjectDisposedException) { }
catch (Exception ex) { _logger.Warning(ex, "ConnectionSink push failed for {Kind}", kind); }
}
public void Dispose()
{
if (_onData is not null) _backend.OnDataChange -= _onData;
if (_onAlarm is not null) _backend.OnAlarmEvent -= _onAlarm;
if (_onHost is not null) _backend.OnHostStatusChanged -= _onHost;
}
}
} }

View File

@@ -98,6 +98,8 @@ public sealed class PipeServer : IDisposable
new HelloAck { Accepted = true, HostName = Environment.MachineName }, new HelloAck { Accepted = true, HostName = Environment.MachineName },
linked.Token).ConfigureAwait(false); linked.Token).ConfigureAwait(false);
using var attachment = handler.AttachConnection(writer);
while (!linked.Token.IsCancellationRequested) while (!linked.Token.IsCancellationRequested)
{ {
var frame = await reader.ReadFrameAsync(linked.Token).ConfigureAwait(false); var frame = await reader.ReadFrameAsync(linked.Token).ConfigureAwait(false);
@@ -157,4 +159,19 @@ public sealed class PipeServer : IDisposable
public interface IFrameHandler public interface IFrameHandler
{ {
Task HandleAsync(MessageKind kind, byte[] body, FrameWriter writer, CancellationToken ct); Task HandleAsync(MessageKind kind, byte[] body, FrameWriter writer, CancellationToken ct);
/// <summary>
/// Called once per accepted connection after the Hello handshake. Lets the handler
/// attach server-pushed event sinks (data-change, alarm, host-status) to the
/// connection's <paramref name="writer"/>. Returns an <see cref="IDisposable"/> the
/// pipe server disposes when the connection closes — backends use it to unsubscribe.
/// Implementations that don't push events can return <see cref="NoopAttachment"/>.
/// </summary>
IDisposable AttachConnection(FrameWriter writer);
public sealed class NoopAttachment : IDisposable
{
public static readonly NoopAttachment Instance = new();
public void Dispose() { }
}
} }

View File

@@ -1,3 +1,4 @@
using System;
using System.Threading; using System.Threading;
using System.Threading.Tasks; using System.Threading.Tasks;
using MessagePack; using MessagePack;
@@ -27,4 +28,6 @@ public sealed class StubFrameHandler : IFrameHandler
new ErrorResponse { Code = "not-implemented", Message = $"Kind {kind} is stubbed — MXAccess lift deferred" }, new ErrorResponse { Code = "not-implemented", Message = $"Kind {kind} is stubbed — MXAccess lift deferred" },
ct); ct);
} }
public IDisposable AttachConnection(FrameWriter writer) => IFrameHandler.NoopAttachment.Instance;
} }

View File

@@ -4,6 +4,7 @@ using System.Threading;
using Serilog; using Serilog;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend; using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Galaxy; using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Galaxy;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Historian;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess; using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Ipc; using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Ipc;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Sta; using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Sta;
@@ -66,9 +67,11 @@ public static class Program
pump = new StaPump("Galaxy.Sta"); pump = new StaPump("Galaxy.Sta");
pump.WaitForStartedAsync().GetAwaiter().GetResult(); pump.WaitForStartedAsync().GetAwaiter().GetResult();
mx = new MxAccessClient(pump, new MxProxyAdapter(), clientName); mx = new MxAccessClient(pump, new MxProxyAdapter(), clientName);
var historian = BuildHistorianIfEnabled();
backend = new MxAccessGalaxyBackend( backend = new MxAccessGalaxyBackend(
new GalaxyRepository(new GalaxyRepositoryOptions { ConnectionString = zbConn }), new GalaxyRepository(new GalaxyRepositoryOptions { ConnectionString = zbConn }),
mx); mx,
historian);
break; break;
} }
@@ -77,6 +80,7 @@ public static class Program
try { server.RunAsync(handler, cts.Token).GetAwaiter().GetResult(); } try { server.RunAsync(handler, cts.Token).GetAwaiter().GetResult(); }
finally finally
{ {
(backend as IDisposable)?.Dispose();
mx?.Dispose(); mx?.Dispose();
pump?.Dispose(); pump?.Dispose();
} }
@@ -91,4 +95,45 @@ public static class Program
} }
finally { Log.CloseAndFlush(); } finally { Log.CloseAndFlush(); }
} }
/// <summary>
/// Builds a <see cref="HistorianDataSource"/> from the OTOPCUA_HISTORIAN_* environment
/// variables the supervisor passes at spawn time. Returns null when the historian is
/// disabled (default) so <c>MxAccessGalaxyBackend.HistoryReadAsync</c> returns a clear
/// "not configured" error instead of attempting an SDK connection to localhost.
/// </summary>
private static IHistorianDataSource? BuildHistorianIfEnabled()
{
var enabled = Environment.GetEnvironmentVariable("OTOPCUA_HISTORIAN_ENABLED");
if (!string.Equals(enabled, "true", StringComparison.OrdinalIgnoreCase) && enabled != "1")
return null;
var cfg = new HistorianConfiguration
{
Enabled = true,
ServerName = Environment.GetEnvironmentVariable("OTOPCUA_HISTORIAN_SERVER") ?? "localhost",
Port = TryParseInt("OTOPCUA_HISTORIAN_PORT", 32568),
IntegratedSecurity = !string.Equals(Environment.GetEnvironmentVariable("OTOPCUA_HISTORIAN_INTEGRATED"), "false", StringComparison.OrdinalIgnoreCase),
UserName = Environment.GetEnvironmentVariable("OTOPCUA_HISTORIAN_USER"),
Password = Environment.GetEnvironmentVariable("OTOPCUA_HISTORIAN_PASS"),
CommandTimeoutSeconds = TryParseInt("OTOPCUA_HISTORIAN_TIMEOUT_SEC", 30),
MaxValuesPerRead = TryParseInt("OTOPCUA_HISTORIAN_MAX_VALUES", 10000),
FailureCooldownSeconds = TryParseInt("OTOPCUA_HISTORIAN_COOLDOWN_SEC", 60),
};
var servers = Environment.GetEnvironmentVariable("OTOPCUA_HISTORIAN_SERVERS");
if (!string.IsNullOrWhiteSpace(servers))
cfg.ServerNames = new System.Collections.Generic.List<string>(
servers.Split(new[] { ',' }, StringSplitOptions.RemoveEmptyEntries));
Log.Information("Historian enabled — {NodeCount} configured node(s), port={Port}",
cfg.ServerNames.Count > 0 ? cfg.ServerNames.Count : 1, cfg.Port);
return new HistorianDataSource(cfg);
}
private static int TryParseInt(string envName, int defaultValue)
{
var raw = Environment.GetEnvironmentVariable(envName);
return int.TryParse(raw, out var parsed) ? parsed : defaultValue;
}
} }

View File

@@ -30,11 +30,43 @@
<ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.csproj"/> <ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.csproj"/>
</ItemGroup> </ItemGroup>
<ItemGroup>
<InternalsVisibleTo Include="ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests"/>
</ItemGroup>
<ItemGroup> <ItemGroup>
<Reference Include="ArchestrA.MxAccess"> <Reference Include="ArchestrA.MxAccess">
<HintPath>..\..\lib\ArchestrA.MxAccess.dll</HintPath> <HintPath>..\..\lib\ArchestrA.MxAccess.dll</HintPath>
<Private>true</Private> <Private>true</Private>
</Reference> </Reference>
<!-- Wonderware Historian SDK — consumed by Backend/Historian/ for HistoryReadAsync.
Previously lived in the v1 Historian.Aveva plugin; folded into Driver.Galaxy.Host
for PR #5 because this host is already Galaxy-specific. -->
<Reference Include="aahClientManaged">
<HintPath>..\..\lib\aahClientManaged.dll</HintPath>
<EmbedInteropTypes>false</EmbedInteropTypes>
</Reference>
<Reference Include="aahClientCommon">
<HintPath>..\..\lib\aahClientCommon.dll</HintPath>
<EmbedInteropTypes>false</EmbedInteropTypes>
</Reference>
</ItemGroup>
<ItemGroup>
<!-- Historian SDK native and satellite DLLs — staged beside the host exe so the
aahClientManaged wrapper can P/Invoke into them without an AssemblyResolve hook. -->
<None Include="..\..\lib\aahClient.dll">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
</None>
<None Include="..\..\lib\Historian.CBE.dll">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
</None>
<None Include="..\..\lib\Historian.DPAPI.dll">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
</None>
<None Include="..\..\lib\ArchestrA.CloudHistorian.Contract.dll">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
</None>
</ItemGroup> </ItemGroup>
<ItemGroup> <ItemGroup>

View File

@@ -114,16 +114,31 @@ public sealed class GalaxyProxyDriver(GalaxyProxyOptions options)
var folder = builder.Folder(obj.ContainedName, obj.ContainedName); var folder = builder.Folder(obj.ContainedName, obj.ContainedName);
foreach (var attr in obj.Attributes) foreach (var attr in obj.Attributes)
{ {
folder.Variable( var fullName = $"{obj.TagName}.{attr.AttributeName}";
var handle = folder.Variable(
attr.AttributeName, attr.AttributeName,
attr.AttributeName, attr.AttributeName,
new DriverAttributeInfo( new DriverAttributeInfo(
FullName: $"{obj.TagName}.{attr.AttributeName}", FullName: fullName,
DriverDataType: MapDataType(attr.MxDataType), DriverDataType: MapDataType(attr.MxDataType),
IsArray: attr.IsArray, IsArray: attr.IsArray,
ArrayDim: attr.ArrayDim, ArrayDim: attr.ArrayDim,
SecurityClass: MapSecurity(attr.SecurityClassification), SecurityClass: MapSecurity(attr.SecurityClassification),
IsHistorized: attr.IsHistorized)); IsHistorized: attr.IsHistorized,
IsAlarm: attr.IsAlarm));
// PR 15: when Galaxy flags the attribute as alarm-bearing (AlarmExtension
// primitive), register an alarm-condition sink so the generic node manager
// can route OnAlarmEvent payloads for this tag to the concrete address-space
// builder. Severity default Medium — the live severity arrives through
// AlarmEventArgs once MxAccessGalaxyBackend's tracker starts firing.
if (attr.IsAlarm)
{
handle.MarkAsAlarmCondition(new AlarmConditionInfo(
SourceName: fullName,
InitialSeverity: AlarmSeverity.Medium,
InitialDescription: null));
}
} }
} }
} }
@@ -296,10 +311,50 @@ public sealed class GalaxyProxyDriver(GalaxyProxyOptions options)
return new HistoryReadResult(samples, ContinuationPoint: null); return new HistoryReadResult(samples, ContinuationPoint: null);
} }
public Task<HistoryReadResult> ReadProcessedAsync( public async Task<HistoryReadResult> ReadProcessedAsync(
string fullReference, DateTime startUtc, DateTime endUtc, TimeSpan interval, string fullReference, DateTime startUtc, DateTime endUtc, TimeSpan interval,
HistoryAggregateType aggregate, CancellationToken cancellationToken) HistoryAggregateType aggregate, CancellationToken cancellationToken)
=> throw new NotSupportedException("Galaxy historian processed reads are not supported in v2; use ReadRawAsync."); {
var client = RequireClient();
var column = MapAggregateToColumn(aggregate);
var resp = await client.CallAsync<HistoryReadProcessedRequest, HistoryReadProcessedResponse>(
MessageKind.HistoryReadProcessedRequest,
new HistoryReadProcessedRequest
{
SessionId = _sessionId,
TagReference = fullReference,
StartUtcUnixMs = new DateTimeOffset(startUtc, TimeSpan.Zero).ToUnixTimeMilliseconds(),
EndUtcUnixMs = new DateTimeOffset(endUtc, TimeSpan.Zero).ToUnixTimeMilliseconds(),
IntervalMs = (long)interval.TotalMilliseconds,
AggregateColumn = column,
},
MessageKind.HistoryReadProcessedResponse,
cancellationToken);
if (!resp.Success)
throw new InvalidOperationException($"Galaxy.Host HistoryReadProcessed failed: {resp.Error}");
IReadOnlyList<DataValueSnapshot> samples = [.. resp.Values.Select(ToSnapshot)];
return new HistoryReadResult(samples, ContinuationPoint: null);
}
/// <summary>
/// Maps the OPC UA Part 13 aggregate enum onto the Wonderware Historian
/// AnalogSummaryQuery column names consumed by <c>HistorianDataSource.ReadAggregateAsync</c>.
/// Kept on the Proxy side so Galaxy.Host stays OPC-UA-free.
/// </summary>
internal static string MapAggregateToColumn(HistoryAggregateType aggregate) => aggregate switch
{
HistoryAggregateType.Average => "Average",
HistoryAggregateType.Minimum => "Minimum",
HistoryAggregateType.Maximum => "Maximum",
HistoryAggregateType.Count => "ValueCount",
HistoryAggregateType.Total => throw new NotSupportedException(
"HistoryAggregateType.Total is not supported by the Wonderware Historian AnalogSummary " +
"query — use Average × Count on the caller side, or switch to Average/Minimum/Maximum/Count."),
_ => throw new NotSupportedException($"Unknown HistoryAggregateType {aggregate}"),
};
// ---- IRediscoverable ---- // ---- IRediscoverable ----

View File

@@ -16,6 +16,10 @@
<ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.csproj"/> <ProjectReference Include="..\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.csproj"/>
</ItemGroup> </ItemGroup>
<ItemGroup>
<InternalsVisibleTo Include="ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests"/>
</ItemGroup>
<ItemGroup> <ItemGroup>
<NuGetAuditSuppress Include="https://github.com/advisories/GHSA-37gx-xxp4-5rgx"/> <NuGetAuditSuppress Include="https://github.com/advisories/GHSA-37gx-xxp4-5rgx"/>
<NuGetAuditSuppress Include="https://github.com/advisories/GHSA-w3x6-4m5h-cxqf"/> <NuGetAuditSuppress Include="https://github.com/advisories/GHSA-w3x6-4m5h-cxqf"/>

View File

@@ -30,6 +30,15 @@ public sealed class GalaxyAttributeInfo
[Key(3)] public uint? ArrayDim { get; set; } [Key(3)] public uint? ArrayDim { get; set; }
[Key(4)] public int SecurityClassification { get; set; } [Key(4)] public int SecurityClassification { get; set; }
[Key(5)] public bool IsHistorized { get; set; } [Key(5)] public bool IsHistorized { get; set; }
/// <summary>
/// True when the attribute has an AlarmExtension primitive in the Galaxy repository
/// (<c>primitive_definition.primitive_name = 'AlarmExtension'</c>). The generic
/// node-manager uses this to enrich the variable's OPC UA node with an
/// <c>AlarmConditionState</c> during address-space build. Added in PR 9 as the
/// discovery-side foundation for the alarm event wire-up that follows in PR 10+.
/// </summary>
[Key(6)] public bool IsAlarm { get; set; }
} }
[MessagePackObject] [MessagePackObject]

View File

@@ -50,6 +50,12 @@ public enum MessageKind : byte
HistoryReadRequest = 0x60, HistoryReadRequest = 0x60,
HistoryReadResponse = 0x61, HistoryReadResponse = 0x61,
HistoryReadProcessedRequest = 0x62,
HistoryReadProcessedResponse = 0x63,
HistoryReadAtTimeRequest = 0x64,
HistoryReadAtTimeResponse = 0x65,
HistoryReadEventsRequest = 0x66,
HistoryReadEventsResponse = 0x67,
HostConnectivityStatus = 0x70, HostConnectivityStatus = 0x70,
RuntimeStatusChange = 0x71, RuntimeStatusChange = 0x71,

View File

@@ -26,3 +26,85 @@ public sealed class HistoryReadResponse
[Key(1)] public string? Error { get; set; } [Key(1)] public string? Error { get; set; }
[Key(2)] public HistoryTagValues[] Tags { get; set; } = System.Array.Empty<HistoryTagValues>(); [Key(2)] public HistoryTagValues[] Tags { get; set; } = System.Array.Empty<HistoryTagValues>();
} }
/// <summary>
/// Processed (aggregated) historian read — OPC UA HistoryReadProcessed service. The
/// aggregate column is a string (e.g. "Average", "Minimum") mapped by the Proxy from the
/// OPC UA HistoryAggregateType enum so Galaxy.Host stays OPC-UA-free.
/// </summary>
[MessagePackObject]
public sealed class HistoryReadProcessedRequest
{
[Key(0)] public long SessionId { get; set; }
[Key(1)] public string TagReference { get; set; } = string.Empty;
[Key(2)] public long StartUtcUnixMs { get; set; }
[Key(3)] public long EndUtcUnixMs { get; set; }
[Key(4)] public long IntervalMs { get; set; }
[Key(5)] public string AggregateColumn { get; set; } = "Average";
}
[MessagePackObject]
public sealed class HistoryReadProcessedResponse
{
[Key(0)] public bool Success { get; set; }
[Key(1)] public string? Error { get; set; }
[Key(2)] public GalaxyDataValue[] Values { get; set; } = System.Array.Empty<GalaxyDataValue>();
}
/// <summary>
/// At-time historian read — OPC UA HistoryReadAtTime service. Returns one sample per
/// requested timestamp (interpolated when no exact match exists). The per-timestamp array
/// is flow-encoded as Unix milliseconds to avoid MessagePack DateTime quirks.
/// </summary>
[MessagePackObject]
public sealed class HistoryReadAtTimeRequest
{
[Key(0)] public long SessionId { get; set; }
[Key(1)] public string TagReference { get; set; } = string.Empty;
[Key(2)] public long[] TimestampsUtcUnixMs { get; set; } = System.Array.Empty<long>();
}
[MessagePackObject]
public sealed class HistoryReadAtTimeResponse
{
[Key(0)] public bool Success { get; set; }
[Key(1)] public string? Error { get; set; }
[Key(2)] public GalaxyDataValue[] Values { get; set; } = System.Array.Empty<GalaxyDataValue>();
}
/// <summary>
/// Historical events read — OPC UA HistoryReadEvents service and Alarm &amp; Condition
/// history. <c>SourceName</c> null means "all sources". Distinct from the live
/// <see cref="GalaxyAlarmEvent"/> stream because historical rows carry both
/// <c>EventTime</c> (when the event occurred in the process) and <c>ReceivedTime</c>
/// (when the Historian persisted it) and have no StateTransition — the Historian logs
/// the instantaneous event, not the OPC UA alarm lifecycle.
/// </summary>
[MessagePackObject]
public sealed class HistoryReadEventsRequest
{
[Key(0)] public long SessionId { get; set; }
[Key(1)] public string? SourceName { get; set; }
[Key(2)] public long StartUtcUnixMs { get; set; }
[Key(3)] public long EndUtcUnixMs { get; set; }
[Key(4)] public int MaxEvents { get; set; } = 1000;
}
[MessagePackObject]
public sealed class GalaxyHistoricalEvent
{
[Key(0)] public string EventId { get; set; } = string.Empty;
[Key(1)] public string? SourceName { get; set; }
[Key(2)] public long EventTimeUtcUnixMs { get; set; }
[Key(3)] public long ReceivedTimeUtcUnixMs { get; set; }
[Key(4)] public string? DisplayText { get; set; }
[Key(5)] public ushort Severity { get; set; }
}
[MessagePackObject]
public sealed class HistoryReadEventsResponse
{
[Key(0)] public bool Success { get; set; }
[Key(1)] public string? Error { get; set; }
[Key(2)] public GalaxyHistoricalEvent[] Events { get; set; } = System.Array.Empty<GalaxyHistoricalEvent>();
}

View File

@@ -11,6 +11,12 @@
<CopyLocalLockFileAssemblies>false</CopyLocalLockFileAssemblies> <CopyLocalLockFileAssemblies>false</CopyLocalLockFileAssemblies>
<!-- Deploy next to Host.exe under bin/<cfg>/Historian/ so F5 works without a manual copy. --> <!-- Deploy next to Host.exe under bin/<cfg>/Historian/ so F5 works without a manual copy. -->
<HistorianPluginOutputDir>$(MSBuildThisFileDirectory)..\ZB.MOM.WW.OtOpcUa.Host\bin\$(Configuration)\net48\Historian\</HistorianPluginOutputDir> <HistorianPluginOutputDir>$(MSBuildThisFileDirectory)..\ZB.MOM.WW.OtOpcUa.Host\bin\$(Configuration)\net48\Historian\</HistorianPluginOutputDir>
<!--
Phase 2 Stream D — V1 ARCHIVE. Plugs into the legacy in-process Host's
Wonderware Historian loader. Will be ported into Driver.Galaxy.Host's
Backend/Historian/ subtree when MxAccessGalaxyBackend.HistoryReadAsync is
wired (Task B.1.h follow-up). See docs/v2/V1_ARCHIVE_STATUS.md.
-->
</PropertyGroup> </PropertyGroup>
<ItemGroup> <ItemGroup>

View File

@@ -8,6 +8,15 @@
<Nullable>enable</Nullable> <Nullable>enable</Nullable>
<RootNamespace>ZB.MOM.WW.OtOpcUa.Host</RootNamespace> <RootNamespace>ZB.MOM.WW.OtOpcUa.Host</RootNamespace>
<AssemblyName>ZB.MOM.WW.OtOpcUa.Host</AssemblyName> <AssemblyName>ZB.MOM.WW.OtOpcUa.Host</AssemblyName>
<!--
Phase 2 Stream D — V1 ARCHIVE. Functionally superseded by:
src/ZB.MOM.WW.OtOpcUa.Server (host process, .NET 10)
src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host (out-of-process MXAccess, net48 x86)
src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy (in-process driver, .NET 10)
Kept in the build graph because Historian.Aveva + IntegrationTests still
transitively reference it. Deletion is the subject of Phase 2 PR 3 (separate from
this PR 2). See docs/v2/V1_ARCHIVE_STATUS.md.
-->
</PropertyGroup> </PropertyGroup>
<ItemGroup> <ItemGroup>

View File

@@ -0,0 +1,363 @@
using System.Globalization;
using Microsoft.Extensions.Logging;
using Opc.Ua;
using Opc.Ua.Server;
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
using DriverWriteRequest = ZB.MOM.WW.OtOpcUa.Core.Abstractions.WriteRequest;
namespace ZB.MOM.WW.OtOpcUa.Server.OpcUa;
/// <summary>
/// Concrete <see cref="CustomNodeManager2"/> that materializes the driver's address space
/// into OPC UA nodes. Implements <see cref="IAddressSpaceBuilder"/> itself so
/// <c>GenericDriverNodeManager.BuildAddressSpaceAsync</c> can stream nodes directly into the
/// OPC UA server's namespace. PR 15's <c>MarkAsAlarmCondition</c> hook creates a sibling
/// <see cref="AlarmConditionState"/> node per alarm-flagged variable; subsequent driver
/// <c>OnAlarmEvent</c> pushes land through the returned sink to drive Activate /
/// Acknowledge / Deactivate transitions.
/// </summary>
/// <remarks>
/// Read / Subscribe / Write are routed to the driver's capability interfaces — the node
/// manager holds references to <see cref="IReadable"/>, <see cref="ISubscribable"/>, and
/// <see cref="IWritable"/> when present. Nodes with no driver backing (plain folders) are
/// served directly from the internal PredefinedNodes table.
/// </remarks>
public sealed class DriverNodeManager : CustomNodeManager2, IAddressSpaceBuilder
{
private readonly IDriver _driver;
private readonly IReadable? _readable;
private readonly IWritable? _writable;
private readonly ILogger<DriverNodeManager> _logger;
/// <summary>The driver whose address space this node manager exposes.</summary>
public IDriver Driver => _driver;
private FolderState? _driverRoot;
private readonly Dictionary<string, BaseDataVariableState> _variablesByFullRef = new(StringComparer.OrdinalIgnoreCase);
// Active building folder — set per Folder() call so Variable() lands under the right parent.
// A stack would support nested folders; we use a single current folder because IAddressSpaceBuilder
// returns a child builder per Folder call and the caller threads nesting through those references.
private FolderState _currentFolder = null!;
public DriverNodeManager(IServerInternal server, ApplicationConfiguration configuration,
IDriver driver, ILogger<DriverNodeManager> logger)
: base(server, configuration, namespaceUris: $"urn:OtOpcUa:{driver.DriverInstanceId}")
{
_driver = driver;
_readable = driver as IReadable;
_writable = driver as IWritable;
_logger = logger;
}
protected override NodeStateCollection LoadPredefinedNodes(ISystemContext context) => new();
public override void CreateAddressSpace(IDictionary<NodeId, IList<IReference>> externalReferences)
{
lock (Lock)
{
_driverRoot = new FolderState(null)
{
SymbolicName = _driver.DriverInstanceId,
ReferenceTypeId = ReferenceTypeIds.Organizes,
TypeDefinitionId = ObjectTypeIds.FolderType,
NodeId = new NodeId(_driver.DriverInstanceId, NamespaceIndex),
BrowseName = new QualifiedName(_driver.DriverInstanceId, NamespaceIndex),
DisplayName = new LocalizedText(_driver.DriverInstanceId),
EventNotifier = EventNotifiers.None,
};
// Link under Objects folder so clients see the driver subtree at browse root.
if (!externalReferences.TryGetValue(ObjectIds.ObjectsFolder, out var references))
{
references = new List<IReference>();
externalReferences[ObjectIds.ObjectsFolder] = references;
}
references.Add(new NodeStateReference(ReferenceTypeIds.Organizes, false, _driverRoot.NodeId));
AddPredefinedNode(SystemContext, _driverRoot);
_currentFolder = _driverRoot;
}
}
// ------- IAddressSpaceBuilder implementation (PR 15 contract) -------
public IAddressSpaceBuilder Folder(string browseName, string displayName)
{
lock (Lock)
{
var folder = new FolderState(_currentFolder)
{
SymbolicName = browseName,
ReferenceTypeId = ReferenceTypeIds.Organizes,
TypeDefinitionId = ObjectTypeIds.FolderType,
NodeId = new NodeId($"{_currentFolder.NodeId.Identifier}/{browseName}", NamespaceIndex),
BrowseName = new QualifiedName(browseName, NamespaceIndex),
DisplayName = new LocalizedText(displayName),
};
_currentFolder.AddChild(folder);
AddPredefinedNode(SystemContext, folder);
return new NestedBuilder(this, folder);
}
}
public IVariableHandle Variable(string browseName, string displayName, DriverAttributeInfo attributeInfo)
{
lock (Lock)
{
var v = new BaseDataVariableState(_currentFolder)
{
SymbolicName = browseName,
ReferenceTypeId = ReferenceTypeIds.Organizes,
TypeDefinitionId = VariableTypeIds.BaseDataVariableType,
NodeId = new NodeId(attributeInfo.FullName, NamespaceIndex),
BrowseName = new QualifiedName(browseName, NamespaceIndex),
DisplayName = new LocalizedText(displayName),
DataType = MapDataType(attributeInfo.DriverDataType),
ValueRank = attributeInfo.IsArray ? ValueRanks.OneDimension : ValueRanks.Scalar,
AccessLevel = AccessLevels.CurrentReadOrWrite,
UserAccessLevel = AccessLevels.CurrentReadOrWrite,
Historizing = attributeInfo.IsHistorized,
};
_currentFolder.AddChild(v);
AddPredefinedNode(SystemContext, v);
_variablesByFullRef[attributeInfo.FullName] = v;
v.OnReadValue = OnReadValue;
v.OnWriteValue = OnWriteValue;
return new VariableHandle(this, v, attributeInfo.FullName);
}
}
public void AddProperty(string browseName, DriverDataType dataType, object? value)
{
lock (Lock)
{
var p = new PropertyState(_currentFolder)
{
SymbolicName = browseName,
ReferenceTypeId = ReferenceTypeIds.HasProperty,
TypeDefinitionId = VariableTypeIds.PropertyType,
NodeId = new NodeId($"{_currentFolder.NodeId.Identifier}/{browseName}", NamespaceIndex),
BrowseName = new QualifiedName(browseName, NamespaceIndex),
DisplayName = new LocalizedText(browseName),
DataType = MapDataType(dataType),
ValueRank = ValueRanks.Scalar,
Value = value,
};
_currentFolder.AddChild(p);
AddPredefinedNode(SystemContext, p);
}
}
private ServiceResult OnReadValue(ISystemContext context, NodeState node, NumericRange indexRange,
QualifiedName dataEncoding, ref object? value, ref StatusCode statusCode, ref DateTime timestamp)
{
if (_readable is null)
{
statusCode = StatusCodes.BadNotReadable;
return ServiceResult.Good;
}
try
{
var fullRef = node.NodeId.Identifier as string ?? "";
var result = _readable.ReadAsync([fullRef], CancellationToken.None).GetAwaiter().GetResult();
if (result.Count == 0)
{
statusCode = StatusCodes.BadNoData;
return ServiceResult.Good;
}
var snap = result[0];
value = snap.Value;
statusCode = snap.StatusCode;
timestamp = snap.ServerTimestampUtc;
}
catch (Exception ex)
{
_logger.LogWarning(ex, "OnReadValue failed for {NodeId}", node.NodeId);
statusCode = StatusCodes.BadInternalError;
}
return ServiceResult.Good;
}
private static NodeId MapDataType(DriverDataType t) => t switch
{
DriverDataType.Boolean => DataTypeIds.Boolean,
DriverDataType.Int32 => DataTypeIds.Int32,
DriverDataType.Float32 => DataTypeIds.Float,
DriverDataType.Float64 => DataTypeIds.Double,
DriverDataType.String => DataTypeIds.String,
DriverDataType.DateTime => DataTypeIds.DateTime,
_ => DataTypeIds.BaseDataType,
};
/// <summary>
/// Nested builder returned by <see cref="Folder"/>. Temporarily retargets the parent's
/// <see cref="_currentFolder"/> during each call so Variable/Folder calls land under the
/// correct subtree. Not thread-safe if callers drive Discovery concurrently — but
/// <c>GenericDriverNodeManager</c> discovery is sequential per driver.
/// </summary>
private sealed class NestedBuilder(DriverNodeManager owner, FolderState folder) : IAddressSpaceBuilder
{
public IAddressSpaceBuilder Folder(string browseName, string displayName)
{
var prior = owner._currentFolder;
owner._currentFolder = folder;
try { return owner.Folder(browseName, displayName); }
finally { owner._currentFolder = prior; }
}
public IVariableHandle Variable(string browseName, string displayName, DriverAttributeInfo attributeInfo)
{
var prior = owner._currentFolder;
owner._currentFolder = folder;
try { return owner.Variable(browseName, displayName, attributeInfo); }
finally { owner._currentFolder = prior; }
}
public void AddProperty(string browseName, DriverDataType dataType, object? value)
{
var prior = owner._currentFolder;
owner._currentFolder = folder;
try { owner.AddProperty(browseName, dataType, value); }
finally { owner._currentFolder = prior; }
}
}
private sealed class VariableHandle : IVariableHandle
{
private readonly DriverNodeManager _owner;
private readonly BaseDataVariableState _variable;
public string FullReference { get; }
public VariableHandle(DriverNodeManager owner, BaseDataVariableState variable, string fullRef)
{
_owner = owner;
_variable = variable;
FullReference = fullRef;
}
public IAlarmConditionSink MarkAsAlarmCondition(AlarmConditionInfo info)
{
lock (_owner.Lock)
{
var alarm = new AlarmConditionState(_variable)
{
SymbolicName = _variable.BrowseName.Name + "_Condition",
ReferenceTypeId = ReferenceTypeIds.HasComponent,
NodeId = new NodeId(FullReference + ".Condition", _owner.NamespaceIndex),
BrowseName = new QualifiedName(_variable.BrowseName.Name + "_Condition", _owner.NamespaceIndex),
DisplayName = new LocalizedText(info.SourceName),
};
alarm.Create(_owner.SystemContext, alarm.NodeId, alarm.BrowseName, alarm.DisplayName, false);
alarm.SourceName.Value = info.SourceName;
alarm.Severity.Value = (ushort)MapSeverity(info.InitialSeverity);
alarm.Message.Value = new LocalizedText(info.InitialDescription ?? info.SourceName);
alarm.EnabledState.Value = new LocalizedText("Enabled");
alarm.EnabledState.Id.Value = true;
alarm.Retain.Value = false;
alarm.AckedState.Value = new LocalizedText("Acknowledged");
alarm.AckedState.Id.Value = true;
alarm.ActiveState.Value = new LocalizedText("Inactive");
alarm.ActiveState.Id.Value = false;
_variable.AddChild(alarm);
_owner.AddPredefinedNode(_owner.SystemContext, alarm);
return new ConditionSink(_owner, alarm);
}
}
private static int MapSeverity(AlarmSeverity s) => s switch
{
AlarmSeverity.Low => 250,
AlarmSeverity.Medium => 500,
AlarmSeverity.High => 700,
AlarmSeverity.Critical => 900,
_ => 500,
};
}
private sealed class ConditionSink(DriverNodeManager owner, AlarmConditionState alarm)
: IAlarmConditionSink
{
public void OnTransition(AlarmEventArgs args)
{
lock (owner.Lock)
{
alarm.Severity.Value = (ushort)MapSeverity(args.Severity);
alarm.Time.Value = args.SourceTimestampUtc;
alarm.Message.Value = new LocalizedText(args.Message);
// Map the driver's transition type to OPC UA Part 9 state. The driver uses
// AlarmEventArgs but the state transition kind is encoded in AlarmType by
// convention — Galaxy's GalaxyAlarmTracker emits "Active"/"Acknowledged"/"Inactive".
switch (args.AlarmType)
{
case "Active":
alarm.SetActiveState(owner.SystemContext, true);
alarm.SetAcknowledgedState(owner.SystemContext, false);
alarm.Retain.Value = true;
break;
case "Acknowledged":
alarm.SetAcknowledgedState(owner.SystemContext, true);
break;
case "Inactive":
alarm.SetActiveState(owner.SystemContext, false);
// Retain stays true until the condition is both Inactive and Acknowledged
// so alarm clients keep the record in their condition refresh snapshot.
if (alarm.AckedState.Id.Value) alarm.Retain.Value = false;
break;
}
alarm.ClearChangeMasks(owner.SystemContext, true);
alarm.ReportEvent(owner.SystemContext, alarm);
}
}
private static int MapSeverity(AlarmSeverity s) => s switch
{
AlarmSeverity.Low => 250,
AlarmSeverity.Medium => 500,
AlarmSeverity.High => 700,
AlarmSeverity.Critical => 900,
_ => 500,
};
}
/// <summary>
/// Per-variable write hook wired on each <see cref="BaseDataVariableState"/>. Routes the
/// value into the driver's <see cref="IWritable"/> and surfaces its per-tag status code.
/// </summary>
private ServiceResult OnWriteValue(ISystemContext context, NodeState node, NumericRange indexRange,
QualifiedName dataEncoding, ref object? value, ref StatusCode statusCode, ref DateTime timestamp)
{
if (_writable is null) return StatusCodes.BadNotWritable;
var fullRef = node.NodeId.Identifier as string;
if (string.IsNullOrEmpty(fullRef)) return StatusCodes.BadNodeIdUnknown;
try
{
var results = _writable.WriteAsync(
[new DriverWriteRequest(fullRef!, value)],
CancellationToken.None).GetAwaiter().GetResult();
if (results.Count > 0 && results[0].StatusCode != 0)
{
statusCode = results[0].StatusCode;
return ServiceResult.Good;
}
return ServiceResult.Good;
}
catch (Exception ex)
{
_logger.LogWarning(ex, "Write failed for {FullRef}", fullRef);
return new ServiceResult(StatusCodes.BadInternalError);
}
}
// Diagnostics hook for tests — number of variables registered in this node manager.
internal int VariableCount => _variablesByFullRef.Count;
internal bool TryGetVariable(string fullRef, out BaseDataVariableState? v)
=> _variablesByFullRef.TryGetValue(fullRef, out v!);
}

View File

@@ -0,0 +1,181 @@
using Microsoft.Extensions.Logging;
using Opc.Ua;
using Opc.Ua.Configuration;
using ZB.MOM.WW.OtOpcUa.Core.Hosting;
using ZB.MOM.WW.OtOpcUa.Core.OpcUa;
namespace ZB.MOM.WW.OtOpcUa.Server.OpcUa;
/// <summary>
/// Wraps <see cref="ApplicationInstance"/> to bring the OPC UA server online — builds an
/// <see cref="ApplicationConfiguration"/> programmatically (no external XML file), ensures
/// the application certificate exists in the PKI store (auto-generates self-signed on first
/// run), starts the server, then walks each <see cref="DriverNodeManager"/> and invokes
/// <see cref="GenericDriverNodeManager.BuildAddressSpaceAsync"/> against it so the driver's
/// discovery streams into the already-running server's address space.
/// </summary>
public sealed class OpcUaApplicationHost : IAsyncDisposable
{
private readonly OpcUaServerOptions _options;
private readonly DriverHost _driverHost;
private readonly ILoggerFactory _loggerFactory;
private readonly ILogger<OpcUaApplicationHost> _logger;
private ApplicationInstance? _application;
private OtOpcUaServer? _server;
private bool _disposed;
public OpcUaApplicationHost(OpcUaServerOptions options, DriverHost driverHost,
ILoggerFactory loggerFactory, ILogger<OpcUaApplicationHost> logger)
{
_options = options;
_driverHost = driverHost;
_loggerFactory = loggerFactory;
_logger = logger;
}
public OtOpcUaServer? Server => _server;
/// <summary>
/// Builds the <see cref="ApplicationConfiguration"/>, validates/creates the application
/// certificate, constructs + starts the <see cref="OtOpcUaServer"/>, then drives
/// <see cref="GenericDriverNodeManager.BuildAddressSpaceAsync"/> per registered driver so
/// the address space is populated before the first client connects.
/// </summary>
public async Task StartAsync(CancellationToken ct)
{
_application = new ApplicationInstance
{
ApplicationName = _options.ApplicationName,
ApplicationType = ApplicationType.Server,
ApplicationConfiguration = BuildConfiguration(),
};
var hasCert = await _application.CheckApplicationInstanceCertificate(silent: true, minimumKeySize: CertificateFactory.DefaultKeySize).ConfigureAwait(false);
if (!hasCert)
throw new InvalidOperationException(
$"OPC UA application certificate could not be validated or created in {_options.PkiStoreRoot}");
_server = new OtOpcUaServer(_driverHost, _loggerFactory);
await _application.Start(_server).ConfigureAwait(false);
_logger.LogInformation("OPC UA server started — endpoint={Endpoint} driverCount={Count}",
_options.EndpointUrl, _server.DriverNodeManagers.Count);
// Drive each driver's discovery through its node manager. The node manager IS the
// IAddressSpaceBuilder; GenericDriverNodeManager captures alarm-condition sinks into
// its internal map and wires OnAlarmEvent → sink routing.
foreach (var nodeManager in _server.DriverNodeManagers)
{
var driverId = nodeManager.Driver.DriverInstanceId;
try
{
var generic = new GenericDriverNodeManager(nodeManager.Driver);
await generic.BuildAddressSpaceAsync(nodeManager, ct).ConfigureAwait(false);
_logger.LogInformation("Address space populated for driver {Driver}", driverId);
}
catch (Exception ex)
{
// Per decision #12: driver exceptions isolate — log and keep the server serving
// the other drivers' subtrees. Re-building this one takes a Reinitialize call.
_logger.LogError(ex, "Discovery failed for driver {Driver}; subtree faulted", driverId);
}
}
}
private ApplicationConfiguration BuildConfiguration()
{
Directory.CreateDirectory(_options.PkiStoreRoot);
var cfg = new ApplicationConfiguration
{
ApplicationName = _options.ApplicationName,
ApplicationUri = _options.ApplicationUri,
ApplicationType = ApplicationType.Server,
ProductUri = "urn:OtOpcUa:Server",
SecurityConfiguration = new SecurityConfiguration
{
ApplicationCertificate = new CertificateIdentifier
{
StoreType = CertificateStoreType.Directory,
StorePath = Path.Combine(_options.PkiStoreRoot, "own"),
SubjectName = "CN=" + _options.ApplicationName,
},
TrustedIssuerCertificates = new CertificateTrustList
{
StoreType = CertificateStoreType.Directory,
StorePath = Path.Combine(_options.PkiStoreRoot, "issuers"),
},
TrustedPeerCertificates = new CertificateTrustList
{
StoreType = CertificateStoreType.Directory,
StorePath = Path.Combine(_options.PkiStoreRoot, "trusted"),
},
RejectedCertificateStore = new CertificateTrustList
{
StoreType = CertificateStoreType.Directory,
StorePath = Path.Combine(_options.PkiStoreRoot, "rejected"),
},
AutoAcceptUntrustedCertificates = _options.AutoAcceptUntrustedClientCertificates,
AddAppCertToTrustedStore = true,
},
TransportConfigurations = new TransportConfigurationCollection(),
TransportQuotas = new TransportQuotas { OperationTimeout = 15000 },
ServerConfiguration = new ServerConfiguration
{
BaseAddresses = new StringCollection { _options.EndpointUrl },
SecurityPolicies = new ServerSecurityPolicyCollection
{
new ServerSecurityPolicy
{
SecurityMode = MessageSecurityMode.None,
SecurityPolicyUri = SecurityPolicies.None,
},
},
UserTokenPolicies = new UserTokenPolicyCollection
{
new UserTokenPolicy(UserTokenType.Anonymous)
{
PolicyId = "Anonymous",
SecurityPolicyUri = SecurityPolicies.None,
},
},
MinRequestThreadCount = 5,
MaxRequestThreadCount = 100,
MaxQueuedRequestCount = 200,
},
TraceConfiguration = new TraceConfiguration(),
};
cfg.Validate(ApplicationType.Server).GetAwaiter().GetResult();
if (cfg.SecurityConfiguration.AutoAcceptUntrustedCertificates)
{
cfg.CertificateValidator.CertificateValidation += (_, e) =>
{
if (e.Error.StatusCode == StatusCodes.BadCertificateUntrusted)
e.Accept = true;
};
}
return cfg;
}
public async ValueTask DisposeAsync()
{
if (_disposed) return;
_disposed = true;
try
{
_server?.Stop();
}
catch (Exception ex)
{
_logger.LogWarning(ex, "OPC UA server stop threw during dispose");
}
await Task.CompletedTask;
}
}

View File

@@ -0,0 +1,42 @@
namespace ZB.MOM.WW.OtOpcUa.Server.OpcUa;
/// <summary>
/// OPC UA server endpoint + application-identity configuration. Bound from the
/// <c>OpcUaServer</c> section of <c>appsettings.json</c>. PR 17 minimum-viable scope: no LDAP,
/// no security profiles beyond None — those wire in alongside a future deployment-policy PR
/// that reads from the central config DB instead of appsettings.
/// </summary>
public sealed class OpcUaServerOptions
{
public const string SectionName = "OpcUaServer";
/// <summary>
/// Fully-qualified endpoint URI clients connect to. Use <c>0.0.0.0</c> to bind all
/// interfaces; the stack rewrites to the machine's hostname for the returned endpoint
/// description at GetEndpoints time.
/// </summary>
public string EndpointUrl { get; init; } = "opc.tcp://0.0.0.0:4840/OtOpcUa";
/// <summary>Human-readable application name surfaced in the endpoint description.</summary>
public string ApplicationName { get; init; } = "OtOpcUa Server";
/// <summary>Stable application URI — must match the subjectAltName of the app cert.</summary>
public string ApplicationUri { get; init; } = "urn:OtOpcUa:Server";
/// <summary>
/// Directory where the OPC UA stack stores the application certificate + trusted /
/// rejected cert folders. Defaults to <c>%ProgramData%\OtOpcUa\pki</c>; the stack
/// creates the directory tree on first run and generates a self-signed cert.
/// </summary>
public string PkiStoreRoot { get; init; } =
System.IO.Path.Combine(
Environment.GetFolderPath(Environment.SpecialFolder.CommonApplicationData),
"OtOpcUa", "pki");
/// <summary>
/// When true, the stack auto-trusts client certs on first connect. Dev-default = true,
/// production deployments should flip this to false and manually trust clients via the
/// Admin UI.
/// </summary>
public bool AutoAcceptUntrustedClientCertificates { get; init; } = true;
}

View File

@@ -0,0 +1,62 @@
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Logging;
using Opc.Ua;
using Opc.Ua.Server;
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
using ZB.MOM.WW.OtOpcUa.Core.Hosting;
using ZB.MOM.WW.OtOpcUa.Core.OpcUa;
namespace ZB.MOM.WW.OtOpcUa.Server.OpcUa;
/// <summary>
/// <see cref="StandardServer"/> subclass that wires one <see cref="DriverNodeManager"/> per
/// registered driver from <see cref="DriverHost"/>. Anonymous endpoint on
/// <c>opc.tcp://0.0.0.0:4840</c>, no security — PR 16 minimum-viable scope; LDAP + security
/// profiles are deferred to their own PR on top of this.
/// </summary>
public sealed class OtOpcUaServer : StandardServer
{
private readonly DriverHost _driverHost;
private readonly ILoggerFactory _loggerFactory;
private readonly List<DriverNodeManager> _driverNodeManagers = new();
public OtOpcUaServer(DriverHost driverHost, ILoggerFactory loggerFactory)
{
_driverHost = driverHost;
_loggerFactory = loggerFactory;
}
/// <summary>
/// Read-only snapshot of the driver node managers materialized at server start. Used by
/// the generic-driver-node-manager-driven discovery flow after the server starts — the
/// host walks each entry and invokes
/// <c>GenericDriverNodeManager.BuildAddressSpaceAsync(manager)</c> passing the manager
/// as its own <see cref="IAddressSpaceBuilder"/>.
/// </summary>
public IReadOnlyList<DriverNodeManager> DriverNodeManagers => _driverNodeManagers;
protected override MasterNodeManager CreateMasterNodeManager(IServerInternal server, ApplicationConfiguration configuration)
{
foreach (var driverId in _driverHost.RegisteredDriverIds)
{
var driver = _driverHost.GetDriver(driverId);
if (driver is null) continue;
var logger = _loggerFactory.CreateLogger<DriverNodeManager>();
var manager = new DriverNodeManager(server, configuration, driver, logger);
_driverNodeManagers.Add(manager);
}
return new MasterNodeManager(server, configuration, null, _driverNodeManagers.ToArray());
}
protected override ServerProperties LoadServerProperties() => new()
{
ManufacturerName = "OtOpcUa",
ProductName = "OtOpcUa.Server",
ProductUri = "urn:OtOpcUa:Server",
SoftwareVersion = "2.0.0",
BuildNumber = "0",
BuildDate = DateTime.UtcNow,
};
}

View File

@@ -1,18 +1,20 @@
using Microsoft.Extensions.Hosting; using Microsoft.Extensions.Hosting;
using Microsoft.Extensions.Logging; using Microsoft.Extensions.Logging;
using ZB.MOM.WW.OtOpcUa.Core.Hosting; using ZB.MOM.WW.OtOpcUa.Core.Hosting;
using ZB.MOM.WW.OtOpcUa.Server.OpcUa;
namespace ZB.MOM.WW.OtOpcUa.Server; namespace ZB.MOM.WW.OtOpcUa.Server;
/// <summary> /// <summary>
/// BackgroundService that owns the OPC UA server lifecycle (decision #30, replacing TopShelf). /// BackgroundService that owns the OPC UA server lifecycle (decision #30, replacing TopShelf).
/// Bootstraps config, starts the <see cref="DriverHost"/>, and runs until stopped. /// Bootstraps config, starts the <see cref="DriverHost"/>, starts the OPC UA server via
/// Phase 1 scope: bootstrap-only — the OPC UA transport layer that serves endpoints stays in /// <see cref="OpcUaApplicationHost"/>, drives each driver's discovery into the address space,
/// the legacy Host until the Phase 2 cutover. /// runs until stopped.
/// </summary> /// </summary>
public sealed class OpcUaServerService( public sealed class OpcUaServerService(
NodeBootstrap bootstrap, NodeBootstrap bootstrap,
DriverHost driverHost, DriverHost driverHost,
OpcUaApplicationHost applicationHost,
ILogger<OpcUaServerService> logger) : BackgroundService ILogger<OpcUaServerService> logger) : BackgroundService
{ {
protected override async Task ExecuteAsync(CancellationToken stoppingToken) protected override async Task ExecuteAsync(CancellationToken stoppingToken)
@@ -22,8 +24,11 @@ public sealed class OpcUaServerService(
var result = await bootstrap.LoadCurrentGenerationAsync(stoppingToken); var result = await bootstrap.LoadCurrentGenerationAsync(stoppingToken);
logger.LogInformation("Bootstrap complete: source={Source} generation={Gen}", result.Source, result.GenerationId); logger.LogInformation("Bootstrap complete: source={Source} generation={Gen}", result.Source, result.GenerationId);
// Phase 1: no drivers are wired up at bootstrap — Galaxy still lives in legacy Host. // PR 17: stand up the OPC UA server + drive discovery per registered driver. Driver
// Phase 2 will register drivers here based on the fetched generation. // registration itself (RegisterAsync on DriverHost) happens during an earlier DI
// extension once the central config DB query + per-driver factory land; for now the
// server comes up with whatever drivers are in DriverHost at start time.
await applicationHost.StartAsync(stoppingToken);
logger.LogInformation("OtOpcUa.Server running. Hosted drivers: {Count}", driverHost.RegisteredDriverIds.Count); logger.LogInformation("OtOpcUa.Server running. Hosted drivers: {Count}", driverHost.RegisteredDriverIds.Count);
@@ -40,6 +45,7 @@ public sealed class OpcUaServerService(
public override async Task StopAsync(CancellationToken cancellationToken) public override async Task StopAsync(CancellationToken cancellationToken)
{ {
await base.StopAsync(cancellationToken); await base.StopAsync(cancellationToken);
await applicationHost.DisposeAsync();
await driverHost.DisposeAsync(); await driverHost.DisposeAsync();
} }
} }

View File

@@ -5,6 +5,7 @@ using Serilog;
using ZB.MOM.WW.OtOpcUa.Configuration.LocalCache; using ZB.MOM.WW.OtOpcUa.Configuration.LocalCache;
using ZB.MOM.WW.OtOpcUa.Core.Hosting; using ZB.MOM.WW.OtOpcUa.Core.Hosting;
using ZB.MOM.WW.OtOpcUa.Server; using ZB.MOM.WW.OtOpcUa.Server;
using ZB.MOM.WW.OtOpcUa.Server.OpcUa;
var builder = Host.CreateApplicationBuilder(args); var builder = Host.CreateApplicationBuilder(args);
@@ -29,10 +30,23 @@ var options = new NodeOptions
LocalCachePath = nodeSection.GetValue<string>("LocalCachePath") ?? "config_cache.db", LocalCachePath = nodeSection.GetValue<string>("LocalCachePath") ?? "config_cache.db",
}; };
var opcUaSection = builder.Configuration.GetSection(OpcUaServerOptions.SectionName);
var opcUaOptions = new OpcUaServerOptions
{
EndpointUrl = opcUaSection.GetValue<string>("EndpointUrl") ?? "opc.tcp://0.0.0.0:4840/OtOpcUa",
ApplicationName = opcUaSection.GetValue<string>("ApplicationName") ?? "OtOpcUa Server",
ApplicationUri = opcUaSection.GetValue<string>("ApplicationUri") ?? "urn:OtOpcUa:Server",
PkiStoreRoot = opcUaSection.GetValue<string>("PkiStoreRoot")
?? Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.CommonApplicationData), "OtOpcUa", "pki"),
AutoAcceptUntrustedClientCertificates = opcUaSection.GetValue<bool?>("AutoAcceptUntrustedClientCertificates") ?? true,
};
builder.Services.AddSingleton(options); builder.Services.AddSingleton(options);
builder.Services.AddSingleton(opcUaOptions);
builder.Services.AddSingleton<ILocalConfigCache>(_ => new LiteDbConfigCache(options.LocalCachePath)); builder.Services.AddSingleton<ILocalConfigCache>(_ => new LiteDbConfigCache(options.LocalCachePath));
builder.Services.AddSingleton<DriverHost>(); builder.Services.AddSingleton<DriverHost>();
builder.Services.AddSingleton<NodeBootstrap>(); builder.Services.AddSingleton<NodeBootstrap>();
builder.Services.AddSingleton<OpcUaApplicationHost>();
builder.Services.AddHostedService<OpcUaServerService>(); builder.Services.AddHostedService<OpcUaServerService>();
var host = builder.Build(); var host = builder.Build();

View File

@@ -21,6 +21,8 @@
<PackageReference Include="Serilog.Settings.Configuration" Version="9.0.0"/> <PackageReference Include="Serilog.Settings.Configuration" Version="9.0.0"/>
<PackageReference Include="Serilog.Sinks.Console" Version="6.0.0"/> <PackageReference Include="Serilog.Sinks.Console" Version="6.0.0"/>
<PackageReference Include="Serilog.Sinks.File" Version="7.0.0"/> <PackageReference Include="Serilog.Sinks.File" Version="7.0.0"/>
<PackageReference Include="OPCFoundation.NetStandard.Opc.Ua.Server" Version="1.5.374.126"/>
<PackageReference Include="OPCFoundation.NetStandard.Opc.Ua.Configuration" Version="1.5.374.126"/>
</ItemGroup> </ItemGroup>
<ItemGroup> <ItemGroup>
@@ -30,6 +32,9 @@
<ItemGroup> <ItemGroup>
<NuGetAuditSuppress Include="https://github.com/advisories/GHSA-37gx-xxp4-5rgx"/> <NuGetAuditSuppress Include="https://github.com/advisories/GHSA-37gx-xxp4-5rgx"/>
<NuGetAuditSuppress Include="https://github.com/advisories/GHSA-w3x6-4m5h-cxqf"/> <NuGetAuditSuppress Include="https://github.com/advisories/GHSA-w3x6-4m5h-cxqf"/>
<!-- OPCFoundation.NetStandard.Opc.Ua.Core advisory — v1 already uses this package at the
same version, risk already accepted in the v1 stack. -->
<NuGetAuditSuppress Include="https://github.com/advisories/GHSA-h958-fxgg-g7w3"/>
</ItemGroup> </ItemGroup>
</Project> </Project>

View File

@@ -0,0 +1,163 @@
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
using ZB.MOM.WW.OtOpcUa.Core.OpcUa;
namespace ZB.MOM.WW.OtOpcUa.Core.Tests;
[Trait("Category", "Unit")]
public sealed class GenericDriverNodeManagerTests
{
/// <summary>
/// BuildAddressSpaceAsync walks the driver's discovery through the caller's builder. Every
/// variable marked with MarkAsAlarmCondition captures its sink in the node manager; later,
/// IAlarmSource.OnAlarmEvent payloads are routed by SourceNodeId to the matching sink.
/// This is the plumbing that PR 16's concrete OPC UA builder will use to update the actual
/// AlarmConditionState nodes.
/// </summary>
[Fact]
public async Task Alarm_events_are_routed_to_the_sink_registered_for_the_matching_source_node_id()
{
var driver = new FakeDriver();
var builder = new RecordingBuilder();
using var nm = new GenericDriverNodeManager(driver);
await nm.BuildAddressSpaceAsync(builder, CancellationToken.None);
builder.Alarms.Count.ShouldBe(2);
nm.TrackedAlarmSources.Count.ShouldBe(2);
// Simulate the driver raising a transition for one of the alarms.
var args = new AlarmEventArgs(
SubscriptionHandle: new FakeHandle("s1"),
SourceNodeId: "Tank.HiHi",
ConditionId: "cond-1",
AlarmType: "Tank.HiHi",
Message: "Level exceeded",
Severity: AlarmSeverity.High,
SourceTimestampUtc: DateTime.UtcNow);
driver.RaiseAlarm(args);
builder.Alarms["Tank.HiHi"].Received.Count.ShouldBe(1);
builder.Alarms["Tank.HiHi"].Received[0].Message.ShouldBe("Level exceeded");
// The other alarm sink never received a payload — fan-out is tag-scoped.
builder.Alarms["Heater.OverTemp"].Received.Count.ShouldBe(0);
}
[Fact]
public async Task Non_alarm_variables_do_not_register_sinks()
{
var driver = new FakeDriver();
var builder = new RecordingBuilder();
using var nm = new GenericDriverNodeManager(driver);
await nm.BuildAddressSpaceAsync(builder, CancellationToken.None);
// FakeDriver registers 2 alarm-bearing variables + 1 plain variable.
nm.TrackedAlarmSources.ShouldNotContain("Tank.Level"); // the plain one
}
[Fact]
public async Task Unknown_source_node_id_is_dropped_silently()
{
var driver = new FakeDriver();
var builder = new RecordingBuilder();
using var nm = new GenericDriverNodeManager(driver);
await nm.BuildAddressSpaceAsync(builder, CancellationToken.None);
driver.RaiseAlarm(new AlarmEventArgs(
new FakeHandle("s1"), "Unknown.Source", "c", "t", "m", AlarmSeverity.Low, DateTime.UtcNow));
builder.Alarms.Values.All(s => s.Received.Count == 0).ShouldBeTrue();
}
[Fact]
public async Task Dispose_unsubscribes_from_OnAlarmEvent()
{
var driver = new FakeDriver();
var builder = new RecordingBuilder();
var nm = new GenericDriverNodeManager(driver);
await nm.BuildAddressSpaceAsync(builder, CancellationToken.None);
nm.Dispose();
driver.RaiseAlarm(new AlarmEventArgs(
new FakeHandle("s1"), "Tank.HiHi", "c", "t", "m", AlarmSeverity.Low, DateTime.UtcNow));
// No sink should have received it — the forwarder was detached.
builder.Alarms["Tank.HiHi"].Received.Count.ShouldBe(0);
}
// --- test doubles ---
private sealed class FakeDriver : IDriver, ITagDiscovery, IAlarmSource
{
public string DriverInstanceId => "fake";
public string DriverType => "Fake";
public event EventHandler<AlarmEventArgs>? OnAlarmEvent;
public Task InitializeAsync(string driverConfigJson, CancellationToken ct) => Task.CompletedTask;
public Task ReinitializeAsync(string driverConfigJson, CancellationToken ct) => Task.CompletedTask;
public Task ShutdownAsync(CancellationToken ct) => Task.CompletedTask;
public DriverHealth GetHealth() => new(DriverState.Healthy, DateTime.UtcNow, null);
public long GetMemoryFootprint() => 0;
public Task FlushOptionalCachesAsync(CancellationToken ct) => Task.CompletedTask;
public Task DiscoverAsync(IAddressSpaceBuilder builder, CancellationToken ct)
{
var folder = builder.Folder("Tank", "Tank");
var lvl = folder.Variable("Level", "Level", new DriverAttributeInfo(
"Tank.Level", DriverDataType.Float64, false, null, SecurityClassification.FreeAccess, false, IsAlarm: false));
var hiHi = folder.Variable("HiHi", "HiHi", new DriverAttributeInfo(
"Tank.HiHi", DriverDataType.Boolean, false, null, SecurityClassification.FreeAccess, false, IsAlarm: true));
hiHi.MarkAsAlarmCondition(new AlarmConditionInfo("Tank.HiHi", AlarmSeverity.High, "High-high alarm"));
var heater = builder.Folder("Heater", "Heater");
var ot = heater.Variable("OverTemp", "OverTemp", new DriverAttributeInfo(
"Heater.OverTemp", DriverDataType.Boolean, false, null, SecurityClassification.FreeAccess, false, IsAlarm: true));
ot.MarkAsAlarmCondition(new AlarmConditionInfo("Heater.OverTemp", AlarmSeverity.Critical, "Over-temperature"));
return Task.CompletedTask;
}
public void RaiseAlarm(AlarmEventArgs args) => OnAlarmEvent?.Invoke(this, args);
public Task<IAlarmSubscriptionHandle> SubscribeAlarmsAsync(IReadOnlyList<string> _, CancellationToken __)
=> Task.FromResult<IAlarmSubscriptionHandle>(new FakeHandle("sub"));
public Task UnsubscribeAlarmsAsync(IAlarmSubscriptionHandle _, CancellationToken __) => Task.CompletedTask;
public Task AcknowledgeAsync(IReadOnlyList<AlarmAcknowledgeRequest> _, CancellationToken __) => Task.CompletedTask;
}
private sealed class FakeHandle(string diagnosticId) : IAlarmSubscriptionHandle
{
public string DiagnosticId { get; } = diagnosticId;
}
private sealed class RecordingBuilder : IAddressSpaceBuilder
{
public Dictionary<string, RecordingSink> Alarms { get; } = new(StringComparer.OrdinalIgnoreCase);
public IAddressSpaceBuilder Folder(string _, string __) => this;
public IVariableHandle Variable(string _, string __, DriverAttributeInfo info)
=> new Handle(info.FullName, Alarms);
public void AddProperty(string _, DriverDataType __, object? ___) { }
public sealed class Handle(string fullRef, Dictionary<string, RecordingSink> alarms) : IVariableHandle
{
public string FullReference { get; } = fullRef;
public IAlarmConditionSink MarkAsAlarmCondition(AlarmConditionInfo _)
{
var sink = new RecordingSink();
alarms[FullReference] = sink;
return sink;
}
}
public sealed class RecordingSink : IAlarmConditionSink
{
public List<AlarmEventArgs> Received { get; } = new();
public void OnTransition(AlarmEventArgs args) => Received.Add(args);
}
}
}

View File

@@ -0,0 +1,58 @@
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E;
[Trait("Category", "ParityE2E")]
[Collection(nameof(ParityCollection))]
public sealed class HierarchyParityTests
{
private readonly ParityFixture _fx;
public HierarchyParityTests(ParityFixture fx) => _fx = fx;
[Fact]
public async Task Discover_returns_at_least_one_gobject_with_attributes()
{
_fx.SkipIfUnavailable();
var builder = new RecordingAddressSpaceBuilder();
await _fx.Driver!.DiscoverAsync(builder, CancellationToken.None);
builder.Folders.Count.ShouldBeGreaterThan(0,
"live Galaxy ZB has at least one deployed gobject");
builder.Variables.Count.ShouldBeGreaterThan(0,
"at least one gobject in the dev Galaxy carries dynamic attributes");
}
[Fact]
public async Task Discover_emits_only_lowercase_browse_paths_for_each_attribute()
{
// OPC UA browse paths are case-sensitive; the v1 server emits Galaxy attribute
// names verbatim (camelCase like "PV.Input.Value"). Parity invariant: every
// emitted variable's full reference contains a '.' separating the gobject
// tag-name from the attribute name (Galaxy reference grammar).
_fx.SkipIfUnavailable();
var builder = new RecordingAddressSpaceBuilder();
await _fx.Driver!.DiscoverAsync(builder, CancellationToken.None);
builder.Variables.ShouldAllBe(v => v.AttributeInfo.FullName.Contains('.'),
"Galaxy MXAccess full references are 'tag.attribute'");
}
[Fact]
public async Task Discover_marks_at_least_one_attribute_as_historized_when_HistoryExtension_present()
{
_fx.SkipIfUnavailable();
var builder = new RecordingAddressSpaceBuilder();
await _fx.Driver!.DiscoverAsync(builder, CancellationToken.None);
// Soft assertion — some Galaxies are configuration-only with no Historian extensions.
// We only check the field flows through correctly when populated.
var historized = builder.Variables.Count(v => v.AttributeInfo.IsHistorized);
// Just assert the count is non-negative — the value itself is data-dependent.
historized.ShouldBeGreaterThanOrEqualTo(0);
}
}

View File

@@ -0,0 +1,136 @@
using System.Diagnostics;
using System.Net.Sockets;
using System.Reflection;
using System.Security.Principal;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E;
/// <summary>
/// Spawns one <c>OtOpcUa.Driver.Galaxy.Host.exe</c> subprocess per test class and exposes
/// a connected <see cref="GalaxyProxyDriver"/> for the tests. Per Phase 2 plan §"Stream E
/// Parity Validation": the Proxy owns a session against a real out-of-process Host running
/// the production-shape <c>MxAccessGalaxyBackend</c> backed by live ZB + MXAccess COM.
/// Skipped when the Host EXE isn't built, when ZB SQL is unreachable, or when the dev box
/// runs as Administrator (the IPC ACL explicitly denies Administrators per decision #76).
/// </summary>
public sealed class ParityFixture : IAsyncLifetime
{
public GalaxyProxyDriver? Driver { get; private set; }
public string? SkipReason { get; private set; }
private Process? _host;
private const string Secret = "parity-suite-secret";
public async ValueTask InitializeAsync()
{
if (!OperatingSystem.IsWindows()) { SkipReason = "Windows-only"; return; }
if (IsAdministrator()) { SkipReason = "PipeAcl denies Administrators on dev shells"; return; }
if (!await ZbReachableAsync()) { SkipReason = "Galaxy ZB SQL not reachable on localhost:1433"; return; }
var hostExe = FindHostExe();
if (hostExe is null) { SkipReason = "Galaxy.Host EXE not built — run `dotnet build src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host`"; return; }
// Use the SQL-only DB backend by default — exercises the full IPC + dispatcher + SQL
// path without requiring a healthy MXAccess connection. Tests that need MXAccess
// override via env vars before InitializeAsync is called (use a separate fixture).
var pipe = $"OtOpcUaGalaxyParity-{Guid.NewGuid():N}";
using var identity = WindowsIdentity.GetCurrent();
var sid = identity.User!;
var psi = new ProcessStartInfo(hostExe)
{
UseShellExecute = false,
CreateNoWindow = true,
RedirectStandardOutput = true,
RedirectStandardError = true,
EnvironmentVariables =
{
["OTOPCUA_GALAXY_PIPE"] = pipe,
["OTOPCUA_ALLOWED_SID"] = sid.Value,
["OTOPCUA_GALAXY_SECRET"] = Secret,
["OTOPCUA_GALAXY_BACKEND"] = "db",
["OTOPCUA_GALAXY_ZB_CONN"] = "Server=localhost;Database=ZB;Integrated Security=True;TrustServerCertificate=True;Encrypt=False;",
},
};
_host = Process.Start(psi)
?? throw new InvalidOperationException("Failed to spawn Galaxy.Host EXE");
// Give the PipeServer ~2s to bind. The supervisor's HeartbeatMonitor can do this
// in production with retry, but the parity tests are best served by a fixed warm-up.
await Task.Delay(2_000);
Driver = new GalaxyProxyDriver(new GalaxyProxyOptions
{
DriverInstanceId = "parity",
PipeName = pipe,
SharedSecret = Secret,
ConnectTimeout = TimeSpan.FromSeconds(5),
});
await Driver.InitializeAsync(driverConfigJson: "{}", CancellationToken.None);
}
public async ValueTask DisposeAsync()
{
if (Driver is not null)
{
try { await Driver.ShutdownAsync(CancellationToken.None); } catch { /* shutdown */ }
Driver.Dispose();
}
if (_host is not null && !_host.HasExited)
{
try { _host.Kill(entireProcessTree: true); } catch { /* ignore */ }
try { _host.WaitForExit(5_000); } catch { /* ignore */ }
}
_host?.Dispose();
}
/// <summary>Skip the test if the fixture couldn't initialize. xUnit Skip.If pattern.</summary>
public void SkipIfUnavailable()
{
if (SkipReason is not null)
Assert.Skip(SkipReason);
}
private static bool IsAdministrator()
{
if (!OperatingSystem.IsWindows()) return false;
using var identity = WindowsIdentity.GetCurrent();
return new WindowsPrincipal(identity).IsInRole(WindowsBuiltInRole.Administrator);
}
private static async Task<bool> ZbReachableAsync()
{
try
{
using var client = new TcpClient();
var task = client.ConnectAsync("localhost", 1433);
return await Task.WhenAny(task, Task.Delay(1_500)) == task && client.Connected;
}
catch { return false; }
}
private static string? FindHostExe()
{
var asmDir = Path.GetDirectoryName(Assembly.GetExecutingAssembly().Location)!;
var solutionRoot = asmDir;
for (var i = 0; i < 8 && solutionRoot is not null; i++)
{
if (File.Exists(Path.Combine(solutionRoot, "ZB.MOM.WW.OtOpcUa.slnx"))) break;
solutionRoot = Path.GetDirectoryName(solutionRoot);
}
if (solutionRoot is null) return null;
var path = Path.Combine(solutionRoot,
"src", "ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host", "bin", "Debug", "net48",
"OtOpcUa.Driver.Galaxy.Host.exe");
return File.Exists(path) ? path : null;
}
}
[CollectionDefinition(nameof(ParityCollection))]
public sealed class ParityCollection : ICollectionFixture<ParityFixture> { }

View File

@@ -0,0 +1,70 @@
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E;
/// <summary>
/// Test-only <see cref="IAddressSpaceBuilder"/> that records every Folder + Variable
/// registration. Mirrors the v1 in-process address-space build so tests can assert on
/// the same shape the legacy <c>LmxNodeManager</c> produced.
/// </summary>
public sealed class RecordingAddressSpaceBuilder : IAddressSpaceBuilder
{
public List<RecordedFolder> Folders { get; } = new();
public List<RecordedVariable> Variables { get; } = new();
public List<RecordedProperty> Properties { get; } = new();
public List<RecordedAlarmCondition> AlarmConditions { get; } = new();
public IAddressSpaceBuilder Folder(string browseName, string displayName)
{
Folders.Add(new RecordedFolder(browseName, displayName));
return this; // single flat builder for tests; nesting irrelevant for parity assertions
}
public IVariableHandle Variable(string browseName, string displayName, DriverAttributeInfo attributeInfo)
{
Variables.Add(new RecordedVariable(browseName, displayName, attributeInfo));
return new RecordedVariableHandle(attributeInfo.FullName, AlarmConditions);
}
public void AddProperty(string browseName, DriverDataType dataType, object? value)
{
Properties.Add(new RecordedProperty(browseName, dataType, value));
}
public sealed record RecordedFolder(string BrowseName, string DisplayName);
public sealed record RecordedVariable(string BrowseName, string DisplayName, DriverAttributeInfo AttributeInfo);
public sealed record RecordedProperty(string BrowseName, DriverDataType DataType, object? Value);
public sealed record RecordedAlarmCondition(string SourceNodeId, AlarmConditionInfo Info);
public sealed record RecordedAlarmTransition(string SourceNodeId, AlarmEventArgs Args);
/// <summary>
/// Sink the tests assert on to verify the alarm event forwarder routed a transition
/// to the correct source-node-id. One entry per <see cref="IAlarmSource.OnAlarmEvent"/>.
/// </summary>
public List<RecordedAlarmTransition> AlarmTransitions { get; } = new();
private sealed class RecordedVariableHandle : IVariableHandle
{
private readonly List<RecordedAlarmCondition> _conditions;
public string FullReference { get; }
public RecordedVariableHandle(string fullReference, List<RecordedAlarmCondition> conditions)
{
FullReference = fullReference;
_conditions = conditions;
}
public IAlarmConditionSink MarkAsAlarmCondition(AlarmConditionInfo info)
{
_conditions.Add(new RecordedAlarmCondition(FullReference, info));
return new RecordingSink(FullReference);
}
private sealed class RecordingSink : IAlarmConditionSink
{
public string SourceNodeId { get; }
public List<AlarmEventArgs> Received { get; } = new();
public RecordingSink(string sourceNodeId) => SourceNodeId = sourceNodeId;
public void OnTransition(AlarmEventArgs args) => Received.Add(args);
}
}
}

View File

@@ -0,0 +1,140 @@
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E;
/// <summary>
/// Regression tests for the four 2026-04-13 stability findings (commits <c>c76ab8f</c>,
/// <c>7310925</c>) per Phase 2 plan §"Stream E.3". Each test asserts the v2 topology
/// does not reintroduce the v1 defect.
/// </summary>
[Trait("Category", "ParityE2E")]
[Trait("Subcategory", "StabilityRegression")]
[Collection(nameof(ParityCollection))]
public sealed class StabilityFindingsRegressionTests
{
private readonly ParityFixture _fx;
public StabilityFindingsRegressionTests(ParityFixture fx) => _fx = fx;
/// <summary>
/// Finding #1 — <em>phantom probe subscription flips Tick() to Stopped</em>. When the
/// v1 GalaxyRuntimeProbeManager failed to subscribe a probe, it left a phantom entry
/// that the next Tick() flipped to Stopped, fanning Bad-quality across unrelated
/// subtrees. v2 regression net: a failed subscribe must not affect host status of
/// subscriptions that did succeed.
/// </summary>
[Fact]
public async Task Failed_subscribe_does_not_corrupt_unrelated_host_status()
{
_fx.SkipIfUnavailable();
// GetHostStatuses pre-subscribe — baseline.
var preSubscribe = _fx.Driver!.GetHostStatuses().Count;
// Try to subscribe to a nonsense reference; the Host should reject it without
// poisoning the host-status table.
try
{
await _fx.Driver.SubscribeAsync(
new[] { "nonexistent.tag.does.not.exist[]" },
TimeSpan.FromSeconds(1),
CancellationToken.None);
}
catch { /* expected — bad reference */ }
var postSubscribe = _fx.Driver.GetHostStatuses().Count;
postSubscribe.ShouldBe(preSubscribe,
"failed subscribe must not mutate the host-status snapshot");
}
/// <summary>
/// Finding #2 — <em>cross-host quality clear wipes sibling state during recovery</em>.
/// v1 cleared all subscriptions when ANY host changed state, even healthy peers.
/// v2 regression net: host-status events must be scoped to the affected host name.
/// </summary>
[Fact]
public void Host_status_change_event_carries_specific_host_name_not_global_clear()
{
_fx.SkipIfUnavailable();
var changes = new List<HostStatusChangedEventArgs>();
EventHandler<HostStatusChangedEventArgs> handler = (_, e) => changes.Add(e);
_fx.Driver!.OnHostStatusChanged += handler;
try
{
// We can't deterministically force a Host status transition in the suite without
// tearing down the COM connection. The structural assertion is sufficient: the
// event TYPE carries a specific HostName, OldState, NewState — there is no
// "global clear" payload. v1's bug was structural; v2's event signature
// mathematically prevents reintroduction.
typeof(HostStatusChangedEventArgs).GetProperty("HostName")
.ShouldNotBeNull("event signature must scope to a specific host");
typeof(HostStatusChangedEventArgs).GetProperty("OldState")
.ShouldNotBeNull();
typeof(HostStatusChangedEventArgs).GetProperty("NewState")
.ShouldNotBeNull();
}
finally
{
_fx.Driver.OnHostStatusChanged -= handler;
}
}
/// <summary>
/// Finding #3 — <em>sync-over-async on the OPC UA stack thread</em>. v1 had spots
/// that called <c>.Result</c> / <c>.Wait()</c> from the OPC UA stack callback,
/// deadlocking under load. v2 regression net: every <see cref="GalaxyProxyDriver"/>
/// capability method is async-all-the-way; a reflection scan asserts no
/// <c>.GetAwaiter().GetResult()</c> appears in IL of the public surface.
/// Implemented as a structural shape assertion — every public method returning
/// <see cref="Task"/> or <see cref="Task{TResult}"/>.
/// </summary>
[Fact]
public void All_GalaxyProxyDriver_capability_methods_return_Task_for_async_correctness()
{
_fx.SkipIfUnavailable();
var driverType = typeof(Proxy.GalaxyProxyDriver);
var capabilityMethods = driverType.GetMethods(System.Reflection.BindingFlags.Public | System.Reflection.BindingFlags.Instance)
.Where(m => m.DeclaringType == driverType
&& !m.IsSpecialName
&& m.Name is "InitializeAsync" or "ReinitializeAsync" or "ShutdownAsync"
or "FlushOptionalCachesAsync" or "DiscoverAsync"
or "ReadAsync" or "WriteAsync"
or "SubscribeAsync" or "UnsubscribeAsync"
or "SubscribeAlarmsAsync" or "UnsubscribeAlarmsAsync" or "AcknowledgeAsync"
or "ReadRawAsync" or "ReadProcessedAsync");
foreach (var m in capabilityMethods)
{
(m.ReturnType == typeof(Task) || m.ReturnType.IsGenericType && m.ReturnType.GetGenericTypeDefinition() == typeof(Task<>))
.ShouldBeTrue($"{m.Name} must return Task or Task<T> — sync-over-async risks deadlock under load");
}
}
/// <summary>
/// Finding #4 — <em>fire-and-forget alarm tasks racing shutdown</em>. v1 fired
/// <c>Task.Run(() => raiseAlarm)</c> without awaiting, so shutdown could complete
/// while the task was still touching disposed state. v2 regression net: alarm
/// acknowledgement is sequential and awaited — verified by the integration test
/// <c>AcknowledgeAsync</c> returning a completed Task that doesn't leave background
/// work.
/// </summary>
[Fact]
public async Task AcknowledgeAsync_completes_before_returning_no_background_tasks()
{
_fx.SkipIfUnavailable();
// We can't easily acknowledge a real Galaxy alarm in this fixture, but we can
// assert the call shape: a synchronous-from-the-caller-perspective await without
// throwing or leaving a pending continuation.
await _fx.Driver!.AcknowledgeAsync(
new[] { new AlarmAcknowledgeRequest("nonexistent-source", "nonexistent-event", "test ack") },
CancellationToken.None);
// If we got here, the call awaited cleanly — no fire-and-forget background work
// left running after the caller returned.
true.ShouldBeTrue();
}
}

View File

@@ -0,0 +1,36 @@
<Project Sdk="Microsoft.NET.Sdk">
<PropertyGroup>
<TargetFramework>net10.0</TargetFramework>
<Nullable>enable</Nullable>
<ImplicitUsings>enable</ImplicitUsings>
<IsPackable>false</IsPackable>
<IsTestProject>true</IsTestProject>
<RootNamespace>ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E</RootNamespace>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="xunit.v3" Version="1.1.0"/>
<PackageReference Include="Shouldly" Version="4.3.0"/>
<PackageReference Include="Microsoft.NET.Test.Sdk" Version="17.12.0"/>
<PackageReference Include="xunit.runner.visualstudio" Version="3.0.2">
<PrivateAssets>all</PrivateAssets>
<IncludeAssets>runtime; build; native; contentfiles; analyzers; buildtransitive</IncludeAssets>
</PackageReference>
</ItemGroup>
<ItemGroup>
<ProjectReference Include="..\..\src\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.csproj"/>
<!--
We DO NOT reference Galaxy.Host (net48 x86) here. The Host runs as a subprocess —
this project only needs to spawn the EXE and talk to it via named pipes through
the Proxy. Cross-FX type loading is what bit the earlier in-process attempt.
-->
</ItemGroup>
<ItemGroup>
<NuGetAuditSuppress Include="https://github.com/advisories/GHSA-37gx-xxp4-5rgx"/>
<NuGetAuditSuppress Include="https://github.com/advisories/GHSA-w3x6-4m5h-cxqf"/>
</ItemGroup>
</Project>

View File

@@ -0,0 +1,84 @@
using System;
using MessagePack;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests;
[Trait("Category", "Unit")]
public sealed class AlarmDiscoveryTests
{
/// <summary>
/// PR 9 — IsAlarm must survive the MessagePack round-trip at Key=6 position.
/// Regression guard: any reorder of keys in GalaxyAttributeInfo would silently corrupt
/// the flag in the wire payload since MessagePack encodes by key number, not field name.
/// </summary>
[Fact]
public void GalaxyAttributeInfo_IsAlarm_round_trips_true_through_MessagePack()
{
var input = new GalaxyAttributeInfo
{
AttributeName = "TankLevel",
MxDataType = 2,
IsArray = false,
ArrayDim = null,
SecurityClassification = 1,
IsHistorized = true,
IsAlarm = true,
};
var bytes = MessagePackSerializer.Serialize(input);
var decoded = MessagePackSerializer.Deserialize<GalaxyAttributeInfo>(bytes);
decoded.IsAlarm.ShouldBeTrue();
decoded.IsHistorized.ShouldBeTrue();
decoded.AttributeName.ShouldBe("TankLevel");
}
[Fact]
public void GalaxyAttributeInfo_IsAlarm_round_trips_false_through_MessagePack()
{
var input = new GalaxyAttributeInfo { AttributeName = "ColorRgb", IsAlarm = false };
var bytes = MessagePackSerializer.Serialize(input);
var decoded = MessagePackSerializer.Deserialize<GalaxyAttributeInfo>(bytes);
decoded.IsAlarm.ShouldBeFalse();
}
/// <summary>
/// Wire-compat guard: payloads serialized before PR 9 (which omit Key=6) must still
/// deserialize cleanly — MessagePack treats missing keys as default. This lets a newer
/// Proxy talk to an older Host during a rolling upgrade without a crash.
/// </summary>
[Fact]
public void Pre_PR9_payload_without_IsAlarm_key_deserializes_with_default_false()
{
// Build a 6-field payload (keys 0..5) matching the pre-PR9 shape by serializing a
// stand-in class with the same key layout but no Key=6.
var pre = new PrePR9Shape
{
AttributeName = "Legacy",
MxDataType = 1,
IsArray = false,
ArrayDim = null,
SecurityClassification = 0,
IsHistorized = false,
};
var bytes = MessagePackSerializer.Serialize(pre);
var decoded = MessagePackSerializer.Deserialize<GalaxyAttributeInfo>(bytes);
decoded.AttributeName.ShouldBe("Legacy");
decoded.IsAlarm.ShouldBeFalse();
}
[MessagePackObject]
public sealed class PrePR9Shape
{
[Key(0)] public string AttributeName { get; set; } = string.Empty;
[Key(1)] public int MxDataType { get; set; }
[Key(2)] public bool IsArray { get; set; }
[Key(3)] public uint? ArrayDim { get; set; }
[Key(4)] public int SecurityClassification { get; set; }
[Key(5)] public bool IsHistorized { get; set; }
}
}

View File

@@ -0,0 +1,190 @@
using System;
using System.Collections.Concurrent;
using System.Threading.Tasks;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Alarms;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests;
[Trait("Category", "Unit")]
public sealed class GalaxyAlarmTrackerTests
{
private sealed class FakeSubscriber
{
public readonly ConcurrentDictionary<string, Action<string, Vtq>> Subs = new();
public readonly ConcurrentQueue<string> Unsubs = new();
public readonly ConcurrentQueue<(string Tag, object Value)> Writes = new();
public bool WriteReturns { get; set; } = true;
public Task Subscribe(string tag, Action<string, Vtq> cb)
{
Subs[tag] = cb;
return Task.CompletedTask;
}
public Task Unsubscribe(string tag)
{
Unsubs.Enqueue(tag);
Subs.TryRemove(tag, out _);
return Task.CompletedTask;
}
public Task<bool> Write(string tag, object value)
{
Writes.Enqueue((tag, value));
return Task.FromResult(WriteReturns);
}
}
private static Vtq Bool(bool v) => new(v, DateTime.UtcNow, 192);
private static Vtq Int(int v) => new(v, DateTime.UtcNow, 192);
private static Vtq Str(string v) => new(v, DateTime.UtcNow, 192);
[Fact]
public async Task Track_subscribes_to_four_alarm_attributes()
{
var fake = new FakeSubscriber();
using var t = new GalaxyAlarmTracker(fake.Subscribe, fake.Unsubscribe, fake.Write);
await t.TrackAsync("Tank.Level.HiHi");
fake.Subs.ShouldContainKey("Tank.Level.HiHi.InAlarm");
fake.Subs.ShouldContainKey("Tank.Level.HiHi.Priority");
fake.Subs.ShouldContainKey("Tank.Level.HiHi.DescAttrName");
fake.Subs.ShouldContainKey("Tank.Level.HiHi.Acked");
t.TrackedAlarmCount.ShouldBe(1);
}
[Fact]
public async Task Track_is_idempotent_on_repeat_call()
{
var fake = new FakeSubscriber();
using var t = new GalaxyAlarmTracker(fake.Subscribe, fake.Unsubscribe, fake.Write);
await t.TrackAsync("Alarm.A");
await t.TrackAsync("Alarm.A");
t.TrackedAlarmCount.ShouldBe(1);
fake.Subs.Count.ShouldBe(4); // 4 sub calls, not 8
}
[Fact]
public async Task InAlarm_false_to_true_fires_Active_transition()
{
var fake = new FakeSubscriber();
using var t = new GalaxyAlarmTracker(fake.Subscribe, fake.Unsubscribe, fake.Write);
var transitions = new ConcurrentQueue<AlarmTransition>();
t.TransitionRaised += (_, tr) => transitions.Enqueue(tr);
await t.TrackAsync("Alarm.A");
fake.Subs["Alarm.A.Priority"]("Alarm.A.Priority", Int(500));
fake.Subs["Alarm.A.DescAttrName"]("Alarm.A.DescAttrName", Str("TankLevelHiHi"));
fake.Subs["Alarm.A.InAlarm"]("Alarm.A.InAlarm", Bool(true));
transitions.Count.ShouldBe(1);
transitions.TryDequeue(out var tr).ShouldBeTrue();
tr!.Transition.ShouldBe(AlarmStateTransition.Active);
tr.Priority.ShouldBe(500);
tr.DescAttrName.ShouldBe("TankLevelHiHi");
}
[Fact]
public async Task InAlarm_true_to_false_fires_Inactive_transition()
{
var fake = new FakeSubscriber();
using var t = new GalaxyAlarmTracker(fake.Subscribe, fake.Unsubscribe, fake.Write);
var transitions = new ConcurrentQueue<AlarmTransition>();
t.TransitionRaised += (_, tr) => transitions.Enqueue(tr);
await t.TrackAsync("Alarm.A");
fake.Subs["Alarm.A.InAlarm"]("Alarm.A.InAlarm", Bool(true));
fake.Subs["Alarm.A.InAlarm"]("Alarm.A.InAlarm", Bool(false));
transitions.Count.ShouldBe(2);
transitions.TryDequeue(out _);
transitions.TryDequeue(out var tr).ShouldBeTrue();
tr!.Transition.ShouldBe(AlarmStateTransition.Inactive);
}
[Fact]
public async Task Acked_false_to_true_fires_Acknowledged_while_InAlarm_is_true()
{
var fake = new FakeSubscriber();
using var t = new GalaxyAlarmTracker(fake.Subscribe, fake.Unsubscribe, fake.Write);
var transitions = new ConcurrentQueue<AlarmTransition>();
t.TransitionRaised += (_, tr) => transitions.Enqueue(tr);
await t.TrackAsync("Alarm.A");
fake.Subs["Alarm.A.InAlarm"]("Alarm.A.InAlarm", Bool(true)); // Active, clears Acked flag
fake.Subs["Alarm.A.Acked"]("Alarm.A.Acked", Bool(true)); // Acknowledged
transitions.Count.ShouldBe(2);
transitions.TryDequeue(out _);
transitions.TryDequeue(out var tr).ShouldBeTrue();
tr!.Transition.ShouldBe(AlarmStateTransition.Acknowledged);
}
[Fact]
public async Task Acked_transition_while_InAlarm_is_false_does_not_fire()
{
var fake = new FakeSubscriber();
using var t = new GalaxyAlarmTracker(fake.Subscribe, fake.Unsubscribe, fake.Write);
var transitions = new ConcurrentQueue<AlarmTransition>();
t.TransitionRaised += (_, tr) => transitions.Enqueue(tr);
await t.TrackAsync("Alarm.A");
// Initial Acked=true on subscribe (alarm is at rest, pre-ack'd) — should not fire.
fake.Subs["Alarm.A.Acked"]("Alarm.A.Acked", Bool(true));
transitions.Count.ShouldBe(0);
}
[Fact]
public async Task Acknowledge_writes_AckMsg_with_comment()
{
var fake = new FakeSubscriber();
using var t = new GalaxyAlarmTracker(fake.Subscribe, fake.Unsubscribe, fake.Write);
await t.TrackAsync("Alarm.A");
var ok = await t.AcknowledgeAsync("Alarm.A", "acknowledged by operator");
ok.ShouldBeTrue();
fake.Writes.Count.ShouldBe(1);
fake.Writes.TryDequeue(out var w).ShouldBeTrue();
w.Tag.ShouldBe("Alarm.A.AckMsg");
w.Value.ShouldBe("acknowledged by operator");
}
[Fact]
public async Task Snapshot_reports_latest_fields()
{
var fake = new FakeSubscriber();
using var t = new GalaxyAlarmTracker(fake.Subscribe, fake.Unsubscribe, fake.Write);
await t.TrackAsync("Alarm.A");
fake.Subs["Alarm.A.InAlarm"]("Alarm.A.InAlarm", Bool(true));
fake.Subs["Alarm.A.Priority"]("Alarm.A.Priority", Int(900));
fake.Subs["Alarm.A.DescAttrName"]("Alarm.A.DescAttrName", Str("MyAlarm"));
fake.Subs["Alarm.A.Acked"]("Alarm.A.Acked", Bool(true));
var snap = t.SnapshotStates();
snap.Count.ShouldBe(1);
snap[0].InAlarm.ShouldBeTrue();
snap[0].Acked.ShouldBeTrue();
snap[0].Priority.ShouldBe(900);
snap[0].DescAttrName.ShouldBe("MyAlarm");
}
[Fact]
public async Task Foreign_probe_callback_is_dropped()
{
var fake = new FakeSubscriber();
using var t = new GalaxyAlarmTracker(fake.Subscribe, fake.Unsubscribe, fake.Write);
var transitions = new ConcurrentQueue<AlarmTransition>();
t.TransitionRaised += (_, tr) => transitions.Enqueue(tr);
// No TrackAsync was called — this callback is foreign and should be silently ignored.
t.OnProbeCallback("Unknown.InAlarm", Bool(true));
transitions.Count.ShouldBe(0);
}
}

View File

@@ -0,0 +1,231 @@
using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Stability;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests;
[Trait("Category", "Unit")]
public sealed class GalaxyRuntimeProbeManagerTests
{
private sealed class FakeSubscriber
{
public readonly ConcurrentDictionary<string, Action<string, Vtq>> Subs = new();
public readonly ConcurrentQueue<string> UnsubCalls = new();
public bool FailSubscribeFor { get; set; }
public string? FailSubscribeTag { get; set; }
public Task Subscribe(string probe, Action<string, Vtq> cb)
{
if (FailSubscribeFor && string.Equals(probe, FailSubscribeTag, StringComparison.OrdinalIgnoreCase))
throw new InvalidOperationException("subscribe refused");
Subs[probe] = cb;
return Task.CompletedTask;
}
public Task Unsubscribe(string probe)
{
UnsubCalls.Enqueue(probe);
Subs.TryRemove(probe, out _);
return Task.CompletedTask;
}
}
private static Vtq Good(bool scanState) => new(scanState, DateTime.UtcNow, 192);
private static Vtq Bad() => new(null, DateTime.UtcNow, 0);
[Fact]
public async Task Sync_subscribes_to_ScanState_per_host()
{
var subs = new FakeSubscriber();
using var mgr = new GalaxyRuntimeProbeManager(subs.Subscribe, subs.Unsubscribe);
await mgr.SyncAsync(new[]
{
new HostProbeTarget("PlatformA", GalaxyRuntimeProbeManager.CategoryWinPlatform),
new HostProbeTarget("EngineB", GalaxyRuntimeProbeManager.CategoryAppEngine),
});
mgr.ActiveProbeCount.ShouldBe(2);
subs.Subs.ShouldContainKey("PlatformA.ScanState");
subs.Subs.ShouldContainKey("EngineB.ScanState");
}
[Fact]
public async Task Sync_is_idempotent_on_repeat_call_with_same_set()
{
var subs = new FakeSubscriber();
using var mgr = new GalaxyRuntimeProbeManager(subs.Subscribe, subs.Unsubscribe);
var targets = new[] { new HostProbeTarget("PlatformA", 1) };
await mgr.SyncAsync(targets);
await mgr.SyncAsync(targets);
mgr.ActiveProbeCount.ShouldBe(1);
subs.Subs.Count.ShouldBe(1);
subs.UnsubCalls.Count.ShouldBe(0);
}
[Fact]
public async Task Sync_unadvises_removed_hosts()
{
var subs = new FakeSubscriber();
using var mgr = new GalaxyRuntimeProbeManager(subs.Subscribe, subs.Unsubscribe);
await mgr.SyncAsync(new[]
{
new HostProbeTarget("PlatformA", 1),
new HostProbeTarget("PlatformB", 1),
});
await mgr.SyncAsync(new[] { new HostProbeTarget("PlatformA", 1) });
mgr.ActiveProbeCount.ShouldBe(1);
subs.UnsubCalls.ShouldContain("PlatformB.ScanState");
}
[Fact]
public async Task Subscribe_failure_rolls_back_host_entry_so_later_transitions_do_not_fire_stale_events()
{
var subs = new FakeSubscriber { FailSubscribeFor = true, FailSubscribeTag = "PlatformA.ScanState" };
using var mgr = new GalaxyRuntimeProbeManager(subs.Subscribe, subs.Unsubscribe);
await mgr.SyncAsync(new[] { new HostProbeTarget("PlatformA", 1) });
mgr.ActiveProbeCount.ShouldBe(0); // rolled back
mgr.GetState("PlatformA").ShouldBe(HostRuntimeState.Unknown);
}
[Fact]
public async Task Unknown_to_Running_does_not_fire_StateChanged()
{
var now = new DateTime(2026, 4, 18, 10, 0, 0, DateTimeKind.Utc);
var subs = new FakeSubscriber();
using var mgr = new GalaxyRuntimeProbeManager(subs.Subscribe, subs.Unsubscribe, () => now);
var transitions = new ConcurrentQueue<HostStateTransition>();
mgr.StateChanged += (_, t) => transitions.Enqueue(t);
await mgr.SyncAsync(new[] { new HostProbeTarget("PlatformA", 1) });
subs.Subs["PlatformA.ScanState"]("PlatformA.ScanState", Good(true));
mgr.GetState("PlatformA").ShouldBe(HostRuntimeState.Running);
transitions.Count.ShouldBe(0); // startup transition, operators don't care
}
[Fact]
public async Task Running_to_Stopped_fires_StateChanged_with_both_states()
{
var now = new DateTime(2026, 4, 18, 10, 0, 0, DateTimeKind.Utc);
var subs = new FakeSubscriber();
using var mgr = new GalaxyRuntimeProbeManager(subs.Subscribe, subs.Unsubscribe, () => now);
var transitions = new ConcurrentQueue<HostStateTransition>();
mgr.StateChanged += (_, t) => transitions.Enqueue(t);
await mgr.SyncAsync(new[] { new HostProbeTarget("PlatformA", 1) });
subs.Subs["PlatformA.ScanState"]("PlatformA.ScanState", Good(true)); // Unknown→Running (silent)
subs.Subs["PlatformA.ScanState"]("PlatformA.ScanState", Good(false)); // Running→Stopped (fires)
transitions.Count.ShouldBe(1);
transitions.TryDequeue(out var t).ShouldBeTrue();
t!.TagName.ShouldBe("PlatformA");
t.OldState.ShouldBe(HostRuntimeState.Running);
t.NewState.ShouldBe(HostRuntimeState.Stopped);
}
[Fact]
public async Task Stopped_to_Running_fires_StateChanged_for_recovery()
{
var now = new DateTime(2026, 4, 18, 10, 0, 0, DateTimeKind.Utc);
var subs = new FakeSubscriber();
using var mgr = new GalaxyRuntimeProbeManager(subs.Subscribe, subs.Unsubscribe, () => now);
var transitions = new ConcurrentQueue<HostStateTransition>();
mgr.StateChanged += (_, t) => transitions.Enqueue(t);
await mgr.SyncAsync(new[] { new HostProbeTarget("PlatformA", 1) });
subs.Subs["PlatformA.ScanState"]("PlatformA.ScanState", Good(true)); // Unknown→Running (silent)
subs.Subs["PlatformA.ScanState"]("PlatformA.ScanState", Good(false)); // Running→Stopped (fires)
subs.Subs["PlatformA.ScanState"]("PlatformA.ScanState", Good(true)); // Stopped→Running (fires)
transitions.Count.ShouldBe(2);
}
[Fact]
public async Task Unknown_to_Stopped_fires_StateChanged_for_first_known_bad_signal()
{
var now = new DateTime(2026, 4, 18, 10, 0, 0, DateTimeKind.Utc);
var subs = new FakeSubscriber();
using var mgr = new GalaxyRuntimeProbeManager(subs.Subscribe, subs.Unsubscribe, () => now);
var transitions = new ConcurrentQueue<HostStateTransition>();
mgr.StateChanged += (_, t) => transitions.Enqueue(t);
await mgr.SyncAsync(new[] { new HostProbeTarget("PlatformA", 1) });
// First callback is bad-quality — we must flag the host Stopped so operators see it.
subs.Subs["PlatformA.ScanState"]("PlatformA.ScanState", Bad());
transitions.Count.ShouldBe(1);
transitions.TryDequeue(out var t).ShouldBeTrue();
t!.OldState.ShouldBe(HostRuntimeState.Unknown);
t.NewState.ShouldBe(HostRuntimeState.Stopped);
}
[Fact]
public async Task Repeated_Good_Running_callbacks_do_not_fire_duplicate_events()
{
var now = new DateTime(2026, 4, 18, 10, 0, 0, DateTimeKind.Utc);
var subs = new FakeSubscriber();
using var mgr = new GalaxyRuntimeProbeManager(subs.Subscribe, subs.Unsubscribe, () => now);
var count = 0;
mgr.StateChanged += (_, _) => Interlocked.Increment(ref count);
await mgr.SyncAsync(new[] { new HostProbeTarget("PlatformA", 1) });
for (var i = 0; i < 5; i++)
subs.Subs["PlatformA.ScanState"]("PlatformA.ScanState", Good(true));
count.ShouldBe(0); // only the silent Unknown→Running on the first, no events after
}
[Fact]
public async Task Unknown_callback_for_non_tracked_probe_is_dropped()
{
var subs = new FakeSubscriber();
using var mgr = new GalaxyRuntimeProbeManager(subs.Subscribe, subs.Unsubscribe);
mgr.OnProbeCallback("ProbeForSomeoneElse.ScanState", Good(true));
mgr.ActiveProbeCount.ShouldBe(0);
}
[Fact]
public async Task Snapshot_reports_current_state_for_every_tracked_host()
{
var now = new DateTime(2026, 4, 18, 10, 0, 0, DateTimeKind.Utc);
var subs = new FakeSubscriber();
using var mgr = new GalaxyRuntimeProbeManager(subs.Subscribe, subs.Unsubscribe, () => now);
await mgr.SyncAsync(new[]
{
new HostProbeTarget("PlatformA", 1),
new HostProbeTarget("EngineB", 3),
});
subs.Subs["PlatformA.ScanState"]("PlatformA.ScanState", Good(true)); // Running
subs.Subs["EngineB.ScanState"]("EngineB.ScanState", Bad()); // Stopped
var snap = mgr.SnapshotStates();
snap.Count.ShouldBe(2);
snap.ShouldContain(s => s.TagName == "PlatformA" && s.State == HostRuntimeState.Running);
snap.ShouldContain(s => s.TagName == "EngineB" && s.State == HostRuntimeState.Stopped);
}
[Fact]
public void IsRuntimeHost_recognizes_WinPlatform_and_AppEngine_category_ids()
{
new HostProbeTarget("X", GalaxyRuntimeProbeManager.CategoryWinPlatform).IsRuntimeHost.ShouldBeTrue();
new HostProbeTarget("X", GalaxyRuntimeProbeManager.CategoryAppEngine).IsRuntimeHost.ShouldBeTrue();
new HostProbeTarget("X", 4 /* $Area */).IsRuntimeHost.ShouldBeFalse();
new HostProbeTarget("X", 11 /* $ApplicationObject */).IsRuntimeHost.ShouldBeFalse();
}
}

View File

@@ -0,0 +1,94 @@
using System;
using System.Linq;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Historian;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests
{
[Trait("Category", "Unit")]
public sealed class HistorianClusterEndpointPickerTests
{
private static HistorianConfiguration Config(params string[] nodes) => new()
{
ServerName = "ignored",
ServerNames = nodes.ToList(),
FailureCooldownSeconds = 60,
};
[Fact]
public void Single_node_config_falls_back_to_ServerName_when_ServerNames_empty()
{
var cfg = new HistorianConfiguration { ServerName = "only-node", ServerNames = new() };
var p = new HistorianClusterEndpointPicker(cfg);
p.NodeCount.ShouldBe(1);
p.GetHealthyNodes().ShouldBe(new[] { "only-node" });
}
[Fact]
public void Failed_node_enters_cooldown_and_is_skipped()
{
var now = new DateTime(2026, 4, 18, 10, 0, 0, DateTimeKind.Utc);
var p = new HistorianClusterEndpointPicker(Config("a", "b"), () => now);
p.MarkFailed("a", "boom");
p.GetHealthyNodes().ShouldBe(new[] { "b" });
}
[Fact]
public void Cooldown_expires_after_configured_window()
{
var clock = new DateTime(2026, 4, 18, 10, 0, 0, DateTimeKind.Utc);
var p = new HistorianClusterEndpointPicker(Config("a", "b"), () => clock);
p.MarkFailed("a", "boom");
p.GetHealthyNodes().ShouldBe(new[] { "b" });
clock = clock.AddSeconds(61);
p.GetHealthyNodes().ShouldBe(new[] { "a", "b" });
}
[Fact]
public void MarkHealthy_immediately_clears_cooldown()
{
var now = new DateTime(2026, 4, 18, 10, 0, 0, DateTimeKind.Utc);
var p = new HistorianClusterEndpointPicker(Config("a"), () => now);
p.MarkFailed("a", "boom");
p.GetHealthyNodes().ShouldBeEmpty();
p.MarkHealthy("a");
p.GetHealthyNodes().ShouldBe(new[] { "a" });
}
[Fact]
public void All_nodes_in_cooldown_returns_empty_healthy_list()
{
var now = new DateTime(2026, 4, 18, 10, 0, 0, DateTimeKind.Utc);
var p = new HistorianClusterEndpointPicker(Config("a", "b"), () => now);
p.MarkFailed("a", "x");
p.MarkFailed("b", "y");
p.GetHealthyNodes().ShouldBeEmpty();
p.NodeCount.ShouldBe(2);
}
[Fact]
public void Snapshot_reports_failure_count_and_last_error()
{
var now = new DateTime(2026, 4, 18, 10, 0, 0, DateTimeKind.Utc);
var p = new HistorianClusterEndpointPicker(Config("a"), () => now);
p.MarkFailed("a", "first");
p.MarkFailed("a", "second");
var snap = p.SnapshotNodeStates().Single();
snap.FailureCount.ShouldBe(2);
snap.LastError.ShouldBe("second");
snap.IsHealthy.ShouldBeFalse();
snap.CooldownUntil.ShouldNotBeNull();
}
[Fact]
public void Duplicate_hostnames_are_deduplicated_case_insensitively()
{
var p = new HistorianClusterEndpointPicker(Config("NodeA", "nodea", "NodeB"));
p.NodeCount.ShouldBe(2);
}
}
}

View File

@@ -0,0 +1,61 @@
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Historian;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests;
[Trait("Category", "Unit")]
public sealed class HistorianQualityMapperTests
{
/// <summary>
/// Rich mapping preserves specific OPC DA subcodes through the historian ToWire path.
/// Before PR 12 the category-only fallback collapsed e.g. BadNotConnected(8) to
/// Bad(0x80000000) so downstream OPC UA clients could not distinguish transport issues
/// from sensor issues. After PR 12 every known subcode round-trips to its canonical
/// uint32 StatusCode and Proxy translation stays byte-for-byte with v1 QualityMapper.
/// </summary>
[Theory]
[InlineData((byte)192, 0x00000000u)] // Good
[InlineData((byte)216, 0x00D80000u)] // Good_LocalOverride
[InlineData((byte)64, 0x40000000u)] // Uncertain
[InlineData((byte)68, 0x40900000u)] // Uncertain_LastUsableValue
[InlineData((byte)80, 0x40930000u)] // Uncertain_SensorNotAccurate
[InlineData((byte)84, 0x40940000u)] // Uncertain_EngineeringUnitsExceeded
[InlineData((byte)88, 0x40950000u)] // Uncertain_SubNormal
[InlineData((byte)0, 0x80000000u)] // Bad
[InlineData((byte)4, 0x80890000u)] // Bad_ConfigurationError
[InlineData((byte)8, 0x808A0000u)] // Bad_NotConnected
[InlineData((byte)12, 0x808B0000u)] // Bad_DeviceFailure
[InlineData((byte)16, 0x808C0000u)] // Bad_SensorFailure
[InlineData((byte)20, 0x80050000u)] // Bad_CommunicationError
[InlineData((byte)24, 0x808D0000u)] // Bad_OutOfService
[InlineData((byte)32, 0x80320000u)] // Bad_WaitingForInitialData
public void Maps_specific_OPC_DA_codes_to_canonical_StatusCode(byte quality, uint expected)
{
HistorianQualityMapper.Map(quality).ShouldBe(expected);
}
[Theory]
[InlineData((byte)200)] // Good — unknown subcode in Good family
[InlineData((byte)255)] // Good — unknown
public void Unknown_good_family_codes_fall_back_to_plain_Good(byte q)
{
HistorianQualityMapper.Map(q).ShouldBe(0x00000000u);
}
[Theory]
[InlineData((byte)100)] // Uncertain — unknown subcode
[InlineData((byte)150)] // Uncertain — unknown
public void Unknown_uncertain_family_codes_fall_back_to_plain_Uncertain(byte q)
{
HistorianQualityMapper.Map(q).ShouldBe(0x40000000u);
}
[Theory]
[InlineData((byte)1)] // Bad — unknown subcode
[InlineData((byte)50)] // Bad — unknown
public void Unknown_bad_family_codes_fall_back_to_plain_Bad(byte q)
{
HistorianQualityMapper.Map(q).ShouldBe(0x80000000u);
}
}

View File

@@ -0,0 +1,109 @@
using System;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Galaxy;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Historian;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Sta;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests
{
[Trait("Category", "Unit")]
public sealed class HistorianWiringTests
{
/// <summary>
/// When the Proxy sends a HistoryRead but the supervisor never enabled the historian
/// (OTOPCUA_HISTORIAN_ENABLED unset), we expect a clean Success=false with a
/// self-explanatory error — not an exception or a hang against localhost.
/// </summary>
[Fact]
public async Task HistoryReadAsync_returns_disabled_error_when_no_historian_configured()
{
using var pump = new StaPump("Test.Sta");
await pump.WaitForStartedAsync();
var mx = new MxAccessClient(pump, new MxProxyAdapter(), "HistorianWiringTests");
using var backend = new MxAccessGalaxyBackend(
new GalaxyRepository(new GalaxyRepositoryOptions { ConnectionString = "Server=.;Database=ZB;Integrated Security=True;" }),
mx,
historian: null);
var resp = await backend.HistoryReadAsync(new HistoryReadRequest
{
TagReferences = new[] { "TestTag" },
StartUtcUnixMs = 0,
EndUtcUnixMs = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(),
MaxValuesPerTag = 100,
}, CancellationToken.None);
resp.Success.ShouldBeFalse();
resp.Error.ShouldContain("Historian disabled");
resp.Tags.ShouldBeEmpty();
}
/// <summary>
/// When the historian is wired up, we expect the backend to call through and map
/// samples onto the IPC wire shape. Uses a fake <see cref="IHistorianDataSource"/>
/// that returns a single known-good sample so we can assert the mapping stays sane.
/// </summary>
[Fact]
public async Task HistoryReadAsync_maps_sample_to_GalaxyDataValue()
{
using var pump = new StaPump("Test.Sta");
await pump.WaitForStartedAsync();
var mx = new MxAccessClient(pump, new MxProxyAdapter(), "HistorianWiringTests");
var fake = new FakeHistorianDataSource(new HistorianSample
{
Value = 42.5,
Quality = 192, // Good
TimestampUtc = new DateTime(2026, 4, 18, 9, 0, 0, DateTimeKind.Utc),
});
using var backend = new MxAccessGalaxyBackend(
new GalaxyRepository(new GalaxyRepositoryOptions { ConnectionString = "Server=.;Database=ZB;Integrated Security=True;" }),
mx,
fake);
var resp = await backend.HistoryReadAsync(new HistoryReadRequest
{
TagReferences = new[] { "TankLevel" },
StartUtcUnixMs = 0,
EndUtcUnixMs = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(),
MaxValuesPerTag = 100,
}, CancellationToken.None);
resp.Success.ShouldBeTrue();
resp.Tags.Length.ShouldBe(1);
resp.Tags[0].TagReference.ShouldBe("TankLevel");
resp.Tags[0].Values.Length.ShouldBe(1);
resp.Tags[0].Values[0].StatusCode.ShouldBe(0u); // Good
resp.Tags[0].Values[0].ValueBytes.ShouldNotBeNull();
resp.Tags[0].Values[0].SourceTimestampUtcUnixMs.ShouldBe(
new DateTimeOffset(2026, 4, 18, 9, 0, 0, TimeSpan.Zero).ToUnixTimeMilliseconds());
}
private sealed class FakeHistorianDataSource : IHistorianDataSource
{
private readonly HistorianSample _sample;
public FakeHistorianDataSource(HistorianSample sample) => _sample = sample;
public Task<List<HistorianSample>> ReadRawAsync(string tagName, DateTime s, DateTime e, int max, CancellationToken ct)
=> Task.FromResult(new List<HistorianSample> { _sample });
public Task<List<HistorianAggregateSample>> ReadAggregateAsync(string tagName, DateTime s, DateTime e, double ms, string col, CancellationToken ct)
=> Task.FromResult(new List<HistorianAggregateSample>());
public Task<List<HistorianSample>> ReadAtTimeAsync(string tagName, DateTime[] ts, CancellationToken ct)
=> Task.FromResult(new List<HistorianSample>());
public Task<List<HistorianEventDto>> ReadEventsAsync(string? src, DateTime s, DateTime e, int max, CancellationToken ct)
=> Task.FromResult(new List<HistorianEventDto>());
public HistorianHealthSnapshot GetHealthSnapshot() => new();
public void Dispose() { }
}
}
}

View File

@@ -0,0 +1,147 @@
using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading;
using System.Threading.Tasks;
using MessagePack;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Galaxy;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Historian;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Sta;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests;
[Trait("Category", "Unit")]
public sealed class HistoryReadAtTimeTests
{
private static MxAccessGalaxyBackend BuildBackend(IHistorianDataSource? historian, StaPump pump) =>
new(
new GalaxyRepository(new GalaxyRepositoryOptions { ConnectionString = "Server=.;Database=ZB;Integrated Security=True;" }),
new MxAccessClient(pump, new MxProxyAdapter(), "attime-test"),
historian);
[Fact]
public async Task Returns_disabled_error_when_no_historian_configured()
{
using var pump = new StaPump("Test.Sta");
await pump.WaitForStartedAsync();
using var backend = BuildBackend(null, pump);
var resp = await backend.HistoryReadAtTimeAsync(new HistoryReadAtTimeRequest
{
TagReference = "T",
TimestampsUtcUnixMs = new[] { 1L, 2L },
}, CancellationToken.None);
resp.Success.ShouldBeFalse();
resp.Error.ShouldContain("Historian disabled");
}
[Fact]
public async Task Empty_timestamp_list_short_circuits_to_success_with_no_values()
{
using var pump = new StaPump("Test.Sta");
await pump.WaitForStartedAsync();
var fake = new FakeHistorian();
using var backend = BuildBackend(fake, pump);
var resp = await backend.HistoryReadAtTimeAsync(new HistoryReadAtTimeRequest
{
TagReference = "T",
TimestampsUtcUnixMs = Array.Empty<long>(),
}, CancellationToken.None);
resp.Success.ShouldBeTrue();
resp.Values.ShouldBeEmpty();
fake.Calls.ShouldBe(0); // no round-trip to SDK for empty timestamp list
}
[Fact]
public async Task Timestamps_survive_Unix_ms_round_trip_to_DateTime()
{
using var pump = new StaPump("Test.Sta");
await pump.WaitForStartedAsync();
var t1 = new DateTime(2026, 4, 18, 10, 0, 0, DateTimeKind.Utc);
var t2 = new DateTime(2026, 4, 18, 10, 5, 0, DateTimeKind.Utc);
var fake = new FakeHistorian(
new HistorianSample { Value = 100.0, Quality = 192, TimestampUtc = t1 },
new HistorianSample { Value = 101.5, Quality = 192, TimestampUtc = t2 });
using var backend = BuildBackend(fake, pump);
var resp = await backend.HistoryReadAtTimeAsync(new HistoryReadAtTimeRequest
{
TagReference = "TankLevel",
TimestampsUtcUnixMs = new[]
{
new DateTimeOffset(t1, TimeSpan.Zero).ToUnixTimeMilliseconds(),
new DateTimeOffset(t2, TimeSpan.Zero).ToUnixTimeMilliseconds(),
},
}, CancellationToken.None);
resp.Success.ShouldBeTrue();
resp.Values.Length.ShouldBe(2);
resp.Values[0].SourceTimestampUtcUnixMs.ShouldBe(new DateTimeOffset(t1, TimeSpan.Zero).ToUnixTimeMilliseconds());
resp.Values[0].StatusCode.ShouldBe(0u); // Good (quality 192)
MessagePackSerializer.Deserialize<double>(resp.Values[0].ValueBytes!).ShouldBe(100.0);
fake.Calls.ShouldBe(1);
fake.LastTimestamps.Length.ShouldBe(2);
fake.LastTimestamps[0].ShouldBe(t1);
fake.LastTimestamps[1].ShouldBe(t2);
}
[Fact]
public async Task Missing_sample_maps_to_Bad_category()
{
using var pump = new StaPump("Test.Sta");
await pump.WaitForStartedAsync();
// Quality=0 means no sample at that timestamp per HistorianDataSource.ReadAtTimeAsync.
var fake = new FakeHistorian(new HistorianSample
{
Value = null,
Quality = 0,
TimestampUtc = DateTime.UtcNow,
});
using var backend = BuildBackend(fake, pump);
var resp = await backend.HistoryReadAtTimeAsync(new HistoryReadAtTimeRequest
{
TagReference = "T",
TimestampsUtcUnixMs = new[] { 1L },
}, CancellationToken.None);
resp.Success.ShouldBeTrue();
resp.Values.Length.ShouldBe(1);
resp.Values[0].StatusCode.ShouldBe(0x80000000u); // Bad category
resp.Values[0].ValueBytes.ShouldBeNull();
}
private sealed class FakeHistorian : IHistorianDataSource
{
private readonly HistorianSample[] _samples;
public int Calls { get; private set; }
public DateTime[] LastTimestamps { get; private set; } = Array.Empty<DateTime>();
public FakeHistorian(params HistorianSample[] samples) => _samples = samples;
public Task<List<HistorianSample>> ReadAtTimeAsync(string tag, DateTime[] ts, CancellationToken ct)
{
Calls++;
LastTimestamps = ts;
return Task.FromResult(new List<HistorianSample>(_samples));
}
public Task<List<HistorianSample>> ReadRawAsync(string tag, DateTime s, DateTime e, int max, CancellationToken ct)
=> Task.FromResult(new List<HistorianSample>());
public Task<List<HistorianAggregateSample>> ReadAggregateAsync(string tag, DateTime s, DateTime e, double ms, string col, CancellationToken ct)
=> Task.FromResult(new List<HistorianAggregateSample>());
public Task<List<HistorianEventDto>> ReadEventsAsync(string? src, DateTime s, DateTime e, int max, CancellationToken ct)
=> Task.FromResult(new List<HistorianEventDto>());
public HistorianHealthSnapshot GetHealthSnapshot() => new();
public void Dispose() { }
}
}

View File

@@ -0,0 +1,129 @@
using System;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Galaxy;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Historian;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Sta;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests;
[Trait("Category", "Unit")]
public sealed class HistoryReadEventsTests
{
private static MxAccessGalaxyBackend BuildBackend(IHistorianDataSource? h, StaPump pump) =>
new(
new GalaxyRepository(new GalaxyRepositoryOptions { ConnectionString = "Server=.;Database=ZB;Integrated Security=True;" }),
new MxAccessClient(pump, new MxProxyAdapter(), "events-test"),
h);
[Fact]
public async Task Returns_disabled_error_when_no_historian_configured()
{
using var pump = new StaPump("Test.Sta");
await pump.WaitForStartedAsync();
using var backend = BuildBackend(null, pump);
var resp = await backend.HistoryReadEventsAsync(new HistoryReadEventsRequest
{
SourceName = "TankA",
StartUtcUnixMs = 0,
EndUtcUnixMs = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(),
MaxEvents = 100,
}, CancellationToken.None);
resp.Success.ShouldBeFalse();
resp.Error.ShouldContain("Historian disabled");
}
[Fact]
public async Task Maps_HistorianEventDto_to_GalaxyHistoricalEvent_wire_shape()
{
using var pump = new StaPump("Test.Sta");
await pump.WaitForStartedAsync();
var eventId = Guid.NewGuid();
var eventTime = new DateTime(2026, 4, 18, 10, 0, 0, DateTimeKind.Utc);
var receivedTime = eventTime.AddMilliseconds(150);
var fake = new FakeHistorian(new HistorianEventDto
{
Id = eventId,
Source = "TankA.Level.HiHi",
EventTime = eventTime,
ReceivedTime = receivedTime,
DisplayText = "HiHi alarm tripped",
Severity = 900,
});
using var backend = BuildBackend(fake, pump);
var resp = await backend.HistoryReadEventsAsync(new HistoryReadEventsRequest
{
SourceName = "TankA.Level.HiHi",
StartUtcUnixMs = 0,
EndUtcUnixMs = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(),
MaxEvents = 50,
}, CancellationToken.None);
resp.Success.ShouldBeTrue();
resp.Events.Length.ShouldBe(1);
var got = resp.Events[0];
got.EventId.ShouldBe(eventId.ToString());
got.SourceName.ShouldBe("TankA.Level.HiHi");
got.DisplayText.ShouldBe("HiHi alarm tripped");
got.Severity.ShouldBe<ushort>(900);
got.EventTimeUtcUnixMs.ShouldBe(new DateTimeOffset(eventTime, TimeSpan.Zero).ToUnixTimeMilliseconds());
got.ReceivedTimeUtcUnixMs.ShouldBe(new DateTimeOffset(receivedTime, TimeSpan.Zero).ToUnixTimeMilliseconds());
fake.LastSourceName.ShouldBe("TankA.Level.HiHi");
fake.LastMaxEvents.ShouldBe(50);
}
[Fact]
public async Task Null_source_name_is_passed_through_as_all_sources()
{
using var pump = new StaPump("Test.Sta");
await pump.WaitForStartedAsync();
var fake = new FakeHistorian();
using var backend = BuildBackend(fake, pump);
await backend.HistoryReadEventsAsync(new HistoryReadEventsRequest
{
SourceName = null,
StartUtcUnixMs = 0,
EndUtcUnixMs = 1,
MaxEvents = 10,
}, CancellationToken.None);
fake.LastSourceName.ShouldBeNull();
}
private sealed class FakeHistorian : IHistorianDataSource
{
private readonly HistorianEventDto[] _events;
public string? LastSourceName { get; private set; } = "<unset>";
public int LastMaxEvents { get; private set; }
public FakeHistorian(params HistorianEventDto[] events) => _events = events;
public Task<List<HistorianEventDto>> ReadEventsAsync(string? src, DateTime s, DateTime e, int max, CancellationToken ct)
{
LastSourceName = src;
LastMaxEvents = max;
return Task.FromResult(new List<HistorianEventDto>(_events));
}
public Task<List<HistorianSample>> ReadRawAsync(string tag, DateTime s, DateTime e, int max, CancellationToken ct)
=> Task.FromResult(new List<HistorianSample>());
public Task<List<HistorianAggregateSample>> ReadAggregateAsync(string tag, DateTime s, DateTime e, double ms, string col, CancellationToken ct)
=> Task.FromResult(new List<HistorianAggregateSample>());
public Task<List<HistorianSample>> ReadAtTimeAsync(string tag, DateTime[] ts, CancellationToken ct)
=> Task.FromResult(new List<HistorianSample>());
public HistorianHealthSnapshot GetHealthSnapshot() => new();
public void Dispose() { }
}
}

View File

@@ -0,0 +1,158 @@
using System;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
using MessagePack;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Galaxy;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Historian;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Sta;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests;
[Trait("Category", "Unit")]
public sealed class HistoryReadProcessedTests
{
[Fact]
public async Task ReturnsDisabledError_When_NoHistorianConfigured()
{
using var pump = new StaPump("Test.Sta");
await pump.WaitForStartedAsync();
var mx = new MxAccessClient(pump, new MxProxyAdapter(), "processed-test");
using var backend = new MxAccessGalaxyBackend(
new GalaxyRepository(new GalaxyRepositoryOptions { ConnectionString = "Server=.;Database=ZB;Integrated Security=True;" }),
mx,
historian: null);
var resp = await backend.HistoryReadProcessedAsync(new HistoryReadProcessedRequest
{
TagReference = "T",
StartUtcUnixMs = 0,
EndUtcUnixMs = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(),
IntervalMs = 1000,
AggregateColumn = "Average",
}, CancellationToken.None);
resp.Success.ShouldBeFalse();
resp.Error.ShouldContain("Historian disabled");
}
[Fact]
public async Task Rejects_NonPositiveInterval()
{
using var pump = new StaPump("Test.Sta");
await pump.WaitForStartedAsync();
var mx = new MxAccessClient(pump, new MxProxyAdapter(), "processed-test");
var fake = new FakeHistorianDataSource();
using var backend = new MxAccessGalaxyBackend(
new GalaxyRepository(new GalaxyRepositoryOptions { ConnectionString = "Server=.;Database=ZB;Integrated Security=True;" }),
mx,
fake);
var resp = await backend.HistoryReadProcessedAsync(new HistoryReadProcessedRequest
{
TagReference = "T",
IntervalMs = 0,
AggregateColumn = "Average",
}, CancellationToken.None);
resp.Success.ShouldBeFalse();
resp.Error.ShouldContain("IntervalMs");
}
[Fact]
public async Task Maps_AggregateSample_With_Value_To_Good()
{
using var pump = new StaPump("Test.Sta");
await pump.WaitForStartedAsync();
var mx = new MxAccessClient(pump, new MxProxyAdapter(), "processed-test");
var fake = new FakeHistorianDataSource(new HistorianAggregateSample
{
Value = 12.34,
TimestampUtc = new DateTime(2026, 4, 18, 10, 0, 0, DateTimeKind.Utc),
});
using var backend = new MxAccessGalaxyBackend(
new GalaxyRepository(new GalaxyRepositoryOptions { ConnectionString = "Server=.;Database=ZB;Integrated Security=True;" }),
mx,
fake);
var resp = await backend.HistoryReadProcessedAsync(new HistoryReadProcessedRequest
{
TagReference = "T",
StartUtcUnixMs = 0,
EndUtcUnixMs = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds(),
IntervalMs = 60_000,
AggregateColumn = "Average",
}, CancellationToken.None);
resp.Success.ShouldBeTrue();
resp.Values.Length.ShouldBe(1);
resp.Values[0].StatusCode.ShouldBe(0u); // Good
resp.Values[0].ValueBytes.ShouldNotBeNull();
MessagePackSerializer.Deserialize<double>(resp.Values[0].ValueBytes!).ShouldBe(12.34);
fake.LastAggregateColumn.ShouldBe("Average");
fake.LastIntervalMs.ShouldBe(60_000d);
}
[Fact]
public async Task Maps_Null_Bucket_To_BadNoData()
{
using var pump = new StaPump("Test.Sta");
await pump.WaitForStartedAsync();
var mx = new MxAccessClient(pump, new MxProxyAdapter(), "processed-test");
var fake = new FakeHistorianDataSource(new HistorianAggregateSample
{
Value = null,
TimestampUtc = DateTime.UtcNow,
});
using var backend = new MxAccessGalaxyBackend(
new GalaxyRepository(new GalaxyRepositoryOptions { ConnectionString = "Server=.;Database=ZB;Integrated Security=True;" }),
mx,
fake);
var resp = await backend.HistoryReadProcessedAsync(new HistoryReadProcessedRequest
{
TagReference = "T",
IntervalMs = 1000,
AggregateColumn = "Minimum",
}, CancellationToken.None);
resp.Success.ShouldBeTrue();
resp.Values.Length.ShouldBe(1);
resp.Values[0].StatusCode.ShouldBe(0x800E0000u); // BadNoData
resp.Values[0].ValueBytes.ShouldBeNull();
}
private sealed class FakeHistorianDataSource : IHistorianDataSource
{
private readonly HistorianAggregateSample[] _samples;
public string? LastAggregateColumn { get; private set; }
public double LastIntervalMs { get; private set; }
public FakeHistorianDataSource(params HistorianAggregateSample[] samples) => _samples = samples;
public Task<List<HistorianSample>> ReadRawAsync(string tag, DateTime s, DateTime e, int max, CancellationToken ct)
=> Task.FromResult(new List<HistorianSample>());
public Task<List<HistorianAggregateSample>> ReadAggregateAsync(
string tag, DateTime s, DateTime e, double intervalMs, string col, CancellationToken ct)
{
LastAggregateColumn = col;
LastIntervalMs = intervalMs;
return Task.FromResult(new List<HistorianAggregateSample>(_samples));
}
public Task<List<HistorianSample>> ReadAtTimeAsync(string tag, DateTime[] ts, CancellationToken ct)
=> Task.FromResult(new List<HistorianSample>());
public Task<List<HistorianEventDto>> ReadEventsAsync(string? src, DateTime s, DateTime e, int max, CancellationToken ct)
=> Task.FromResult(new List<HistorianEventDto>());
public HistorianHealthSnapshot GetHealthSnapshot() => new();
public void Dispose() { }
}
}

View File

@@ -0,0 +1,91 @@
using System;
using System.Collections.Concurrent;
using System.Threading;
using System.Threading.Tasks;
using ArchestrA.MxAccess;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.Galaxy;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Sta;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests;
[Trait("Category", "Unit")]
public sealed class HostStatusPushTests
{
/// <summary>
/// PR 8 — when MxAccessClient.ConnectionStateChanged fires false→true→false,
/// MxAccessGalaxyBackend raises OnHostStatusChanged once per transition with
/// HostName=ClientName, RuntimeStatus="Running"/"Stopped", and a timestamp.
/// This is the gateway-level signal; per-platform ScanState probes are deferred.
/// </summary>
[Fact]
public async Task ConnectionStateChanged_raises_OnHostStatusChanged_with_gateway_name()
{
using var pump = new StaPump("Test.Sta");
await pump.WaitForStartedAsync();
var proxy = new FakeProxy();
var mx = new MxAccessClient(pump, proxy, "GatewayClient", new MxAccessClientOptions { AutoReconnect = false });
using var backend = new MxAccessGalaxyBackend(
new GalaxyRepository(new GalaxyRepositoryOptions { ConnectionString = "Server=.;Database=ZB;Integrated Security=True;" }),
mx,
historian: null);
var notifications = new ConcurrentQueue<HostConnectivityStatus>();
backend.OnHostStatusChanged += (_, s) => notifications.Enqueue(s);
await mx.ConnectAsync();
await mx.DisconnectAsync();
notifications.Count.ShouldBe(2);
notifications.TryDequeue(out var first).ShouldBeTrue();
first!.HostName.ShouldBe("GatewayClient");
first.RuntimeStatus.ShouldBe("Running");
first.LastObservedUtcUnixMs.ShouldBeGreaterThan(0);
notifications.TryDequeue(out var second).ShouldBeTrue();
second!.HostName.ShouldBe("GatewayClient");
second.RuntimeStatus.ShouldBe("Stopped");
}
[Fact]
public async Task Dispose_unsubscribes_so_post_dispose_state_changes_do_not_fire_events()
{
using var pump = new StaPump("Test.Sta");
await pump.WaitForStartedAsync();
var proxy = new FakeProxy();
var mx = new MxAccessClient(pump, proxy, "GatewayClient", new MxAccessClientOptions { AutoReconnect = false });
var backend = new MxAccessGalaxyBackend(
new GalaxyRepository(new GalaxyRepositoryOptions { ConnectionString = "Server=.;Database=ZB;Integrated Security=True;" }),
mx,
historian: null);
var count = 0;
backend.OnHostStatusChanged += (_, _) => Interlocked.Increment(ref count);
await mx.ConnectAsync();
count.ShouldBe(1);
backend.Dispose();
await mx.DisconnectAsync();
count.ShouldBe(1); // no second notification after Dispose
}
private sealed class FakeProxy : IMxProxy
{
private int _next = 1;
public int Register(string _) => 42;
public void Unregister(int _) { }
public int AddItem(int _, string __) => Interlocked.Increment(ref _next);
public void RemoveItem(int _, int __) { }
public void AdviseSupervisory(int _, int __) { }
public void UnAdviseSupervisory(int _, int __) { }
public void Write(int _, int __, object ___, int ____) { }
public event MxDataChangeHandler? OnDataChange { add { } remove { } }
public event MxWriteCompleteHandler? OnWriteComplete { add { } remove { } }
}
}

View File

@@ -0,0 +1,173 @@
using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Tasks;
using ArchestrA.MxAccess;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Backend.MxAccess;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Sta;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests;
[Trait("Category", "Unit")]
public sealed class MxAccessClientMonitorLoopTests
{
/// <summary>
/// PR 6 low finding #1 — every $Heartbeat probe must RemoveItem the item handle it
/// allocated. Without that, the monitor leaks one handle per MonitorInterval seconds,
/// which over a 24h uptime becomes thousands of leaked MXAccess handles and can
/// eventually exhaust the runtime proxy's handle table.
/// </summary>
[Fact]
public async Task Heartbeat_probe_calls_RemoveItem_for_every_AddItem()
{
using var pump = new StaPump("Monitor.Sta");
await pump.WaitForStartedAsync();
var proxy = new CountingProxy();
var client = new MxAccessClient(pump, proxy, "probe-test", new MxAccessClientOptions
{
AutoReconnect = true,
MonitorInterval = TimeSpan.FromMilliseconds(150),
StaleThreshold = TimeSpan.FromMilliseconds(50),
});
await client.ConnectAsync();
// Wait past StaleThreshold, then let several monitor cycles fire.
await Task.Delay(700);
client.Dispose();
// One Heartbeat probe fires per monitor tick once the connection looks stale.
proxy.HeartbeatAddCount.ShouldBeGreaterThan(1);
// Every AddItem("$Heartbeat") must be matched by a RemoveItem on the same handle.
proxy.HeartbeatAddCount.ShouldBe(proxy.HeartbeatRemoveCount);
proxy.OutstandingHeartbeatHandles.ShouldBe(0);
}
/// <summary>
/// PR 6 low finding #2 — after reconnect, per-subscription replay failures must raise
/// SubscriptionReplayFailed so the backend can propagate the degradation, not get
/// silently eaten.
/// </summary>
[Fact]
public async Task SubscriptionReplayFailed_fires_for_each_tag_that_fails_to_replay()
{
using var pump = new StaPump("Replay.Sta");
await pump.WaitForStartedAsync();
var proxy = new ReplayFailingProxy(failOnReplayForTags: new[] { "BadTag.A", "BadTag.B" });
var client = new MxAccessClient(pump, proxy, "replay-test", new MxAccessClientOptions
{
AutoReconnect = true,
MonitorInterval = TimeSpan.FromMilliseconds(120),
StaleThreshold = TimeSpan.FromMilliseconds(50),
});
var failures = new ConcurrentBag<SubscriptionReplayFailedEventArgs>();
client.SubscriptionReplayFailed += (_, e) => failures.Add(e);
await client.ConnectAsync();
await client.SubscribeAsync("GoodTag.X", (_, _) => { });
await client.SubscribeAsync("BadTag.A", (_, _) => { });
await client.SubscribeAsync("BadTag.B", (_, _) => { });
proxy.TriggerProbeFailureOnNextCall();
// Wait for the monitor loop to probe → fail → reconnect → replay.
await Task.Delay(800);
client.Dispose();
failures.Count.ShouldBe(2);
var names = new HashSet<string>();
foreach (var f in failures) names.Add(f.TagReference);
names.ShouldContain("BadTag.A");
names.ShouldContain("BadTag.B");
}
// ----- test doubles -----
private sealed class CountingProxy : IMxProxy
{
private int _next = 1;
private readonly ConcurrentDictionary<int, string> _live = new();
public int HeartbeatAddCount;
public int HeartbeatRemoveCount;
public int OutstandingHeartbeatHandles => _live.Count;
public event MxDataChangeHandler? OnDataChange { add { } remove { } }
public event MxWriteCompleteHandler? OnWriteComplete { add { } remove { } }
public int Register(string _) => 42;
public void Unregister(int _) { }
public int AddItem(int _, string address)
{
var h = Interlocked.Increment(ref _next);
_live[h] = address;
if (address == "$Heartbeat") Interlocked.Increment(ref HeartbeatAddCount);
return h;
}
public void RemoveItem(int _, int itemHandle)
{
if (_live.TryRemove(itemHandle, out var addr) && addr == "$Heartbeat")
Interlocked.Increment(ref HeartbeatRemoveCount);
}
public void AdviseSupervisory(int _, int __) { }
public void UnAdviseSupervisory(int _, int __) { }
public void Write(int _, int __, object ___, int ____) { }
}
/// <summary>
/// Mock that lets us exercise the reconnect + replay path. TriggerProbeFailureOnNextCall
/// flips a one-shot flag so the very next AddItem("$Heartbeat") throws — that drives the
/// monitor loop into the reconnect-with-replay branch. During the replay, AddItem for the
/// tags listed in failOnReplayForTags throws so SubscriptionReplayFailed should fire once
/// per failing tag.
/// </summary>
private sealed class ReplayFailingProxy : IMxProxy
{
private int _next = 1;
private readonly HashSet<string> _failOnReplay;
private int _probeFailOnce;
private readonly ConcurrentDictionary<string, bool> _replayedOnce = new(StringComparer.OrdinalIgnoreCase);
public ReplayFailingProxy(IEnumerable<string> failOnReplayForTags)
{
_failOnReplay = new HashSet<string>(failOnReplayForTags, StringComparer.OrdinalIgnoreCase);
}
public void TriggerProbeFailureOnNextCall() => Interlocked.Exchange(ref _probeFailOnce, 1);
public event MxDataChangeHandler? OnDataChange { add { } remove { } }
public event MxWriteCompleteHandler? OnWriteComplete { add { } remove { } }
public int Register(string _) => 42;
public void Unregister(int _) { }
public int AddItem(int _, string address)
{
if (address == "$Heartbeat" && Interlocked.Exchange(ref _probeFailOnce, 0) == 1)
throw new InvalidOperationException("simulated probe failure");
// Fail only on the *replay* AddItem for listed tags — not the initial subscribe.
if (_failOnReplay.Contains(address) && _replayedOnce.ContainsKey(address))
throw new InvalidOperationException($"simulated replay failure for {address}");
if (_failOnReplay.Contains(address)) _replayedOnce[address] = true;
return Interlocked.Increment(ref _next);
}
public void RemoveItem(int _, int __) { }
public void AdviseSupervisory(int _, int __) { }
public void UnAdviseSupervisory(int _, int __) { }
public void Write(int _, int __, object ___, int ____) { }
}
}

View File

@@ -24,6 +24,11 @@
<ItemGroup> <ItemGroup>
<ProjectReference Include="..\..\src\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.csproj"/> <ProjectReference Include="..\..\src\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host\ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.csproj"/>
<Reference Include="System.ServiceProcess"/> <Reference Include="System.ServiceProcess"/>
<!-- IMxProxy's delegate signatures mention ArchestrA.MxAccess.MXSTATUS_PROXY, so tests
implementing the interface must resolve that type at compile time. -->
<Reference Include="ArchestrA.MxAccess">
<HintPath>..\..\lib\ArchestrA.MxAccess.dll</HintPath>
</Reference>
</ItemGroup> </ItemGroup>
<ItemGroup> <ItemGroup>

View File

@@ -0,0 +1,27 @@
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
using ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Proxy.Tests;
[Trait("Category", "Unit")]
public sealed class AggregateColumnMappingTests
{
[Theory]
[InlineData(HistoryAggregateType.Average, "Average")]
[InlineData(HistoryAggregateType.Minimum, "Minimum")]
[InlineData(HistoryAggregateType.Maximum, "Maximum")]
[InlineData(HistoryAggregateType.Count, "ValueCount")]
public void Maps_OpcUa_enum_to_AnalogSummary_column(HistoryAggregateType aggregate, string expected)
{
GalaxyProxyDriver.MapAggregateToColumn(aggregate).ShouldBe(expected);
}
[Fact]
public void Total_is_not_supported()
{
Should.Throw<System.NotSupportedException>(
() => GalaxyProxyDriver.MapAggregateToColumn(HistoryAggregateType.Total));
}
}

View File

@@ -6,7 +6,14 @@
<LangVersion>9.0</LangVersion> <LangVersion>9.0</LangVersion>
<Nullable>enable</Nullable> <Nullable>enable</Nullable>
<IsPackable>false</IsPackable> <IsPackable>false</IsPackable>
<IsTestProject>true</IsTestProject> <!--
Phase 2 Stream D — V1 ARCHIVE. References v1 OtOpcUa.Host directly.
Excluded from `dotnet test` solution runs; replaced by the v2
OtOpcUa.Driver.Galaxy.E2E suite. To run explicitly:
dotnet test tests/ZB.MOM.WW.OtOpcUa.IntegrationTests
See docs/v2/V1_ARCHIVE_STATUS.md.
-->
<IsTestProject>false</IsTestProject>
<RootNamespace>ZB.MOM.WW.OtOpcUa.IntegrationTests</RootNamespace> <RootNamespace>ZB.MOM.WW.OtOpcUa.IntegrationTests</RootNamespace>
</PropertyGroup> </PropertyGroup>

View File

@@ -0,0 +1,158 @@
using Microsoft.Extensions.Logging.Abstractions;
using Opc.Ua;
using Opc.Ua.Client;
using Opc.Ua.Configuration;
using Shouldly;
using Xunit;
using ZB.MOM.WW.OtOpcUa.Core.Abstractions;
using ZB.MOM.WW.OtOpcUa.Core.Hosting;
using ZB.MOM.WW.OtOpcUa.Server.OpcUa;
namespace ZB.MOM.WW.OtOpcUa.Server.Tests;
[Trait("Category", "Integration")]
public sealed class OpcUaServerIntegrationTests : IAsyncLifetime
{
// Use a non-default port + per-test-run PKI root to avoid colliding with anything else
// running on the box (a live v1 Host or a developer's previous run).
private static readonly int Port = 48400 + Random.Shared.Next(0, 99);
private readonly string _endpoint = $"opc.tcp://localhost:{Port}/OtOpcUaTest";
private readonly string _pkiRoot = Path.Combine(Path.GetTempPath(), $"otopcua-test-{Guid.NewGuid():N}");
private DriverHost _driverHost = null!;
private OpcUaApplicationHost _server = null!;
private FakeDriver _driver = null!;
public async ValueTask InitializeAsync()
{
_driverHost = new DriverHost();
_driver = new FakeDriver();
await _driverHost.RegisterAsync(_driver, "{}", CancellationToken.None);
var options = new OpcUaServerOptions
{
EndpointUrl = _endpoint,
ApplicationName = "OtOpcUaTest",
ApplicationUri = "urn:OtOpcUa:Server:Test",
PkiStoreRoot = _pkiRoot,
AutoAcceptUntrustedClientCertificates = true,
};
_server = new OpcUaApplicationHost(options, _driverHost, NullLoggerFactory.Instance,
NullLogger<OpcUaApplicationHost>.Instance);
await _server.StartAsync(CancellationToken.None);
}
public async ValueTask DisposeAsync()
{
await _server.DisposeAsync();
await _driverHost.DisposeAsync();
try { Directory.Delete(_pkiRoot, recursive: true); } catch { /* best-effort */ }
}
[Fact]
public async Task Client_can_connect_and_browse_driver_subtree()
{
using var session = await OpenSessionAsync();
// Browse the driver subtree registered under ObjectsFolder. FakeDriver registers one
// folder ("TestFolder") with one variable ("Var1"), so we expect to see our driver's
// root folder plus standard UA children.
var rootRef = new NodeId("fake", (ushort)session.NamespaceUris.GetIndex("urn:OtOpcUa:fake"));
session.Browse(null, null, rootRef, 0, BrowseDirection.Forward, ReferenceTypeIds.HierarchicalReferences,
true, (uint)NodeClass.Object | (uint)NodeClass.Variable, out _, out var references);
references.Count.ShouldBeGreaterThan(0);
references.ShouldContain(r => r.BrowseName.Name == "TestFolder");
}
[Fact]
public async Task Client_can_read_a_driver_variable_through_the_node_manager()
{
using var session = await OpenSessionAsync();
var nsIndex = (ushort)session.NamespaceUris.GetIndex("urn:OtOpcUa:fake");
var varNodeId = new NodeId("TestFolder.Var1", nsIndex);
var dv = session.ReadValue(varNodeId);
dv.ShouldNotBeNull();
// FakeDriver.ReadAsync returns 42 as the value.
dv.Value.ShouldBe(42);
}
private async Task<ISession> OpenSessionAsync()
{
var cfg = new ApplicationConfiguration
{
ApplicationName = "OtOpcUaTestClient",
ApplicationUri = "urn:OtOpcUa:TestClient",
ApplicationType = ApplicationType.Client,
SecurityConfiguration = new SecurityConfiguration
{
ApplicationCertificate = new CertificateIdentifier
{
StoreType = CertificateStoreType.Directory,
StorePath = Path.Combine(_pkiRoot, "client-own"),
SubjectName = "CN=OtOpcUaTestClient",
},
TrustedIssuerCertificates = new CertificateTrustList { StoreType = CertificateStoreType.Directory, StorePath = Path.Combine(_pkiRoot, "client-issuers") },
TrustedPeerCertificates = new CertificateTrustList { StoreType = CertificateStoreType.Directory, StorePath = Path.Combine(_pkiRoot, "client-trusted") },
RejectedCertificateStore = new CertificateTrustList { StoreType = CertificateStoreType.Directory, StorePath = Path.Combine(_pkiRoot, "client-rejected") },
AutoAcceptUntrustedCertificates = true,
AddAppCertToTrustedStore = true,
},
TransportConfigurations = new TransportConfigurationCollection(),
TransportQuotas = new TransportQuotas { OperationTimeout = 15000 },
ClientConfiguration = new ClientConfiguration { DefaultSessionTimeout = 60000 },
};
await cfg.Validate(ApplicationType.Client);
cfg.CertificateValidator.CertificateValidation += (_, e) => e.Accept = true;
var instance = new ApplicationInstance { ApplicationConfiguration = cfg, ApplicationType = ApplicationType.Client };
await instance.CheckApplicationInstanceCertificate(true, CertificateFactory.DefaultKeySize);
// Let the client fetch the live endpoint description from the running server so the
// UserTokenPolicy it signs with matches what the server actually advertised (including
// the PolicyId = "Anonymous" the server sets).
var selected = CoreClientUtils.SelectEndpoint(cfg, _endpoint, useSecurity: false);
var endpointConfig = EndpointConfiguration.Create(cfg);
var configuredEndpoint = new ConfiguredEndpoint(null, selected, endpointConfig);
return await Session.Create(cfg, configuredEndpoint, false, "OtOpcUaTestClientSession", 60000,
new UserIdentity(new AnonymousIdentityToken()), null);
}
/// <summary>
/// Minimum driver that implements enough of IDriver + ITagDiscovery + IReadable to drive
/// the integration test. Returns a single folder with one variable that reads as 42.
/// </summary>
private sealed class FakeDriver : IDriver, ITagDiscovery, IReadable
{
public string DriverInstanceId => "fake";
public string DriverType => "Fake";
public Task InitializeAsync(string driverConfigJson, CancellationToken ct) => Task.CompletedTask;
public Task ReinitializeAsync(string driverConfigJson, CancellationToken ct) => Task.CompletedTask;
public Task ShutdownAsync(CancellationToken ct) => Task.CompletedTask;
public DriverHealth GetHealth() => new(DriverState.Healthy, DateTime.UtcNow, null);
public long GetMemoryFootprint() => 0;
public Task FlushOptionalCachesAsync(CancellationToken ct) => Task.CompletedTask;
public Task DiscoverAsync(IAddressSpaceBuilder builder, CancellationToken ct)
{
var folder = builder.Folder("TestFolder", "TestFolder");
folder.Variable("Var1", "Var1", new DriverAttributeInfo(
"TestFolder.Var1", DriverDataType.Int32, false, null, SecurityClassification.FreeAccess, false, IsAlarm: false));
return Task.CompletedTask;
}
public Task<IReadOnlyList<DataValueSnapshot>> ReadAsync(
IReadOnlyList<string> fullReferences, CancellationToken cancellationToken)
{
var now = DateTime.UtcNow;
IReadOnlyList<DataValueSnapshot> result =
fullReferences.Select(_ => new DataValueSnapshot(42, 0u, now, now)).ToArray();
return Task.FromResult(result);
}
}
}

View File

@@ -14,6 +14,7 @@
<PackageReference Include="Shouldly" Version="4.3.0"/> <PackageReference Include="Shouldly" Version="4.3.0"/>
<PackageReference Include="Microsoft.NET.Test.Sdk" Version="17.12.0"/> <PackageReference Include="Microsoft.NET.Test.Sdk" Version="17.12.0"/>
<PackageReference Include="Microsoft.Extensions.Logging.Abstractions" Version="10.0.0"/> <PackageReference Include="Microsoft.Extensions.Logging.Abstractions" Version="10.0.0"/>
<PackageReference Include="OPCFoundation.NetStandard.Opc.Ua.Client" Version="1.5.374.126"/>
<PackageReference Include="xunit.runner.visualstudio" Version="3.0.2"> <PackageReference Include="xunit.runner.visualstudio" Version="3.0.2">
<PrivateAssets>all</PrivateAssets> <PrivateAssets>all</PrivateAssets>
<IncludeAssets>runtime; build; native; contentfiles; analyzers; buildtransitive</IncludeAssets> <IncludeAssets>runtime; build; native; contentfiles; analyzers; buildtransitive</IncludeAssets>
@@ -27,6 +28,7 @@
<ItemGroup> <ItemGroup>
<NuGetAuditSuppress Include="https://github.com/advisories/GHSA-37gx-xxp4-5rgx"/> <NuGetAuditSuppress Include="https://github.com/advisories/GHSA-37gx-xxp4-5rgx"/>
<NuGetAuditSuppress Include="https://github.com/advisories/GHSA-w3x6-4m5h-cxqf"/> <NuGetAuditSuppress Include="https://github.com/advisories/GHSA-w3x6-4m5h-cxqf"/>
<NuGetAuditSuppress Include="https://github.com/advisories/GHSA-h958-fxgg-g7w3"/>
</ItemGroup> </ItemGroup>
</Project> </Project>

Some files were not shown because too many files have changed in this diff Show More