Phase 2 PR 5 — Wonderware Historian SDK port into Driver.Galaxy.Host #4

Closed
dohertj2 wants to merge 1 commits from phase-2-pr5-historian into phase-2-pr4-findings
Owner

Summary

Phase 2 PR 5 — port the Wonderware Historian SDK into Driver.Galaxy.Host/Backend/Historian/, wiring MxAccessGalaxyBackend.HistoryReadAsync end-to-end and closing the last Phase 2 Task B.1.h follow-up that was still returning a placeholder error.

Key decisions:

  • OPC-UA-free surface inside Galaxy.Host. v1 returned Opc.Ua.DataValue on the hot historian path, which would have required dragging OPCFoundation.NetStandard.Opc.Ua.Server into net48 x86 Galaxy.Host and leaking OPC types across the IPC boundary. Instead, PR 5 introduces HistorianSample + HistorianAggregateSample POCOs that carry the raw MX quality byte through the pipe unchanged; the Proxy side does the OPC translation via the existing QualityMapper already used for live reads. Decision #13's GalaxyDataValue contract survives intact — no Shared wire break vs PR 4.
  • Folded into Galaxy.Host. v1 shipped the historian as an external plugin loaded via 180 LOC of HistorianPluginLoader + AssemblyResolve + Assembly.LoadFrom. That indirection existed solely because the plugin was staged in Host/bin/Debug/net48/Historian/ at deploy time. Since Driver.Galaxy.Host is already Galaxy-specific, the plugin boundary is pure overhead here — the port lives directly under Backend/Historian/ and Galaxy.Host.csproj carries the SDK refs + native DLL staging inline.
  • Cluster failover preserved verbatim. HistorianClusterEndpointPicker is the thread-safe pure-logic picker ported unchanged (injected clock, per-node cooldown, case-insensitive de-dup). ConnectToAnyHealthyNode iterates healthy candidates, clones config per attempt, marks healthy-on-success / failed-on-exception, and throws with the last exception chained when all nodes exhaust.
  • Four SDK paths ported, one IPC-exposed in PR 5. ReadRawAsync is wired through IPC now (HistoryReadRequest was already in Shared.Contracts). ReadAggregateAsync / ReadAtTimeAsync / ReadEventsAsync / GetHealthSnapshot are ported-but-not-yet-IPC-exposed — they stay internal to Galaxy.Host until PR 6+ surfaces them via new contract message kinds.

Changes

Area Change
src/.../Backend/Historian/ New subtree — 9 files, ~1100 LOC
Backend/MxAccessGalaxyBackend.cs HistoryReadAsync now delegates to IHistorianDataSource.ReadRawAsync + maps samples to GalaxyDataValue; gains IDisposable to close historian connections at shutdown
Program.cs BuildHistorianIfEnabled() reads OTOPCUA_HISTORIAN_* env vars, returns null when disabled so the backend surfaces a clean Historian disabled error
Driver.Galaxy.Host.csproj References aahClientManaged + aahClientCommon; stages aahClient.dll + Historian.CBE.dll + Historian.DPAPI.dll + ArchestrA.CloudHistorian.Contract.dll alongside the host exe; InternalsVisibleTo("...Host.Tests") added so the endpoint picker stays testable
tests/.../Host.Tests/HistorianClusterEndpointPickerTests.cs 7 test cases — fallback to ServerName, cooldown enters/expires, MarkHealthy clears, all-in-cooldown empty list, Snapshot reports state, case-insensitive de-dup
tests/.../Host.Tests/HistorianWiringTests.cs 2 test cases — disabled returns Success=false with clear error; fake historian maps HistorianSample(42.5, Good, ts) to GalaxyDataValue{StatusCode=0u, source ts matches, MessagePack bytes non-null}

Test plan

  • dotnet build src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host/ — 0 errors
  • dotnet build ZB.MOM.WW.OtOpcUa.slnx — 0 errors, 202 pre-existing warnings
  • dotnet test tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests/ --filter "Category=Unit" — 24/24 pass (9 new + 15 pre-existing)
  • Reviewer: run dotnet test ZB.MOM.WW.OtOpcUa.slnx — expect full solution pass count consistent with PR 4 baseline
  • Reviewer: on a machine with the Wonderware Historian SDK installed and a live Historian endpoint, spawn the host with OTOPCUA_HISTORIAN_ENABLED=true OTOPCUA_HISTORIAN_SERVER=<hostname> and fire a HistoryRead from the Client CLI — should return live historical samples instead of the PR 4 placeholder error

Follow-ups / deferred

  • PR 6 — Expose Aggregate / AtTime / Events / Health through IPC. The Galaxy.Host code is already ported and exercised by unit tests; the missing piece is the Shared.Contracts messages + handler branches + Proxy-side IHistoryProvider.ReadProcessedAsync wiring.
  • PR 6 — Port the v1 alarm subsystem (AlarmExtensionOnAlarmEvent) from MxAccessGalaxyBackend. Shares the IPC ConnectionSink path that PR 4 built for OnDataChange.
  • PR 6 — Port GalaxyRuntimeProbeManagerOnHostStatusChanged. Same pattern.
  • Archive-delete PR — Once PR 2 + PR 3 merge, the v1 Historian.Aveva + Historian.Aveva.Tests projects (the two remaining archived-but-still-building v1 surfaces) can be removed alongside the rest of the v1 stack.

🤖 Generated with Claude Code

## Summary Phase 2 PR 5 — port the Wonderware Historian SDK into `Driver.Galaxy.Host/Backend/Historian/`, wiring `MxAccessGalaxyBackend.HistoryReadAsync` end-to-end and closing the last Phase 2 Task B.1.h follow-up that was still returning a placeholder error. Key decisions: - **OPC-UA-free surface inside Galaxy.Host.** v1 returned `Opc.Ua.DataValue` on the hot historian path, which would have required dragging `OPCFoundation.NetStandard.Opc.Ua.Server` into net48 x86 Galaxy.Host and leaking OPC types across the IPC boundary. Instead, PR 5 introduces `HistorianSample` + `HistorianAggregateSample` POCOs that carry the raw MX quality byte through the pipe unchanged; the Proxy side does the OPC translation via the existing `QualityMapper` already used for live reads. Decision #13's `GalaxyDataValue` contract survives intact — no Shared wire break vs PR 4. - **Folded into Galaxy.Host.** v1 shipped the historian as an external plugin loaded via 180 LOC of `HistorianPluginLoader` + `AssemblyResolve` + `Assembly.LoadFrom`. That indirection existed solely because the plugin was staged in `Host/bin/Debug/net48/Historian/` at deploy time. Since Driver.Galaxy.Host is already Galaxy-specific, the plugin boundary is pure overhead here — the port lives directly under `Backend/Historian/` and Galaxy.Host.csproj carries the SDK refs + native DLL staging inline. - **Cluster failover preserved verbatim.** `HistorianClusterEndpointPicker` is the thread-safe pure-logic picker ported unchanged (injected clock, per-node cooldown, case-insensitive de-dup). `ConnectToAnyHealthyNode` iterates healthy candidates, clones config per attempt, marks healthy-on-success / failed-on-exception, and throws with the last exception chained when all nodes exhaust. - **Four SDK paths ported, one IPC-exposed in PR 5.** `ReadRawAsync` is wired through IPC now (`HistoryReadRequest` was already in Shared.Contracts). `ReadAggregateAsync` / `ReadAtTimeAsync` / `ReadEventsAsync` / `GetHealthSnapshot` are ported-but-not-yet-IPC-exposed — they stay internal to Galaxy.Host until PR 6+ surfaces them via new contract message kinds. ## Changes | Area | Change | |------|--------| | `src/.../Backend/Historian/` | New subtree — 9 files, ~1100 LOC | | `Backend/MxAccessGalaxyBackend.cs` | `HistoryReadAsync` now delegates to `IHistorianDataSource.ReadRawAsync` + maps samples to `GalaxyDataValue`; gains `IDisposable` to close historian connections at shutdown | | `Program.cs` | `BuildHistorianIfEnabled()` reads `OTOPCUA_HISTORIAN_*` env vars, returns null when disabled so the backend surfaces a clean `Historian disabled` error | | `Driver.Galaxy.Host.csproj` | References `aahClientManaged` + `aahClientCommon`; stages `aahClient.dll` + `Historian.CBE.dll` + `Historian.DPAPI.dll` + `ArchestrA.CloudHistorian.Contract.dll` alongside the host exe; `InternalsVisibleTo("...Host.Tests")` added so the endpoint picker stays testable | | `tests/.../Host.Tests/HistorianClusterEndpointPickerTests.cs` | 7 test cases — fallback to `ServerName`, cooldown enters/expires, MarkHealthy clears, all-in-cooldown empty list, Snapshot reports state, case-insensitive de-dup | | `tests/.../Host.Tests/HistorianWiringTests.cs` | 2 test cases — disabled returns Success=false with clear error; fake historian maps `HistorianSample(42.5, Good, ts)` to `GalaxyDataValue{StatusCode=0u, source ts matches, MessagePack bytes non-null}` | ## Test plan - [x] `dotnet build src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host/` — 0 errors - [x] `dotnet build ZB.MOM.WW.OtOpcUa.slnx` — 0 errors, 202 pre-existing warnings - [x] `dotnet test tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host.Tests/ --filter "Category=Unit"` — 24/24 pass (9 new + 15 pre-existing) - [ ] Reviewer: run `dotnet test ZB.MOM.WW.OtOpcUa.slnx` — expect full solution pass count consistent with PR 4 baseline - [ ] Reviewer: on a machine with the Wonderware Historian SDK installed and a live Historian endpoint, spawn the host with `OTOPCUA_HISTORIAN_ENABLED=true OTOPCUA_HISTORIAN_SERVER=<hostname>` and fire a HistoryRead from the Client CLI — should return live historical samples instead of the PR 4 placeholder error ## Follow-ups / deferred - **PR 6** — Expose Aggregate / AtTime / Events / Health through IPC. The Galaxy.Host code is already ported and exercised by unit tests; the missing piece is the Shared.Contracts messages + handler branches + Proxy-side `IHistoryProvider.ReadProcessedAsync` wiring. - **PR 6** — Port the v1 alarm subsystem (`AlarmExtension` → `OnAlarmEvent`) from `MxAccessGalaxyBackend`. Shares the IPC `ConnectionSink` path that PR 4 built for `OnDataChange`. - **PR 6** — Port `GalaxyRuntimeProbeManager` → `OnHostStatusChanged`. Same pattern. - **Archive-delete PR** — Once PR 2 + PR 3 merge, the v1 `Historian.Aveva` + `Historian.Aveva.Tests` projects (the two remaining archived-but-still-building v1 surfaces) can be removed alongside the rest of the v1 stack. --- 🤖 Generated with [Claude Code](https://claude.com/claude-code)
dohertj2 added 1 commit 2026-04-18 01:48:28 -04:00
Phase 2 PR 5 — port Wonderware Historian SDK into Driver.Galaxy.Host/Backend/Historian/. The full v1 Historian.Aveva code path (HistorianDataSource + HistorianClusterEndpointPicker + IHistorianConnectionFactory + SdkHistorianConnectionFactory) now lives inside Galaxy.Host instead of the previously-required out-of-tree plugin + HistorianPluginLoader AssemblyResolve hack, and MxAccessGalaxyBackend.HistoryReadAsync — which previously returned a Phase 2 Task B.1.h follow-up placeholder — now delegates to the ported HistorianDataSource.ReadRawAsync, maps HistorianSample to GalaxyDataValue via the IPC wire shape, and reports Success=true with per-tag HistoryTagValues arrays. OPC-UA-free surface inside Galaxy.Host: the v1 code returned Opc.Ua.DataValue on the hot path, which would have required dragging OPCFoundation.NetStandard.Opc.Ua.Server into net48 x86 Galaxy.Host and bleeding OPC types across the IPC boundary — instead, the port introduces HistorianSample (Value, Quality byte, TimestampUtc) + HistorianAggregateSample (Value, TimestampUtc) POCOs that carry the raw MX quality byte through the pipe unchanged, and the OPC translation happens on the Proxy side via the existing QualityMapper that the live-read path already uses. Decision #13's IPC data-shape contract survives intact — GalaxyDataValue (TagReference + ValueBytes MessagePack + ValueMessagePackType + StatusCode + SourceTimestampUtcUnixMs + ServerTimestampUtcUnixMs) — so no Shared.Contracts wire break vs PR 4. Cluster failover preserved verbatim: HistorianClusterEndpointPicker is the thread-safe pure-logic picker ported verbatim with no SDK dependency (injected DateTime clock, per-node cooldown state, unknown-node-name tolerance, case-insensitive de-dup on configuration-order list), ConnectToAnyHealthyNode iterates the picker's healthy candidates, clones config per attempt, calls the factory, marks healthy on success / failed on exception with the failure message stored for dashboard surfacing, throws "All N healthy historian candidate(s) failed" with the last exception chained when every node exhausts. Process path + Event path use separate HistorianAccess connections (CreateHistoryQuery vs CreateEventQuery vs CreateAnalogSummaryQuery on the SDK surface) guarded by independent _connection/_eventConnection locks — a mid-query failure on one silo resets only that connection, the other stays open. Four SDK paths ported: ReadRawAsync (RetrievalMode.Full, BatchSize from config.MaxValuesPerRead, MoveNext pump, per-sample quality + value decode with the StringValue/Value fallback the v1 code did, limit-based early exit), ReadAggregateAsync (AnalogSummaryQuery + Resolution in ms, ExtractAggregateValue maps Average/Minimum/Maximum/ValueCount/First/Last/StdDev column names — the NodeId to column mapping is moved to the Proxy side since the IPC request carries a string column), ReadAtTimeAsync (per-timestamp HistoryQuery with RetrievalMode.Interpolated + BatchSize=1, returns Quality=0 / Value=null for missing samples), ReadEventsAsync (EventQuery + AddEventFilter("Source",Equal,sourceName) when sourceName is non-null, EventOrder.Ascending, EventCount = maxEvents or config.MaxValuesPerRead); GetHealthSnapshot returns the full runtime-health snapshot (TotalQueries/Successes/Failures + ConsecutiveFailures + LastSuccess/FailureTime + LastError + ProcessConnectionOpen/EventConnectionOpen + ActiveProcessNode/ActiveEventNode + per-node state list). ReadRaw is the only path wired through IPC in PR 5 (HistoryReadRequest/HistoryTagValues/HistoryReadResponse already existed in Shared.Contracts); Aggregate/AtTime/Events/Health are ported-but-not-yet-IPC-exposed — they stay internal to Galaxy.Host for PR 6+ to surface via new contract message kinds (aggregate = OPC UA HistoryReadProcessed, at-time = HistoryReadAtTime, events = HistoryReadEvents, health = admin dashboard IPC query). Galaxy.Host csproj gains aahClientManaged + aahClientCommon references with Private=false (managed wrappers) + None items for aahClient.dll + Historian.CBE.dll + Historian.DPAPI.dll + ArchestrA.CloudHistorian.Contract.dll native satellites staged alongside the host exe via CopyToOutputDirectory=PreserveNewest so aahClientManaged can P/Invoke into them at runtime without an AssemblyResolve hook (cleaner than the v1 HistorianPluginLoader.cs 180-LOC AssemblyResolve + Assembly.LoadFrom dance that existed solely because the plugin was loaded late from Host/bin/Debug/net48/Historian/). Program.cs adds BuildHistorianIfEnabled() that reads OTOPCUA_HISTORIAN_ENABLED (true or 1) + OTOPCUA_HISTORIAN_SERVER + OTOPCUA_HISTORIAN_SERVERS (comma-separated cluster list overrides single-server) + OTOPCUA_HISTORIAN_PORT (default 32568) + OTOPCUA_HISTORIAN_INTEGRATED (default true) + OTOPCUA_HISTORIAN_USER/OTOPCUA_HISTORIAN_PASS + OTOPCUA_HISTORIAN_TIMEOUT_SEC (30) + OTOPCUA_HISTORIAN_MAX_VALUES (10000) + OTOPCUA_HISTORIAN_COOLDOWN_SEC (60), returns null when disabled so MxAccessGalaxyBackend.HistoryReadAsync surfaces a clean "Historian disabled" Success=false instead of a localhost-SDK hang; server.RunAsync finally block now also casts backend to IDisposable.Dispose() so the historian SDK connections get cleanly closed on Ctrl+C. MxAccessGalaxyBackend gains an IHistorianDataSource? historian constructor parameter (defaults null to preserve existing Host.Tests call sites that don't exercise HistoryRead), implements IDisposable that forwards to _historian.Dispose(), and the pragma warning disable CS0618 is locally scoped to the ToDto(HistorianEvent) mapper since the SDK marks Id/Source/DisplayText/Severity obsolete but the replacement surface isn't available in the aahClientManaged version we bind against — every other deprecated-SDK use still surfaces as an error under TreatWarningsAsErrors. Ported from v1 Historian.Aveva unchanged: the CloneConfigWithServerName helper that preserves every config field except ServerName per attempt; the double-checked locking in EnsureConnected/EnsureEventConnected (fast path = Volatile.Read outside lock, slow path acquires lock + re-checks + disposes any raced-in-parallel connection); HandleConnectionError/HandleEventConnectionError that close the dead connection, clear the active-node tracker, MarkFailed the picker entry with the exception message so the node enters cooldown, and log the reset with node= for operator correlation; RecordSuccess/RecordFailure that bump counters under _healthLock. Tests: HistorianClusterEndpointPickerTests (7 cases) — single-node ServerName fallback when ServerNames empty, MarkFailed enters cooldown and skips, cooldown expires after window, MarkHealthy immediately clears, all-in-cooldown returns empty healthy list, Snapshot reports failure count + last error + IsHealthy, case-insensitive de-dup on duplicate hostnames. HistorianWiringTests (2 cases) — HistoryReadAsync returns "Historian disabled" Success=false when historian:null passed; HistoryReadAsync with a fake IHistorianDataSource maps the returned HistorianSample (Value=42.5, Quality=192 Good, Timestamp) to a GalaxyDataValue with StatusCode=0u + SourceTimestampUtcUnixMs matching the sample + MessagePack-encoded value bytes. InternalsVisibleTo("...Host.Tests") added to Galaxy.Host.csproj so tests can reach the internal HistorianClusterEndpointPicker. Full Galaxy.Host.Tests suite: 24 pass / 0 fail (9 new historian + 15 pre-existing MemoryWatchdog/PostMortemMmf/RecyclePolicy/StaPump/EndToEndIpc/Handshake). Full solution build: 0 errors (202 pre-existing warnings). The v1 Historian.Aveva project + Historian.Aveva.Tests still build intact because the archive PR (Stream D.1 destructive delete) is still ahead of us — PR 5 intentionally does not delete either; once PR 2+3 merge and the archive-delete PR lands, a follow-up cleanup can remove Historian.Aveva + its 4 source files + 18 test cases. Alarm subsystem wire-up (OnAlarmEvent raising from MxAccessGalaxyBackend via AlarmExtension primitives) + host-status push (OnHostStatusChanged via a ported GalaxyRuntimeProbeManager) remain PR 6 candidates; they were on the same "Task B.1.h follow-up" list and share the IPC connection-sink wiring with the historian events path — it made PR 5 scope-manageable to do Historian first since that's what has the biggest surface area (981 LOC v1 plus SDK binding) and alarms/host-status have more bespoke integration with the existing MxAccess subscription fan-out. 6df1a79d35
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Author
Owner

Closing � the PR5 historian commit (6df1a79) was already an ancestor of v2 when PR #3 merged (it sat on the pr4-findings branch). No additional work required; the Historian port is live on v2.

Closing � the PR5 historian commit (6df1a79) was already an ancestor of v2 when PR #3 merged (it sat on the pr4-findings branch). No additional work required; the Historian port is live on v2.
dohertj2 closed this pull request 2026-04-18 06:58:35 -04:00
dohertj2 referenced this issue from a commit 2026-04-18 15:25:05 -04:00
Phase 3 PR 31 — Live-LDAP integration test + Active Directory compatibility. Closes LMX follow-up #4 with 6 live-bind tests in Server.Tests/LdapUserAuthenticatorLiveTests.cs against the dev GLAuth instance at localhost:3893 (skipped cleanly when unreachable via Assert.Skip + a clear SkipReason — matches the GalaxyRepositoryLiveSmokeTests pattern). Coverage: valid credentials bind + surface DisplayName; wrong password fails; unknown user fails; empty credentials fail pre-flight without touching the directory; writeop user's memberOf maps through GroupToRole to WriteOperate (the exact string WriteAuthzPolicy.IsAllowed expects); admin user surfaces all four mapped roles (WriteOperate + WriteTune + WriteConfigure + AlarmAck) proving memberOf parsing doesn't stop after the first match. While wiring this up, the authenticator's hard-coded user-lookup filter 'uid=<name>' didn't match GLAuth (which keys users by cn and doesn't populate uid) — AND it doesn't match Active Directory either, which uses sAMAccountName. Added UserNameAttribute to LdapOptions (default 'uid' for RFC 2307 backcompat) so deployments override to 'cn' / 'sAMAccountName' / 'userPrincipalName' as the directory requires; authenticator filter now interpolates the configured attribute. The default stays 'uid' so existing test fixtures and OpenLDAP installs keep working without a config change — a regression guard in LdapUserAuthenticatorAdCompatTests.LdapOptions_default_UserNameAttribute_is_uid_for_rfc2307_compat pins this so a future 'helpful' default change can't silently break anyone.

Pull request closed

Sign in to join this conversation.
No Reviewers
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: dohertj2/lmxopcua#4