Files
lmxopcua/docs/v1/reqs/MxAccessClientReqs.md
Joseph Doherty 006af51768 docs: post-PR-7.2 cleanup — audit + three-track scrub
Audit (three parallel agent passes) found 43 markdown files carrying
stale references to the deleted Galaxy.Host/Proxy/Shared projects
after the v2-mxgw merge. This commit lands the prioritized fixes.

Track 1 — high-traffic in-place rewrites (3 files, ~454 lines deleted)
- README.md (202 → 91 lines): drops .NET 4.8 / x86 / TopShelf install
  text; leads with the multi-driver .NET 10 server identity and points
  at scripts/install/Install-Services.ps1 and the parity rig.
- docs/v2/driver-specs.md §1 Galaxy (~289 → ~66 lines): replaces the
  Tier-C out-of-process spec with a Tier-A in-process description
  matching the current GalaxyDriver code, with the four-section
  GalaxyDriverOptions JSON shape pulled verbatim from
  Config/GalaxyDriverOptions.cs.
- docs/drivers/Galaxy.md (211 → 92 lines): full rewrite around the
  current Browse/Runtime/Health/Config sub-folders.

Track 2 — historical banners (5 files)
- lmx_mxgw.md, lmx_mxgw_impl.md, lmx_backend.md,
  docs/v2/Galaxy.ParityMatrix.md,
  docs/v2/implementation/phase-2-galaxy-out-of-process.md each get a
  " Completed 2026-04-30 — historical record" banner block. lmx_mxgw.md
  also fixes two dead links (`docs/Galaxy.Driver.md` and
  `docs/v2/Galaxy.Driver.md`) → `docs/drivers/Galaxy.md`.

Track 3 — v1 archive sweep (10 git mv + 1 new index + 2 in-place scrubs)
- Moved 10 v1 docs under docs/v1/ preserving subpath structure:
  AlarmTracking, Configuration, DataTypeMapping, HistoricalDataAccess,
  Subscriptions (top-level); drivers/Galaxy-Repository,
  drivers/Galaxy-Test-Fixture; reqs/GalaxyRepositoryReqs,
  reqs/MxAccessClientReqs, reqs/ServiceHostReqs.
- New docs/v1/README.md is the shared archive banner + per-file table.
- docs/README.md repointed to the v1 paths and updated to reflect the
  v2 two-process deploy shape (Server + Admin + optional
  OtOpcUaWonderwareHistorian).
- docs/v2/Galaxy.ParityRig.md got a historical banner + four inline
  scrubs marking the OtOpcUaGalaxyHost service / Driver.Galaxy.Host
  EXE / Driver.Galaxy.ParityTests project as deleted-in-PR-7.2.

The repo's live-reading surface (README + CLAUDE.md + docs/v2/) now
describes only the post-PR-7.2 architecture. v1 docs are preserved as
a labelled archive under docs/v1/.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 08:59:59 -04:00

12 KiB

Galaxy Driver — MXAccess Client Requirements

Revision — Refreshed 2026-04-19 for the OtOpcUa v2 multi-driver platform (task #205). Scope narrowed: this document covers the MXAccess surface inside OtOpcUa.Galaxy.Host (.NET Framework 4.8 x86 Windows service). The in-server Driver.Galaxy.Proxy implements the IReadable / IWritable / ISubscribable / IAlarmSource / IHistoryProvider capability interfaces and routes every wire call through the named pipe to this Host process. The STA thread + reconnect playback + subscription refcount requirements from v1 are preserved; what changed is where they live (Host service, not the Server process). MXA-010 (proxy-side wrapping) and MXA-011 (pipe ACL / shared secret) are new.

Parent: HLR-002, HLR-005, HLR-007

Driver scope: Galaxy only. Process scope: OtOpcUa.Galaxy.Host (Host side) and Driver.Galaxy.Proxy (server-side forwarder).

MXA-001: STA Thread with Message Pump

All MXAccess COM objects shall be created and called on a dedicated STA thread running a Win32 message pump to ensure COM callbacks are delivered.

Acceptance Criteria

  • A dedicated thread is created with ApartmentState.STA before any MXAccess COM object is instantiated; implementation lives in StaPump inside OtOpcUa.Galaxy.Host.
  • The thread runs a Win32 message pump using GetMessage / TranslateMessage / DispatchMessage.
  • Work items are marshalled to the STA thread via PostThreadMessage(WM_APP) and a concurrent queue.
  • All COM object creation (LMXProxyServer), method calls, and event callbacks happen on this thread.
  • Thread name Galaxy.Sta (for diagnostics).

Details

  • If the STA thread dies unexpectedly, log Fatal and trigger Host service shutdown. The supervisor restarts the Host under its driver-stability policy (docs/v2/driver-stability.md). COM objects on the dead thread are unrecoverable; no in-process recovery is attempted.
  • RunAsync(Action) returns a Task that completes when the action executes on the STA thread. Callers can await it.

MXA-002: Connection Lifecycle

The Host shall support Register/Unregister lifecycle with the LMXProxyServer COM object, tracking the connection handle.

Acceptance Criteria

  • Register(clientName) is called on the STA thread and returns a positive connection handle on success.
  • Handle ≤ 0 → descriptive error thrown; Host reports DriverHealth.Unavailable via the pipe so the Proxy reports Bad quality to the core.
  • Unregister(handle) is called during disconnect after all subscriptions are removed.
  • Client name comes from OTOPCUA_GALAXY_CLIENT_NAME environment variable; default OtOpcUa-Galaxy.Host. Must be unique per MXAccess registration (a cluster's Primary and Secondary each get their own client-name suffix via node override).
  • Connection state transitions: Disconnected → Connecting → Connected → Disconnecting → Disconnected (and Error from any state).

Details

  • ConnectedSince (UTC) recorded after successful Register.
  • ReconnectCount tracked for diagnostics and /metrics.
  • State changes are emitted over the pipe as DriverHealth updates.

MXA-003: Tag Subscription

The Host shall support subscribing to tags via AddItem + AdviseSupervisory, receiving value updates through OnDataChange callbacks.

Acceptance Criteria

  • Subscribe sequence: AddItem(handle, address) returns item handle, then AdviseSupervisory(handle, itemHandle) starts the subscription.
  • OnDataChange callback delivers value, quality, timestamp, and MXSTATUS_PROXY array.
  • Item address format: tag_name.AttributeName for scalars, tag_name.AttributeName[] for whole arrays.
  • AddItem failure → Warning logged, failure propagated over the pipe to the Proxy.
  • Bidirectional maps of address ↔ itemHandle maintained for callback resolution.
  • Multi-client refcounting: two Proxy-side subscribe calls for the same address produce one MXAccess subscription; refcount decrement on the last unsubscribe triggers UnAdvise / RemoveItem.

Details

  • AdviseSupervisory (not Advise) is used because this is a background service without an interactive user session.
  • Stored subscriptions dictionary maps address → callback for reconnect replay.
  • On reconnect, every entry in stored subscriptions is re-subscribed (AddItem + AdviseSupervisory with new handles).

MXA-004: Tag Read/Write

The Host shall support synchronous-style read and write operations, marshalled to the STA thread, with configurable timeouts.

Acceptance Criteria

  • Read pattern: prefer cached subscription value; fall back to subscribe-get-first-value-unsubscribe (AddItem → AdviseSupervisory → wait for OnDataChange → UnAdvise → RemoveItem).
  • Write: AddItem → AdviseSupervisory → Write() → await OnWriteComplete callback → cleanup.
  • Read timeout: Galaxy:ReadTimeoutSeconds in driver config (default 5 seconds) — enforced on the Host side in addition to the Proxy-side Polly Timeout leg.
  • Write timeout: Galaxy:WriteTimeoutSeconds (default 5 seconds) — enforced similarly.
  • Concurrent operation limit: configurable semaphore (Galaxy:MaxConcurrentOperations, default 10).
  • All operations marshalled to the STA thread.

Details

  • Write uses security classification -1 (no security). Galaxy runtime enforces security; OtOpcUa authorization is enforced server-side before the call ever reaches the pipe (per OPC-014 AuthorizationGate).
  • OnWriteComplete: check MXSTATUS_PROXY.success. If 0, extract detail code and propagate as an error over the pipe.
  • COM exceptions translated to meaningful error messages.

MXA-005: Auto-Reconnect

The Host shall monitor connection health and automatically reconnect on failure, replaying all stored subscriptions after reconnect.

Acceptance Criteria

  • Monitor loop runs on a background thread at Galaxy:MonitorIntervalSeconds (default 5 seconds).
  • On disconnect, attempt reconnect. On success, replay all stored subscriptions.
  • On reconnect failure, log Warning and retry at next interval (no exponential backoff inside the Host; the Proxy-side Polly pipeline handles cross-process backoff against pipe failures).
  • Reconnect count is incremented on each successful reconnect.
  • Monitor loop is cancellable for clean Host shutdown.

Details

  • Reconnect cleans up old COM objects before creating new ones.
  • After reconnect, probe subscription (MXA-006) is re-established first, then stored subscriptions.
  • No max retry limit — keep trying indefinitely until the Host service is stopped.

MXA-006: Probe-Based Health Monitoring

The Host shall optionally subscribe to a configurable probe tag and use OnDataChange callback staleness to detect silent connection failures.

Acceptance Criteria

  • Probe tag address configured via Galaxy:ProbeTag. If unset, probe monitoring is disabled.
  • Track _lastProbeValueTime (UTC) updated on each OnDataChange for the probe tag.
  • If DateTime.UtcNow - _lastProbeValueTime > staleThreshold, force disconnect and reconnect.
  • Stale threshold: Galaxy:ProbeStaleThresholdSeconds (default 60 seconds).
  • Implements IHostConnectivityProbe on the Proxy side so the core's CapabilityInvoker records probe outcomes with DriverCapability.Probe telemetry.

Details

  • The probe tag should be an attribute the Galaxy runtime updates regularly (platform heartbeat, area timestamp). Specific tag is site-dependent.
  • After forced reconnect, reset _lastProbeValueTime to DateTime.UtcNow.

MXA-007: COM Cleanup

On disconnect or disposal, the Host shall unwire event handlers, unadvise/remove all items, unregister, and release COM objects via Marshal.ReleaseComObject.

Acceptance Criteria

  • Cleanup order: UnAdvise all active subscriptions → RemoveItem all items → unwire OnDataChange and OnWriteComplete handlers → Unregister → Marshal.ReleaseComObject.
  • On dispose: run disconnect if still connected, then dispose STA thread.
  • Each cleanup step wrapped in try/catch (cleanup must not throw).
  • After cleanup: handle maps cleared, pending write TCS entries abandoned, COM reference set to null.

Details

  • Stored subscriptions are NOT cleared on disconnect (preserved for reconnect replay). Only cleared on Dispose.
  • Event handlers unwired BEFORE Unregister (else callbacks may fire on a dead object).
  • Marshal.ReleaseComObject in a finally block, always.

MXA-008: Operation Metrics

The MXAccess Host shall record timing and success/failure for Read, Write, and Subscribe operations.

Acceptance Criteria

  • Each operation records duration (ms) + success/failure.
  • Metrics exposed over the pipe to the Proxy, which re-publishes them via OpenTelemetry → Prometheus under DriverInstanceId = "galaxy-*", HostName = "galaxy.host".
  • Rolling 1000-entry buffer for percentile calculation.
  • Uses an ITimingScope pattern: using (var scope = metrics.BeginOperation("read")) { ... }.

MXA-009: Error Code Translation

The Host shall translate known MXAccess error codes from MXSTATUS_PROXY.detail into human-readable messages for logging and OPC UA status propagation.

Acceptance Criteria

  • Error 1008 → "User lacks security permission"
  • Error 1012 → "Secured write required (one signature)"
  • Error 1013 → "Verified write required (two signatures)"
  • Unknown error codes logged with their numeric value.
  • Translated messages flow back through the pipe and surface in OPC UA StatusCode descriptions and Server logs.
  • Errors 1008 / 1012 / 1013 on write operations map to Bad_UserAccessDenied at the OPC UA surface.

MXA-010: Proxy-Side Capability Wrapping

Driver.Galaxy.Proxy shall implement the capability interfaces as thin forwarders that serialize every call through the named pipe and route every call through CapabilityInvoker.

Acceptance Criteria

  • Driver.Galaxy.Proxy implements IDriver + IReadable + IWritable + ISubscribable + ITagDiscovery + IRediscoverable + IAlarmSource + IHistoryProvider + IHostConnectivityProbe.
  • Each implementation uses CapabilityInvoker.InvokeAsync(DriverCapability.<...>, …) — direct pipe calls bypassing the invoker are caught by Roslyn OTOPCUA0001.
  • Each method serializes a MessagePack request frame, sends over the pipe, awaits the response frame, deserializes, returns.
  • Pipe disconnect mid-call → CapabilityInvoker's circuit breaker counts the failure; sustained disconnect opens the circuit and Galaxy nodes surface Bad quality until the pipe reconnects.
  • Proxy tolerates Host service restarts — it automatically reconnects and replays subscription setup (parallel to MXA-005 but across the IPC boundary).

MXA-011: Pipe Security

The named pipe between Proxy and Host shall be restricted to the Server's runtime principal via SID-based ACL and authenticated with a per-process shared secret.

Acceptance Criteria

  • Pipe name from OTOPCUA_GALAXY_PIPE environment variable; default OtOpcUaGalaxy.
  • Allowed SID passed as OTOPCUA_ALLOWED_SID — only the declared principal (typically the Server service account) can open the pipe; Administrators is explicitly NOT granted (per the project_galaxy_host_installed memory note).
  • Shared secret passed via OTOPCUA_GALAXY_SECRET at spawn time; the Proxy must present the matching secret on the opening handshake.
  • Secret is process-scoped (regenerated per Host restart) and never persisted to disk or Config DB.
  • Pipe ACL denials are logged as Warning with the rejected principal SID.

Details

  • Environment variables are passed by the supervisor launching the Host (docs/v2/driver-stability.md).
  • Dev-box secret is stored at .local/galaxy-host-secret.txt for NSSM-wrapped development runs (memory note: project_galaxy_host_installed).