Per-file summary: - docs/reqs/OpcUaServerReqs.md — rewritten driver-agnostic. OPC-001..OPC-013 re-scoped to multi-driver address-space composition + capability dispatch; OPC-014 AuthorizationGate + permission trie; OPC-015 dynamic ServiceLevel via RedundancyCoordinator; OPC-017 surgical generation-apply rebuild; OPC-012 capability dispatch via CapabilityInvoker (decision #143 idempotence-aware retry); OPC-013 per-host Polly isolation (decision #144); OPC-019 OpenTelemetry metrics. Transport-security profile matrix (OPC-010) + UserName/LDAP (OPC-011) preserved. - docs/reqs/GalaxyRepositoryReqs.md — scope clarified as Galaxy-driver-only (not platform). GR-001..GR-004 tied to ITagDiscovery.DiscoverAsync + IRediscoverable; all SQL runs inside OtOpcUa.Galaxy.Host and streams to Proxy via named pipe. GR-008 capability wrapping via CapabilityInvoker added. Cross-links to docs/v2/driver-specs.md + docs/GalaxyRepository.md. - docs/reqs/MxAccessClientReqs.md — scope clarified as Galaxy-Host-only. MXA-001..MXA-009 preserved (STA pump, register/unregister, subscription refcount, auto-reconnect, probe, COM cleanup, operation metrics, error translation). MXA-010 Proxy-side capability wrapping + MXA-011 pipe ACL + per-process shared secret (OTOPCUA_ALLOWED_SID / OTOPCUA_GALAXY_SECRET) added. - docs/reqs/ServiceHostReqs.md — rewritten for three-process deployment. Shared section (SVC-SHARED-001/002) for Serilog + bootstrap-only appsettings. SRV-* for OtOpcUa.Server (net10 x64, Microsoft.Extensions.Hosting + AddWindowsService, in-process driver hosting, redundancy-node bootstrap). ADM-* for OtOpcUa.Admin (Blazor Server, cookie+LDAP auth, CanEdit/CanPublish policies, sole DB writer, Prometheus /metrics, audit logging). GHX-* for OtOpcUa.Galaxy.Host (TopShelf, net48 x86, named-pipe IPC bootstrap, STA backend lifecycle, crash handling tied to supervisor). - docs/reqs/ClientRequirements.md — restructured as numbered, verifiable requirements. SHR-* for Client.Shared (single IOpcUaClientService, ConnectionSettings, failover, cross-platform certs, type-coercing write, UI-thread neutrality). CLI-001..CLI-011 cover connect/read/write/browse/subscribe/historyread/alarms/redundancy. UI-001..UI-008 cover connection panel, tree browser, each tab, connection-state reflection, cross-platform build. Reference design content (IOpcUaClientService shape, models, view-model map, mock layout) preserved. - docs/reqs/StatusDashboardReqs.md — retired cleanly. Replaced with a pointer to docs/v2/admin-ui.md + HLR-015 / HLR-016 / HLR-017 / ADM-*. Mapping table shows each retired DASH-001..DASH-009 requirement's replacement (live cluster-node view via SignalR, Prometheus metrics, driver-instance detail views, etc.). Note that a formal AdminUiReqs.md can be written later if needed for cert compliance. HighLevelReqs.md was already at the target shape (HLR-001..HLR-018 with Revision header noting retired HLR-009) as of commit f217636; verified identical and no additional edit required. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
12 KiB
Galaxy Driver — MXAccess Client Requirements
Revision — Refreshed 2026-04-19 for the OtOpcUa v2 multi-driver platform (task #205). Scope narrowed: this document covers the MXAccess surface inside
OtOpcUa.Galaxy.Host(.NET Framework 4.8 x86 Windows service). The in-serverDriver.Galaxy.Proxyimplements theIReadable/IWritable/ISubscribable/IAlarmSource/IHistoryProvidercapability interfaces and routes every wire call through the named pipe to this Host process. The STA thread + reconnect playback + subscription refcount requirements from v1 are preserved; what changed is where they live (Host service, not the Server process). MXA-010 (proxy-side wrapping) and MXA-011 (pipe ACL / shared secret) are new.
Parent: HLR-002, HLR-005, HLR-007
Driver scope: Galaxy only. Process scope: OtOpcUa.Galaxy.Host (Host side) and Driver.Galaxy.Proxy (server-side forwarder).
MXA-001: STA Thread with Message Pump
All MXAccess COM objects shall be created and called on a dedicated STA thread running a Win32 message pump to ensure COM callbacks are delivered.
Acceptance Criteria
- A dedicated thread is created with
ApartmentState.STAbefore any MXAccess COM object is instantiated; implementation lives inStaPumpinsideOtOpcUa.Galaxy.Host. - The thread runs a Win32 message pump using
GetMessage/TranslateMessage/DispatchMessage. - Work items are marshalled to the STA thread via
PostThreadMessage(WM_APP)and a concurrent queue. - All COM object creation (
LMXProxyServer), method calls, and event callbacks happen on this thread. - Thread name
Galaxy.Sta(for diagnostics).
Details
- If the STA thread dies unexpectedly, log Fatal and trigger Host service shutdown. The supervisor restarts the Host under its driver-stability policy (
docs/v2/driver-stability.md). COM objects on the dead thread are unrecoverable; no in-process recovery is attempted. RunAsync(Action)returns aTaskthat completes when the action executes on the STA thread. Callers canawaitit.
MXA-002: Connection Lifecycle
The Host shall support Register/Unregister lifecycle with the LMXProxyServer COM object, tracking the connection handle.
Acceptance Criteria
Register(clientName)is called on the STA thread and returns a positive connection handle on success.- Handle ≤ 0 → descriptive error thrown; Host reports
DriverHealth.Unavailablevia the pipe so the Proxy reports Bad quality to the core. Unregister(handle)is called during disconnect after all subscriptions are removed.- Client name comes from
OTOPCUA_GALAXY_CLIENT_NAMEenvironment variable; defaultOtOpcUa-Galaxy.Host. Must be unique per MXAccess registration (a cluster's Primary and Secondary each get their own client-name suffix via node override). - Connection state transitions: Disconnected → Connecting → Connected → Disconnecting → Disconnected (and Error from any state).
Details
ConnectedSince(UTC) recorded after successful Register.ReconnectCounttracked for diagnostics and/metrics.- State changes are emitted over the pipe as
DriverHealthupdates.
MXA-003: Tag Subscription
The Host shall support subscribing to tags via AddItem + AdviseSupervisory, receiving value updates through OnDataChange callbacks.
Acceptance Criteria
- Subscribe sequence:
AddItem(handle, address)returns item handle, thenAdviseSupervisory(handle, itemHandle)starts the subscription. OnDataChangecallback delivers value, quality, timestamp, and MXSTATUS_PROXY array.- Item address format:
tag_name.AttributeNamefor scalars,tag_name.AttributeName[]for whole arrays. - AddItem failure → Warning logged, failure propagated over the pipe to the Proxy.
- Bidirectional maps of
address ↔ itemHandlemaintained for callback resolution. - Multi-client refcounting: two Proxy-side subscribe calls for the same address produce one MXAccess subscription; refcount decrement on the last unsubscribe triggers
UnAdvise/RemoveItem.
Details
AdviseSupervisory(notAdvise) is used because this is a background service without an interactive user session.- Stored subscriptions dictionary maps address → callback for reconnect replay.
- On reconnect, every entry in stored subscriptions is re-subscribed (AddItem + AdviseSupervisory with new handles).
MXA-004: Tag Read/Write
The Host shall support synchronous-style read and write operations, marshalled to the STA thread, with configurable timeouts.
Acceptance Criteria
- Read pattern: prefer cached subscription value; fall back to subscribe-get-first-value-unsubscribe (AddItem → AdviseSupervisory → wait for OnDataChange → UnAdvise → RemoveItem).
- Write: AddItem → AdviseSupervisory →
Write()→ awaitOnWriteCompletecallback → cleanup. - Read timeout:
Galaxy:ReadTimeoutSecondsin driver config (default 5 seconds) — enforced on the Host side in addition to the Proxy-side PollyTimeoutleg. - Write timeout:
Galaxy:WriteTimeoutSeconds(default 5 seconds) — enforced similarly. - Concurrent operation limit: configurable semaphore (
Galaxy:MaxConcurrentOperations, default 10). - All operations marshalled to the STA thread.
Details
- Write uses security classification
-1(no security). Galaxy runtime enforces security; OtOpcUa authorization is enforced server-side before the call ever reaches the pipe (per OPC-014AuthorizationGate). OnWriteComplete: checkMXSTATUS_PROXY.success. If 0, extract detail code and propagate as an error over the pipe.- COM exceptions translated to meaningful error messages.
MXA-005: Auto-Reconnect
The Host shall monitor connection health and automatically reconnect on failure, replaying all stored subscriptions after reconnect.
Acceptance Criteria
- Monitor loop runs on a background thread at
Galaxy:MonitorIntervalSeconds(default 5 seconds). - On disconnect, attempt reconnect. On success, replay all stored subscriptions.
- On reconnect failure, log Warning and retry at next interval (no exponential backoff inside the Host; the Proxy-side Polly pipeline handles cross-process backoff against pipe failures).
- Reconnect count is incremented on each successful reconnect.
- Monitor loop is cancellable for clean Host shutdown.
Details
- Reconnect cleans up old COM objects before creating new ones.
- After reconnect, probe subscription (MXA-006) is re-established first, then stored subscriptions.
- No max retry limit — keep trying indefinitely until the Host service is stopped.
MXA-006: Probe-Based Health Monitoring
The Host shall optionally subscribe to a configurable probe tag and use OnDataChange callback staleness to detect silent connection failures.
Acceptance Criteria
- Probe tag address configured via
Galaxy:ProbeTag. If unset, probe monitoring is disabled. - Track
_lastProbeValueTime(UTC) updated on each OnDataChange for the probe tag. - If
DateTime.UtcNow - _lastProbeValueTime > staleThreshold, force disconnect and reconnect. - Stale threshold:
Galaxy:ProbeStaleThresholdSeconds(default 60 seconds). - Implements
IHostConnectivityProbeon the Proxy side so the core'sCapabilityInvokerrecords probe outcomes withDriverCapability.Probetelemetry.
Details
- The probe tag should be an attribute the Galaxy runtime updates regularly (platform heartbeat, area timestamp). Specific tag is site-dependent.
- After forced reconnect, reset
_lastProbeValueTimetoDateTime.UtcNow.
MXA-007: COM Cleanup
On disconnect or disposal, the Host shall unwire event handlers, unadvise/remove all items, unregister, and release COM objects via Marshal.ReleaseComObject.
Acceptance Criteria
- Cleanup order: UnAdvise all active subscriptions → RemoveItem all items → unwire OnDataChange and OnWriteComplete handlers → Unregister →
Marshal.ReleaseComObject. - On dispose: run disconnect if still connected, then dispose STA thread.
- Each cleanup step wrapped in try/catch (cleanup must not throw).
- After cleanup: handle maps cleared, pending write TCS entries abandoned, COM reference set to null.
Details
- Stored subscriptions are NOT cleared on disconnect (preserved for reconnect replay). Only cleared on Dispose.
- Event handlers unwired BEFORE Unregister (else callbacks may fire on a dead object).
Marshal.ReleaseComObjectin afinallyblock, always.
MXA-008: Operation Metrics
The MXAccess Host shall record timing and success/failure for Read, Write, and Subscribe operations.
Acceptance Criteria
- Each operation records duration (ms) + success/failure.
- Metrics exposed over the pipe to the Proxy, which re-publishes them via OpenTelemetry → Prometheus under
DriverInstanceId = "galaxy-*",HostName = "galaxy.host". - Rolling 1000-entry buffer for percentile calculation.
- Uses an
ITimingScopepattern:using (var scope = metrics.BeginOperation("read")) { ... }.
MXA-009: Error Code Translation
The Host shall translate known MXAccess error codes from MXSTATUS_PROXY.detail into human-readable messages for logging and OPC UA status propagation.
Acceptance Criteria
- Error 1008 → "User lacks security permission"
- Error 1012 → "Secured write required (one signature)"
- Error 1013 → "Verified write required (two signatures)"
- Unknown error codes logged with their numeric value.
- Translated messages flow back through the pipe and surface in OPC UA
StatusCodedescriptions and Server logs. - Errors 1008 / 1012 / 1013 on write operations map to
Bad_UserAccessDeniedat the OPC UA surface.
MXA-010: Proxy-Side Capability Wrapping
Driver.Galaxy.Proxy shall implement the capability interfaces as thin forwarders that serialize every call through the named pipe and route every call through CapabilityInvoker.
Acceptance Criteria
Driver.Galaxy.ProxyimplementsIDriver+IReadable+IWritable+ISubscribable+ITagDiscovery+IRediscoverable+IAlarmSource+IHistoryProvider+IHostConnectivityProbe.- Each implementation uses
CapabilityInvoker.InvokeAsync(DriverCapability.<...>, …)— direct pipe calls bypassing the invoker are caught by Roslyn OTOPCUA0001. - Each method serializes a MessagePack request frame, sends over the pipe, awaits the response frame, deserializes, returns.
- Pipe disconnect mid-call →
CapabilityInvoker's circuit breaker counts the failure; sustained disconnect opens the circuit and Galaxy nodes surface Bad quality until the pipe reconnects. - Proxy tolerates Host service restarts — it automatically reconnects and replays subscription setup (parallel to MXA-005 but across the IPC boundary).
MXA-011: Pipe Security
The named pipe between Proxy and Host shall be restricted to the Server's runtime principal via SID-based ACL and authenticated with a per-process shared secret.
Acceptance Criteria
- Pipe name from
OTOPCUA_GALAXY_PIPEenvironment variable; defaultOtOpcUaGalaxy. - Allowed SID passed as
OTOPCUA_ALLOWED_SID— only the declared principal (typically the Server service account) can open the pipe;Administratorsis explicitly NOT granted (per theproject_galaxy_host_installedmemory note). - Shared secret passed via
OTOPCUA_GALAXY_SECRETat spawn time; the Proxy must present the matching secret on the opening handshake. - Secret is process-scoped (regenerated per Host restart) and never persisted to disk or Config DB.
- Pipe ACL denials are logged as Warning with the rejected principal SID.
Details
- Environment variables are passed by the supervisor launching the Host (
docs/v2/driver-stability.md). - Dev-box secret is stored at
.local/galaxy-host-secret.txtfor NSSM-wrapped development runs (memory note:project_galaxy_host_installed).