# MXAccess Client — Component Requirements Parent: [HLR-003](HighLevelReqs.md#hlr-003-mxaccess-runtime-data-access), [HLR-008](HighLevelReqs.md#hlr-008-connection-resilience) ## MXA-001: STA Thread with Message Pump All MXAccess COM objects shall be created and called on a dedicated STA thread running a Win32 message pump to ensure COM callbacks are delivered. ### Acceptance Criteria - A dedicated thread is created with `ApartmentState.STA` before any MXAccess COM objects are instantiated. - The thread runs a Win32 message pump using `GetMessage`/`TranslateMessage`/`DispatchMessage` loop. - Work items are marshalled to the STA thread via `PostThreadMessage(WM_APP)` and a concurrent queue. - The STA thread processes work items between message pump iterations. - All COM object creation (`LMXProxyServer` constructor), method calls, and event callbacks happen on this thread. ### Details - Thread name: `MxAccess-STA` (for diagnostics). - If the STA thread dies unexpectedly, log Fatal and trigger service shutdown. Do not attempt to create a replacement thread (COM objects on the dead thread are unrecoverable). - `RunAsync(Action)` method returns a `Task` that completes when the action executes on the STA thread. Callers can `await` it. --- ## MXA-002: Connection Lifecycle The client shall support Register/Unregister lifecycle with the LMXProxyServer COM object, tracking the connection handle. ### Acceptance Criteria - `Register(clientName)` is called on the STA thread and returns a positive connection handle on success. - If Register returns handle <= 0, throw with descriptive error. - `Unregister(handle)` is called during disconnect after all subscriptions are removed. - Client name: configurable via `MxAccess:ClientName`, default `LmxOpcUa`. Must be unique per MXAccess registration. - Connection state transitions: Disconnected → Connecting → Connected → Disconnecting → Disconnected (and Error from any state). ### Details - `ConnectedSince` timestamp (UTC) is recorded after successful Register. - `ReconnectCount` is tracked for diagnostics and dashboard display. - State change events are raised for dashboard and health check consumption. --- ## MXA-003: Tag Subscription The client shall support subscribing to tags via AddItem + AdviseSupervisory, receiving value updates through OnDataChange callbacks. ### Acceptance Criteria - Subscribe sequence: `AddItem(handle, address)` returns item handle, then `AdviseSupervisory(handle, itemHandle)` starts the subscription. - `OnDataChange` callback delivers value, quality (integer), timestamp, and MXSTATUS_PROXY array. - Item address format: `tag_name.AttributeName` for scalars, `tag_name.AttributeName[]` for whole arrays. - If AddItem fails (e.g., tag does not exist), log Warning and return failure to caller. - Bidirectional maps of `address ↔ itemHandle` are maintained for callback resolution. ### Details - Use `AdviseSupervisory` (not `Advise`) because this is a background service with no interactive user session. AdviseSupervisory allows secured/verified writes without user authentication. - Stored subscriptions dictionary maps address to callback for reconnect replay. - On reconnect, all entries in stored subscriptions are re-subscribed (AddItem + AdviseSupervisory with new handles). --- ## MXA-004: Tag Read/Write The client shall support synchronous-style read and write operations, marshalled to the STA thread, with configurable timeouts. ### Acceptance Criteria - Read: implemented as subscribe-get-first-value-unsubscribe pattern (AddItem → AdviseSupervisory → wait for OnDataChange → UnAdvise → RemoveItem). - Write: AddItem → AdviseSupervisory → `Write()` → await `OnWriteComplete` callback → cleanup. - Read timeout: configurable via `MxAccess:ReadTimeoutSeconds`, default 5 seconds. - Write timeout: configurable via `MxAccess:WriteTimeoutSeconds`, default 5 seconds. On timeout, log Warning and return timeout error. - Concurrent operation limit: configurable semaphore via `MxAccess:MaxConcurrentOperations`, default 10. - All operations marshalled to the STA thread. ### Details - Write uses security classification -1 (no security). Galaxy runtime handles security enforcement. - `OnWriteComplete` callback: check MXSTATUS_PROXY `success` field. If 0, extract detail code and propagate error. - COM exceptions (`COMException` with HRESULT) are caught and translated to meaningful error messages. --- ## MXA-005: Auto-Reconnect The client shall monitor connection health and automatically reconnect on failure, replaying all stored subscriptions after reconnect. ### Acceptance Criteria - Monitor loop runs on a background thread, checking connection health at configurable interval (`MxAccess:MonitorIntervalSeconds`, default 5 seconds). - If disconnected, attempt reconnect. On success, replay all stored subscriptions. - On reconnect failure, log Warning and retry at next interval (no exponential backoff — reconnect as quickly as possible on a plant-floor service). - Reconnect count is incremented on each successful reconnect. - Monitor loop is cancellable (for clean shutdown). ### Details - Reconnect cleans up old COM objects before creating new ones. - After reconnect, probe subscription is re-established first, then stored subscriptions. - No max retry limit — keep trying indefinitely until service is stopped. --- ## MXA-006: Probe-Based Health Monitoring The client shall optionally subscribe to a configurable probe tag and use OnDataChange callback staleness to detect silent connection failures. ### Acceptance Criteria - Subscribe to a configurable probe tag (a known-good Galaxy attribute that changes periodically). - Track `_lastProbeValueTime` (UTC) updated on each OnDataChange for the probe tag. - If `DateTime.UtcNow - _lastProbeValueTime > staleThreshold`, force disconnect and reconnect. - Probe tag address: configurable via `MxAccess:ProbeTag`. If not configured, probe monitoring is disabled. - Stale threshold: configurable via `MxAccess:ProbeStaleThresholdSeconds`, default 60 seconds. ### Details - The probe tag should be an attribute that the Galaxy runtime updates regularly (e.g., a platform heartbeat or area-level timestamp). The specific tag is site-dependent. - After forced reconnect, reset `_lastProbeValueTime` to `DateTime.UtcNow` to give the new connection a full threshold window. --- ## MXA-007: COM Cleanup On disconnect or disposal, the client shall unwire event handlers, unadvise/remove all items, unregister, and release COM objects via Marshal.ReleaseComObject. ### Acceptance Criteria - Cleanup order: UnAdvise all active subscriptions → RemoveItem all items → unwire OnDataChange and OnWriteComplete event handlers → Unregister → `Marshal.ReleaseComObject`. - On dispose: run disconnect if still connected, then dispose STA thread. - Each cleanup step is wrapped in try/catch (cleanup must not throw). - After cleanup: handle maps are cleared, pending write TCS entries are abandoned, COM reference is set to null. ### Details - `_storedSubscriptions` is NOT cleared on disconnect (preserved for reconnect replay). Only cleared on Dispose. - Event handlers must be unwired BEFORE Unregister, or callbacks may fire on a dead object. - `Marshal.ReleaseComObject` in a finally block, always, even if earlier steps fail. --- ## MXA-008: Operation Metrics The MXAccess client shall record timing and success/failure for Read, Write, and Subscribe operations. ### Acceptance Criteria - Each operation records: duration (ms), success/failure. - Metrics are available for the status dashboard: count, success rate, avg/min/max/P95 latency. - Uses a rolling 1000-entry buffer for percentile calculation. - Metrics are exposed via a queryable interface consumed by the status report service. ### Details - Uses an `ITimingScope` pattern: `using (var scope = metrics.BeginOperation("read")) { ... }` for automatic timing and success tracking. - Metrics are periodically logged at Debug level for diagnostics. --- ## MXA-009: Error Code Translation The client shall translate known MXAccess error codes from MXSTATUS_PROXY.detail into human-readable messages for logging and OPC UA status propagation. ### Acceptance Criteria - Error 1008 → "User lacks security permission" - Error 1012 → "Secured write required (one signature)" - Error 1013 → "Verified write required (two signatures)" - Unknown error codes are logged with their numeric value. - Translated messages are included in OPC UA StatusCode descriptions and log entries.