# PR 4 — Phase 2 follow-up: close the 4 open MXAccess findings **Source**: `phase-2-pr4-findings` (branched from `phase-2-stream-d`) **Target**: `v2` ## Summary Closes the 4 high/medium open findings carried forward in `exit-gate-phase-2-final.md`: - **High 1 — `ReadAsync` subscription-leak on cancel.** One-shot read now wraps the subscribe→first-OnDataChange→unsubscribe pattern in a `try/finally` so the per-tag callback is always detached, and if the read installed the underlying MXAccess subscription itself (no other caller had it), it tears it down on the way out. - **High 2 — No reconnect loop on the MXAccess COM connection.** New `MxAccessClientOptions { AutoReconnect, MonitorInterval, StaleThreshold }` + a background `MonitorLoopAsync` that watches a stale-activity threshold + probes the proxy via a no-op COM call, then reconnects-with-replay (re-Register, re-AddItem every active subscription) when the proxy is dead. Liveness signal: every `OnDataChange` callback bumps `_lastObservedActivityUtc`. Defaults match v1 monitor cadence (5s poll, 60s stale). `ReconnectCount` exposed for diagnostics; `ConnectionStateChanged` event for downstream consumers (the supervisor on the Proxy side already surfaces this through its HeartbeatMonitor, but the Host-side event lets local logging/metrics hook in). - **Medium 3 — `MxAccessGalaxyBackend.SubscribeAsync` doesn't push OnDataChange frames back to the Proxy.** New `IGalaxyBackend.OnDataChange` / `OnAlarmEvent` / `OnHostStatusChanged` events that the new `GalaxyFrameHandler.AttachConnection` subscribes per-connection and forwards as outbound `OnDataChangeNotification` / `AlarmEvent` / `RuntimeStatusChange` frames through the connection's `FrameWriter`. `MxAccessGalaxyBackend` fans out per-tag value changes to every `SubscriptionId` that's listening to that tag (multiple Proxy subs may share a Galaxy attribute — single COM subscription, multi-fan-out on the wire). Stub + DbBacked backends declare the events with `#pragma warning disable CS0067` (treat-warnings-as-errors would otherwise fail on never-raised events that exist only to satisfy the interface). - **Medium 4 — `WriteValuesAsync` doesn't await `OnWriteComplete`.** New `WriteAsync(...)` overload returns `bool` after awaiting the OnWriteComplete callback via the v1-style `TaskCompletionSource`-keyed-by-item-handle pattern in `_pendingWrites`. `MxAccessGalaxyBackend.WriteValuesAsync` now reports per-tag `Bad_InternalError` when the runtime rejected the write, instead of false-positive `Good`. ## Pipe server change `IFrameHandler` gains `AttachConnection(FrameWriter writer): IDisposable` so the handler can register backend event sinks on each accepted connection and detach them at disconnect. The `PipeServer.RunOneConnectionAsync` calls it after the Hello handshake and disposes it in the finally of the per-connection scope. `StubFrameHandler` returns `IFrameHandler.NoopAttachment.Instance` (net48 doesn't support default interface methods, so the empty-attach lives as a public nested class). ## Tests **`dotnet test ZB.MOM.WW.OtOpcUa.slnx`**: **460 pass / 7 skip (E2E on admin shell) / 1 pre-existing baseline failure**. No regressions. The Driver.Galaxy.Host unit tests + 5 live ZB smoke + 3 live MXAccess COM smoke all pass unchanged. ## Test plan for reviewers - [ ] `dotnet build` clean - [ ] `dotnet test` shows 460/7-skip/1-baseline - [ ] Spot-check `MxAccessClient.MonitorLoopAsync` against v1's `MxAccessClient.Monitor` partial (`src/ZB.MOM.WW.OtOpcUa.Host/MxAccess/MxAccessClient.Monitor.cs`) — same polling cadence, same probe-then-reconnect-with-replay shape - [ ] Read `GalaxyFrameHandler.ConnectionSink.Dispose` and confirm event handlers are detached on connection close (no leaked invocation list refs) - [ ] `WriteValuesAsync` returning `Bad_InternalError` on a runtime-rejected write is the correct shape — confirm against the v1 `MxAccessClient.ReadWrite.cs` pattern ## What's NOT in this PR - Wonderware Historian SDK plugin port (Task B.1.h) — separate PR, larger scope. - Alarm subsystem wire-up (`MxAccessGalaxyBackend.SubscribeAlarmsAsync` is still a no-op). `OnAlarmEvent` is declared on the backend interface and pushed by the frame handler when raised; `MxAccessGalaxyBackend` just doesn't raise it yet (waits for the alarm-tracking port from v1's `AlarmObjectFilter` + Galaxy alarm primitives). - Host-status push (`OnHostStatusChanged`) — declared on the interface and pushed by the frame handler; `MxAccessGalaxyBackend` doesn't raise it (the Galaxy.Host's `HostConnectivityProbe` from v1 needs porting too, scoped under the Historian PR). ## Adversarial review Quick pass over the PR 4 deltas. No new findings beyond: - **Low 1** — `MonitorLoopAsync`'s `$Heartbeat` probe item-handle is leaked (`AddItem` succeeds, never `RemoveItem`'d). Cosmetic — the probe item is internal to the COM connection, dies with `Unregister` at disconnect/recycle. Worth a follow-up to call `RemoveItem` after the probe succeeds. - **Low 2** — Replay loop in `MonitorLoopAsync` swallows per-subscription failures. If Galaxy permanently rejects a previously-valid reference (rare but possible after a re-deploy), the user gets silent data loss for that one subscription. The stub-handler- unaware operator wouldn't notice. Worth surfacing as a `ConnectionStateChanged(false) → ConnectionStateChanged(true)` payload that includes the replay-failures list. Both are low-priority follow-ups, not PR 4 blockers.