5.5 KiB
PR 4 — Phase 2 follow-up: close the 4 open MXAccess findings
Source: phase-2-pr4-findings (branched from phase-2-stream-d)
Target: v2
Summary
Closes the 4 high/medium open findings carried forward in exit-gate-phase-2-final.md:
- High 1 —
ReadAsyncsubscription-leak on cancel. One-shot read now wraps the subscribe→first-OnDataChange→unsubscribe pattern in atry/finallyso the per-tag callback is always detached, and if the read installed the underlying MXAccess subscription itself (no other caller had it), it tears it down on the way out. - High 2 — No reconnect loop on the MXAccess COM connection. New
MxAccessClientOptions { AutoReconnect, MonitorInterval, StaleThreshold }+ a backgroundMonitorLoopAsyncthat watches a stale-activity threshold + probes the proxy via a no-op COM call, then reconnects-with-replay (re-Register, re-AddItem every active subscription) when the proxy is dead. Liveness signal: everyOnDataChangecallback bumps_lastObservedActivityUtc. Defaults match v1 monitor cadence (5s poll, 60s stale).ReconnectCountexposed for diagnostics;ConnectionStateChangedevent for downstream consumers (the supervisor on the Proxy side already surfaces this through its HeartbeatMonitor, but the Host-side event lets local logging/metrics hook in). - Medium 3 —
MxAccessGalaxyBackend.SubscribeAsyncdoesn't push OnDataChange frames back to the Proxy. NewIGalaxyBackend.OnDataChange/OnAlarmEvent/OnHostStatusChangedevents that the newGalaxyFrameHandler.AttachConnectionsubscribes per-connection and forwards as outboundOnDataChangeNotification/AlarmEvent/RuntimeStatusChangeframes through the connection'sFrameWriter.MxAccessGalaxyBackendfans out per-tag value changes to everySubscriptionIdthat's listening to that tag (multiple Proxy subs may share a Galaxy attribute — single COM subscription, multi-fan-out on the wire). Stub + DbBacked backends declare the events with#pragma warning disable CS0067(treat-warnings-as-errors would otherwise fail on never-raised events that exist only to satisfy the interface). - Medium 4 —
WriteValuesAsyncdoesn't awaitOnWriteComplete. NewWriteAsync(...)overload returnsboolafter awaiting the OnWriteComplete callback via the v1-styleTaskCompletionSource-keyed-by-item-handle pattern in_pendingWrites.MxAccessGalaxyBackend.WriteValuesAsyncnow reports per-tagBad_InternalErrorwhen the runtime rejected the write, instead of false-positiveGood.
Pipe server change
IFrameHandler gains AttachConnection(FrameWriter writer): IDisposable so the handler can
register backend event sinks on each accepted connection and detach them at disconnect. The
PipeServer.RunOneConnectionAsync calls it after the Hello handshake and disposes it in the
finally of the per-connection scope. StubFrameHandler returns IFrameHandler.NoopAttachment.Instance
(net48 doesn't support default interface methods, so the empty-attach lives as a public nested
class).
Tests
dotnet test ZB.MOM.WW.OtOpcUa.slnx: 460 pass / 7 skip (E2E on admin shell) / 1
pre-existing baseline failure. No regressions. The Driver.Galaxy.Host unit tests + 5 live
ZB smoke + 3 live MXAccess COM smoke all pass unchanged.
Test plan for reviewers
dotnet buildcleandotnet testshows 460/7-skip/1-baseline- Spot-check
MxAccessClient.MonitorLoopAsyncagainst v1'sMxAccessClient.Monitorpartial (src/ZB.MOM.WW.OtOpcUa.Host/MxAccess/MxAccessClient.Monitor.cs) — same polling cadence, same probe-then-reconnect-with-replay shape - Read
GalaxyFrameHandler.ConnectionSink.Disposeand confirm event handlers are detached on connection close (no leaked invocation list refs) WriteValuesAsyncreturningBad_InternalErroron a runtime-rejected write is the correct shape — confirm against the v1MxAccessClient.ReadWrite.cspattern
What's NOT in this PR
- Wonderware Historian SDK plugin port (Task B.1.h) — separate PR, larger scope.
- Alarm subsystem wire-up (
MxAccessGalaxyBackend.SubscribeAlarmsAsyncis still a no-op).OnAlarmEventis declared on the backend interface and pushed by the frame handler when raised;MxAccessGalaxyBackendjust doesn't raise it yet (waits for the alarm-tracking port from v1'sAlarmObjectFilter+ Galaxy alarm primitives). - Host-status push (
OnHostStatusChanged) — declared on the interface and pushed by the frame handler;MxAccessGalaxyBackenddoesn't raise it (the Galaxy.Host'sHostConnectivityProbefrom v1 needs porting too, scoped under the Historian PR).
Adversarial review
Quick pass over the PR 4 deltas. No new findings beyond:
- Low 1 —
MonitorLoopAsync's$Heartbeatprobe item-handle is leaked (AddItemsucceeds, neverRemoveItem'd). Cosmetic — the probe item is internal to the COM connection, dies withUnregisterat disconnect/recycle. Worth a follow-up to callRemoveItemafter the probe succeeds. - Low 2 — Replay loop in
MonitorLoopAsyncswallows per-subscription failures. If Galaxy permanently rejects a previously-valid reference (rare but possible after a re-deploy), the user gets silent data loss for that one subscription. The stub-handler- unaware operator wouldn't notice. Worth surfacing as aConnectionStateChanged(false) → ConnectionStateChanged(true)payload that includes the replay-failures list.
Both are low-priority follow-ups, not PR 4 blockers.