Files
lmxopcua/codereviews/review.md
Joseph Doherty ee7e190fab Add multi-client subscription sync and concurrency integration tests
9 tests verifying server handles multiple simultaneous OPC UA clients:

Subscription sync:
- 3 clients subscribe to same tag, all receive data changes
- Client disconnect doesn't affect other clients' subscriptions
- Client unsubscribe doesn't affect other clients' subscriptions
- Clients subscribing to different tags receive only their own data

Concurrency:
- 5 clients browse simultaneously, all get identical results
- 5 clients browse different nodes concurrently, all succeed
- 4 clients browse+subscribe simultaneously, no interference
- 3 clients subscribe+browse concurrently, no deadlock (timeout-guarded)
- Rapid connect/disconnect cycles (10x), server stays stable

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 06:54:47 -04:00

3.4 KiB

Full Review: src

Overall verdict: patch is incorrect

I reviewed the current implementation under src/ as a code review, not just the Git diff. The main issues are correctness bugs in the OPC UA subscription path and MXAccess recovery/write handling.

Findings

[P1] OPC UA monitored items never trigger MXAccess subscriptions

File: src/ZB.MOM.WW.LmxOpcUa.Host/OpcUa/LmxNodeManager.cs:327-345

LmxNodeManager has a SubscribeTag helper, but nothing in the node manager calls it when OPC UA clients create monitored items. There is also no monitored-item lifecycle override in this class. The result is that browsing or subscribing from an OPC UA client does not call _mxAccessClient.SubscribeAsync(...), so live MXAccess data changes are never started for those tags. Clients can still do synchronous reads, but any client expecting pushed value updates will see stale BadWaitingForInitialData or last-cached values instead of runtime updates.

[P1] Write timeouts are reported as successful writes

File: src/ZB.MOM.WW.LmxOpcUa.Host/MxAccess/MxAccessClient.ReadWrite.cs:101-105

On write timeout, the cancellation callback completes the pending task with true. That means a missing OnWriteComplete callback is treated as a successful write, and LmxNodeManager.Write(...) will return ServiceResult.Good upstream. In any timeout scenario where MXAccess never acknowledged the write, OPC UA clients will be told the write succeeded even though the runtime value may never have changed.

[P1] Auto-reconnect stops permanently after one failed reconnect attempt

Files:

  • src/ZB.MOM.WW.LmxOpcUa.Host/MxAccess/MxAccessClient.Connection.cs:101-110
  • src/ZB.MOM.WW.LmxOpcUa.Host/MxAccess/MxAccessClient.Monitor.cs:38-45

The monitor only retries when _state == ConnectionState.Disconnected, but both ConnectAsync and ReconnectAsync move the client to ConnectionState.Error on a failed attempt. After the first reconnect failure, the monitor loop no longer matches its reconnect condition, so temporary outages become permanent until the whole service is restarted.

[P2] Address-space construction depends on parent-before-child ordering

File: src/ZB.MOM.WW.LmxOpcUa.Host/OpcUa/LmxNodeManager.cs:98-104

BuildAddressSpace attaches each object to rootFolder whenever its parent is not already in nodeMap. That only works if the input hierarchy is topologically ordered with every parent appearing before every descendant. The method does not enforce that ordering itself, so any unsorted hierarchy list will silently build the wrong OPC UA tree by promoting children to the root level.

[P3] Startup always performs an unnecessary second rebuild

Files:

  • src/ZB.MOM.WW.LmxOpcUa.Host/OpcUaService.cs:142-182
  • src/ZB.MOM.WW.LmxOpcUa.Host/GalaxyRepository/ChangeDetectionService.cs:46-60

OpcUaService.Start() already reads the Galaxy hierarchy/attributes and builds the initial address space before change detection starts. ChangeDetectionService then unconditionally fires OnGalaxyChanged on its first poll, even when the deploy timestamp has not changed, causing an immediate second full DB fetch and address-space rebuild on every startup. That doubles startup load and rebuild latency for no functional gain.

Notes

  • This was a static review of the src tree.
  • I attempted dotnet test ZB.MOM.WW.LmxOpcUa.slnx, but the run timed out after about 124 seconds, so I did not rely on a full green test pass for validation.