7.0 KiB
LmxProxy v2 — Fix Failing Write Integration Tests
Run this prompt with Claude Code from the lmxproxy/ directory.
Prompt
You are debugging and fixing 2 failing integration tests for the LmxProxy v2 gRPC proxy service. The tests are WriteAndReadBack and WriteBatchAndWait in the integration test project.
Context
Read these documents before starting:
CLAUDE.md— project-level instructions and architecturedocs/deviations.md— deviation #7 describes the failure (OnWriteComplete COM callback not firing)mxaccess_documentation.md— the official MxAccess Toolkit reference. Search for "OnWriteComplete", "Write() method", and "AdviseSupervisory" sections.
Read these source files:
src/ZB.MOM.WW.LmxProxy.Host/MxAccess/MxAccessClient.ReadWrite.cs— current v2 write implementationsrc/ZB.MOM.WW.LmxProxy.Host/MxAccess/MxAccessClient.EventHandlers.cs— OnWriteComplete callback handlersrc/ZB.MOM.WW.LmxProxy.Host/MxAccess/MxAccessClient.Connection.cs— COM event wiring (_lmxProxy.OnWriteComplete += OnWriteComplete)src-reference/ZB.MOM.WW.LmxProxy.Host/Implementation/MxAccessClient.ReadWrite.cs— v1 write implementation (for comparison)src-reference/ZB.MOM.WW.LmxProxy.Host/Implementation/MxAccessClient.EventHandlers.cs— v1 OnWriteComplete handlertests/ZB.MOM.WW.LmxProxy.Client.IntegrationTests/WriteTests.cs— failing WriteAndReadBack testtests/ZB.MOM.WW.LmxProxy.Client.IntegrationTests/WriteBatchAndWaitTests.cs— failing WriteBatchAndWait test
Problem Statement
The OnWriteComplete COM callback from MxAccess never fires within the write timeout (default 5s). The write operation follows this flow:
AddItem()— registers the tag with MxAccessAdviseSupervisory()— establishes a supervisory connection for the tag- Store a
TaskCompletionSource<bool>in_pendingWrites[itemHandle] Write()— sends the write to MxAccess- Wait for
OnWriteCompletecallback to resolve the TCS — this never fires, causing a timeout
The v1 code used the same pattern and presumably worked, so the issue is either:
- (a) MxAccess completes the write synchronously and never fires
OnWriteCompletefor simple (non-secured, non-verified) writes. The documentation says: "The LMXProxyInterface triggers an event for OnWriteComplete when your program calls the Write() or WriteSecured() function." However, this may not be true in all configurations or for all attribute types. - (b) The COM event subscription wiring (
_lmxProxy.OnWriteComplete += OnWriteComplete) is correct syntactically but the callback doesn't fire because the thread that calledWrite()(a thread pool thread viaTask.Run) isn't pumping COM messages. Note: STA threading was abandoned (see deviation #2 indocs/deviations.md) because it caused other issues withOnDataChangecallbacks. - (c) There's a difference in how v1 vs v2 initializes or interacts with the COM object that affects event delivery.
Investigation Steps
These steps require SSH access to windev where the v2 Host is deployed.
Step 1: Check if the write actually succeeds despite no callback.
The WriteAndReadBack test writes a value and then reads it back. The test fails because WriteAsync throws a TimeoutException (OnWriteComplete never fires). But the write itself may have succeeded at the MxAccess level.
SSH to windev (ssh windev) and:
- Check the v2 Host service logs in
C:\publish-v2\logs\for any write-related log entries - Look for "Write failed", "WriteComplete", or "timeout" messages
Step 2: Add a fire-and-forget write mode.
If the write succeeds at the MxAccess level but OnWriteComplete never fires, the simplest fix is to bypass the callback wait. Modify MxAccessClient.ReadWrite.cs:
- After calling
_lmxProxy.Write(), immediately resolve the TCS with success instead of waiting for the callback - Keep the
OnWriteCompletehandler wired up — if it does fire, it can log the result for diagnostics but shouldn't block the write path - Add a configuration option
WriteConfirmationModewith valuesFireAndForget(default) andWaitForCallback, so the behavior can be switched if needed
The rationale: the MxAccess documentation's sample application (Chapter 6) uses OnWriteComplete to detect whether a secured or verified write is needed, then retries with WriteSecured(). For simple supervisory writes (which is what LmxProxy does), the Write() call itself is the confirmation — if it doesn't throw, the write was accepted.
Step 3: Implement the fix.
In MxAccessClient.ReadWrite.cs, modify SetupWriteOperationAsync:
// After _lmxProxy.Write():
// Immediately complete the write — OnWriteComplete may not fire for supervisory writes
tcs.TrySetResult(true);
Remove the _pendingWrites[itemHandle] = tcs tracking since it's no longer needed for the default path. Keep OnWriteComplete wired for logging/diagnostics.
Clean up WaitForWriteCompletionAsync — in fire-and-forget mode, the TCS is already completed so the await returns immediately. The cleanup (UnAdvise + RemoveItem) should still happen.
Step 4: Consider an alternative approach — poll-based write confirmation.
If fire-and-forget is too loose (we want to confirm the write took effect), consider a poll-based approach similar to WriteBatchAndWaitAsync: after writing, read the tag back and compare. This is more reliable than depending on a COM callback. However, this adds latency and may not be needed — the WriteAndReadBack test already does a read-back verification.
Step 5: Build and deploy.
After making changes:
ssh windev "cd C:\source\lmxproxy && git pull && dotnet build src\ZB.MOM.WW.LmxProxy.Host -c Release -r win-x86 --no-self-contained -o C:\publish-v2"
Restart the v2 service:
ssh windev "net stop LmxProxyV2 && net start LmxProxyV2"
Step 6: Run the integration tests.
From the Mac:
cd tests/ZB.MOM.WW.LmxProxy.Client.IntegrationTests
dotnet test --filter "WriteAndReadBack|WriteBatchAndWait" -v n
Both tests should pass. If they do, run the full integration test suite:
dotnet test -v n
Step 7: Update deviations document.
If the fix works, update docs/deviations.md deviation #7 to reflect the resolution. Change the status from pending to resolved and document what was done.
Guardrails
- Do not reintroduce STA threading. The MTA/Task.Run approach works for
OnDataChangecallbacks and subscriptions. Do not change the threading model. - Do not modify the integration tests unless they have a genuine bug (e.g., wrong assertion, wrong tag name). The tests define the expected behavior — fix the implementation to match.
- Do not modify the proto file or gRPC contracts. This is a Host-side implementation fix only.
- Keep the OnWriteComplete handler wired. Even if we don't wait on it, the callback provides diagnostic information (security errors, verified write requirements) that should be logged.
- Commit with message:
fix(lmxproxy): resolve write timeout — bypass OnWriteComplete callback for supervisory writes