From 73b2b2f6d74ec347795ad925bad667f95c0aacfe Mon Sep 17 00:00:00 2001 From: Joseph Doherty Date: Sun, 22 Mar 2026 05:02:15 -0400 Subject: [PATCH] docs(lmxproxy): add STA message pump gap analysis with implementation guide Documents when the full STA+Application.Run() approach is needed (secured/verified writes), why our first attempt failed, the correct pattern using Form.BeginInvoke(), and tradeoffs vs fire-and-forget. Co-Authored-By: Claude Opus 4.6 (1M context) --- lmxproxy/docs/sta_gap.md | 167 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 167 insertions(+) create mode 100644 lmxproxy/docs/sta_gap.md diff --git a/lmxproxy/docs/sta_gap.md b/lmxproxy/docs/sta_gap.md new file mode 100644 index 0000000..da03845 --- /dev/null +++ b/lmxproxy/docs/sta_gap.md @@ -0,0 +1,167 @@ +# STA Message Pump Gap — OnWriteComplete COM Callback + +**Status**: Documented gap. Fire-and-forget workaround in place (deviation #7). Full fix deferred until secured/verified writes are needed. + +## When This Matters + +The current fire-and-forget write approach works for **supervisory writes** where: +- Security is handled at the LmxProxy API key level, not MxAccess attribute level +- Writes succeed synchronously (no secured/verified write requirements) +- Write confirmation is handled at the application level (read-back in `WriteBatchAndWait`) + +This gap becomes a **blocking issue** if any of these scenarios arise: +- **Secured writes (MxAccess error 1012)**: Attribute requires ArchestrA user authentication. `OnWriteComplete` returns the error, and the caller must retry with `WriteSecured()`. +- **Verified writes (MxAccess error 1013)**: Attribute requires two-user verification. Same retry pattern. +- **Write failure detection**: MxAccess accepts the `Write()` call but can't complete it (e.g., downstream device failure). `OnWriteComplete` is the only notification of this — without it, the caller assumes success. + +## Root Cause + +The MxAccess documentation (Write() Method) states: *"Upon completion of the write, your program receives notification of the success/failure status through the OnWriteComplete() event"* and *"that item should not be taken off advise or removed from the internal tables until the OnWriteComplete() event is received."* + +`OnWriteComplete` **should** fire after every `Write()` call. It doesn't in our service because: +- MxAccess is a COM component designed for Windows Forms apps with a UI message loop +- COM event callbacks are delivered via the Windows message pump +- Our Topshelf Windows service has no message pump — `Write()` is called from thread pool threads (`Task.Run`) with no message loop +- `OnDataChange` works because MxAccess fires it proactively on its own internal threads; `OnWriteComplete` is a response callback that needs message-pump-based marshaling + +## Correct Solution: Dedicated STA Thread + `Application.Run()` + +Based on research (Stephen Toub, MSDN Magazine 2007; Microsoft Learn COM interop docs; community patterns), the correct approach is a dedicated STA thread running a Windows Forms message pump via `Application.Run()`. + +### Architecture + +``` +Service main thread (MTA) + │ + ├── gRPC server threads (handle client RPCs) + │ │ + │ └── Marshal COM calls via Form.BeginInvoke() ──┐ + │ │ + └── Dedicated STA thread │ + │ │ + ├── Creates LMXProxyServerClass COM object │ + ├── Wires event handlers (OnDataChange, │ + │ OnWriteComplete, OperationComplete) │ + ├── Runs Application.Run() ← continuous │ + │ message pump │ + │ │ + └── Hidden Form receives BeginInvoke calls ◄────┘ + │ + ├── Executes COM operations (Read, Write, + │ AddItem, AdviseSupervisory, etc.) + │ + └── COM callbacks delivered via message pump + (OnWriteComplete, OnDataChange, etc.) +``` + +### Implementation Pattern + +```csharp +// In MxAccessClient constructor or Start(): +var initDone = new ManualResetEventSlim(false); + +_staThread = new Thread(() => +{ + // 1. Create hidden form for marshaling + _marshalForm = new Form(); + _marshalForm.CreateHandle(); // force HWND creation without showing + + // 2. Create COM objects ON THIS THREAD + _lmxProxy = new LMXProxyServerClass(); + _lmxProxy.OnDataChange += OnDataChange; + _lmxProxy.OnWriteComplete += OnWriteComplete; + + // 3. Signal that init is complete + initDone.Set(); + + // 4. Run message pump (blocks forever, pumps COM callbacks) + Application.Run(); +}); +_staThread.Name = "MxAccess-STA"; +_staThread.IsBackground = true; +_staThread.SetApartmentState(ApartmentState.STA); +_staThread.Start(); + +initDone.Wait(); // wait for COM objects to be ready +``` + +### Dispatching Work to the STA Thread + +```csharp +// All COM calls must go through the hidden form's invoke: +public Task ReadAsync(string address, CancellationToken ct) +{ + var tcs = new TaskCompletionSource(); + _marshalForm.BeginInvoke((Action)(() => + { + try + { + // COM call executes on STA thread + int handle = _lmxProxy.AddItem(_connectionHandle, address); + _lmxProxy.AdviseSupervisory(_connectionHandle, handle); + // ... etc + tcs.SetResult(vtq); + } + catch (Exception ex) + { + tcs.SetException(ex); + } + })); + return tcs.Task; +} +``` + +### Shutdown + +```csharp +// To stop the message pump: +_marshalForm.BeginInvoke((Action)(() => +{ + // Clean up COM objects on STA thread + // ... UnAdvise, RemoveItem, Unregister ... + Marshal.ReleaseComObject(_lmxProxy); + Application.ExitThread(); // stops Application.Run() +})); +_staThread.Join(TimeSpan.FromSeconds(10)); +``` + +### Why Our First Attempt Failed + +Our original `StaDispatchThread` (Phase 2) used `BlockingCollection.Take()` to wait for work items, with `Application.DoEvents()` between items. This failed because: + +| Our failed approach | Correct approach | +|---|---| +| `BlockingCollection.Take()` blocks the STA thread, preventing the message pump from running | `Application.Run()` runs continuously, pumping messages at all times | +| `Application.DoEvents()` only pumps messages already in the queue at that instant | Message pump runs an infinite loop, processing messages as they arrive | +| Work dispatched by enqueueing to `BlockingCollection` | Work dispatched via `Form.BeginInvoke()` which posts a Windows message to the STA thread's queue | + +The key difference: `BeginInvoke` posts a `WM_` message that the message pump processes alongside COM callbacks. `BlockingCollection` bypasses the message pump entirely. + +## Drawbacks of the STA Approach + +### Performance +- **All COM calls serialize onto one thread.** Under load (batch reads of 100+ tags), operations queue up single-file. Current `Task.Run` approach allows MxAccess's internal marshaling to handle some concurrency. +- **Double context switch per operation.** Caller → STA thread (invoke) → wait → back to caller. Adds ~0.1-1ms per call. Negligible for single reads, noticeable for large batch operations. + +### Safety +- **Single point of failure.** If the STA thread dies, all MxAccess operations stop. Recovery requires tearing down and recreating the thread + all COM objects. +- **Deadlock risk.** If STA thread code synchronously waits on something that needs the STA thread (circular dependency), the message pump freezes. All waits must be async/non-blocking. +- **Reentrancy.** While pumping messages, inbound COM callbacks can reenter your code during another COM call. Event handlers must be reentrant-safe. + +### Complexity +- Every COM call needs `_marshalForm.BeginInvoke()` wrapping. +- COM object affinity to STA thread is hard to enforce at compile time. +- Unit tests need STA thread support or must use fakes. + +## Decision + +Fire-and-forget is the correct choice for now. Revisit when secured/verified writes are needed. + +## References + +- [.NET Matters: Handling Messages in Console Apps — Stephen Toub, MSDN Magazine 2007](https://learn.microsoft.com/en-us/archive/msdn-magazine/2007/june/net-matters-handling-messages-in-console-apps) +- [How to: Support COM Interop by Displaying Each Windows Form on Its Own Thread — Microsoft Learn](https://learn.microsoft.com/en-us/dotnet/desktop/winforms/advanced/how-to-support-com-interop-by-displaying-each-windows-form-on-its-own-thread) +- [.NET Windows Service needs STAThread — hirenppatel](https://hirenppatel.wordpress.com/2012/11/24/net-windows-service-needs-to-use-stathread-instead-of-mtathread/) +- [Application.Run() In a Windows Service — PC Review](https://www.pcreview.co.uk/threads/application-run-in-a-windows-service.3087159/) +- [Build a message pump for a Windows service? — CodeProject](https://www.codeproject.com/Messages/1365966/Build-a-message-pump-for-a-Windows-service.aspx) +- MxAccess Toolkit User's Guide — Write() Method, OnWriteComplete Callback sections