docs(lmxproxy): add STA message pump gap analysis with implementation guide
Documents when the full STA+Application.Run() approach is needed (secured/verified writes), why our first attempt failed, the correct pattern using Form.BeginInvoke(), and tradeoffs vs fire-and-forget. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
167
lmxproxy/docs/sta_gap.md
Normal file
167
lmxproxy/docs/sta_gap.md
Normal file
@@ -0,0 +1,167 @@
|
||||
# STA Message Pump Gap — OnWriteComplete COM Callback
|
||||
|
||||
**Status**: Documented gap. Fire-and-forget workaround in place (deviation #7). Full fix deferred until secured/verified writes are needed.
|
||||
|
||||
## When This Matters
|
||||
|
||||
The current fire-and-forget write approach works for **supervisory writes** where:
|
||||
- Security is handled at the LmxProxy API key level, not MxAccess attribute level
|
||||
- Writes succeed synchronously (no secured/verified write requirements)
|
||||
- Write confirmation is handled at the application level (read-back in `WriteBatchAndWait`)
|
||||
|
||||
This gap becomes a **blocking issue** if any of these scenarios arise:
|
||||
- **Secured writes (MxAccess error 1012)**: Attribute requires ArchestrA user authentication. `OnWriteComplete` returns the error, and the caller must retry with `WriteSecured()`.
|
||||
- **Verified writes (MxAccess error 1013)**: Attribute requires two-user verification. Same retry pattern.
|
||||
- **Write failure detection**: MxAccess accepts the `Write()` call but can't complete it (e.g., downstream device failure). `OnWriteComplete` is the only notification of this — without it, the caller assumes success.
|
||||
|
||||
## Root Cause
|
||||
|
||||
The MxAccess documentation (Write() Method) states: *"Upon completion of the write, your program receives notification of the success/failure status through the OnWriteComplete() event"* and *"that item should not be taken off advise or removed from the internal tables until the OnWriteComplete() event is received."*
|
||||
|
||||
`OnWriteComplete` **should** fire after every `Write()` call. It doesn't in our service because:
|
||||
- MxAccess is a COM component designed for Windows Forms apps with a UI message loop
|
||||
- COM event callbacks are delivered via the Windows message pump
|
||||
- Our Topshelf Windows service has no message pump — `Write()` is called from thread pool threads (`Task.Run`) with no message loop
|
||||
- `OnDataChange` works because MxAccess fires it proactively on its own internal threads; `OnWriteComplete` is a response callback that needs message-pump-based marshaling
|
||||
|
||||
## Correct Solution: Dedicated STA Thread + `Application.Run()`
|
||||
|
||||
Based on research (Stephen Toub, MSDN Magazine 2007; Microsoft Learn COM interop docs; community patterns), the correct approach is a dedicated STA thread running a Windows Forms message pump via `Application.Run()`.
|
||||
|
||||
### Architecture
|
||||
|
||||
```
|
||||
Service main thread (MTA)
|
||||
│
|
||||
├── gRPC server threads (handle client RPCs)
|
||||
│ │
|
||||
│ └── Marshal COM calls via Form.BeginInvoke() ──┐
|
||||
│ │
|
||||
└── Dedicated STA thread │
|
||||
│ │
|
||||
├── Creates LMXProxyServerClass COM object │
|
||||
├── Wires event handlers (OnDataChange, │
|
||||
│ OnWriteComplete, OperationComplete) │
|
||||
├── Runs Application.Run() ← continuous │
|
||||
│ message pump │
|
||||
│ │
|
||||
└── Hidden Form receives BeginInvoke calls ◄────┘
|
||||
│
|
||||
├── Executes COM operations (Read, Write,
|
||||
│ AddItem, AdviseSupervisory, etc.)
|
||||
│
|
||||
└── COM callbacks delivered via message pump
|
||||
(OnWriteComplete, OnDataChange, etc.)
|
||||
```
|
||||
|
||||
### Implementation Pattern
|
||||
|
||||
```csharp
|
||||
// In MxAccessClient constructor or Start():
|
||||
var initDone = new ManualResetEventSlim(false);
|
||||
|
||||
_staThread = new Thread(() =>
|
||||
{
|
||||
// 1. Create hidden form for marshaling
|
||||
_marshalForm = new Form();
|
||||
_marshalForm.CreateHandle(); // force HWND creation without showing
|
||||
|
||||
// 2. Create COM objects ON THIS THREAD
|
||||
_lmxProxy = new LMXProxyServerClass();
|
||||
_lmxProxy.OnDataChange += OnDataChange;
|
||||
_lmxProxy.OnWriteComplete += OnWriteComplete;
|
||||
|
||||
// 3. Signal that init is complete
|
||||
initDone.Set();
|
||||
|
||||
// 4. Run message pump (blocks forever, pumps COM callbacks)
|
||||
Application.Run();
|
||||
});
|
||||
_staThread.Name = "MxAccess-STA";
|
||||
_staThread.IsBackground = true;
|
||||
_staThread.SetApartmentState(ApartmentState.STA);
|
||||
_staThread.Start();
|
||||
|
||||
initDone.Wait(); // wait for COM objects to be ready
|
||||
```
|
||||
|
||||
### Dispatching Work to the STA Thread
|
||||
|
||||
```csharp
|
||||
// All COM calls must go through the hidden form's invoke:
|
||||
public Task<Vtq> ReadAsync(string address, CancellationToken ct)
|
||||
{
|
||||
var tcs = new TaskCompletionSource<Vtq>();
|
||||
_marshalForm.BeginInvoke((Action)(() =>
|
||||
{
|
||||
try
|
||||
{
|
||||
// COM call executes on STA thread
|
||||
int handle = _lmxProxy.AddItem(_connectionHandle, address);
|
||||
_lmxProxy.AdviseSupervisory(_connectionHandle, handle);
|
||||
// ... etc
|
||||
tcs.SetResult(vtq);
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
tcs.SetException(ex);
|
||||
}
|
||||
}));
|
||||
return tcs.Task;
|
||||
}
|
||||
```
|
||||
|
||||
### Shutdown
|
||||
|
||||
```csharp
|
||||
// To stop the message pump:
|
||||
_marshalForm.BeginInvoke((Action)(() =>
|
||||
{
|
||||
// Clean up COM objects on STA thread
|
||||
// ... UnAdvise, RemoveItem, Unregister ...
|
||||
Marshal.ReleaseComObject(_lmxProxy);
|
||||
Application.ExitThread(); // stops Application.Run()
|
||||
}));
|
||||
_staThread.Join(TimeSpan.FromSeconds(10));
|
||||
```
|
||||
|
||||
### Why Our First Attempt Failed
|
||||
|
||||
Our original `StaDispatchThread` (Phase 2) used `BlockingCollection.Take()` to wait for work items, with `Application.DoEvents()` between items. This failed because:
|
||||
|
||||
| Our failed approach | Correct approach |
|
||||
|---|---|
|
||||
| `BlockingCollection.Take()` blocks the STA thread, preventing the message pump from running | `Application.Run()` runs continuously, pumping messages at all times |
|
||||
| `Application.DoEvents()` only pumps messages already in the queue at that instant | Message pump runs an infinite loop, processing messages as they arrive |
|
||||
| Work dispatched by enqueueing to `BlockingCollection` | Work dispatched via `Form.BeginInvoke()` which posts a Windows message to the STA thread's queue |
|
||||
|
||||
The key difference: `BeginInvoke` posts a `WM_` message that the message pump processes alongside COM callbacks. `BlockingCollection` bypasses the message pump entirely.
|
||||
|
||||
## Drawbacks of the STA Approach
|
||||
|
||||
### Performance
|
||||
- **All COM calls serialize onto one thread.** Under load (batch reads of 100+ tags), operations queue up single-file. Current `Task.Run` approach allows MxAccess's internal marshaling to handle some concurrency.
|
||||
- **Double context switch per operation.** Caller → STA thread (invoke) → wait → back to caller. Adds ~0.1-1ms per call. Negligible for single reads, noticeable for large batch operations.
|
||||
|
||||
### Safety
|
||||
- **Single point of failure.** If the STA thread dies, all MxAccess operations stop. Recovery requires tearing down and recreating the thread + all COM objects.
|
||||
- **Deadlock risk.** If STA thread code synchronously waits on something that needs the STA thread (circular dependency), the message pump freezes. All waits must be async/non-blocking.
|
||||
- **Reentrancy.** While pumping messages, inbound COM callbacks can reenter your code during another COM call. Event handlers must be reentrant-safe.
|
||||
|
||||
### Complexity
|
||||
- Every COM call needs `_marshalForm.BeginInvoke()` wrapping.
|
||||
- COM object affinity to STA thread is hard to enforce at compile time.
|
||||
- Unit tests need STA thread support or must use fakes.
|
||||
|
||||
## Decision
|
||||
|
||||
Fire-and-forget is the correct choice for now. Revisit when secured/verified writes are needed.
|
||||
|
||||
## References
|
||||
|
||||
- [.NET Matters: Handling Messages in Console Apps — Stephen Toub, MSDN Magazine 2007](https://learn.microsoft.com/en-us/archive/msdn-magazine/2007/june/net-matters-handling-messages-in-console-apps)
|
||||
- [How to: Support COM Interop by Displaying Each Windows Form on Its Own Thread — Microsoft Learn](https://learn.microsoft.com/en-us/dotnet/desktop/winforms/advanced/how-to-support-com-interop-by-displaying-each-windows-form-on-its-own-thread)
|
||||
- [.NET Windows Service needs STAThread — hirenppatel](https://hirenppatel.wordpress.com/2012/11/24/net-windows-service-needs-to-use-stathread-instead-of-mtathread/)
|
||||
- [Application.Run() In a Windows Service — PC Review](https://www.pcreview.co.uk/threads/application-run-in-a-windows-service.3087159/)
|
||||
- [Build a message pump for a Windows service? — CodeProject](https://www.codeproject.com/Messages/1365966/Build-a-message-pump-for-a-Windows-service.aspx)
|
||||
- MxAccess Toolkit User's Guide — Write() Method, OnWriteComplete Callback sections
|
||||
Reference in New Issue
Block a user