# MXAccess Worker Instance Detailed Design ## Purpose An MXAccess worker instance is the compatibility boundary around one installed MXAccess COM object. It runs as a disposable .NET Framework 4.8 x86 process, owns one dedicated STA thread, pumps Windows/COM messages, executes MXAccess commands on that STA, and forwards MXAccess events back to the gateway. The worker's job is not to make MXAccess nicer. Its job is to preserve direct MXAccess behavior while making that behavior available to modern clients through the gateway. ## Runtime - Target runtime: .NET Framework 4.8. - Language: C#. - Platform target: x86 by default. - Process lifetime: one worker per gateway session. - Public network listeners: none. - Gateway IPC: one named pipe with protobuf-framed messages. - COM apartment: one dedicated STA thread. Style guides: - [C# Style Guide](./style-guides/CSharpStyleGuide.md) - [Protobuf Style Guide](./style-guides/ProtobufStyleGuide.md) ## Build And Test Build the SDK-style worker project with the .NET SDK MSBuild entry point. The project targets .NET Framework 4.8, but the SDK resolver comes from the .NET SDK installation: ```powershell dotnet msbuild src\MxGateway.Worker\MxGateway.Worker.csproj /restore /p:Configuration=Debug /p:Platform=x86 ``` `docs/toolchain-links.md` records the Visual Studio MSBuild executable for classic .NET Framework and COM interop builds: ```powershell & "C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\MSBuild\Current\Bin\MSBuild.exe" src\MxGateway.Worker\MxGateway.Worker.csproj /p:Configuration=Debug /p:Platform=x86 ``` Run the worker tests with the same platform target: ```powershell dotnet test src\MxGateway.Worker.Tests\MxGateway.Worker.Tests.csproj -p:Platform=x86 ``` The only MXAccess interop reference belongs in `MxGateway.Worker`. Gateway and test projects may reference the worker project for metadata and scaffold tests, but they must not reference `ArchestrA.MXAccess.dll` directly. ## Responsibilities The worker owns: - connection to the gateway pipe, - protocol hello and readiness reporting, - STA thread creation and teardown, - COM initialization on the STA, - MXAccess COM object creation, - MXAccess event sink wiring, - command dispatch on the STA, - MXAccess handle and advise state tracking, - value/status/HRESULT capture, - conversion to worker protobuf DTOs, - event sequencing, - heartbeat reporting, - graceful shutdown. The worker does not own: - public gRPC API, - client authentication, - cross-session routing, - worker process supervision, - remote TLS, - policy decisions for other sessions. ## Process Bootstrap Expected command-line arguments: ```text --session-id --pipe-name --protocol-version ``` Expected protected environment values: ```text MXGATEWAY_WORKER_NONCE= MXGATEWAY_WORKER_LOG_CONTEXT= ``` Startup sequence: 1. Parse command-line arguments. 2. Configure minimal logging. 3. Validate required values are present. 4. Connect to the gateway named pipe. 5. Exchange `WorkerHello` and `GatewayHello`. 6. Validate protocol version, session id, and nonce. 7. Start the STA runtime. 8. Create the MXAccess COM object on the STA. 9. Attach MXAccess event handlers on the STA. 10. Send `WorkerReady`. 11. Start pipe read, pipe write, heartbeat, and shutdown coordination loops. If validation fails before MXAccess creation, exit quickly with a non-zero exit code. If MXAccess creation fails, send `WorkerFault` when possible and exit. The bootstrap layer returns structured exit codes before it creates pipes, starts the STA, or touches MXAccess: | Exit code | Name | Meaning | |-----------|------|---------| | `0` | `Success` | Required bootstrap options are valid. | | `1` | `UnexpectedFailure` | A non-bootstrap exception reaches the process boundary. | | `2` | `InvalidArguments` | Required arguments are missing or unknown arguments are present. | | `3` | `InvalidProtocolVersion` | `--protocol-version` is not numeric or does not match the supported worker protocol. | | `4` | `MissingNonce` | `MXGATEWAY_WORKER_NONCE` is absent or empty. | Bootstrap logs use `WorkerConsoleLogger` key/value output. `WorkerLogRedactor` redacts fields whose names indicate nonce, secret, password, token, credential, or API key values before the message is written. ## Internal Components ```text MxGateway.Worker Program Bootstrap WorkerOptions WorkerHost Ipc PipeClient FrameReader FrameWriter WorkerProtocol Sta StaRuntime StaCommandQueue MessagePump StaWatchdog MxAccess MxAccessSession MxAccessCommandDispatcher MxAccessEventSink MxAccessHandleRegistry Conversion VariantConverter SafeArrayConverter StatusProxyConverter HResultMapper ``` ## Threading Model ```text main thread -> parse args -> configure host -> coordinate shutdown pipe reader thread/task -> read WorkerEnvelope frames -> validate protocol -> enqueue commands or control messages pipe writer thread/task -> serialize WorkerEnvelope frames -> write replies, events, heartbeats, faults STA thread -> CoInitializeEx(APARTMENTTHREADED) -> create MXAccess COM object -> attach event handlers -> pump Windows/COM messages -> execute queued commands -> detach events and release COM on shutdown watchdog/heartbeat task -> observe STA responsiveness -> send heartbeat or fault ``` No MXAccess method may execute outside the STA thread. Do not use `Task.Run` around COM calls. Do not let event handlers perform pipe writes. ## STA Runtime The STA runtime is the most important part of the worker. Startup: 1. Create a dedicated `Thread`. 2. Set apartment state to `ApartmentState.STA`. 3. Start the thread. 4. Inside the thread, initialize COM. 5. Create the MXAccess COM object. 6. Attach event handlers. 7. Signal ready to the worker host. 8. Enter the message pump. Shutdown: 1. Mark the command queue as completing. 2. Drain or reject pending commands according to shutdown mode. 3. Optionally issue MXAccess cleanup calls for active handles. 4. Detach event handlers. 5. Release COM references. 6. Uninitialize COM. 7. Exit the thread. ## Message Pump The STA must pump Windows messages while also processing queued commands. A blocking queue that prevents message pumping is not acceptable. Required loop shape: ```text while not shutdown: while command queue has work: execute one command on STA MsgWaitForMultipleObjectsEx( command_event, timeout, QS_ALLINPUT, MWMO_INPUTAVAILABLE) while PeekMessage: TranslateMessage DispatchMessage ``` The command queue should signal a Win32 event or equivalent wait handle so the STA can wake without busy-waiting. The loop should update a heartbeat timestamp after: - successfully pumping messages, - starting a command, - finishing a command, - processing an MXAccess event. `StaRuntime` implements this runtime boundary in the worker. It starts one background thread named `MxGateway.Worker.STA`, sets it to `ApartmentState.STA`, initializes COM through `StaComApartmentInitializer`, and runs `StaMessagePump`. Commands are scheduled through `InvokeAsync`; the command queue signals an `AutoResetEvent` so `MsgWaitForMultipleObjectsEx` can wake the STA without busy-waiting. `LastActivityUtc` records pump, command, startup, and shutdown activity so the future heartbeat/watchdog can report whether the STA is still responsive. Shutdown marks the runtime as closing, wakes the pump, rejects new commands, cancels queued work, uninitializes COM on the STA, and waits for the thread to exit. ## COM Creation The MXAccess analysis source at `C:\Users\dohertj2\Desktop\mxaccess` identifies the installed COM target: - interop assembly: `C:\Program Files (x86)\ArchestrA\Framework\Bin\ArchestrA.MXAccess.dll` - assembly identity: `ArchestrA.MxAccess, Version=3.2.0.0, PublicKeyToken=23106a86e706d0ae` - COM class: `ArchestrA.MxAccess.LMXProxyServerClass` - CLSID: `{C30B52F5-2CB5-4760-AF0A-3A344A7EB5DC}` - ProgID: `LMXProxy.LMXProxyServer.1` - version-independent ProgID: `LMXProxy.LMXProxyServer` - registered server: `C:\Program Files (x86)\ArchestrA\Framework\Bin\LmxProxy.dll` - registry view: `HKCR\Wow6432Node\CLSID\{C30B52F5-2CB5-4760-AF0A-3A344A7EB5DC}` - threading model: `Apartment` The worker should reference the interop assembly and instantiate `LMXProxyServerClass` on the dedicated STA thread. Keep the ProgID and assembly path configurable for diagnostics, but this COM class is the v1 default. `MxAccessStaSession` owns the initial COM creation path. It starts `StaRuntime`, creates `LMXProxyServerClass` through `MxAccessComObjectFactory` on the STA, attaches `MxAccessBaseEventSink`, and returns `WorkerReady` only after those steps succeed. `MxAccessSession` keeps the raw COM object private, records the STA managed thread id that created it, detaches the base event sink during disposal, and releases the COM reference on the STA. After creation, `MxAccessStaSession` owns a `StaCommandDispatcher` backed by `MxAccessCommandExecutor`; `DispatchAsync` queues contract commands back to the same STA instead of exposing the COM object to callers. Creation rules: - Create COM object only on the STA. - Attach event handlers only on the STA. - Keep the COM reference private to the STA runtime. - Never marshal the raw COM object to pipe reader/writer threads. - Capture COM creation HRESULT or exception details. If COM creation fails, the worker should send a structured fault with: - fault category, - exception type, - HRESULT when available, - COM class or ProgID attempted, - worker process id, - session id. `WorkerPipeSession` maps startup exceptions from this path to `WorkerFaultCategory.MxaccessCreationFailed`, includes the captured HRESULT when the exception exposes one, and does not send `WorkerReady` after a failed COM creation attempt. ## Event Sink The worker must subscribe to every public MXAccess event family: - `OnDataChange` - `OnWriteComplete` - `OperationComplete` - `OnBufferedDataChange` Forward these event families only when the native MXAccess COM object raises them. Do not synthesize `OperationComplete` from write completion or command status. `OnBufferedDataChange` must be represented in the protocol now, but multi-sample payload conversion should remain capture-validated; preserve raw metadata whenever conversion is incomplete. Event handling rules: - Event handlers are expected to run on the STA. - Assign a monotonic worker event sequence. - Convert event args to `WorkerEvent`. - Include value, quality, timestamp, handles, status arrays, and raw status details when available. - Preserve raw event payload metadata for unsupported buffered or completion-only shapes. - Enqueue to the outbound event queue. - Return quickly to preserve message pumping. If event conversion throws, catch it inside the event handler, enqueue a structured `WorkerFault` or diagnostic event, and keep the worker alive only if the fault policy allows it. ## Command Queue The pipe reader converts `WorkerCommand` messages into `StaCommand` entries. Each entry should include: - correlation id, - method name, - method-specific request payload, - enqueue timestamp, - cancellation marker, - reply completion path. The STA command dispatcher: 1. Dequeues one command. 2. Checks whether shutdown has started. 3. Calls the matching MXAccess method. 4. Captures return values, out parameters, status arrays, and HRESULT. 5. Converts results to `WorkerCommandReply`. 6. Enqueues the reply to the pipe writer. The STA should execute one command at a time. MXAccess command ordering must be preserved for one worker. ## Command Dispatch Surface Phase 1 commands: - `Register` - `Unregister` - `AddItem` - `RemoveItem` Phase 2 event commands: - `Advise` - `UnAdvise` - `AdviseSupervisory` Full surface: - `AddItem2` - `AddBufferedItem` - `SetBufferedUpdateInterval` - `Suspend` - `Activate` - `Write` - `Write2` - `WriteSecured` - `WriteSecured2` - `AuthenticateUser` - `ArchestrAUserToId` Diagnostics: - `Ping` - `GetSessionState` - `GetWorkerInfo` - `DrainEvents` - `ShutdownWorker` Implement method-specific dispatch instead of a generic string method invoker. Parity tests need stable command-specific request and reply shapes. `MxAccessCommandExecutor` implements the first command pair: - `Register` calls `LMXProxyServerClass.Register` with the requested client name and preserves the returned server handle in both `ReturnValue` and `RegisterReply.ServerHandle`. - `Unregister` calls `LMXProxyServerClass.Unregister` with the requested server handle. The reply has no method-specific payload because the public MXAccess method returns `void`. Both commands set `Hresult` to `0` only after the COM call returns normally. COM exceptions flow through `StaCommandDispatcher`, which captures the thrown HRESULT and converts the reply to `ProtocolStatusCode.MxaccessFailure`. `MxAccessStaSession.GetRegisteredServerHandlesAsync` returns an STA-read snapshot of tracked server handles for diagnostics and future cleanup logic. `MxAccessCommandExecutor` also implements the item lifecycle commands: - `AddItem` calls `LMXProxyServerClass.AddItem` with the requested server handle and item definition. It preserves the returned item handle in both `ReturnValue` and `AddItemReply.ItemHandle`. - `AddItem2` calls `LMXProxyServerClass.AddItem2` with the requested server handle, item definition, and context string. The context string is passed to MXAccess exactly as received. - `RemoveItem` calls `LMXProxyServerClass.RemoveItem` with the requested server handle and item handle. The reply has no method-specific payload because the public MXAccess method returns `void`. The worker records item handles only after `AddItem` or `AddItem2` returns normally, and removes item handles only after `RemoveItem` returns normally. The registry does not prevalidate server or item handles, so invalid and cross-server handle behavior remains owned by MXAccess. COM exceptions continue through `StaCommandDispatcher`, which preserves the HRESULT and leaves diagnostic registry state unchanged for failed cleanup calls. `MxAccessCommandExecutor` implements advice lifecycle commands on the same STA path: - `Advise` calls `LMXProxyServerClass.Advise` with the requested server handle and item handle. - `AdviseSupervisory` calls `LMXProxyServerClass.AdviseSupervisory` with the requested server handle and item handle. This remains a distinct command from plain `Advise` even though observed scalar captures share the same lower-level subscription body. - `UnAdvise` calls `LMXProxyServerClass.UnAdvise` with the requested server handle and item handle. The worker records plain and supervisory advice separately only after the COM call returns normally. Successful `UnAdvise` removes all tracked advice for the server and item pair because the public MXAccess cleanup method has no plain versus supervisory selector. Successful `RemoveItem` and `Unregister` also clear related advice state from the worker registry. Failed advice and cleanup calls leave registry state unchanged so diagnostics continue to reflect the last successful MXAccess-owned state transition. ## Handle Registry The worker should track MXAccess state for diagnostics and cleanup, while still treating MXAccess as the authority. Suggested tracked state: - registered server handles, - item handles, - item names and context, - server handle for each item, - advise state, - buffered item state, - authenticated user ids if needed, - last command touching each handle. Rules: - Do not invent handles. - Do not rewrite handles returned by MXAccess. - Record server handles only after `Register` succeeds. - Remove server handles only after `Unregister` succeeds. - Record item handles only after `AddItem` or `AddItem2` succeeds. - Remove item handles only after `RemoveItem` succeeds. - Record advice state only after `Advise` or `AdviseSupervisory` succeeds. - Remove advice state only after `UnAdvise`, `RemoveItem`, or `Unregister` succeeds. - Preserve invalid-handle behavior from MXAccess. - Preserve cross-server handle behavior from MXAccess. - Use registry state for cleanup and diagnostics, not semantic correction. ## Value Conversion `VariantConverter` should convert COM values into the protobuf `MxValue` union. Supported scalar projections: - bool, - int32, - int64, - float, - double, - string, - timestamp, - raw fallback. Supported arrays: - bool array, - int32 array, - float array, - double array, - string array, - timestamp array, - raw fallback. Rules: - Preserve null and empty values distinctly when MXAccess exposes a distinction. - Preserve array rank and dimensions when available. - Preserve original variant type metadata. - If conversion is lossy, include the best typed value plus raw diagnostic metadata. - Do not throw away values just because they are awkward. Credential-bearing values must not be logged. ## Status And HRESULT Capture `MXSTATUS_PROXY` arrays must be represented explicitly. Do not collapse status arrays into a single success flag. For every command reply, capture: - protocol success/failure, - method name, - correlation id, - COM HRESULT if available, - thrown exception HRESULT if available, - MXAccess return value if any, - method-specific out parameters, - status array, - diagnostic message safe for logs. If a COM call throws, map the exception into a command reply instead of crashing the worker, unless the exception indicates process corruption or the configured policy says to fail the session. ## Cancellation Worker cancellation is cooperative at the queue boundary. Rules: - If a `WorkerCancel` arrives before a command starts, mark the command canceled and reply or drop according to protocol policy. - If a command is already executing on the STA, do not attempt to abort the COM call. - When the COM call returns after gateway cancellation, send the reply only if the gateway still wants late replies; otherwise log and discard. - Hard cancellation is process kill by the gateway. ## Outbound Queues The worker should use bounded outbound queues for replies, events, heartbeats, and faults. Priority order when writing: 1. faults, 2. command replies, 3. shutdown acknowledgements, 4. heartbeats, 5. events. Event overflow policy defaults to fail-fast for parity testing. If the event queue fills: 1. Capture overflow metrics. 2. Send `WorkerFault` if possible. 3. Stop accepting new commands. 4. Let the gateway close or kill the worker. Production coalescing may be added later, but it must be explicit and tested. Do not drop or coalesce events in v1. ## Heartbeat And Watchdog The worker heartbeat should prove that: - pipe writer is alive, - worker host is alive, - STA has recently pumped or completed work. Heartbeat payload should include: - worker process id, - session id, - current state, - last STA activity timestamp, - pending command count, - outbound event queue depth, - event sequence, - current command correlation id if any. The STA watchdog should warn when: - one command exceeds its expected duration, - the STA has not pumped messages within the heartbeat grace period, - event queue depth remains high. The worker can report the problem, but the gateway owns the final kill decision. ## Shutdown Graceful shutdown sequence: 1. Pipe reader receives `WorkerShutdown`. 2. Worker host marks shutdown requested. 3. Reject new commands. 4. Let current STA command finish if within timeout. 5. Optionally run MXAccess cleanup: - `UnAdvise`, - `RemoveItem`, - `Unregister`. 6. Detach event handlers. 7. Release COM object until reference count reaches zero when possible. 8. Stop pipe reader and writer. 9. Exit process with success code. If shutdown wedges, the gateway kills the process. The worker should be written so process kill does not corrupt other sessions. ## Fault Handling Worker fault categories: - `InvalidArguments` - `GatewayAuthenticationFailed` - `ProtocolMismatch` - `ProtocolViolation` - `PipeDisconnected` - `MxAccessCreationFailed` - `MxAccessCommandFailed` - `MxAccessEventConversionFailed` - `StaHung` - `QueueOverflow` - `ShutdownTimeout` Fault payload should include: - category, - session id, - correlation id when command-specific, - command method when command-specific, - HRESULT when available, - exception type when available, - safe diagnostic message. Do not include raw credentials or full secured-write values. ## Security The worker should trust only the launching gateway after validating: - expected session id, - expected protocol version, - nonce, - pipe identity where available. It should not expose any network listener. It should not accept commands from arbitrary local processes. Credential-bearing commands must keep credential data out of: - command line, - logs, - metrics labels, - exception messages, - crash dumps when avoidable. ## Observability Worker logs should include: - startup arguments except secrets, - protocol version, - gateway handshake result, - MXAccess COM creation result, - command start/end with correlation id, - HRESULT/status summary, - event family and sequence, - queue overflow, - STA watchdog warnings, - shutdown path. Metrics can be emitted through the gateway or exposed as worker heartbeat fields. The worker does not need its own public metrics endpoint. ## Testing Strategy Worker tests that do not require installed MXAccess: - frame reader/writer, - protocol validation, - command queue ordering, - STA command scheduling with a fake COM object, - message-pump wake behavior where practical, - value conversion, - status conversion, - event conversion from fake event args, - shutdown state transitions, - queue overflow behavior. Live MXAccess tests: - COM creation on STA, - `Register` and `Unregister`, - `AddItem` and `RemoveItem`, - `Advise` and one `OnDataChange`, - write completion behavior, - secured write behavior, - buffered data-change behavior, - invalid handle behavior. - no synthesized `OperationComplete` when native MXAccess does not raise it. - raw metadata preservation for buffered payloads that cannot yet be fully converted. Live tests should be opt-in and clearly marked because they depend on installed MXAccess COM and provider state. The worker test suite uses `MXGATEWAY_RUN_LIVE_MXACCESS_TESTS=1` for these tests. `AddItem` uses `TestChildObject.TestInt` by default and accepts an override through `MXGATEWAY_LIVE_MXACCESS_ITEM`; `AddItem2` uses the captured parity fixture shape `AddItem2("TestInt", "TestChildObject")`. ## Initial Implementation Slice The first worker slice should implement: 1. Argument parsing and pipe connection. 2. Protocol hello and nonce validation. 3. STA thread startup. 4. COM initialization and MXAccess object creation. 5. Message pump with command wake event. 6. `WorkerReady`. 7. Shutdown command. 8. `Register`, `AddItem`, and `Advise`. 9. Event sink for one `OnDataChange`. 10. Basic value/status conversion. 11. Event model coverage for `OperationComplete` and `OnBufferedDataChange` without synthesized events. 12. Fault reporting. This slice proves the worker can preserve the core MXAccess requirements: single-process isolation, STA ownership, message pumping, command execution, and event delivery.