Initial MXAccess gateway design docs

This commit is contained in:
Joseph Doherty
2026-04-26 14:04:40 -04:00
commit 1d8a6fe3db
2 changed files with 1223 additions and 0 deletions
+326
View File
@@ -0,0 +1,326 @@
# MXAccess Gateway Agent Guide
Repository: https://gitea.dohertylan.com/dohertj2/mxaccessgw
This project builds a gateway that gives modern clients full MXAccess parity
without requiring those clients to load MXAccess COM, run as x86, or own an STA
message pump. Treat the installed MXAccess COM component as the compatibility
baseline.
## Core Contract
Preserve MXAccess behavior first:
- public MXAccess command semantics,
- native MXAccess event families,
- STA/message-pump delivery behavior,
- installed-provider quirks,
- HRESULT/status/value marshaling,
- per-client isolation.
Do not simplify, normalize, or "fix" MXAccess behavior unless an explicit
non-parity mode is being implemented and tested. `MxAsbClient` and managed NMX
are future acceleration paths only; they do not define the parity contract.
## Architecture
The intended split is:
```text
client
-> gRPC over TCP
-> .NET 10 x64 gateway
-> session manager
-> per-session .NET Framework 4.8 x86 worker process
-> dedicated STA thread
-> MXAccess COM instance
-> Windows/COM message pump
-> command queue
-> event sink
```
The gateway must never instantiate or call MXAccess directly. All MXAccess COM
interaction belongs in the worker process on its dedicated STA thread.
The worker must not host public gRPC. Gateway-to-worker communication should use
a small local IPC protocol, with named pipes and protobuf-framed messages as the
default design.
## Runtime Targets
- Gateway: .NET 10, C#, x64 preferred, ASP.NET Core gRPC.
- Worker: .NET Framework 4.8, C#, x86 by default.
- Worker IPC: one bidirectional named pipe per worker.
- Worker process model: one external client session maps to one worker by
default.
## Expected Layout
Prefer this structure unless there is a strong reason to adjust it:
```text
src/MxGateway.Contracts/
Protos/
mxaccess_gateway.proto
mxaccess_worker.proto
Generated/
src/MxGateway.Server/
Program.cs
Sessions/
Workers/
Grpc/
Metrics/
src/MxGateway.Worker/
Program.cs
Ipc/
Sta/
MxAccess/
Conversion/
src/MxGateway.Tests/
contract tests
gateway session tests
fake worker tests
src/MxGateway.Worker.Tests/
value/status conversion tests
STA queue tests
src/MxGateway.IntegrationTests/
optional live MXAccess tests
```
The contracts project may multi-target, or the `.proto` files may be shared as
source inputs to both gateway and worker builds.
## Public API Shape
The external API should be session-oriented. Initial rollout should prefer
unary `OpenSession`, `CloseSession`, and `Invoke`, plus server-streaming
`StreamEvents`. Add a bidirectional `Session` stream after the command and event
model is stable.
Do not compress MXAccess into generic verbs too early. Use a command enum with
method-specific payloads so parity can be tested method by method.
Core MXAccess commands to represent:
- `Register`
- `Unregister`
- `AddItem`
- `AddItem2`
- `RemoveItem`
- `Advise`
- `UnAdvise`
- `AdviseSupervisory`
- `AddBufferedItem`
- `SetBufferedUpdateInterval`
- `Suspend`
- `Activate`
- `Write`
- `Write2`
- `WriteSecured`
- `WriteSecured2`
- `AuthenticateUser`
- `ArchestrAUserToId`
Diagnostics may include `Ping`, `GetSessionState`, `GetWorkerInfo`,
`DrainEvents`, and `ShutdownWorker`.
## Event Requirements
Represent every public MXAccess event family:
- `OnDataChange`
- `OnWriteComplete`
- `OperationComplete`
- `OnBufferedDataChange`
Preserve per-worker event order. The gateway must not reorder events emitted by
the same MXAccess instance.
Event DTOs should carry event family, session id, server handle, item handle,
value, quality, timestamp, `MXSTATUS_PROXY[]` equivalent, raw HRESULT/status
fields when available, event sequence, worker timestamp, and gateway receive
timestamp.
## Value And Status Rules
Use a protobuf value union that can represent COM `VARIANT` values and arrays.
When a value cannot be losslessly converted, preserve both the best typed
projection and enough raw diagnostic metadata to reproduce the case.
Represent `MXSTATUS_PROXY` explicitly. Do not collapse status arrays into a
single success flag.
Command replies should include protocol status, COM HRESULT if available,
MXAccess return values, method-specific out parameters, and status arrays where
the MXAccess method emits them.
## Worker Rules
Each worker owns:
- one process,
- one MXAccess session,
- one dedicated STA thread,
- one MXAccess COM object,
- one inbound command queue,
- one outbound event queue.
All MXAccess operations must run on the STA. A plain blocking queue is not
enough for the STA; the STA loop must pump Windows/COM messages and service
queued commands.
Do not block the STA on pipe writes, gRPC calls, or slow consumers. Event
handlers should convert event args, enqueue outbound events, and return to
pumping messages.
On graceful shutdown, reject new commands, optionally clean up active MXAccess
handles, detach events, release the COM object, uninitialize COM, and exit. If
graceful shutdown exceeds the configured timeout, the gateway may kill the
worker.
## IPC Rules
Default pipe name shape:
```text
mxaccess-gateway-{gatewayProcessId}-{sessionId}
```
Frame messages as:
```text
uint32 little-endian payload_length
payload_length bytes protobuf WorkerEnvelope
```
Every envelope should include protocol version, session id, monotonic sender
sequence, correlation id, and a typed body. Protocol version mismatch should
fail session creation.
Pipe security should be local-machine only, with ACLs restricted to the gateway
identity and launched worker identity. Prefer a per-session nonce handshake.
## Gateway Rules
The gateway is responsible for:
- public TCP/gRPC API,
- authn/authz when needed,
- session creation and teardown,
- worker launch and lifecycle management,
- command routing,
- event streaming,
- leases, heartbeats, timeouts, and quotas,
- worker kill/restart policy,
- metrics and structured logs.
The gRPC layer should stay thin: validate request, find session, call the
session worker client, map worker replies to public replies, and stream events.
Keep MXAccess-specific translation logic testable outside the gRPC handlers.
Gateway restart should not try to reattach old workers in the first version.
Terminate orphaned workers on startup if that behavior is implemented.
## Command, Timeout, And Cancellation Semantics
Command lifecycle:
```text
client gRPC command
gateway validates session and payload
gateway assigns correlation id
gateway writes WorkerCommand to pipe
worker queues command to STA
STA executes MXAccess method
worker captures return/out/status/HRESULT
worker sends WorkerCommandReply
gateway completes gRPC response
```
Canceling a gRPC call should stop waiting in the gateway, but it cannot safely
abort an in-flight COM call on the STA. Hard cancellation means killing the
worker process.
If a command wedges the STA beyond a configured grace period, the gateway should
kill the worker and fail the session.
## Backpressure Policy
Worker outbound events must use a bounded queue. For parity testing, prefer
fail-fast behavior over silent drops. Production coalescing or drop policies
must be explicit and observable.
The gateway should preserve per-session event order, apply backpressure from
slow gRPC streams, and disconnect or coalesce only according to an explicit
policy.
## Security And Logging
Use TLS for remote gRPC when crossing machine boundaries. Authentication may be
Windows auth, mTLS, or a deployment-specific token.
Commands that write, authenticate users, or alter runtime state need explicit
authorization design.
Never log passwords or raw credential values for `AuthenticateUser`,
`WriteSecured`, or related secured operations. Do not log full values by
default; make value logging opt-in and redacted.
## Testing Expectations
Use focused tests for:
- contract/protobuf compatibility,
- gateway session state and worker lifecycle,
- gateway behavior with a fake worker,
- worker value/status conversion,
- STA queue and message-pump behavior.
Live MXAccess integration tests are optional but should be isolated because they
depend on installed COM components and provider behavior.
Parity tests should compare direct MXAccess behavior against the gateway:
- return values,
- HRESULTs and exceptions,
- event sequence,
- value projection,
- quality/status arrays,
- invalid handle behavior,
- cross-server handle behavior,
- cleanup behavior.
Known important parity areas:
- `WriteSecured` may fail before a value-bearing NMX body is emitted.
- `WriteSecured2` can succeed in observed native paths.
- `OperationComplete` is distinct from write completion.
- `OnBufferedDataChange` has a distinct public event shape.
- Invalid handles and cross-server handles have specific exception/status
behavior.
- STA message pumping is required for event delivery.
## Implementation Priority
Build the smallest end-to-end slice first:
1. .NET 10 gateway starts.
2. Client calls `OpenSession`.
3. Gateway launches .NET Framework 4.8 x86 worker.
4. Worker creates STA and MXAccess COM object.
5. Client calls `Register`.
6. Client calls `AddItem`.
7. Client calls `Advise`.
8. Worker forwards one `OnDataChange` event to the gateway.
9. Gateway streams the event to the client.
10. Client calls `CloseSession`.
11. Gateway shuts down the worker.
That slice proves the high-risk requirements: process isolation, STA ownership,
message pumping, command routing, and event streaming.
+897
View File
@@ -0,0 +1,897 @@
# MXAccess Gateway Design
## Goal
Provide full MXAccess parity to modern clients without forcing those clients to
load MXAccess COM, run as x86, or own an STA message pump.
The gateway must preserve MXAccess behavior first:
- public MXAccess command semantics,
- native MXAccess event families,
- STA/message-pump delivery behavior,
- installed-provider quirks,
- HRESULT/status/value marshaling,
- per-client isolation.
`MxAsbClient` and the managed NMX client remain useful future acceleration
paths, but they should not define the parity contract. The installed MXAccess
COM component is the compatibility baseline.
## Architecture
Use a .NET 10 C# gateway for external clients and per-session .NET Framework
4.8 x86 C# worker processes for MXAccess.
```text
client
-> gRPC over TCP
-> .NET 10 x64 gateway
-> session manager
-> per-session .NET Framework 4.8 x86 worker process
-> dedicated STA thread
-> MXAccess COM instance
-> Windows/COM message pump
-> command queue
-> event sink
```
The worker does not host gRPC. The gateway talks to workers through a small
local IPC protocol. Named pipes with protobuf-framed messages are the default
transport.
## Process Split
### Gateway Process
Runtime:
- .NET 10
- C#
- x64 preferred
- ASP.NET Core gRPC server
Responsibilities:
- expose the public TCP/gRPC API,
- authenticate/authorize remote clients if needed,
- create one worker per client session,
- route commands to the owning worker,
- stream worker events to the owning client,
- enforce session leases, heartbeats, timeouts, and quotas,
- kill/restart workers when they hang or crash,
- collect metrics and structured logs,
- optionally route selected future operations to ASB or managed NMX only after
parity tests prove equivalent behavior.
The gateway must never instantiate or call MXAccess directly.
### Worker Process
Runtime:
- .NET Framework 4.8
- C#
- x86 build by default
Responsibilities:
- own one MXAccess COM instance,
- create and preserve one dedicated STA thread,
- pump Windows/COM messages on that STA thread,
- execute every MXAccess method call on that STA thread,
- subscribe to MXAccess COM events,
- convert command results and events into internal protobuf DTOs,
- send events back to the gateway over the worker pipe,
- shut down cleanly on request,
- terminate quickly when the gateway kills the process.
The worker should be disposable. If MXAccess leaks state, faults, or wedges the
STA, the gateway can kill the process without corrupting other clients.
## Why Not gRPC In The Worker
.NET Framework 4.8 does not have the same first-class gRPC stack as .NET 10.
For the worker, a custom local protocol is simpler and more predictable:
- named pipes are Windows-native,
- no HTTP/2 requirement,
- fewer dependencies in the x86 process,
- easier process lifetime control,
- easier framed binary protocol,
- sufficient throughput for command and event traffic.
The public API can still be modern gRPC because the gateway runs on .NET 10.
## Worker IPC
Default transport: one bidirectional named pipe per worker.
Pipe name:
```text
mxaccess-gateway-{gatewayProcessId}-{sessionId}
```
Message framing:
```text
uint32 little-endian payload_length
payload_length bytes protobuf WorkerEnvelope
uint32 little-endian payload_length
payload_length bytes protobuf WorkerEnvelope
...
```
The gateway creates the pipe server, starts the worker with the pipe name as an
argument, then waits for the worker to connect and send `WorkerReady`.
Pipe security:
- local machine only,
- ACL restricted to the gateway identity and the launched worker identity,
- no anonymous access,
- optionally add a per-session random handshake nonce passed by command line or
inherited environment.
### Worker Envelope
Every IPC message uses a common envelope:
```protobuf
message WorkerEnvelope {
uint32 protocol_version = 1;
string session_id = 2;
uint64 sequence = 3;
uint64 correlation_id = 4;
oneof body {
WorkerHello worker_hello = 10;
GatewayHello gateway_hello = 11;
WorkerReady worker_ready = 12;
WorkerCommand command = 20;
WorkerCommandReply command_reply = 21;
WorkerEvent event = 22;
WorkerHeartbeat heartbeat = 23;
WorkerCancel cancel = 24;
WorkerShutdown shutdown = 25;
WorkerFault fault = 26;
}
}
```
Rules:
- `sequence` is monotonic per sender.
- `correlation_id` links commands to replies.
- Events use their own correlation id or zero.
- Replies must preserve MXAccess HRESULT/status information even when the
command is also represented as a protocol-level failure.
- Protocol version mismatch fails session creation.
## Public gRPC API
The external API should be session-oriented. A bidirectional stream is the best
long-term shape because it naturally carries commands, replies, events,
heartbeats, and cancellation.
```protobuf
service MxAccessGateway {
rpc OpenSession(OpenSessionRequest) returns (OpenSessionReply);
rpc CloseSession(CloseSessionRequest) returns (CloseSessionReply);
rpc Invoke(MxCommandRequest) returns (MxCommandReply);
rpc StreamEvents(StreamEventsRequest) returns (stream MxEvent);
rpc Session(stream ClientMessage) returns (stream ServerMessage);
}
```
Recommended rollout:
1. Implement unary `OpenSession`, `CloseSession`, and `Invoke`.
2. Implement server-streaming `StreamEvents`.
3. Add bidirectional `Session` after the command/event model is stable.
The unary plus event-stream shape is easier to debug initially. The
bidirectional stream can later reduce per-command overhead and improve
backpressure.
## Public MXAccess Command Surface
The gateway contract should mirror MXAccess concepts without leaking COM types.
Keep handles and statuses explicit.
Core commands:
- `Register`
- `Unregister`
- `AddItem`
- `AddItem2`
- `RemoveItem`
- `Advise`
- `UnAdvise`
- `AdviseSupervisory`
- `AddBufferedItem`
- `SetBufferedUpdateInterval`
- `Suspend`
- `Activate`
- `Write`
- `Write2`
- `WriteSecured`
- `WriteSecured2`
- `AuthenticateUser`
- `ArchestrAUserToId`
Optional diagnostics:
- `Ping`
- `GetSessionState`
- `GetWorkerInfo`
- `DrainEvents`
- `ShutdownWorker`
Do not compress MXAccess semantics into generic verbs too early. A command enum
with method-specific payloads is easier to test for parity.
## Event Surface
The gateway must represent every public MXAccess event family:
- `OnDataChange`
- `OnWriteComplete`
- `OperationComplete`
- `OnBufferedDataChange`
The event DTO should include:
- event family,
- session id,
- server handle,
- item handle,
- value when present,
- quality when present,
- timestamp when present,
- `MXSTATUS_PROXY[]` equivalent,
- raw HRESULT/status fields when available,
- event ordering sequence,
- worker timestamp,
- gateway receive timestamp.
Keep event order stable per worker. The gateway should not reorder events from
the same MXAccess instance.
## Value Model
Use a protobuf value union that can represent COM `VARIANT` values and arrays.
```protobuf
message MxValue {
oneof kind {
bool bool_value = 1;
int32 int32_value = 2;
int64 int64_value = 3;
float float_value = 4;
double double_value = 5;
string string_value = 6;
Timestamp timestamp_value = 7;
MxArray array_value = 8;
bytes raw_variant = 100;
}
}
```
Array support should include at least:
- bool array,
- int32 array,
- float array,
- double array,
- string array,
- timestamp array,
- raw fallback.
For full parity, unknown or awkward COM values should be preserved as raw
metadata rather than dropped. If a value cannot be losslessly converted, the
worker should return both the best typed projection and enough diagnostic
metadata to reproduce the case.
## Status Model
Represent `MXSTATUS_PROXY` explicitly:
```protobuf
message MxStatusProxy {
int32 success = 1;
uint32 category = 2;
uint32 detail = 3;
uint32 source = 4;
uint32 raw_hresult = 5;
string text = 6;
}
```
The exact field names should be adjusted to match the actual interop struct,
but the design principle is important: do not collapse status arrays into a
single success flag.
For command replies, return:
- protocol status,
- COM HRESULT if available,
- MXAccess return value if the method has one,
- method-specific out parameters,
- status array if the method emits one.
## STA Worker Thread Model
Each worker owns:
- one process,
- one MXAccess session,
- one dedicated STA thread,
- one MXAccess COM object,
- one inbound command queue,
- one outbound event queue.
All MXAccess operations run on the STA:
```text
pipe reader thread
-> parse WorkerCommand
-> enqueue StaCommand
-> await task completion
-> write WorkerCommandReply
STA thread
-> CoInitializeEx(APARTMENTTHREADED)
-> create MXAccess COM object
-> wire events
-> run message pump
-> execute queued commands between message dispatches
MXAccess event handler on STA
-> convert event args to WorkerEvent
-> enqueue outbound event
pipe writer thread
-> dequeue replies/events
-> write framed protobuf messages
```
Do not block the STA on pipe writes or gRPC calls. The STA should enqueue
results/events and return to pumping messages.
### Message Pump
The STA loop must pump Windows messages and service command work. A typical
shape:
```text
while not shutdown:
while command queue has work:
execute one command on STA
MsgWaitForMultipleObjectsEx(
command_event,
timeout,
QS_ALLINPUT,
MWMO_INPUTAVAILABLE)
while PeekMessage:
TranslateMessage
DispatchMessage
```
This is the critical piece for MXAccess event delivery. A plain blocking queue
on an STA thread is not enough if it prevents COM/window messages from being
pumped.
### COM Lifetime
Worker startup:
1. set apartment state to STA,
2. initialize COM on the STA,
3. instantiate `LMXProxyServerClass` or the installed MXAccess interop class,
4. attach event handlers,
5. send `WorkerReady`.
Worker shutdown:
1. reject new commands,
2. optionally send `UnAdvise`/`RemoveItem`/`Unregister` for active handles,
3. detach event handlers,
4. release COM object until reference count reaches zero,
5. uninitialize COM,
6. exit process.
If graceful shutdown exceeds timeout, the gateway kills the worker.
## Session Model
One external client session maps to one worker process by default.
Session state in the gateway:
- session id,
- client identity,
- worker process id,
- pipe name,
- pipe connection,
- open time,
- last heartbeat,
- active stream subscribers,
- command timeout policy,
- event queue metrics.
Session state in the worker:
- MXAccess COM object,
- registered server handles,
- item handles,
- item definitions/context,
- advise state,
- buffered state,
- authenticated user ids if needed,
- event sequence number.
The gateway should treat worker state as authoritative for MXAccess handles.
It can keep a shadow state for diagnostics and cleanup, but should not invent
handles.
## Command Execution
Every command should follow the same lifecycle:
```text
client sends gRPC command
gateway validates session and payload
gateway assigns correlation id
gateway writes WorkerCommand to pipe
worker pipe reader enqueues command to STA
STA executes MXAccess method
worker captures return value/out params/status/HRESULT
worker sends WorkerCommandReply
gateway completes gRPC response
```
Timeouts:
- gateway command timeout bounds client waiting,
- worker command timeout marks the command as stuck,
- if the STA does not recover after a configurable grace period, kill the
worker and fail the session.
Cancellation:
- canceling the gRPC call should stop waiting in the gateway,
- it cannot safely abort an in-flight COM call on the STA,
- the worker should finish the COM call and discard or log the late reply if
the correlation was canceled,
- hard cancellation means killing the worker process.
## Event Delivery And Backpressure
Events flow from worker to gateway, then gateway to client streams.
Worker policy:
- bounded outbound event channel,
- never block MXAccess event handler on pipe writes,
- if the outbound channel is full, apply configured policy:
- disconnect session,
- drop oldest low-priority data-change events,
- coalesce data changes by item handle,
- or block briefly then fault.
For full parity testing, default should be fail-fast rather than silent drop.
For production high-rate telemetry, add explicit coalescing modes.
Gateway policy:
- one event sequencer per session,
- preserve per-session event order,
- support multiple client event subscribers only if explicitly required,
- apply backpressure from slow gRPC streams,
- disconnect or coalesce according to client-selected mode.
## Isolation And Fault Handling
Failure cases:
- worker fails startup,
- worker pipe disconnects,
- worker heartbeat expires,
- worker process exits,
- STA command times out,
- MXAccess COM throws,
- MXAccess event handler throws,
- client disconnects,
- gateway shuts down.
Policy:
- worker startup failure fails `OpenSession`,
- worker crash emits terminal session fault to client,
- command exceptions return structured command fault with HRESULT if known,
- stale sessions are closed by lease timeout,
- stuck workers are killed by process id,
- gateway restart should not attempt to reattach old workers unless explicitly
designed; first version should terminate orphaned workers on startup.
Because each client owns one worker, a crash or leak affects only that session.
## Security
External gateway:
- use TLS for remote gRPC if crossing machine boundaries,
- authenticate clients with Windows auth, mTLS, or a deployment-specific token,
- authorize access to commands that can write, authenticate users, or alter
runtime state.
Internal worker IPC:
- local named pipes only,
- restrictive pipe ACL,
- per-session nonce handshake,
- worker validates gateway hello before creating MXAccess,
- gateway validates worker executable path and version,
- no secrets in command line when avoidable.
Credential-sensitive commands such as `AuthenticateUser` and `WriteSecured`
must not log passwords or raw credential values.
## Observability
Gateway metrics:
- sessions open,
- workers running,
- worker start latency,
- command latency by method,
- command failures by method/status,
- event rate by session/event type,
- event queue depth,
- worker memory/CPU,
- worker restarts/kills,
- gRPC stream disconnects.
Worker logs:
- startup/shutdown,
- MXAccess COM creation result,
- command start/end with correlation id,
- HRESULT/status summary,
- event family and sequence number,
- queue overflow,
- STA watchdog warnings.
Do not log full values by default. Make value logging opt-in and redacted where
credentials or secured writes are involved.
## Performance Strategy
First priority is parity. Performance comes from process isolation, batching,
and avoiding unnecessary cross-process round trips.
Baseline choices:
- long-lived worker per session,
- persistent pipe,
- protobuf binary framing,
- no gRPC inside worker,
- no COM calls outside STA,
- event streaming rather than event polling.
Optimizations after parity:
- batch commands where MXAccess semantics allow,
- batch events from worker to gateway while preserving order,
- optional data-change coalescing by item handle,
- memory-mapped payload slabs for very large arrays,
- shared schema for typed values to avoid raw COM marshaling at the gateway,
- gateway-side route to `MxAsbClient` for proven high-volume read/write
workloads only when caller opts into non-MXAccess-backed behavior or parity
tests prove equivalence.
## Project Layout
Suggested additions:
```text
src/MxGateway.Contracts/
Protos/
mxaccess_gateway.proto
mxaccess_worker.proto
Generated/
src/MxGateway.Server/
Program.cs
Sessions/
Workers/
Grpc/
Metrics/
src/MxGateway.Worker/
Program.cs
Ipc/
Sta/
MxAccess/
Conversion/
src/MxGateway.Tests/
contract tests
gateway session tests
fake worker tests
src/MxGateway.Worker.Tests/
value/status conversion tests
STA queue tests
src/MxGateway.IntegrationTests/
optional live MXAccess tests
```
Build outputs:
- gateway: .NET 10 x64,
- worker: .NET Framework 4.8 x86.
The contracts project can multi-target if needed, or the `.proto` files can be
shared as source inputs to both gateway and worker builds.
## Worker Implementation Plan
### Phase 1: Minimal Worker Harness
- Create .NET Framework 4.8 x86 worker executable.
- Parse pipe name/session id/nonce args.
- Connect to gateway named pipe.
- Exchange hello/ready messages.
- Start STA thread.
- Create MXAccess COM object on STA.
- Pump messages.
- Shut down cleanly.
Exit criteria:
- gateway can spawn worker,
- worker reports ready,
- worker exits on shutdown command,
- STA remains responsive.
### Phase 2: Command Queue
- Add command DTOs for `Register`, `Unregister`, `AddItem`, `RemoveItem`.
- Implement STA command dispatch.
- Return method result, HRESULT, and structured fault.
- Add command timeout handling in gateway.
Exit criteria:
- client can open a session and perform basic handle lifecycle through gRPC.
### Phase 3: Event Stream
- Wire MXAccess events in the worker.
- Convert `OnDataChange`, `OnWriteComplete`, `OperationComplete`, and
`OnBufferedDataChange` to protobuf events.
- Add event sequence numbers.
- Add gateway `StreamEvents`.
Exit criteria:
- advised item changes reach a .NET 10 client without the client owning an STA.
### Phase 4: Full Command Surface
Add remaining MXAccess methods:
- `Advise`
- `UnAdvise`
- `AdviseSupervisory`
- `AddItem2`
- `AddBufferedItem`
- `SetBufferedUpdateInterval`
- `Suspend`
- `Activate`
- `Write`
- `Write2`
- `WriteSecured`
- `WriteSecured2`
- `AuthenticateUser`
- `ArchestrAUserToId`
Exit criteria:
- gRPC command surface covers the installed MXAccess public method set.
### Phase 5: Parity Harness
- Reuse existing MXAccess trace harness scenarios.
- Run each scenario against direct MXAccess and against the gateway.
- Compare:
- return values,
- HRESULTs/exceptions,
- event sequence,
- value projection,
- quality/status arrays,
- invalid handle behavior,
- cleanup behavior.
Exit criteria:
- documented parity matrix for all public methods and event families.
### Phase 6: Hardening
- Worker watchdog.
- Heartbeats.
- Process kill/restart.
- Bounded queues.
- Backpressure policy.
- TLS/auth on public gateway.
- Metrics.
- Structured logging.
- Installer/service packaging.
Exit criteria:
- gateway can run as a Windows service and recover from worker crashes.
## Gateway Implementation Plan
### Session Manager
Core operations:
- allocate session id,
- choose worker executable,
- create pipe name and nonce,
- start worker process,
- accept pipe connection,
- verify worker hello,
- track worker state,
- close or kill worker.
State machine:
```text
Creating
-> StartingWorker
-> WaitingForPipe
-> InitializingWorker
-> Ready
-> Closing
-> Closed
-> Faulted
```
### Worker Client
Gateway-side worker client owns:
- pipe stream,
- read loop,
- write loop,
- pending command dictionary,
- event channel,
- heartbeat monitor,
- process handle.
It should expose:
```csharp
Task<WorkerCommandReply> InvokeAsync(WorkerCommand command, CancellationToken ct);
IAsyncEnumerable<WorkerEvent> ReadEventsAsync(CancellationToken ct);
Task ShutdownAsync(TimeSpan timeout);
void Kill();
```
### gRPC Layer
The gRPC layer should be thin:
- validate request,
- find session,
- call session worker client,
- map worker reply to public reply,
- stream events from session event channel.
Avoid embedding MXAccess-specific business logic in gRPC handlers. Keep the
translation code testable.
## C# Worker Versus C++ Worker
Start with a C# .NET Framework 4.8 x86 worker.
Reasons:
- fastest implementation path,
- easiest COM interop/event sink work,
- straightforward named-pipe/protobuf implementation,
- easier logging and diagnostics,
- easier parity iteration.
C++/CLI or native C++ remains an escape hatch if C# COM interop proves
insufficient. The pipe protocol should be language-neutral so a future C++
worker can replace the C# worker without changing gateway or clients.
Use C++ only if evidence shows:
- C# event sinks cannot reliably pump MXAccess events,
- COM `VARIANT`/`SAFEARRAY` conversion loses required data,
- throughput is bottlenecked by .NET COM marshaling,
- MXAccess requires ATL-style connection point behavior not reproducible from
C#.
## Compatibility Baseline
The proxy should preserve direct MXAccess behavior, including surprising cases.
Known important parity areas from existing captures:
- `WriteSecured` may fail before a value-bearing NMX body is emitted.
- `WriteSecured2` can succeed in observed native paths.
- `OperationComplete` is distinct from write completion.
- `OnBufferedDataChange` has a distinct public event shape.
- Invalid handles and cross-server handles have specific exception/status
behavior.
- STA message pumping is required for event delivery.
The gateway should not "fix" these behaviors unless the client explicitly opts
into a non-parity mode.
## Future Backend Routing
After the MXAccess-backed proxy is stable, the gateway can optionally support
other backends behind the same public contract:
- `MxAsbClient` for high-volume basic read/write where poll-based subscription
semantics are acceptable or proven equivalent for a workload,
- managed NMX for native callback experiments and eventual MXAccess-free
replacement work,
- direct MXAccess worker as the default parity backend.
Routing must be explicit and observable:
- event/reply includes backend name,
- tests assert backend choice,
- no silent fallback that changes semantics.
Initial production mode should be:
```text
backend = mxaccess-worker
```
## Open Questions
- Exact installed MXAccess COM ProgID/class used by production should be pinned
from the existing trace harness.
- Whether one gRPC client connection maps to one session or whether sessions can
survive client reconnects.
- Whether event streams can have multiple subscribers per session.
- Required authentication model for remote clients.
- Whether worker process identity should be the gateway identity or a restricted
service account.
- Maximum supported event rate before coalescing is required.
- Whether command batching is needed for high-volume tag registration.
## Recommended Next Step
Build the smallest end-to-end slice:
1. .NET 10 gateway starts.
2. Client calls `OpenSession`.
3. Gateway launches .NET Framework 4.8 x86 worker.
4. Worker creates STA and MXAccess COM object.
5. Client calls `Register`.
6. Client calls `AddItem`.
7. Client calls `Advise`.
8. Worker forwards one `OnDataChange` event to the gateway.
9. Gateway streams the event to the client.
10. Client calls `CloseSession`.
11. Gateway shuts down the worker.
That slice proves the architecture's hardest requirements: process isolation,
STA ownership, message pumping, command routing, and event streaming.