Files
mxaccessgw/gateway.md
T
2026-04-26 16:45:42 -04:00

25 KiB

MXAccess Gateway Design

Goal

Provide full MXAccess parity to modern clients without forcing those clients to load MXAccess COM, run as x86, or own an STA message pump.

The gateway must preserve MXAccess behavior first:

  • public MXAccess command semantics,
  • native MXAccess event families,
  • STA/message-pump delivery behavior,
  • installed-provider quirks,
  • HRESULT/status/value marshaling,
  • per-client isolation.

MxAsbClient and the managed NMX client remain useful future acceleration paths, but they should not define the parity contract. The installed MXAccess COM component is the compatibility baseline.

Architecture

Use a .NET 10 C# gateway for external clients and per-session .NET Framework 4.8 x86 C# worker processes for MXAccess.

client
  -> gRPC over TCP
    -> .NET 10 x64 gateway
      -> session manager
        -> per-session .NET Framework 4.8 x86 worker process
          -> dedicated STA thread
            -> MXAccess COM instance
            -> Windows/COM message pump
            -> command queue
            -> event sink

The worker does not host gRPC. The gateway talks to workers through a small local IPC protocol. Named pipes with protobuf-framed messages are the default transport.

Detailed follow-up docs:

  • docs/gateway-process-design.md covers the .NET 10 gateway process, session manager, worker supervision, gRPC API, event streaming, fault model, security, observability, and test strategy.
  • docs/WorkerFrameProtocol.md covers the gateway-side named-pipe frame reader/writer and WorkerEnvelope validation rules.
  • docs/WorkerProcessLauncher.md covers worker executable validation, process launch arguments, nonce handling, and startup cleanup behavior.
  • docs/mxaccess-worker-instance-design.md covers each .NET Framework 4.8 x86 MXAccess worker instance, including STA ownership, message pumping, COM lifetime, command dispatch, event sinks, conversion, and shutdown.
  • docs/design-decisions.md records current v1 choices, including API-key authentication in gateway-owned SQLite and the concrete installed MXAccess COM class details from C:\Users\dohertj2\Desktop\mxaccess.
  • docs/gateway-dashboard-design.md covers the Blazor Server and Bootstrap dashboard for live gateway/session/worker status.
  • docs/client-libraries-design.md covers shared design requirements for official gRPC client libraries, test CLIs, and tests for .NET C#, Go, Rust, Python, and Java.
  • docs/implementation-plan-index.md links the detailed implementation plans and recommended Gitea milestones/issues.

Implementation style guides:

  • StyleGuide.md covers project documentation.
  • docs/style-guides/CSharpStyleGuide.md covers gateway, worker, .NET client, and C# tests.
  • docs/style-guides/ProtobufStyleGuide.md covers public gRPC and worker IPC contracts.
  • docs/style-guides/GoStyleGuide.md covers the Go client.
  • docs/style-guides/RustStyleGuide.md covers the Rust client.
  • docs/style-guides/PythonStyleGuide.md covers the Python client.
  • docs/style-guides/JavaStyleGuide.md covers the Java client.

Process Split

Gateway Process

Runtime:

  • .NET 10
  • C#
  • x64 preferred
  • ASP.NET Core gRPC server

Responsibilities:

  • expose the public TCP/gRPC API,
  • authenticate/authorize remote clients if needed,
  • create one worker per client session,
  • route commands to the owning worker,
  • stream worker events to the owning client,
  • enforce session leases, heartbeats, timeouts, and quotas,
  • kill/restart workers when they hang or crash,
  • collect metrics and structured logs,
  • optionally route selected future operations to ASB or managed NMX only after parity tests prove equivalent behavior.

The gateway must never instantiate or call MXAccess directly.

The gateway observability foundation lives in MxGateway.Server.Diagnostics and MxGateway.Server.Metrics. Structured logging scopes carry session, worker, correlation, command, and client identity fields with redaction applied before values enter log state. GatewayMetrics exposes counters, gauges, and histograms through .NET Meter and a snapshot API that dashboard services can project without binding to a metrics exporter.

Worker Process

Runtime:

  • .NET Framework 4.8
  • C#
  • x86 build by default

Responsibilities:

  • own one MXAccess COM instance,
  • create and preserve one dedicated STA thread,
  • pump Windows/COM messages on that STA thread,
  • execute every MXAccess method call on that STA thread,
  • subscribe to MXAccess COM events,
  • convert command results and events into internal protobuf DTOs,
  • send events back to the gateway over the worker pipe,
  • shut down cleanly on request,
  • terminate quickly when the gateway kills the process.

The worker should be disposable. If MXAccess leaks state, faults, or wedges the STA, the gateway can kill the process without corrupting other clients.

Why Not gRPC In The Worker

.NET Framework 4.8 does not have the same first-class gRPC stack as .NET 10. For the worker, a custom local protocol is simpler and more predictable:

  • named pipes are Windows-native,
  • no HTTP/2 requirement,
  • fewer dependencies in the x86 process,
  • easier process lifetime control,
  • easier framed binary protocol,
  • sufficient throughput for command and event traffic.

The public API can still be modern gRPC because the gateway runs on .NET 10.

Worker IPC

Default transport: one bidirectional named pipe per worker.

Pipe name:

mxaccess-gateway-{gatewayProcessId}-{sessionId}

Message framing:

uint32 little-endian payload_length
payload_length bytes protobuf WorkerEnvelope
uint32 little-endian payload_length
payload_length bytes protobuf WorkerEnvelope
...

The gateway creates the pipe server, starts the worker with the pipe name as an argument, then waits for the worker to connect and send WorkerReady.

Pipe security:

  • local machine only,
  • ACL restricted to the gateway identity and the launched worker identity,
  • no anonymous access,
  • optionally add a per-session random handshake nonce passed by command line or inherited environment.

Worker Envelope

Every IPC message uses a common envelope:

message WorkerEnvelope {
  uint32 protocol_version = 1;
  string session_id = 2;
  uint64 sequence = 3;
  uint64 correlation_id = 4;
  oneof body {
    WorkerHello worker_hello = 10;
    GatewayHello gateway_hello = 11;
    WorkerReady worker_ready = 12;
    WorkerCommand command = 20;
    WorkerCommandReply command_reply = 21;
    WorkerEvent event = 22;
    WorkerHeartbeat heartbeat = 23;
    WorkerCancel cancel = 24;
    WorkerShutdown shutdown = 25;
    WorkerFault fault = 26;
  }
}

Rules:

  • sequence is monotonic per sender.
  • correlation_id links commands to replies.
  • Events use their own correlation id or zero.
  • Replies must preserve MXAccess HRESULT/status information even when the command is also represented as a protocol-level failure.
  • Protocol version mismatch fails session creation.

Public gRPC API

The external API should be session-oriented. A bidirectional stream is the best long-term shape because it naturally carries commands, replies, events, heartbeats, and cancellation.

service MxAccessGateway {
  rpc OpenSession(OpenSessionRequest) returns (OpenSessionReply);
  rpc CloseSession(CloseSessionRequest) returns (CloseSessionReply);
  rpc Invoke(MxCommandRequest) returns (MxCommandReply);
  rpc StreamEvents(StreamEventsRequest) returns (stream MxEvent);
  rpc Session(stream ClientMessage) returns (stream ServerMessage);
}

Recommended rollout:

  1. Implement unary OpenSession, CloseSession, and Invoke.
  2. Implement server-streaming StreamEvents.
  3. Add bidirectional Session after the command/event model is stable.

The unary plus event-stream shape is easier to debug initially. The bidirectional stream can later reduce per-command overhead and improve backpressure.

Public MXAccess Command Surface

The gateway contract should mirror MXAccess concepts without leaking COM types. Keep handles and statuses explicit.

Core commands:

  • Register
  • Unregister
  • AddItem
  • AddItem2
  • RemoveItem
  • Advise
  • UnAdvise
  • AdviseSupervisory
  • AddBufferedItem
  • SetBufferedUpdateInterval
  • Suspend
  • Activate
  • Write
  • Write2
  • WriteSecured
  • WriteSecured2
  • AuthenticateUser
  • ArchestrAUserToId

Optional diagnostics:

  • Ping
  • GetSessionState
  • GetWorkerInfo
  • DrainEvents
  • ShutdownWorker

Do not compress MXAccess semantics into generic verbs too early. A command enum with method-specific payloads is easier to test for parity.

Event Surface

The gateway must represent every public MXAccess event family:

  • OnDataChange
  • OnWriteComplete
  • OperationComplete
  • OnBufferedDataChange

The event DTO should include:

  • event family,
  • session id,
  • server handle,
  • item handle,
  • value when present,
  • quality when present,
  • timestamp when present,
  • MXSTATUS_PROXY[] equivalent,
  • raw HRESULT/status fields when available,
  • event ordering sequence,
  • worker timestamp,
  • gateway receive timestamp.

Keep event order stable per worker. The gateway should not reorder events from the same MXAccess instance.

Value Model

Use a protobuf value union that can represent COM VARIANT values and arrays.

message MxValue {
  oneof kind {
    bool bool_value = 1;
    int32 int32_value = 2;
    int64 int64_value = 3;
    float float_value = 4;
    double double_value = 5;
    string string_value = 6;
    Timestamp timestamp_value = 7;
    MxArray array_value = 8;
    bytes raw_variant = 100;
  }
}

Array support should include at least:

  • bool array,
  • int32 array,
  • float array,
  • double array,
  • string array,
  • timestamp array,
  • raw fallback.

For full parity, unknown or awkward COM values should be preserved as raw metadata rather than dropped. If a value cannot be losslessly converted, the worker should return both the best typed projection and enough diagnostic metadata to reproduce the case.

Status Model

Represent MXSTATUS_PROXY explicitly:

message MxStatusProxy {
  int32 success = 1;
  uint32 category = 2;
  uint32 detail = 3;
  uint32 source = 4;
  uint32 raw_hresult = 5;
  string text = 6;
}

The exact field names should be adjusted to match the actual interop struct, but the design principle is important: do not collapse status arrays into a single success flag.

For command replies, return:

  • protocol status,
  • COM HRESULT if available,
  • MXAccess return value if the method has one,
  • method-specific out parameters,
  • status array if the method emits one.

STA Worker Thread Model

Each worker owns:

  • one process,
  • one MXAccess session,
  • one dedicated STA thread,
  • one MXAccess COM object,
  • one inbound command queue,
  • one outbound event queue.

All MXAccess operations run on the STA:

pipe reader thread
  -> parse WorkerCommand
  -> enqueue StaCommand
  -> await task completion
  -> write WorkerCommandReply

STA thread
  -> CoInitializeEx(APARTMENTTHREADED)
  -> create MXAccess COM object
  -> wire events
  -> run message pump
  -> execute queued commands between message dispatches

MXAccess event handler on STA
  -> convert event args to WorkerEvent
  -> enqueue outbound event

pipe writer thread
  -> dequeue replies/events
  -> write framed protobuf messages

Do not block the STA on pipe writes or gRPC calls. The STA should enqueue results/events and return to pumping messages.

Message Pump

The STA loop must pump Windows messages and service command work. A typical shape:

while not shutdown:
  while command queue has work:
    execute one command on STA

  MsgWaitForMultipleObjectsEx(
    command_event,
    timeout,
    QS_ALLINPUT,
    MWMO_INPUTAVAILABLE)

  while PeekMessage:
    TranslateMessage
    DispatchMessage

This is the critical piece for MXAccess event delivery. A plain blocking queue on an STA thread is not enough if it prevents COM/window messages from being pumped.

COM Lifetime

Worker startup:

  1. set apartment state to STA,
  2. initialize COM on the STA,
  3. instantiate LMXProxyServerClass or the installed MXAccess interop class,
  4. attach event handlers,
  5. send WorkerReady.

Worker shutdown:

  1. reject new commands,
  2. optionally send UnAdvise/RemoveItem/Unregister for active handles,
  3. detach event handlers,
  4. release COM object until reference count reaches zero,
  5. uninitialize COM,
  6. exit process.

If graceful shutdown exceeds timeout, the gateway kills the worker.

Session Model

One external client session maps to one worker process by default.

Session state in the gateway:

  • session id,
  • client identity,
  • worker process id,
  • pipe name,
  • pipe connection,
  • open time,
  • last heartbeat,
  • active stream subscribers,
  • command timeout policy,
  • event queue metrics.

Session state in the worker:

  • MXAccess COM object,
  • registered server handles,
  • item handles,
  • item definitions/context,
  • advise state,
  • buffered state,
  • authenticated user ids if needed,
  • event sequence number.

The gateway should treat worker state as authoritative for MXAccess handles. It can keep a shadow state for diagnostics and cleanup, but should not invent handles.

Command Execution

Every command should follow the same lifecycle:

client sends gRPC command
gateway validates session and payload
gateway assigns correlation id
gateway writes WorkerCommand to pipe
worker pipe reader enqueues command to STA
STA executes MXAccess method
worker captures return value/out params/status/HRESULT
worker sends WorkerCommandReply
gateway completes gRPC response

Timeouts:

  • gateway command timeout bounds client waiting,
  • worker command timeout marks the command as stuck,
  • if the STA does not recover after a configurable grace period, kill the worker and fail the session.

Cancellation:

  • canceling the gRPC call should stop waiting in the gateway,
  • it cannot safely abort an in-flight COM call on the STA,
  • the worker should finish the COM call and discard or log the late reply if the correlation was canceled,
  • hard cancellation means killing the worker process.

Event Delivery And Backpressure

Events flow from worker to gateway, then gateway to client streams.

Worker policy:

  • bounded outbound event channel,
  • never block MXAccess event handler on pipe writes,
  • if the outbound channel is full, apply configured policy:
    • disconnect session,
    • drop oldest low-priority data-change events,
    • coalesce data changes by item handle,
    • or block briefly then fault.

For full parity testing, default should be fail-fast rather than silent drop. For production high-rate telemetry, add explicit coalescing modes.

Gateway policy:

  • one event sequencer per session,
  • preserve per-session event order,
  • support multiple client event subscribers only if explicitly required,
  • apply backpressure from slow gRPC streams,
  • disconnect or coalesce according to client-selected mode.

Isolation And Fault Handling

Failure cases:

  • worker fails startup,
  • worker pipe disconnects,
  • worker heartbeat expires,
  • worker process exits,
  • STA command times out,
  • MXAccess COM throws,
  • MXAccess event handler throws,
  • client disconnects,
  • gateway shuts down.

Policy:

  • worker startup failure fails OpenSession,
  • worker crash emits terminal session fault to client,
  • command exceptions return structured command fault with HRESULT if known,
  • stale sessions are closed by lease timeout,
  • stuck workers are killed by process id,
  • gateway restart should not attempt to reattach old workers unless explicitly designed; first version should terminate orphaned workers on startup.

Because each client owns one worker, a crash or leak affects only that session.

Security

External gateway:

  • use TLS for remote gRPC if crossing machine boundaries,
  • authenticate clients with Windows auth, mTLS, or a deployment-specific token,
  • authorize access to commands that can write, authenticate users, or alter runtime state.

Internal worker IPC:

  • local named pipes only,
  • restrictive pipe ACL,
  • per-session nonce handshake,
  • worker validates gateway hello before creating MXAccess,
  • gateway validates worker executable path and version,
  • no secrets in command line when avoidable.

Credential-sensitive commands such as AuthenticateUser and WriteSecured must not log passwords or raw credential values.

Observability

Gateway metrics:

  • sessions open,
  • workers running,
  • worker start latency,
  • command latency by method,
  • command failures by method/status,
  • event rate by session/event type,
  • event queue depth,
  • worker memory/CPU,
  • worker restarts/kills,
  • gRPC stream disconnects.

Worker logs:

  • startup/shutdown,
  • MXAccess COM creation result,
  • command start/end with correlation id,
  • HRESULT/status summary,
  • event family and sequence number,
  • queue overflow,
  • STA watchdog warnings.

Do not log full values by default. Make value logging opt-in and redacted where credentials or secured writes are involved.

Performance Strategy

First priority is parity. Performance comes from process isolation, batching, and avoiding unnecessary cross-process round trips.

Baseline choices:

  • long-lived worker per session,
  • persistent pipe,
  • protobuf binary framing,
  • no gRPC inside worker,
  • no COM calls outside STA,
  • event streaming rather than event polling.

Optimizations after parity:

  • batch commands where MXAccess semantics allow,
  • batch events from worker to gateway while preserving order,
  • optional data-change coalescing by item handle,
  • memory-mapped payload slabs for very large arrays,
  • shared schema for typed values to avoid raw COM marshaling at the gateway,
  • gateway-side route to MxAsbClient for proven high-volume read/write workloads only when caller opts into non-MXAccess-backed behavior or parity tests prove equivalence.

Project Layout

Suggested additions:

src/MxGateway.Contracts/
  Protos/
    mxaccess_gateway.proto
    mxaccess_worker.proto
  Generated/

src/MxGateway.Server/
  Program.cs
  Sessions/
  Workers/
  Grpc/
  Metrics/

src/MxGateway.Worker/
  Program.cs
  Ipc/
  Sta/
  MxAccess/
  Conversion/

src/MxGateway.Tests/
  contract tests
  gateway session tests
  fake worker tests

src/MxGateway.Worker.Tests/
  value/status conversion tests
  STA queue tests

src/MxGateway.IntegrationTests/
  optional live MXAccess tests

Build outputs:

  • gateway: .NET 10 x64,
  • worker: .NET Framework 4.8 x86.

The contracts project can multi-target if needed, or the .proto files can be shared as source inputs to both gateway and worker builds.

Worker Implementation Plan

Phase 1: Minimal Worker Harness

  • Create .NET Framework 4.8 x86 worker executable.
  • Parse pipe name/session id/nonce args.
  • Connect to gateway named pipe.
  • Exchange hello/ready messages.
  • Start STA thread.
  • Create MXAccess COM object on STA.
  • Pump messages.
  • Shut down cleanly.

Exit criteria:

  • gateway can spawn worker,
  • worker reports ready,
  • worker exits on shutdown command,
  • STA remains responsive.

Phase 2: Command Queue

  • Add command DTOs for Register, Unregister, AddItem, RemoveItem.
  • Implement STA command dispatch.
  • Return method result, HRESULT, and structured fault.
  • Add command timeout handling in gateway.

Exit criteria:

  • client can open a session and perform basic handle lifecycle through gRPC.

Phase 3: Event Stream

  • Wire MXAccess events in the worker.
  • Convert OnDataChange, OnWriteComplete, OperationComplete, and OnBufferedDataChange to protobuf events.
  • Add event sequence numbers.
  • Add gateway StreamEvents.

Exit criteria:

  • advised item changes reach a .NET 10 client without the client owning an STA.

Phase 4: Full Command Surface

Add remaining MXAccess methods:

  • Advise
  • UnAdvise
  • AdviseSupervisory
  • AddItem2
  • AddBufferedItem
  • SetBufferedUpdateInterval
  • Suspend
  • Activate
  • Write
  • Write2
  • WriteSecured
  • WriteSecured2
  • AuthenticateUser
  • ArchestrAUserToId

Exit criteria:

  • gRPC command surface covers the installed MXAccess public method set.

Phase 5: Parity Harness

  • Reuse existing MXAccess trace harness scenarios.
  • Run each scenario against direct MXAccess and against the gateway.
  • Compare:
    • return values,
    • HRESULTs/exceptions,
    • event sequence,
    • value projection,
    • quality/status arrays,
    • invalid handle behavior,
    • cleanup behavior.

Exit criteria:

  • documented parity matrix for all public methods and event families.

Phase 6: Hardening

  • Worker watchdog.
  • Heartbeats.
  • Process kill/restart.
  • Bounded queues.
  • Backpressure policy.
  • TLS/auth on public gateway.
  • Metrics.
  • Structured logging.
  • Installer/service packaging.

Exit criteria:

  • gateway can run as a Windows service and recover from worker crashes.

Gateway Implementation Plan

Session Manager

Core operations:

  • allocate session id,
  • choose worker executable,
  • create pipe name and nonce,
  • start worker process,
  • accept pipe connection,
  • verify worker hello,
  • track worker state,
  • close or kill worker.

State machine:

Creating
  -> StartingWorker
  -> WaitingForPipe
  -> InitializingWorker
  -> Ready
  -> Closing
  -> Closed
  -> Faulted

Worker Client

Gateway-side worker client owns:

  • pipe stream,
  • read loop,
  • write loop,
  • pending command dictionary,
  • event channel,
  • heartbeat monitor,
  • process handle.

It should expose:

Task<WorkerCommandReply> InvokeAsync(WorkerCommand command, CancellationToken ct);
IAsyncEnumerable<WorkerEvent> ReadEventsAsync(CancellationToken ct);
Task ShutdownAsync(TimeSpan timeout);
void Kill();

gRPC Layer

The gRPC layer should be thin:

  • validate request,
  • find session,
  • call session worker client,
  • map worker reply to public reply,
  • stream events from session event channel.

Avoid embedding MXAccess-specific business logic in gRPC handlers. Keep the translation code testable.

C# Worker Versus C++ Worker

Start with a C# .NET Framework 4.8 x86 worker.

Reasons:

  • fastest implementation path,
  • easiest COM interop/event sink work,
  • straightforward named-pipe/protobuf implementation,
  • easier logging and diagnostics,
  • easier parity iteration.

C++/CLI or native C++ remains an escape hatch if C# COM interop proves insufficient. The pipe protocol should be language-neutral so a future C++ worker can replace the C# worker without changing gateway or clients.

Use C++ only if evidence shows:

  • C# event sinks cannot reliably pump MXAccess events,
  • COM VARIANT/SAFEARRAY conversion loses required data,
  • throughput is bottlenecked by .NET COM marshaling,
  • MXAccess requires ATL-style connection point behavior not reproducible from C#.

Compatibility Baseline

The proxy should preserve direct MXAccess behavior, including surprising cases.

Known important parity areas from existing captures:

  • WriteSecured may fail before a value-bearing NMX body is emitted.
  • WriteSecured2 can succeed in observed native paths.
  • OperationComplete is distinct from write completion.
  • OnBufferedDataChange has a distinct public event shape.
  • Invalid handles and cross-server handles have specific exception/status behavior.
  • STA message pumping is required for event delivery.

The gateway should not "fix" these behaviors unless the client explicitly opts into a non-parity mode.

Future Backend Routing

After the MXAccess-backed proxy is stable, the gateway can optionally support other backends behind the same public contract:

  • MxAsbClient for high-volume basic read/write where poll-based subscription semantics are acceptable or proven equivalent for a workload,
  • managed NMX for native callback experiments and eventual MXAccess-free replacement work,
  • direct MXAccess worker as the default parity backend.

Routing must be explicit and observable:

  • event/reply includes backend name,
  • tests assert backend choice,
  • no silent fallback that changes semantics.

Initial production mode should be:

backend = mxaccess-worker

Open Questions

Current v1 decisions are recorded in docs/design-decisions.md.

Resolved for v1:

  • MXAccess COM target is ArchestrA.MxAccess.LMXProxyServerClass / LMXProxy.LMXProxyServer.1 from the installed 32-bit LmxProxy.dll.
  • One OpenSession maps to one worker process; no reconnectable sessions.
  • One active event subscriber per session.
  • API key authentication with hashed keys in gateway-owned SQLite.
  • Basic Blazor Server dashboard with Bootstrap CSS/JS and real-time updates.
  • Workers run as the gateway service identity.
  • Event backpressure is fail-fast with bounded queues.
  • No public command batching.
  • OperationComplete is forwarded only when native MXAccess raises it.
  • OnBufferedDataChange is modeled now; multi-sample payload conversion remains capture-validated work.

Post-v1 revisit items:

  • production event-rate target and optional coalescing,
  • reconnectable sessions,
  • multi-subscriber event fan-out,
  • restricted worker process identity,
  • command batching for high-volume setup.

Build the smallest end-to-end slice:

  1. .NET 10 gateway starts.
  2. Client calls OpenSession.
  3. Gateway launches .NET Framework 4.8 x86 worker.
  4. Worker creates STA and MXAccess COM object.
  5. Client calls Register.
  6. Client calls AddItem.
  7. Client calls Advise.
  8. Worker forwards one OnDataChange event to the gateway.
  9. Gateway streams the event to the client.
  10. Client calls CloseSession.
  11. Gateway shuts down the worker.

That slice proves the architecture's hardest requirements: process isolation, STA ownership, message pumping, command routing, and event streaming.