981 lines
27 KiB
Markdown
981 lines
27 KiB
Markdown
# MXAccess Gateway Design
|
|
|
|
## Goal
|
|
|
|
Provide full MXAccess parity to modern clients without forcing those clients to
|
|
load MXAccess COM, run as x86, or own an STA message pump.
|
|
|
|
The gateway must preserve MXAccess behavior first:
|
|
|
|
- public MXAccess command semantics,
|
|
- native MXAccess event families,
|
|
- STA/message-pump delivery behavior,
|
|
- installed-provider quirks,
|
|
- HRESULT/status/value marshaling,
|
|
- per-client isolation.
|
|
|
|
`MxAsbClient` and the managed NMX client remain useful future acceleration
|
|
paths, but they should not define the parity contract. The installed MXAccess
|
|
COM component is the compatibility baseline.
|
|
|
|
## Architecture
|
|
|
|
Use a .NET 10 C# gateway for external clients and per-session .NET Framework
|
|
4.8 x86 C# worker processes for MXAccess.
|
|
|
|
```text
|
|
client
|
|
-> gRPC over TCP
|
|
-> .NET 10 x64 gateway
|
|
-> session manager
|
|
-> per-session .NET Framework 4.8 x86 worker process
|
|
-> dedicated STA thread
|
|
-> MXAccess COM instance
|
|
-> Windows/COM message pump
|
|
-> command queue
|
|
-> event sink
|
|
```
|
|
|
|
The worker does not host gRPC. The gateway talks to workers through a small
|
|
local IPC protocol. Named pipes with protobuf-framed messages are the default
|
|
transport.
|
|
|
|
Detailed follow-up docs:
|
|
|
|
- `docs/gateway-process-design.md` covers the .NET 10 gateway process,
|
|
session manager, worker supervision, gRPC API, event streaming, fault model,
|
|
security, observability, and test strategy.
|
|
- `docs/WorkerFrameProtocol.md` covers the gateway-side named-pipe frame
|
|
reader/writer and `WorkerEnvelope` validation rules.
|
|
- `docs/WorkerProcessLauncher.md` covers worker executable validation, process
|
|
launch arguments, nonce handling, and startup cleanup behavior.
|
|
- `docs/mxaccess-worker-instance-design.md` covers each .NET Framework 4.8 x86
|
|
MXAccess worker instance, including STA ownership, message pumping, COM
|
|
lifetime, command dispatch, event sinks, conversion, and shutdown.
|
|
- `docs/design-decisions.md` records current v1 choices, including API-key
|
|
authentication in gateway-owned SQLite and the concrete installed MXAccess
|
|
COM class details from `C:\Users\dohertj2\Desktop\mxaccess`.
|
|
- `docs/gateway-dashboard-design.md` covers the Blazor Server and Bootstrap
|
|
dashboard for live gateway/session/worker status.
|
|
- `docs/client-libraries-design.md` covers shared design requirements for
|
|
official gRPC client libraries, test CLIs, and tests for .NET C#, Go, Rust,
|
|
Python, and Java.
|
|
- `docs/implementation-plan-index.md` links the detailed implementation plans
|
|
and recommended Gitea milestones/issues.
|
|
|
|
Implementation style guides:
|
|
|
|
- `StyleGuide.md` covers project documentation.
|
|
- `docs/style-guides/CSharpStyleGuide.md` covers gateway, worker, .NET client,
|
|
and C# tests.
|
|
- `docs/style-guides/ProtobufStyleGuide.md` covers public gRPC and worker IPC
|
|
contracts.
|
|
- `docs/style-guides/GoStyleGuide.md` covers the Go client.
|
|
- `docs/style-guides/RustStyleGuide.md` covers the Rust client.
|
|
- `docs/style-guides/PythonStyleGuide.md` covers the Python client.
|
|
- `docs/style-guides/JavaStyleGuide.md` covers the Java client.
|
|
|
|
## Process Split
|
|
|
|
### Gateway Process
|
|
|
|
Runtime:
|
|
|
|
- .NET 10
|
|
- C#
|
|
- x64 preferred
|
|
- ASP.NET Core gRPC server
|
|
|
|
Responsibilities:
|
|
|
|
- expose the public TCP/gRPC API,
|
|
- authenticate/authorize remote clients if needed,
|
|
- create one worker per client session,
|
|
- route commands to the owning worker,
|
|
- stream worker events to the owning client,
|
|
- enforce session leases, heartbeats, timeouts, and quotas,
|
|
- kill/restart workers when they hang or crash,
|
|
- collect metrics and structured logs,
|
|
- optionally route selected future operations to ASB or managed NMX only after
|
|
parity tests prove equivalent behavior.
|
|
|
|
The gateway must never instantiate or call MXAccess directly.
|
|
|
|
The gateway observability foundation lives in `MxGateway.Server.Diagnostics`
|
|
and `MxGateway.Server.Metrics`. Structured logging scopes carry session,
|
|
worker, correlation, command, and client identity fields with redaction applied
|
|
before values enter log state. `GatewayMetrics` exposes counters, gauges, and
|
|
histograms through .NET `Meter` and a snapshot API that dashboard services can
|
|
project without binding to a metrics exporter.
|
|
`DashboardSnapshotService` projects sessions, workers, metrics, faults, and
|
|
effective configuration into immutable DTOs for read-only dashboard rendering.
|
|
Dashboard routes use the same API-key verifier as gRPC. `/dashboard/login`
|
|
accepts the API key in a form body, validates the configured `admin` scope,
|
|
and issues an HTTP-only secure cookie for subsequent dashboard requests.
|
|
`/dashboard/logout` clears that cookie. Login and logout posts validate
|
|
anti-forgery tokens, and API keys are never accepted through query strings.
|
|
`Dashboard:AllowAnonymousLocalhost` can bypass the cookie requirement for
|
|
loopback requests only when explicitly enabled.
|
|
|
|
### Worker Process
|
|
|
|
Runtime:
|
|
|
|
- .NET Framework 4.8
|
|
- C#
|
|
- x86 build by default
|
|
|
|
Responsibilities:
|
|
|
|
- own one MXAccess COM instance,
|
|
- create and preserve one dedicated STA thread,
|
|
- pump Windows/COM messages on that STA thread,
|
|
- execute every MXAccess method call on that STA thread,
|
|
- subscribe to MXAccess COM events,
|
|
- convert command results and events into internal protobuf DTOs,
|
|
- send events back to the gateway over the worker pipe,
|
|
- shut down cleanly on request,
|
|
- terminate quickly when the gateway kills the process.
|
|
|
|
The worker should be disposable. If MXAccess leaks state, faults, or wedges the
|
|
STA, the gateway can kill the process without corrupting other clients.
|
|
|
|
## Why Not gRPC In The Worker
|
|
|
|
.NET Framework 4.8 does not have the same first-class gRPC stack as .NET 10.
|
|
For the worker, a custom local protocol is simpler and more predictable:
|
|
|
|
- named pipes are Windows-native,
|
|
- no HTTP/2 requirement,
|
|
- fewer dependencies in the x86 process,
|
|
- easier process lifetime control,
|
|
- easier framed binary protocol,
|
|
- sufficient throughput for command and event traffic.
|
|
|
|
The public API can still be modern gRPC because the gateway runs on .NET 10.
|
|
|
|
## Worker IPC
|
|
|
|
Default transport: one bidirectional named pipe per worker.
|
|
|
|
Pipe name:
|
|
|
|
```text
|
|
mxaccess-gateway-{gatewayProcessId}-{sessionId}
|
|
```
|
|
|
|
Message framing:
|
|
|
|
```text
|
|
uint32 little-endian payload_length
|
|
payload_length bytes protobuf WorkerEnvelope
|
|
uint32 little-endian payload_length
|
|
payload_length bytes protobuf WorkerEnvelope
|
|
...
|
|
```
|
|
|
|
The gateway creates the pipe server, starts the worker with the pipe name as an
|
|
argument, then waits for the worker to connect and send `WorkerReady`.
|
|
|
|
Pipe security:
|
|
|
|
- local machine only,
|
|
- ACL restricted to the gateway identity and the launched worker identity,
|
|
- no anonymous access,
|
|
- optionally add a per-session random handshake nonce passed by command line or
|
|
inherited environment.
|
|
|
|
### Worker Envelope
|
|
|
|
Every IPC message uses a common envelope:
|
|
|
|
```protobuf
|
|
message WorkerEnvelope {
|
|
uint32 protocol_version = 1;
|
|
string session_id = 2;
|
|
uint64 sequence = 3;
|
|
uint64 correlation_id = 4;
|
|
oneof body {
|
|
WorkerHello worker_hello = 10;
|
|
GatewayHello gateway_hello = 11;
|
|
WorkerReady worker_ready = 12;
|
|
WorkerCommand command = 20;
|
|
WorkerCommandReply command_reply = 21;
|
|
WorkerEvent event = 22;
|
|
WorkerHeartbeat heartbeat = 23;
|
|
WorkerCancel cancel = 24;
|
|
WorkerShutdown shutdown = 25;
|
|
WorkerFault fault = 26;
|
|
}
|
|
}
|
|
```
|
|
|
|
Rules:
|
|
|
|
- `sequence` is monotonic per sender.
|
|
- `correlation_id` links commands to replies.
|
|
- Events use their own correlation id or zero.
|
|
- Replies must preserve MXAccess HRESULT/status information even when the
|
|
command is also represented as a protocol-level failure.
|
|
- Protocol version mismatch fails session creation.
|
|
|
|
## Public gRPC API
|
|
|
|
The external API should be session-oriented. A bidirectional stream is the best
|
|
long-term shape because it naturally carries commands, replies, events,
|
|
heartbeats, and cancellation.
|
|
|
|
```protobuf
|
|
service MxAccessGateway {
|
|
rpc OpenSession(OpenSessionRequest) returns (OpenSessionReply);
|
|
rpc CloseSession(CloseSessionRequest) returns (CloseSessionReply);
|
|
rpc Invoke(MxCommandRequest) returns (MxCommandReply);
|
|
rpc StreamEvents(StreamEventsRequest) returns (stream MxEvent);
|
|
rpc Session(stream ClientMessage) returns (stream ServerMessage);
|
|
}
|
|
```
|
|
|
|
Recommended rollout:
|
|
|
|
1. Implement unary `OpenSession`, `CloseSession`, and `Invoke`.
|
|
2. Implement server-streaming `StreamEvents`.
|
|
3. Add bidirectional `Session` after the command/event model is stable.
|
|
|
|
The unary plus event-stream shape is easier to debug initially. The
|
|
bidirectional stream can later reduce per-command overhead and improve
|
|
backpressure.
|
|
|
|
## Public MXAccess Command Surface
|
|
|
|
The gateway contract should mirror MXAccess concepts without leaking COM types.
|
|
Keep handles and statuses explicit.
|
|
|
|
Core commands:
|
|
|
|
- `Register`
|
|
- `Unregister`
|
|
- `AddItem`
|
|
- `AddItem2`
|
|
- `RemoveItem`
|
|
- `Advise`
|
|
- `UnAdvise`
|
|
- `AdviseSupervisory`
|
|
- `AddBufferedItem`
|
|
- `SetBufferedUpdateInterval`
|
|
- `Suspend`
|
|
- `Activate`
|
|
- `Write`
|
|
- `Write2`
|
|
- `WriteSecured`
|
|
- `WriteSecured2`
|
|
- `AuthenticateUser`
|
|
- `ArchestrAUserToId`
|
|
|
|
Optional diagnostics:
|
|
|
|
- `Ping`
|
|
- `GetSessionState`
|
|
- `GetWorkerInfo`
|
|
- `DrainEvents`
|
|
- `ShutdownWorker`
|
|
|
|
Do not compress MXAccess semantics into generic verbs too early. A command enum
|
|
with method-specific payloads is easier to test for parity.
|
|
|
|
## Event Surface
|
|
|
|
The gateway must represent every public MXAccess event family:
|
|
|
|
- `OnDataChange`
|
|
- `OnWriteComplete`
|
|
- `OperationComplete`
|
|
- `OnBufferedDataChange`
|
|
|
|
The event DTO should include:
|
|
|
|
- event family,
|
|
- session id,
|
|
- server handle,
|
|
- item handle,
|
|
- value when present,
|
|
- quality when present,
|
|
- timestamp when present,
|
|
- `MXSTATUS_PROXY[]` equivalent,
|
|
- raw HRESULT/status fields when available,
|
|
- event ordering sequence,
|
|
- worker timestamp,
|
|
- gateway receive timestamp.
|
|
|
|
Keep event order stable per worker. The gateway should not reorder events from
|
|
the same MXAccess instance.
|
|
|
|
## Value Model
|
|
|
|
Use a protobuf value union that can represent COM `VARIANT` values and arrays.
|
|
|
|
```protobuf
|
|
message MxValue {
|
|
oneof kind {
|
|
bool bool_value = 1;
|
|
int32 int32_value = 2;
|
|
int64 int64_value = 3;
|
|
float float_value = 4;
|
|
double double_value = 5;
|
|
string string_value = 6;
|
|
Timestamp timestamp_value = 7;
|
|
MxArray array_value = 8;
|
|
bytes raw_variant = 100;
|
|
}
|
|
}
|
|
```
|
|
|
|
Array support should include at least:
|
|
|
|
- bool array,
|
|
- int32 array,
|
|
- float array,
|
|
- double array,
|
|
- string array,
|
|
- timestamp array,
|
|
- raw fallback.
|
|
|
|
For full parity, unknown or awkward COM values should be preserved as raw
|
|
metadata rather than dropped. If a value cannot be losslessly converted, the
|
|
worker should return both the best typed projection and enough diagnostic
|
|
metadata to reproduce the case.
|
|
|
|
## Status Model
|
|
|
|
Represent `MXSTATUS_PROXY` explicitly:
|
|
|
|
```protobuf
|
|
message MxStatusProxy {
|
|
int32 success = 1;
|
|
uint32 category = 2;
|
|
uint32 detail = 3;
|
|
uint32 source = 4;
|
|
uint32 raw_hresult = 5;
|
|
string text = 6;
|
|
}
|
|
```
|
|
|
|
The exact field names should be adjusted to match the actual interop struct,
|
|
but the design principle is important: do not collapse status arrays into a
|
|
single success flag.
|
|
|
|
For command replies, return:
|
|
|
|
- protocol status,
|
|
- COM HRESULT if available,
|
|
- MXAccess return value if the method has one,
|
|
- method-specific out parameters,
|
|
- status array if the method emits one.
|
|
|
|
## STA Worker Thread Model
|
|
|
|
Each worker owns:
|
|
|
|
- one process,
|
|
- one MXAccess session,
|
|
- one dedicated STA thread,
|
|
- one MXAccess COM object,
|
|
- one inbound command queue,
|
|
- one outbound event queue.
|
|
|
|
All MXAccess operations run on the STA:
|
|
|
|
```text
|
|
pipe reader thread
|
|
-> parse WorkerCommand
|
|
-> enqueue StaCommand
|
|
-> await task completion
|
|
-> write WorkerCommandReply
|
|
|
|
STA thread
|
|
-> CoInitializeEx(APARTMENTTHREADED)
|
|
-> create MXAccess COM object
|
|
-> wire events
|
|
-> run message pump
|
|
-> execute queued commands between message dispatches
|
|
|
|
MXAccess event handler on STA
|
|
-> convert event args to WorkerEvent
|
|
-> enqueue outbound event
|
|
|
|
pipe writer thread
|
|
-> dequeue replies/events
|
|
-> write framed protobuf messages
|
|
```
|
|
|
|
Do not block the STA on pipe writes or gRPC calls. The STA should enqueue
|
|
results/events and return to pumping messages.
|
|
|
|
### Message Pump
|
|
|
|
The STA loop must pump Windows messages and service command work. A typical
|
|
shape:
|
|
|
|
```text
|
|
while not shutdown:
|
|
while command queue has work:
|
|
execute one command on STA
|
|
|
|
MsgWaitForMultipleObjectsEx(
|
|
command_event,
|
|
timeout,
|
|
QS_ALLINPUT,
|
|
MWMO_INPUTAVAILABLE)
|
|
|
|
while PeekMessage:
|
|
TranslateMessage
|
|
DispatchMessage
|
|
```
|
|
|
|
This is the critical piece for MXAccess event delivery. A plain blocking queue
|
|
on an STA thread is not enough if it prevents COM/window messages from being
|
|
pumped.
|
|
|
|
### COM Lifetime
|
|
|
|
Worker startup:
|
|
|
|
1. set apartment state to STA,
|
|
2. initialize COM on the STA,
|
|
3. instantiate `LMXProxyServerClass` or the installed MXAccess interop class,
|
|
4. attach event handlers,
|
|
5. send `WorkerReady`.
|
|
|
|
Worker shutdown:
|
|
|
|
1. reject new commands,
|
|
2. optionally send `UnAdvise`/`RemoveItem`/`Unregister` for active handles,
|
|
3. detach event handlers,
|
|
4. release COM object until reference count reaches zero,
|
|
5. uninitialize COM,
|
|
6. exit process.
|
|
|
|
If graceful shutdown exceeds timeout, the gateway kills the worker.
|
|
|
|
## Session Model
|
|
|
|
One external client session maps to one worker process by default.
|
|
|
|
Session state in the gateway:
|
|
|
|
- session id,
|
|
- client identity,
|
|
- worker process id,
|
|
- pipe name,
|
|
- pipe connection,
|
|
- open time,
|
|
- last heartbeat,
|
|
- active stream subscribers,
|
|
- command timeout policy,
|
|
- event queue metrics.
|
|
|
|
Session state in the worker:
|
|
|
|
- MXAccess COM object,
|
|
- registered server handles,
|
|
- item handles,
|
|
- item definitions/context,
|
|
- advise state,
|
|
- buffered state,
|
|
- authenticated user ids if needed,
|
|
- event sequence number.
|
|
|
|
The gateway should treat worker state as authoritative for MXAccess handles.
|
|
It can keep a shadow state for diagnostics and cleanup, but should not invent
|
|
handles.
|
|
|
|
## Command Execution
|
|
|
|
Every command should follow the same lifecycle:
|
|
|
|
```text
|
|
client sends gRPC command
|
|
gateway validates session and payload
|
|
gateway assigns correlation id
|
|
gateway writes WorkerCommand to pipe
|
|
worker pipe reader enqueues command to STA
|
|
STA executes MXAccess method
|
|
worker captures return value/out params/status/HRESULT
|
|
worker sends WorkerCommandReply
|
|
gateway completes gRPC response
|
|
```
|
|
|
|
Timeouts:
|
|
|
|
- gateway command timeout bounds client waiting,
|
|
- worker command timeout marks the command as stuck,
|
|
- if the STA does not recover after a configurable grace period, kill the
|
|
worker and fail the session.
|
|
|
|
Cancellation:
|
|
|
|
- canceling the gRPC call should stop waiting in the gateway,
|
|
- it cannot safely abort an in-flight COM call on the STA,
|
|
- the worker should finish the COM call and discard or log the late reply if
|
|
the correlation was canceled,
|
|
- hard cancellation means killing the worker process.
|
|
|
|
## Event Delivery And Backpressure
|
|
|
|
Events flow from worker to gateway, then gateway to client streams.
|
|
|
|
Worker policy:
|
|
|
|
- bounded outbound event channel,
|
|
- never block MXAccess event handler on pipe writes,
|
|
- if the outbound channel is full, apply configured policy:
|
|
- disconnect session,
|
|
- drop oldest low-priority data-change events,
|
|
- coalesce data changes by item handle,
|
|
- or block briefly then fault.
|
|
|
|
For full parity testing, default should be fail-fast rather than silent drop.
|
|
For production high-rate telemetry, add explicit coalescing modes.
|
|
|
|
Gateway policy:
|
|
|
|
- one event sequencer per session,
|
|
- preserve per-session event order,
|
|
- support multiple client event subscribers only if explicitly required,
|
|
- apply backpressure from slow gRPC streams,
|
|
- disconnect or coalesce according to client-selected mode.
|
|
|
|
## Isolation And Fault Handling
|
|
|
|
Failure cases:
|
|
|
|
- worker fails startup,
|
|
- worker pipe disconnects,
|
|
- worker heartbeat expires,
|
|
- worker process exits,
|
|
- STA command times out,
|
|
- MXAccess COM throws,
|
|
- MXAccess event handler throws,
|
|
- client disconnects,
|
|
- gateway shuts down.
|
|
|
|
Policy:
|
|
|
|
- worker startup failure fails `OpenSession`,
|
|
- worker crash emits terminal session fault to client,
|
|
- command exceptions return structured command fault with HRESULT if known,
|
|
- stale sessions are closed by lease timeout,
|
|
- stuck workers are killed by process id,
|
|
- gateway restart should not attempt to reattach old workers unless explicitly
|
|
designed; first version should terminate orphaned workers on startup.
|
|
|
|
Because each client owns one worker, a crash or leak affects only that session.
|
|
|
|
## Security
|
|
|
|
External gateway:
|
|
|
|
- use TLS for remote gRPC if crossing machine boundaries,
|
|
- authenticate v1 gRPC clients with `authorization: Bearer
|
|
mxgw_<key-id>_<secret>` API-key metadata,
|
|
- reject missing or invalid API keys with gRPC `Unauthenticated`,
|
|
- reject valid keys that lack the required session, invoke, event, metadata, or
|
|
admin scope with gRPC `PermissionDenied`,
|
|
- authorize access to commands that can write, authenticate users, expose
|
|
metadata, stream events, or alter runtime state.
|
|
|
|
Internal worker IPC:
|
|
|
|
- local named pipes only,
|
|
- restrictive pipe ACL,
|
|
- per-session nonce handshake,
|
|
- worker validates gateway hello before creating MXAccess,
|
|
- gateway validates worker executable path and version,
|
|
- no secrets in command line when avoidable.
|
|
|
|
Credential-sensitive commands such as `AuthenticateUser` and `WriteSecured`
|
|
must not log passwords or raw credential values.
|
|
|
|
## Observability
|
|
|
|
Gateway metrics:
|
|
|
|
- sessions open,
|
|
- workers running,
|
|
- worker start latency,
|
|
- command latency by method,
|
|
- command failures by method/status,
|
|
- event rate by session/event type,
|
|
- event queue depth,
|
|
- worker memory/CPU,
|
|
- worker restarts/kills,
|
|
- gRPC stream disconnects.
|
|
|
|
Worker logs:
|
|
|
|
- startup/shutdown,
|
|
- MXAccess COM creation result,
|
|
- command start/end with correlation id,
|
|
- HRESULT/status summary,
|
|
- event family and sequence number,
|
|
- queue overflow,
|
|
- STA watchdog warnings.
|
|
|
|
Do not log full values by default. Make value logging opt-in and redacted where
|
|
credentials or secured writes are involved.
|
|
|
|
## Performance Strategy
|
|
|
|
First priority is parity. Performance comes from process isolation, batching,
|
|
and avoiding unnecessary cross-process round trips.
|
|
|
|
Baseline choices:
|
|
|
|
- long-lived worker per session,
|
|
- persistent pipe,
|
|
- protobuf binary framing,
|
|
- no gRPC inside worker,
|
|
- no COM calls outside STA,
|
|
- event streaming rather than event polling.
|
|
|
|
Optimizations after parity:
|
|
|
|
- batch commands where MXAccess semantics allow,
|
|
- batch events from worker to gateway while preserving order,
|
|
- optional data-change coalescing by item handle,
|
|
- memory-mapped payload slabs for very large arrays,
|
|
- shared schema for typed values to avoid raw COM marshaling at the gateway,
|
|
- gateway-side route to `MxAsbClient` for proven high-volume read/write
|
|
workloads only when caller opts into non-MXAccess-backed behavior or parity
|
|
tests prove equivalence.
|
|
|
|
## Project Layout
|
|
|
|
Suggested additions:
|
|
|
|
```text
|
|
src/MxGateway.Contracts/
|
|
Protos/
|
|
mxaccess_gateway.proto
|
|
mxaccess_worker.proto
|
|
Generated/
|
|
|
|
src/MxGateway.Server/
|
|
Program.cs
|
|
Sessions/
|
|
Workers/
|
|
Grpc/
|
|
Metrics/
|
|
|
|
src/MxGateway.Worker/
|
|
Program.cs
|
|
Ipc/
|
|
Sta/
|
|
MxAccess/
|
|
Conversion/
|
|
|
|
src/MxGateway.Tests/
|
|
contract tests
|
|
gateway session tests
|
|
fake worker tests
|
|
|
|
src/MxGateway.Worker.Tests/
|
|
value/status conversion tests
|
|
STA queue tests
|
|
|
|
src/MxGateway.IntegrationTests/
|
|
optional live MXAccess tests
|
|
```
|
|
|
|
Build outputs:
|
|
|
|
- gateway: .NET 10 x64,
|
|
- worker: .NET Framework 4.8 x86.
|
|
|
|
The contracts project can multi-target if needed, or the `.proto` files can be
|
|
shared as source inputs to both gateway and worker builds.
|
|
|
|
## Worker Implementation Plan
|
|
|
|
### Phase 1: Minimal Worker Harness
|
|
|
|
- Create .NET Framework 4.8 x86 worker executable.
|
|
- Parse pipe name/session id/nonce args.
|
|
- Connect to gateway named pipe.
|
|
- Exchange hello/ready messages.
|
|
- Start STA thread.
|
|
- Create MXAccess COM object on STA.
|
|
- Pump messages.
|
|
- Shut down cleanly.
|
|
|
|
Exit criteria:
|
|
|
|
- gateway can spawn worker,
|
|
- worker reports ready,
|
|
- worker exits on shutdown command,
|
|
- STA remains responsive.
|
|
|
|
### Phase 2: Command Queue
|
|
|
|
- Add command DTOs for `Register`, `Unregister`, `AddItem`, `RemoveItem`.
|
|
- Implement STA command dispatch.
|
|
- Return method result, HRESULT, and structured fault.
|
|
- Add command timeout handling in gateway.
|
|
|
|
Exit criteria:
|
|
|
|
- client can open a session and perform basic handle lifecycle through gRPC.
|
|
|
|
### Phase 3: Event Stream
|
|
|
|
- Wire MXAccess events in the worker.
|
|
- Convert `OnDataChange`, `OnWriteComplete`, `OperationComplete`, and
|
|
`OnBufferedDataChange` to protobuf events.
|
|
- Add event sequence numbers.
|
|
- Add gateway `StreamEvents`.
|
|
|
|
Exit criteria:
|
|
|
|
- advised item changes reach a .NET 10 client without the client owning an STA.
|
|
|
|
### Phase 4: Full Command Surface
|
|
|
|
Add remaining MXAccess methods:
|
|
|
|
- `Advise`
|
|
- `UnAdvise`
|
|
- `AdviseSupervisory`
|
|
- `AddItem2`
|
|
- `AddBufferedItem`
|
|
- `SetBufferedUpdateInterval`
|
|
- `Suspend`
|
|
- `Activate`
|
|
- `Write`
|
|
- `Write2`
|
|
- `WriteSecured`
|
|
- `WriteSecured2`
|
|
- `AuthenticateUser`
|
|
- `ArchestrAUserToId`
|
|
|
|
Exit criteria:
|
|
|
|
- gRPC command surface covers the installed MXAccess public method set.
|
|
|
|
### Phase 5: Parity Harness
|
|
|
|
- Reuse existing MXAccess trace harness scenarios.
|
|
- Run each scenario against direct MXAccess and against the gateway.
|
|
- Compare:
|
|
- return values,
|
|
- HRESULTs/exceptions,
|
|
- event sequence,
|
|
- value projection,
|
|
- quality/status arrays,
|
|
- invalid handle behavior,
|
|
- cleanup behavior.
|
|
|
|
Exit criteria:
|
|
|
|
- documented parity matrix for all public methods and event families.
|
|
|
|
### Phase 6: Hardening
|
|
|
|
- Worker watchdog.
|
|
- Heartbeats.
|
|
- Process kill/restart.
|
|
- Bounded queues.
|
|
- Backpressure policy.
|
|
- TLS/auth on public gateway.
|
|
- Metrics.
|
|
- Structured logging.
|
|
- Installer/service packaging.
|
|
|
|
Exit criteria:
|
|
|
|
- gateway can run as a Windows service and recover from worker crashes.
|
|
|
|
## Gateway Implementation Plan
|
|
|
|
### Session Manager
|
|
|
|
Core operations:
|
|
|
|
- allocate session id,
|
|
- choose worker executable,
|
|
- create pipe name and nonce,
|
|
- start worker process,
|
|
- accept pipe connection,
|
|
- verify worker hello,
|
|
- track worker state,
|
|
- close or kill worker.
|
|
|
|
The gateway implementation keeps sessions in an in-memory `SessionRegistry`
|
|
keyed by session id. `SessionManager` owns the state machine, creates
|
|
per-session pipe names and nonces, starts the worker through the worker-client
|
|
factory, gates commands to `Ready` sessions, exposes lease-close hooks, and
|
|
cleans up workers during gateway shutdown.
|
|
|
|
State machine:
|
|
|
|
```text
|
|
Creating
|
|
-> StartingWorker
|
|
-> WaitingForPipe
|
|
-> InitializingWorker
|
|
-> Ready
|
|
-> Closing
|
|
-> Closed
|
|
-> Faulted
|
|
```
|
|
|
|
### Worker Client
|
|
|
|
Gateway-side worker client owns:
|
|
|
|
- pipe stream,
|
|
- read loop,
|
|
- write loop,
|
|
- pending command dictionary,
|
|
- event channel,
|
|
- heartbeat monitor,
|
|
- process handle.
|
|
|
|
It should expose:
|
|
|
|
```csharp
|
|
Task<WorkerCommandReply> InvokeAsync(WorkerCommand command, CancellationToken ct);
|
|
IAsyncEnumerable<WorkerEvent> ReadEventsAsync(CancellationToken ct);
|
|
Task ShutdownAsync(TimeSpan timeout);
|
|
void Kill();
|
|
```
|
|
|
|
### gRPC Layer
|
|
|
|
The gRPC layer should be thin:
|
|
|
|
- validate request,
|
|
- find session,
|
|
- call session worker client,
|
|
- map worker reply to public reply,
|
|
- stream events from session event channel.
|
|
|
|
Avoid embedding MXAccess-specific business logic in gRPC handlers. Keep the
|
|
translation code testable.
|
|
|
|
The gateway maps `MxAccessGateway` to `MxAccessGatewayService`. The service
|
|
implements `OpenSession`, `CloseSession`, `Invoke`, and `StreamEvents` by
|
|
validating public requests, delegating session work to `ISessionManager`, and
|
|
using explicit mapper code for public-to-worker commands, worker replies, and
|
|
events. Missing sessions and transport failures return gRPC status errors;
|
|
worker command replies preserve MXAccess HRESULT and status details in the
|
|
public reply.
|
|
|
|
## C# Worker Versus C++ Worker
|
|
|
|
Start with a C# .NET Framework 4.8 x86 worker.
|
|
|
|
Reasons:
|
|
|
|
- fastest implementation path,
|
|
- easiest COM interop/event sink work,
|
|
- straightforward named-pipe/protobuf implementation,
|
|
- easier logging and diagnostics,
|
|
- easier parity iteration.
|
|
|
|
C++/CLI or native C++ remains an escape hatch if C# COM interop proves
|
|
insufficient. The pipe protocol should be language-neutral so a future C++
|
|
worker can replace the C# worker without changing gateway or clients.
|
|
|
|
Use C++ only if evidence shows:
|
|
|
|
- C# event sinks cannot reliably pump MXAccess events,
|
|
- COM `VARIANT`/`SAFEARRAY` conversion loses required data,
|
|
- throughput is bottlenecked by .NET COM marshaling,
|
|
- MXAccess requires ATL-style connection point behavior not reproducible from
|
|
C#.
|
|
|
|
## Compatibility Baseline
|
|
|
|
The proxy should preserve direct MXAccess behavior, including surprising cases.
|
|
|
|
Known important parity areas from existing captures:
|
|
|
|
- `WriteSecured` may fail before a value-bearing NMX body is emitted.
|
|
- `WriteSecured2` can succeed in observed native paths.
|
|
- `OperationComplete` is distinct from write completion.
|
|
- `OnBufferedDataChange` has a distinct public event shape.
|
|
- Invalid handles and cross-server handles have specific exception/status
|
|
behavior.
|
|
- STA message pumping is required for event delivery.
|
|
|
|
The gateway should not "fix" these behaviors unless the client explicitly opts
|
|
into a non-parity mode.
|
|
|
|
## Future Backend Routing
|
|
|
|
After the MXAccess-backed proxy is stable, the gateway can optionally support
|
|
other backends behind the same public contract:
|
|
|
|
- `MxAsbClient` for high-volume basic read/write where poll-based subscription
|
|
semantics are acceptable or proven equivalent for a workload,
|
|
- managed NMX for native callback experiments and eventual MXAccess-free
|
|
replacement work,
|
|
- direct MXAccess worker as the default parity backend.
|
|
|
|
Routing must be explicit and observable:
|
|
|
|
- event/reply includes backend name,
|
|
- tests assert backend choice,
|
|
- no silent fallback that changes semantics.
|
|
|
|
Initial production mode should be:
|
|
|
|
```text
|
|
backend = mxaccess-worker
|
|
```
|
|
|
|
## Open Questions
|
|
|
|
Current v1 decisions are recorded in `docs/design-decisions.md`.
|
|
|
|
Resolved for v1:
|
|
|
|
- MXAccess COM target is `ArchestrA.MxAccess.LMXProxyServerClass` /
|
|
`LMXProxy.LMXProxyServer.1` from the installed 32-bit `LmxProxy.dll`.
|
|
- One `OpenSession` maps to one worker process; no reconnectable sessions.
|
|
- One active event subscriber per session.
|
|
- API key authentication with hashed keys in gateway-owned SQLite.
|
|
- Basic Blazor Server dashboard with Bootstrap CSS/JS and real-time updates.
|
|
- Workers run as the gateway service identity.
|
|
- Event backpressure is fail-fast with bounded queues.
|
|
- No public command batching.
|
|
- `OperationComplete` is forwarded only when native MXAccess raises it.
|
|
- `OnBufferedDataChange` is modeled now; multi-sample payload conversion remains
|
|
capture-validated work.
|
|
|
|
Post-v1 revisit items:
|
|
|
|
- production event-rate target and optional coalescing,
|
|
- reconnectable sessions,
|
|
- multi-subscriber event fan-out,
|
|
- restricted worker process identity,
|
|
- command batching for high-volume setup.
|
|
|
|
## Recommended Next Step
|
|
|
|
Build the smallest end-to-end slice:
|
|
|
|
1. .NET 10 gateway starts.
|
|
2. Client calls `OpenSession`.
|
|
3. Gateway launches .NET Framework 4.8 x86 worker.
|
|
4. Worker creates STA and MXAccess COM object.
|
|
5. Client calls `Register`.
|
|
6. Client calls `AddItem`.
|
|
7. Client calls `Advise`.
|
|
8. Worker forwards one `OnDataChange` event to the gateway.
|
|
9. Gateway streams the event to the client.
|
|
10. Client calls `CloseSession`.
|
|
11. Gateway shuts down the worker.
|
|
|
|
That slice proves the architecture's hardest requirements: process isolation,
|
|
STA ownership, message pumping, command routing, and event streaming.
|