Files
mxaccessgw/docs/gateway-process-design.md
T
2026-04-27 15:37:56 -04:00

31 KiB

Gateway Process Detailed Design

Purpose

The gateway process is the only public network-facing component. It exposes the modern API, owns session lifecycle, launches and supervises MXAccess worker processes, and moves commands and events between clients and the worker that owns each session.

The gateway must not instantiate MXAccess COM, import MXAccess interop types, or depend on an STA message pump. The installed MXAccess COM component is isolated behind the worker process boundary.

Runtime

  • Target runtime: .NET 10.
  • Language: C#.
  • Preferred process architecture: x64.
  • Hosting: ASP.NET Core gRPC.
  • Web UI: Blazor Server dashboard with Bootstrap CSS/JS.
  • Operating system: Windows.
  • Public transport: TCP gRPC.
  • Internal worker transport: named pipes with protobuf-framed messages.

Style guides:

Responsibilities

The gateway owns:

  • public gRPC service endpoints,
  • Blazor Server dashboard endpoints,
  • optional authentication and authorization,
  • session id allocation,
  • worker executable selection,
  • named-pipe server creation,
  • worker process launch,
  • gateway/worker handshake,
  • command correlation and timeout handling,
  • event fan-out to client streams,
  • session lease and heartbeat enforcement,
  • worker crash and hang detection,
  • metrics and structured logging,
  • graceful service shutdown.

The gateway does not own:

  • MXAccess COM object creation,
  • MXAccess method dispatch,
  • MXAccess event subscription,
  • MXAccess handle generation,
  • COM value conversion from native VARIANT values.

Those belong to the worker.

High-Level Components

MxGateway.Server
  Program / Host
  Configuration
  Grpc
    MxAccessGatewayService
    MxAccessGrpcRequestValidator
    MxAccessGrpcMapper
  Dashboard
    Pages
    Components
    DashboardSnapshotService
    DashboardAuthorization
  Sessions
    SessionManager
    GatewaySession
    SessionRegistry
    SessionLeaseMonitor
  Workers
    WorkerProcessLauncher
    WorkerClient
    WorkerPipeTransport
    WorkerProtocolReader
    WorkerProtocolWriter
    WorkerWatchdog
  Security
    ClientIdentityResolver
    CommandAuthorization
  Metrics
    GatewayMetrics
  Diagnostics
    HealthChecks

Public gRPC Surface

Start with unary commands plus an event stream:

service MxAccessGateway {
  rpc OpenSession(OpenSessionRequest) returns (OpenSessionReply);
  rpc CloseSession(CloseSessionRequest) returns (CloseSessionReply);
  rpc Invoke(MxCommandRequest) returns (MxCommandReply);
  rpc StreamEvents(StreamEventsRequest) returns (stream MxEvent);
}

MxAccessGatewayService implements these public RPCs in the gateway process. It validates public requests with MxAccessGrpcRequestValidator, delegates session lifecycle and command routing to ISessionManager, and maps worker command replies and events through MxAccessGrpcMapper. Session lookup, validation, and worker transport failures become gRPC status errors. MXAccess method replies that reached the worker remain MxCommandReply payloads so HRESULT values, status arrays, and method-specific reply fields survive transport boundaries.

Add this later only after the command and event model is stable:

rpc Session(stream ClientMessage) returns (stream ServerMessage);

OpenSession

OpenSession creates one gateway session and one worker process by default.

Inputs should include:

  • requested backend, defaulting to mxaccess-worker,
  • optional client session name,
  • optional client correlation id,
  • optional timeout policy,
  • optional event backpressure policy,
  • optional metadata discovery options.

Outputs should include:

  • session id,
  • backend name,
  • worker process id when available,
  • protocol version,
  • server capabilities,
  • default timeout values.

Behavior:

  1. Resolve and authorize the client identity.
  2. Allocate a session id.
  3. Build a pipe name and random handshake nonce.
  4. Create a named-pipe server with restrictive local ACLs.
  5. Launch the worker executable with session bootstrap data.
  6. Accept the pipe connection within startup timeout.
  7. Exchange GatewayHello and WorkerHello.
  8. Wait for WorkerReady.
  9. Register the session as ready.
  10. Return the session details.

If any step fails, clean up all resources. Kill the worker if it was launched and did not shut down on its own.

CloseSession

CloseSession attempts graceful shutdown and then enforces a kill timeout.

Behavior:

  1. Mark the session closing.
  2. Stop accepting new commands.
  3. Notify event streams of terminal session close.
  4. Send WorkerShutdown when the pipe is still connected.
  5. Wait for worker exit up to the configured timeout.
  6. Kill the worker process if it remains alive.
  7. Remove the session from the registry.

CloseSession should be idempotent. Closing an already closed session should return a successful close result with the final known state.

WorkerClient.ShutdownAsync sends WorkerShutdown, waits for the worker read, write, and heartbeat loops to stop, and waits for the launched worker process to exit within the same shutdown timeout. If the pipe loops or process exit exceed the timeout, the close operation fails with ShutdownTimeout; GatewaySession then kills the worker process tree before surfacing the close failure.

Invoke

Invoke forwards one MXAccess command to the worker that owns the session.

Behavior:

  1. Validate the session id.
  2. Check session state is Ready.
  3. Validate the method-specific payload.
  4. Authorize the command, especially writes and credential-bearing commands.
  5. Assign a gateway correlation id.
  6. Write WorkerCommand to the worker pipe.
  7. Await the correlated WorkerCommandReply.
  8. Map worker reply to public MxCommandReply.

Request cancellation stops waiting in the gateway. It does not abort an in-flight COM call. If the command must be hard-canceled, kill the worker and fault the session.

StreamEvents

StreamEvents streams events for one session.

Initial implementation allows one active stream subscriber per session. A second subscriber should be rejected with a clear session error. If multiple subscribers are later supported, they must have independent backpressure accounting and a clear fan-out policy.

Behavior:

  1. Validate session id and authorize event access.
  2. Attach the single active subscriber lease for the session.
  3. Read worker events into a bounded public stream queue.
  4. Send events in worker sequence order.
  5. Stop on client cancellation, session close, or session fault.
  6. Emit a terminal status when the session faults if gRPC status alone cannot preserve the required details.

EventStreamService owns subscriber tracking and public stream backpressure. The default policy allows one active subscriber per session. A second subscriber is rejected with EventSubscriberAlreadyActive. Stream cancellation releases the subscriber lease so a later stream can attach to the session.

The gateway must not reorder events from one worker. EventStreamService writes mapped events to a bounded first-in, first-out queue and faults the session with EventQueueOverflow if the queue fills. The gateway does not synthesize OperationComplete; it forwards that family only when the worker reports a native MXAccess OperationComplete event.

Web Dashboard

The gateway hosts a basic Blazor Server dashboard for operators and developers. The dashboard is read-only for v1 and should show current gateway/session/worker state plus basic metrics.

Technology:

  • Blazor Server,
  • Bootstrap CSS,
  • Bootstrap JavaScript,
  • no MudBlazor,
  • no other Blazor client component libraries.

Suggested routes:

/dashboard
/dashboard/sessions
/dashboard/sessions/{sessionId}
/dashboard/workers
/dashboard/events
/dashboard/settings

Dashboard pages:

  • home: gateway status, uptime, session count, worker count, command rate, event rate, queue depth, recent faults,
  • sessions: active/recent session table,
  • session details: one session's worker, heartbeat, counters, queues, and fault summary,
  • workers: worker process table and heartbeat details,
  • events: aggregate event counters and rates,
  • settings: read-only effective configuration with secrets redacted.

Realtime updates should use Blazor Server component updates from a read-only snapshot service. Components should subscribe to snapshots and call StateHasChanged through InvokeAsync. Do not stream every MXAccess event to the dashboard; aggregate event rates and counters instead.

Suggested service shape:

public interface IDashboardSnapshotService
{
    DashboardSnapshot GetSnapshot();
    IAsyncEnumerable<DashboardSnapshot> WatchSnapshotsAsync(
        CancellationToken cancellationToken);
}

Default refresh policy:

  • immediate update on session create, close, or fault,
  • immediate update on worker fault,
  • periodic metrics refresh every 1 second,
  • event-rate windows updated every 1 second.

Dashboard access should require API-key-backed authentication with admin scope when enabled. A simple /dashboard/login form can validate an API key and issue an HTTP-only secure cookie for dashboard pages. Do not put API keys in query strings. Anonymous localhost access may exist only behind an explicit configuration option that defaults to false.

Session State Machine

Creating
  -> StartingWorker
  -> WaitingForPipe
  -> Handshaking
  -> InitializingWorker
  -> Ready
  -> Closing
  -> Closed

Any non-terminal state
  -> Faulted

Faulted
  -> Closed

State Rules

  • Creating: session id and in-memory state exist, but no worker has launched.
  • StartingWorker: worker process launch is in progress.
  • WaitingForPipe: gateway is waiting for the worker to connect to the pipe.
  • Handshaking: pipe is connected and protocol hello is being verified.
  • InitializingWorker: worker is connected but has not reported MXAccess ready.
  • Ready: commands and event streams may run.
  • Closing: graceful shutdown is in progress.
  • Closed: resources are released.
  • Faulted: a non-graceful terminal fault occurred and must be reported to callers before resources are released.

Only Ready sessions accept new commands.

Session Model

Gateway session state should include:

  • session id,
  • client identity,
  • backend name,
  • worker process id,
  • worker executable path and version,
  • pipe name,
  • pipe connection state,
  • open time,
  • last client activity time,
  • last worker heartbeat time,
  • lease expiration,
  • command timeout policy,
  • startup timeout policy,
  • shutdown timeout policy,
  • event queue metrics,
  • active event stream count,
  • final fault if any.

The worker remains authoritative for MXAccess handles. The gateway may keep a shadow state for diagnostics, but it must not invent, rewrite, or recycle MXAccess handles.

SessionManager owns the current in-memory session registry. It allocates a session id, creates the worker pipe name and nonce, registers the session before worker startup, and removes the session if startup fails. A successful OpenSession attaches the ready IWorkerClient and transitions the session to Ready.

Only Ready sessions accept command and event operations. CloseSession is idempotent for sessions still known to the registry: the first close shuts down the worker, and later closes return the final Closed state. Lease handling is exposed as a session hook so a monitor can close expired sessions without embedding lease policy in the worker client. Gateway shutdown walks the registry, closes each known session, and kills a worker if graceful shutdown fails.

Worker Launch

The gateway should launch the worker using explicit configuration:

  • worker executable path,
  • worker working directory,
  • worker architecture requirement,
  • protocol version,
  • startup timeout,
  • environment variables,
  • optional restricted user identity.

Command-line arguments should include only non-secret bootstrap values:

--session-id <sessionId>
--pipe-name <pipeName>
--protocol-version <version>

Prefer passing the handshake nonce via inherited environment or another protected local mechanism instead of command line when possible.

Before launch, validate:

  • worker executable exists,
  • worker path is under the configured install directory,
  • worker file version or product version is acceptable,
  • worker is expected to be x86.

WorkerProcessLauncher implements the first validation layer now: it resolves the worker executable path, requires a .exe, validates the Windows Portable Executable header, and verifies the configured processor architecture. It passes only --session-id, --pipe-name, and --protocol-version on the command line. The per-session nonce is set through MXGATEWAY_WORKER_NONCE so the command line remains safe to log. Startup failures and startup timeouts kill and dispose the worker process and the pre-created pipe reservation before the session manager observes the failure.

Worker IPC

The gateway creates the pipe server before launching the worker.

Pipe name:

mxaccess-gateway-{gatewayProcessId}-{sessionId}

Message framing:

uint32 little-endian payload_length
payload_length bytes protobuf WorkerEnvelope

Recommended size limits:

  • default max message size: 16 MiB,
  • configurable upper bound for large arrays,
  • reject zero-length payloads,
  • reject payloads larger than configured maximum before allocation.

Envelope Rules

Every message uses WorkerEnvelope:

  • protocol_version must match a supported version.
  • session_id must match the pipe/session.
  • sequence is monotonic per sender.
  • correlation_id links commands and replies.
  • events use either zero or their own event correlation id.
  • protocol faults do not replace MXAccess HRESULT/status details.

The gateway should treat malformed frames, sequence regressions, and wrong session ids as protocol faults and close the session.

WorkerClient Design

WorkerClient is the gateway-side object that owns one worker connection.

Current public shape:

public interface IWorkerClient : IAsyncDisposable
{
    string SessionId { get; }
    int? ProcessId { get; }
    WorkerClientState State { get; }
    DateTimeOffset LastHeartbeatAt { get; }

    Task StartAsync(CancellationToken cancellationToken);
    Task<WorkerCommandReply> InvokeAsync(
        WorkerCommand command,
        TimeSpan timeout,
        CancellationToken cancellationToken);
    IAsyncEnumerable<WorkerEvent> ReadEventsAsync(
        CancellationToken cancellationToken);
    Task ShutdownAsync(TimeSpan timeout, CancellationToken cancellationToken);
    void Kill(string reason);
}

Internally it owns:

  • process handle,
  • pipe stream,
  • read loop,
  • write loop,
  • outbound command/control channel serialized by the write loop,
  • bounded inbound event channel,
  • pending command dictionary keyed by correlation id,
  • heartbeat monitor,
  • terminal fault source.

StartAsync sends GatewayHello, verifies the WorkerHello protocol version and nonce, waits for WorkerReady, and only then exposes Ready state. The read loop starts after readiness so the handshake has a single owner for its ordered frames.

Read Loop

The read loop:

  1. Reads one frame.
  2. Parses WorkerEnvelope.
  3. Validates protocol fields.
  4. Dispatches by body type:
    • WorkerCommandReply: completes pending command.
    • WorkerEvent: enqueues event.
    • WorkerHeartbeat: updates heartbeat timestamp.
    • WorkerFault: faults session.
  5. Stops when pipe closes or cancellation is requested.

If the pipe closes while the session is not closing, fault the session.

Write Loop

The write loop serializes all writes to the pipe. No other code should write to the pipe directly.

It handles:

  • GatewayHello,
  • WorkerCommand,
  • WorkerCancel,
  • WorkerShutdown,
  • gateway heartbeat if used.

The write loop should fail the session if a pipe write fails outside normal shutdown.

During shutdown the worker client treats WorkerShutdownAck as the protocol close signal, but the process handle remains authoritative for process lifetime. The client waits for both the protocol close and process exit before reporting a clean shutdown to GatewaySession.

Command Correlation

Each command gets:

  • gateway correlation id,
  • method name,
  • start timestamp,
  • timeout deadline,
  • caller cancellation token,
  • reply completion source.

Pending command handling:

  • Add the pending entry before writing the command.
  • Remove it exactly once when reply, timeout, cancellation, or session fault occurs.
  • If a late reply arrives after cancellation or timeout, log it with the correlation id and discard it.
  • If the session faults, complete all pending commands with a structured fault.

Timeouts should not assume the COM call stopped. A timed-out command may still finish inside the worker.

Fault Model

Fault categories:

  • StartupFailed
  • ProtocolMismatch
  • ProtocolViolation
  • PipeDisconnected
  • WorkerExited
  • HeartbeatExpired
  • CommandTimeout
  • WorkerFaulted
  • GatewayShutdown
  • AuthorizationFailed

Public replies should distinguish:

  • gRPC transport failure,
  • gateway/session failure,
  • worker protocol failure,
  • MXAccess method failure,
  • MXAccess HRESULT/status failure.

Do not hide an MXAccess HRESULT by returning only an RPC error. When MXAccess was reached and returned status, preserve that status in the command reply.

Heartbeats And Leases

Use separate concepts:

  • worker heartbeat: proves the worker process and pipe loop are alive,
  • session lease: proves the client still owns the session,
  • command timeout: bounds one command wait,
  • startup timeout: bounds worker creation,
  • shutdown timeout: bounds graceful stop.

Suggested defaults for early development:

  • startup timeout: 30 seconds,
  • worker heartbeat interval: 5 seconds,
  • heartbeat grace: 15 seconds,
  • default command timeout: 30 seconds,
  • graceful shutdown timeout: 10 seconds,
  • idle session lease: configurable, disabled in local development.

The exact values should be configurable.

Event Delivery

Events flow:

worker MXAccess event
  -> worker outbound event queue
  -> worker pipe writer
  -> gateway read loop
  -> worker client event queue
  -> EventStreamService bounded stream queue
  -> gRPC StreamEvents

The gateway should record:

  • worker event sequence,
  • gateway receive sequence,
  • worker timestamp,
  • gateway receive timestamp,
  • stream send timestamp if needed for diagnostics.

Default backpressure policy for parity testing should be fail-fast:

  1. If the worker client event queue fills, fault the worker client.
  2. If the public stream queue fills, fault the gateway session.
  3. Preserve the overflow details in logs and metrics.
  4. Do not silently drop data-change events.

Do not set a production event-rate target before measurement. GatewayMetrics records received event counts by family, queue depth, stream disconnects, and overflow counts. Later production modes may support explicit coalescing by item handle as an opt-in behavior.

The gateway should not synthesize OperationComplete from write completion, command replies, ASB completion queues, or completion-only status frames. Forward OperationComplete only when the worker reports the native MXAccess public event.

Security

Public API

Use API key authentication for v1. Store API keys in a gateway-owned SQLite database, but store only hashed key secrets. Clients should send keys in gRPC metadata using:

authorization: Bearer mxgw_<key-id>_<secret>

The gateway should split the key into a stable key id and secret component, load the key record by id, hash the presented secret, and compare using a constant-time comparison.

ApiKeyParser accepts only authorization: Bearer mxgw_<key-id>_<secret>. Malformed headers fail before any database lookup. The parsed raw secret is kept only long enough for ApiKeySecretHasher to compute an HMAC-SHA256 hash using the configured Authentication:PepperSecretName lookup in application configuration. The raw secret is not stored in the auth database, identity model, logs, or verification result.

ApiKeyVerifier loads the stored key record by key id, rejects revoked keys, hashes the presented secret, and compares the stored and presented hashes with CryptographicOperations.FixedTimeEquals. A successful verification returns an ApiKeyIdentity with key id, key prefix, display name, and scopes. Failure results distinguish malformed credentials, missing keys, revoked keys, missing pepper configuration, and hash mismatch for internal authorization handling.

GatewayGrpcAuthorizationInterceptor enforces this authentication model for public gRPC calls. Missing, malformed, revoked, unknown, or mismatched keys fail with Unauthenticated. Authenticated calls missing the scope required by the RPC fail with PermissionDenied. The interceptor applies to unary calls and server-streaming calls and stores the authenticated ApiKeyIdentity in IGatewayRequestIdentityAccessor for the duration of the request handler. Authentication:Mode set to Disabled bypasses API-key verification for local development only.

Dashboard authentication reuses the API-key verifier and scope model. The dashboard login endpoint accepts the key in a form post, checks admin scope when Dashboard:RequireAdminScope is enabled, and signs in with the MxGateway.Dashboard cookie scheme. The cookie is HTTP-only, secure, strict SameSite, and scoped with the __Host-MxGatewayDashboard name. Logout clears that cookie. Login and logout posts use anti-forgery validation, and dashboard API keys are not accepted in query strings. Dashboard:AllowAnonymousLocalhost allows only loopback requests to bypass the dashboard cookie requirement and defaults to true.

Recommended scopes:

  • session:open
  • session:close
  • invoke:read
  • invoke:write
  • invoke:secure
  • events:read
  • metadata:read
  • admin

If the gateway is exposed outside the local machine, use TLS. Do not log raw API keys or raw credential-bearing MXAccess values.

API key administration for v1 should be a local CLI/tool rather than a public gRPC admin API. It should initialize the auth database, create keys, list keys without secrets, revoke keys, rotate keys, and print raw secrets only once at creation.

MxGateway.Server exposes local API-key administration as an apikey subcommand before the web host starts:

MxGateway.Server apikey init-db --sqlite-path C:\ProgramData\MxGateway\gateway-auth.db
MxGateway.Server apikey create-key --key-id operator01 --display-name Operator --scopes session:open,events:read
MxGateway.Server apikey list-keys --json
MxGateway.Server apikey revoke-key --key-id operator01
MxGateway.Server apikey rotate-key --key-id operator01 --json

The subcommands accept --sqlite-path, --pepper, and --json. --pepper sets the local MxGateway:ApiKeyPepper configuration value for the command process; deployments should normally provide the pepper through the configured secret source. create-key and rotate-key print the full raw API key exactly once. list-keys never prints raw secrets or secret_hash values.

SQLite auth storage should use startup migrations with a schema_version table. Migrations should run inside transactions and fail startup if the database schema is newer than the running binary understands.

The v1 auth store uses Microsoft.Data.Sqlite and creates the schema_version, api_keys, and api_key_audit tables through SqliteAuthStoreMigrator. AuthStoreMigrationHostedService runs those migrations at gateway startup when API-key authentication and Authentication:RunMigrationsOnStartup are enabled. A database with a newer schema version fails startup instead of being modified by an older gateway binary.

IApiKeyStore reads stored key records and exposes an active-key lookup that excludes rows with revoked_utc set. Hash verification belongs to the API-key hashing layer, but the store preserves the secret_hash bytes, display name, scopes, timestamps, and revocation state needed by that layer.

IApiKeyAuditStore appends audit events to api_key_audit and returns recent events for diagnostics and future administrative tools. Audit records store key ids and event metadata only; they do not store raw API key secrets.

Commands requiring authorization:

  • writes,
  • secured writes,
  • authentication commands,
  • worker shutdown diagnostics,
  • metadata queries if they expose sensitive plant structure.

Current gRPC scope mapping:

  • OpenSession requires session:open.
  • CloseSession requires session:close.
  • StreamEvents and DrainEvents require events:read.
  • read-style MXAccess commands such as Register, AddItem, Advise, and Ping require invoke:read.
  • Write and Write2 require invoke:write.
  • WriteSecured, WriteSecured2, and AuthenticateUser require invoke:secure.
  • metadata commands such as ArchestrAUserToId, GetSessionState, and GetWorkerInfo require metadata:read.
  • ShutdownWorker requires admin.

Worker IPC

Named pipes should be local only. Pipe ACLs should restrict access to:

  • the gateway process identity,
  • the launched worker identity,
  • administrators only when operationally required.

The worker must validate GatewayHello and the nonce before creating MXAccess.

Observability

Use structured logs with these fields where applicable:

  • session id,
  • client identity,
  • worker process id,
  • pipe name hash or suffix,
  • protocol version,
  • correlation id,
  • command method,
  • MXAccess HRESULT,
  • MXAccess status summary,
  • event family,
  • event sequence,
  • queue depth,
  • elapsed milliseconds.

Metrics:

  • open sessions,
  • workers running,
  • worker startup latency,
  • command latency by method,
  • command failures by method and category,
  • event rate by session and family,
  • event queue depth,
  • worker exits by reason,
  • worker kills,
  • heartbeat failures,
  • gRPC stream disconnects.

Do not log credential values or full tag values by default.

The gateway registers GatewayMetrics as the in-process metrics foundation. It emits .NET Meter instruments for collectors and keeps a GatewayMetricsSnapshot for dashboard projection. The snapshot exists because the dashboard needs current counters and queue depths without depending on a specific metrics exporter.

HTTP request handling uses UseGatewayRequestLoggingScope() to attach common structured log fields when request metadata is present:

  • SessionId,
  • ClientIdentity,
  • WorkerProcessId,
  • CorrelationId,
  • CommandMethod.

GatewayLogRedactor redacts API key secrets and command values before they are added to log state. Value logging remains opt-in and redacted by default so secured writes, authentication commands, and ordinary tag values do not leak through diagnostics.

Configuration

Suggested configuration shape:

{
  "MxGateway": {
    "Authentication": {
      "Mode": "ApiKey",
      "SqlitePath": "C:\\ProgramData\\MxGateway\\gateway-auth.db",
      "PepperSecretName": "MxGateway:ApiKeyPepper",
      "RunMigrationsOnStartup": true
    },
    "Worker": {
      "ExecutablePath": "src/MxGateway.Worker/bin/x86/Release/MxGateway.Worker.exe",
      "StartupTimeoutSeconds": 30,
      "StartupProbeRetryAttempts": 3,
      "StartupProbeRetryDelayMilliseconds": 250,
      "PipeConnectAttemptTimeoutMilliseconds": 2000,
      "ShutdownTimeoutSeconds": 10,
      "HeartbeatIntervalSeconds": 5,
      "HeartbeatGraceSeconds": 15,
      "MaxMessageBytes": 16777216
    },
    "Sessions": {
      "DefaultCommandTimeoutSeconds": 30,
      "MaxSessions": 64,
      "AllowMultipleEventSubscribers": false
    },
    "Events": {
      "QueueCapacity": 10000,
      "BackpressurePolicy": "FailFast"
    },
    "Dashboard": {
      "Enabled": true,
      "PathBase": "/dashboard",
      "RequireAdminScope": true,
      "AllowAnonymousLocalhost": true,
      "SnapshotIntervalMilliseconds": 1000,
      "RecentFaultLimit": 100,
      "RecentSessionLimit": 200,
      "ShowTagValues": false
    }
  }
}

Do not scatter connection or path constants through implementation code.

MxGateway.Server binds this section to GatewayOptions at startup and registers validation with ValidateOnStart(). Startup fails before the gateway begins serving traffic when required authentication settings are missing, timeouts or queue sizes are not positive, dashboard settings are malformed, or the configured worker protocol version does not match the contract version.

The gateway exposes read-only effective settings through IGatewayConfigurationProvider. This projection is for dashboard settings and diagnostics, so it redacts secret-related fields such as Authentication:PepperSecretName and does not include raw API keys or key material.

Galaxy Repository Metadata

Galaxy hierarchy and tag metadata can be discovered through SQL Server when needed for browse or diagnostics. The current notes live outside this repo at:

C:\Users\dohertj2\Desktop\lmxopcua\gr

Use SQL metadata as discovery data. It does not replace MXAccess-backed runtime behavior unless an explicit non-parity backend is designed.

Testing Strategy

Gateway tests should be able to run without installed MXAccess by using fake workers and fake transports.

Use FakeWorkerHarness for tests that need real gateway-to-worker framing, handshake, command, event, fault, or malformed-protocol behavior without loading MXAccess COM. See Gateway Testing for the harness scope and focused test commands.

Focused tests:

  • session state transitions,
  • gRPC API-key authentication for unary and streaming calls,
  • gRPC scope mapping for sessions, invokes, events, metadata, and admin commands,
  • worker startup failures,
  • protocol version mismatch,
  • malformed frame handling,
  • pending command completion,
  • command timeout and late reply handling,
  • worker crash handling,
  • event ordering,
  • event queue overflow,
  • CloseSession idempotency,
  • gRPC mapping for command replies and faults.
  • dashboard snapshot projection,
  • dashboard auth decisions,
  • dashboard redaction,
  • dashboard realtime subscription disposal.

Integration tests with the real worker should be separated from unit tests and clearly marked because they require Windows, .NET Framework worker output, and eventually installed MXAccess COM.

Initial Implementation Slice

The first gateway slice should implement:

  1. Host startup and configuration binding.
  2. SQLite auth database initialization and migrations.
  3. Local API-key administration CLI/tool.
  4. API-key authentication and scope checks.
  5. OpenSession.
  6. Worker process launch.
  7. Named-pipe handshake.
  8. Invoke for Register, AddItem, and Advise.
  9. StreamEvents with one subscriber per session.
  10. CloseSession.
  11. Worker crash and startup failure handling.
  12. Event-rate, queue-depth, and overflow metrics.
  13. Blazor Server dashboard with Bootstrap assets.
  14. Dashboard home, sessions, and workers pages.
  15. Dashboard realtime snapshot refresh.
  16. Dashboard API-key login with admin-scope check.
  17. Basic structured logs.

This proves the process model before the full command surface is implemented.