d692232191
EventsHub publisher (closes the v2.1 follow-up flagged in the previous commit)
EventStreamService now mirrors every MxEvent it forwards to a gRPC client
into the `EventsHub` group for the session. The fan-out goes through a new
singleton `IDashboardEventBroadcaster`:
* IDashboardEventBroadcaster — abstraction so EventStreamService doesn't
take a direct dependency on SignalR.
* DashboardEventBroadcaster — singleton implementation that hands the
SendAsync to IHubContext<EventsHub> as fire-and-forget. Errors are
logged at debug and dropped so the source gRPC stream is never
blocked.
EventStreamService now takes IDashboardEventBroadcaster as a ctor parameter
and calls Publish(sessionId, publicEvent) once per event after sequence
filtering, before the bounded queue write. Test fixtures and the live
integration harness pass NullDashboardEventBroadcaster.Instance so the
broadcaster is a no-op in unit tests.
SessionDetailsPage adds a "Recent events" panel:
* implements IAsyncDisposable
* opens a second HubConnection via DashboardHubConnectionFactory targeting
/hubs/events
* calls SubscribeSession(SessionId) on Start
* renders the most recent 50 events in a small table (worker seq, family,
server/item handle, alarm reference when the event is OnAlarmTransition)
* shows a live/offline conn-pill driven by HubConnection.Closed /
Reconnected events
The dashboard mirror is intentionally passive — events appear only while a
gRPC client is also consuming that session's events. Documented as such in
the empty-state copy and in GatewayDashboardDesign.md.
Documentation refresh
Every doc that referenced the retired options (PathBase, RequireAdminScope,
RequiredGroup) and the old API-key-cookie auth flow is updated to describe
the new model:
* CLAUDE.md — Authentication section now explains LDAP bind +
GroupToRole + HubToken bearer flow.
* gateway.md — Dashboard section: root-mounted routes, snapshot/alarms/
events SignalR hubs, LDAP cookie + bearer scheme.
* docs/GatewayConfiguration.md — drop PathBase / RequireAdminScope rows,
add GroupToRole row, append "Authorization policies" and "SignalR hubs"
subsections describing the three policies and the /hubs/* endpoints.
* docs/GatewayDashboardDesign.md — hosting model (root mount, new
endpoint layout), Realtime Updates rewritten as a hub table
(DashboardSnapshotHub / AlarmsHub / EventsHub with producers, payloads,
and routing), Authentication And Authorization rewritten around LDAP +
role mapping + the hub bearer flow, Configuration block updated.
* docs/GatewayProcessDesign.md — security-section dashboard paragraph
and the example config block both refreshed to LDAP/role auth.
* docs/ImplementationPlanGateway.md — dashboard-auth deliverable list
updated (LDAP bind + GroupToRole + /hubs/token bearer mint replace the
API-key login flow).
* docs/GatewayTesting.md — DashboardLdapLiveTests blurb describes the
GroupToRole fixture (`{ GwAdmin: Admin }`) instead of the retired
RequiredGroup default; success-path assertion explains the role-claim
check.
Verification: 475 server tests, 275 worker tests (+ 9 dev-rig skips), 18
integration tests (live MxAccess + LDAP + Galaxy) all pass — including the
live worker smoke test fixture that now constructs EventStreamService with
the new broadcaster parameter.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
530 lines
12 KiB
Markdown
530 lines
12 KiB
Markdown
# Gateway Implementation Plan
|
|
|
|
This plan implements the .NET 10 gateway process first. It covers contracts,
|
|
configuration, API-key authentication, worker lifecycle, gRPC APIs, event
|
|
streaming, metrics, dashboard, tests, and operational hooks.
|
|
|
|
Primary designs:
|
|
|
|
- `docs/GatewayProcessDesign.md`
|
|
- `docs/GatewayDashboardDesign.md`
|
|
- `docs/DesignDecisions.md`
|
|
- `docs/ToolchainLinks.md`
|
|
|
|
## Milestone: gateway-foundation
|
|
|
|
Goal: create the solution, shared contracts, configuration model, logging, and
|
|
test scaffolding that all later work depends on.
|
|
|
|
### Issue: Scaffold Gateway Solution And Projects
|
|
|
|
Labels: `area:gateway`, `type:infra`, `priority:p0`
|
|
|
|
Deliverables:
|
|
|
|
- create `src/ZB.MOM.WW.MxGateway.slnx`,
|
|
- create `src/ZB.MOM.WW.MxGateway.Contracts`,
|
|
- create `src/ZB.MOM.WW.MxGateway.Server`,
|
|
- create `src/ZB.MOM.WW.MxGateway.Tests`,
|
|
- create `src/ZB.MOM.WW.MxGateway.IntegrationTests`,
|
|
- target `ZB.MOM.WW.MxGateway.Server` to `net10.0`,
|
|
- add shared C# build settings in `Directory.Build.props`,
|
|
- add baseline tests.
|
|
|
|
Acceptance criteria:
|
|
|
|
- `dotnet build src/ZB.MOM.WW.MxGateway.slnx` succeeds,
|
|
- `dotnet test src/ZB.MOM.WW.MxGateway.slnx` succeeds,
|
|
- gateway project does not reference MXAccess COM.
|
|
|
|
### Issue: Define Protobuf Contracts
|
|
|
|
Labels: `area:contracts`, `type:feature`, `priority:p0`
|
|
|
|
Deliverables:
|
|
|
|
- `src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_gateway.proto`,
|
|
- `src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_worker.proto`,
|
|
- `MxAccessGateway` service with `OpenSession`, `CloseSession`, `Invoke`, and
|
|
`StreamEvents`,
|
|
- `WorkerEnvelope` and worker IPC messages,
|
|
- `MxValue`, `MxArray`, `MxStatusProxy`, `MxEvent`, and first-slice command
|
|
payloads,
|
|
- generated C# code.
|
|
|
|
Acceptance criteria:
|
|
|
|
- generated code builds,
|
|
- worker envelopes include protocol version, session id, sequence, and
|
|
correlation id,
|
|
- command replies preserve protocol status, HRESULT, return value, out params,
|
|
and status arrays.
|
|
|
|
Tests:
|
|
|
|
- protobuf generation smoke,
|
|
- serialization round-trip for command, reply, event, value, and status.
|
|
|
|
### Issue: Add Gateway Configuration And Validation
|
|
|
|
Labels: `area:gateway`, `type:feature`, `priority:p0`
|
|
|
|
Deliverables:
|
|
|
|
- typed options for authentication, worker, sessions, events, dashboard, and
|
|
protocol,
|
|
- startup validation,
|
|
- defaults matching design docs,
|
|
- redacted effective-configuration model.
|
|
|
|
Acceptance criteria:
|
|
|
|
- invalid worker path, invalid queue capacity, invalid auth config, and invalid
|
|
dashboard config fail startup clearly,
|
|
- redacted config never includes API key pepper or raw secrets.
|
|
|
|
Tests:
|
|
|
|
- options binding,
|
|
- validation,
|
|
- redaction.
|
|
|
|
### Issue: Add Structured Logging And Metrics Foundation
|
|
|
|
Labels: `area:gateway`, `type:infra`, `priority:p0`
|
|
|
|
Deliverables:
|
|
|
|
- logging scopes for session id, worker process id, correlation id, command
|
|
method, and client identity,
|
|
- counters/gauges/histograms for sessions, workers, commands, events, queues,
|
|
and faults,
|
|
- redaction helpers.
|
|
|
|
Acceptance criteria:
|
|
|
|
- common logs include correlation fields,
|
|
- API keys and credential-bearing values are not logged,
|
|
- metrics can feed dashboard snapshots.
|
|
|
|
Tests:
|
|
|
|
- log redaction,
|
|
- metric update tests.
|
|
|
|
## Milestone: gateway-auth
|
|
|
|
Goal: implement API-key authentication backed by SQLite.
|
|
|
|
### Issue: Implement SQLite Auth Store And Migrations
|
|
|
|
Labels: `area:auth`, `type:feature`, `priority:p0`
|
|
|
|
Deliverables:
|
|
|
|
- SQLite schema for `schema_version`, `api_keys`, and `api_key_audit`,
|
|
- idempotent startup migrations,
|
|
- newer-schema startup block,
|
|
- key lookup and audit services.
|
|
|
|
Acceptance criteria:
|
|
|
|
- empty DB initializes,
|
|
- existing DB migrates,
|
|
- newer DB version blocks startup,
|
|
- revoked keys cannot authenticate.
|
|
|
|
Tests:
|
|
|
|
- temp SQLite migration tests,
|
|
- key lookup tests,
|
|
- revoked key tests.
|
|
|
|
### Issue: Implement API Key Hashing And Verification
|
|
|
|
Labels: `area:auth`, `type:feature`, `priority:p0`
|
|
|
|
Deliverables:
|
|
|
|
- parse `mxgw_<key-id>_<secret>` format,
|
|
- HMAC-SHA256 with gateway-local pepper or accepted Argon2id dependency,
|
|
- constant-time hash comparison,
|
|
- key id/display name/scopes identity model.
|
|
|
|
Acceptance criteria:
|
|
|
|
- raw secrets are never stored,
|
|
- malformed keys fail unauthenticated,
|
|
- valid keys authenticate,
|
|
- revoked keys fail.
|
|
|
|
Tests:
|
|
|
|
- parse tests,
|
|
- hash verification,
|
|
- redaction,
|
|
- scope extraction.
|
|
|
|
### Issue: Implement Local API Key Admin CLI
|
|
|
|
Labels: `area:auth`, `type:feature`, `priority:p1`
|
|
|
|
Deliverables:
|
|
|
|
- local admin CLI or gateway subcommand,
|
|
- `init-db`,
|
|
- `create-key`,
|
|
- `list-keys`,
|
|
- `revoke-key`,
|
|
- `rotate-key`,
|
|
- JSON output option.
|
|
|
|
Acceptance criteria:
|
|
|
|
- created key can authenticate,
|
|
- listed keys never show raw secret,
|
|
- revoked key fails authentication,
|
|
- raw secret is printed exactly once on create/rotate.
|
|
|
|
Tests:
|
|
|
|
- CLI parser,
|
|
- temp DB command tests,
|
|
- JSON redaction.
|
|
|
|
### Issue: Add gRPC Authentication And Scope Authorization
|
|
|
|
Labels: `area:auth`, `area:gateway`, `type:feature`, `priority:p0`
|
|
|
|
Deliverables:
|
|
|
|
- gRPC auth middleware/interceptor,
|
|
- request identity context,
|
|
- scope checks for sessions, invoke, secure invoke, events, metadata, and
|
|
admin actions.
|
|
|
|
Acceptance criteria:
|
|
|
|
- missing/invalid key returns unauthenticated,
|
|
- valid key with missing scope returns permission denied,
|
|
- auth applies to unary and streaming calls.
|
|
|
|
Tests:
|
|
|
|
- unary auth,
|
|
- streaming auth,
|
|
- scope mapping.
|
|
|
|
## Milestone: gateway-sessions-ipc
|
|
|
|
Goal: create, supervise, and communicate with per-session workers.
|
|
|
|
### Issue: Implement Worker Frame Protocol
|
|
|
|
Labels: `area:gateway`, `area:contracts`, `type:feature`, `priority:p0`
|
|
|
|
Deliverables:
|
|
|
|
- little-endian uint32 length-prefixed frame reader/writer,
|
|
- max message size enforcement,
|
|
- protobuf envelope validation,
|
|
- protocol violation errors.
|
|
|
|
Acceptance criteria:
|
|
|
|
- valid frames round-trip,
|
|
- partial reads are handled,
|
|
- oversized frames fail before allocation,
|
|
- wrong protocol/session id is detected.
|
|
|
|
Tests:
|
|
|
|
- round-trip,
|
|
- partial read,
|
|
- malformed length,
|
|
- max size,
|
|
- wrong protocol/session.
|
|
|
|
### Issue: Implement Worker Process Launcher
|
|
|
|
Labels: `area:gateway`, `type:feature`, `priority:p0`
|
|
|
|
Deliverables:
|
|
|
|
- worker executable validation,
|
|
- process launch with session id, pipe name, protocol version,
|
|
- nonce via environment,
|
|
- startup timeout handling,
|
|
- failed-startup cleanup.
|
|
|
|
Acceptance criteria:
|
|
|
|
- command line contains no secrets,
|
|
- nonce is not logged,
|
|
- failed startup kills worker and disposes pipe,
|
|
- process id is recorded.
|
|
|
|
Tests:
|
|
|
|
- fake worker success/failure,
|
|
- timeout kill,
|
|
- command-line redaction.
|
|
|
|
### Issue: Implement Gateway WorkerClient
|
|
|
|
Labels: `area:gateway`, `type:feature`, `priority:p0`
|
|
|
|
Deliverables:
|
|
|
|
- named-pipe server,
|
|
- `GatewayHello`/`WorkerHello` handshake,
|
|
- read loop,
|
|
- write loop,
|
|
- pending command dictionary,
|
|
- event channel,
|
|
- heartbeat tracking,
|
|
- terminal fault handling.
|
|
|
|
Acceptance criteria:
|
|
|
|
- worker ready establishes `Ready` state,
|
|
- command reply completes matching pending command,
|
|
- worker events enter channel in order,
|
|
- pipe disconnect faults session.
|
|
|
|
Tests:
|
|
|
|
- fake worker protocol,
|
|
- command correlation,
|
|
- late reply,
|
|
- pipe disconnect,
|
|
- heartbeat expiration.
|
|
|
|
### Issue: Implement Session Manager And Registry
|
|
|
|
Labels: `area:gateway`, `type:feature`, `priority:p0`
|
|
|
|
Deliverables:
|
|
|
|
- session state machine,
|
|
- registry keyed by session id,
|
|
- `OpenSession` orchestration,
|
|
- `CloseSession` idempotency,
|
|
- lease hooks,
|
|
- gateway shutdown cleanup.
|
|
|
|
Acceptance criteria:
|
|
|
|
- only `Ready` sessions accept commands,
|
|
- close is idempotent,
|
|
- faulted sessions reject new commands,
|
|
- shutdown terminates workers.
|
|
|
|
Tests:
|
|
|
|
- state transitions,
|
|
- close idempotency,
|
|
- open failure cleanup,
|
|
- shutdown cleanup.
|
|
|
|
## Milestone: gateway-grpc-events-dashboard
|
|
|
|
Goal: expose the public API, stream events, and provide the dashboard.
|
|
|
|
### Issue: Implement Public gRPC Service
|
|
|
|
Labels: `area:gateway`, `type:feature`, `priority:p0`
|
|
|
|
Deliverables:
|
|
|
|
- `MxAccessGatewayService`,
|
|
- `OpenSession`,
|
|
- `CloseSession`,
|
|
- `Invoke`,
|
|
- `StreamEvents`,
|
|
- request validation,
|
|
- public-to-worker mappers.
|
|
|
|
Acceptance criteria:
|
|
|
|
- missing session fails clearly,
|
|
- method-specific payloads map correctly,
|
|
- HRESULT/status survives in replies,
|
|
- transport errors are separate from command replies.
|
|
|
|
Tests:
|
|
|
|
- service unit tests,
|
|
- mapper tests,
|
|
- validation tests,
|
|
- reply/error mapping.
|
|
|
|
### Issue: Implement Event Streaming And Backpressure
|
|
|
|
Labels: `area:gateway`, `type:feature`, `priority:p0`
|
|
|
|
Deliverables:
|
|
|
|
- one active subscriber per session,
|
|
- second-subscriber rejection,
|
|
- ordered event streaming,
|
|
- fail-fast queue overflow,
|
|
- terminal fault propagation,
|
|
- event-rate metrics.
|
|
|
|
Acceptance criteria:
|
|
|
|
- event order preserved,
|
|
- stream cancellation detaches subscriber,
|
|
- queue overflow faults session,
|
|
- `OperationComplete` is not synthesized by gateway.
|
|
|
|
Tests:
|
|
|
|
- order,
|
|
- single-subscriber enforcement,
|
|
- cancellation,
|
|
- overflow.
|
|
|
|
### Issue: Implement Dashboard Snapshot Service
|
|
|
|
Labels: `area:dashboard`, `type:feature`, `priority:p1`
|
|
|
|
Deliverables:
|
|
|
|
- immutable dashboard snapshot DTOs,
|
|
- session summaries,
|
|
- worker summaries,
|
|
- metric summaries,
|
|
- fault summaries,
|
|
- `WatchSnapshotsAsync`.
|
|
|
|
Acceptance criteria:
|
|
|
|
- snapshot reads do not mutate session/worker state,
|
|
- secrets and credential values are redacted,
|
|
- subscribers dispose cleanly.
|
|
|
|
Tests:
|
|
|
|
- projection,
|
|
- redaction,
|
|
- subscription disposal,
|
|
- empty/active/faulted states.
|
|
|
|
### Issue: Implement Blazor Server Dashboard
|
|
|
|
Labels: `area:dashboard`, `type:feature`, `priority:p1`
|
|
|
|
Deliverables:
|
|
|
|
- Blazor Server hosting,
|
|
- Bootstrap CSS/JS assets,
|
|
- layout/nav,
|
|
- home page,
|
|
- sessions page,
|
|
- workers page,
|
|
- events page,
|
|
- settings page,
|
|
- real-time refresh.
|
|
|
|
Acceptance criteria:
|
|
|
|
- Bootstrap/local CSS only,
|
|
- no MudBlazor or other Blazor UI libraries,
|
|
- pages update without manual refresh,
|
|
- dashboard can be disabled by config.
|
|
|
|
Tests:
|
|
|
|
- snapshot service tests,
|
|
- component tests if bUnit is added,
|
|
- disabled-dashboard behavior.
|
|
|
|
### Issue: Implement Dashboard Authentication
|
|
|
|
Labels: `area:dashboard`, `area:auth`, `type:feature`, `priority:p1`
|
|
|
|
Deliverables:
|
|
|
|
- `/login` (root-mounted),
|
|
- LDAP bind against `MxGateway:Ldap`,
|
|
- LDAP-group → role mapping (`Admin` / `Viewer`) via
|
|
`MxGateway:Dashboard:GroupToRole`,
|
|
- HTTP-only secure cookie (`__Host-MxGatewayDashboard`),
|
|
- `/hubs/token` bearer mint for SignalR connections,
|
|
- `/logout`,
|
|
- antiforgery protection,
|
|
- `MxGateway:Dashboard:AllowAnonymousLocalhost` loopback bypass
|
|
(defaults to true for local development).
|
|
|
|
Acceptance criteria:
|
|
|
|
- unauthenticated access is denied/redirected,
|
|
- non-admin key is denied,
|
|
- admin key logs in,
|
|
- cookies use secure settings,
|
|
- API keys never appear in query strings or logs.
|
|
|
|
Tests:
|
|
|
|
- auth decisions,
|
|
- non-admin denial,
|
|
- cookie properties,
|
|
- redaction.
|
|
|
|
## Milestone: integration-and-parity
|
|
|
|
Goal: prove gateway behavior with fake workers before depending on live
|
|
MXAccess.
|
|
|
|
### Issue: Build Fake Worker Test Harness
|
|
|
|
Labels: `area:tests`, `area:gateway`, `type:test`, `priority:p0`
|
|
|
|
Deliverables:
|
|
|
|
- fake worker executable or in-process transport,
|
|
- scripted hello/ready/reply/event/fault behavior,
|
|
- malformed protocol scenarios,
|
|
- slow/hung worker scenarios.
|
|
|
|
Acceptance criteria:
|
|
|
|
- gateway tests do not require installed MXAccess,
|
|
- fake worker simulates startup success/failure,
|
|
- fake worker emits ordered events and faults.
|
|
|
|
### Issue: Gateway End-To-End Smoke With Fake Worker
|
|
|
|
Labels: `area:tests`, `area:gateway`, `type:test`, `priority:p0`
|
|
|
|
Deliverables:
|
|
|
|
- open session,
|
|
- invoke `Register`, `AddItem`, `Advise`,
|
|
- stream one event,
|
|
- close session,
|
|
- verify metrics/dashboard snapshot changed.
|
|
|
|
Acceptance criteria:
|
|
|
|
- smoke passes without live MXAccess,
|
|
- worker exits,
|
|
- artifacts stay in temp directories.
|
|
|
|
|
|
## Related Documentation
|
|
|
|
- [Implementation Plan Index](./ImplementationPlanIndex.md)
|
|
- [Gateway Process Detailed Design](./GatewayProcessDesign.md)
|
|
- [Gateway Configuration](./GatewayConfiguration.md)
|
|
- [Sessions](./Sessions.md)
|
|
- [gRPC](./Grpc.md)
|
|
- [Authentication](./Authentication.md)
|
|
- [Authorization](./Authorization.md)
|
|
- [Gateway Dashboard Detailed Design](./GatewayDashboardDesign.md)
|
|
- [Gateway Testing](./GatewayTesting.md)
|
|
- [Metrics](./Metrics.md)
|
|
- [Diagnostics](./Diagnostics.md)
|