From 3081b80efc0acc3be6c989197599a0307564e53d Mon Sep 17 00:00:00 2001 From: Joseph Doherty Date: Wed, 3 Jun 2026 15:35:46 -0400 Subject: [PATCH] docs(audit): cluster findings fragments (13 clusters, read-only verification) --- docs/audit/fragments/01-architecture.md | 443 ++++++++++++++++++ docs/audit/fragments/02-worker.md | 580 ++++++++++++++++++++++++ docs/audit/fragments/03-sessions.md | 380 ++++++++++++++++ docs/audit/fragments/04-auth.md | 437 ++++++++++++++++++ docs/audit/fragments/05-dashboard.md | 332 ++++++++++++++ docs/audit/fragments/06-config.md | 473 +++++++++++++++++++ docs/audit/fragments/07-contracts.md | 362 +++++++++++++++ docs/audit/fragments/08-galaxy.md | 521 +++++++++++++++++++++ docs/audit/fragments/09-alarms.md | 404 +++++++++++++++++ docs/audit/fragments/10-testing.md | 522 +++++++++++++++++++++ docs/audit/fragments/11-clients.md | 341 ++++++++++++++ docs/audit/fragments/12-styleguides.md | 226 +++++++++ docs/audit/fragments/13-history.md | 265 +++++++++++ 13 files changed, 5286 insertions(+) create mode 100644 docs/audit/fragments/01-architecture.md create mode 100644 docs/audit/fragments/02-worker.md create mode 100644 docs/audit/fragments/03-sessions.md create mode 100644 docs/audit/fragments/04-auth.md create mode 100644 docs/audit/fragments/05-dashboard.md create mode 100644 docs/audit/fragments/06-config.md create mode 100644 docs/audit/fragments/07-contracts.md create mode 100644 docs/audit/fragments/08-galaxy.md create mode 100644 docs/audit/fragments/09-alarms.md create mode 100644 docs/audit/fragments/10-testing.md create mode 100644 docs/audit/fragments/11-clients.md create mode 100644 docs/audit/fragments/12-styleguides.md create mode 100644 docs/audit/fragments/13-history.md diff --git a/docs/audit/fragments/01-architecture.md b/docs/audit/fragments/01-architecture.md new file mode 100644 index 0000000..baec702 --- /dev/null +++ b/docs/audit/fragments/01-architecture.md @@ -0,0 +1,443 @@ +# Cluster 01 — Architecture + +DOC: gateway.md +LINES: 737–769 +CLAIM: Project layout lists `src/MxGateway.Server`, `src/MxGateway.Worker`, `src/MxGateway.Contracts`, `src/MxGateway.Tests`, `src/MxGateway.Worker.Tests`, `src/MxGateway.IntegrationTests` as suggested path names. +CLAIM_TYPE: path +VERDICT: stale +EVIDENCE: src/ directory listing — actual project directories are `ZB.MOM.WW.MxGateway.Server`, `ZB.MOM.WW.MxGateway.Worker`, `ZB.MOM.WW.MxGateway.Contracts`, `ZB.MOM.WW.MxGateway.Tests`, `ZB.MOM.WW.MxGateway.Worker.Tests`, `ZB.MOM.WW.MxGateway.IntegrationTests` +CODE_AREA: arch.layout +SEVERITY: medium +PROPOSED_FIX: Replace all short project names in the layout block with the fully-qualified names (e.g. `src/ZB.MOM.WW.MxGateway.Server/`, `src/ZB.MOM.WW.MxGateway.Worker/`, etc.). + +--- + +DOC: gateway.md +LINES: 231–248 +CLAIM: `WorkerEnvelope` has `uint64 correlation_id = 4` and oneof body field numbers: `worker_hello=10, gateway_hello=11, worker_ready=12, command=20, command_reply=21, event=22, heartbeat=23, cancel=24, shutdown=25, fault=26`. +CLAIM_TYPE: rpc/proto +VERDICT: wrong +EVIDENCE: src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_worker.proto:4,20–38 — actual proto has `string correlation_id = 4` (not uint64); body fields are `gateway_hello=10, worker_hello=11, worker_ready=12, worker_command=13, worker_command_reply=14, worker_cancel=15, worker_shutdown=16, worker_shutdown_ack=17, worker_event=18, worker_heartbeat=19, worker_fault=20`; field names also differ (e.g. `command` → `worker_command`, `event` → `worker_event`). +CODE_AREA: arch.ipc +SEVERITY: high +PROPOSED_FIX: Replace the WorkerEnvelope protobuf block in gateway.md with the actual proto content from `mxaccess_worker.proto`, including the correct field type for `correlation_id` (string), the correct field numbers, and the correct field names. Also add the missing `WorkerShutdownAck worker_shutdown_ack = 17` entry. + +--- + +DOC: gateway.md +LINES: 898–913 +CLAIM: Session state machine is `Creating -> StartingWorker -> WaitingForPipe -> InitializingWorker -> Ready -> Closing -> Closed -> Faulted`. +CLAIM_TYPE: behavior-rule +VERDICT: stale +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionWorkerClientFactory.cs:75 — code transitions to `SessionState.Handshaking` between `WaitingForPipe` and `InitializingWorker`; this state also appears in the generated proto enum (`MxaccessGateway.cs:726`, `SESSION_STATE_HANDSHAKING = 4`). +CODE_AREA: arch.session +SEVERITY: medium +PROPOSED_FIX: Add `-> Handshaking` between `WaitingForPipe` and `InitializingWorker` in the state machine diagram, and add a description: "`Handshaking`: pipe is connected and protocol hello is being verified." + +--- + +DOC: gateway.md +LINES: 119–121 +CLAIM: Blazor dashboard mounts at the host root and renders pages at `/`, `/sessions`, `/workers`, `/events`, `/galaxy`, `/alarms`, `/apikeys`, and `/settings`. +CLAIM_TYPE: path +VERDICT: stale +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Dashboard/Components/Pages/BrowsePage.razor:1 — there is also a `/browse` page (`@page "/browse"`) that is not listed. `/login` is also present. +CODE_AREA: arch.layout +SEVERITY: low +PROPOSED_FIX: Add `/browse` (and `/login`) to the list of documented dashboard routes. + +--- + +DOC: gateway.md +LINES: 662–663 +CLAIM: Rejects valid keys lacking the required `session, invoke, event, metadata, or admin` scope with gRPC `PermissionDenied`. +CLAIM_TYPE: config-key +VERDICT: stale +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Security/Authorization/GatewayScopes.cs:5–12 — actual scopes are `session:open`, `session:close`, `invoke:read`, `invoke:write`, `invoke:secure`, `events:read`, `metadata:read`, `admin`. The simplified short-form names (`session`, `invoke`, `event`) do not match the canonical scope strings. +CODE_AREA: arch.auth +SEVERITY: medium +PROPOSED_FIX: Replace the simplified scope names with the canonical forms: `session:open`, `session:close`, `invoke:read`, `invoke:write`, `invoke:secure`, `events:read`, `metadata:read`, `admin`. + +--- + +DOC: docs/DesignDecisions.md +LINES: 360–363 +CLAIM: "Dashboard access should require API-key-backed dashboard authentication with `admin` scope when enabled." +CLAIM_TYPE: behavior-rule +VERDICT: wrong +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardAuthenticator.cs:9 — dashboard authentication is LDAP-backed (bind + group-to-role mapping), not API-key-backed. This is also confirmed in `GatewayProcessDesign.md` lines 291–299 and `gateway.md` lines 147–156. +CODE_AREA: arch.auth +SEVERITY: high +PROPOSED_FIX: Replace "API-key-backed dashboard authentication with `admin` scope" with "LDAP-backed authentication with `GroupToRole` mapping to `Admin` or `Viewer` roles." Keep the note about `AllowAnonymousLocalhost` for local development. + +--- + +DOC: docs/GatewayProcessDesign.md +LINES: 249–255 +CLAIM: Dashboard suggested routes use a `/dashboard` prefix: `/dashboard`, `/dashboard/sessions`, `/dashboard/sessions/{sessionId}`, `/dashboard/workers`, `/dashboard/events`, `/dashboard/settings`. +CLAIM_TYPE: path +VERDICT: wrong +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Dashboard/Components/Pages/ — actual Blazor pages are mounted at `/` (DashboardHome.razor), `/sessions` (SessionsPage.razor), `/sessions/{SessionId}` (SessionDetailsPage.razor), `/workers` (WorkersPage.razor), `/events` (EventsPage.razor), `/settings` (SettingsPage.razor), `/alarms` (AlarmsPage.razor), `/galaxy` (GalaxyPage.razor), `/browse` (BrowsePage.razor), `/apikeys` (ApiKeysPage.razor). None have a `/dashboard` prefix. +CODE_AREA: arch.layout +SEVERITY: high +PROPOSED_FIX: Replace the `/dashboard`-prefixed route table with the actual routes: `/`, `/sessions`, `/sessions/{sessionId}`, `/workers`, `/events`, `/alarms`, `/galaxy`, `/browse`, `/apikeys`, `/settings`. + +--- + +DOC: docs/GatewayProcessDesign.md +LINES: 689 +CLAIM: "`Dashboard:AllowAnonymousLocalhost` permits loopback requests to bypass the cookie requirement." +CLAIM_TYPE: config-key +VERDICT: stale +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Configuration/DashboardOptions.cs:9 — property is `AllowAnonymousLocalhost` under `DashboardOptions`, which maps to `MxGateway:Dashboard:AllowAnonymousLocalhost`. The shorthand `Dashboard:AllowAnonymousLocalhost` omits the root `MxGateway:` prefix used throughout the project (also confirmed in GatewayProcessDesign.md line 298 which correctly uses `MxGateway:Dashboard:AllowAnonymousLocalhost`). +CODE_AREA: arch.config +SEVERITY: low +PROPOSED_FIX: Standardize to `MxGateway:Dashboard:AllowAnonymousLocalhost` (the form used in GatewayOptions / the configuration section name) everywhere this key is referenced. + +--- + +DOC: docs/GatewayProcessDesign.md +LINES: 854–855 +CLAIM: Worker `ExecutablePath` default is `src/ZB.MOM.WW.MxGateway.Worker/bin/x86/Release/ZB.MOM.WW.MxGateway.Worker.exe` (forward-slash path shown in JSON block). +CLAIM_TYPE: config-key +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Configuration/WorkerOptions.cs:7 — actual default is `src\ZB.MOM.WW.MxGateway.Worker\bin\x86\Release\ZB.MOM.WW.MxGateway.Worker.exe` (backslashes on Windows). The path and filename match; only the separator style differs between the JSON doc sample and the C# literal. +CODE_AREA: arch.config +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: docs/DesignDecisions.md +LINES: 36 +CLAIM: Interop assembly identity: `ArchestrA.MxAccess, Version=3.2.0.0, PublicKeyToken=23106a86e706d0ae`. +CLAIM_TYPE: version +VERDICT: unverifiable +EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessInteropInfo.cs — the file records the assembly path and name (`ArchestrA.MxAccess`) but does not hard-code the version or public key token; `InteropAssemblyVersion` is read dynamically from the loaded assembly at runtime (`typeof(LMXProxyServerClass).Assembly.GetName().Version`). Cannot verify the exact version string without MXAccess installed. +CODE_AREA: arch.ipc +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: docs/DesignDecisions.md +LINES: 36–48 +CLAIM: COM class `ArchestrA.MxAccess.LMXProxyServerClass`, CLSID `{C30B52F5-2CB5-4760-AF0A-3A344A7EB5DC}`, ProgID `LMXProxy.LMXProxyServer.1`, version-independent ProgID `LMXProxy.LMXProxyServer`, registered server `C:\Program Files (x86)\ArchestrA\Framework\Bin\LmxProxy.dll`, interop assembly `C:\Program Files (x86)\ArchestrA\Framework\Bin\ArchestrA.MXAccess.dll`. +CLAIM_TYPE: config-key +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessInteropInfo.cs:14,19,24,29–30,35–36,41 — `ComClassName = "ArchestrA.MxAccess.LMXProxyServerClass"`, `Clsid = "{C30B52F5-2CB5-4760-AF0A-3A344A7EB5DC}"`, `ProgId = "LMXProxy.LMXProxyServer.1"`, `VersionIndependentProgId = "LMXProxy.LMXProxyServer"`, `RegisteredServerPath = @"C:\Program Files (x86)\ArchestrA\Framework\Bin\LmxProxy.dll"`, `InteropAssemblyPath = @"C:\Program Files (x86)\ArchestrA\Framework\Bin\ArchestrA.MXAccess.dll"`. All match. +CODE_AREA: arch.ipc +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: docs/DesignDecisions.md +LINES: 55 +CLAIM: Worker should reference `ArchestrA.MXAccess.dll` (upper-case MXAccess in filename). +CLAIM_TYPE: path +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/ZB.MOM.WW.MxGateway.Worker.csproj:27 — `C:\Program Files (x86)\ArchestrA\Framework\Bin\ArchestrA.MXAccess.dll`. Matches. +CODE_AREA: arch.layout +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: gateway.md +LINES: 88–94 +CLAIM: Gateway runtime is `.NET 10`, `C#`, `x64 preferred`, `ASP.NET Core gRPC server`. +CLAIM_TYPE: version +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/ZB.MOM.WW.MxGateway.Server.csproj:4 — `net10.0`; no explicit `` is set (so the default is AnyCPU/x64-preferred on .NET 10). Grpc.AspNetCore is referenced. Matches. +CODE_AREA: arch.layout +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: gateway.md +LINES: 162–165 +CLAIM: Worker runtime is `.NET Framework 4.8`, `C#`, `x86 build by default`. +CLAIM_TYPE: version +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/ZB.MOM.WW.MxGateway.Worker.csproj:5–7 — `net48`, `x86`, `true`. Matches. +CODE_AREA: arch.layout +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: gateway.md +LINES: 198–210 +CLAIM: Pipe name format is `mxaccess-gateway-{gatewayProcessId}-{sessionId}` and framing is `uint32 little-endian payload_length` followed by `payload_length bytes protobuf WorkerEnvelope`. +CLAIM_TYPE: rpc/proto +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManager.cs:433 — `string pipeName = $"mxaccess-gateway-{Environment.ProcessId}-{sessionId}"`. Framing confirmed by `WorkerFrameReader.cs` and `WorkerFrameWriter.cs` in `src/ZB.MOM.WW.MxGateway.Server/Workers/`. +CODE_AREA: arch.ipc +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: gateway.md +LINES: 108 +CLAIM: "The gateway must never instantiate or call MXAccess directly." +CLAIM_TYPE: behavior-rule +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/ZB.MOM.WW.MxGateway.Server.csproj — no reference to `ArchestrA.MXAccess.dll`. MXAccess COM is only referenced in the Worker project csproj (line 26–29). +CODE_AREA: arch.layout +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: gateway.md +LINES: 646–650 +CLAIM: Gateway restart does not reattach old workers; `OrphanWorkerCleanupHostedService` runs `OrphanWorkerTerminator` once on startup to kill leftover `ZB.MOM.WW.MxGateway.Worker.exe` processes. +CLAIM_TYPE: behavior-rule +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Workers/OrphanWorkerCleanupHostedService.cs:7 — class exists and references `OrphanWorkerTerminator`. `OrphanWorkerTerminator.cs:19` is present. Worker executable name `ZB.MOM.WW.MxGateway.Worker.exe` confirmed in `IntegrationTestEnvironment.cs:66`. +CODE_AREA: arch.layout +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: docs/GatewayProcessDesign.md +LINES: 420–428 +CLAIM: Pipe name format is `mxaccess-gateway-{gatewayProcessId}-{sessionId}` and framing is `uint32 little-endian payload_length` followed by `payload_length bytes protobuf WorkerEnvelope`. +CLAIM_TYPE: rpc/proto +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManager.cs:433 — confirmed matching. +CODE_AREA: arch.ipc +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: docs/GatewayProcessDesign.md +LINES: 459–475 +CLAIM: `IWorkerClient` has methods `StartAsync`, `InvokeAsync(WorkerCommand, TimeSpan, CancellationToken)`, `ReadEventsAsync(CancellationToken)`, `ShutdownAsync(TimeSpan, CancellationToken)`, `Kill(string)`. +CLAIM_TYPE: rpc/proto +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Workers/IWorkerClient.cs:22,28–31,35,40,44 — all five methods are present with matching signatures. +CODE_AREA: arch.ipc +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: docs/GatewayProcessDesign.md +LINES: 713–719 +CLAIM: API-key admin CLI subcommands are `init-db`, `create-key`, `list-keys`, `revoke-key`, `rotate-key` on `ZB.MOM.WW.MxGateway.Server apikey`. +CLAIM_TYPE: command +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Security/Authentication/ApiKeyAdminCommandLineParser.cs:121–135 — all five subcommands are parsed. Matches. +CODE_AREA: arch.auth +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: docs/GatewayProcessDesign.md +LINES: 408–410 +CLAIM: Nonce is passed via `MXGATEWAY_WORKER_NONCE` environment variable so the command line remains safe to log. +CLAIM_TYPE: config-key +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Workers/WorkerProcessLauncher.cs:17–18 — `public const string WorkerNonceEnvironmentVariableName = "MXGATEWAY_WORKER_NONCE"`. Matches. +CODE_AREA: arch.ipc +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: docs/GatewayProcessDesign.md +LINES: 223–229 +CLAIM: `EventStreamService` rejects a second subscriber with `EventSubscriberAlreadyActive`; faults the session with `EventQueueOverflow` if the queue fills. +CLAIM_TYPE: term +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManagerErrorCode.cs:7–8 — enum values `EventSubscriberAlreadyActive` and `EventQueueOverflow` present. Also used at `MxAccessGatewayService.cs:929–930` and `EventStreamService.cs:150,160`. +CODE_AREA: arch.ipc +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: docs/GatewayProcessDesign.md +LINES: 291–299 +CLAIM: Dashboard auth uses LDAP bind + role mapping (`MxGateway:Dashboard:GroupToRole`), issues HTTP-only secure cookie, allows `Dashboard:AllowAnonymousLocalhost` to default to `true`. +CLAIM_TYPE: behavior-rule +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardAuthenticator.cs:9 (LDAP-backed); `DashboardOptions.cs:9` (`AllowAnonymousLocalhost` defaults to `true`). Matches. +CODE_AREA: arch.auth +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: docs/GatewayProcessDesign.md +LINES: 527–530 +CLAIM: "During shutdown the worker client treats `WorkerShutdownAck` as the protocol close signal." +CLAIM_TYPE: rpc/proto +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_worker.proto:34,80 — `WorkerShutdownAck` is field 17 in the oneof body and its message is defined at line 80. +CODE_AREA: arch.ipc +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: gateway.md +LINES: 301–314 +CLAIM: Session state machine (in the "Session Manager" section): `Creating -> StartingWorker -> WaitingForPipe -> InitializingWorker -> Ready -> Closing -> Closed -> Faulted`. +CLAIM_TYPE: behavior-rule +VERDICT: stale +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionWorkerClientFactory.cs:75 — `session.TransitionTo(SessionState.Handshaking)` is called between `WaitingForPipe` and `InitializingWorker`. The `Handshaking` state also exists in the public `SessionState` proto enum (`MxaccessGateway.cs:726`). The state machine in gateway.md at this location (the Gateway Implementation Plan / Session Manager section) is missing the `Handshaking` state exactly as in the earlier reference at lines 898–913. +CODE_AREA: arch.session +SEVERITY: medium +PROPOSED_FIX: Add `-> Handshaking` between `WaitingForPipe` and `InitializingWorker` in both state machine diagrams in gateway.md. + +--- + +DOC: gateway.md +LINES: 1023–1025 +CLAIM: "MXAccess COM target is `ArchestrA.MxAccess.LMXProxyServerClass` / `LMXProxy.LMXProxyServer.1` from the installed 32-bit `LmxProxy.dll`." +CLAIM_TYPE: term +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessInteropInfo.cs:14,41 — `ComClassName = "ArchestrA.MxAccess.LMXProxyServerClass"`, `ProgId = "LMXProxy.LMXProxyServer.1"`, registered server `LmxProxy.dll`. Matches. +CODE_AREA: arch.ipc +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: docs/GatewayProcessDesign.md +LINES: 62–93 +CLAIM: High-level component list references namespace `ZB.MOM.WW.MxGateway.Server` with sub-components including `GatewayMetrics` (under `Metrics`) and `HealthChecks` (under `Diagnostics`). +CLAIM_TYPE: term +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Metrics/GatewayMetrics.cs:4 — `namespace ZB.MOM.WW.MxGateway.Server.Metrics`; src/ZB.MOM.WW.MxGateway.Server/Diagnostics/AuthStoreHealthCheck.cs:5 — `namespace ZB.MOM.WW.MxGateway.Server.Diagnostics`. Matches. +CODE_AREA: arch.layout +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: gateway.md +LINES: 110–116 +CLAIM: Gateway observability foundation lives in `ZB.MOM.WW.MxGateway.Server.Diagnostics` and `ZB.MOM.WW.MxGateway.Server.Metrics`; `GatewayMetrics` exposes counters/gauges/histograms through .NET `Meter`; `DashboardSnapshotService` projects sessions/workers/metrics into immutable DTOs. +CLAIM_TYPE: term +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Metrics/GatewayMetrics.cs:4; src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardSnapshotService.cs:8. Both namespaces confirmed. Matches. +CODE_AREA: arch.layout +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: gateway.md +LINES: 119–121 +CLAIM: SignalR hubs at `/hubs/{snapshot,alarms,events}` accept either the cookie or a 30-minute bearer minted at `/hubs/token`. +CLAIM_TYPE: path +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardEndpointRouteBuilderExtensions.cs:63–65,73 — `MapHub("/hubs/snapshot")`, `MapHub("/hubs/alarms")`, `MapHub("/hubs/events")`, `/hubs/token` endpoint mapped at line 73. Matches. +CODE_AREA: arch.layout +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: gateway.md +LINES: 121–122 +CLAIM: "`/hubs/events` mirrors per-session `MxEvent` traffic from `EventStreamService` to clients subscribed to `session:{id}`." +CLAIM_TYPE: term +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Dashboard/Hubs/EventsHub.cs:27 — `public static string GroupName(string sessionId) => $"session:{sessionId}"`. Matches. +CODE_AREA: arch.layout +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: docs/GatewayProcessDesign.md +LINES: 864–893 +CLAIM: Configuration JSON block shows `MxGateway:Worker:ExecutablePath`, `MxGateway:Sessions:AllowMultipleEventSubscribers`, `MxGateway:Events:QueueCapacity`, `MxGateway:Protocol:WorkerProtocolVersion`, etc. +CLAIM_TYPE: config-key +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Configuration/WorkerOptions.cs:6–7,13 — `ExecutablePath` and `RequiredArchitecture` match; `SessionOptions.cs` and `EventsOptions` confirm the other keys through bound configuration. +CODE_AREA: arch.config +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: docs/DesignDecisions.md +LINES: 85–95 +CLAIM: The single-subscriber rule for `StreamEvents` no longer applies to alarms. `GatewayAlarmMonitor` owns one gateway-managed worker session, fans alarm state to any number of clients through session-less `StreamAlarms`. `AcknowledgeAlarm` is session-less and routes through the monitor. +CLAIM_TYPE: behavior-rule +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Alarms/GatewayAlarmMonitor.cs:17 — class exists. `MxAccessGatewayService.cs:167` — `StreamAlarms` and `AcknowledgeAlarm` are session-less. Matches. +CODE_AREA: arch.session +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: docs/DesignDecisions.md +LINES: 217–225 +CLAIM: Bulk commands are `AddItemBulk`, `AdviseItemBulk`, `RemoveItemBulk`, `UnAdviseItemBulk`, `SubscribeBulk`, `UnsubscribeBulk`, `WriteBulk`, `Write2Bulk`, `WriteSecuredBulk`, `WriteSecured2Bulk`, `ReadBulk`. Each runs single-item MXAccess COM calls sequentially on the STA; per-entry failures are non-throwing. +CLAIM_TYPE: rpc/proto +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_gateway.proto — all eleven bulk command kinds are present in the `MxCommandKind` enum and corresponding request/reply messages. Verified by cross-referencing `GatewayGrpcScopeResolver.cs:39` which maps `WriteBulk`, `Write2Bulk`, etc. +CODE_AREA: arch.ipc +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: gateway.md +LINES: 129–130 +CLAIM: "`/browse` walks the `IGalaxyHierarchyCache` tree and reads subscribed tag values live through `IDashboardLiveDataService`." +CLAIM_TYPE: term +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Dashboard/IDashboardBrowseService.cs — `IDashboardBrowseService` references `IGalaxyHierarchyCache`. `IDashboardLiveDataService.cs` exists in the same Dashboard directory. `/browse` page confirmed in `BrowsePage.razor:1`. +CODE_AREA: arch.layout +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: gateway.md +LINES: 2–19 +CLAIM: Gateway preserves MXAccess behavior first, including public MXAccess command semantics, native MXAccess event families, STA/message-pump delivery behavior, HRESULT/status/value marshaling, and per-client isolation. "Installed MXAccess COM component is the compatibility baseline." +CLAIM_TYPE: behavior-rule +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessInteropInfo.cs (installs/references real COM interop); docs/DesignDecisions.md:26–28 — "target the installed MXAccess COM interop surface directly from the x86 worker." Consistent across all three docs. +CODE_AREA: arch.layout +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: docs/GatewayProcessDesign.md +LINES: 100–105 +CLAIM: gRPC service surface at this stage is limited to `OpenSession`, `CloseSession`, `Invoke`, `StreamEvents` (with `Session(stream ClientMessage) returns (stream ServerMessage)` deferred). +CLAIM_TYPE: rpc/proto +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_gateway.proto — `MxAccessGateway` service defines `OpenSession`, `CloseSession`, `Invoke`, `StreamEvents`, and additional alarm/galaxy RPCs. The bidirectional `Session` RPC is not present in the current proto, consistent with the deferral noted in the doc. +CODE_AREA: arch.ipc +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: gateway.md +LINES: 266–273 +CLAIM: Public gRPC service is `MxAccessGateway` with `OpenSession`, `CloseSession`, `Invoke`, `StreamEvents`, and deferred bidirectional `Session` RPC. +CLAIM_TYPE: rpc/proto +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_gateway.proto — confirmed. The `Session` bidirectional RPC is absent as expected for deferred rollout. +CODE_AREA: arch.ipc +SEVERITY: low +PROPOSED_FIX: flag only diff --git a/docs/audit/fragments/02-worker.md b/docs/audit/fragments/02-worker.md new file mode 100644 index 0000000..9d4ee18 --- /dev/null +++ b/docs/audit/fragments/02-worker.md @@ -0,0 +1,580 @@ +# Cluster 02 — Worker + +Auditor: automated prose-documentation audit +Docs audited: WorkerBootstrap.md, WorkerConversion.md, WorkerFrameProtocol.md, WorkerProcessLauncher.md, WorkerSta.md, MxAccessWorkerInstanceDesign.md +Code verified against: src/ZB.MOM.WW.MxGateway.Worker/**, src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_worker.proto + +--- + +DOC: WorkerSta.md +LINES: 23-31 +CLAIM: `StaRuntime`'s constructor configures a background `Thread` named `ZB.MOM.WW.MxGateway.Worker.STA` and the code snippet shows `Name = "ZB.MOM.WW.MxGateway.Worker.STA"`. +CLAIM_TYPE: term +VERDICT: wrong +EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/Sta/StaRuntime.cs:61 — actual thread name is `"MxGateway.Worker.STA"` (no `ZB.MOM.WW.` prefix). +CODE_AREA: worker.sta +SEVERITY: medium +PROPOSED_FIX: Change every occurrence of `ZB.MOM.WW.MxGateway.Worker.STA` in WorkerSta.md (prose on line 23 and code snippet on line 29) to `MxGateway.Worker.STA`. + +--- + +DOC: MxAccessWorkerInstanceDesign.md +LINES: 254 +CLAIM: `StaRuntime` "starts one background thread named `ZB.MOM.WW.MxGateway.Worker.STA`". +CLAIM_TYPE: term +VERDICT: wrong +EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/Sta/StaRuntime.cs:61 — thread is named `"MxGateway.Worker.STA"`. +CODE_AREA: worker.sta +SEVERITY: medium +PROPOSED_FIX: Replace `ZB.MOM.WW.MxGateway.Worker.STA` with `MxGateway.Worker.STA` in the STA Runtime section. + +--- + +DOC: WorkerSta.md +LINES: 144 +CLAIM: "`InvokeAsync` rejects new work with `InvalidOperationException`" when shutdown is requested. +CLAIM_TYPE: term +VERDICT: wrong +EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/Sta/StaRuntime.cs:170 — actually throws `StaRuntimeShutdownException`. That class inherits from `InvalidOperationException` (StaRuntimeShutdownException.cs:16) but is a distinct type callers are expected to distinguish. +CODE_AREA: worker.sta +SEVERITY: medium +PROPOSED_FIX: Change "rejects new work with `InvalidOperationException`" to "rejects new work with `StaRuntimeShutdownException` (a subtype of `InvalidOperationException`)". The distinction matters because MxAccessStaSession uses it to separate graceful stop from programming errors (e.g., STA-affinity assertions). + +--- + +DOC: MxAccessWorkerInstanceDesign.md +LINES: 122 +CLAIM: Exit code `0` / `Success` meaning = "Required bootstrap options are valid." +CLAIM_TYPE: behavior-rule +VERDICT: wrong +EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/Bootstrap/WorkerExitCode.cs:5; WorkerBootstrap.md:113 states the authoritative meaning: "The pipe session ran to a clean close." The design-doc description conflates parse success with process-lifetime success. +CODE_AREA: worker.launcher +SEVERITY: high +PROPOSED_FIX: Update the Success row to: "`Success` | 0 | The pipe session ran to a clean close." Add a note that `WorkerBootstrapResult.Succeeded` is a parse-phase gate distinct from process exit code 0. + +--- + +DOC: MxAccessWorkerInstanceDesign.md +LINES: 119-128 +CLAIM: Exit code table lists only five codes (0–4). Codes 5 (`PipeConnectionFailed`) and 6 (`ProtocolViolation`) are absent. +CLAIM_TYPE: behavior-rule +VERDICT: stale +EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/Bootstrap/WorkerExitCode.cs:5-12 — enum has seven values (0–6); WorkerBootstrap.md:112-120 documents all seven. +CODE_AREA: worker.launcher +SEVERITY: high +PROPOSED_FIX: Add rows for `PipeConnectionFailed = 5` ("An `IOException` or `TimeoutException` escapes the pipe client") and `ProtocolViolation = 6` ("A `WorkerFrameProtocolException` escapes the pipe client") to the exit-code table. + +--- + +DOC: MxAccessWorkerInstanceDesign.md +LINES: 134-160 +CLAIM: Internal component tree lists class names including `WorkerHost`, `PipeClient`, `FrameReader`, `FrameWriter`, `WorkerProtocol`, `StaCommandQueue`, `MessagePump`, `StaWatchdog`, `MxAccessCommandDispatcher`, `SafeArrayConverter`, `StatusProxyConverter`, `HResultMapper`. +CLAIM_TYPE: term +VERDICT: stale +EVIDENCE: Actual source files in the worker project: +- `WorkerHost` does not exist; entry point is `WorkerApplication` (WorkerApplication.cs). +- `PipeClient` exists as `WorkerPipeClient` (Ipc/WorkerPipeClient.cs). +- `FrameReader`/`FrameWriter` exist as `WorkerFrameReader`/`WorkerFrameWriter` (Ipc/). +- `WorkerProtocol` does not exist; closest is `WorkerContractInfo` (Ipc/WorkerContractInfo.cs). +- `StaCommandQueue` does not exist; queue logic lives in `StaCommandDispatcher` (Sta/StaCommandDispatcher.cs). +- `MessagePump` exists as `StaMessagePump` (Sta/StaMessagePump.cs). +- `StaWatchdog` does not exist; watchdog logic lives in `WorkerPipeSession` (Ipc/WorkerPipeSession.cs). +- `MxAccessCommandDispatcher` does not exist; actual class is `MxAccessCommandExecutor` (MxAccess/MxAccessCommandExecutor.cs). +- `SafeArrayConverter` does not exist; SAFEARRAY conversion is part of `VariantConverter`. +- `StatusProxyConverter` does not exist; actual class is `MxStatusProxyConverter` (Conversion/MxStatusProxyConverter.cs). +- `HResultMapper` does not exist; actual class is `HResultConverter` (Conversion/HResultConverter.cs). +CODE_AREA: worker.sta +SEVERITY: high +PROPOSED_FIX: Rewrite the component tree to match actual class names. This section appears to be a design-phase placeholder that was never updated after implementation. + +--- + +DOC: WorkerBootstrap.md +LINES: 146 +CLAIM: "Standard error is used rather than standard output because the gateway side reads worker stdout for diagnostic capture only, while stderr is reserved for log output that does not interfere with any future stdout-based channel." +CLAIM_TYPE: behavior-rule +VERDICT: stale +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Workers/WorkerProcessLauncher.cs:166-174 — `ProcessStartInfo` does not set `RedirectStandardOutput = true` or `RedirectStandardError = true`; the gateway currently reads neither stream. The stated reason (gateway reads stdout) is not implemented. +CODE_AREA: worker.launcher +SEVERITY: medium +PROPOSED_FIX: Replace the stdout-capture rationale with the accurate reason: "Environment variables of another process are not visible to other users, unlike command-line arguments; stdout/stderr redirect is not currently wired by the launcher." Alternatively, if stdout capture is a planned feature, label it as such. + +--- + +DOC: WorkerConversion.md +LINES: 178 +CLAIM: "`MapCategory` and `MapSource` translate the integer codes documented for `MXSTATUS_PROXY` (for example `0 = Ok`, `3 = CommunicationError`, `0 = RequestingLmx`, `5 = RespondingAutomationObject`)". +CLAIM_TYPE: term +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/Conversion/MxStatusProxyConverter.cs:103-133 — `MapCategory(0)` → `MxStatusCategory.Ok`; `MapCategory(3)` → `MxStatusCategory.CommunicationError`; `MapSource(0)` → `MxStatusSource.RequestingLmx`; `MapSource(5)` → `MxStatusSource.RespondingAutomationObject`. +CODE_AREA: worker.convert +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: WorkerConversion.md +LINES: 225 +CLAIM: "The mapping covers the engine-error range documented for MXAccess (16-50, 56-61, 541-542, 8017)." +CLAIM_TYPE: behavior-rule +VERDICT: stale +EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/Conversion/MxStatusDetailText.cs:7-48 — the dictionary has gaps within those ranges: keys 35, 45, 46 are absent from 16–50; keys 58, 59 are absent from 56–61. The doc implies contiguous ranges. +CODE_AREA: worker.convert +SEVERITY: low +PROPOSED_FIX: Replace the continuous-range description with "selected detail codes in the ranges 16–50, 56–61, 541–542, and 8017 (not all values in those ranges are populated)." + +--- + +DOC: WorkerBootstrap.md +LINES: 7-8 +CLAIM: "`WorkerApplication.Run` constructs the bootstrap dependencies (`EnvironmentVariableWorkerEnvironment`, `WorkerConsoleLogger` writing to `Console.Error`, and a `WorkerPipeClient`)". +CLAIM_TYPE: term +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/WorkerApplication.cs:16-19. +CODE_AREA: worker.launcher +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: WorkerBootstrap.md +LINES: 113-120 +CLAIM: Exit code table with seven rows 0–6. +CLAIM_TYPE: behavior-rule +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/Bootstrap/WorkerExitCode.cs:5-12. +CODE_AREA: worker.launcher +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: WorkerBootstrap.md +LINES: 181-193 +CLAIM: `WorkerLogRedactor` `SensitiveFieldNameParts` list (seven entries: nonce, secret, password, token, credential, apikey, api_key). +CLAIM_TYPE: behavior-rule +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/Bootstrap/WorkerLogRedactor.cs:16-25. +CODE_AREA: worker.launcher +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: WorkerBootstrap.md +LINES: 105 +CLAIM: "`Succeeded` is defined as `ExitCode == WorkerExitCode.Success` rather than as a separate flag, so the exit code and the success state cannot disagree." +CLAIM_TYPE: behavior-rule +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/Bootstrap/WorkerBootstrapResult.cs:36. +CODE_AREA: worker.launcher +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: WorkerFrameProtocol.md +LINES: 14-19 +CLAIM: Each frame starts with a four-byte little-endian unsigned payload length followed by the serialized `WorkerEnvelope` payload. Zero-length payloads and payloads larger than the configured maximum are rejected before allocating the payload buffer. The default maximum is 16 MiB. +CLAIM_TYPE: rpc/proto +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/Ipc/WorkerFrameReader.cs:32-50; WorkerFrameProtocolOptions.cs:11 (`DefaultMaxMessageBytes = 16 * 1024 * 1024`). +CODE_AREA: worker.frameproto +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: WorkerFrameProtocol.md +LINES: 22-34 +CLAIM: Envelope validation checks: `protocol_version` must match configured version; `session_id` must match owning session; envelope must contain one typed `body` value. Violations throw `WorkerFrameProtocolException` with a `WorkerFrameProtocolErrorCode`. +CLAIM_TYPE: rpc/proto +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/Ipc/WorkerEnvelopeValidator.cs:16-36. +CODE_AREA: worker.frameproto +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: WorkerFrameProtocol.md +LINES: 38-41 +CLAIM: "The frame protocol lives in `ZB.MOM.WW.MxGateway.Worker.Ipc` (`WorkerFrameReader`, `WorkerFrameWriter`, `WorkerFrameProtocolOptions`)". +CLAIM_TYPE: path +VERDICT: accurate +EVIDENCE: Namespaces in WorkerFrameReader.cs:9, WorkerFrameWriter.cs:8, WorkerFrameProtocolOptions.cs:6. +CODE_AREA: worker.frameproto +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: WorkerFrameProtocol.md +LINES: 44-47 +CLAIM: Test file path is `src/ZB.MOM.WW.MxGateway.Worker.Tests/Ipc/WorkerFrameProtocolTests.cs`. +CLAIM_TYPE: path +VERDICT: accurate +EVIDENCE: File confirmed at that path. +CODE_AREA: worker.frameproto +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: WorkerProcessLauncher.md +LINES: 18-25 +CLAIM: Launcher passes `SessionId`, `PipeName`, and `ProtocolVersion` as `--session-id`, `--pipe-name`, `--protocol-version` CLI arguments; nonce travels via `MXGATEWAY_WORKER_NONCE` environment variable; nonce is excluded from `WorkerProcessCommandLine`. +CLAIM_TYPE: command +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Workers/WorkerProcessLauncher.cs:156-184. +CODE_AREA: worker.launcher +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: WorkerProcessLauncher.md +LINES: 30-34 +CLAIM: Launcher validates that the configured worker path exists, has `.exe` extension, contains a valid Windows Portable Executable header, and matches `RequiredArchitecture`. +CLAIM_TYPE: behavior-rule +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Workers/WorkerProcessLauncher.cs:189-220 calls `WorkerExecutableValidator.Validate`. +CODE_AREA: worker.launcher +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: WorkerProcessLauncher.md +LINES: 35-45 +CLAIM: Default probe (`IWorkerStartupProbe`) "only verifies that the worker did not exit immediately." Retry policy configured by `WorkerOptions.StartupProbeRetryAttempts` and `WorkerOptions.StartupProbeRetryDelayMilliseconds`; counter recorded as `mxgateway.retries.attempted` with `area=worker_startup`. +CLAIM_TYPE: behavior-rule +VERDICT: accurate +EVIDENCE: WorkerProcessStartedProbe.cs:10-24 (exits check only); WorkerOptions.cs:18-22; GatewayMetrics.cs:70 (`mxgateway.retries.attempted`); WorkerProcessLauncher.cs:279 (area label `"worker_startup"`). +CODE_AREA: worker.launcher +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: WorkerProcessLauncher.md +LINES: 48-55 +CLAIM: Launcher also passes `MXGATEWAY_WORKER_PIPE_CONNECT_ATTEMPT_TIMEOUT_MS` from `WorkerOptions.PipeConnectAttemptTimeoutMilliseconds`. On failure, kills the worker process tree, disposes the process handle, disposes the optional pipe reservation, records a worker kill metric, and reports `WorkerProcessLaunchException`. +CLAIM_TYPE: behavior-rule +VERDICT: accurate +EVIDENCE: WorkerProcessLauncher.cs:181-182, 253-267. +CODE_AREA: worker.launcher +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: WorkerProcessLauncher.md +LINES: 60-64 +CLAIM: Test command: `dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter WorkerProcessLauncherTests`. +CLAIM_TYPE: command +VERDICT: accurate +EVIDENCE: Project file confirmed at `src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj`; test class `WorkerProcessLauncherTests` confirmed at `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Workers/WorkerProcessLauncherTests.cs`. +CODE_AREA: worker.launcher +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: WorkerSta.md +LINES: 14 +CLAIM: Type table shows `StaCommandDispatcher` as "Bounded asynchronous queue in front of `StaRuntime`…". +CLAIM_TYPE: term +VERDICT: stale +EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/Sta/StaCommandDispatcher.cs:15 — uses `Queue`, a plain synchronous non-concurrent `Queue` guarded by `lock(gate)`. There is no async channel or channel-based backpressure; `DrainAsync` is fire-and-forget but the queue itself is not an async queue. +CODE_AREA: worker.sta +SEVERITY: low +PROPOSED_FIX: Change "Bounded asynchronous queue" to "Bounded queue with an async drain loop" to avoid implying the underlying data structure is an async channel. + +--- + +DOC: WorkerSta.md +LINES: 56 +CLAIM: "`The idlePumpInterval` defaults to 50 ms so the pump still services Windows messages even when no commands are queued". +CLAIM_TYPE: config-key +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/Sta/StaRuntime.cs:30 — `TimeSpan.FromMilliseconds(50)`. +CODE_AREA: worker.sta +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: WorkerSta.md +LINES: 82-99 +CLAIM: `InvokeAsync` wraps the delegate in a `StaWorkItem`, enqueues it on a `ConcurrentQueue`, and signals `commandWakeEvent`. `StaWorkItem` uses an `Interlocked.CompareExchange` on `started` so exactly one of three outcomes happens. +CLAIM_TYPE: behavior-rule +VERDICT: accurate +EVIDENCE: StaRuntime.cs:12 (`ConcurrentQueue`); StaRuntime.cs:164-177; StaWorkItem.cs:31,47,57. +CODE_AREA: worker.sta +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: WorkerSta.md +LINES: 141-148 +CLAIM: Shutdown sequence step 1: sets `shutdownRequested` under `gate`; step 2: signals `commandWakeEvent`; step 3: waits up to `timeout` on `stoppedEvent`, which the STA sets after leaving `ThreadMain`; step 4: drains the queue through `CancelQueuedCommands` calling `CancelBeforeExecution`. +CLAIM_TYPE: behavior-rule +VERDICT: stale +EVIDENCE: StaRuntime.cs:261-273 — `CancelQueuedCommands()` is called inside `ThreadMain`'s `finally` block *before* `stoppedEvent.Set()`, meaning the drain happens on the STA thread, not after `stoppedEvent` is observed by `Shutdown()`. `Shutdown()` calls `CancelQueuedCommands()` a *second* time after observing `stoppedEvent`, but the doc implies a single post-stop drain. +CODE_AREA: worker.sta +SEVERITY: medium +PROPOSED_FIX: Revise step 3 to note that `stoppedEvent` is set from within `ThreadMain`'s `finally` block (before the thread exits) after `CoUninitialize`. Revise step 4 to note the queue is drained *twice*: once by `ThreadMain` in its `finally` (to cancel items enqueued before shutdown) and once by `Shutdown()` after `stoppedEvent` (to cancel any items enqueued in the gap). + +--- + +DOC: WorkerSta.md +LINES: 149 +CLAIM: "`Dispose` calls `Shutdown` with a five-second budget and only disposes the wait handles when shutdown actually completed". +CLAIM_TYPE: behavior-rule +VERDICT: accurate +EVIDENCE: StaRuntime.cs:224-233 — `Shutdown(TimeSpan.FromSeconds(5))`; handles disposed only when `stopped` is true. +CODE_AREA: worker.sta +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: WorkerSta.md +LINES: 108 +CLAIM: "when `commandQueue.Count` reaches `maxPendingCommands` (default `DefaultMaxPendingCommands = 128`) the dispatcher returns a synthetic `WorkerUnavailable` reply". +CLAIM_TYPE: config-key +VERDICT: accurate +EVIDENCE: StaCommandDispatcher.cs:11 (`DefaultMaxPendingCommands = 128`); lines 125-132 (count check and WorkerUnavailable reply). +CODE_AREA: worker.sta +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: MxAccessWorkerInstanceDesign.md +LINES: 97 +CLAIM: Expected protected environment values include `MXGATEWAY_WORKER_LOG_CONTEXT=`. +CLAIM_TYPE: config-key +VERDICT: wrong +EVIDENCE: No occurrence of `MXGATEWAY_WORKER_LOG_CONTEXT` anywhere in `src/ZB.MOM.WW.MxGateway.Worker/**`. The only worker environment variable in code is `MXGATEWAY_WORKER_NONCE` (WorkerOptions.cs:7) and `MXGATEWAY_WORKER_PIPE_CONNECT_ATTEMPT_TIMEOUT_MS` (WorkerProcessLauncher.cs:22). +CODE_AREA: worker.launcher +SEVERITY: high +PROPOSED_FIX: Remove `MXGATEWAY_WORKER_LOG_CONTEXT` from the bootstrap environment table, or add a note that it is not yet implemented if it is intended for a future slice. + +--- + +DOC: MxAccessWorkerInstanceDesign.md +LINES: 86-99 +CLAIM: Bootstrap sequence lists `MXGATEWAY_WORKER_LOG_CONTEXT` as an optional protected environment value alongside `MXGATEWAY_WORKER_NONCE`. +CLAIM_TYPE: config-key +VERDICT: wrong +EVIDENCE: Same as above — `MXGATEWAY_WORKER_LOG_CONTEXT` is not read anywhere in the worker bootstrap code. +CODE_AREA: worker.launcher +SEVERITY: high +PROPOSED_FIX: flag only (same fix as prior entry). + +--- + +DOC: MxAccessWorkerInstanceDesign.md +LINES: 368-375 +CLAIM: "`MxAccessEventQueue` is the bounded outbound event queue for one worker session. It assigns the monotonic `WorkerSequence` and `WorkerTimestamp` when an event is accepted. The default capacity is `10000`. When the queue reaches capacity it records a `WorkerFaultCategory.QueueOverflow` fault and rejects further events." +CLAIM_TYPE: behavior-rule +VERDICT: stale +EVIDENCE: MxAccessEventQueue.cs:115-132 — `Enqueue` throws `MxAccessEventQueueOverflowException` in addition to recording the fault. Callers in `MxAccessBaseEventSink` catch this exception. The doc's phrase "rejects further events" omits the thrown exception, which callers must handle. +CODE_AREA: worker.sta +SEVERITY: low +PROPOSED_FIX: Add that `Enqueue` raises `MxAccessEventQueueOverflowException` on overflow, in addition to recording the fault, so that callers know to catch this exception rather than only observing the fault via `DrainFault()`. + +--- + +DOC: WorkerConversion.md +LINES: 1-262 (entire doc) +CLAIM: Documents `VariantConverter`, `HResultConverter`/`HResultConversion`, `MxStatusProxyConverter`, `MxStatusDetailText`, `MxStatusConversionException`. +CLAIM_TYPE: term +VERDICT: gap +EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/Conversion/VariantConverter.cs:129-177 — `ConvertToComValue(MxValue)` and `ConvertToComArray(MxArray)` are fully implemented methods that convert protobuf values back to CLR objects for COM write calls. These inverse-projection paths are nowhere mentioned in WorkerConversion.md, leaving integrators unaware of the write path. +CODE_AREA: worker.convert +SEVERITY: medium +PROPOSED_FIX: Add a section "Inverse projection for COM writes" describing `ConvertToComValue`, its dispatch on `MxValue.KindOneofCase`, the `ConvertToComArray` helper, and that raw or unset `MxValue` payloads throw `ArgumentException`. + +--- + +DOC: MxAccessWorkerInstanceDesign.md +LINES: 134-160 +CLAIM: Internal component tree for `MxAccess` subtree lists: `MxAccessSession`, `MxAccessCommandDispatcher`, `MxAccessEventSink`, `MxAccessHandleRegistry`. +CLAIM_TYPE: term +VERDICT: stale +EVIDENCE: Actual classes: `MxAccessSession` (internal session state), `MxAccessStaSession` (owner of the STA session lifecycle), `MxAccessCommandExecutor` (implements `IStaCommandExecutor`), `MxAccessBaseEventSink`/`MxAccessAlarmEventSink` (event sinks), `MxAccessHandleRegistry`. The class `MxAccessCommandDispatcher` does not exist. +CODE_AREA: worker.sta +SEVERITY: medium +PROPOSED_FIX: Update MxAccess subtree to reflect actual class names. Note that `MxAccessStaSession` owns `StaCommandDispatcher` (in the Sta namespace) and `MxAccessCommandExecutor`; they are separate concerns. + +--- + +DOC: MxAccessWorkerInstanceDesign.md +LINES: 134-160 (entire component tree) +CLAIM: No mention of the alarm subsystem. +CLAIM_TYPE: term +VERDICT: gap +EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/MxAccess/ contains a complete alarm subsystem: `AlarmCommandHandler.cs`, `AlarmDispatcher.cs`, `AlarmRecordTransitionMapper.cs`, `IAlarmCommandHandler.cs`, `IMxAccessAlarmConsumer.cs`, `MxAccessAlarmEventSink.cs`, `WnWrapAlarmConsumer.cs`, `MxAlarmSnapshot.cs`, `MxAlarmStateKind.cs`, `MxAlarmTransitionEvent.cs`. None of these appear in any of the six audited docs. `MxAccessStaSession.cs` shows an `alarmCommandHandlerFactory` parameter and an alarm poll loop (lines 14-312). +CODE_AREA: worker.sta +SEVERITY: high +PROPOSED_FIX: Add an "Alarm Subsystem" section to MxAccessWorkerInstanceDesign.md (or create docs/WorkerAlarms.md) covering: `IAlarmCommandHandler`/`AlarmCommandHandler`, the `WnWrapAlarmConsumer` STA-affinity requirement, the 500 ms alarm poll loop in `MxAccessStaSession.RunAlarmPollLoopAsync`, `AlarmDispatcher`, and the `MxAccessAlarmEventSink`. Update the event-sink list in the "Event Sink" section to include alarm events. + +--- + +DOC: MxAccessWorkerInstanceDesign.md +LINES: 336-338 +CLAIM: Event sink must subscribe to `OnDataChange`, `OnWriteComplete`, `OperationComplete`, `OnBufferedDataChange`. +CLAIM_TYPE: behavior-rule +VERDICT: gap +EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessAlarmEventSink.cs exists alongside `MxAccessBaseEventSink.cs`, indicating a fifth event family (alarm events) is handled. The four-family list is incomplete. +CODE_AREA: worker.sta +SEVERITY: medium +PROPOSED_FIX: Add alarm events to the event sink subscription list and clarify that alarm events are handled via `MxAccessAlarmEventSink` on the same STA thread. + +--- + +DOC: WorkerConversion.md +LINES: 17-18 +CLAIM: "It accepts an optional `expectedDataType` so that an MXAccess attribute hint (for example `MxDataType.Time` for a 64-bit FILETIME) overrides the default CLR-driven projection." +CLAIM_TYPE: behavior-rule +VERDICT: accurate +EVIDENCE: VariantConverter.cs:262-291 (`ConvertInt64Scalar` checks `expectedDataType == MxDataType.Time && value is long`). +CODE_AREA: worker.convert +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: WorkerConversion.md +LINES: 112-135 +CLAIM: "`HResultConverter.Convert` prefers `COMException.ErrorCode` over `Exception.HResult` because the runtime sometimes overwrites `Exception.HResult` while marshalling". +CLAIM_TYPE: behavior-rule +VERDICT: accurate +EVIDENCE: HResultConverter.cs:21-26. +CODE_AREA: worker.convert +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: WorkerBootstrap.md +LINES: 48-54 +CLAIM: Three fields arrive on the command line (`--session-id`, `--pipe-name`, `--protocol-version`) and one via environment variable (`MXGATEWAY_WORKER_NONCE`). +CLAIM_TYPE: command +VERDICT: accurate +EVIDENCE: WorkerOptionsParser.cs:12-14, 78. +CODE_AREA: worker.launcher +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: WorkerBootstrap.md +LINES: 155-159 +CLAIM: "`IWorkerLogger` exposes only `Information` and `Error`. There is no `Debug` or `Trace` level." +CLAIM_TYPE: behavior-rule +VERDICT: accurate +EVIDENCE: IWorkerLogger.cs:8-19. +CODE_AREA: worker.launcher +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: WorkerSta.md +LINES: 34 +CLAIM: "`StaComApartmentInitializer.Initialize` calls `CoInitializeEx` with `COINIT_APARTMENTTHREADED` (`0x2`) and treats both `S_OK` and `S_FALSE` as success because `S_FALSE` indicates the apartment was already initialized on this thread." +CLAIM_TYPE: behavior-rule +VERDICT: accurate +EVIDENCE: StaComApartmentInitializer.cs:8-18. +CODE_AREA: worker.sta +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: WorkerSta.md +LINES: 63-78 +CLAIM: "`StaMessagePump.WaitForWorkOrMessages` calls `MsgWaitForMultipleObjectsEx` with `QS_ALLINPUT` and `MWMO_INPUTAVAILABLE`. `PumpPendingMessages` drains the queue with `PM_REMOVE`." +CLAIM_TYPE: behavior-rule +VERDICT: accurate +EVIDENCE: StaMessagePump.cs:13-15 (`MwmoInputAvailable = 0x0004`, `PmRemove = 0x0001`, `QsAllInput = 0x04FF`); lines 31-36, 50-57. +CODE_AREA: worker.sta +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: MxAccessWorkerInstanceDesign.md +LINES: 271-286 +CLAIM: COM details: interop assembly path, assembly identity (`ArchestrA.MxAccess, Version=3.2.0.0, PublicKeyToken=23106a86e706d0ae`), COM class `ArchestrA.MxAccess.LMXProxyServerClass`, CLSID `{C30B52F5-2CB5-4760-AF0A-3A344A7EB5DC}`, ProgID `LMXProxy.LMXProxyServer.1`, version-independent ProgID `LMXProxy.LMXProxyServer`, registered server `LmxProxy.dll`, threading model `Apartment`. +CLAIM_TYPE: path +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessInteropInfo.cs — ProgId, VersionIndependentProgId, Clsid, InteropAssemblyPath, RegisteredServerPath, ComClassName all match. Assembly identity and threading model are from MXAccess analysis sources and are unverifiable in this repo but consistent with design sources cited in CLAUDE.md. +CODE_AREA: worker.sta +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: MxAccessWorkerInstanceDesign.md +LINES: 656-660 +CLAIM: "HeartbeatStuckCeiling (default 75 seconds = 5 × HeartbeatGrace)". +CLAIM_TYPE: config-key +VERDICT: accurate +EVIDENCE: WorkerPipeSessionOptions.cs:19 (`DefaultHeartbeatStuckCeiling = TimeSpan.FromSeconds(75)`); DefaultHeartbeatGrace = 15 s (line 11); 5 × 15 = 75. ✓ +CODE_AREA: worker.sta +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: WorkerBootstrap.md +LINES: 5-6 +CLAIM: "The worker process is a short-lived child of the gateway." +CLAIM_TYPE: term +VERDICT: stale +EVIDENCE: No functional error, but "short-lived" is context-dependent; workers persist for the entire duration of a gateway session (which may be hours). Integrators might misread this as expecting sub-minute lifetimes. +CODE_AREA: worker.launcher +SEVERITY: low +PROPOSED_FIX: Replace "short-lived child" with "per-session child process" or "child process that lives for the duration of one gateway session." + +--- + +DOC: MxAccessWorkerInstanceDesign.md +LINES: 151 +CLAIM: Component tree lists `MxAccessSession` as a class under `MxAccess`. +CLAIM_TYPE: term +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessSession.cs exists. The tree is incomplete (missing `MxAccessStaSession`, alarm classes, etc.) but `MxAccessSession` itself is real. +CODE_AREA: worker.sta +SEVERITY: low +PROPOSED_FIX: flag only (incompleteness covered by the component-tree stale entry above). + +--- + +DOC: WorkerConversion.md +LINES: 18 +CLAIM: `VariantConverter` is in namespace `ZB.MOM.WW.MxGateway.Worker.Conversion`. +CLAIM_TYPE: path +VERDICT: accurate +EVIDENCE: VariantConverter.cs:8 (`namespace ZB.MOM.WW.MxGateway.Worker.Conversion;`). +CODE_AREA: worker.convert +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC: WorkerFrameProtocol.md +LINES: 49-53 +CLAIM: Build command `dotnet build src/ZB.MOM.WW.MxGateway.Worker/ZB.MOM.WW.MxGateway.Worker.csproj -p:Platform=x86`. +CLAIM_TYPE: command +VERDICT: accurate +EVIDENCE: Project file exists at that path. +CODE_AREA: worker.frameproto +SEVERITY: low +PROPOSED_FIX: flag only diff --git a/docs/audit/fragments/03-sessions.md b/docs/audit/fragments/03-sessions.md new file mode 100644 index 0000000..6504481 --- /dev/null +++ b/docs/audit/fragments/03-sessions.md @@ -0,0 +1,380 @@ +# Cluster 03 — Sessions/Runtime + +Auditor: automated (claude-sonnet-4-6) +Date: 2026-06-03 +Source doc: docs/Sessions.md +Verified against: src/ZB.MOM.WW.MxGateway.Server/Sessions/**, src/ZB.MOM.WW.MxGateway.Server/Workers/** + +--- + +DOC / LINES / 9 +CLAIM: "All four interfaces (`ISessionManager`, `ISessionRegistry`, `ISessionWorkerClientFactory`) plus `SessionShutdownHostedService` are wired as singletons by `SessionServiceCollectionExtensions.AddGatewaySessions`." +CLAIM_TYPE: term +VERDICT: wrong +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionServiceCollectionExtensions.cs:9-18 — only three interfaces exist (confirmed by `ls I*.cs` in Sessions/). The doc claims "four interfaces" but names only three. Additionally the DI registration also registers `SessionLeaseMonitorHostedService` as a hosted service, which is omitted from this sentence. +CODE_AREA: session.di +SEVERITY: medium +PROPOSED_FIX: Change "All four interfaces" to "All three interfaces". Separately note that two hosted services are registered: `SessionLeaseMonitorHostedService` and `SessionShutdownHostedService`. + +--- + +DOC / LINES / 265-276 +CLAIM: Code snippet for `AddGatewaySessions` shows only `SessionShutdownHostedService` registered; `SessionLeaseMonitorHostedService` is absent from the snippet. +CLAIM_TYPE: behavior-rule +VERDICT: stale +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionServiceCollectionExtensions.cs:14-15 — actual code registers both `AddHostedService()` and `AddHostedService()`. The snippet in the doc is missing the lease-monitor line. +CODE_AREA: session.di +SEVERITY: medium +PROPOSED_FIX: Add `services.AddHostedService();` to the code snippet (between the `ISessionManager` singleton line and the shutdown service line). + +--- + +DOC / LINES / 232-259 +CLAIM: The `ShutdownAsync` code snippet shown calls `session.KillWorker(GatewayShutdownReason)` and `await RemoveSessionAsync(session)` directly in the catch block. +CLAIM_TYPE: behavior-rule +VERDICT: stale +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManager.cs:296-331 — the actual `ShutdownAsync` fallback calls `await KillWorkerAsync(session.SessionId, GatewayShutdownReason, cancellationToken)` (which routes through `KillWorkerWithCloseGateAsync` and then `RemoveSessionAsync`), not a direct `session.KillWorker` + `RemoveSessionAsync`. The old snippet predates the Server-045/Server-046 refactor that unified the kill path through `KillWorkerAsync`. +CODE_AREA: session.shutdown +SEVERITY: medium +PROPOSED_FIX: Replace the ShutdownAsync snippet with the current implementation, which checks `_registry.TryGet` then calls `KillWorkerAsync` (wrapped in its own try/catch) instead of directly calling `session.KillWorker` and `RemoveSessionAsync`. + +--- + +DOC / LINES / 55-59 +CLAIM: "`KillWorkerAsync` is the forceful path used by the dashboard's admin Kill button: it calls `GatewaySession.KillWorker` directly, which kills the worker process immediately with no graceful-shutdown attempt and transitions the session to `Closed`." +CLAIM_TYPE: behavior-rule +VERDICT: stale +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManager.cs:216-264 — `KillWorkerAsync` now calls `session.KillWorkerWithCloseGateAsync` (not `GatewaySession.KillWorker` directly). The `KillWorkerWithCloseGateAsync` method acquires `_closeLock` before killing, serializing concurrent close/kill attempts (Server-045 fix). The old description of a direct `KillWorker` call is stale. +CODE_AREA: session.lifecycle +SEVERITY: medium +PROPOSED_FIX: Update description to state that `KillWorkerAsync` calls `session.KillWorkerWithCloseGateAsync`, which acquires the per-session close lock before killing the worker, so concurrent close and kill callers serialize. + +--- + +DOC / LINES / 59 +CLAIM: "Both paths converge on the same registry/metrics cleanup, so the open-session slot is released and `mxgateway.sessions.closed` is incremented either way." +CLAIM_TYPE: config-key +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Metrics/GatewayMetrics.cs:59 — counter name `mxgateway.sessions.closed` confirmed. Both `CloseSessionCoreAsync` and `KillWorkerAsync` call `_metrics.SessionClosed()` and `RemoveSessionAsync` (which calls `ReleaseSessionSlot`). +CODE_AREA: session.metrics +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC / LINES / 60-72 +CLAIM: Code snippet for `EnsureSessionCapacity` throws `SessionManagerException` with `SessionLimitExceeded`; open requests that exceed the bound "throw ... rather than queuing". +CLAIM_TYPE: behavior-rule +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManager.cs:388-396 — `_sessionSlots.Wait(0)` (zero timeout = non-blocking) confirms the no-queue, immediate-throw behavior. +CODE_AREA: session.lifecycle +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC / LINES / 61 +CLAIM: "Concurrency is bounded by a `SemaphoreSlim` initialized to `GatewayOptions.Sessions.MaxSessions`." +CLAIM_TYPE: config-key +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManager.cs:53 — `new SemaphoreSlim(_options.Sessions.MaxSessions, _options.Sessions.MaxSessions)`. +CODE_AREA: session.lifecycle +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC / LINES / 75 +CLAIM: "three close-reason constants — `DefaultCloseReason` (`\"client-close\"`), `GatewayShutdownReason` (`\"gateway-shutdown\"`), and `LeaseExpiredReason` (`\"lease-expired\"`)" +CLAIM_TYPE: term +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManager.cs:17-19 — all three constants confirmed with exact string values. +CODE_AREA: session.lifecycle +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC / LINES / 79-81 +CLAIM: "`SessionRegistry` is a thin wrapper over a `ConcurrentDictionary` keyed by session id with `StringComparer.Ordinal`." +CLAIM_TYPE: term +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionRegistry.cs:12 — `new ConcurrentDictionary(StringComparer.Ordinal)` confirmed. +CODE_AREA: session.registry +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC / LINES / 81 +CLAIM: "`ActiveCount` filters out sessions whose state is `Closed`" +CLAIM_TYPE: behavior-rule +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionRegistry.cs:22 — `_sessions.Values.Count(session => session.State is not SessionState.Closed)` confirmed. +CODE_AREA: session.registry +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC / LINES / 15-19 +CLAIM: "The session id is an opaque string in the form `session-{guid:N}` and the per-session pipe name is `mxaccess-gateway-{ProcessId}-{SessionId}`." +CLAIM_TYPE: term +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManager.cs:433 (`pipeName = $"mxaccess-gateway-{Environment.ProcessId}-{sessionId}"`) and :479 (`$"session-{Guid.NewGuid():N}"`). +CODE_AREA: session.lifecycle +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC / LINES / 19 +CLAIM: "`SessionState` itself is the protobuf-generated enum from `ZB.MOM.WW.MxGateway.Contracts.Proto`" +CLAIM_TYPE: term +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/GatewaySession.cs:1 — `using ZB.MOM.WW.MxGateway.Contracts.Proto;` and the state field is typed `SessionState`. +CODE_AREA: session.lifecycle +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC / LINES / 85-87 +CLAIM: "`SessionWorkerClientFactory.CreateAsync` … drives the session through the protobuf `SessionState` substates in order: `StartingWorker`, `WaitingForPipe`, `Handshaking`, `InitializingWorker`." +CLAIM_TYPE: behavior-rule +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionWorkerClientFactory.cs:60-105 — `TransitionTo(SessionState.StartingWorker)` → `TransitionTo(SessionState.WaitingForPipe)` → `TransitionTo(SessionState.Handshaking)` → `TransitionTo(SessionState.InitializingWorker)` in sequence. +CODE_AREA: session.lifecycle +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC / LINES / 87-98 +CLAIM: Startup timeout wrapped as `TimeoutException` with the exact catch pattern shown — `OperationCanceledException` where `startupCancellation.IsCancellationRequested` and `!cancellationToken.IsCancellationRequested`. +CLAIM_TYPE: behavior-rule +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionWorkerClientFactory.cs:145-153 — identical predicate confirmed. +CODE_AREA: session.lifecycle +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC / LINES / 100 +CLAIM: "The named pipe is created with `maxNumberOfServerInstances: 1`" +CLAIM_TYPE: behavior-rule +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionWorkerClientFactory.cs:166 — `maxNumberOfServerInstances: 1` confirmed. +CODE_AREA: session.lifecycle +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC / LINES / 104 +CLAIM: "`SessionShutdownHostedService` … catches `OperationCanceledException` triggered by the host shutdown timeout and logs a warning so that an over-running shutdown does not surface as an unhandled exception." +CLAIM_TYPE: behavior-rule +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionShutdownHostedService.cs:18-28 — exact catch confirmed. +CODE_AREA: session.shutdown +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC / LINES / 109-127 +CLAIM: `SessionOpenRequest` is a `sealed record` with fields `RequestedBackend`, `ClientSessionName`, `ClientCorrelationId`, `CommandTimeout`, and a `FromContract` factory. +CLAIM_TYPE: term +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionOpenRequest.cs:6-24 — confirmed. Note: the doc snippet includes a `ClientCorrelationId` field in the record definition, but the actual `SessionManager.CreateSession` derives `clientCorrelationId` internally rather than forwarding the field from the request. This is a minor mismatch between what the record holds vs. how it is used, but does not constitute an error in the doc's description of the record type itself. +CODE_AREA: session.lifecycle +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC / LINES / 134-139 +CLAIM: `SessionCloseResult` is a `sealed record` with `SessionId`, `FinalState`, `AlreadyClosed`. +CLAIM_TYPE: term +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionCloseResult.cs:5-8 — confirmed. +CODE_AREA: session.lifecycle +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC / LINES / 143 +CLAIM: "`SessionCloseStartedException` is `internal`" +CLAIM_TYPE: term +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionCloseStartedException.cs:3 — `internal sealed class SessionCloseStartedException` confirmed. +CODE_AREA: session.lifecycle +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC / LINES / 148-157 +CLAIM: Error code table for `SessionManagerException` — seven codes listed: `SessionNotFound`, `SessionNotReady`, `EventSubscriberAlreadyActive`, `EventQueueOverflow`, `SessionLimitExceeded`, `OpenFailed`, `CloseFailed`. +CLAIM_TYPE: term +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManagerErrorCode.cs:1-12 — all seven members confirmed in order. +CODE_AREA: session.lifecycle +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC / LINES / 163-188 +CLAIM: Open failure rollback order: "fault, deregister, dispose, release slot, record metric, log, rethrow". +CLAIM_TYPE: behavior-rule +VERDICT: stale +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManager.cs:97-123 — actual order is: MarkFaulted → TryRemove (deregister) → DisposeAsync → (conditionally) SessionRemoved metric if sessionOpenedRecorded → ReleaseSessionSlot → Fault metric → LogWarning → rethrow. The doc omits the `sessionOpenedRecorded` conditional `SessionRemoved()` call that was added in the Server-006 fix, making the described order incomplete. The doc text says "release slot, record metric" but the actual code calls `SessionRemoved` before `ReleaseSessionSlot` when `sessionOpenedRecorded` is true. +CODE_AREA: session.lifecycle +SEVERITY: medium +PROPOSED_FIX: Update the rollback description to note the conditional `SessionRemoved()` metric call that precedes `ReleaseSessionSlot` when `SessionOpened()` was already recorded (guards against mxgateway.sessions.open gauge leak on late failures such as auto-subscribe rejection). + +--- + +DOC / LINES / 193-195 +CLAIM: "`GatewaySession` also exposes typed bulk helpers (`AddItemBulkAsync`, `SubscribeBulkAsync`, etc.) that wrap `WorkerCommand` round-trips and translate non-`Ok` `ProtocolStatus` replies into `SessionManagerException` with `SessionNotReady`." +CLAIM_TYPE: term +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/GatewaySession.cs:490, 590 (AddItemBulkAsync, SubscribeBulkAsync) and :1017-1023 (ProtocolStatusCode.Ok guard throwing SessionManagerException(SessionNotReady)). +CODE_AREA: session.lifecycle +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC / LINES / 195-197 +CLAIM: "Event streaming uses `AttachEventSubscriber` which returns a disposable lease. When `allowMultipleSubscribers` is false the second attach throws `EventSubscriberAlreadyActive`; this prevents two gRPC streams from racing on the same worker event channel. Active event subscribers keep the session lease from expiring until the stream is disposed." +CLAIM_TYPE: behavior-rule +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/GatewaySession.cs:387-407 (AttachEventSubscriber guard and lease) and :373-380 (IsLeaseExpired checks `_activeEventSubscriberCount == 0`). +CODE_AREA: session.subscriber +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC / LINES / 197 +CLAIM: "Sessions open with `MxGateway:Sessions:DefaultLeaseSeconds` (default 1800)" +CLAIM_TYPE: config-key +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Configuration/SessionOptions.cs:21 — `public int DefaultLeaseSeconds { get; init; } = 1800`. +CODE_AREA: session.lifecycle +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC / LINES / 197 +CLAIM: "`SessionLeaseMonitorHostedService` runs that sweep every `MxGateway:Sessions:LeaseSweepIntervalSeconds` seconds (default 30)." +CLAIM_TYPE: config-key +VERDICT: accurate +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Configuration/SessionOptions.cs:24 — `public int LeaseSweepIntervalSeconds { get; init; } = 30`; src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionLeaseMonitorHostedService.cs:19 — `TimeSpan.FromSeconds(Math.Max(1, options.Value.Sessions.LeaseSweepIntervalSeconds))`. +CODE_AREA: session.lifecycle +SEVERITY: low +PROPOSED_FIX: flag only + +--- + +DOC / LINES / 230 +CLAIM: "`GatewaySession.KillWorker` is the unconditional forced-close path used by shutdown when graceful close itself throws, and also by `SessionManager.KillWorkerAsync` — the explicit kill path that the dashboard's admin Kill button invokes." +CLAIM_TYPE: behavior-rule +VERDICT: stale +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManager.cs:233 — `KillWorkerAsync` now calls `session.KillWorkerWithCloseGateAsync` (not `session.KillWorker`). The shutdown fallback (line 319) also routes through `KillWorkerAsync` rather than calling `session.KillWorker` + `RemoveSessionAsync` directly. `GatewaySession.KillWorker` is still present (line 874) but is no longer the entry point from `SessionManager.KillWorkerAsync`. +CODE_AREA: session.lifecycle +SEVERITY: medium +PROPOSED_FIX: Update to reflect that `SessionManager.KillWorkerAsync` delegates to `session.KillWorkerWithCloseGateAsync` (which serializes concurrent kill/close via `_closeLock` — Server-045 fix) and that `GatewaySession.KillWorker` is now only the internal terminal action inside `KillWorkerWithCloseGateAsync`. + +--- + +DOC / LINES / 230 +CLAIM: "`KillCount` increments while `ShutdownCount` does not" +CLAIM_TYPE: term +VERDICT: wrong +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Metrics/GatewayMetrics.cs:56-79 — no metrics named `KillCount` or `ShutdownCount` exist. The actual worker-kill metric is `mxgateway.workers.killed` (counter). The doc invents non-existent metric names. +CODE_AREA: session.metrics +SEVERITY: high +PROPOSED_FIX: Replace "KillCount increments while ShutdownCount does not" with "the `mxgateway.workers.killed` counter is incremented (via `GatewayMetrics.WorkerKilled`) while the graceful-shutdown path does not increment it". + +--- + +DOC / LINES / 265 +CLAIM: "registers the four singletons and the hosted service" (singular "the hosted service") +CLAIM_TYPE: behavior-rule +VERDICT: wrong +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionServiceCollectionExtensions.cs:14-15 — two hosted services are registered: `SessionLeaseMonitorHostedService` and `SessionShutdownHostedService`. +CODE_AREA: session.di +SEVERITY: medium +PROPOSED_FIX: Change "registers the four singletons and the hosted service" to "registers the three singletons and two hosted services (`SessionLeaseMonitorHostedService`, `SessionShutdownHostedService`)". + +--- + +DOC / LINES / 279 +CLAIM: "Registering `SessionShutdownHostedService` last ensures it is constructed after `ISessionManager` and therefore drains sessions during host stop." +CLAIM_TYPE: behavior-rule +VERDICT: stale +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionServiceCollectionExtensions.cs:14-15 — `SessionLeaseMonitorHostedService` is now registered before `SessionShutdownHostedService`. The shutdown service is still last of the two hosted services, but the reasoning in the doc no longer fully applies because construction order of hosted services relative to singletons is governed by ASP.NET Core's DI container, not purely registration order. +CODE_AREA: session.di +SEVERITY: low +PROPOSED_FIX: Update to note that two hosted services are registered in order (lease monitor first, shutdown second) and that both depend on `ISessionManager` which is registered as a singleton. + +--- + +DOC / LINES / (none — gap) +CLAIM: (gap) `GatewaySession` holds an item registration dictionary (`_items`, keyed by `(ServerHandle, ItemHandle)`) tracking all successfully added/subscribed items. The session tracks and prunes these registrations via `TrackCommandReply`, `TryGetItemRegistration`, and the per-command `TrackItem`/`RemoveItems` helpers. This bookkeeping is undocumented. +CLAIM_TYPE: behavior-rule +VERDICT: gap +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/GatewaySession.cs:17 (_items field), :425-481 (TrackCommandReply), :1059-1090 (TrackItem, TrackBulkItems, RemoveItems). src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionItemRegistration.cs:3 (SessionItemRegistration record). +CODE_AREA: session.lifecycle +SEVERITY: low +PROPOSED_FIX: Add a subsection or paragraph noting that `GatewaySession` maintains an in-session item registry keyed by `(ServerHandle, ItemHandle)`, updated after successful `AddItem`, `AddItem2`, `AddBufferedItem`, `AddItemBulk`, `SubscribeBulk`, `RemoveItem`, `RemoveItemBulk`, and `UnsubscribeBulk` replies. + +--- + +DOC / LINES / (none — gap) +CLAIM: (gap) `SessionOptions` exposes `AllowMultipleEventSubscribers` (default `false`). Setting it `true` is **rejected at startup** by `GatewayOptionsValidator` with the message "AllowMultipleEventSubscribers is not supported until event fan-out is implemented." This validator-level enforcement of the v1 constraint is undocumented. +CLAIM_TYPE: config-key +VERDICT: gap +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Configuration/SessionOptions.cs:29 and src/ZB.MOM.WW.MxGateway.Server/Configuration/GatewayOptionsValidator.cs:181-184. +CODE_AREA: session.subscriber +SEVERITY: medium +PROPOSED_FIX: Add a note to the "Run" section explaining that `MxGateway:Sessions:AllowMultipleEventSubscribers` exists but is actively refused by the validator in v1; operators who set it to `true` will see a startup validation failure, not a runtime error. + +--- + +DOC / LINES / (none — gap) +CLAIM: (gap) Gateway-restart orphan cleanup is performed by `OrphanWorkerCleanupHostedService` (wrapping `OrphanWorkerTerminator.TerminateOrphans`) on `StartAsync`, before the gateway accepts sessions. Cleanup is best-effort (a failure logs a warning but does not block startup). The `Sessions.md` doc does not mention this, yet it directly affects the "gateway restart does not reattach orphan workers" contract. +CLAIM_TYPE: behavior-rule +VERDICT: gap +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Workers/OrphanWorkerCleanupHostedService.cs:7-30; src/ZB.MOM.WW.MxGateway.Server/Workers/OrphanWorkerTerminator.cs:49-95; src/ZB.MOM.WW.MxGateway.Server/Workers/WorkerServiceCollectionExtensions.cs:19. +CODE_AREA: session.orphan +SEVERITY: high +PROPOSED_FIX: Add a "Gateway Restart / Orphan Cleanup" section to Sessions.md (or cross-reference from Shutdown Coordination) noting that `OrphanWorkerCleanupHostedService` runs `OrphanWorkerTerminator.TerminateOrphans` on startup, kills any running worker executables matching the configured `MxGateway:Worker:ExecutablePath`, and that failures are non-fatal to startup. + +--- + +DOC / LINES / (none — gap) +CLAIM: (gap) `SessionOptions.MaxPendingCommandsPerSession` (default 128) is passed to `WorkerClientOptions.MaxPendingCommands` during session construction. This per-session command concurrency cap is not documented in Sessions.md. +CLAIM_TYPE: config-key +VERDICT: gap +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Configuration/SessionOptions.cs:18; src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionWorkerClientFactory.cs:92. +CODE_AREA: session.lifecycle +SEVERITY: low +PROPOSED_FIX: Add a note in the "Key Types — SessionManager" or "Run" section that each session is bounded to `MxGateway:Sessions:MaxPendingCommandsPerSession` (default 128) concurrent in-flight worker commands. + +--- + +DOC / LINES / (none — gap) +CLAIM: (gap) `GatewaySession` exposes a `KillWorkerWithCloseGateAsync` method that acquires `_closeLock` before killing, introduced to serialize concurrent close/kill callers (Server-045). This method is not mentioned; the doc describes only `KillWorker` as the unconditional kill path from `SessionManager`. +CLAIM_TYPE: term +VERDICT: gap +EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/GatewaySession.cs:896-917; src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManager.cs:233. +CODE_AREA: session.lifecycle +SEVERITY: low +PROPOSED_FIX: Mention `KillWorkerWithCloseGateAsync` in the "Close" section as the locked kill path now used by `SessionManager.KillWorkerAsync`, distinguishing it from the bare `KillWorker` still used as the internal terminal action. diff --git a/docs/audit/fragments/04-auth.md b/docs/audit/fragments/04-auth.md new file mode 100644 index 0000000..2670b9e --- /dev/null +++ b/docs/audit/fragments/04-auth.md @@ -0,0 +1,437 @@ +# Cluster 04 — Auth + +Auditor: Claude Code (claude-sonnet-4-6) +Date: 2026-06-03 +Docs audited: docs/Authentication.md, docs/Authorization.md, glauth.md +Code verified against: src/ZB.MOM.WW.MxGateway.Server/Security/** and Dashboard/** + +--- + +DOC / Authentication.md / LINES 253–271 +CLAIM / `AuthStoreServiceCollectionExtensions.AddSqliteAuthStore` wires services via direct `AddSingleton` calls for `IApiKeyParser`, `IApiKeySecretHasher`, `IApiKeyVerifier`, `IApiKeyStore`/`SqliteApiKeyStore`, `IApiKeyAdminStore`/`SqliteApiKeyAdminStore`, `IApiKeyAuditStore`/`SqliteApiKeyAuditStore`, `AuthSqliteConnectionFactory`, `IAuthStoreMigrator`/`SqliteAuthStoreMigrator`, `AuthStoreMigrationHostedService`. +CLAIM_TYPE / behavior-rule +VERDICT / stale +EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authentication/AuthStoreServiceCollectionExtensions.cs:67 — the shared library `ZB.MOM.WW.Auth.ApiKeys` is registered via `services.AddZbApiKeyAuth(effectiveConfig, AuthenticationSectionPath)`, which owns all of those types. The local method no longer registers them individually. The doc code block is a fabricated snapshot of pre-migration code that no longer matches any method in the codebase. +CODE_AREA / auth.apikeys +SEVERITY / high +PROPOSED_FIX / Replace the Registration section code block with the actual method body from AuthStoreServiceCollectionExtensions.cs (calls AddZbApiKeyAuth, then registers CanonicalForwardingApiKeyAuditStore, SqliteCanonicalAuditStore, IAuditWriter, ApiKeyAdminCommands, ApiKeyAdminCliRunner). Remove the statement that AddSqliteAuthStore "registers the migration hosted service" — the hosted service is registered by AddZbApiKeyAuth, not by local code. + +--- + +DOC / Authentication.md / LINES 53–68 +CLAIM / `ApiKeySecretHasher` (registered behind `IApiKeySecretHasher`) hashes secrets with `HMACSHA256` keyed by a server-side pepper. The pepper is resolved by `IConfiguration` lookup against `PepperSecretName`. `ApiKeyPepperUnavailableException` is thrown when the pepper is missing. +CLAIM_TYPE / behavior-rule +VERDICT / stale +EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authentication/AuthStoreServiceCollectionExtensions.cs:5–8 — these types (`ApiKeySecretHasher`, `IApiKeySecretHasher`, `ApiKeyPepperUnavailableException`) now live in the shared package `ZB.MOM.WW.Auth.ApiKeys` (PackageReference in .csproj line 11). The behavior is correct but the doc presents them as if they are local gateway types. The interceptor's return type is `ApiKeyVerification` not `ApiKeyVerificationResult` (AuthStoreServiceCollectionExtensions.cs context; GatewayGrpcAuthorizationInterceptor.cs:69). +CODE_AREA / auth.apikeys +SEVERITY / medium +PROPOSED_FIX / Clarify that `ApiKeySecretHasher`, `IApiKeySecretHasher`, and `ApiKeyPepperUnavailableException` are provided by the `ZB.MOM.WW.Auth.ApiKeys` shared library, not gateway-local types. Correct `ApiKeyVerificationResult` → `ApiKeyVerification` (the type returned by `IApiKeyVerifier.VerifyAsync` in the interceptor). + +--- + +DOC / Authentication.md / LINES 72–98 +CLAIM / `ApiKeyVerifier` (`IApiKeyVerifier`) step 5: "Compare hashes with `CryptographicOperations.FixedTimeEquals`." Step 6: "Record a `LastUsedUtc` timestamp via `MarkKeyUsedAsync` and return an `ApiKeyIdentity`." Code block shows `ApiKeyVerificationResult.Fail(ApiKeyVerificationFailure.SecretMismatch)` and `ApiKeyVerificationResult.Success(new ApiKeyIdentity(...))`. +CLAIM_TYPE / behavior-rule +VERDICT / stale +EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authorization/GatewayGrpcAuthorizationInterceptor.cs:69 — the interceptor receives `ApiKeyVerification verification`, not `ApiKeyVerificationResult`. These types are from the shared package `ZB.MOM.WW.Auth.ApiKeys` which was migrated to. The types, method signatures, and return types shown in the code block may have been renamed or restructured during the migration to the shared library; the gateway no longer owns or contains these implementations. +CODE_AREA / auth.apikeys +SEVERITY / medium +PROPOSED_FIX / Update type names to match the shared library (`ApiKeyVerification` instead of `ApiKeyVerificationResult`). Add note that `ApiKeyVerifier` is from `ZB.MOM.WW.Auth.ApiKeys`. Verify failure enum values against the shared library. + +--- + +DOC / Authentication.md / LINES 108–122 +CLAIM / "`AuthSqliteConnectionFactory` reads `GatewayOptions.Authentication.SqlitePath`" +CLAIM_TYPE / term +VERDICT / stale +EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authentication/AuthStoreServiceCollectionExtensions.cs:67 — `AuthSqliteConnectionFactory` is now registered by `AddZbApiKeyAuth` from the shared package. The doc implies it is a local type that reads the gateway's `GatewayOptions`, but it is actually from `ZB.MOM.WW.Auth.ApiKeys` and reads `ApiKeyOptions.SqlitePath` (bound from `MxGateway:Authentication` section). The behavior is equivalent but the doc is misleading about the type ownership. +CODE_AREA / auth.apikeys +SEVERITY / low +PROPOSED_FIX / Note that `AuthSqliteConnectionFactory` is from `ZB.MOM.WW.Auth.ApiKeys` and reads `ApiKeyOptions.SqlitePath` (bound via `MxGateway:Authentication:SqlitePath`). + +--- + +DOC / Authentication.md / LINES 126–133 +CLAIM / "`SqliteAuthSchema` declares table names and the current schema version as constants. Three tables are involved: `api_keys`, `api_key_audit`, `schema_version`." +CLAIM_TYPE / behavior-rule +VERDICT / stale +EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authentication/AuthStoreServiceCollectionExtensions.cs:69–74 — a new `audit_event` table now exists in the same SQLite file, written by `SqliteCanonicalAuditStore`. The `api_key_audit` table is left in place but nothing writes to it once the `CanonicalForwardingApiKeyAuditStore` adapter overrides the library's audit store. The doc says only three tables; there are now at minimum four. +CODE_AREA / auth.apikeys +SEVERITY / medium +PROPOSED_FIX / Add `audit_event` as a fourth table (from `SqliteCanonicalAuditStore`). Note that `api_key_audit` is retained by the schema but is no longer written to at runtime (the `CanonicalForwardingApiKeyAuditStore` adapter redirects all writes to `audit_event` via `IAuditWriter`). + +--- + +DOC / Authentication.md / LINES 134–153 +CLAIM / "`SqliteApiKeyStore` (`IApiKeyStore`) handles the two reads needed at request time: `FindByKeyIdAsync` and `FindActiveByKeyIdAsync`. `MarkKeyUsedAsync` updates `last_used_utc` only for non-revoked rows." Shows `ApiKeyRecordReader.Read` code block with column-ordinal reader. +CLAIM_TYPE / behavior-rule +VERDICT / stale +EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authentication/AuthStoreServiceCollectionExtensions.cs:67 — `SqliteApiKeyStore` is in the shared package `ZB.MOM.WW.Auth.ApiKeys`. The code block shown is from the package, not local gateway code. If the package's internal implementation has changed, the doc may be inaccurate. The doc presents this as if it is local gateway source. +CODE_AREA / auth.apikeys +SEVERITY / low +PROPOSED_FIX / Clarify that `SqliteApiKeyStore`, `ApiKeyRecord`, and `ApiKeyRecordReader` are in the shared `ZB.MOM.WW.Auth.ApiKeys` package and are not directly modifiable in this repository. Remove or label the code block as "from shared library." + +--- + +DOC / Authentication.md / LINES 156–164 +CLAIM / "`SqliteApiKeyAdminStore` (`IApiKeyAdminStore`) implements administrative mutations: `CreateAsync`, `RevokeAsync`, `RotateAsync`, `DeleteAsync`." +CLAIM_TYPE / term +VERDICT / stale +EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authentication/AuthStoreServiceCollectionExtensions.cs:67 — `SqliteApiKeyAdminStore` is in `ZB.MOM.WW.Auth.ApiKeys`. The gateway now wraps admin operations through `ApiKeyAdminCommands` (from the same package), not by injecting `IApiKeyAdminStore` directly in the CLI runner. `DashboardSnapshotService` and `DashboardApiKeyManagementService` do consume `IApiKeyAdminStore` directly, which is fine. +CODE_AREA / auth.apikeys +SEVERITY / low +PROPOSED_FIX / Note that `SqliteApiKeyAdminStore` is from the shared library. Note that the gateway CLI runner delegates through `ApiKeyAdminCommands` (shared library), not by calling `IApiKeyAdminStore` directly. + +--- + +DOC / Authentication.md / LINES 165–183 +CLAIM / "`SqliteAuthStoreMigrator` executes the migration inside a single transaction so a partial failure leaves the database untouched, refuses to start when the on-disk schema version is newer than the binary supports, and idempotently creates the v1 schema." "Operators who manage schema out-of-band can disable the hosted run and use the admin CLI's `init-db` command instead." +CLAIM_TYPE / behavior-rule +VERDICT / stale +EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authentication/AuthStoreServiceCollectionExtensions.cs:104 — `SqliteAuthStoreMigrator` is from `ZB.MOM.WW.Auth.ApiKeys` (resolved via `sp.GetRequiredService()`). The description of its behavior is likely still accurate but is presented as locally-owned code. `AuthStoreMigrationHostedService` is also from the shared package (registered by `AddZbApiKeyAuth`). The code block shown at lines 171–179 is from the package. +CODE_AREA / auth.apikeys +SEVERITY / low +PROPOSED_FIX / Clarify that `SqliteAuthStoreMigrator`, `IAuthStoreMigrator`, and `AuthStoreMigrationHostedService` are from the shared library. + +--- + +DOC / Authentication.md / LINES 187–208 +CLAIM / CLI subcommand table lists: `init-db`, `create-key`, `list-keys`, `revoke-key`, `rotate-key`. CLI example uses `mxgateway apikey create-key --key-id ops.alice --display-name "Alice (ops)" --scopes read,write`. +CLAIM_TYPE / command +VERDICT / wrong +EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authorization/GatewayScopes.cs:5–13 — `GatewayScopes.All` contains `session:open`, `session:close`, `invoke:read`, `invoke:write`, `invoke:secure`, `events:read`, `metadata:read`, `admin`. The values `read` and `write` are not in the scope catalog. `ApiKeyAdminCommandLineParser.ValidateScopes` at line 170–177 would reject `--scopes read,write` as unknown scopes. +CODE_AREA / auth.scopes +SEVERITY / high +PROPOSED_FIX / Replace `--scopes read,write` with valid scope strings, e.g. `--scopes invoke:read,invoke:write`. Update all CLI examples in Authentication.md to use canonical scope strings from `GatewayScopes.All`. + +--- + +DOC / Authentication.md / LINES 229–248 +CLAIM / "`ApiKeyScopeSerializer.Serialize` writes a JSON array sorted with `StringComparer.Ordinal`." Code block shown. +CLAIM_TYPE / behavior-rule +VERDICT / stale +EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authentication/AuthStoreServiceCollectionExtensions.cs:5 — `ApiKeyScopeSerializer` is from the shared `ZB.MOM.WW.Auth.ApiKeys` package. The behavior described is likely correct but is presented as local gateway code. +CODE_AREA / auth.apikeys +SEVERITY / low +PROPOSED_FIX / Note that `ApiKeyScopeSerializer` is in the shared `ZB.MOM.WW.Auth.ApiKeys` library. + +--- + +DOC / Authorization.md / LINES 107–113 +CLAIM / Scope resolver code block includes `TestConnectionRequest or GetLastDeployTimeRequest or DiscoverHierarchyRequest or WatchDeployEventsRequest => GatewayScopes.MetadataRead`. +CLAIM_TYPE / rpc/proto +VERDICT / stale +EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authorization/GatewayGrpcScopeResolver.cs:23–28 — the actual resolver also includes `BrowseChildrenRequest => GatewayScopes.MetadataRead` in the same arm. `BrowseChildrenRequest` was added (per docs/plans/2026-05-28-lazy-browse-implementation.md) but the code block in Authorization.md was not updated. +CODE_AREA / auth.scopes +SEVERITY / high +PROPOSED_FIX / Add `BrowseChildrenRequest` to the `MetadataRead` arm of the scope resolver code block. Update the scope catalog table at line 212 to include `GalaxyRepository.BrowseChildren` in the `MetadataRead` row. + +--- + +DOC / Authorization.md / LINE 212 +CLAIM / Scope catalog table row: `MetadataRead` / `metadata:read` / "`MxCommandKind.ArchestraUserToId`, `MxCommandKind.GetSessionState`, `MxCommandKind.GetWorkerInfo`, `GalaxyRepository.TestConnection`, `GalaxyRepository.GetLastDeployTime`, `GalaxyRepository.DiscoverHierarchy`, `GalaxyRepository.WatchDeployEvents`". +CLAIM_TYPE / rpc/proto +VERDICT / stale +EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authorization/GatewayGrpcScopeResolver.cs:27 — `BrowseChildrenRequest` is also mapped to `metadata:read` but is absent from the table. +CODE_AREA / auth.scopes +SEVERITY / high +PROPOSED_FIX / Add `GalaxyRepository.BrowseChildren` to the `MetadataRead` row of the scope catalog table. + +--- + +DOC / Authorization.md / LINES 260–270 +CLAIM / Registration code block for `AddGatewayGrpcAuthorization` shows three `AddSingleton` calls: `GatewayGrpcScopeResolver`, `IGatewayRequestIdentityAccessor`/`GatewayRequestIdentityAccessor`, `GatewayGrpcAuthorizationInterceptor`, then `AddGrpc`. +CLAIM_TYPE / behavior-rule +VERDICT / stale +EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authorization/GrpcAuthorizationServiceCollectionExtensions.cs:18–31 — the actual method also registers `IConstraintEnforcer`/`ConstraintEnforcer` as a singleton (line 20) and configures `GrpcServiceOptions` with `MaxReceiveMessageSize`/`MaxSendMessageSize` from `MxGateway:Protocol`. The doc code block omits both. +CODE_AREA / auth.scopes +SEVERITY / medium +PROPOSED_FIX / Update the Registration code block to include `services.AddSingleton()` and the `AddOptions` configuration block for message size limits. + +--- + +DOC / Authorization.md / LINE 273 +CLAIM / "none of the three classes hold per-request state on instance fields" +CLAIM_TYPE / behavior-rule +VERDICT / stale +EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authorization/GrpcAuthorizationServiceCollectionExtensions.cs:20 — there are now four singleton classes registered by `AddGatewayGrpcAuthorization` (`GatewayGrpcScopeResolver`, `GatewayRequestIdentityAccessor`, `GatewayGrpcAuthorizationInterceptor`, `ConstraintEnforcer`), not three. +CODE_AREA / auth.scopes +SEVERITY / low +PROPOSED_FIX / Update "three classes" to "four classes." + +--- + +DOC / glauth.md / LINES 63–66 +CLAIM / "`LdapOptions.RequiredGroup` defaults to `GwAdmin`, so the dashboard login and `DashboardLdapLiveTests` require `admin` to be a member of a `GwAdmin` group." +CLAIM_TYPE / config-key +VERDICT / wrong +EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Configuration/LdapOptions.cs — no `RequiredGroup` field exists on the gateway's `LdapOptions`. The gateway enforces group membership via `MxGateway:Dashboard:GroupToRole` (a dictionary mapping LDAP group names to dashboard roles) in `DashboardOptions`. Authorization succeeds if the user's LDAP groups map to at least one role — there is no `RequiredGroup` concept in the current architecture. +CODE_AREA / auth.ldap +SEVERITY / high +PROPOSED_FIX / Remove the sentence "`LdapOptions.RequiredGroup` defaults to `GwAdmin`." Replace with: the dashboard enforces that at least one of the user's LDAP groups appears in `MxGateway:Dashboard:GroupToRole` (e.g. `GwAdmin: Administrator`); a login with no matching group is rejected. `DashboardLdapLiveTests` seeds the role map with `GwAdmin -> Administrator`. + +--- + +DOC / glauth.md / LINES 181–182 +CLAIM / "the authenticator strips to `GwAdmin` and matches against `RequiredGroup`" +CLAIM_TYPE / behavior-rule +VERDICT / wrong +EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardGroupRoleMapping.cs:35–48 — the shared `ILdapAuthService` strips the leading RDN value from each group DN, and the gateway's `DashboardGroupRoleMapper` looks up the short name in `GroupToRole`. There is no `RequiredGroup` property or concept anywhere in the codebase. +CODE_AREA / auth.ldap +SEVERITY / high +PROPOSED_FIX / Replace "matches against `RequiredGroup`" with "looks up the short RDN name (e.g. `GwAdmin`) in `MxGateway:Dashboard:GroupToRole`." + +--- + +DOC / glauth.md / LINES 113–136 +CLAIM / "Suggested mxgw configuration shape" YAML block uses config keys `useTls`, `allowInsecureLdap`, `userNameAttribute`. +CLAIM_TYPE / config-key +VERDICT / wrong +EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Configuration/LdapOptions.cs:49,52,64 — the current config keys (as bound by the shared `LdapOptions` and the gateway's shadow `LdapOptions`) are `Transport` (an enum: `None`/`Ldaps`/`StartTls`), `AllowInsecure` (bool), `UserNameAttribute` (string, default `"cn"` not `"uid"`). The YAML block uses stale camelCase key names from a pre-migration configuration shape. +CODE_AREA / auth.ldap +SEVERITY / high +PROPOSED_FIX / Update the YAML config example to use `Transport: None` (or `Ldaps`/`StartTls`) instead of `useTls: false`, `AllowInsecure: true` instead of `allowInsecureLdap: true`, `UserNameAttribute: "cn"` (gateway default; note GLAuth populates `cn` not `uid` per the gateway default). Rename the section header from `ldap:` to `MxGateway: Ldap:` to match the actual config path. + +--- + +DOC / glauth.md / LINE 128 +CLAIM / `userNameAttribute: "uid" # GLAuth populates this; AD uses sAMAccountName` +CLAIM_TYPE / config-key +VERDICT / wrong +EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Configuration/LdapOptions.cs:64 — the gateway `LdapOptions` default for `UserNameAttribute` is `"cn"`, not `"uid"`. GLAuth does populate both `uid` and `cn`, but the gateway ships `"cn"` as default. +CODE_AREA / auth.ldap +SEVERITY / medium +PROPOSED_FIX / Change example to `UserNameAttribute: "cn"` with a note that the gateway default is `cn`; to use `uid` instead set `MxGateway:Ldap:UserNameAttribute: uid`. + +--- + +DOC / glauth.md / LINES 261–269 +CLAIM / AD migration cheat-sheet uses field names `UseTls` and `AllowInsecureLdap`. +CLAIM_TYPE / config-key +VERDICT / wrong +EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Configuration/LdapOptions.cs:49,52 — these fields were renamed: `UseTls` → `Transport` (enum), `AllowInsecureLdap` → `AllowInsecure`. +CODE_AREA / auth.ldap +SEVERITY / high +PROPOSED_FIX / Update the AD migration table: rename `UseTls` row to `Transport` (GLAuth dev value: `None`, AD value: `Ldaps`); rename `AllowInsecureLdap` row to `AllowInsecure` (GLAuth dev: `true`, AD: `false`). + +--- + +DOC / CLAUDE.md / LINE 119 +CLAIM / "maps the user's LDAP groups to `Admin` or `Viewer` via `MxGateway:Dashboard:GroupToRole`, then issues an HTTP-only secure `__Host-MxGatewayDashboard` cookie" +CLAIM_TYPE / term +VERDICT / wrong +EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardAuthenticationDefaults.cs:38 — the cookie name constant is `CookieName = "MxGatewayDashboard"` (no `__Host-` prefix). `__Host-` is a browser security prefix that requires `Path=/`, no `Domain`, and `Secure` — the code sets `Path = "/"` and `SecurePolicy = Always` by default, satisfying the requirements, but the actual cookie name in the constant and in `ZbCookieDefaults.Apply` is `MxGatewayDashboard`, not `__Host-MxGatewayDashboard`. Additionally, `Admin` should be `Administrator` (the renamed role value per `DashboardRoles.Admin = "Administrator"`). +CODE_AREA / auth.cookie +SEVERITY / high +PROPOSED_FIX / Change `__Host-MxGatewayDashboard` to `MxGatewayDashboard` in CLAUDE.md. Change `Admin` to `Administrator`. + +--- + +DOC / CLAUDE.md / LINE 119 +CLAIM / "maps the user's LDAP groups to `Admin` or `Viewer`" +CLAIM_TYPE / term +VERDICT / wrong +EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardRoles.cs:14 — `DashboardRoles.Admin = "Administrator"` (not `"Admin"`). The role value was renamed in Task 1.7. CLAUDE.md was not updated. +CODE_AREA / auth.roles +SEVERITY / high +PROPOSED_FIX / Change `Admin` to `Administrator` in the CLAUDE.md authentication paragraph. + +--- + +DOC / CLAUDE.md / LINE 35 +CLAIM / `dotnet run --project src/MxGateway.Server/MxGateway.Server.csproj -- apikey create --display-name "dev" --scopes session,invoke,event,metadata,admin` +CLAIM_TYPE / command +VERDICT / wrong +EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authorization/GatewayScopes.cs:5–13 — canonical scopes are `session:open`, `session:close`, `invoke:read`, `invoke:write`, `invoke:secure`, `events:read`, `metadata:read`, `admin`. The shorthand values `session`, `invoke`, `event`, `metadata` are not recognized and would be rejected by `ApiKeyAdminCommandLineParser.ValidateScopes` as unknown scopes. Also, the subcommand is `create-key` not `create`. +CODE_AREA / auth.scopes +SEVERITY / high +PROPOSED_FIX / Replace the example with a valid invocation, e.g.: `dotnet run --project src/MxGateway.Server/MxGateway.Server.csproj -- apikey create-key --key-id dev --display-name "dev" --scopes session:open,session:close,invoke:read,invoke:write,events:read,metadata:read,admin` + +--- + +DOC / CLAUDE.md / LINE 117 +CLAIM / "Keys are stored hashed (with a peppered SHA) in a gateway-owned SQLite DB (default `C:\ProgramData\MxGateway\gateway-auth.db`). Scopes (`session`, `invoke`, `event`, `metadata`, `admin`) gate specific RPCs" +CLAIM_TYPE / config-key +VERDICT / wrong +EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Configuration/AuthenticationOptions.cs:9 — SQLite path default is correct. However, scope names `session`, `invoke`, `event`, `metadata` are not the canonical scope strings. Actual scopes are `session:open`, `session:close`, `invoke:read`, `invoke:write`, `invoke:secure`, `events:read`, `metadata:read`, `admin`. +CODE_AREA / auth.scopes +SEVERITY / high +PROPOSED_FIX / Replace the scope shorthand list with the full canonical scope strings from `GatewayScopes.All`. The SQLite path is accurate and should be kept. + +--- + +DOC / glauth.md / LINES 70–74 +CLAIM / "> **Dashboard role value (Task 1.7):** the LDAP `GwAdmin` group now maps to the canonical dashboard role **`Administrator`** (was `Admin`); `GwReader` maps to `Viewer`." +CLAIM_TYPE / term +VERDICT / accurate +EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardRoles.cs:14 — `DashboardRoles.Admin = "Administrator"`, `DashboardRoles.Viewer = "Viewer"`. src/ZB.MOM.WW.MxGateway.Server/appsettings.json:63–64 confirms `"GwAdmin": "Administrator"`, `"GwReader": "Viewer"`. +CODE_AREA / auth.roles +SEVERITY / low +PROPOSED_FIX / flag only + +--- + +DOC / glauth.md / LINES 21–26 +CLAIM / Connection details: Protocol LDAP, Host `localhost`, Port `3893`, Base DN `dc=zb,dc=local`, Bind DN format `cn={username},dc=zb,dc=local`, Group OU `ou=,ou=groups,dc=zb,dc=local`. +CLAIM_TYPE / config-key +VERDICT / accurate +EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Configuration/LdapOptions.cs:36,39,55,58 — defaults: `Server=localhost`, `Port=3893`, `SearchBase=dc=zb,dc=local`, `ServiceAccountDn=cn=serviceaccount,dc=zb,dc=local`. +CODE_AREA / auth.ldap +SEVERITY / low +PROPOSED_FIX / flag only + +--- + +DOC / docs/Authentication.md / LINES 1–30 +CLAIM / Token format `mxgw__`, prefix `mxgw_`, parser is `ApiKeyParser` behind `IApiKeyParser`. +CLAIM_TYPE / term +VERDICT / accurate +EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authentication/AuthStoreServiceCollectionExtensions.cs:30,33 — `TokenPrefix = "mxgw"`, `PepperSecretName = "MxGateway:ApiKeyPepper"`. The token format claim is accurate; `IApiKeyParser`/`ApiKeyParser` are from the shared package but the behavior description matches. +CODE_AREA / auth.apikeys +SEVERITY / low +PROPOSED_FIX / flag only + +--- + +DOC / docs/Authentication.md / LINE 110 +CLAIM / "`AuthSqliteConnectionFactory` reads `GatewayOptions.Authentication.SqlitePath`" +CLAIM_TYPE / config-key +VERDICT / accurate +EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Configuration/AuthenticationOptions.cs:9 — `SqlitePath` default is `C:\ProgramData\MxGateway\gateway-auth.db`. The factory reads from `ApiKeyOptions.SqlitePath` which is bound from `MxGateway:Authentication:SqlitePath`, so the effective config key path matches `GatewayOptions.Authentication.SqlitePath`. +CODE_AREA / auth.apikeys +SEVERITY / low +PROPOSED_FIX / flag only + +--- + +DOC / docs/Authentication.md / LINES 189–208 +CLAIM / CLI subcommands: `init-db`, `create-key`, `list-keys`, `revoke-key`, `rotate-key`. +CLAIM_TYPE / command +VERDICT / accurate +EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authentication/ApiKeyAdminCommandKind.cs — enum has `InitDb`, `CreateKey`, `ListKeys`, `RevokeKey`, `RotateKey`. ApiKeyAdminCommandLineParser.cs maps these to exactly those string values. +CODE_AREA / auth.apikeys +SEVERITY / low +PROPOSED_FIX / flag only + +--- + +DOC / docs/Authentication.md / LINES 220–225 +CLAIM / "Every destructive dashboard action is gated by a confirmation dialog and emits its own audit event (`dashboard-create-key`, `dashboard-rotate-key`, `dashboard-revoke-key`, `dashboard-delete-key`)." +CLAIM_TYPE / behavior-rule +VERDICT / accurate +EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardApiKeyManagementService.cs:69,201 — audit event strings `dashboard-create-key` and `dashboard-delete-key` confirmed in code. +CODE_AREA / auth.apikeys +SEVERITY / low +PROPOSED_FIX / flag only + +--- + +DOC / docs/Authorization.md / LINES 94–116 +CLAIM / Scope resolver switches on request type; `_ => GatewayScopes.Admin` fallback for unrecognized types. +CLAIM_TYPE / behavior-rule +VERDICT / accurate +EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authorization/GatewayGrpcScopeResolver.cs:13–29 — the pattern and fallback match exactly. +CODE_AREA / auth.scopes +SEVERITY / low +PROPOSED_FIX / flag only + +--- + +DOC / docs/Authorization.md / LINE 85 +CLAIM / "If `GatewayOptions.Authentication.Mode` is `AuthenticationMode.Disabled`, the helper returns `null` immediately. No identity is pushed onto the accessor and the continuation runs without scope enforcement. This matches the `AuthenticationMode` enum, which only defines `ApiKey` and `Disabled`." +CLAIM_TYPE / behavior-rule +VERDICT / accurate +EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authorization/GatewayGrpcAuthorizationInterceptor.cs:59 — confirmed. +CODE_AREA / auth.apikeys +SEVERITY / low +PROPOSED_FIX / flag only + +--- + +DOC / docs/Authorization.md / LINE 215 +CLAIM / "The `Admin` constant is also referenced by `DashboardAuthenticator` and `DashboardAuthorizationHandler` so that the dashboard and the gRPC layer agree on what 'admin' means." +CLAIM_TYPE / behavior-rule +VERDICT / stale +EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardAuthenticator.cs — `DashboardAuthenticator` does not reference `GatewayScopes.Admin`. The `admin` gRPC scope and the `Administrator` dashboard role are separate concepts. The dashboard authorization policy uses `DashboardRoles.Admin = "Administrator"`, not `GatewayScopes.Admin = "admin"`. These are distinct and do not share a constant. +CODE_AREA / auth.roles +SEVERITY / medium +PROPOSED_FIX / Correct or remove the claim that `GatewayScopes.Admin` is referenced by `DashboardAuthenticator`. The dashboard and gRPC "admin" are deliberately separate concepts — the dashboard role is `Administrator` (a role claim value on the ClaimsPrincipal), while the gRPC scope is the literal string `"admin"` (a scope string on ApiKeyIdentity). + +--- + +DOC / docs/Authorization.md / LINE 116 +CLAIM / "`AcknowledgeAlarm` is treated as a write — it mutates alarm state, mirroring `MxCommandKind.Write*` — and `StreamAlarms` shares the alarm/event surface with `StreamEvents` and `MxCommandKind.DrainEvents`, so it carries `events:read`. Both alarm RPCs are session-less." +CLAIM_TYPE / behavior-rule +VERDICT / accurate +EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authorization/GatewayGrpcScopeResolver.cs:21,22 — `AcknowledgeAlarmRequest => GatewayScopes.InvokeWrite`, `StreamAlarmsRequest => GatewayScopes.EventsRead`. Confirmed. +CODE_AREA / auth.scopes +SEVERITY / low +PROPOSED_FIX / flag only + +--- + +DOC / docs/Authorization.md / LINES 205–215 +CLAIM / Scope catalog table — all scope strings and their `Required For` mappings. +CLAIM_TYPE / rpc/proto +VERDICT / stale +EVIDENCE / GatewayGrpcScopeResolver.cs:27 — `BrowseChildrenRequest` is missing from the `MetadataRead` row (already captured above). All other rows are accurate. +CODE_AREA / auth.scopes +SEVERITY / high +PROPOSED_FIX / (Same as finding above — add `GalaxyRepository.BrowseChildren` to `MetadataRead` row.) + +--- + +## GAP FINDINGS (auth behavior in code but undocumented) + +DOC / (none — gap) +CLAIM / `DashboardAuthenticationDefaults.CookieName` is the default cookie name `"MxGatewayDashboard"`, but `DashboardOptions.CookieName` allows a per-deployment override via `MxGateway:Dashboard:CookieName`. Auth docs do not mention this override. +CLAIM_TYPE / config-key +VERDICT / gap +EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardServiceCollectionExtensions.cs:91–97, src/ZB.MOM.WW.MxGateway.Server/Configuration/DashboardOptions.cs:33. +CODE_AREA / auth.cookie +SEVERITY / medium +PROPOSED_FIX / Add documentation of `MxGateway:Dashboard:CookieName` override and when to use it (multiple gateway instances sharing a hostname). + +--- + +DOC / (none — gap) +CLAIM / The dashboard cookie idle timeout is 8 hours (set by `ZbCookieDefaults.Apply` with `idleTimeout: TimeSpan.FromHours(8)`). The hub bearer token expires in 30 minutes (`HubTokenService.TokenLifetime = TimeSpan.FromMinutes(30)`). Neither timeout is documented in Authentication.md. +CLAIM_TYPE / behavior-rule +VERDICT / gap +EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardServiceCollectionExtensions.cs:66, src/ZB.MOM.WW.MxGateway.Server/Dashboard/HubTokenService.cs:29. +CODE_AREA / auth.hub +SEVERITY / medium +PROPOSED_FIX / Add a section in Authentication.md (or GatewayDashboardDesign.md) documenting the 8-hour dashboard cookie idle timeout and the 30-minute hub bearer token lifetime. + +--- + +DOC / (none — gap) +CLAIM / The `CanonicalForwardingApiKeyAuditStore` overrides the shared library's `IApiKeyAuditStore`. As a result, the `api_key_audit` table in the SQLite DB is written by the shared library's migration but is NOT written to at runtime — all audit records go to `audit_event` via `IAuditWriter`. This is operationally important for anyone reading the DB directly but is not documented. +CLAIM_TYPE / behavior-rule +VERDICT / gap +EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authentication/AuthStoreServiceCollectionExtensions.cs:85–94, src/ZB.MOM.WW.MxGateway.Server/Security/Audit/CanonicalForwardingApiKeyAuditStore.cs. +CODE_AREA / auth.apikeys +SEVERITY / medium +PROPOSED_FIX / Document in Authentication.md that `api_key_audit` exists in the schema but is unused at runtime; all audit events flow to `audit_event` via `IAuditWriter`/`SqliteCanonicalAuditStore`. + +--- + +DOC / (none — gap) +CLAIM / `DashboardOptions.RequireHttpsCookie` (default `true`) controls whether the dashboard cookie uses `SecurePolicy.Always` or `SameAsRequest`. Setting it `false` is required for plain-HTTP dev deployments. This config key is not mentioned in auth docs. +CLAIM_TYPE / config-key +VERDICT / gap +EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Configuration/DashboardOptions.cs:22, src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardServiceCollectionExtensions.cs:87. +CODE_AREA / auth.cookie +SEVERITY / low +PROPOSED_FIX / Reference `MxGateway:Dashboard:RequireHttpsCookie` in the auth cookie documentation. + +--- + +DOC / (none — gap) +CLAIM / `ZbClaimTypes` and `ZbCookieDefaults` (from `ZB.MOM.WW.Auth.AspNetCore` package) are now used for claim and cookie setup. Authentication.md does not mention the shared library claim types (`zb:username`, `zb:displayname`) or that cookie hardening defaults come from `ZbCookieDefaults.Apply`. +CLAIM_TYPE / behavior-rule +VERDICT / gap +EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardAuthenticator.cs:111–115, src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardServiceCollectionExtensions.cs:66. +CODE_AREA / auth.cookie +SEVERITY / low +PROPOSED_FIX / Add a brief note in dashboard auth documentation about `ZbClaimTypes` (`zb:username`, `zb:displayname`, `zb:name`, `zb:role`) and `ZbCookieDefaults.Apply` providing cookie security defaults. diff --git a/docs/audit/fragments/05-dashboard.md b/docs/audit/fragments/05-dashboard.md new file mode 100644 index 0000000..7cde4cd --- /dev/null +++ b/docs/audit/fragments/05-dashboard.md @@ -0,0 +1,332 @@ +# Cluster 05 — Dashboard + +Audited docs: `docs/DashboardInterfaceDesign.md`, `docs/GatewayDashboardDesign.md` +Verified against: `src/ZB.MOM.WW.MxGateway.Server/Dashboard/**`, `src/ZB.MOM.WW.MxGateway.Server/wwwroot/**` +Audit date: 2026-06-03 + +--- + +DOC / DashboardInterfaceDesign.md / LINES / 39–57 +CLAIM / "The shell does not use a sidebar. A horizontal navigation bar is enough…" with a `
` / `