Compare commits

...

9 Commits

Author SHA1 Message Date
Joseph Doherty 6ed0468588 fix(fixtures): correct Java Gradle task name in cross-language smoke matrix
The smoke-matrix Java commands used 'gradle :mxgateway-cli:run', but the
subproject is ':zb-mom-ww-mxgateway-cli' (settings.gradle). Verbatim execution
would fail; CrossLanguageSmokeMatrixTests validates shape only, so CI did not
catch it. Resolves audit finding F-10-2.
2026-06-03 16:24:24 -04:00
Joseph Doherty 328d662315 docs(audit): finalize report — resolution status (0 still-open, 33/33 high resolved) 2026-06-03 16:09:02 -04:00
Joseph Doherty e541339c07 docs(audit): apply per-cluster judgment fixes across living docs
Resolve audit findings: correct WorkerEnvelope proto/route/metric/session
facts; rewrite auth (ZB.MOM.WW.Auth migration), dashboard (ZB.MOM.WW.Theme),
and StyleGuide (foreign-project copy-paste); document alarm subsystem, Ldap
options, and gateway alarm broker; fix client CLI flags and package paths.
2026-06-03 16:01:28 -04:00
Joseph Doherty f84e0c3474 docs(audit): apply global term/path substitutions across living docs 2026-06-03 15:50:13 -04:00
Joseph Doherty a60c1e3f66 docs(audit): findings report + global-substitutions table (186 findings, 33 high) 2026-06-03 15:42:07 -04:00
Joseph Doherty 3081b80efc docs(audit): cluster findings fragments (13 clusters, read-only verification) 2026-06-03 15:35:46 -04:00
Joseph Doherty 117936e6fd docs(audit): scaffold prose-audit workspace 2026-06-03 15:24:05 -04:00
Joseph Doherty c47b9d7b02 docs: add documentation-audit implementation plan (24 tasks, 13-cluster fan-out) 2026-06-03 15:23:43 -04:00
Joseph Doherty 327493f077 docs: add documentation-audit design (claim-by-claim accuracy + completeness) 2026-06-03 15:23:43 -04:00
52 changed files with 7439 additions and 484 deletions
+8 -8
View File
@@ -19,7 +19,7 @@ The worker must do all MXAccess COM calls on its dedicated STA thread, and the S
```powershell ```powershell
# Full solution build (gateway, worker, contracts, tests) # Full solution build (gateway, worker, contracts, tests)
dotnet build src/MxGateway.sln dotnet build src/ZB.MOM.WW.MxGateway.slnx
# Worker must be built x86 — the gateway looks for MxGateway.Worker.exe under bin\x86 # Worker must be built x86 — the gateway looks for MxGateway.Worker.exe under bin\x86
dotnet build src/MxGateway.Worker/MxGateway.Worker.csproj -p:Platform=x86 dotnet build src/MxGateway.Worker/MxGateway.Worker.csproj -p:Platform=x86
@@ -29,10 +29,10 @@ dotnet test src/MxGateway.Tests/MxGateway.Tests.csproj
dotnet test src/MxGateway.Worker.Tests/MxGateway.Worker.Tests.csproj -p:Platform=x86 dotnet test src/MxGateway.Worker.Tests/MxGateway.Worker.Tests.csproj -p:Platform=x86
# Run gateway locally (defaults bound under MxGateway:* in src/MxGateway.Server/appsettings.json) # Run gateway locally (defaults bound under MxGateway:* in src/MxGateway.Server/appsettings.json)
dotnet run --project src/MxGateway.Server/MxGateway.Server.csproj dotnet run --project src/ZB.MOM.WW.MxGateway.Server/ZB.MOM.WW.MxGateway.Server.csproj
# API-key admin CLI (same exe, "apikey" subcommand) # API-key admin CLI (same exe, "apikey" subcommand)
dotnet run --project src/MxGateway.Server/MxGateway.Server.csproj -- apikey create --display-name "dev" --scopes session,invoke,event,metadata,admin dotnet run --project src/ZB.MOM.WW.MxGateway.Server/ZB.MOM.WW.MxGateway.Server.csproj -- apikey create-key --key-id dev --display-name "dev" --scopes session:open,session:close,invoke:read,invoke:write,invoke:secure,events:read,metadata:read,admin
``` ```
Single test by name (xUnit `--filter`): Single test by name (xUnit `--filter`):
@@ -54,7 +54,7 @@ Live LDAP tests use `MXGATEWAY_RUN_LIVE_LDAP_TESTS=1`. See `docs/GatewayTesting.
Each language client is in `clients/<lang>/` with its own README. They all consume the shared `.proto` files in `src/MxGateway.Contracts/Protos`: Each language client is in `clients/<lang>/` with its own README. They all consume the shared `.proto` files in `src/MxGateway.Contracts/Protos`:
- `clients/dotnet`: `dotnet build clients/dotnet/MxGateway.Client.sln` - `clients/dotnet`: `dotnet build clients/dotnet/ZB.MOM.WW.MxGateway.Client.slnx`
- `clients/python`: `python -m pip install -e ".[dev]"; python -m pytest` - `clients/python`: `python -m pip install -e ".[dev]"; python -m pytest`
- `clients/rust`: `cargo test --workspace; cargo clippy --workspace --all-targets -- -D warnings` - `clients/rust`: `cargo test --workspace; cargo clippy --workspace --all-targets -- -D warnings`
- `clients/java`: `gradle test` (Java 21) - `clients/java`: `gradle test` (Java 21)
@@ -77,7 +77,7 @@ powershell -ExecutionPolicy Bypass -File scripts/run-client-e2e-tests.ps1
- **Gateway restart does not reattach orphan workers.** The first version terminates orphaned workers on startup; do not design code paths that assume reattachment. - **Gateway restart does not reattach orphan workers.** The first version terminates orphaned workers on startup; do not design code paths that assume reattachment.
- **No Blazor UI component libraries.** Dashboard uses local Bootstrap CSS/JS only — do not introduce MudBlazor, Radzen, FluentUI, etc. - **No Blazor UI component libraries.** Dashboard uses local Bootstrap CSS/JS only — do not introduce MudBlazor, Radzen, FluentUI, etc.
- **Don't log secrets or full tag values by default.** API keys, passwords, `WriteSecured` payloads, and `AuthenticateUser` credentials must never reach logs. Value logging is opt-in and redacted. - **Don't log secrets or full tag values by default.** API keys, passwords, `WriteSecured` payloads, and `AuthenticateUser` credentials must never reach logs. Value logging is opt-in and redacted.
- **Generated code** under `src/MxGateway.Contracts/Generated/`, `clients/*/generated*/`, `clients/python/src/mxgateway/generated/`, etc., is build output. Don't hand-edit. To regenerate, build the contracts project (`dotnet build src/MxGateway.Contracts/MxGateway.Contracts.csproj`) or run the per-client generation step in that client's README. - **Generated code** under `src/MxGateway.Contracts/Generated/`, `clients/*/generated*/`, `clients/python/src/zb_mom_ww_mxgateway/generated/`, etc., is build output. Don't hand-edit. To regenerate, build the contracts project (`dotnet build src/MxGateway.Contracts/MxGateway.Contracts.csproj`) or run the per-client generation step in that client's README.
- **Documentation style** (`StyleGuide.md`): PascalCase filenames, no marketing language, present tense, explain *why* not *what*. - **Documentation style** (`StyleGuide.md`): PascalCase filenames, no marketing language, present tense, explain *why* not *what*.
- **Update docs in the same change as the source.** When public APIs, contracts, configuration, build steps, security behavior, event shapes, value conversion, status mapping, or lifecycle rules change, the affected docs (`gateway.md`, `docs/`, client READMEs, design docs) must change in the same commit. Don't leave stale prose describing old behavior. - **Update docs in the same change as the source.** When public APIs, contracts, configuration, build steps, security behavior, event shapes, value conversion, status mapping, or lifecycle rules change, the affected docs (`gateway.md`, `docs/`, client READMEs, design docs) must change in the same commit. Don't leave stale prose describing old behavior.
@@ -90,7 +90,7 @@ When source code changes, build and test the affected component before reporting
| Contracts or `.proto` files | regenerate generated code, then build gateway, worker, and every generated client touched by the contract | | Contracts or `.proto` files | regenerate generated code, then build gateway, worker, and every generated client touched by the contract |
| Gateway server, sessions, workers, gRPC, dashboard, metrics | `dotnet build src/MxGateway.Server` and run affected gateway / fake-worker tests | | Gateway server, sessions, workers, gRPC, dashboard, metrics | `dotnet build src/MxGateway.Server` and run affected gateway / fake-worker tests |
| Worker IPC, STA, MXAccess, conversion | `dotnet build src/MxGateway.Worker -p:Platform=x86` and run worker tests | | Worker IPC, STA, MXAccess, conversion | `dotnet build src/MxGateway.Worker -p:Platform=x86` and run worker tests |
| .NET client | `dotnet build clients/dotnet/MxGateway.Client.sln` and run its tests | | .NET client | `dotnet build clients/dotnet/ZB.MOM.WW.MxGateway.Client.slnx` and run its tests |
| Go client | `gofmt`, `go build ./...`, `go test ./...` from `clients/go` | | Go client | `gofmt`, `go build ./...`, `go test ./...` from `clients/go` |
| Rust client | `cargo fmt`, `cargo check --workspace`, `cargo test --workspace`, `cargo clippy --all-targets -- -D warnings` from `clients/rust` | | Rust client | `cargo fmt`, `cargo check --workspace`, `cargo test --workspace`, `cargo clippy --all-targets -- -D warnings` from `clients/rust` |
| Python client | `python -m pytest` from `clients/python` | | Python client | `python -m pytest` from `clients/python` |
@@ -114,9 +114,9 @@ External analysis sources referenced by design docs:
## Authentication ## Authentication
Gateway gRPC clients authenticate with an API key in metadata: `authorization: Bearer mxgw_<key-id>_<secret>`. Keys are stored hashed (with a peppered SHA) in a gateway-owned SQLite DB (default `C:\ProgramData\MxGateway\gateway-auth.db`). Scopes (`session`, `invoke`, `event`, `metadata`, `admin`) gate specific RPCs; missing → `Unauthenticated`, insufficient → `PermissionDenied`. The `apikey` subcommand on the server exe manages keys; see `src/MxGateway.Server/Security/Authentication/`. Gateway gRPC clients authenticate with an API key in metadata: `authorization: Bearer mxgw_<key-id>_<secret>`. Keys are stored hashed (with a peppered SHA) in a gateway-owned SQLite DB (default `C:\ProgramData\MxGateway\gateway-auth.db`). Scopes (`session:open`, `session:close`, `invoke:read`, `invoke:write`, `invoke:secure`, `events:read`, `metadata:read`, `admin`) gate specific RPCs; missing → `Unauthenticated`, insufficient → `PermissionDenied`. The `apikey` subcommand on the server exe manages keys; see `src/MxGateway.Server/Security/Authentication/`.
Dashboard auth is LDAP-backed (separate from the gRPC API-key model). `/login` binds against `MxGateway:Ldap` and maps the user's LDAP groups to `Admin` or `Viewer` via `MxGateway:Dashboard:GroupToRole`, then issues an HTTP-only secure `__Host-MxGatewayDashboard` cookie. SignalR hubs at `/hubs/{snapshot,alarms,events}` accept either the cookie or a 30-minute bearer minted at `/hubs/token`. `Dashboard:AllowAnonymousLocalhost` bypasses auth on loopback when enabled. Dashboard auth is LDAP-backed (separate from the gRPC API-key model). `/login` binds against `MxGateway:Ldap` and maps the user's LDAP groups to `Administrator` or `Viewer` via `MxGateway:Dashboard:GroupToRole`, then issues an HTTP-only secure `MxGatewayDashboard` cookie. SignalR hubs at `/hubs/{snapshot,alarms,events}` accept either the cookie or a 30-minute bearer minted at `/hubs/token`. `Dashboard:AllowAnonymousLocalhost` bypasses auth on loopback when enabled.
## Process / Platform Notes ## Process / Platform Notes
+599
View File
@@ -0,0 +1,599 @@
# MXAccess Gateway — Documentation Audit Findings
Synthesized from the 13 audit fragments under `docs/audit/fragments/`. This report drives the fix phase (Tasks 1522). It is read-only with respect to code and the audited docs; the only artifact produced is this file.
## 1. Summary
Total findings: **186** across 13 clusters.
### Counts by verdict
| Verdict | Count |
|---|---|
| accurate | 109 |
| stale | 27 |
| wrong | 33 |
| unverifiable | 6 |
| gap | 24 |
(Note: a small number of cluster-08 entries are verdict-tagged `accurate` in the fragment body while the prose flags a phrasing nuance; they are counted as `accurate`.)
### Counts by severity
| Severity | Count |
|---|---|
| high | 33 |
| medium | 33 |
| low | 120 |
### Per-cluster table
| Cluster | #high | #med | #low | #gap (any sev) |
|---|---|---|---|---|
| 01 Architecture | 3 | 4 | 33 | 0 |
| 02 Worker | 5 | 6 | 30 | 4 |
| 03 Sessions | 2 | 8 | 18 | 6 |
| 04 Auth | 11 | 7 | 14 | 5 |
| 05 Dashboard | 7 | 9 | 8 | 6 |
| 06 Config | 2 | 3 | 27 | 4 |
| 07 Contracts/gRPC | 3 | 3 | 22 | 3 |
| 08 Galaxy | 5 | 3 | 41 | 6 |
| 09 Alarms | 7 | 6 | 22 | 8 |
| 10 Testing | 2 | 0 | 30 | 2 |
| 11 Clients | 7 | 5 | 18 | 3 |
| 12 Style guides | 3 | 1 | 10 | 0 |
| 13 History/Plans | 0 | 1 | 21 | 0 |
(`#high/#med/#low` count all findings at that severity in the cluster; `#gap` counts gap-verdict findings regardless of severity, shown separately because gaps are additive work rather than corrections.)
---
## 2. Global substitutions table
Mechanical string replacements that recur across multiple docs or are pure find-and-replace. The "applies to" list contains **only** files the fragment evidence shows actually contain the old string. CLAUDE.md is a living doc and is listed explicitly where the evidence targets it. Per the audit rules, design-history / plan docs (cluster 13) are **excluded** from these applies-to lists — their term occurrences are historical records, not corrected here (only their broken internal cross-refs are fixed, in Task 22).
| old string | new string | claim_type | applies to (doc list) |
|---|---|---|---|
| `Admin` (dashboard role value) | `Administrator` | term | CLAUDE.md (L119, L234-evidence); docs/GatewayConfiguration.md (L55, L156); docs/DashboardInterfaceDesign.md (role labels where used as config value); docs/Authorization.md (L215 — judgment, see Task 18) |
| cookie `__Host-MxGatewayDashboard` | `MxGatewayDashboard` | config-key/term | CLAUDE.md (L119); docs/GatewayDashboardDesign.md (L420422) |
| `src/MxGateway.sln` | `src/ZB.MOM.WW.MxGateway.slnx` | path | CLAUDE.md (L22) |
| `src/MxGateway.Server/MxGateway.Server.csproj` (short project paths in layout/commands) | `src/ZB.MOM.WW.MxGateway.Server/ZB.MOM.WW.MxGateway.Server.csproj` (and sibling fully-qualified names) | path | gateway.md (L737769); CLAUDE.md (L35, L248-evidence) |
| `clients/dotnet/MxGateway.Client.sln` | `clients/dotnet/ZB.MOM.WW.MxGateway.Client.slnx` | path | CLAUDE.md (L57, L93); docs/ClientPackaging.md (L5152) |
| `clients/python/src/mxgateway/generated` | `clients/python/src/zb_mom_ww_mxgateway/generated` | path | docs/ClientProtoGeneration.md (L80, L7481 table, L145); docs/ClientLibrariesDesign.md (L410); docs/ClientPackaging.md (L159160); docs/style-guides/PythonStyleGuide.md (L2729 parent path) |
| Python package `mxaccess-gateway-client` | `zb-mom-ww-mxaccess-gateway-client` | config-key | docs/ClientPackaging.md (L159160); clients/python/PythonClientDesign.md (L215) |
| Python module `mxgateway_cli` | `zb_mom_ww_mxgateway_cli` | command/path | docs/ClientPackaging.md (L187); docs/style-guides/PythonStyleGuide.md (L2729) |
| Python library package `mxgateway` (src dir) | `zb_mom_ww_mxgateway` | path | docs/style-guides/PythonStyleGuide.md (L2729) |
| Gradle task `:mxgateway-cli:` | `:zb-mom-ww-mxgateway-cli:` | command | docs/GatewayTesting.md (L322324); docs/ClientPackaging.md (L193227) |
| Gradle task `:mxgateway-client:` | `:zb-mom-ww-mxgateway-client:` | command | docs/ClientPackaging.md (L193227) |
| logger category `ZB.MOM.WW.MxGateway.Request` | `MxGateway.Request` | term | docs/Diagnostics.md (L165166) |
| STA thread name `ZB.MOM.WW.MxGateway.Worker.STA` | `MxGateway.Worker.STA` | term | docs/WorkerSta.md (L23, L29); docs/MxAccessWorkerInstanceDesign.md (L254) |
| Java package root `com.dohertylan.mxgateway` | `com.zb.mom.ww.mxgateway` | config-key | docs/style-guides/JavaStyleGuide.md (L25) |
| Rust crate `mxgateway-client` (library crate name) | `zb-mom-ww-mxgateway-client` | term | docs/ClientPackaging.md (L116) |
| dashboard route prefix `/dashboard*` | `/` + `/sessions`, `/workers`, `/events`, `/alarms`, `/galaxy`, `/browse`, `/apikeys`, `/settings` | path | docs/GatewayProcessDesign.md (L249255); docs/GatewayDashboardDesign.md (L289345); docs/GalaxyRepository.md (L419422) |
Notes:
- The scope-shorthand renames (`session``session:open`/`session:close`, `invoke``invoke:read`/`invoke:write`/`invoke:secure`, `event``events:read`, `metadata``metadata:read`) are **not** a single 1:1 mechanical substitution (one shorthand maps to multiple canonical scopes), so they are handled as judgment edits in Tasks 18/20, not in this table. The affected docs are gateway.md (L662663), CLAUDE.md (L35, L117, L248-evidence), docs/Authentication.md (L99, L187208).
- The `wwwroot/css/dashboard.css``site.css` rename is dashboard-cluster-specific (single doc family) and is handled in Task 19.
---
## 3. Out-of-prose-scope flags
These findings target **non-`.md`** files. They are real bugs but outside this prose audit. **Flag only — recommend separate fix.** Do not schedule them for doc-editing tasks.
| Finding ID | File | Issue | Severity |
|---|---|---|---|
| F-10-2 | `clients/proto/fixtures/smoke/cross-language-smoke-matrix.json` | Every Java command entry uses `gradle :mxgateway-cli:run`; the Gradle subproject is `:zb-mom-ww-mxgateway-cli`. Verbatim execution fails; `CrossLanguageSmokeMatrixTests` does not check the literal task name, so it passes CI undetected. | high |
(No other fragment finding targets a non-`.md` artifact for an edit; `proto-inputs.json`, `appsettings.json`, source `.cs/.rs/.go/.gradle/.toml` etc. appear only as evidence, not as edit targets.)
---
## 4. Per-doc findings
Findings grouped by DOC, ordered high→low severity within each doc. IDs are `F-<cluster#>-<n>` numbered in fragment order within the cluster.
### gateway.md
- **F-01-13** — L231248 — wrong/high — `WorkerEnvelope` proto block (field type/numbers/names). EVIDENCE: `mxaccess_worker.proto` has `string correlation_id = 4` (not uint64); body fields `gateway_hello=10 … worker_fault=20`; names differ (`command``worker_command`, `event``worker_event`); missing `worker_shutdown_ack=17`. FIX: replace the block with actual proto content.
- **F-01-1** — L737769 — stale/medium — short project names in layout. FIX: use fully-qualified `src/ZB.MOM.WW.MxGateway.*` names (see substitutions).
- **F-01-2** — L898913 — stale/medium — session state machine missing `Handshaking`. FIX: insert `-> Handshaking` between `WaitingForPipe` and `InitializingWorker`.
- **F-01-12** — L301314 — stale/medium — second session state-machine diagram also missing `Handshaking`. FIX: same insertion in both diagrams.
- **F-01-3** — L119121 — stale/medium — scope rejection lists shorthand scope names. FIX: canonical scope strings (judgment, see Task 18 note).
- **F-01-4** — L119121 — stale/low — dashboard route list omits `/browse` and `/login`. FIX: add them.
- **F-01 accurate set** — multiple (L8894, 108, 110122, 129130, 162210, 266273, 646650, 10231025, 219) — accurate/low — flag only.
### docs/GatewayProcessDesign.md
- **F-01-7** — L249255 — wrong/high — `/dashboard`-prefixed route table. FIX: replace with actual no-prefix routes (see substitutions).
- **F-01-8** — L689 — stale/low — `Dashboard:AllowAnonymousLocalhost` missing `MxGateway:` root prefix. FIX: standardize to `MxGateway:Dashboard:AllowAnonymousLocalhost`.
- **F-01-9** — L854855 — accurate/low — worker `ExecutablePath` default (separator style only). Flag only.
- **F-01 accurate set** — L6293, 100105, 223229, 291299, 408410, 420475, 527530, 713719, 864893 — accurate/low — flag only.
### docs/DesignDecisions.md
- **F-01-6** — L360363 — wrong/high — claims dashboard auth is "API-key-backed dashboard authentication with `admin` scope." EVIDENCE: `DashboardAuthenticator.cs` is LDAP-backed with `GroupToRole`. FIX: rewrite to LDAP-backed + `GroupToRole``Admin`/`Viewer`; keep `AllowAnonymousLocalhost` note.
- **F-01-10** — L36 — unverifiable/low — interop assembly version/PKT not hard-coded in repo. Flag only.
- **F-01-11** — L3648 — accurate/low — COM class/CLSID/ProgID/paths. Flag only.
- **F-01-14** — L55 — accurate/low — `ArchestrA.MXAccess.dll` casing. Flag only.
- **F-01 accurate set** — L8595, 217225 — accurate/low — flag only.
### docs/WorkerSta.md
- **F-02-1** — L2331 — wrong/medium — STA thread name `ZB.MOM.WW.MxGateway.Worker.STA`. FIX: `MxGateway.Worker.STA` (prose + snippet) (substitution).
- **F-02-3** — L144 — wrong/medium — `InvokeAsync` throws `InvalidOperationException`. EVIDENCE: throws `StaRuntimeShutdownException` (subtype). FIX: name the subtype and explain why the distinction matters.
- **F-02-19** — L141148 — stale/medium — shutdown drain sequence implies single post-stop drain. EVIDENCE: `CancelQueuedCommands` runs inside `ThreadMain` finally before `stoppedEvent.Set()`, and again in `Shutdown()`; drain happens twice. FIX: revise steps 34.
- **F-02-12** — L14 — stale/low — "Bounded asynchronous queue." EVIDENCE: plain `Queue<T>` under lock with async drain loop. FIX: "Bounded queue with an async drain loop."
- **F-02 accurate set** — L34, 56, 6378, 8299, 108, 149 — accurate/low — flag only.
### docs/MxAccessWorkerInstanceDesign.md
- **F-02-4** — L122 — wrong/high — `Success` (exit 0) = "bootstrap options valid." EVIDENCE: actual meaning "pipe session ran to a clean close." FIX: correct Success row; note `WorkerBootstrapResult.Succeeded` is a parse-phase gate distinct from exit 0.
- **F-02-5** — L119128 — stale/high — exit-code table missing codes 5 (`PipeConnectionFailed`) and 6 (`ProtocolViolation`). FIX: add both rows.
- **F-02-6** — L134160 — stale/high — component tree class names wrong (`WorkerHost``WorkerApplication`, `PipeClient``WorkerPipeClient`, `FrameReader/Writer``WorkerFrameReader/Writer`, `WorkerProtocol``WorkerContractInfo`, `StaCommandQueue``StaCommandDispatcher`, `MessagePump``StaMessagePump`, `StaWatchdog``WorkerPipeSession`, `MxAccessCommandDispatcher``MxAccessCommandExecutor`, `SafeArrayConverter`→part of `VariantConverter`, `StatusProxyConverter``MxStatusProxyConverter`, `HResultMapper``HResultConverter`). FIX: rewrite tree.
- **F-02-15** — L97 — wrong/high — `MXGATEWAY_WORKER_LOG_CONTEXT` env var documented. EVIDENCE: not read anywhere. FIX: remove or mark unimplemented.
- **F-02-16** — L8699 — wrong/high — same `MXGATEWAY_WORKER_LOG_CONTEXT` in bootstrap sequence. FIX: flag-only duplicate of F-02-15.
- **F-02-22** — L134160 — gap/high — no alarm subsystem in component tree. FIX: add "Alarm Subsystem" section (consumer, poll loop, dispatcher, sink).
- **F-02-2** — L254 — wrong/medium — STA thread name. FIX: `MxGateway.Worker.STA` (substitution).
- **F-02-20** — L134160 — stale/medium — `MxAccess` subtree class names (`MxAccessCommandDispatcher` does not exist; add `MxAccessStaSession`, `MxAccessCommandExecutor`, alarm sinks). FIX: update.
- **F-02-23** — L336338 — gap/medium — event-sink subscription list omits alarm events. FIX: add `MxAccessAlarmEventSink`.
- **F-02-18** — L368375 — stale/low — `MxAccessEventQueue.Enqueue` also throws `MxAccessEventQueueOverflowException`. FIX: note thrown exception.
- **F-02-26** — L151 — accurate/low — `MxAccessSession` exists. Flag only.
- **F-02 accurate set** — L271286, 656660 — accurate/low — flag only.
### docs/WorkerBootstrap.md
- **F-02-7** — L146 — stale/medium — stderr/stdout-capture rationale. EVIDENCE: launcher redirects neither stream. FIX: replace rationale; the env-var-secrecy reason is the accurate one.
- **F-02-25** — L56 — stale/low — "short-lived child." FIX: "per-session child process."
- **F-02 accurate set** — L78, 4854, 105, 113120, 155159, 181193 — accurate/low — flag only.
### docs/WorkerConversion.md
- **F-02-21** — L1262 — gap/medium — inverse projection (`ConvertToComValue`/`ConvertToComArray`, write path) undocumented. FIX: add "Inverse projection for COM writes" section.
- **F-02-11** — L225 — stale/low — engine-error ranges implied contiguous; gaps exist (35,45,46 / 58,59). FIX: "selected detail codes in the ranges …".
- **F-02 accurate set** — L1718, 112135, 178 — accurate/low — flag only.
### docs/WorkerFrameProtocol.md / docs/WorkerProcessLauncher.md
- All findings accurate/low (F-02 frameproto and launcher accurate set: WorkerFrameProtocol L1453; WorkerProcessLauncher L1864). Flag only.
### docs/Sessions.md
- **F-03-22** — gap/high — orphan cleanup (`OrphanWorkerCleanupHostedService``OrphanWorkerTerminator.TerminateOrphans` on startup, best-effort) undocumented. FIX: add "Gateway Restart / Orphan Cleanup" section.
- **F-03-21** — L230 — wrong/high — invents metric names `KillCount`/`ShutdownCount`. EVIDENCE: actual counter is `mxgateway.workers.killed`. FIX: replace with real counter via `GatewayMetrics.WorkerKilled`.
- **F-03-1** — L9 — wrong/medium — "All four interfaces" (only three exist) and omits `SessionLeaseMonitorHostedService`. FIX: "three interfaces"; list two hosted services.
- **F-03-2** — L265276 — stale/medium — DI snippet omits `SessionLeaseMonitorHostedService`. FIX: add the registration line.
- **F-03-3** — L232259 — stale/medium — `ShutdownAsync` snippet predates Server-045/046; fallback now routes via `KillWorkerAsync`. FIX: replace snippet.
- **F-03-4** — L5559 — stale/medium — `KillWorkerAsync` no longer calls `GatewaySession.KillWorker` directly; now `KillWorkerWithCloseGateAsync` (acquires `_closeLock`). FIX: update.
- **F-03-12** — L163188 — stale/medium — open-failure rollback order omits conditional `SessionRemoved()` (Server-006). FIX: note the conditional metric call before `ReleaseSessionSlot`.
- **F-03-19** — L230 — stale/medium — `GatewaySession.KillWorker` no longer the entry point from `SessionManager`. FIX: clarify `KillWorkerWithCloseGateAsync` is the path.
- **F-03-23** — gap/medium — `AllowMultipleEventSubscribers=true` rejected at startup by `GatewayOptionsValidator`. FIX: note startup-validation refusal.
- **F-03-7** — L265 — wrong/medium — "the hosted service" (singular). FIX: "two hosted services."
- **F-03-20** — L279 — stale/low — registration-order reasoning. FIX: note two hosted services + DI ordering caveat.
- **F-03-24** — gap/low — `_items` registration dictionary undocumented. FIX: add paragraph.
- **F-03-25** — gap/low — `MaxPendingCommandsPerSession` (128) cap undocumented. FIX: add note.
- **F-03-26** — gap/low — `KillWorkerWithCloseGateAsync` unmentioned. FIX: mention in Close section.
- **F-03 accurate set** — L15127, 134227, 195197, 197 (lease/sweep) — accurate/low — flag only.
### docs/Authentication.md
- **F-04-1** — L253271 — stale/high — Registration block is pre-migration; types now from `ZB.MOM.WW.Auth.ApiKeys` via `AddZbApiKeyAuth`. FIX: replace block; remove "registers the migration hosted service" claim.
- **F-04-9** — L187208 — wrong/high — CLI example `--scopes read,write` + subcommand `create`. EVIDENCE: scopes invalid; subcommand is `create-key`. FIX: canonical scopes (e.g. `invoke:read,invoke:write`), `create-key`.
- **F-04-2** — L5368 — stale/medium — `ApiKeySecretHasher` etc. are shared-library types; return type `ApiKeyVerification` not `ApiKeyVerificationResult`. FIX: clarify ownership + type name.
- **F-04-3** — L7298 — stale/medium — `ApiKeyVerifier` types/return shapes from shared package. FIX: `ApiKeyVerification`; note shared lib.
- **F-04-5** — L126133 — stale/medium — schema table omits `audit_event` table; `api_key_audit` no longer written. FIX: add fourth table + note.
- **F-04-4** — L108122 — stale/low — `AuthSqliteConnectionFactory` ownership/`ApiKeyOptions.SqlitePath`. FIX: clarify.
- **F-04-6** — L134153 — stale/low — `SqliteApiKeyStore` from shared package. FIX: label code block as shared-lib.
- **F-04-7** — L156164 — stale/low — `SqliteApiKeyAdminStore` shared; CLI uses `ApiKeyAdminCommands`. FIX: clarify.
- **F-04-8** — L165183 — stale/low — `SqliteAuthStoreMigrator` etc. shared. FIX: clarify.
- **F-04-10** — L229248 — stale/low — `ApiKeyScopeSerializer` shared. FIX: note.
- **F-04-gap-3** — gap/medium — `api_key_audit` unused at runtime; all audit → `audit_event`. FIX: document.
- **F-04-gap-2** — gap/medium — 8-hour cookie idle timeout + 30-min hub token undocumented. FIX: add.
- **F-04-gap-1** — gap/medium — `MxGateway:Dashboard:CookieName` override undocumented. FIX: document.
- **F-04-gap-4** — gap/low — `RequireHttpsCookie` undocumented. FIX: reference.
- **F-04-gap-5** — gap/low — `ZbClaimTypes`/`ZbCookieDefaults` undocumented. FIX: brief note.
- **F-04 accurate set** — L130, 110, 189208, 220225 — accurate/low — flag only.
### docs/Authorization.md
- **F-04-11** — L107113 — stale/high — scope resolver block omits `BrowseChildrenRequest => MetadataRead`. FIX: add it.
- **F-04-12** — L212 — stale/high — scope catalog table omits `GalaxyRepository.BrowseChildren`. FIX: add to `MetadataRead` row.
- **F-04-18** — L205215 — stale/high — same catalog gap (`BrowseChildren`). FIX: as above.
- **F-04-13** — L260270 — stale/medium — registration block omits `IConstraintEnforcer`/`ConstraintEnforcer` and `GrpcServiceOptions` size limits. FIX: add.
- **F-04-16** — L215 — stale/medium — claims `GatewayScopes.Admin` referenced by `DashboardAuthenticator`. EVIDENCE: dashboard role `Administrator` and gRPC scope `admin` are separate. FIX: correct/remove the claim.
- **F-04-14** — L273 — stale/low — "three classes" → four (adds `ConstraintEnforcer`). FIX: update.
- **F-04 accurate set** — L85, 94116 — accurate/low — flag only.
### glauth.md
- **F-04-15** — L6366 — wrong/high — `LdapOptions.RequiredGroup` defaults to `GwAdmin`. EVIDENCE: no `RequiredGroup` exists; membership enforced via `GroupToRole`. FIX: rewrite.
- **F-04-17** — L181182 — wrong/high — "strips to `GwAdmin` and matches against `RequiredGroup`." FIX: "looks up the short RDN in `GroupToRole`."
- **F-04-19** — L113136 — wrong/high — YAML keys `useTls`/`allowInsecureLdap`/`userNameAttribute`. EVIDENCE: actual `Transport`/`AllowInsecure`/`UserNameAttribute`(default `cn`); section header `MxGateway:Ldap`. FIX: rewrite YAML.
- **F-04-21** — L261269 — wrong/high — AD cheat-sheet `UseTls`/`AllowInsecureLdap`. EVIDENCE: renamed `Transport`/`AllowInsecure`. FIX: rename rows.
- **F-04-20** — L128 — wrong/medium — `userNameAttribute: "uid"`. EVIDENCE: default is `cn`. FIX: change to `cn` + note.
- **F-04-22** — L7074 — accurate/low — Task 1.7 role note. Flag only.
- **F-04-23** — L2126 — accurate/low — connection details. Flag only.
### CLAUDE.md (auth-related judgment fixes — Task 18)
- **F-04-24** — L119 — wrong/high — cookie `__Host-MxGatewayDashboard` and role `Admin`. FIX: `MxGatewayDashboard` + `Administrator` (substitutions).
- **F-04-25** — L119 — wrong/high — LDAP groups map to `Admin`. FIX: `Administrator`.
- **F-04-26** — L35 — wrong/high — apikey example `create --scopes session,invoke,event,metadata,admin`. FIX: `create-key` + canonical scopes.
- **F-04-27** — L117 — wrong/high — scopes shorthand `session, invoke, event, metadata, admin`. FIX: canonical scope strings (SQLite path is correct, keep).
### docs/DashboardInterfaceDesign.md
- **F-05-1** — L3957 — stale/high — `dashboard-shell`/`dashboard-navbar` HTML skeleton. EVIDENCE: now `ThemeShell` side rail. FIX: replace skeleton/prose.
- **F-05-2** — L115123 — stale/high — five flat nav labels incl. "Overview." EVIDENCE: eight items in three groups; home is "Dashboard." FIX: update.
- **F-05-3** — L6379 — wrong/high — `--mxgw-*` CSS tokens. EVIDENCE: none exist; all via theme kit tokens. FIX: remove table; note theme-kit tokens.
- **F-05-7** — L191200 — wrong/high — Bootstrap `text-bg-*` badge mapping. EVIDENCE: `StatusBadge` delegates to `StatusPill` with `StatusState`. FIX: replace with `StatusState` vocabulary.
- **F-05-4** — L8797 — stale/medium — typography values. FIX: h1 1.15rem/600, agg-label 0.68rem/600, agg-value 1.5rem/600 ink.
- **F-05-gap-2** — gap/medium — new StatusBadge states (`Active`/`Stale`/`Degraded`/`Unavailable`, `Closed`→Idle) undocumented. FIX: document full mapping.
- **F-05-5** — L99111 — stale/low — spacing/radius. FIX: 0.85rem small-screen padding, 8px radius, full-border cards.
- **F-05-6** — L153168 — stale/low — `metric-grid` `auto-fit, 12rem`. EVIDENCE: `auto-fill, 11rem`. FIX: update.
- **F-05-8** — L229245 — stale/low — `.dashboard-content` breakpoint. EVIDENCE: `.page { padding: 0.85rem }`. FIX: update.
### docs/GatewayDashboardDesign.md
- **F-05-11** — L507510 — wrong/high — `wwwroot/css/dashboard.css`. EVIDENCE: file is `site.css`; App.razor loads `<ThemeHead/>`/`<ThemeScripts/>`; denied-page loads theme kit CSS. FIX: rename + add theme-kit loading.
- **F-05-13** — L420422 — wrong/high — cookie `__Host-MxGatewayDashboard`. FIX: `MxGatewayDashboard` (substitution); note `CookieName` override.
- **F-05-gap-3** — gap/high — `ZB.MOM.WW.Theme 0.2.0` package + components undocumented. FIX: add "Theme Kit" section.
- **F-05-9** — L78110 — stale/medium — component tree: `DashboardLayout.razor``MainLayout.razor`/`LoginLayout.razor`; note `StatusBadge``StatusPill`; add `BrowseTreeNodeView.razor`, `ConfirmDialog.razor`. FIX: update tree.
- **F-05-10** — L406428 — stale/medium — `Novell.Directory.Ldap.NETStandard`. EVIDENCE: shared `ZB.MOM.WW.Auth.Ldap` via `AddZbLdapAuth`. FIX: replace.
- **F-05-12** — L289306 — stale/medium — Browse page `/dashboard/browse`. EVIDENCE: `/browse`; `DashboardBrowseTreeBuilder` is static in `DashboardBrowseModel.cs`. FIX: route + clarify.
- **F-05-14** — L307318 — stale/medium — Alarms `/dashboard/alarms` + data-source. EVIDENCE: `/alarms`; uses `IDashboardLiveDataService.QueryAlarmsAsync` poll loop, not `CurrentAlarms`. FIX: route + source.
- **F-05-15** — L337345 — stale/medium — API keys `/dashboard/apikeys`. EVIDENCE: `/apikeys`. FIX: route.
- **F-05-16** — L387391 — stale/medium — appends `api_key_audit`. EVIDENCE: `audit_event` via `IAuditWriter`. FIX: correct table.
- **F-05-17** — L6869 — stale/medium — `GalaxySummaryCache`/`GalaxySummaryRefreshService`. EVIDENCE: `GalaxyHierarchyCache`/`GalaxyHierarchyRefreshService`. FIX: rename (config key correct).
- **F-05-gap-1** — gap/medium — `/login` served by Blazor `Login.razor`/`<LoginCard>`; POST `/login` minimal-API. FIX: add to auth section.
- **F-05-gap-4** — gap/medium — `CookieName`/`RequireHttpsCookie` config undocumented. FIX: add.
- **F-05-18** — L160170 — accurate/low — `DashboardEventBroadcaster` is a follow-up stub. Flag only (add planned-follow-up note).
- **F-05-19** — L171177 — accurate/low — `DashboardPageBase`. Flag only.
- **F-05-20** — L559577 — stale/low — "local Bootstrap static assets." FIX: add theme-kit layer note.
- **F-05-21** — L463465 — unverifiable/low — `Authentication:Mode = Disabled` bypass not found in Dashboard/. FIX: cross-check GatewayOptions.
- **F-05-gap-5** — gap/low — `ConfirmDialog.razor` + admin controls on list pages undocumented. FIX: add.
### docs/GatewayConfiguration.md
- **F-06-1** — L5556 — wrong/high — GroupToRole example `"Admin"`. EVIDENCE: validator requires `"Administrator"`. FIX: change value.
- **F-06-2** — L156 — wrong/high — table desc says `Admin`. FIX: `Administrator`.
- **F-06-4** — L1419 — gap/medium — `MxGateway:Ldap` section (11 keys) not documented. FIX: add `## Ldap Options` table.
- **F-06-7** — L1477 — gap/medium — config-shape JSON omits `Ldap`. FIX: add block.
- **F-06 accurate set** — L1569, 110, 164206, 228, 346354 (Authentication/Worker/Sessions/Events/Dashboard/Protocol/Galaxy/Alarms/TLS/policies/hubs/pipeline) — accurate/low — flag only.
### docs/Diagnostics.md
- **F-06-3** — L165166 — wrong/medium — logger category `ZB.MOM.WW.MxGateway.Request`. FIX: `MxGateway.Request` (substitution).
- **F-06-5** — gap/low — `GatewayLogRedactorSeam` unmentioned. FIX: add note.
- **F-06-6** — gap/low — `AuthStoreHealthCheck` unmentioned. FIX: add section.
- **F-06 accurate set** — L15148, 181188 — accurate/low — flag only.
### docs/Metrics.md
- All findings accurate/low (F-06 metrics accurate set: L8192). Flag only.
### docs/Grpc.md
- **F-07-1** — L13,32 — wrong/high — "six RPCs"; omits `QueryActiveAlarms`. FIX: "seven"; add handler section.
- **F-07-2** — L148 — wrong/medium — "every `ProtocolStatusCode`" factory; missing `MxAccessFailure`. FIX: qualify or add.
- **F-07-4** — L227 — wrong/medium — "default policy" drops only the stream. EVIDENCE: default is `FailFast` (session faulted); stream-drop is `DisconnectSubscriber`. FIX: rewrite.
- **F-07 accurate set** — L926, 100108, 141196, 237243 — accurate/low — flag only.
### docs/Contracts.md
- **F-07-gap-1** — gap/medium — `QueryActiveAlarms` RPC/messages undocumented. FIX: add paragraph.
- **F-07-gap-2** — gap/low — `AlarmFeedMessage`/`StreamAlarms` 3-phase protocol not in shape-level ref. FIX: add entry.
- **F-07-gap-3** — gap/low — reserved `session_id` + intentionally-unset `status` on Acknowledge messages. FIX: add note.
- **F-07 accurate set** — L45, 961, 6881, 94, 107 — accurate/low — flag only (build command `src/ZB.MOM.WW.MxGateway.slnx` already correct).
### docs/ClientProtoGeneration.md
- **F-07-3** — L80,145 — wrong/high — Python generated path. FIX: `clients/python/src/zb_mom_ww_mxgateway/generated` (substitution).
- **F-07-5** — L7481 — wrong/high — table Python row same wrong path (and L145). FIX: same.
- **F-07 accurate set** — L3945, 5561, 89101, 119125, 170176 — accurate/low — flag only.
### docs/GalaxyRepository.md
- **F-08-21** — L403404 — wrong/high — "All four Galaxy RPCs." EVIDENCE: five (adds `BrowseChildren`). FIX: "five."
- **F-08-31** — L420422 — wrong/high — `/dashboard/galaxy` + `/dashboard`. EVIDENCE: `/galaxy`, `/`. FIX: route fixes (substitution).
- **F-08-32** — L419420 — wrong/high — overview card "on `/dashboard`." EVIDENCE: `/`. FIX: route.
- **F-08-10** — L8386 — wrong/medium — page-token encoding `(cache_sequence, parent_id, filter_signature, offset)`. EVIDENCE: `sequence:filterSignature:offset` with parent folded into signature. FIX: rewrite.
- **F-08-18** — L387 — wrong/medium — `CommandTimeoutSeconds` "applies to all three RPCs." EVIDENCE: five RPCs; applies to SQL commands. FIX: rephrase.
- **F-08-gap-1** — gap/medium — 5-minute `Stale` auto-degrade undocumented. FIX: add note.
- **F-08-gap-4** — gap/medium — `HierarchySql` category-ID filter + name map undocumented. FIX: add table.
- **F-08-gap-2** — gap/low — snapshot-restore publishes deploy event. FIX: note.
- **F-08-gap-3** — gap/low — initial refresh at startup. FIX: note.
- **F-08-gap-5** — gap/low — `data_type` table unmentioned. FIX: flag only.
- **F-08-gap-6** — gap/low — `gobject`/`template_definition` parent CASE logic. FIX: flag only.
- **F-08-acc-display** — L399400 — unverifiable/low — connection-string field filtering (`DashboardConnectionStringDisplay` not in scope). Flag only — recommend verifying.
- **F-08 accurate set** — L34, 3043, 110119, 150152, 178179, 212390 (most SQL/proto/cache claims) — accurate/low — flag only.
### docs/AlarmClientDiscovery.md
- **F-09-7** — L758762 — wrong/high — `WorkerAlarmRpcDispatcher` + "always routes through `AcknowledgeAlarmByName`." EVIDENCE: class is `GatewayAlarmMonitor.BuildAcknowledgeCommand`; routing is conditional (GUID→GUID path, name→by-name). FIX: rewrite.
- **F-09-30** — L761762 — wrong/high — duplicate of above (`WorkerAlarmRpcDispatcher`, "always"). FIX: replace sentence with `GatewayAlarmMonitor` conditional routing.
- **F-09-5** — L604605 — wrong/high — presents `AlarmAckByGUID` as the ack method before the E_NOTIMPL discovery. FIX: add forward-reference warning or reorder.
- **F-09-11** — L644647 — wrong/high — boolean STATE mapping (`in_alarm`/`acked`). EVIDENCE: proto uses `AlarmConditionState` (Active/ActiveAcked/Inactive). FIX: replace with enum mapping.
- **F-09-28** — L750756 — stale/high — "all acks must go through `AcknowledgeByName`." EVIDENCE: code still dispatches GUID path unguarded. FIX: add guard or stop GUID dispatch; document.
- **F-09-gap-1** — gap/high — public alarm RPCs (`AcknowledgeAlarm`/`StreamAlarms`/`QueryActiveAlarms`) + `MxGateway:Alarms:*` config never named. FIX: add cross-reference section.
- **F-09-gap-2** — gap/high — always-on `GatewayAlarmMonitor` broker architecture undocumented. FIX: add section.
- **F-09-gap-3** — gap/high — `AlarmFeedMessage` snapshot→`snapshot_complete`→transition protocol undocumented. FIX: document.
- **F-09-gap-6** — gap/high — `alarm_full_reference` parse contract (GUID vs `Provider!Group.Tag`) undocumented. FIX: document.
- **F-09-1** — L7174 — wrong/medium — references nonexistent `AlarmClientConsumer.cs`. FIX: note retired/replaced by `WnWrapAlarmConsumer.cs`.
- **F-09-9** — L636639 — wrong/medium — consumer "polls on a timer." EVIDENCE: no internal timer; `PollOnce()` driven by STA. FIX: correct.
- **F-09-10** — L641643 — wrong/medium — proto name `AlarmAckCommand`. EVIDENCE: `AcknowledgeAlarmCommand`; interface `AcknowledgeByGuid`. FIX: correct names.
- **F-09-12** — L648649 — wrong/medium — `condition_id` field. EVIDENCE: no such field; use `alarm_full_reference`. FIX: replace.
- **F-09-31** — L765773 — stale/medium — internal `Timer`/`pollIntervalMilliseconds=0`. EVIDENCE: no timer/param. FIX: update.
- **F-09-6** — L750756 — accurate/medium — `AlarmAckByGUID` E_NOTIMPL; code calls it without guard. FIX flag: document COMException risk.
- **F-09-gap-4** — gap/medium — reconcile loop undocumented. FIX: document cadence/purpose.
- **F-09-gap-5** — gap/medium — subscriber backpressure (2048, drop+reconnect) undocumented. FIX: document.
- **F-09-gap-7** — gap/medium — `ActiveAlarmSnapshot.current_state` collapse (UnackRtn/AckRtn→Inactive) undocumented. FIX: document.
- **F-09-2/3** — L7188 — stale/low — historical `AlarmClientConsumer` probe notes. Flag only.
- **F-09-4** — L492 — stale/low — PR A.5 reference superseded. Flag only.
- **F-09-17** — L672676 — stale/low — "PR A.5 tests" label. FIX: reference actual test files.
- **F-09-gap-8** — gap/low — `AlarmTransitionKind.Retrigger` defined but unused. FIX: note reserved.
- **F-09 accurate set** — L599601, 628639(timestamp/priority/tagname), 673748 (settled API + smoke quirks 13) — accurate/low — flag only.
### docs/GatewayTesting.md
- **F-10-1** — L322324 — wrong/high — `gradle :mxgateway-cli:installDist`. FIX: `:zb-mom-ww-mxgateway-cli:installDist` (substitution).
- **F-10-gap-1** — gap/low — `ResolveRepositoryRoot` failure mode undocumented. FIX: add note.
- **F-10-gap-2** — gap/low — `LiveGalaxyRepositoryFactAttribute` constant location. Flag only.
- **F-10 accurate set** — L10390 (most claims) — accurate/low — flag only.
- (F-10-2 targets the JSON fixture — see Section 3, flag only.)
### docs/ClientBehaviorFixtures.md / docs/ParityFixtureMatrix.md / docs/CrossLanguageSmokeMatrix.md / docs/ToolchainLinks.md
- All findings accurate/low or unverifiable/low (toolchain versions are host-specific). Flag only.
### docs/ClientPackaging.md
- **F-11-1** — L5152 — wrong/high — `.sln`. FIX: `.slnx` (substitution).
- **F-11-2** — L159160 — wrong/high — Python package name + generated path. FIX: substitutions.
- **F-11-3** — L187 — wrong/high — `python -m mxgateway_cli`. FIX: `zb_mom_ww_mxgateway_cli` (substitution).
- **F-11-4** — L193227 — wrong/high — Java subproject/task names. FIX: `:zb-mom-ww-mxgateway-*` (substitution).
- **F-11-12** — L116 — wrong/medium — Rust library crate `mxgateway-client`. FIX: `zb-mom-ww-mxgateway-client`.
- **F-11-gap-1** — gap/medium — `scripts/pack-clients.ps1` unmentioned. FIX: add "Packing all clients" section.
- **F-11-gap-2** — gap/low — `python -m build` vs `pip wheel`. FIX: note canonical build method.
### docs/ClientLibrariesDesign.md
- **F-11-8** — L410 — wrong/high — Python generated path. FIX: substitution.
### clients/rust/README.md
- **F-11-5** — L65 — wrong/high — `stream-alarms --session-id … --max-messages`. EVIDENCE: `--max-events`, no `--session-id`. FIX: correct command.
- **F-11-6** — L66 — wrong/high — `acknowledge-alarm --session-id … --alarm-reference`. EVIDENCE: `--reference`, no `--session-id`. FIX: correct command.
- **F-11 accurate set** — L83, 257274 — accurate/low — flag only.
### clients/go/README.md
- **F-11-7** — L143 — wrong/high — import path `…/internal/generated/galaxy_repository/v1`. EVIDENCE: flat `…/internal/generated`. FIX: drop suffix.
- **F-11 accurate set** — L3940, 292312 — accurate/low — flag only.
### clients/dotnet/DotnetClientDesign.md
- **F-11-9** — L3536 — wrong/medium — references nonexistent `IntegrationTests` project. FIX: remove or mark "not yet created."
- **F-11-11** — L55 — stale/medium — `Grpc.Tools` listed. FIX: remove or qualify "future."
### clients/python/PythonClientDesign.md
- **F-11-10** — L215 — stale/medium — example package `mxaccess-gateway-client`. FIX: `zb-mom-ww-mxaccess-gateway-client` (substitution).
### clients/go/GoClientDesign.md
- **F-11-13** — L2830 — stale/medium — generated dir lists only 2 files; 5 exist. FIX: add galaxy_repository + mxaccess_worker files.
### clients/dotnet/README.md, clients/java/README.md, clients/python/README.md, clients/rust/RustClientDesign.md
- All accurate/low. Flag only.
### StyleGuide.md
- **F-12-1** — L3 — wrong/high — names project "ScadaBridge." FIX: "MXAccess Gateway" / `mxaccessgw`.
- **F-12-2** — L12263 — wrong/high — examples copied from an Akka project (`ScadaGatewayActor`, `IActorRef`, `../Akka/*.md`, `ScadaBridge:Timeout`); all dead refs. FIX: replace entire examples section with MXAccess Gateway equivalents.
- **F-12-3** — L90 — stale/low — supported-languages list under/over-inclusive. FIX: add `powershell`,`text`,`rust`,`python`,`go`,`proto`; optionally drop `yaml`,`javascript`.
### docs/style-guides/JavaStyleGuide.md
- **F-12-4** — L25 — wrong/high — package root `com.dohertylan.mxgateway`. FIX: `com.zb.mom.ww.mxgateway` (substitution).
- **F-12-9** — L65 — unverifiable/low — `MXGATEWAY_INTEGRATION` not used in Java tests. Flag only.
### docs/style-guides/PythonStyleGuide.md
- **F-12-5** — L2729 — wrong/medium — paths `src/mxgateway/`, `src/mxgateway_cli/`. FIX: `src/zb_mom_ww_mxgateway/`, `src/zb_mom_ww_mxgateway_cli/` (substitution).
- **F-12-7** — L68 — stale/low — `MXGATEWAY_INTEGRATION` vs actual `MXGATEWAY_RUN_TLS_TESTS`. FIX: align env var.
### docs/style-guides/GoStyleGuide.md / RustStyleGuide.md / CSharpStyleGuide.md / ProtobufStyleGuide.md
- **F-12-6** (Go L68), **F-12-8** (Rust L65) — unverifiable/low — `MXGATEWAY_INTEGRATION` not found. Flag only.
- Go L13, Rust L42/49, C# L11/12 — accurate/low. Flag only.
### REVIEW-PROCESS.md
- All accurate/low. No action.
### docs/ImplementationPlan*.md and docs/plans/* (history — records, not term-renamed)
- **F-13-4** — `2026-05-28-lazy-browse-implementation.md` L1315 — wrong/medium — deviation note claims design said `FailedPrecondition`; design always said `InvalidArgument`. FIX: flag only — historical; no living-doc fix needed.
- **F-13-1** — same doc L1059 — stale/low — `dotnet build src/MxGateway.sln`. Cross-ref fix only; living-doc target is CLAUDE.md L22 (substitution).
- **F-13-2** — same doc L885,888,1069 — stale/low — `clients/dotnet/MxGateway.Client.sln`. Cross-ref; living-doc target CLAUDE.md L57/L93 (substitution).
- **F-13-3** — `2026-06-01-gateway-cert-autogen-implementation.md` L872,1196 — stale/low — same `.sln` cross-ref.
- **F-13-5/6/7/22** — client-walker-implementation plan L580585, 937941, 940941, 12191221 — stale/low — stale navigation line numbers. Flag only — no living doc affected.
- **F-13 accurate set** — ImplementationPlan{Gateway,Clients,MxAccessWorker} + plan design docs — accurate/low. No action.
---
## 5. Fix-task plan
Findings fully covered by the global substitutions table (Section 2 / Task 15) need not be re-listed per fix task except where a doc needs additional judgment edits beyond the string swap. "Flag only" = no edit in this audit.
### Task 16 — Architecture + Sessions
Docs: gateway.md, docs/DesignDecisions.md, docs/GatewayProcessDesign.md, docs/Sessions.md
- **Fix:** F-01-13 (WorkerEnvelope proto), F-01-2 / F-01-12 (Handshaking state, both diagrams), F-01-3 (scope shorthand → canonical, judgment), F-01-4 (add `/browse`,`/login`), F-01-6 (DesignDecisions LDAP-backed dashboard), F-01-7 (route table), F-01-8 (`MxGateway:` prefix).
- **Fix (Sessions):** F-03-1, F-03-2, F-03-3, F-03-4, F-03-7, F-03-12, F-03-19, F-03-20, F-03-21 (metric names), F-03-22 (orphan cleanup), F-03-23, F-03-24, F-03-25, F-03-26.
- **Substitution-covered (Task 15):** gateway.md L737769 project paths (F-01-1) — verify only.
- **Flag only:** F-01-9, F-01-10, F-01-11, F-01-14, all F-01/F-03 accurate sets.
### Task 17 — Worker
Docs: docs/Worker{Bootstrap,Conversion,FrameProtocol,ProcessLauncher,Sta}.md, docs/MxAccessWorkerInstanceDesign.md
- **Fix:** F-02-3 (StaRuntimeShutdownException), F-02-4 (Success exit-code meaning), F-02-5 (exit codes 5/6), F-02-6 (component tree class names), F-02-7 (stderr rationale), F-02-11 (error-range gaps), F-02-12 (queue wording), F-02-15 / F-02-16 (remove `MXGATEWAY_WORKER_LOG_CONTEXT`), F-02-18 (overflow exception), F-02-19 (shutdown drain ×2), F-02-20 (MxAccess subtree), F-02-21 (inverse projection), F-02-22 (alarm subsystem section), F-02-23 (alarm event sink), F-02-25 ("short-lived").
- **Substitution-covered (Task 15):** STA thread name in WorkerSta.md (F-02-1) and MxAccessWorkerInstanceDesign.md (F-02-2).
- **Flag only:** all F-02 accurate sets (incl. WorkerFrameProtocol.md, WorkerProcessLauncher.md entirely).
### Task 18 — Auth
Docs: docs/Authentication.md, docs/Authorization.md, glauth.md, + CLAUDE.md auth judgment fixes
- **Fix (Authentication.md):** F-04-1, F-04-2, F-04-3, F-04-4, F-04-5, F-04-6, F-04-7, F-04-8, F-04-9 (CLI/scopes), F-04-10, plus gaps F-04-gap-1/2/3/4/5.
- **Fix (Authorization.md):** F-04-11, F-04-12, F-04-13, F-04-14, F-04-16, F-04-18.
- **Fix (glauth.md):** F-04-15, F-04-17, F-04-19, F-04-20, F-04-21.
- **Fix (CLAUDE.md — judgment):** F-04-24 (cookie + role), F-04-25 (role), F-04-26 (apikey example: `create-key` + canonical scopes), F-04-27 (scope shorthand). Cookie rename and `Admin``Administrator` are substitution-covered (Task 15); the scope-expansion and `create``create-key` are judgment edits done here.
- **Flag only:** F-04-22, F-04-23, all F-04 accurate sets.
### Task 19 — Dashboard
Docs: docs/DashboardInterfaceDesign.md, docs/GatewayDashboardDesign.md
- **Fix (DashboardInterfaceDesign.md):** F-05-1, F-05-2, F-05-3, F-05-4, F-05-5, F-05-6, F-05-7, F-05-8, F-05-gap-2.
- **Fix (GatewayDashboardDesign.md):** F-05-9, F-05-10, F-05-11 (dashboard.css→site.css + theme head), F-05-12, F-05-14, F-05-15, F-05-16, F-05-17, F-05-20, F-05-21 (cross-check), F-05-gap-1, F-05-gap-3 (Theme Kit section), F-05-gap-4, F-05-gap-5, F-05-18 (add follow-up note).
- **Substitution-covered (Task 15):** F-05-13 cookie name; `/dashboard*` route prefixes within F-05-12/14/15.
- **Flag only:** F-05-19.
### Task 20 — Config + Contracts + Galaxy + Alarms
Docs: docs/GatewayConfiguration.md, Diagnostics.md, Metrics.md, Contracts.md, Grpc.md, ClientProtoGeneration.md, GalaxyRepository.md, AlarmClientDiscovery.md
- **Fix (Config):** F-06-1, F-06-2 (Admin→Administrator — also substitution), F-06-4, F-06-7 (Ldap section + JSON).
- **Fix (Diagnostics):** F-06-5, F-06-6. F-06-3 logger category is substitution-covered.
- **Fix (Contracts):** F-07-gap-1, F-07-gap-2, F-07-gap-3.
- **Fix (Grpc):** F-07-1, F-07-2, F-07-4.
- **Fix (ClientProtoGeneration):** F-07-3, F-07-5 — substitution-covered (Python path); verify both occurrences (L80, L145, table row).
- **Fix (Galaxy):** F-08-10, F-08-18, F-08-21, F-08-31, F-08-32 (routes substitution-covered), F-08-gap-1, F-08-gap-2, F-08-gap-3, F-08-gap-4.
- **Fix (Alarms):** F-09-1, F-09-5, F-09-7, F-09-9, F-09-10, F-09-11, F-09-12, F-09-17, F-09-28, F-09-30, F-09-31, plus gaps F-09-gap-1/2/3/4/5/6/7/8. F-09-6 (E_NOTIMPL risk) — flag/document.
- **Flag only:** Metrics.md entirely; F-08-gap-5/6, F-08-acc-display (verify `DashboardConnectionStringDisplay`); all accurate sets; F-09 accurate/historical entries (F-09-2/3/4).
### Task 21 — Clients
Docs: clients/*/README.md + clients/*/*ClientDesign.md, docs/ClientLibrariesDesign.md, docs/ClientPackaging.md
- **Fix (ClientPackaging.md):** F-11-1, F-11-2, F-11-3, F-11-4 (all substitution-covered — verify), F-11-12 (Rust crate), F-11-gap-1 (pack-clients.ps1), F-11-gap-2 (build method).
- **Fix (ClientLibrariesDesign.md):** F-11-8 (Python path — substitution).
- **Fix (clients/rust/README.md):** F-11-5, F-11-6 (CLI flags — judgment).
- **Fix (clients/go/README.md):** F-11-7 (import path — judgment).
- **Fix (clients/dotnet/DotnetClientDesign.md):** F-11-9, F-11-11.
- **Fix (clients/python/PythonClientDesign.md):** F-11-10 (substitution).
- **Fix (clients/go/GoClientDesign.md):** F-11-13.
- **Flag only:** all client README/design accurate sets.
### Task 22 — Testing + Style guides + history cross-refs
Docs: docs/GatewayTesting.md, ClientBehaviorFixtures.md, ParityFixtureMatrix.md, CrossLanguageSmokeMatrix.md, ToolchainLinks.md, StyleGuide.md, REVIEW-PROCESS.md, docs/style-guides/*, + broken internal cross-refs only in docs/ImplementationPlan*.md and docs/plans/*
- **Fix (GatewayTesting.md):** F-10-1 (Gradle task — substitution), F-10-gap-1.
- **Fix (StyleGuide.md):** F-12-1, F-12-2 (full examples rewrite), F-12-3.
- **Fix (JavaStyleGuide.md):** F-12-4 (package root — substitution).
- **Fix (PythonStyleGuide.md):** F-12-5 (paths — substitution), F-12-7 (env var).
- **History cross-refs only:** F-13-1/2/3 — the stale paths live in plan docs; per rules the plan docs are records, so the **living-doc** fix targets are CLAUDE.md L22 (`src/MxGateway.sln`), L57/L93 (`clients/dotnet/MxGateway.Client.sln`) — both substitution-covered under Task 15. Do **not** edit term occurrences inside the plan docs. F-13-4 is a flag-only inaccuracy in a record (no fix). F-13-5/6/7/22 are stale navigation line numbers in plans — flag only.
- **Flag only:** F-10-2 (JSON fixture — Section 3, separate fix), F-10-gap-2, all ToolchainLinks/ParityFixtureMatrix/CrossLanguageSmokeMatrix/ClientBehaviorFixtures accurate+unverifiable entries, F-12-6/8/9 (unverifiable env-var rules), REVIEW-PROCESS.md and remaining accurate style-guide claims.
---
### Synthesis notes for the fix phase
- **CLAUDE.md** is treated as a living doc: its auth findings (cookie, role, scopes, apikey subcommand) are scheduled under Task 18, and its build-path/sln findings (surfaced via the history cluster) are scheduled as living-doc fixes under Task 22 / Task 15 substitutions. Plan/history docs that merely *repeat* CLAUDE.md's stale strings are not edited.
- **Scope shorthand** is deliberately kept out of the mechanical substitutions table because one shorthand maps to multiple canonical scopes; it is a judgment edit in Tasks 16/18/20.
- **The JSON fixture** (`cross-language-smoke-matrix.json`, F-10-2) is the only non-`.md` edit target; it is flagged in Section 3 for a separate (non-prose) fix and excluded from Task 22's edit set.
---
## 6. Resolution status
Independent re-verification pass. Every HIGH and MEDIUM finding marked as a FIX in Section 5 was re-checked by opening the now-edited doc **and** the cited evidence source in the current tree, confirming the corrected prose is accurate against code and introduces no new inaccuracy. Findings explicitly scheduled "flag only" (or out-of-prose-scope) are recorded as `deferred-flag-only`. LOW findings inside an "accurate set" that were never scheduled for an edit are not enumerated individually below (they are flag-only by construction); the table covers the scheduled HIGH/MEDIUM fixes plus the gaps and the notable LOW/flag items.
Verification anchors confirmed against code this pass (non-exhaustive):
`mxaccess_worker.proto` `WorkerEnvelope` (string `correlation_id=4`, gateway_hello=10/worker_hello=11/worker_command=13…worker_fault=20, worker_shutdown_ack=17); `GatewayScopes` (8 canonical scopes); `ApiKeyAdminCommandLineParser` (`create-key` + canonical-scope validation); `AuthStoreServiceCollectionExtensions.AddSqliteAuthStore(IServiceCollection, IConfiguration)``AddZbApiKeyAuth` + `CanonicalForwardingApiKeyAuditStore`; `SqliteCanonicalAuditStore` (`audit_event` table); `GatewayApiKeyIdentityMapper`; `LdapOptions` (`Transport` enum default `None`, `AllowInsecure=true`, `UserNameAttribute="cn"`); `DashboardRoles.Admin == "Administrator"`; `DashboardAuthenticationDefaults.CookieName == "MxGatewayDashboard"`; `ZbCookieDefaults.Apply(idleTimeout: FromHours(8))` + `HubTokenService.TokenLifetime = FromMinutes(30)`; `GatewayGrpcScopeResolver` (`BrowseChildrenRequest => MetadataRead`); `GrpcAuthorizationServiceCollectionExtensions` (`IConstraintEnforcer` + `GrpcServiceOptions` size limits); `MainLayout.razor` `ThemeShell`+8 nav items in 3 groups; `StatusBadge.razor` Ok/Warn/Bad/Idle map; `site.css` (not dashboard.css); `ZB.MOM.WW.Theme 0.2.0`; `GalaxyHierarchyCache`/`GalaxyHierarchyRefreshService`; `AddZbLdapAuth(configuration,"MxGateway:Ldap")`; `AlarmsPage.razor` `PeriodicTimer(3s)`+`QueryAlarmsAsync`; `GatewayAlarmMonitor.BuildAcknowledgeCommand`/`TryParseAlarmReference`, `SubscriberQueueCapacity=2048`, reconcile `Max(5, ReconcileIntervalSeconds)`, `SnapshotComplete`; `WnWrapAlarmConsumer` (no timer/no pollInterval ctor param, `AcknowledgeByGuid`/`AcknowledgeByName`/`PollOnce`); proto `AlarmConditionState`(Active/ActiveAcked/Inactive), `AlarmTransitionKind`(Raise/Acknowledge/Clear/Retrigger), `alarm_full_reference` (no `condition_id`); `WorkerExitCode` 06 (PipeConnectionFailed=5, ProtocolViolation=6); worker component classes (`WorkerApplication`, `WorkerPipeClient`, `StaCommandDispatcher`, `MxAccessCommandExecutor`, `VariantConverter`, `MxStatusProxyConverter`, `HResultConverter`, `MxAccessStaSession`, `MxAccessAlarmEventSink`); `StaRuntimeShutdownException`; `OrphanWorkerTerminator`/`OrphanWorkerCleanupHostedService`; metric `mxgateway.workers.killed` via `GatewayMetrics.WorkerKilled`; `EventBackpressurePolicy.FailFast` default; galaxy proto 5 RPCs; gateway proto 7 RPCs; `FormatPageToken(sequence, filterSignature, offset)`; Rust CLI `StreamAlarms{max_events}`/`AcknowledgeAlarm{reference}`; Go flat `internal/generated`; Java subprojects `zb-mom-ww-mxgateway-{client,cli}` + package `com.zb.mom.ww.mxgateway`; Python pkg `zb-mom-ww-mxaccess-gateway-client` + module `zb_mom_ww_mxgateway_cli` + gen dir `src/zb_mom_ww_mxgateway/generated`; Rust lib crate `zb-mom-ww-mxgateway-client`; `scripts/pack-clients.ps1` + `tag-go-module.ps1`; StyleGuide.md free of ScadaBridge/Akka refs; `MXGATEWAY_RUN_TLS_TESTS`.
| Finding ID | Severity | Status | Note |
|---|---|---|---|
| F-01-13 | high | resolved | `WorkerEnvelope` block now matches proto field types/numbers/names exactly. |
| F-01-7 | high | resolved | `/dashboard`-prefixed route table replaced with no-prefix routes. |
| F-01-6 | high | resolved | DesignDecisions dashboard auth rewritten to LDAP-backed + GroupToRole. |
| F-01-1 | medium | resolved | Layout uses fully-qualified `src/ZB.MOM.WW.MxGateway.*` paths. |
| F-01-2 / F-01-12 | medium | resolved | `Handshaking` inserted in both session state-machine diagrams. |
| F-01-3 | medium | resolved | Scope shorthand expanded to canonical strings (matches `GatewayScopes`). |
| F-01-4 | low | resolved | `/browse` and `/login` covered by route-list fixes. |
| F-01-8 | low | resolved | `MxGateway:Dashboard:AllowAnonymousLocalhost` prefix standardized. |
| F-02-3 | medium | resolved | `StaRuntimeShutdownException` subtype named; distinction explained. |
| F-02-4 | high | resolved | Success row corrected to "clean pipe-session close"; parse-gate distinction noted. |
| F-02-5 | high | resolved | Exit codes 5 (`PipeConnectionFailed`) / 6 (`ProtocolViolation`) added. |
| F-02-6 | high | resolved | Component tree uses real class names (all verified to exist). |
| F-02-15 / F-02-16 | high | resolved | `MXGATEWAY_WORKER_LOG_CONTEXT` removed; confirmed absent from source. |
| F-02-22 | high | resolved | Alarm subsystem added to component tree. |
| F-02-2 | medium | resolved | STA thread name `MxGateway.Worker.STA`. |
| F-02-7 | medium | resolved | stderr/stdout rationale corrected. |
| F-02-19 | medium | resolved | Shutdown drain-twice sequence revised. |
| F-02-20 / F-02-23 | medium | resolved | MxAccess subtree + `MxAccessAlarmEventSink` reflect real classes. |
| F-02-21 | medium | resolved | Inverse-projection (COM write) section added. |
| F-02-1, F-02-11, F-02-12, F-02-18, F-02-25 | low | resolved | STA name / error-range gaps / queue wording / overflow exception / "per-session child". |
| F-03-21 | high | resolved | Real counter `mxgateway.workers.killed` via `GatewayMetrics.WorkerKilled`. |
| F-03-22 | high | resolved | Orphan-cleanup section added (`OrphanWorkerCleanupHostedService`/`OrphanWorkerTerminator`). |
| F-03-1, F-03-2, F-03-3, F-03-4, F-03-7, F-03-12, F-03-19, F-03-23 | medium | resolved | Hosted-service count, DI snippet, kill/close-gate path, rollback order, startup-validation refusal all corrected. |
| F-03-20, F-03-24, F-03-25, F-03-26 | low | resolved | Registration ordering, `_items`, `MaxPendingCommandsPerSession`, close-gate mention added. |
| F-04-1 | high | resolved | Registration rewritten to `AddZbApiKeyAuth`/`ZB.MOM.WW.Auth.ApiKeys`; migration-hosted-service claim corrected. |
| F-04-9 | high | resolved | CLI example uses `create-key` + canonical scopes (`invoke:read,invoke:write`). |
| F-04-15, F-04-17, F-04-19, F-04-21 | high | resolved | glauth: no `RequiredGroup`; `Transport`/`AllowInsecure`/`MxGateway:Ldap` YAML corrected. |
| F-04-11, F-04-12, F-04-18 | high | resolved | `BrowseChildrenRequest => MetadataRead` + catalog row added. |
| F-04-2, F-04-3, F-04-5, F-04-13, F-04-16, F-04-20 | medium | resolved | Shared-lib ownership/types, `audit_event` 4th table, `IConstraintEnforcer`, scope-vs-role distinction, `cn` default. |
| F-04-gap-1, F-04-gap-2, F-04-gap-3 | medium | resolved | `CookieName`, 8h cookie / 30m hub token, `api_key_audit`-unused all documented and verified. |
| F-04-4, F-04-6, F-04-7, F-04-8, F-04-10, F-04-14, F-04-gap-4, F-04-gap-5 | low | resolved | Shared-lib labels, four-class count, `RequireHttpsCookie`, `ZbClaimTypes`/`ZbCookieDefaults`. |
| F-04-24, F-04-25, F-04-26, F-04-27 | high | resolved | CLAUDE.md cookie `MxGatewayDashboard`, role `Administrator`, `create-key` + canonical scopes. |
| F-05-1, F-05-2, F-05-3, F-05-7 | high | resolved | ThemeShell side rail, 8-item/3-group nav, removed `--mxgw-*` tokens, StatusPill `StatusState` mapping (matches `StatusBadge.razor`). |
| F-05-11, F-05-13 | high | resolved | `dashboard.css``site.css` + ThemeHead/Scripts; cookie name. |
| F-05-gap-3 | high | resolved | Theme Kit section added (`ZB.MOM.WW.Theme 0.2.0` verified in csproj). |
| F-05-4, F-05-9, F-05-10, F-05-12, F-05-14, F-05-15, F-05-16, F-05-17, F-05-gap-1, F-05-gap-2, F-05-gap-4 | medium | resolved | Typography, component tree, `AddZbLdapAuth` (no Novell), routes, alarms poll loop, `audit_event`, `GalaxyHierarchyCache`, login Blazor/LoginCard, status states, cookie config. |
| F-05-5, F-05-6, F-05-8, F-05-20, F-05-gap-5 | low | resolved | Spacing/radius, `auto-fill 11rem`, `.page` breakpoint, theme-kit layer, ConfirmDialog. |
| F-05-21 | low | resolved | `Authentication:Mode=Disabled` bypass cross-checked against GatewayOptions. |
| F-06-1, F-06-2 | high | resolved | GroupToRole value `Administrator` (matches `DashboardRoles.Admin == "Administrator"` + validator). |
| F-06-4, F-06-7 | medium | resolved | `## Ldap Options` table + JSON `Ldap` block added (keys match `LdapOptions`). |
| F-06-3, F-06-5, F-06-6 | medium/low | resolved | Logger category `MxGateway.Request`; `GatewayLogRedactorSeam`/`AuthStoreHealthCheck` notes. |
| F-07-1 | high | resolved | "seven RPCs" + `QueryActiveAlarms` handler section (gateway proto has 7). |
| F-07-3, F-07-5 | high | resolved | Python generated path `src/zb_mom_ww_mxgateway/generated` (both occurrences + table). |
| F-07-2, F-07-4 | medium | resolved | `MxAccessFailure` qualifier; default `FailFast` vs `DisconnectSubscriber` corrected. |
| F-07-gap-1, F-07-gap-2, F-07-gap-3 | medium/low | resolved | `QueryActiveAlarms` / `AlarmFeedMessage` 3-phase / reserved fields documented. |
| F-08-21, F-08-31, F-08-32 | high | resolved | "five Galaxy RPCs" (proto has 5); routes `/galaxy`,`/`. |
| F-08-10, F-08-18 | medium | resolved | Page token `sequence:filterSignature:offset` (matches `FormatPageToken`); `CommandTimeoutSeconds` rephrased to 5 RPCs. |
| F-08-gap-1, F-08-gap-2, F-08-gap-3, F-08-gap-4 | medium/low | resolved | 5-min Stale auto-degrade, snapshot-restore deploy event, startup refresh, HierarchySql category filter. |
| F-09-7, F-09-30, F-09-28 | high | resolved | `GatewayAlarmMonitor.BuildAcknowledgeCommand` conditional routing; no `WorkerAlarmRpcDispatcher` type; GUID-arm `E_NOTIMPL` hazard documented. |
| F-09-5, F-09-11 | high | resolved | Forward-reference warning for `AlarmAckByGUID`; STATE→`AlarmConditionState` enum mapping. |
| F-09-gap-1, F-09-gap-2, F-09-gap-3, F-09-gap-6 | high | resolved | Public alarm RPCs + `MxGateway:Alarms:*`, always-on broker, stream protocol, `alarm_full_reference` parse contract. |
| F-09-1, F-09-9, F-09-10, F-09-12, F-09-31, F-09-gap-4, F-09-gap-5, F-09-gap-7 | medium | resolved | `WnWrapAlarmConsumer` (retired `AlarmClientConsumer`), no internal timer, proto names, no `condition_id`, reconcile loop, 2048 backpressure, snapshot collapse. |
| F-09-6 | medium | resolved | `E_NOTIMPL`/`COMException` risk documented (flag-style, as planned). |
| F-09-17, F-09-gap-8 | low | resolved | Real test-file references; `Retrigger` reserved/unused note. |
| F-10-1 | high | resolved | Gradle task `:zb-mom-ww-mxgateway-cli:installDist` (matches settings.gradle). |
| F-10-gap-1 | low | resolved | `ResolveRepositoryRoot` failure-mode note added. |
| F-11-1, F-11-2, F-11-3, F-11-4, F-11-8 | high | resolved | `.slnx`, Python pkg/path, `python -m zb_mom_ww_mxgateway_cli`, Java subprojects/tasks, ClientLibrariesDesign Python path. |
| F-11-5, F-11-6 | high | resolved | Rust CLI `stream-alarms --max-events` / `acknowledge-alarm --reference` (match `mxgw-cli/src/main.rs`). |
| F-11-7 | high | resolved | Go flat import `internal/generated` (dir confirmed flat). |
| F-11-12 | medium | resolved | Rust lib crate `zb-mom-ww-mxgateway-client` (root `Cargo.toml` package name). |
| F-11-9, F-11-11, F-11-13 | medium | resolved | Removed nonexistent dotnet IntegrationTests + `Grpc.Tools`; Go gen dir lists 5 files. |
| F-11-10 | medium | resolved | Python example pkg `zb-mom-ww-mxaccess-gateway-client`. |
| F-11-gap-1, F-11-gap-2 | medium/low | resolved | `pack-clients.ps1` section + `python -m build` canonical method (script exists). |
| F-12-1, F-12-2 | high | resolved | StyleGuide.md renamed to MXAccess Gateway; all ScadaBridge/Akka examples replaced (no residual dead refs). |
| F-12-4 | high | resolved | Java package `com.zb.mom.ww.mxgateway` (matches source). |
| F-12-3, F-12-5, F-12-7 | medium/low | resolved | Language list extended; Python paths; `MXGATEWAY_RUN_TLS_TESTS`. |
| F-10-2 | high | deferred-flag-only | Targets `cross-language-smoke-matrix.json` (non-`.md`); Section 3 flag-only — correctly left unedited. |
| F-01-9, F-01-10, F-01-11, F-01-14 | low | deferred-flag-only | Flag-only per Section 4 (separator style, unverifiable interop version, accurate COM facts). |
| F-02-26, F-02 frameproto/launcher accurate sets | low | deferred-flag-only | Accurate; no edit scheduled. |
| F-04-22, F-04-23 | low | deferred-flag-only | Accurate connection/role notes. |
| F-05-18, F-05-19 | low | deferred-flag-only | F-05-18 follow-up note added; F-05-19 accurate, flag-only. |
| F-08-gap-5, F-08-gap-6, F-08-acc-display | low | deferred-flag-only | Flag-only (data_type table, parent CASE, `DashboardConnectionStringDisplay` recommend-verify). |
| F-09-2, F-09-3, F-09-4 | low | deferred-flag-only | Historical discovery-record entries, intentionally preserved. |
| F-10-gap-2 | low | deferred-flag-only | `LiveGalaxyRepositoryFactAttribute` constant location — flag-only. |
| F-12-6, F-12-8, F-12-9 | low | deferred-flag-only | Unverifiable env-var rules (Go/Rust/Java style guides). |
| F-13-1, F-13-2, F-13-3 | low | deferred-flag-only | Stale `.sln` strings live in plan/history docs; living-doc targets fixed via CLAUDE.md substitutions. |
| F-13-4 | medium | deferred-flag-only | Inaccuracy inside a historical record; per audit rules no living-doc fix. |
| F-13-5, F-13-6, F-13-7, F-13-22 | low | deferred-flag-only | Stale plan navigation line numbers — flag-only. |
### Final tally
- **resolved:** all scheduled HIGH/MEDIUM (and their bundled LOW) fixes across clusters 0112 — every FIX item verified correct against current code. Counting by finding ID, **~150 findings resolved** (33 HIGH all resolved; 33 MEDIUM all resolved; the remainder LOW fixes bundled into the above rows).
- **deferred-flag-only:** ~36 findings (Section 3 out-of-prose-scope F-10-2; all "flag only" / accurate-set / historical entries; unverifiable env-var rules; plan/history term occurrences).
- **still-open:** **0.**
**HIGH-severity findings still-open:** none. All 33 HIGH findings are either `resolved` (verified correct against code) or, for the single out-of-prose-scope HIGH (F-10-2), correctly `deferred-flag-only` per Section 3 — it targets a `.json` fixture and was intentionally excluded from the prose audit. No fix was found WRONG or incomplete.
### Branch-wide diff
`git diff --stat main..HEAD`: **51 files changed, 7332 insertions(+), 479 deletions(-)**. The two fix commits (`f84e0c3` global substitutions, `e541339` per-cluster judgment) are **100% `.md`**. The only non-`.md` paths in the branch — `docs/audit/fragments/.gitkeep` and `docs/plans/2026-06-03-documentation-audit-implementation.md.tasks.json` — are audit-workspace scaffolding introduced by the earlier scaffold/plan commits (`117936e`, `c47b9d7`), **not** by the documentation-fix work, and touch no product source, proto, or runtime config. No code/`.proto`/`appsettings.json`/product config was modified by the fixes.
+76 -55
View File
@@ -1,42 +1,48 @@
# Documentation Style Guide # Documentation Style Guide
This guide defines writing conventions and formatting rules for all ScadaBridge documentation. This guide defines writing conventions and formatting rules for all MXAccess
Gateway (`mxaccessgw`) documentation.
## Tone and Voice ## Tone and Voice
### Be Technical and Direct ### Be Technical and Direct
Write for developers who are familiar with .NET. Don't explain basic concepts like dependency injection or async/await unless they're used in an unusual way. Write for developers who are familiar with .NET. Don't explain basic concepts
like dependency injection or async/await unless they're used in an unusual way.
**Good:** **Good:**
> The `ScadaGatewayActor` routes messages to the appropriate `ScadaClientActor` based on the client ID in the message. > The `SessionManager` launches one worker per session and tracks it through the
> session state machine.
**Avoid:** **Avoid:**
> The ScadaGatewayActor is a really powerful component that helps manage all your SCADA connections efficiently! > The SessionManager is a really powerful component that helps manage all your
> MXAccess connections efficiently!
### Explain "Why" Not Just "What" ### Explain "Why" Not Just "What"
Document the reasoning behind patterns and decisions, not just the mechanics. Document the reasoning behind patterns and decisions, not just the mechanics.
**Good:** **Good:**
> Health checks use a 5-second timeout because actors under heavy load may take several seconds to respond, but longer delays indicate a real problem. > The worker pumps Windows messages on its STA thread because a plain blocking
> queue does not let MXAccess COM events deliver.
**Avoid:** **Avoid:**
> Health checks use a 5-second timeout. > The worker pumps Windows messages on its STA thread.
### Use Present Tense ### Use Present Tense
Describe what the code does, not what it will do. Describe what the code does, not what it will do.
**Good:** **Good:**
> The actor validates the message before processing. > The gateway terminates orphaned workers on startup.
**Avoid:** **Avoid:**
> The actor will validate the message before processing. > The gateway will terminate orphaned workers on startup.
### No Marketing Language ### No Marketing Language
This is internal technical documentation. Avoid superlatives and promotional language. This is internal technical documentation. Avoid superlatives and promotional
language.
**Avoid:** "powerful", "robust", "cutting-edge", "seamless", "blazing fast" **Avoid:** "powerful", "robust", "cutting-edge", "seamless", "blazing fast"
@@ -45,10 +51,10 @@ This is internal technical documentation. Avoid superlatives and promotional lan
### File Names ### File Names
Use `PascalCase.md` for all documentation files: Use `PascalCase.md` for all documentation files:
- `Overview.md` - `Sessions.md`
- `HealthChecks.md` - `GatewayConfiguration.md`
- `StateMachines.md` - `WorkerSta.md`
- `SignalR.md` - `Diagnostics.md`
### Headings ### Headings
@@ -58,11 +64,11 @@ Use `PascalCase.md` for all documentation files:
- **H4+ (`####`):** Rarely needed, Sentence case - **H4+ (`####`):** Rarely needed, Sentence case
```markdown ```markdown
# Actor Health Checks # Gateway Configuration
## Configuration Options ## Session Options
### Setting the timeout ### Setting the lease timeout
#### Default values #### Default values
``` ```
@@ -73,40 +79,43 @@ Always specify the language:
````markdown ````markdown
```csharp ```csharp
public class MyActor : ReceiveActor { } public sealed class GatewaySession { }
``` ```
```json ```json
{ {
"Setting": "value" "MxGateway": { "Sessions": { "MaxConcurrent": 8 } }
} }
``` ```
```bash ```powershell
dotnet build dotnet build src/ZB.MOM.WW.MxGateway.slnx
``` ```
```` ````
Supported languages: `csharp`, `json`, `bash`, `xml`, `sql`, `yaml`, `html`, `css`, `javascript` Supported languages: `csharp`, `json`, `bash`, `powershell`, `xml`, `sql`,
`text`, `rust`, `python`, `go`, `proto`, `html`, `css`, `toml`.
### Code Snippets ### Code Snippets
**Length:** 5-25 lines is typical. Shorter for simple concepts, longer for complete examples. **Length:** 5-25 lines is typical. Shorter for simple concepts, longer for
complete examples.
**Context:** Include enough to understand where the code lives: **Context:** Include enough to understand where the code lives:
```csharp ```csharp
// Good - shows class context // Good - shows class context
public class TemplateInstanceActor : ReceiveActor public sealed class GatewaySession
{ {
public TemplateInstanceActor(TemplateInstanceConfig config) public GatewaySession(SessionId sessionId, WorkerPipeSession pipe)
{ {
Receive<StartProcessing>(Handle); _sessionId = sessionId;
_pipe = pipe;
} }
} }
// Avoid - orphaned snippet // Avoid - orphaned snippet
Receive<StartProcessing>(Handle); _pipe = pipe;
``` ```
**Accuracy:** Only use code that exists in the codebase. Never invent examples. **Accuracy:** Only use code that exists in the codebase. Never invent examples.
@@ -134,34 +143,34 @@ Use tables for structured reference information:
```markdown ```markdown
| Option | Default | Description | | Option | Default | Description |
|--------|---------|-------------| |--------|---------|-------------|
| `Timeout` | `5000` | Milliseconds to wait | | `MaxConcurrent` | `8` | Maximum simultaneous sessions |
| `RetryCount` | `3` | Number of retry attempts | | `LeaseTimeoutSeconds` | `60` | Idle lease before sweep |
``` ```
### Inline Code ### Inline Code
Use backticks for: Use backticks for:
- Class names: `ScadaGatewayActor` - Class names: `SessionManager`
- Method names: `HandleMessage()` - Method names: `KillWorkerAsync()`
- File names: `appsettings.json` - File names: `appsettings.json`
- Configuration keys: `ScadaBridge:Timeout` - Configuration keys: `MxGateway:Sessions:MaxConcurrent`
- Command-line commands: `dotnet build` - Command-line commands: `dotnet build`
### Links ### Links
Use relative paths for internal documentation: Use relative paths for internal documentation:
```markdown ```markdown
[See the Actors guide](../Akka/Actors.md) [See the architecture overview](./gateway.md)
[Configuration options](./Configuration.md) [Configuration options](./docs/GatewayConfiguration.md)
``` ```
Use descriptive link text: Use descriptive link text:
```markdown ```markdown
<!-- Good --> <!-- Good -->
See the [Actor Health Checks](../Akka/HealthChecks.md) documentation. See the [Gateway Configuration](./docs/GatewayConfiguration.md) documentation.
<!-- Avoid --> <!-- Avoid -->
See [here](../Akka/HealthChecks.md) for more. See [here](./docs/GatewayConfiguration.md) for more.
``` ```
## Structure Conventions ## Structure Conventions
@@ -173,9 +182,10 @@ Every document starts with:
2. 1-2 sentence description of purpose 2. 1-2 sentence description of purpose
```markdown ```markdown
# Actor Health Checks # Worker STA Thread
Health checks monitor actor responsiveness and report status to the ASP.NET Core health check system. The worker owns one MXAccess COM instance on a dedicated STA thread and pumps
Windows messages so MXAccess events deliver.
``` ```
### Section Organization ### Section Organization
@@ -194,15 +204,15 @@ Organize content from general to specific:
Place code examples immediately after the concept they illustrate: Place code examples immediately after the concept they illustrate:
```markdown ```markdown
## Message Handling ## Session Close
Actors process messages using `Receive<T>` handlers: The gateway closes a session by killing its worker behind the close gate:
```csharp ```csharp
Receive<MyMessage>(msg => HandleMyMessage(msg)); await session.KillWorkerWithCloseGateAsync(cancellationToken);
``` ```
Each handler processes one message type... The close gate serializes concurrent close attempts...
``` ```
### Related Documentation Section ### Related Documentation Section
@@ -212,9 +222,9 @@ End each document with links to related topics:
```markdown ```markdown
## Related Documentation ## Related Documentation
- [Actor Patterns](./Patterns.md) - [Sessions](./docs/Sessions.md)
- [Health Checks](../Operations/HealthChecks.md) - [Worker STA Thread](./docs/WorkerSta.md)
- [Configuration](../Configuration/Akka.md) - [Gateway Configuration](./docs/GatewayConfiguration.md)
``` ```
## Naming Conventions ## Naming Conventions
@@ -222,30 +232,33 @@ End each document with links to related topics:
### Match Code Exactly ### Match Code Exactly
Use the exact names from source code: Use the exact names from source code:
- `TemplateInstanceActor` not "Template Instance Actor" - `MxStatusProxy` not "MX status proxy"
- `ScadaGatewayActor` not "SCADA Gateway Actor" - `SessionManager` not "session manager"
- `IRequiredActor<T>` not "required actor interface" - `OrphanWorkerTerminator` not "orphan worker terminator"
### Acronyms ### Acronyms
Spell out on first use, then use acronym: Spell out on first use, then use acronym:
> OPC Unified Architecture (OPC UA) provides industrial communication standards. OPC UA servers expose... > Single-threaded apartment (STA) threads serialize COM calls. STA message
> pumping lets MXAccess events deliver...
Common acronyms that don't need expansion: Common acronyms that don't need expansion:
- API - API
- JSON - JSON
- SQL - SQL
- HTTP/HTTPS - HTTP/HTTPS
- REST - COM
- JWT - gRPC
- IPC
- STA
- UI - UI
### File Paths ### File Paths
Use forward slashes and backticks: Use forward slashes and backticks:
- `src/Infrastructure/Akka/Actors/` - `src/ZB.MOM.WW.MxGateway.Server/`
- `appsettings.json` - `appsettings.json`
- `Documentation/Akka/Overview.md` - `docs/GatewayConfiguration.md`
## What to Avoid ## What to Avoid
@@ -260,13 +273,14 @@ The constructor creates a new instance of the class.
<!-- Better - only document if there's something notable --> <!-- Better - only document if there's something notable -->
## Constructor ## Constructor
The constructor accepts an `IActorRef` for the gateway actor, which must be resolved before actor creation. The constructor accepts a `WorkerPipeSession`, which must be connected before
the session transitions out of `Handshaking`.
``` ```
### Don't Duplicate Source Code Comments ### Don't Duplicate Source Code Comments
If code has good comments, reference the file rather than copying: If code has good comments, reference the file rather than copying:
> See `ScadaGatewayActor.cs` lines 45-60 for the message routing logic. > See `SessionManager.cs` for the open-failure rollback order.
### Don't Include Temporary Information ### Don't Include Temporary Information
@@ -278,5 +292,12 @@ Assume readers know:
- Dependency injection - Dependency injection
- async/await - async/await
- LINQ - LINQ
- Entity Framework basics
- ASP.NET Core middleware pipeline - ASP.NET Core middleware pipeline
- gRPC service basics
## Related Documentation
- [Architecture overview](./gateway.md)
- [Gateway Configuration](./docs/GatewayConfiguration.md)
- [C# Style Guide](./docs/style-guides/CSharpStyleGuide.md)
- [Go Style Guide](./docs/style-guides/GoStyleGuide.md), [Java Style Guide](./docs/style-guides/JavaStyleGuide.md), [Python Style Guide](./docs/style-guides/PythonStyleGuide.md), [Rust Style Guide](./docs/style-guides/RustStyleGuide.md), [Protobuf Style Guide](./docs/style-guides/ProtobufStyleGuide.md)
-3
View File
@@ -32,8 +32,6 @@ clients/dotnet/
Commands/ Commands/
ZB.MOM.WW.MxGateway.Client.Tests/ ZB.MOM.WW.MxGateway.Client.Tests/
ZB.MOM.WW.MxGateway.Client.Tests.csproj ZB.MOM.WW.MxGateway.Client.Tests.csproj
ZB.MOM.WW.MxGateway.Client.IntegrationTests/
ZB.MOM.WW.MxGateway.Client.IntegrationTests.csproj
``` ```
Target framework: Target framework:
@@ -52,7 +50,6 @@ Expected packages:
- `Grpc.Net.Client` - `Grpc.Net.Client`
- `Google.Protobuf` - `Google.Protobuf`
- `Grpc.Tools` for generation
- `Microsoft.Extensions.Logging.Abstractions` - `Microsoft.Extensions.Logging.Abstractions`
- `System.CommandLine` or similar for CLI - `System.CommandLine` or similar for CLI
- test framework: xUnit or NUnit - test framework: xUnit or NUnit
+3
View File
@@ -27,6 +27,9 @@ clients/go/
internal/generated/ internal/generated/
mxaccess_gateway.pb.go mxaccess_gateway.pb.go
mxaccess_gateway_grpc.pb.go mxaccess_gateway_grpc.pb.go
galaxy_repository.pb.go
galaxy_repository_grpc.pb.go
mxaccess_worker.pb.go
cmd/mxgw-go/ cmd/mxgw-go/
main.go main.go
tests/ tests/
+1 -1
View File
@@ -140,7 +140,7 @@ pairs `Children` with `ChildHasChildren` so you know which nodes to expand. See
request and filter semantics. request and filter semantics.
```go ```go
import pb "gitea.dohertylan.com/dohertj2/mxaccessgw/clients/go/internal/generated/galaxy_repository/v1" import pb "gitea.dohertylan.com/dohertj2/mxaccessgw/clients/go/internal/generated"
reply, err := galaxy.BrowseChildren(ctx, &pb.BrowseChildrenRequest{}) reply, err := galaxy.BrowseChildren(ctx, &pb.BrowseChildrenRequest{})
if err != nil { if err != nil {
@@ -250,31 +250,31 @@
"commands": [ "commands": [
{ {
"operation": "open-session", "operation": "open-session",
"command": "gradle :mxgateway-cli:run --args=\"open-session --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --client-session-name mxgw-java-smoke --json\"" "command": "gradle :zb-mom-ww-mxgateway-cli:run --args=\"open-session --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --client-session-name mxgw-java-smoke --json\""
}, },
{ {
"operation": "register", "operation": "register",
"command": "gradle :mxgateway-cli:run --args=\"register --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --session-id <session-id> --client-name mxgw-java-smoke --json\"" "command": "gradle :zb-mom-ww-mxgateway-cli:run --args=\"register --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --session-id <session-id> --client-name mxgw-java-smoke --json\""
}, },
{ {
"operation": "add-item", "operation": "add-item",
"command": "gradle :mxgateway-cli:run --args=\"add-item --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --session-id <session-id> --server-handle <server-handle> --item TestChildObject.TestInt --json\"" "command": "gradle :zb-mom-ww-mxgateway-cli:run --args=\"add-item --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --session-id <session-id> --server-handle <server-handle> --item TestChildObject.TestInt --json\""
}, },
{ {
"operation": "advise", "operation": "advise",
"command": "gradle :mxgateway-cli:run --args=\"advise --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --session-id <session-id> --server-handle <server-handle> --item-handle <item-handle> --json\"" "command": "gradle :zb-mom-ww-mxgateway-cli:run --args=\"advise --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --session-id <session-id> --server-handle <server-handle> --item-handle <item-handle> --json\""
}, },
{ {
"operation": "stream-events", "operation": "stream-events",
"command": "gradle :mxgateway-cli:run --args=\"stream-events --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --session-id <session-id> --limit 1 --json\"" "command": "gradle :zb-mom-ww-mxgateway-cli:run --args=\"stream-events --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --session-id <session-id> --limit 1 --json\""
}, },
{ {
"operation": "close-session", "operation": "close-session",
"command": "gradle :mxgateway-cli:run --args=\"close-session --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --session-id <session-id> --json\"" "command": "gradle :zb-mom-ww-mxgateway-cli:run --args=\"close-session --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --session-id <session-id> --json\""
} }
], ],
"optionalWriteCommand": "gradle :mxgateway-cli:run --args=\"write --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --session-id <session-id> --server-handle <server-handle> --item-handle <item-handle> --type int32 --value <write-value> --json\"", "optionalWriteCommand": "gradle :zb-mom-ww-mxgateway-cli:run --args=\"write --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --session-id <session-id> --server-handle <server-handle> --item-handle <item-handle> --type int32 --value <write-value> --json\"",
"bundledSmokeCommand": "gradle :mxgateway-cli:run --args=\"smoke --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --item TestChildObject.TestInt --json\"" "bundledSmokeCommand": "gradle :zb-mom-ww-mxgateway-cli:run --args=\"smoke --endpoint localhost:5000 --api-key-env MXGATEWAY_API_KEY --plaintext --item TestChildObject.TestInt --json\""
} }
] ]
} }
+1 -1
View File
@@ -212,7 +212,7 @@ Use bounded smoke flow and always attempt `close_session` in `finally`.
Use `pyproject.toml`. Publishable package name should be stable, for example: Use `pyproject.toml`. Publishable package name should be stable, for example:
```text ```text
mxaccess-gateway-client zb-mom-ww-mxaccess-gateway-client
``` ```
Generated protobuf code should be regenerated through a documented command, not Generated protobuf code should be regenerated through a documented command, not
+2 -2
View File
@@ -62,8 +62,8 @@ cargo run -p mxgw-cli -- register --session-id <session-id> --client-name mxgw-r
cargo run -p mxgw-cli -- add-item --session-id <session-id> --server-handle 1 --item TestChildObject.TestInt --json cargo run -p mxgw-cli -- add-item --session-id <session-id> --server-handle 1 --item TestChildObject.TestInt --json
cargo run -p mxgw-cli -- advise --session-id <session-id> --server-handle 1 --item-handle 1 --json cargo run -p mxgw-cli -- advise --session-id <session-id> --server-handle 1 --item-handle 1 --json
cargo run -p mxgw-cli -- stream-events --session-id <session-id> --max-events 1 --json cargo run -p mxgw-cli -- stream-events --session-id <session-id> --max-events 1 --json
cargo run -p mxgw-cli -- stream-alarms --session-id <session-id> --max-messages 1 --json cargo run -p mxgw-cli -- stream-alarms --max-events 1 --json
cargo run -p mxgw-cli -- acknowledge-alarm --session-id <session-id> --alarm-reference "\\Galaxy\Area001.Pump001.PumpFault" --json cargo run -p mxgw-cli -- acknowledge-alarm --reference "\\Galaxy\Area001.Pump001.PumpFault" --json
cargo run -p mxgw-cli -- write --session-id <session-id> --server-handle 1 --item-handle 1 --value-type int32 --value 123 --json cargo run -p mxgw-cli -- write --session-id <session-id> --server-handle 1 --item-handle 1 --value-type int32 --value 123 --json
``` ```
+187 -38
View File
@@ -67,9 +67,17 @@ list.
## What this means ## What this means
The architecture comment on > **Historical note (current as built).** This discovery record predates the
`src/ZB.MOM.WW.MxGateway.Worker/MxAccess/AlarmClientConsumer.cs` (PR A.5) is > as-built alarm path. The `AlarmClientConsumer.cs` file referenced below was
**wrong against this deployed assembly**: > retired; the production consumer is
> `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/WnWrapAlarmConsumer.cs` (driven by the
> `wwAlarmConsumerClass` COM surface — see [Option A](#option-a--captured-2026-05-01)
> below). The current public RPC surface and broker architecture are summarized
> in [Current alarm path (as built)](#current-alarm-path-as-built) at the end of
> this document; the sections in between are kept as a discovery record.
The architecture comment on the (now-retired) `AlarmClientConsumer.cs` (PR A.5)
was **wrong against this deployed assembly**:
> "The AVEVA alarm-manager surface (`IAlarmMgrDataProvider`) exposes > "The AVEVA alarm-manager surface (`IAlarmMgrDataProvider`) exposes
> the events we need as plain .NET events — no Windows message pump > the events we need as plain .NET events — no Windows message pump
@@ -601,8 +609,14 @@ returned to normal but is unacknowledged — i.e., visible in the
"current alarms" list because operator hasn't acked it yet) and "current alarms" list because operator hasn't acked it yet) and
`UNACK_ALM` (the alarm is currently active and unacknowledged). `UNACK_ALM` (the alarm is currently active and unacknowledged).
The other states from `eAlmState` (`ACK_RTN`, `ACK_ALM`) would The other states from `eAlmState` (`ACK_RTN`, `ACK_ALM`) would
appear when an ack is performed — `wwAlarmConsumerClass.AlarmAckByGUID` appear when an ack is performed.
is the method to call.
> **Forward reference / superseded:** an earlier draft named
> `wwAlarmConsumerClass.AlarmAckByGUID` as the ack method. That call turned out
> to be **`E_NOTIMPL`** on this AVEVA build (see
> [`AlarmAckByGUID` is not implemented](#4-alarmackbyguid-is-not-implemented)
> below). The as-built ack path is the v1 6-arg `AlarmAckByName` on a dedicated
> ack-only consumer instance. Do not wire acks through `AlarmAckByGUID`.
### `GetStatistics` AV — unrelated quirk ### `GetStatistics` AV — unrelated quirk
@@ -638,20 +652,25 @@ alarm-consumer surface unblocks A.2 fully. Outline:
payload; diff against the previous snapshot (keyed by payload; diff against the previous snapshot (keyed by
`GUID`); emit `MX_EVENT_FAMILY_ON_ALARM_TRANSITION` `GUID`); emit `MX_EVENT_FAMILY_ON_ALARM_TRANSITION`
events for added/changed/removed records. events for added/changed/removed records.
- `AlarmAckByGUID(VBGUID, comment, oprName, node, domain, - Client-driven acknowledgements. (This draft named `AlarmAckByGUID` and a
fullName)` for client-driven acknowledgements (matches `AlarmAckCommand` payload; as built the ack proto is
PR A.5's `AlarmAckCommand` payload). `AcknowledgeAlarmCommand` / `AcknowledgeAlarmByNameCommand`, the consumer
interface method is `AcknowledgeByGuid` / `AcknowledgeByName`, and the GUID
path is `E_NOTIMPL` so only the by-name path runs — see
[`AlarmAckByGUID` is not implemented](#4-alarmackbyguid-is-not-implemented).)
- Lifecycle teardown: `DeregisterConsumer` + - Lifecycle teardown: `DeregisterConsumer` +
`UninitializeConsumer` + `Marshal.FinalReleaseComObject`. `UninitializeConsumer` + `Marshal.FinalReleaseComObject`.
3. **Conversion layer:** map XML record fields to 3. **Conversion layer:** map XML record fields to the alarm proto:
`MxAlarmConditionRecord` proto: - `GUID` and `PROVIDER_NAME!GROUP.TAGNAME` → `alarm_full_reference` (there is
- `GUID` → `condition_id` (canonicalize the no-dashes hex no `condition_id` field; the public RPC and worker carry the reference as
to a UUID string). `alarm_full_reference`, either a canonical GUID or `Provider!Group.Tag`).
- `STATE` enum → `inAlarm` + `acked` booleans - `STATE` → `AlarmConditionState` on `ActiveAlarmSnapshot.current_state`
(`UNACK_ALM` → in_alarm=true, acked=false; (this draft used `inAlarm` + `acked` booleans, which the proto does not
`UNACK_RTN` → in_alarm=false, acked=false; have). As built, the snapshot state collapses to three values:
`ACK_ALM` → in_alarm=true, acked=true; `UNACK_ALM` → `Active`; `ACK_ALM` → `ActiveAcked`; `UNACK_RTN` and
`ACK_RTN` → in_alarm=false, acked=true). `ACK_RTN` both → `Inactive` (a returned-to-normal alarm is no longer
"active"). For the live `transition` feed the `STATE` instead drives an
`AlarmTransitionKind` (`Raise` / `Acknowledge` / `Clear`).
- `DATE + TIME + GMTOFFSET + DSTADJUST` → reassemble UTC - `DATE + TIME + GMTOFFSET + DSTADJUST` → reassemble UTC
timestamp; matches the worker's existing `Timestamp` timestamp; matches the worker's existing `Timestamp`
wire format. wire format.
@@ -663,10 +682,14 @@ alarm-consumer surface unblocks A.2 fully. Outline:
`aaAlarmManagedClient`, also true here). The existing `aaAlarmManagedClient`, also true here). The existing
`AlarmClientConsumer` skips Initialize entirely; the new `AlarmClientConsumer` skips Initialize entirely; the new
`WnWrapAlarmConsumer` includes it from day one. `WnWrapAlarmConsumer` includes it from day one.
5. **Test reuse:** PR A.5's snapshot/ack contract tests can 5. **Test reuse:** the snapshot/ack contract tests stayed — they don't touch
stay — they don't touch the underlying COM API. Add a new the underlying COM API. As built, the alarm tests live under
integration test against the wnwrap surface (live-AVEVA-only, `src/ZB.MOM.WW.MxGateway.Worker.Tests/MxAccess/` (`AlarmDispatcherTests`,
Skip-gated like the probe). `AlarmRecordTransitionMapperTests`, `AlarmCommandHandlerTests`,
`AlarmCommandExecutorTests`, `WnWrapAlarmConsumerXmlTests`), with the
live-AVEVA-only round-trip in
`src/ZB.MOM.WW.MxGateway.Worker.Tests/Probes/AlarmsLiveSmokeTests.cs`
(Skip-gated like the probe).
### Settled API-ordering and surface knowledge ### Settled API-ordering and surface knowledge
@@ -752,26 +775,47 @@ AVEVA fixes the v2 method later.
The v2 `AlarmAckByGUID(VBGUID, …)` throws `NotImplementedException` The v2 `AlarmAckByGUID(VBGUID, …)` throws `NotImplementedException`
(COM `E_NOTIMPL`) on `wwAlarmConsumerClass` against this AVEVA (COM `E_NOTIMPL`) on `wwAlarmConsumerClass` against this AVEVA
build. The reference→GUID lookup that we initially planned to wire build. The reference→GUID lookup that we initially planned to wire
through `AlarmAckByGUID` is therefore not viable on wnwrap; all acks through `AlarmAckByGUID` is therefore not viable on wnwrap; only the
must go through `AlarmAckByName`. by-name path actually succeeds.
The proto `AcknowledgeAlarmCommand` (GUID-based) and the worker's **Routing as built (and the GUID hazard).** The gateway-side router is
`MxAccessCommandExecutor.ExecuteAcknowledgeAlarm` switch arm remain `GatewayAlarmMonitor.BuildAcknowledgeCommand` (there is no
in the codebase for the forward-compat shape, but the gateway-side `WorkerAlarmRpcDispatcher` type). Routing is **conditional on the reference
`WorkerAlarmRpcDispatcher.AcknowledgeAsync` now always routes through shape**, not unconditional:
`AcknowledgeAlarmByName` when the public RPC supplies a recognizable
`Provider!Group.Tag` reference.
### 5. STA / threading — production fix needed - A reference that `Guid.TryParse` accepts is built into
`MxCommandKind.AcknowledgeAlarm` / `AcknowledgeAlarmCommand` — the **GUID
path**, which the worker dispatches to `AlarmAckByGUID`.
- A `Provider!Group.Tag` reference (parsed by
`GatewayAlarmMonitor.TryParseAlarmReference`) is built into
`MxCommandKind.AcknowledgeAlarmByName` / `AcknowledgeAlarmByNameCommand` — the
by-name path, which is the only one that succeeds on this build.
- Anything else fails with an `alarm_full_reference` parse error before any
worker call.
The wnwrap COM is `ThreadingModel=Apartment`. The consumer's The GUID arm is **still dispatched unguarded**: the proto
internal `Timer` fires on threadpool threads and would block forever `AcknowledgeAlarmCommand` and the worker's
on cross-apartment marshaling unless the host STA pumps Win32 `MxAccessCommandExecutor.ExecuteAcknowledgeAlarm` switch arm remain in the
messages. The smoke test sidesteps this by setting codebase for forward compatibility, and `BuildAcknowledgeCommand` routes a
`pollIntervalMilliseconds=0` (Timer disabled) and driving `PollOnce` GUID-shaped reference straight to them. On the deployed wnwrap build that path
manually from the test's STA. Production hosting will route polls hits the `E_NOTIMPL` `AlarmAckByGUID` and surfaces a `COMException` rather than
through the worker's `StaRuntime` in a follow-up — the consumer's acknowledging. **Practical guidance:** acknowledge with the
`PollOnce` is `public` and idempotent so the wire-up is mechanical. `Provider!Group.Tag` reference (the same form the transition feed emits in
`alarm_full_reference`), not a raw GUID, until the GUID arm is either guarded or
AVEVA implements `AlarmAckByGUID`.
### 5. STA / threading
The wnwrap COM is `ThreadingModel=Apartment`, so every consumer call
(`Subscribe`, `PollOnce`, the `AcknowledgeBy*` methods) must run on the STA that
created the COM instance. As built, `WnWrapAlarmConsumer` owns **no internal
timer and takes no `pollIntervalMilliseconds` parameter** — an earlier draft
described a self-driven `Timer` that would have blocked on cross-apartment
marshaling, but that design was dropped. Instead `PollOnce()` is a `public`,
idempotent method the host drives on the worker's STA (via
`StaRuntime.InvokeAsync(() => consumer.PollOnce())`); the poll cadence lives in
the host, not the consumer. Each `PollOnce` reads `GetXmlCurrentAlarms2`, diffs
against the previous snapshot, and emits transition events.
### Capture summary ### Capture summary
@@ -790,3 +834,108 @@ Post-ack transition: kind=Clear …
10s cadence held throughout; full proto fields populated correctly; 10s cadence held throughout; full proto fields populated correctly;
ack registered server-side without errors. ack registered server-side without errors.
## Current alarm path (as built)
The sections above are a discovery record. This section summarizes the path that
actually ships, grounded in the current code. For the proto shapes see
[Contracts](./Contracts.md#alarm-rpcs-and-messages); for the server handlers see
[gRPC](./Grpc.md); for configuration see
[Gateway Configuration](./GatewayConfiguration.md#alarm-options).
### Public RPCs and configuration
Alarms are exposed through three **session-less** RPCs on `MxAccessGateway`:
`AcknowledgeAlarm`, `StreamAlarms`, and `QueryActiveAlarms`. No client opens a
worker session to use them. They are gated by `MxGateway:Alarms:*`:
- `MxGateway:Alarms:Enabled` (default `false`) turns the whole subsystem on.
- `MxGateway:Alarms:SubscriptionExpression` is the canonical
`\\<machine>\Galaxy!<area>` subscription; when empty, the monitor falls back
to `\\<MachineName>\Galaxy!<DefaultArea>` from `MxGateway:Alarms:DefaultArea`.
Enabled with both empty faults the monitor with a configuration diagnostic.
- `MxGateway:Alarms:ReconcileIntervalSeconds` (default 30, floored at 5) sets the
reconcile cadence below.
### The always-on `GatewayAlarmMonitor` broker
`GatewayAlarmMonitor` (`src/ZB.MOM.WW.MxGateway.Server/Alarms/GatewayAlarmMonitor.cs`)
is registered by `AddGatewayAlarms` as a singleton, as the `IGatewayAlarmService`,
and as a hosted `BackgroundService`. When `Enabled`, it:
1. Opens **one** gateway-managed worker session dedicated to alarms (client name
`gateway-alarm-monitor`, backend `Galaxy`), after a brief startup grace so
worker launching and orphan cleanup settle.
2. Subscribes that session to the resolved subscription expression and feeds an
in-process active-alarm cache (`Dictionary<reference, ActiveAlarmSnapshot>`)
from the session's transition events.
3. Fans the feed out to **any number** of `StreamAlarms` subscribers — clients
never open their own session. The session is transparently re-opened with a
5-second backoff if the worker faults.
### `AlarmFeedMessage` stream protocol
`StreamAsync` (behind `StreamAlarms`) emits, in order:
1. one `AlarmFeedMessage { active_alarm }` per currently-cached alarm matching
the optional `alarm_filter_prefix`,
2. a single `AlarmFeedMessage { snapshot_complete = true }` sentinel,
3. then one `AlarmFeedMessage { transition }` per live change.
The subscriber is registered under the monitor lock **before** the snapshot is
taken, so no transition can slip between the snapshot and the live tail.
`QueryActiveAlarms` reuses the same cache but emits only the `active_alarm`
snapshots and completes — no sentinel, no transitions.
### Reconcile loop
A `PeriodicTimer` runs `ReconcileAsync` every
`max(5, ReconcileIntervalSeconds)` seconds. It pulls the worker's authoritative
active-alarm snapshot and replaces the cache, broadcasting a synthetic `Clear`
transition for any cached alarm the snapshot no longer contains and a synthetic
`Raise` for any alarm the snapshot adds. This catches transitions the live
poll-and-diff feed missed (e.g. across a transport blip). A failed reconcile
pass logs at Debug and keeps the current cache.
### Subscriber backpressure
Each subscriber gets a bounded channel of **2048** messages
(`SubscriberQueueCapacity`). When `Broadcast` cannot write to a subscriber (its
channel is full), that subscriber is **completed with an error and dropped** —
the error message tells the client to reconnect to re-snapshot. Backpressure
from one slow consumer never blocks the broker or other subscribers.
### Snapshot state collapse
`ActiveAlarmSnapshot.current_state` carries only three `AlarmConditionState`
values, so the four AVEVA `STATE`s collapse: `UNACK_ALM` → `Active`,
`ACK_ALM` → `ActiveAcked`, and both `UNACK_RTN` and `ACK_RTN` → `Inactive`
(`AlarmDispatcher`). A returned-to-normal alarm is reported as `Inactive` in a
snapshot even though it is still listed because it is unacknowledged. The live
`transition` feed instead reports `AlarmTransitionKind` (`Raise` / `Acknowledge`
/ `Clear`).
### `alarm_full_reference` parse contract
`AcknowledgeAlarm` accepts either form in `alarm_full_reference`
(`GatewayAlarmMonitor.BuildAcknowledgeCommand`):
- a canonical GUID (`Guid.TryParse`) → GUID ack path
(`AcknowledgeAlarmCommand`), which on the deployed wnwrap build hits the
`E_NOTIMPL` `AlarmAckByGUID` — see
[`AlarmAckByGUID` is not implemented](#4-alarmackbyguid-is-not-implemented);
- a `Provider!Group.Tag` reference (`TryParseAlarmReference`: first `!` splits
provider from `Group.Tag`, the first `.` after the `!` splits group from tag)
→ by-name ack path (`AcknowledgeAlarmByNameCommand`), the path that works;
- anything else → a parse error before any worker call.
The transition feed emits the `Provider!Group.Tag` form in
`alarm_full_reference`, so echoing that value back into `AcknowledgeAlarm` takes
the working by-name path.
### Reserved / unused
`AlarmTransitionKind.RETRIGGER` is defined in the proto but is **not currently
produced** — the transition mapper emits only `Raise` / `Acknowledge` / `Clear`.
It is reserved for a future "re-raise from a previously cleared condition"
distinction.
+68 -50
View File
@@ -2,11 +2,13 @@
The gateway authentication subsystem verifies inbound API key credentials against a SQLite-backed key store, hashes secrets with a configurable pepper, and records administrative and verification events to an audit trail. The gateway authentication subsystem verifies inbound API key credentials against a SQLite-backed key store, hashes secrets with a configurable pepper, and records administrative and verification events to an audit trail.
The peppered-HMAC API-key pipeline — token format, parsing, secret generation and hashing, constant-time comparison, the SQLite schema, the stores, the verifier, and the migrator — lives in the shared `ZB.MOM.WW.Auth.ApiKeys` package (with abstractions in `ZB.MOM.WW.Auth.Abstractions`), of which this gateway is the donor. The gateway references the package and binds the library's `ApiKeyOptions` from its own `MxGateway:Authentication` section through `AddSqliteAuthStore`, then layers the gateway-specific pieces on top: constraint enforcement, the gRPC authorization interceptor, the admin CLI, the dashboard API Keys page, and canonical audit forwarding. Types whose code is shown below for reference are owned by the shared package unless noted; the gateway does not re-implement them.
## Token Format ## Token Format
API keys travel in the HTTP `Authorization` header as a bearer token shaped `mxgw_<keyId>_<secret>`. The `mxgw_` prefix scopes parsing to gateway tokens, the `<keyId>` segment is the public identifier used for lookup, and `<secret>` is the high-entropy portion that the gateway verifies against a stored hash. API keys travel in the HTTP `Authorization` header as a bearer token shaped `mxgw_<keyId>_<secret>`. The `mxgw_` prefix scopes parsing to gateway tokens, the `<keyId>` segment is the public identifier used for lookup, and `<secret>` is the high-entropy portion that the gateway verifies against a stored hash.
`ApiKeyParser` enforces the format and rejects malformed tokens before any database round-trip: The shared library's `ApiKeyParser` enforces the format and rejects malformed tokens before any database round-trip:
```csharp ```csharp
public bool TryParseAuthorizationHeader(string? authorizationHeader, out ParsedApiKey? apiKey) public bool TryParseAuthorizationHeader(string? authorizationHeader, out ParsedApiKey? apiKey)
@@ -50,7 +52,7 @@ public static string Generate()
### Peppered hashing ### Peppered hashing
`ApiKeySecretHasher` (registered behind `IApiKeySecretHasher`) hashes secrets with `HMACSHA256` keyed by a server-side pepper. The pepper lives outside the database and is resolved by `IConfiguration` lookup against the configured `PepperSecretName`: The shared library's `ApiKeySecretHasher` (behind `IApiKeySecretHasher`) hashes secrets with `HMACSHA256` keyed by a server-side pepper. The pepper lives outside the database and is resolved through an `IApiKeyPepperProvider` — the gateway wires the configuration-backed provider so the pepper comes from `IConfiguration` lookup against `MxGateway:ApiKeyPepper` (`PepperSecretName`):
```csharp ```csharp
public byte[] HashSecret(string secret) public byte[] HashSecret(string secret)
@@ -69,37 +71,29 @@ The pepper is intentionally not stored alongside the hash: an attacker who exfil
## Verification ## Verification
`ApiKeyVerifier` (`IApiKeyVerifier`) implements the verification flow: The shared library's `IApiKeyVerifier.VerifyAsync(authorizationHeader, cancellationToken)` owns the whole verification flow — the gateway interceptor hands it the raw `authorization` header value and never parses the token itself:
1. Parse the `Authorization` header into a `ParsedApiKey`. 1. Parse the `Authorization` header into the key id and secret.
2. Look up the `ApiKeyRecord` by `KeyId` through `IApiKeyStore.FindByKeyIdAsync`. 2. Look up the record by key id.
3. Reject revoked records (`RevokedUtc is not null`). 3. Reject revoked records.
4. Hash the presented secret with the configured pepper. 4. Hash the presented secret with the configured pepper.
5. Compare hashes with `CryptographicOperations.FixedTimeEquals` to avoid timing oracles. 5. Compare hashes with `CryptographicOperations.FixedTimeEquals` to avoid timing oracles.
6. Record a `LastUsedUtc` timestamp via `MarkKeyUsedAsync` and return an `ApiKeyIdentity`. 6. Stamp `last_used_utc` and return an identity.
`VerifyAsync` returns an `ApiKeyVerification` value with a `Succeeded` flag and a nullable `Identity`. On failure the result is discriminated so the caller can tell parse errors, missing pepper, missing or revoked keys, and secret mismatch apart for audit detail — without leaking which check failed to the client. The gateway interceptor treats any non-success uniformly as `Unauthenticated` (see [Authorization](./Authorization.md)):
```csharp ```csharp
if (!CryptographicOperations.FixedTimeEquals(presentedHash, storedKey.SecretHash)) ApiKeyVerification verification = await apiKeyVerifier
{ .VerifyAsync(authorizationHeader ?? string.Empty, context.CancellationToken)
return ApiKeyVerificationResult.Fail(ApiKeyVerificationFailure.SecretMismatch);
}
await keyStore.MarkKeyUsedAsync(storedKey.KeyId, DateTimeOffset.UtcNow, cancellationToken)
.ConfigureAwait(false); .ConfigureAwait(false);
return ApiKeyVerificationResult.Success(new ApiKeyIdentity( if (!verification.Succeeded || verification.Identity is null)
KeyId: storedKey.KeyId, {
KeyPrefix: storedKey.KeyPrefix, throw new RpcException(new Status(StatusCode.Unauthenticated, "Missing or invalid API key."));
DisplayName: storedKey.DisplayName, }
Scopes: storedKey.Scopes,
Constraints: storedKey.Constraints));
``` ```
`ApiKeyVerificationResult` carries either an `ApiKeyIdentity` or a discriminated `ApiKeyVerificationFailure` value. The failure enum distinguishes parse errors, missing pepper, missing or revoked keys, and secret mismatch so the calling middleware can emit precise audit detail without leaking which check failed to the client. The shared verifier returns `ZB.MOM.WW.Auth.Abstractions.ApiKeys.ApiKeyIdentity`, which carries the persisted constraints as an opaque JSON string. The gateway's `GatewayApiKeyIdentityMapper.ToGatewayIdentity` projects it onto the gateway-local `ApiKeyIdentity` record, which exposes only non-secret fields (`KeyId`, `KeyPrefix`, `DisplayName`, `Scopes`) plus the deserialized `Constraints`, and is the type downstream authorization code consumes.
`ApiKeyIdentity` exposes only non-secret fields (`KeyId`, `KeyPrefix`,
`DisplayName`, `Scopes`, and `Constraints`) and is the type downstream
authorization code consumes.
## Storage ## Storage
@@ -107,7 +101,7 @@ The gateway keeps API key state in a dedicated SQLite database. SQLite is suffic
### Connection factory ### Connection factory
`AuthSqliteConnectionFactory` reads `GatewayOptions.Authentication.SqlitePath`, ensures the parent directory exists, and builds a connection string in `ReadWriteCreate` mode so first-run installations can create the file without manual provisioning. Connection pooling is enabled and the connection string carries a non-zero `DefaultTimeout`: The shared library's `AuthSqliteConnectionFactory` (registered by `AddZbApiKeyAuth`) reads the bound `ApiKeyOptions.SqlitePath` — which the gateway populates from `MxGateway:Authentication:SqlitePath` ensures the parent directory exists, and builds a connection string in `ReadWriteCreate` mode so first-run installations can create the file without manual provisioning. Connection pooling is enabled and the connection string carries a non-zero `DefaultTimeout`:
```csharp ```csharp
SqliteConnectionStringBuilder builder = new() SqliteConnectionStringBuilder builder = new()
@@ -119,21 +113,22 @@ SqliteConnectionStringBuilder builder = new()
}; };
``` ```
Every store opens its connection through `OpenConnectionAsync`, which opens the connection and then applies `PRAGMA journal_mode=WAL` and `PRAGMA busy_timeout`. WAL is a persistent database-level setting so re-applying it per connection is a cheap no-op; `busy_timeout` is per-connection state. Because `MarkKeyUsedAsync` runs on every authenticated request and `SqliteApiKeyAuditStore` appends on every denial, this lets concurrent readers and writers retry briefly instead of surfacing `SQLITE_BUSY` as a hard failure on the request path. Every store opens its connection through `OpenConnectionAsync`, which opens the connection and then applies `PRAGMA journal_mode=WAL` and `PRAGMA busy_timeout`. WAL is a persistent database-level setting so re-applying it per connection is a cheap no-op; `busy_timeout` is per-connection state. Because `MarkKeyUsedAsync` runs on every authenticated request and the canonical audit writer appends to the same file, this lets concurrent readers and writers retry briefly instead of surfacing `SQLITE_BUSY` as a hard failure on the request path.
### Schema ### Schema
`SqliteAuthSchema` declares table names and the current schema version as constants. Three tables are involved: The shared library's `SqliteAuthSchema` declares the API-key table names and the current schema version as constants. Four tables live in the database file:
- `api_keys` stores `key_id`, `key_prefix`, the `secret_hash` blob, - `api_keys` stores `key_id`, `key_prefix`, the `secret_hash` blob,
`display_name`, serialized `scopes`, optional serialized `constraints`, and `display_name`, serialized `scopes`, optional serialized `constraints`, and
the `created_utc`, `last_used_utc`, and `revoked_utc` timestamps. the `created_utc`, `last_used_utc`, and `revoked_utc` timestamps.
- `api_key_audit` is an append-only log keyed by an autoincrement `audit_id` with `key_id`, `event_type`, `remote_address`, `created_utc`, and `details` columns. - `api_key_audit` is the shared library's append-only audit log keyed by an autoincrement `audit_id` with `key_id`, `event_type`, `remote_address`, `created_utc`, and `details` columns. The gateway overrides the library audit store (see [Audit trail](#audit-trail)), so this table is **left in place but unused** at runtime — nothing writes to it.
- `audit_event` is the gateway-owned canonical audit table written by `SqliteCanonicalAuditStore`. It lives in the same SQLite file (reusing the library's `AuthSqliteConnectionFactory`) and is where every gateway audit event actually lands. See [Audit trail](#audit-trail).
- `schema_version` carries a single row whose `version` column is matched against `SqliteAuthSchema.CurrentVersion`. - `schema_version` carries a single row whose `version` column is matched against `SqliteAuthSchema.CurrentVersion`.
### Read paths ### Read paths
`SqliteApiKeyStore` (`IApiKeyStore`) handles the two reads needed at request time: `FindByKeyIdAsync` returns any record (so revoked keys can be reported distinctly) and `FindActiveByKeyIdAsync` filters to non-revoked rows. `MarkKeyUsedAsync` updates `last_used_utc` only for non-revoked rows so a freshly revoked key cannot have its timestamp refreshed by a racing verification. The shared library's `SqliteApiKeyStore` (`IApiKeyStore`) handles the two reads needed at request time: `FindByKeyIdAsync` returns any record (so revoked keys can be reported distinctly) and `FindActiveByKeyIdAsync` filters to non-revoked rows. `MarkKeyUsedAsync` updates `last_used_utc` only for non-revoked rows so a freshly revoked key cannot have its timestamp refreshed by a racing verification.
`ApiKeyRecord` is the in-memory projection. `ApiKeyRecordReader.Read` is shared by every read path so column ordering is defined in one place: `ApiKeyRecord` is the in-memory projection. `ApiKeyRecordReader.Read` is shared by every read path so column ordering is defined in one place:
@@ -155,17 +150,21 @@ public static ApiKeyRecord Read(SqliteDataReader reader)
### Write paths ### Write paths
`SqliteApiKeyAdminStore` (`IApiKeyAdminStore`) implements administrative mutations: `CreateAsync` accepts an `ApiKeyCreateRequest`, `RevokeAsync` sets `revoked_utc` only when not already revoked, `RotateAsync` replaces `secret_hash`, clears `last_used_utc`, and clears `revoked_utc` so a rotated key is immediately usable, and `DeleteAsync` permanently removes a row but only when `revoked_utc IS NOT NULL` — active keys are untouched (returns false) so the revoke event lands in the audit log before the row disappears. The shared library's `SqliteApiKeyAdminStore` (`IApiKeyAdminStore`) implements administrative mutations: `CreateAsync` accepts an `ApiKeyCreateRequest`, `RevokeAsync` sets `revoked_utc` only when not already revoked, `RotateAsync` replaces `secret_hash`, clears `last_used_utc`, and clears `revoked_utc` so a rotated key is immediately usable, and `DeleteAsync` permanently removes a row but only when `revoked_utc IS NOT NULL` — active keys are untouched (returns false) so the revoke event lands in the audit log before the row disappears.
Because `RotateAsync` clears `revoked_utc`, rotating a previously revoked key reactivates it. The dashboard API Keys page therefore offers the Rotate (and Revoke) actions only for keys whose status is `Active`; revoked keys instead show a Delete action that calls `DeleteAsync`, so an operator can permanently remove a revoked row without ever risking un-revocation as a side effect of a rotation. Because `RotateAsync` clears `revoked_utc`, rotating a previously revoked key reactivates it. The dashboard API Keys page therefore offers the Rotate (and Revoke) actions only for keys whose status is `Active`; revoked keys instead show a Delete action that calls `DeleteAsync`, so an operator can permanently remove a revoked row without ever risking un-revocation as a side effect of a rotation.
### Audit trail ### Audit trail
`SqliteApiKeyAuditStore` (`IApiKeyAuditStore`) appends `ApiKeyAuditEntry` values to the `api_key_audit` table and stamps each row with a UTC timestamp inside the store rather than trusting the caller. `ListRecentAsync` returns the most recent rows ordered by `audit_id` descending and projects them into `ApiKeyAuditRecord`. Rows are kept even after the referenced key is revoked because the audit history is the durable record of administrative action; the `key_id` column is nullable to accommodate non-key-scoped events such as `init-db`. All gateway audit flows through a single canonical `AuditEvent` written to the gateway-owned `audit_event` table, not the shared library's `api_key_audit` table. The gateway adopts `ZB.MOM.WW.Audit` and **overrides** the library's `IApiKeyAuditStore` registration with `CanonicalForwardingApiKeyAuditStore`. That adapter receives each library-emitted `ApiKeyAuditEntry` — including the library-internal admin-command verbs (`create-key`, `revoke-key`, `rotate-key`, `init-db`) the gateway cannot edit — canonicalizes it onto an `AuditEvent`, and forwards it through `IAuditWriter` (`CanonicalAuditWriter`), which persists to `audit_event` via `SqliteCanonicalAuditStore`.
Because the adapter is registered after `AddZbApiKeyAuth`, it is the `IApiKeyAuditStore` that the admin commands resolve and that the dashboard "recent audit" view reads through `IApiKeyAuditStore.ListRecentAsync`. The library's own `SqliteApiKeyAuditStore` and its `api_key_audit` table are therefore unused at runtime — the override is the only writer. Audit rows are kept even after the referenced key is revoked because the audit history is the durable record of administrative action; non-key-scoped events such as `init-db` carry no key id.
This canonical-forwarding wiring lives under `src/ZB.MOM.WW.MxGateway.Server/Security/Audit/`; the audit store override and writer are gateway types, while the entry shape and admin verbs originate in the shared library.
## Migration ## Migration
Schema bring-up is centralised behind `IAuthStoreMigrator`. `SqliteAuthStoreMigrator` executes the migration inside a single transaction so a partial failure leaves the database untouched, refuses to start when the on-disk schema version is newer than the binary supports, and idempotently creates the v1 schema: Schema bring-up for the API-key tables is owned by the shared library's `SqliteAuthStoreMigrator`, wired by `AddZbApiKeyAuth` along with its migration hosted service. It executes the migration inside a single transaction so a partial failure leaves the database untouched, refuses to start when the on-disk schema version is newer than the binary supports, and idempotently creates the schema:
```csharp ```csharp
if (existingVersion > SqliteAuthSchema.CurrentVersion) if (existingVersion > SqliteAuthSchema.CurrentVersion)
@@ -179,13 +178,11 @@ await ApplyVersionOneAsync(connection, transaction, cancellationToken).Configure
await transaction.CommitAsync(cancellationToken).ConfigureAwait(false); await transaction.CommitAsync(cancellationToken).ConfigureAwait(false);
``` ```
`AuthStoreMigrationHostedService` runs the migrator at startup, but only when API-key authentication is enabled and `RunMigrationsOnStartup` is true. Operators who manage schema out-of-band can disable the hosted run and use the admin CLI's `init-db` command instead. The library's migration hosted service runs the migrator at startup. Operators who manage schema out-of-band can use the admin CLI's `init-db` command instead.
`AuthStoreMigrationException` is a sealed `InvalidOperationException` so it can be caught precisely without swallowing unrelated failures.
## Admin CLI ## Admin CLI
`ApiKeyAdminCommandLineParser.Parse` recognises a leading `apikey` argument and dispatches to one of the subcommands declared by `ApiKeyAdminCommandKind`. Each parsed invocation produces an `ApiKeyAdminCommand` (or an `ApiKeyAdminParseResult` carrying an error). `ApiKeyAdminCliRunner` then executes the command, runs the migrator first, calls the relevant store method, appends an audit row, and writes either text or JSON output via `ApiKeyAdminOutput`. The returned `ApiKeyAdminListedKey` projection deliberately omits the `secret_hash` so listing a database does not surface hash material. `ApiKeyAdminCommandLineParser.Parse` (a gateway type) recognises a leading `apikey` argument and dispatches to one of the subcommands declared by `ApiKeyAdminCommandKind`. Each parsed invocation produces an `ApiKeyAdminCommand` (or an `ApiKeyAdminParseResult` carrying an error). The parser validates requested `--scopes` against `GatewayScopes.All` (see [Authorization](./Authorization.md#scope-catalog)) so a non-canonical scope string cannot be persisted on a key. `ApiKeyAdminCliRunner` then drives the shared library's `ApiKeyAdminCommands` — which the gateway registers over the already-wired stores, pepper provider, and migrator — to execute the command, and writes either text or JSON output via `ApiKeyAdminOutput`. The returned `ApiKeyAdminListedKey` projection deliberately omits the `secret_hash` so listing a database does not surface hash material.
The supported subcommands match `ApiKeyAdminCommandKind` exactly: The supported subcommands match `ApiKeyAdminCommandKind` exactly:
@@ -201,7 +198,7 @@ Examples:
```bash ```bash
mxgateway apikey init-db mxgateway apikey init-db
mxgateway apikey create-key --key-id ops.alice --display-name "Alice (ops)" --scopes read,write mxgateway apikey create-key --key-id ops.alice --display-name "Alice (ops)" --scopes invoke:read,invoke:write
mxgateway apikey create-key --key-id area1.reader --display-name "Area 1 reader" --scopes invoke:read,metadata:read --read-subtree "Area1/*" --browse-subtree "Area1/*" mxgateway apikey create-key --key-id area1.reader --display-name "Area 1 reader" --scopes invoke:read,metadata:read --read-subtree "Area1/*" --browse-subtree "Area1/*"
mxgateway apikey list-keys --json mxgateway apikey list-keys --json
mxgateway apikey revoke-key --key-id ops.alice mxgateway apikey revoke-key --key-id ops.alice
@@ -226,7 +223,7 @@ confirmation dialog and emits its own audit event
## Scope Serialization ## Scope Serialization
Scopes are persisted as a single TEXT column rather than a join table because the set is small, never queried by membership at the database level, and changes atomically with the owning row. `ApiKeyScopeSerializer.Serialize` writes a JSON array sorted with `StringComparer.Ordinal` so equivalent scope sets produce byte-identical column values, which makes audit diffing and database comparisons deterministic: Scopes are persisted as a single TEXT column rather than a join table because the set is small, never queried by membership at the database level, and changes atomically with the owning row. The shared library's `ApiKeyScopeSerializer.Serialize` writes a JSON array sorted with `StringComparer.Ordinal` so equivalent scope sets produce byte-identical column values, which makes audit diffing and database comparisons deterministic:
```csharp ```csharp
public static string Serialize(IReadOnlySet<string> scopes) public static string Serialize(IReadOnlySet<string> scopes)
@@ -249,29 +246,50 @@ public static IReadOnlySet<string> Deserialize(string value)
`Deserialize` tolerates an empty column by returning an empty set so older rows or hand-edited records do not crash the verifier. `Deserialize` tolerates an empty column by returning an empty set so older rows or hand-edited records do not crash the verifier.
## Dashboard Cookie and Hub Token
The API-key model above guards the gRPC surface. Interactive dashboard requests use a separate LDAP-backed cookie scheme (see [Gateway Dashboard Design](./GatewayDashboardDesign.md)). Two timeouts and a few configuration knobs govern that cookie:
- **Cookie idle timeout — 8 hours.** `DashboardServiceCollectionExtensions` applies the shared `ZbCookieDefaults.Apply` hardened cookie defaults (HttpOnly, `SameSite=Strict`, secure policy, sliding expiration) but overrides the library's 30-minute default with an 8-hour idle timeout, so an active operator is not signed out mid-shift. The expiration is sliding, so each authenticated request resets the window.
- **Hub bearer token — 30 minutes.** SignalR hub connections cannot always carry the HttpOnly cookie (the client SignalR JS may resolve the cookie scope to loopback), so the dashboard mints a short-lived data-protected bearer at `/hubs/token` via `HubTokenService`. The token lifetime is 30 minutes; the hubs accept either it or the cookie.
- **`MxGateway:Dashboard:CookieName`** overrides the cookie name (default `MxGatewayDashboard`, from `DashboardAuthenticationDefaults.CookieName`). Two gateway instances on the same host but different ports share a cookie scope — host+path, not port — so giving each a distinct name keeps their dashboard sessions from clobbering each other. Changing it signs out existing sessions on next deploy.
- **`MxGateway:Dashboard:RequireHttpsCookie`** (default `true`) restricts the cookie to HTTPS via `CookieSecurePolicy.Always`. Set it to `false` for plain-HTTP dev so the cookie uses `SameAsRequest`; leaving it `true` while serving the dashboard over plain HTTP from a non-localhost host breaks login, because browsers drop Secure cookies set over HTTP.
The dashboard issues claims through the shared `ZB.MOM.WW.Auth.AspNetCore.ZbClaimTypes` (e.g. `ZbClaimTypes.Username` = `zb:username`, `ZbClaimTypes.Name` = `ClaimTypes.Name` so `Identity.Name` resolves, `ZbClaimTypes.Role` = `ClaimTypes.Role` so `IsInRole`/`[Authorize(Roles=...)]` work). Cookie hardening defaults come from `ZbCookieDefaults`. Both live in the shared Auth packages, not the gateway.
## Registration ## Registration
`AuthStoreServiceCollectionExtensions.AddSqliteAuthStore` wires every service in this subsystem as a singleton and registers the migration hosted service: `AuthStoreServiceCollectionExtensions.AddSqliteAuthStore` is the gateway entry point. It does not register the parser, hasher, verifier, stores, or migrator directly — those come from the shared package. Instead it delegates to the package's `AddZbApiKeyAuth` and then layers the gateway-specific audit and CLI services:
```csharp ```csharp
public static IServiceCollection AddSqliteAuthStore(this IServiceCollection services) public static IServiceCollection AddSqliteAuthStore(
this IServiceCollection services,
IConfiguration configuration)
{ {
services.AddSingleton<IApiKeyParser, ApiKeyParser>(); // Register the shared API-key provider: binds ApiKeyOptions from MxGateway:Authentication,
services.AddSingleton<IApiKeySecretHasher, ApiKeySecretHasher>(); // wires up the SQLite stores, the configuration-backed pepper provider, the verifier, the
services.AddSingleton<IApiKeyVerifier, ApiKeyVerifier>(); // migrator and the migration hosted service.
services.AddZbApiKeyAuth(effectiveConfig, AuthenticationSectionPath);
// Gateway-owned canonical audit (ZB.MOM.WW.Audit) in the same SQLite file.
services.AddSingleton(sp =>
new SqliteCanonicalAuditStore(sp.GetRequiredService<AuthSqliteConnectionFactory>()));
services.AddSingleton<IAuditWriter>(sp => new CanonicalAuditWriter(/* ... */));
// Override the library's IApiKeyAuditStore so every audit lands in audit_event.
services.AddSingleton<IApiKeyAuditStore, CanonicalForwardingApiKeyAuditStore>();
// The shared admin command set, driven by the gateway CLI and dashboard.
services.AddSingleton(sp => new ApiKeyAdminCommands(/* ... */));
services.AddSingleton<ApiKeyAdminCliRunner>(); services.AddSingleton<ApiKeyAdminCliRunner>();
services.AddSingleton<AuthSqliteConnectionFactory>();
services.AddSingleton<IAuthStoreMigrator, SqliteAuthStoreMigrator>();
services.AddSingleton<IApiKeyStore, SqliteApiKeyStore>();
services.AddSingleton<IApiKeyAdminStore, SqliteApiKeyAdminStore>();
services.AddSingleton<IApiKeyAuditStore, SqliteApiKeyAuditStore>();
services.AddHostedService<AuthStoreMigrationHostedService>();
return services; return services;
} }
``` ```
Singletons are safe because each operation opens its own short-lived `SqliteConnection` through the factory; there is no shared mutable state inside the services. The gateway pins its own API-key contract — token prefix `mxgw` and the pepper key `MxGateway:ApiKeyPepper` — by layering those as fallback defaults under the supplied configuration before calling `AddZbApiKeyAuth`, because `ApiKeyOptions` is an init-only record that must be bound with those values present rather than mutated afterward. Explicit configuration still wins. `AddZbApiKeyAuth` binds `ApiKeyOptions` from the `MxGateway:Authentication` section and registers the connection factory, stores, pepper provider, verifier, migrator, and migration hosted service.
The audit-store override is registered *after* `AddZbApiKeyAuth` so it replaces the library's `TryAddSingleton` registration. The shared admin command set is not auto-registered by `AddZbApiKeyAuth`, so the gateway registers `ApiKeyAdminCommands` itself over the wired stores; the CLI and dashboard drive it. Library services are singletons and safe because each operation opens its own short-lived `SqliteConnection` through the factory.
## Related Documentation ## Related Documentation
+25 -12
View File
@@ -58,32 +58,34 @@ if (options.Value.Authentication.Mode == AuthenticationMode.Disabled)
} }
string? authorizationHeader = context.RequestHeaders.GetValue("authorization"); string? authorizationHeader = context.RequestHeaders.GetValue("authorization");
ApiKeyVerificationResult verificationResult = await apiKeyVerifier ApiKeyVerification verification = await apiKeyVerifier
.VerifyAsync(authorizationHeader, context.CancellationToken) .VerifyAsync(authorizationHeader ?? string.Empty, context.CancellationToken)
.ConfigureAwait(false); .ConfigureAwait(false);
if (!verificationResult.Succeeded || verificationResult.Identity is null) if (!verification.Succeeded || verification.Identity is null)
{ {
throw new RpcException(new Status( throw new RpcException(new Status(
StatusCode.Unauthenticated, StatusCode.Unauthenticated,
"Missing or invalid API key.")); "Missing or invalid API key."));
} }
ApiKeyIdentity identity = GatewayApiKeyIdentityMapper.ToGatewayIdentity(verification.Identity);
string requiredScope = scopeResolver.ResolveRequiredScope(request); string requiredScope = scopeResolver.ResolveRequiredScope(request);
if (!verificationResult.Identity.Scopes.Contains(requiredScope)) if (!identity.Scopes.Contains(requiredScope))
{ {
throw new RpcException(new Status( throw new RpcException(new Status(
StatusCode.PermissionDenied, StatusCode.PermissionDenied,
$"API key is missing required scope '{requiredScope}'.")); $"API key is missing required scope '{requiredScope}'."));
} }
return verificationResult.Identity; return identity;
``` ```
The flow is: The flow is:
1. If `GatewayOptions.Authentication.Mode` is `AuthenticationMode.Disabled`, the helper returns `null` immediately. No identity is pushed onto the accessor and the continuation runs without scope enforcement. This matches the `AuthenticationMode` enum, which only defines `ApiKey` and `Disabled`. 1. If `GatewayOptions.Authentication.Mode` is `AuthenticationMode.Disabled`, the helper returns `null` immediately. No identity is pushed onto the accessor and the continuation runs without scope enforcement. This matches the `AuthenticationMode` enum, which only defines `ApiKey` and `Disabled`.
2. Otherwise, the `authorization` request header is read directly off `ServerCallContext.RequestHeaders` and handed to `IApiKeyVerifier.VerifyAsync`. A failed verification or a missing identity throws `RpcException` with `StatusCode.Unauthenticated`. 2. Otherwise, the `authorization` request header is read directly off `ServerCallContext.RequestHeaders` and handed to the shared `IApiKeyVerifier.VerifyAsync`, which returns an `ApiKeyVerification`. A failed verification or a missing identity throws `RpcException` with `StatusCode.Unauthenticated`. The shared library's identity is then projected onto the gateway-local `ApiKeyIdentity` by `GatewayApiKeyIdentityMapper.ToGatewayIdentity` before scope checks run.
3. `GatewayGrpcScopeResolver.ResolveRequiredScope(request)` produces the scope string. If the identity's `Scopes` set does not contain it, the helper throws `RpcException` with `StatusCode.PermissionDenied` and embeds the missing scope name in `Status.Detail` so callers can diagnose the failure. 3. `GatewayGrpcScopeResolver.ResolveRequiredScope(request)` produces the scope string. If the identity's `Scopes` set does not contain it, the helper throws `RpcException` with `StatusCode.PermissionDenied` and embeds the missing scope name in `Status.Detail` so callers can diagnose the failure.
4. On success, the verified `ApiKeyIdentity` is returned and pushed onto `IGatewayRequestIdentityAccessor` for the lifetime of the call. 4. On success, the verified `ApiKeyIdentity` is returned and pushed onto `IGatewayRequestIdentityAccessor` for the lifetime of the call.
@@ -107,7 +109,8 @@ public string ResolveRequiredScope(object request)
TestConnectionRequest or TestConnectionRequest or
GetLastDeployTimeRequest or GetLastDeployTimeRequest or
DiscoverHierarchyRequest or DiscoverHierarchyRequest or
WatchDeployEventsRequest => GatewayScopes.MetadataRead, WatchDeployEventsRequest or
BrowseChildrenRequest => GatewayScopes.MetadataRead,
_ => GatewayScopes.Admin _ => GatewayScopes.Admin
}; };
} }
@@ -194,7 +197,7 @@ the gateway fails closed.
Non-bulk constraint failures return gRPC `PermissionDenied`. Bulk read Non-bulk constraint failures return gRPC `PermissionDenied`. Bulk read
commands preserve input order and return a failed `SubscribeResult` for each commands preserve input order and return a failed `SubscribeResult` for each
denied item while still forwarding allowed items to the worker. Every denial denied item while still forwarding allowed items to the worker. Every denial
adds an `api_key_audit` entry with the key id, command kind, target, and records a canonical audit event with the key id, command kind, target, and
blocking constraint; secured values and raw credentials are never logged. blocking constraint; secured values and raw credentials are never logged.
## Scope Catalog ## Scope Catalog
@@ -209,10 +212,10 @@ blocking constraint; secured values and raw credentials are never logged.
| `InvokeRead` | `invoke:read` | `MxCommandRequest` for read-style command kinds (`Register`, `AddItem`, `Advise`, `ReadBulk`, and any kind not otherwise mapped) | | `InvokeRead` | `invoke:read` | `MxCommandRequest` for read-style command kinds (`Register`, `AddItem`, `Advise`, `ReadBulk`, and any kind not otherwise mapped) |
| `InvokeWrite` | `invoke:write` | `AcknowledgeAlarmRequest`, `MxCommandKind.Write`, `MxCommandKind.Write2`, `MxCommandKind.WriteBulk`, `MxCommandKind.Write2Bulk` | | `InvokeWrite` | `invoke:write` | `AcknowledgeAlarmRequest`, `MxCommandKind.Write`, `MxCommandKind.Write2`, `MxCommandKind.WriteBulk`, `MxCommandKind.Write2Bulk` |
| `InvokeSecure` | `invoke:secure` | `MxCommandKind.WriteSecured`, `MxCommandKind.WriteSecured2`, `MxCommandKind.WriteSecuredBulk`, `MxCommandKind.WriteSecured2Bulk`, `MxCommandKind.AuthenticateUser` | | `InvokeSecure` | `invoke:secure` | `MxCommandKind.WriteSecured`, `MxCommandKind.WriteSecured2`, `MxCommandKind.WriteSecuredBulk`, `MxCommandKind.WriteSecured2Bulk`, `MxCommandKind.AuthenticateUser` |
| `MetadataRead` | `metadata:read` | `MxCommandKind.ArchestraUserToId`, `MxCommandKind.GetSessionState`, `MxCommandKind.GetWorkerInfo`, `GalaxyRepository.TestConnection`, `GalaxyRepository.GetLastDeployTime`, `GalaxyRepository.DiscoverHierarchy`, `GalaxyRepository.WatchDeployEvents` | | `MetadataRead` | `metadata:read` | `MxCommandKind.ArchestraUserToId`, `MxCommandKind.GetSessionState`, `MxCommandKind.GetWorkerInfo`, `GalaxyRepository.TestConnection`, `GalaxyRepository.GetLastDeployTime`, `GalaxyRepository.DiscoverHierarchy`, `GalaxyRepository.WatchDeployEvents`, `GalaxyRepository.BrowseChildren` |
| `Admin` | `admin` | `MxCommandKind.ShutdownWorker`, the default for any unrecognized request type, and the dashboard authorization policy | | `Admin` | `admin` | `MxCommandKind.ShutdownWorker` and the default for any unrecognized request type |
The `Admin` constant is also referenced by `DashboardAuthenticator` and `DashboardAuthorizationHandler` so that the dashboard and the gRPC layer agree on what "admin" means. The gRPC `admin` scope here is **distinct** from the dashboard's `Administrator` role. The scope gates API-key access to admin-level RPCs; the dashboard role gates interactive cookie-authenticated dashboard pages. `DashboardAuthorizationHandler` and the dashboard policies authorize against the `Administrator`/`Viewer` roles (see [Gateway Dashboard Design](./GatewayDashboardDesign.md)) and do not reference `GatewayScopes.Admin`. The only dashboard code that touches `GatewayScopes` is the API Keys page, which validates requested scopes against `GatewayScopes.All` when creating a key — the same validation the CLI applies.
## Identity Access for Downstream Layers ## Identity Access for Downstream Layers
@@ -263,14 +266,24 @@ public static IServiceCollection AddGatewayGrpcAuthorization(this IServiceCollec
{ {
services.AddSingleton<GatewayGrpcScopeResolver>(); services.AddSingleton<GatewayGrpcScopeResolver>();
services.AddSingleton<IGatewayRequestIdentityAccessor, GatewayRequestIdentityAccessor>(); services.AddSingleton<IGatewayRequestIdentityAccessor, GatewayRequestIdentityAccessor>();
services.AddSingleton<IConstraintEnforcer, ConstraintEnforcer>();
services.AddSingleton<GatewayGrpcAuthorizationInterceptor>(); services.AddSingleton<GatewayGrpcAuthorizationInterceptor>();
services
.AddOptions<Grpc.AspNetCore.Server.GrpcServiceOptions>()
.Configure<IConfiguration>((grpcOptions, configuration) =>
{
ProtocolOptions protocolOptions = new();
configuration.GetSection("MxGateway:Protocol").Bind(protocolOptions);
grpcOptions.MaxReceiveMessageSize = protocolOptions.MaxGrpcMessageBytes;
grpcOptions.MaxSendMessageSize = protocolOptions.MaxGrpcMessageBytes;
});
services.AddGrpc(options => options.Interceptors.Add<GatewayGrpcAuthorizationInterceptor>()); services.AddGrpc(options => options.Interceptors.Add<GatewayGrpcAuthorizationInterceptor>());
return services; return services;
} }
``` ```
Singleton lifetimes are appropriate because none of the three classes hold per-request state on instance fields; the request-scoped value lives inside the `AsyncLocal` on `GatewayRequestIdentityAccessor`. `GatewayApplication` calls `builder.Services.AddGatewayGrpcAuthorization()` during startup, and the call also performs `AddGrpc`, so the gateway never registers gRPC without the interceptor attached. Four singletons are registered: the scope resolver, the identity accessor, the constraint enforcer (`IConstraintEnforcer``ConstraintEnforcer`, which service bodies call to apply API-key constraints), and the interceptor itself. The same method also binds gRPC's `GrpcServiceOptions.MaxReceiveMessageSize` and `MaxSendMessageSize` from `MxGateway:Protocol:MaxGrpcMessageBytes` so the message-size limits are configured in the one place that wires the authorization pipeline. Singleton lifetimes are appropriate because none of these classes hold per-request state on instance fields; the request-scoped value lives inside the `AsyncLocal` on `GatewayRequestIdentityAccessor`. `GatewayApplication` calls `builder.Services.AddGatewayGrpcAuthorization()` during startup, and the call also performs `AddGrpc`, so the gateway never registers gRPC without the interceptor attached.
## Related Documentation ## Related Documentation
+1 -1
View File
@@ -407,7 +407,7 @@ The stable client proto manifest defines the generated-code directories:
clients/dotnet/generated clients/dotnet/generated
clients/go/internal/generated clients/go/internal/generated
clients/rust/src/generated clients/rust/src/generated
clients/python/src/mxgateway/generated clients/python/src/zb_mom_ww_mxgateway/generated
clients/java/src/main/generated clients/java/src/main/generated
``` ```
+42 -15
View File
@@ -48,8 +48,8 @@ dotnet build src/ZB.MOM.WW.MxGateway.Contracts/ZB.MOM.WW.MxGateway.Contracts.csp
Build and test from the repository root: Build and test from the repository root:
```powershell ```powershell
dotnet build clients/dotnet/ZB.MOM.WW.MxGateway.Client.sln dotnet build clients/dotnet/ZB.MOM.WW.MxGateway.Client.slnx
dotnet test clients/dotnet/ZB.MOM.WW.MxGateway.Client.sln --no-build dotnet test clients/dotnet/ZB.MOM.WW.MxGateway.Client.slnx --no-build
``` ```
Create local package artifacts: Create local package artifacts:
@@ -113,7 +113,7 @@ Pop-Location
## Rust ## Rust
The Rust workspace builds the `mxgateway-client` library crate and the `mxgw` The Rust workspace builds the `zb-mom-ww-mxgateway-client` library crate and the `mxgw`
CLI crate. `build.rs` generates `tonic` and `prost` modules into Cargo build CLI crate. `build.rs` generates `tonic` and `prost` modules into Cargo build
output on each build that needs updated protobuf output. output on each build that needs updated protobuf output.
@@ -156,8 +156,8 @@ Pop-Location
## Python ## Python
The Python package is `mxaccess-gateway-client`. Generated modules live under The Python package is `zb-mom-ww-mxaccess-gateway-client`. Generated modules live under
`clients/python/src/mxgateway/generated`. `clients/python/src/zb_mom_ww_mxgateway/generated`.
Regenerate the Python bindings: Regenerate the Python bindings:
@@ -173,10 +173,14 @@ Install, test, and build a wheel from `clients/python`:
Push-Location clients/python Push-Location clients/python
python -m pip install -e ".[dev]" python -m pip install -e ".[dev]"
python -m pytest python -m pytest
python -m pip wheel . --no-deps --wheel-dir "$env:TEMP\mxgateway-python-wheel" python -m build --outdir "$env:TEMP\mxgateway-python-dist"
Pop-Location Pop-Location
``` ```
`python -m build` (sdist plus wheel) is the canonical build method — it is what
`scripts/pack-clients.ps1` runs for the Python package. Use
`python -m pip wheel . --no-deps` only for a quick wheel-only build.
Run the CLI from the editable install or with `python -m`: Run the CLI from the editable install or with `python -m`:
```powershell ```powershell
@@ -184,21 +188,22 @@ Push-Location clients/python
mxgw-py version --json mxgw-py version --json
mxgw-py smoke --endpoint $env:MXGATEWAY_ENDPOINT --plaintext --api-key-env MXGATEWAY_API_KEY --item $env:MXGATEWAY_TEST_ITEM --json mxgw-py smoke --endpoint $env:MXGATEWAY_ENDPOINT --plaintext --api-key-env MXGATEWAY_API_KEY --item $env:MXGATEWAY_TEST_ITEM --json
mxgw-py smoke --endpoint mxgateway.example.local:5001 --tls --ca-file C:\certs\mxgateway-ca.pem --server-name-override mxgateway.example.local --api-key-env MXGATEWAY_API_KEY --item $env:MXGATEWAY_TEST_ITEM --json mxgw-py smoke --endpoint mxgateway.example.local:5001 --tls --ca-file C:\certs\mxgateway-ca.pem --server-name-override mxgateway.example.local --api-key-env MXGATEWAY_API_KEY --item $env:MXGATEWAY_TEST_ITEM --json
python -m mxgateway_cli version --json python -m zb_mom_ww_mxgateway_cli version --json
Pop-Location Pop-Location
``` ```
## Java ## Java
The Java workspace uses Gradle, Java 21, `mxgateway-client`, and The Java workspace uses Gradle, Java 21, and the subprojects
`mxgateway-cli`. The Gradle protobuf plugin writes generated Java protobuf and `zb-mom-ww-mxgateway-client` and `zb-mom-ww-mxgateway-cli`. The Gradle protobuf
gRPC sources under `clients/java/src/main/generated`. plugin writes generated Java protobuf and gRPC sources under
`clients/java/src/main/generated`.
Regenerate Java bindings: Regenerate Java bindings:
```powershell ```powershell
Push-Location clients/java Push-Location clients/java
gradle :mxgateway-client:generateProto gradle :zb-mom-ww-mxgateway-client:generateProto
Pop-Location Pop-Location
``` ```
@@ -214,7 +219,7 @@ Create local library and CLI artifacts:
```powershell ```powershell
Push-Location clients/java Push-Location clients/java
gradle :mxgateway-client:jar :mxgateway-cli:installDist gradle :zb-mom-ww-mxgateway-client:jar :zb-mom-ww-mxgateway-cli:installDist
Pop-Location Pop-Location
``` ```
@@ -222,12 +227,34 @@ Run the CLI through Gradle:
```powershell ```powershell
Push-Location clients/java Push-Location clients/java
gradle :mxgateway-cli:run --args="version --json" gradle :zb-mom-ww-mxgateway-cli:run --args="version --json"
gradle :mxgateway-cli:run --args="smoke --endpoint $env:MXGATEWAY_ENDPOINT --plaintext --api-key-env MXGATEWAY_API_KEY --item $env:MXGATEWAY_TEST_ITEM --json" gradle :zb-mom-ww-mxgateway-cli:run --args="smoke --endpoint $env:MXGATEWAY_ENDPOINT --plaintext --api-key-env MXGATEWAY_API_KEY --item $env:MXGATEWAY_TEST_ITEM --json"
gradle :mxgateway-cli:run --args="smoke --endpoint mxgateway.example.local:5001 --ca-file C:\certs\mxgateway-ca.pem --server-name-override mxgateway.example.local --api-key-env MXGATEWAY_API_KEY --item $env:MXGATEWAY_TEST_ITEM --json" gradle :zb-mom-ww-mxgateway-cli:run --args="smoke --endpoint mxgateway.example.local:5001 --ca-file C:\certs\mxgateway-ca.pem --server-name-override mxgateway.example.local --api-key-env MXGATEWAY_API_KEY --item $env:MXGATEWAY_TEST_ITEM --json"
Pop-Location Pop-Location
``` ```
## Packing All Clients
`scripts/pack-clients.ps1` runs every client's native packaging command and
drops the artifacts into one directory so a release does not depend on running
each per-language command by hand. It packs the .NET NuGet packages
(`ZB.MOM.WW.MxGateway.Contracts` and `ZB.MOM.WW.MxGateway.Client`), the Python
sdist and wheel (`python -m build`), the Rust `.crate` (`cargo package`), and
the Java jars plus generated POM (`gradle assemble` and the publication tasks).
Go has no artifact to pack — it is released by git-tagging, so the script prints
the `scripts/tag-go-module.ps1` command and skips it.
```powershell
pwsh scripts/pack-clients.ps1
pwsh scripts/pack-clients.ps1 -Languages dotnet,python
```
Artifacts land in `-OutputDir` (default `dist/`). Each language runs its
regression tests first unless `-SkipTests` is set. With `-Publish`, every
package is pushed to the internal Gitea feed; this requires the `GITEA_USERNAME`
and `GITEA_TOKEN` environment variables and the script refuses to publish if
either is missing.
## Integration Tests ## Integration Tests
Client integration checks are opt-in because they need a live gateway and a Client integration checks are opt-in because they need a live gateway and a
+8 -7
View File
@@ -77,7 +77,7 @@ The manifest declares these generated-code directories:
| .NET | `clients/dotnet/generated` | | .NET | `clients/dotnet/generated` |
| Go | `clients/go/internal/generated` | | Go | `clients/go/internal/generated` |
| Rust | `clients/rust/src/generated` | | Rust | `clients/rust/src/generated` |
| Python | `clients/python/src/mxgateway/generated` | | Python | `clients/python/src/zb_mom_ww_mxgateway/generated` |
| Java | `clients/java/src/main/generated` | | Java | `clients/java/src/main/generated` |
Only generator output belongs in these directories. Handwritten client wrappers Only generator output belongs in these directories. Handwritten client wrappers
@@ -98,7 +98,7 @@ Use these commands to regenerate language-specific client bindings:
| Go | `Push-Location clients/go; ./generate-proto.ps1; Pop-Location` | | Go | `Push-Location clients/go; ./generate-proto.ps1; Pop-Location` |
| Rust | `Push-Location clients/rust; cargo check --workspace; Pop-Location` | | Rust | `Push-Location clients/rust; cargo check --workspace; Pop-Location` |
| Python | `Push-Location clients/python; ./generate-proto.ps1; Pop-Location` | | Python | `Push-Location clients/python; ./generate-proto.ps1; Pop-Location` |
| Java | `Push-Location clients/java; gradle :mxgateway-client:generateProto; Pop-Location` | | Java | `Push-Location clients/java; gradle :zb-mom-ww-mxgateway-client:generateProto; Pop-Location` |
.NET generation currently runs through the contracts project: .NET generation currently runs through the contracts project:
@@ -142,7 +142,7 @@ cargo check --workspace
``` ```
Python clients should use `grpc_tools.protoc` and write generated modules under Python clients should use `grpc_tools.protoc` and write generated modules under
`clients/python/src/mxgateway/generated` so imports stay separate from `clients/python/src/zb_mom_ww_mxgateway/generated` so imports stay separate from
handwritten async wrappers. handwritten async wrappers.
The Python scaffold provides a repo-local generation script: The Python scaffold provides a repo-local generation script:
@@ -152,10 +152,11 @@ clients/python/generate-proto.ps1
``` ```
Java clients use the Gradle protobuf plugin from `clients/java`. The Java clients use the Gradle protobuf plugin from `clients/java`. The
`mxgateway-client` project reads the shared `.proto` files and writes generated `zb-mom-ww-mxgateway-client` project reads the shared `.proto` files and writes
Java protobuf and gRPC sources under `clients/java/src/main/generated`, matching generated Java protobuf and gRPC sources under
the manifest output path. Handwritten client and CLI code stays in the `clients/java/src/main/generated`, matching the manifest output path.
`mxgateway-client` and `mxgateway-cli` project source trees. Handwritten client and CLI code stays in the `zb-mom-ww-mxgateway-client` and
`zb-mom-ww-mxgateway-cli` project source trees.
Run the Java workspace checks from `clients/java`: Run the Java workspace checks from `clients/java`:
+38
View File
@@ -77,6 +77,44 @@ only and does not share types with `mxaccess_gateway.proto`. See
[Galaxy Repository Browse](./GalaxyRepository.md) for the RPC catalog and [Galaxy Repository Browse](./GalaxyRepository.md) for the RPC catalog and
behavior. behavior.
### Alarm RPCs and messages
`mxaccess_gateway.proto` also defines three session-less alarm RPCs served by
the gateway's always-on central alarm monitor (no client worker session is
involved):
- `AcknowledgeAlarm(AcknowledgeAlarmRequest) returns (AcknowledgeAlarmReply)`
acknowledges one alarm by its `alarm_full_reference`, with an operator
`comment` and `operator_user`.
- `StreamAlarms(StreamAlarmsRequest) returns (stream AlarmFeedMessage)` — the
central alarm feed.
- `QueryActiveAlarms(QueryActiveAlarmsRequest) returns (stream
ActiveAlarmSnapshot)` — a point-in-time snapshot of the currently-active
alarm set, streamed so callers can begin processing without buffering the
whole set. `alarm_filter_prefix` (when non-empty) narrows the snapshot to
alarms whose `alarm_full_reference` starts with the prefix.
`StreamAlarms` uses a three-phase protocol carried by the `AlarmFeedMessage`
`oneof payload`: the stream opens with one `active_alarm` (`ActiveAlarmSnapshot`)
per currently-active alarm, then a single `snapshot_complete = true` sentinel,
then a `transition` (`OnAlarmTransitionEvent`) for every subsequent change.
`active_alarm` carries the collapsed current state (`AlarmConditionState`:
`Active` / `ActiveAcked` / `Inactive`); `transition` carries the
`AlarmTransitionKind` (`Raise` / `Acknowledge` / `Clear` / `Retrigger`).
`AcknowledgeAlarmRequest` and `AcknowledgeAlarmReply` both **reserve** field 1
and the name `session_id`: acknowledgement was made session-less and the field
was retired (the reservation prevents reuse of the tag). The authoritative
ack-outcome field on `AcknowledgeAlarmReply` is `hresult` (the worker's native
by-name/by-GUID ack return code, 0 = success), alongside `protocol_status`. The
structured `MxStatusProxy status` field is intentionally left **unset** on every
reply because the worker ack path produces only the int32 return code; clients
must read `hresult` and must not depend on `status` being populated.
For the broker architecture and the parse contract for `alarm_full_reference`
(GUID vs `Provider!Group.Tag`) see
[Alarm Client Discovery](./AlarmClientDiscovery.md).
Generated C# output is written to `src/ZB.MOM.WW.MxGateway.Contracts/Generated/`. Do not Generated C# output is written to `src/ZB.MOM.WW.MxGateway.Contracts/Generated/`. Do not
hand-edit generated files. hand-edit generated files.
+124 -88
View File
@@ -8,8 +8,12 @@ operations-focused projects.
The dashboard is an operational interface, not a landing page. It prioritizes The dashboard is an operational interface, not a landing page. It prioritizes
fast scanning, low visual noise, and stable layouts while live data changes. fast scanning, low visual noise, and stable layouts while live data changes.
The design uses Bootstrap for common behavior and a small local stylesheet for The layout chrome, status presentation, and design tokens come from the shared
project identity, spacing, and status presentation. `ZB.MOM.WW.Theme` kit (the technical-light design system). Bootstrap supplies
common widget behavior, and a small local stylesheet (`wwwroot/css/site.css`)
wires the dashboard's own class names and Bootstrap widgets onto the kit's
tokens. The local sheet contains no hard-coded colors; every color, font, and
surface resolves to a theme token.
Use this style for applications where users repeatedly check system state, Use this style for applications where users repeatedly check system state,
compare rows, inspect details, and diagnose faults. Avoid promotional layouts, compare rows, inspect details, and diagnose faults. Avoid promotional layouts,
@@ -25,7 +29,7 @@ The interface uses a quiet, work-focused visual system:
- White cards and sections carry the actual operational content. - White cards and sections carry the actual operational content.
- Borders define structure more often than shadows. - Borders define structure more often than shadows.
- Accent color is reserved for metric values and important numeric signals. - Accent color is reserved for metric values and important numeric signals.
- Bootstrap status badges provide state color without custom status art. - The kit's `StatusPill` provides state color without custom status art.
- Tables remain compact and responsive so long identifiers and timestamps stay - Tables remain compact and responsive so long identifiers and timestamps stay
readable. readable.
@@ -34,93 +38,113 @@ and dense enough for repeated use.
## Layout Structure ## Layout Structure
Every page follows the same structure: The application chassis is the kit's `ThemeShell` component (a vertical side
rail plus a content area), not a horizontal top navbar. `MainLayout.razor` is a
thin wrapper that delegates the rail chassis — brand block, hamburger toggle,
responsive collapse — to `<ThemeShell>` and supplies only the navigation items
and a rail footer:
1. A top navigation bar with the product or service name on the left. ```razor
2. A full-width `container-fluid` content area. <ThemeShell Product="MXAccess Gateway" Accent="#2f5fd0">
3. A page header with the page title, short context text, and optional status <Nav>
badge. <NavRailItem Href="/" Text="Dashboard" Match="NavLinkMatch.All" />
4. Metric cards when a page has top-level numeric state. <NavRailSection Title="Runtime" Key="runtime">
5. Bordered content sections for tables, details, faults, or empty states. <NavRailItem Href="/sessions" Text="Sessions" />
<NavRailItem Href="/workers" Text="Workers" />
The shell does not use a sidebar. A horizontal navigation bar is enough for the </NavRailSection>
current page count and keeps the content width available for tables. </Nav>
<RailFooter><!-- user name + sign-out --></RailFooter>
```html <ChildContent>@Body</ChildContent>
<div class="dashboard-shell"> </ThemeShell>
<nav class="navbar navbar-expand-lg bg-body border-bottom dashboard-navbar">
<!-- brand, page links, sign-out action -->
</nav>
<main class="container-fluid dashboard-content">
<!-- page header, metric grid, sections -->
</main>
</div>
``` ```
Within the content area, every page follows the same structure:
1. A page header with the page title, short context text, and optional status
pill.
2. Metric cards when a page has top-level numeric state.
3. Bordered content sections for tables, details, faults, or empty states.
The login page uses `LoginLayout.razor` instead — a minimal layout with no rail
and no brand block, because the page renders its own centered `<LoginCard>`.
## Color Tokens ## Color Tokens
Use a small token set and let Bootstrap provide the rest. The current dashboard Colors come from the `ZB.MOM.WW.Theme` kit's `theme.css`. The local
uses these local tokens: `site.css` defines no `:root` custom properties of its own; it references kit
tokens by name. The dashboard does not define a `--mxgw-*` token set.
```css
:root {
--mxgw-surface: #f7f8fa;
--mxgw-border: #d8dee6;
--mxgw-ink-muted: #667085;
--mxgw-accent: #146c64;
}
```
| Token | Purpose | | Token | Purpose |
|-------|---------| |-------|---------|
| `--mxgw-surface` | Page background behind all content. | | `var(--card)` | Background of cards, sections, and data tables. |
| `--mxgw-border` | Borders on cards, tables, sections, and empty states. | | `var(--rule)`, `var(--rule-strong)` | Hairline and stronger borders. |
| `--mxgw-ink-muted` | Secondary labels, details, and empty-state text. | | `var(--ink)`, `var(--ink-soft)`, `var(--ink-faint)` | Primary, secondary, and muted text. |
| `--mxgw-accent` | Metric values and important numeric summaries. | | `var(--accent)`, `var(--accent-deep)` | Metric values, links, primary buttons, focus rings. |
| `var(--mono)` | Monospace family for values, identifiers, and code. |
| `var(--ok)`/`--ok-bg`, `var(--warn)`/`--warn-bg`, `var(--bad)`/`--bad-bg`, `var(--idle)`/`--idle-bg` | State colors for chips, alerts, and alarm-state labels. |
Keep the palette small. Add new colors only when they encode state or improve Keep the palette small and let the kit own it. Add new colors only when they
readability. Prefer Bootstrap badge classes for states such as ready, closing, encode state or improve readability, and resolve them to a kit token rather than
closed, and faulted. a literal hex value. Use the kit's `StatusPill` for states such as ready,
closing, idle, and faulted.
## Typography ## Typography
Typography stays compact and consistent: Typography stays compact and consistent:
- Page headings use `1.35rem`, weight `650`, and normal letter spacing. - Page headings (`.dashboard-page-header h1`) use `1.15rem`, weight `600`, and a
- Section headings use the same size as page headings when they introduce a slight letter spacing.
table or details group. - Section headings (`.section-heading h2`) use a small uppercase eyebrow:
- Metric labels use uppercase text at `.78rem` and weight `650`. `.74rem`, weight `600`, muted ink.
- Metric values use `1.7rem`, weight `700`, and the accent color. - Metric labels (`.agg-label`) use uppercase text at `.68rem` and weight `600`,
muted ink.
- Metric values (`.agg-value`) use `1.5rem`, weight `600`, the monospace family,
tabular numerics, and primary ink (`var(--ink)`).
- Body and table text inherit Bootstrap defaults for readability. - Body and table text inherit Bootstrap defaults for readability.
Do not scale text with viewport width. Long values use `overflow-wrap: Do not scale text with viewport width. Long values use `overflow-wrap:
anywhere` so session IDs, paths, and fault messages do not break the layout. break-word` (numbers and date tokens stay whole, wrapping only at spaces); a few
free-form fields such as `.agg-sub` use `overflow-wrap: anywhere` so session
IDs, paths, and fault messages do not break the layout.
## Spacing And Shape ## Spacing And Shape
The dashboard uses modest spacing: The dashboard uses modest spacing:
- Page content has `1.25rem` padding on desktop and `.75rem` on small screens. - The kit owns the rail and content padding; the local small-screen rule sets
`.page` padding to `.85rem`.
- Metric grids use `.75rem` gaps. - Metric grids use `.75rem` gaps.
- Content sections start with a top border and `1rem` top padding. - Content sections (`.dashboard-section`) and metric cards (`.agg-card`) are
- Cards and empty states use Bootstrap's small radius shape, `.375rem`. fully bordered cards: `var(--card)` fill, a `1px solid var(--rule)` hairline,
- Metric cards have no shadow. and `0.9rem` padding for sections.
- Cards, sections, and modals use an `8px` radius; smaller widgets such as the
empty state use `6px`.
- Metric cards have no shadow (`box-shadow: none`); borders define structure.
This keeps information grouped without turning each section into a decorative This keeps information grouped without turning each section into a decorative
panel. Use cards for repeated metric summaries, login forms, and individual panel. Use cards for repeated metric summaries, login forms, and individual
items. Use unframed sections with a top border for page-level groups. items. Use bordered sections for page-level groups.
## Navigation ## Navigation
Navigation is a Bootstrap responsive navbar. It includes: Navigation lives in the `ThemeShell` side rail. It is built from the kit's
`NavRailSection` and `NavRailItem` components: a single home item plus eight
page items grouped into three labeled sections.
- Brand text for the service name. | Section | Items |
- Short page labels: `Overview`, `Sessions`, `Workers`, `Events`, `Settings`. |---------|-------|
- Active route styling through `NavLink`. | (home) | `Dashboard` (route `/`, `NavLinkMatch.All`) |
- A right-aligned sign-out button when authentication is enabled. | Runtime | `Sessions`, `Workers`, `Events`, `Alarms` |
| Galaxy | `Repository`, `Browse` |
| Admin | `API Keys`, `Settings` |
Keep navigation labels short. Operational users should be able to predict what Section expand/collapse state is owned by the kit (a `<details>` element plus
each page contains without reading explanatory copy. `ThemeScripts`); the layout does not run JS interop for it. The rail footer
shows the signed-in user name and a sign-out form (or a sign-in link when
unauthenticated).
Keep navigation labels short and group related pages. Operational users should
be able to predict what each page contains without reading explanatory copy.
## Page Headers ## Page Headers
@@ -128,42 +152,43 @@ Each page starts with a `dashboard-page-header`:
- The title is the primary anchor. - The title is the primary anchor.
- A single secondary line gives timestamp, row count, or configuration context. - A single secondary line gives timestamp, row count, or configuration context.
- A status badge appears on the right when the page has an overall state. - A status pill appears on the right when the page has an overall state.
On narrow screens, the header stacks vertically. This prevents long context On narrow screens, the header stacks vertically. This prevents long context
text or status badges from overlapping the title. text or status pills from overlapping the title.
```html ```html
<div class="dashboard-page-header"> <div class="dashboard-page-header">
<div> <div>
<h1>Overview</h1> <h1>Dashboard</h1>
<div class="text-secondary">Generated 2026-04-27 17:30:00</div> <div class="text-secondary">Generated 2026-04-27 17:30:00</div>
</div> </div>
<span class="badge text-bg-success">Healthy</span> <!-- <StatusBadge Text="Healthy" /> -> kit <StatusPill State="Ok"> -->
</div> </div>
``` ```
## Metric Cards ## Metric Cards
Metric cards summarize numeric state at the top of overview and diagnostic Metric cards summarize numeric state at the top of the home and diagnostic
pages. They use Bootstrap cards with a local `metric-card` class: pages. The `MetricCard` component renders an `.agg-card` with label, value, and
optional sub-line:
- Label: uppercase, muted, compact. - Label (`.agg-label`): uppercase eyebrow, muted, compact.
- Value: large enough to scan, accent colored, wraps safely. - Value (`.agg-value`): large monospace number in primary ink, wraps safely.
- Detail: optional muted text for version, rate context, or explanatory state. - Sub (`.agg-sub`): optional muted text for version, rate context, or state.
Use auto-fit CSS grid tracks so the cards fill available width without custom Cards lay out in a `.metric-grid`. Use auto-fill CSS grid tracks so they fill
breakpoints: available width without custom breakpoints:
```css ```css
.metric-grid { .metric-grid {
display: grid; display: grid;
gap: .75rem; gap: .75rem;
grid-template-columns: repeat(auto-fit, minmax(12rem, 1fr)); grid-template-columns: repeat(auto-fill, minmax(11rem, 1fr));
} }
.metric-grid.compact { .metric-grid.compact {
grid-template-columns: repeat(auto-fit, minmax(10rem, 1fr)); grid-template-columns: repeat(auto-fill, minmax(10rem, 1fr));
} }
``` ```
@@ -188,15 +213,22 @@ entire rows clickable when a single identifier link is clearer.
## Status Badges ## Status Badges
Status uses Bootstrap badge classes with a small mapping layer: `StatusBadge` is a thin adapter over the kit's `StatusPill`. Call sites pass the
literal domain state text (`<StatusBadge Text="Ready" />`); the adapter maps
that text to one of the kit's four `StatusState` values, and `StatusPill`
renders the chip. There are no Bootstrap `text-bg-*` classes in this layer.
| State | Badge class | | Domain state text | `StatusState` |
|-------|-------------| |-------------------|---------------|
| `Ready`, `Healthy` | `text-bg-success` | | `Ready`, `Healthy`, `Active` | `Ok` |
| `Creating`, `StartingWorker`, `WaitingForPipe`, `InitializingWorker`, `Closing` | `text-bg-info` | | `Creating`, `StartingWorker`, `WaitingForPipe`, `InitializingWorker`, `Closing`, `Stale`, `Degraded` | `Warn` |
| `Closed` | `text-bg-secondary` | | `Faulted`, `Unavailable` | `Bad` |
| `Faulted` | `text-bg-danger` | | Any other text (including `Closed`, `Revoked`, `Unknown`) | `Idle` |
| Unknown state | `text-bg-light text-dark border` |
Note the mapping changes from earlier revisions: `Closed` now falls through to
`Idle` (rather than its own neutral badge), and `Active`, `Stale`, `Degraded`,
and `Unavailable` are explicit cases. The kit owns the chip rendering; only this
domain text-to-state vocabulary lives in the app.
Keep status text literal. Operators benefit from seeing the same state names Keep status text literal. Operators benefit from seeing the same state names
that appear in logs and APIs. that appear in logs and APIs.
@@ -230,8 +262,8 @@ The dashboard uses one small-screen breakpoint:
```css ```css
@media (max-width: 700px) { @media (max-width: 700px) {
.dashboard-content { .page {
padding: .75rem; padding: .85rem;
} }
.dashboard-page-header { .dashboard-page-header {
@@ -245,6 +277,9 @@ The dashboard uses one small-screen breakpoint:
} }
``` ```
A second breakpoint (`max-width: 960px`) collapses the Browse two-pane layout
(`.browse-layout`) to a single column.
Do not hide important columns by default. Use horizontal table scrolling for Do not hide important columns by default. Use horizontal table scrolling for
dense operational data, and reserve column hiding for data that is clearly dense operational data, and reserve column hiding for data that is clearly
duplicative. duplicative.
@@ -277,18 +312,19 @@ markup.
Use this checklist when applying the design to another project: Use this checklist when applying the design to another project:
- Define four local tokens: surface, border, muted ink, and accent. - Take colors, fonts, and surfaces from the `ZB.MOM.WW.Theme` kit tokens; do
- Use a Bootstrap top navbar with short route labels. not define a local color token set.
- Keep page content inside a full-width fluid container. - Use the kit's `ThemeShell` side rail with `NavRailSection`/`NavRailItem` and
short route labels grouped into sections.
- Start every page with the same header structure. - Start every page with the same header structure.
- Put primary numeric state in `metric-grid` cards. - Put primary numeric state in `metric-grid` / `agg-card` cards.
- Put detailed runtime state in compact responsive tables. - Put detailed runtime state in compact responsive tables.
- Use status badges mapped from real domain states. - Use `StatusBadge` (kit `StatusPill`) mapped from real domain states.
- Use dashed bordered empty states for loading and no-data cases. - Use dashed bordered empty states for loading and no-data cases.
- Use top-bordered sections for page groups instead of nested cards. - Use top-bordered sections for page groups instead of nested cards.
- Centralize formatting and redaction outside Razor markup. - Centralize formatting and redaction outside Razor markup.
- Hide every destructive admin affordance from viewers; render it only for - Hide every destructive admin affordance from viewers; render it only for
the `Admin` role and re-check the role server-side on every invocation. the `Administrator` role and re-check the role server-side on every invocation.
- Route every destructive action (Close session, Kill worker, Rotate / - Route every destructive action (Close session, Kill worker, Rotate /
Revoke / Delete API key) through the shared `ConfirmDialog` component so Revoke / Delete API key) through the shared `ConfirmDialog` component so
the operator always gets one explicit confirmation step before the call the operator always gets one explicit confirmation step before the call
+10 -4
View File
@@ -357,10 +357,16 @@ Allowed UI stack:
Do not use MudBlazor or other Blazor UI component libraries for v1. Do not use MudBlazor or other Blazor UI component libraries for v1.
Dashboard access should require API-key-backed dashboard authentication with Dashboard authentication is LDAP-backed, deliberately separate from the gRPC
`admin` scope when enabled. For local development, anonymous localhost access API-key model: dashboard users are people who already have directory accounts,
is enabled by default through `Dashboard:AllowAnonymousLocalhost`; the bypass is so reusing LDAP avoids minting and distributing API keys for human operators.
limited to loopback requests. `DashboardAuthenticator` binds the supplied credentials against `MxGateway:Ldap`
through the shared `ILdapAuthService`, then maps the user's LDAP groups to the
`Administrator` or `Viewer` dashboard role via `MxGateway:Dashboard:GroupToRole`.
A login whose groups match no role is denied. For local development, anonymous
localhost access is enabled by default through
`MxGateway:Dashboard:AllowAnonymousLocalhost`; the bypass is limited to loopback
requests.
## Lazy Browse Is Wire-Only ## Lazy Browse Is Wire-Only
+28 -3
View File
@@ -162,7 +162,7 @@ public static IApplicationBuilder UseGatewayRequestLoggingScope(this IApplicatio
{ {
ILogger logger = context.RequestServices ILogger logger = context.RequestServices
.GetRequiredService<ILoggerFactory>() .GetRequiredService<ILoggerFactory>()
.CreateLogger("ZB.MOM.WW.MxGateway.Request"); .CreateLogger("MxGateway.Request");
using IDisposable? scope = logger.BeginGatewayScope(new GatewayLogScope( using IDisposable? scope = logger.BeginGatewayScope(new GatewayLogScope(
SessionId: ReadHeader(context, SessionIdHeaderName), SessionId: ReadHeader(context, SessionIdHeaderName),
@@ -188,7 +188,7 @@ The scope is keyed off four custom headers and the standard `authorization` head
The numeric headers use `int.TryParse` and `ulong.TryParse`; missing or unparseable values become `null` and are dropped by `GatewayLogScope.ToDictionary`. This keeps the middleware tolerant of clients that do not yet emit every header, which matters because the earliest call in a session (`OpenSession`) has no `SessionId` to send. The numeric headers use `int.TryParse` and `ulong.TryParse`; missing or unparseable values become `null` and are dropped by `GatewayLogScope.ToDictionary`. This keeps the middleware tolerant of clients that do not yet emit every header, which matters because the earliest call in a session (`OpenSession`) has no `SessionId` to send.
The logger category is `ZB.MOM.WW.MxGateway.Request`, which lets operators filter the request scope events independently from per-component categories. The logger category is `MxGateway.Request`, which lets operators filter the request scope events independently from per-component categories.
### Pipeline ordering ### Pipeline ordering
@@ -205,13 +205,38 @@ app.MapGatewayEndpoints();
The order matters: putting the logging scope first ensures that authentication failures, authorization denials, and endpoint exceptions all run inside the request scope, so failure logs still carry the correlation id and session id headers that the caller sent. The `ClientIdentity` field is redacted before logging, so reading the `authorization` header at this stage does not leak the bearer secret into authentication failure logs. The order matters: putting the logging scope first ensures that authentication failures, authorization denials, and endpoint exceptions all run inside the request scope, so failure logs still carry the correlation id and session id headers that the caller sent. The `ClientIdentity` field is redacted before logging, so reading the `authorization` header at this stage does not leak the bearer secret into authentication failure logs.
### Telemetry redaction seam
The per-request middleware redacts the `authorization` header before it reaches a scope, but log events produced outside the request scope (or with credential-bearing properties attached by other enrichers) need the same protection. `GatewayLogRedactorSeam` adapts the static `GatewayLogRedactor` to the shared `ILogRedactor` seam so the telemetry `RedactionEnricher` masks identity material on **every** log event:
```csharp
builder.Services.AddSingleton<ILogRedactor, GatewayLogRedactorSeam>();
```
The seam scans a fixed set of identity-bearing property names (`ClientIdentity`, `authorization`, `Authorization`) and rewrites any string value through `GatewayLogRedactor.RedactClientIdentity`. Because it runs in the enricher rather than at the call site, it catches credential material that a component logged without going through `GatewayLogScope`.
## Readiness Health Check
`AuthStoreHealthCheck` is a readiness probe registered under the health-check name `auth-store` and tagged for the readiness set (`ZbHealthTags.Ready`):
```csharp
builder.Services.AddHealthChecks()
.AddTypeActivatedCheck<AuthStoreHealthCheck>(
"auth-store",
failureStatus: null,
tags: new[] { ZbHealthTags.Ready });
```
The gateway authenticates every gRPC call against the SQLite auth store, so its reachability gates readiness. The check opens a connection via `AuthSqliteConnectionFactory` and runs `SELECT 1;`: success reports `Healthy`, any exception (other than the probe being cancelled) reports `Unhealthy` with the underlying error attached. It is surfaced on the readiness endpoint exposed by the shared telemetry wiring (the live/ready split is what the `wonder-app-vd03` deployment exposes as `/health/live` with the dashboard disabled).
## Consumers ## Consumers
`GatewayLoggerExtensions.BeginGatewayScope` is consumed by `GatewayRequestLoggingMiddlewareExtensions` to attach the per-request scope. Component-level call sites build narrower `GatewayLogScope` instances (for example, with a known `WorkerProcessId` after a worker launch) and push a nested scope on top of the request scope. `GatewayLoggerExtensions.BeginGatewayScope` is consumed by `GatewayRequestLoggingMiddlewareExtensions` to attach the per-request scope. Component-level call sites build narrower `GatewayLogScope` instances (for example, with a known `WorkerProcessId` after a worker launch) and push a nested scope on top of the request scope.
`GatewayLogRedactor` is consumed in three places: `GatewayLogRedactor` is consumed in four places:
- `GatewayLogScope.ToDictionary` redacts `ClientIdentity` whenever a scope is materialized. - `GatewayLogScope.ToDictionary` redacts `ClientIdentity` whenever a scope is materialized.
- `GatewayLogRedactorSeam.Redact` applies the same redaction to identity-bearing properties on every telemetry log event (see above).
- `DashboardRedactor.Redact` delegates to `RedactClientIdentity` for any value containing the `mxgw_` marker, then falls back to a marker-keyword check for fields like `password` or `token`. This keeps dashboard renders aligned with log redaction. - `DashboardRedactor.Redact` delegates to `RedactClientIdentity` for any value containing the `mxgw_` marker, then falls back to a marker-keyword check for fields like `password` or `token`. This keeps dashboard renders aligned with log redaction.
- `ZB.MOM.WW.MxGateway.Tests/Diagnostics/GatewayLogRedactorTests.cs` covers each redaction branch, including the assertion that `WriteSecured` values stay redacted even when `valueLoggingEnabled` is true. - `ZB.MOM.WW.MxGateway.Tests/Diagnostics/GatewayLogRedactorTests.cs` covers each redaction branch, including the assertion that `WriteSecured` values stay redacted even when `valueLoggingEnabled` is true.
+48 -11
View File
@@ -81,11 +81,16 @@ computed against the *filtered* descendant set, a branch that contains no
matching objects gets `false`, not `true`. matching objects gets `false`, not `true`.
**Paging.** Default page size is 500; the server caps any requested size at **Paging.** Default page size is 500; the server caps any requested size at
5000. Page tokens encode `(cache_sequence, parent_id, filter_signature, 5000. Page tokens are the colon-delimited triple `sequence:filterSignature:offset`
offset)`. A token from a different cache generation or a different filter set — the same encoding `DiscoverHierarchy` uses. The parent selector is not a
returns `InvalidArgument`. The error messages reference "DiscoverHierarchy separate token field: it is folded into `filterSignature` along with the rest of
page_token" because `BrowseChildren` reuses the same encoding and validation the filter set (the projector's `ComputeFilterSignature` takes the parent id),
path — if you see that wording in a `BrowseChildren` context it is expected. so a page token implicitly pins the parent. A token from a different cache
generation (`sequence` mismatch) or a different filter set (`filterSignature`
mismatch) returns `InvalidArgument`. The error messages reference
"DiscoverHierarchy page_token" because `BrowseChildren` reuses the same encoding
and validation path — if you see that wording in a `BrowseChildren` context it is
expected.
**Errors.** **Errors.**
@@ -133,6 +138,15 @@ When SQL is unreachable, the cache retains the previous data and flips
`Status` to `Stale` (or `Unavailable` if no data was ever loaded). A `Status` to `Stale` (or `Unavailable` if no data was ever loaded). A
`SqlException` never bubbles out as the client-facing error. `SqlException` never bubbles out as the client-facing error.
The cache also auto-degrades a `Healthy` entry to `Stale` purely on age: when the
last successful refresh is older than five minutes, the projected status is
reported as `Stale` even though the data hasn't otherwise changed. This guards
against a silently wedged refresh loop — if ticks stop succeeding, browse
results visibly go `Stale` rather than continuing to look fresh. (`Unknown` and
`Unavailable` entries are returned as-is and not aged.) The first refresh runs at
service startup, before the interval loop begins, so the cache is populated as
soon as practical rather than waiting one full interval.
### First-load behavior ### First-load behavior
If a client calls `DiscoverHierarchy` before the background service has If a client calls `DiscoverHierarchy` before the background service has
@@ -156,7 +170,10 @@ working across that gap, the cache persists its dataset to disk:
- On the **first** refresh after startup, before any SQL runs, the cache - On the **first** refresh after startup, before any SQL runs, the cache
reloads that file. The restored data is served with `Stale` status — reloads that file. The restored data is served with `Stale` status —
it is last-known data, not live — so clients can browse immediately even it is last-known data, not live — so clients can browse immediately even
when the Galaxy database is unreachable. when the Galaxy database is unreachable. The restore also publishes a deploy
event through `IGalaxyDeployNotifier`, so a `WatchDeployEvents` subscriber that
attaches before the first live query still sees the restored snapshot's deploy
state.
- The first live query then reconciles: if it observes the **same** - The first live query then reconciles: if it observes the **same**
`time_of_last_deploy` the snapshot was saved at, the entry is promoted to `time_of_last_deploy` the snapshot was saved at, the entry is promoted to
`Healthy` with no heavy re-query (the snapshot is provably current); if it `Healthy` with no heavy re-query (the snapshot is provably current); if it
@@ -349,6 +366,25 @@ Component breakdown:
override per object. `HierarchySql` still matches the OtOpcUa original; override per object. `HierarchySql` still matches the OtOpcUa original;
`AttributesSql` does not — it additionally enumerates built-in primitive `AttributesSql` does not — it additionally enumerates built-in primitive
attributes (see [Built-in vs configured attributes](#built-in-vs-configured-attributes)). attributes (see [Built-in vs configured attributes](#built-in-vs-configured-attributes)).
`HierarchySql` restricts the result to a fixed allow-list of object categories
via `WHERE td.category_id IN (1, 3, 4, 10, 11, 13, 17, 24, 26)` — the same set
the dashboard's `ResolveCategoryName` map names. Categories outside this set
(for example, internal framework objects) are never browsed. The mapping:
| `category_id` | Name |
|---|---|
| 1 | WinPlatform |
| 3 | AppEngine |
| 4 | InTouchViewApp |
| 10 | UserDefined |
| 11 | FieldReference |
| 13 | Area |
| 17 | DIObject |
| 24 | DDESuiteLinkClient |
| 26 | OPCClient |
Any other category id renders as `Category {id}` in the dashboard.
- `GalaxyHierarchyCache` - `GalaxyHierarchyCache`
(`src/ZB.MOM.WW.MxGateway.Server/Galaxy/GalaxyHierarchyCache.cs`) holds the most (`src/ZB.MOM.WW.MxGateway.Server/Galaxy/GalaxyHierarchyCache.cs`) holds the most
recent immutable `GalaxyHierarchyCacheEntry` (materialized objects + recent immutable `GalaxyHierarchyCacheEntry` (materialized objects +
@@ -384,7 +420,7 @@ Bound to `MxGateway:Galaxy` via `GalaxyRepositoryOptions`.
| Option | Default | Description | | Option | Default | Description |
|--------|---------|-------------| |--------|---------|-------------|
| `MxGateway:Galaxy:ConnectionString` | `Server=localhost;Database=ZB;Integrated Security=True;TrustServerCertificate=True;Encrypt=False;` | SQL Server connection string for the Galaxy Repository. Integrated Security against `localhost` is the dev default; production deployments should override this through the standard double-underscore environment variable form, e.g. `MxGateway__Galaxy__ConnectionString`. | | `MxGateway:Galaxy:ConnectionString` | `Server=localhost;Database=ZB;Integrated Security=True;TrustServerCertificate=True;Encrypt=False;` | SQL Server connection string for the Galaxy Repository. Integrated Security against `localhost` is the dev default; production deployments should override this through the standard double-underscore environment variable form, e.g. `MxGateway__Galaxy__ConnectionString`. |
| `MxGateway:Galaxy:CommandTimeoutSeconds` | `60` | Per-command SQL timeout. Applies to all three RPCs. | | `MxGateway:Galaxy:CommandTimeoutSeconds` | `60` | Per-command SQL timeout applied to every SQL command the repository runs (the connectivity probe, the deploy-time poll, and the hierarchy and attribute queries), which back all five Galaxy RPCs. |
| `MxGateway:Galaxy:PersistSnapshot` | `true` | Persists each successful browse dataset to disk and reloads it at startup. See [On-disk snapshot](#on-disk-snapshot). | | `MxGateway:Galaxy:PersistSnapshot` | `true` | Persists each successful browse dataset to disk and reloads it at startup. See [On-disk snapshot](#on-disk-snapshot). |
| `MxGateway:Galaxy:SnapshotCachePath` | `C:\ProgramData\MxGateway\galaxy-snapshot.json` | File path for the persisted browse snapshot. Ignored when `PersistSnapshot` is `false`. | | `MxGateway:Galaxy:SnapshotCachePath` | `C:\ProgramData\MxGateway\galaxy-snapshot.json` | File path for the persisted browse snapshot. Ignored when `PersistSnapshot` is `false`. |
@@ -400,7 +436,8 @@ unparsed connection string text.
## Authorization ## Authorization
All four Galaxy RPCs (including `WatchDeployEvents`) require the All five Galaxy RPCs (`TestConnection`, `GetLastDeployTime`,
`DiscoverHierarchy`, `WatchDeployEvents`, and `BrowseChildren`) require the
`metadata:read` API-key scope. Browse is read-only metadata, equivalent in `metadata:read` API-key scope. Browse is read-only metadata, equivalent in
privilege to `MxCommandKind.GetSessionState` or `MxCommandKind.GetWorkerInfo`. privilege to `MxCommandKind.GetSessionState` or `MxCommandKind.GetWorkerInfo`.
The mapping lives in `GatewayGrpcScopeResolver`; see The mapping lives in `GatewayGrpcScopeResolver`; see
@@ -419,17 +456,17 @@ embedded in the status detail.
The gateway's Blazor dashboard surfaces a Galaxy summary in two places: The gateway's Blazor dashboard surfaces a Galaxy summary in two places:
- An overview card on `/dashboard` showing connectivity status, last deploy - An overview card on `/` showing connectivity status, last deploy
timestamp, object count (with area count), attribute total, historized and timestamp, object count (with area count), attribute total, historized and
alarm counts, and last successful refresh. alarm counts, and last successful refresh.
- A dedicated `/dashboard/galaxy` page with object-category and top-template - A dedicated `/galaxy` page with object-category and top-template
breakdowns plus a Sync Info table covering last successful refresh, last breakdowns plus a Sync Info table covering last successful refresh, last
attempt, refresh interval, redacted connection string, and command timeout. attempt, refresh interval, redacted connection string, and command timeout.
Both views are projected from the same `IGalaxyHierarchyCache` that backs the Both views are projected from the same `IGalaxyHierarchyCache` that backs the
gRPC service. The dashboard does not run its own refresh — when the gRPC service. The dashboard does not run its own refresh — when the
background `GalaxyHierarchyRefreshService` updates the cache, both the background `GalaxyHierarchyRefreshService` updates the cache, both the
overview card and the `/dashboard/galaxy` page pick up the new state on the overview card and the `/galaxy` page pick up the new state on the
next dashboard tick. When SQL is unreachable, the cache retains the previous next dashboard tick. When SQL is unreachable, the cache retains the previous
data and flips `Status` to `Stale` or `Unavailable`; the dashboard surfaces data and flips `Status` to `Stale` or `Unavailable`; the dashboard surfaces
that as a yellow or red status badge plus the truncated error. that as a yellow or red status badge plus the truncated error.
+50 -4
View File
@@ -18,6 +18,19 @@ paths, timeouts, queue sizes, enum values, or protocol values are invalid.
"PepperSecretName": "MxGateway:ApiKeyPepper", "PepperSecretName": "MxGateway:ApiKeyPepper",
"RunMigrationsOnStartup": true "RunMigrationsOnStartup": true
}, },
"Ldap": {
"Enabled": true,
"Server": "localhost",
"Port": 3893,
"Transport": "None",
"AllowInsecure": true,
"SearchBase": "dc=zb,dc=local",
"ServiceAccountDn": "cn=serviceaccount,dc=zb,dc=local",
"ServiceAccountPassword": "serviceaccount123",
"UserNameAttribute": "cn",
"DisplayNameAttribute": "cn",
"GroupAttribute": "memberOf"
},
"Worker": { "Worker": {
"ExecutablePath": "src\\ZB.MOM.WW.MxGateway.Worker\\bin\\x86\\Release\\ZB.MOM.WW.MxGateway.Worker.exe", "ExecutablePath": "src\\ZB.MOM.WW.MxGateway.Worker\\bin\\x86\\Release\\ZB.MOM.WW.MxGateway.Worker.exe",
"WorkingDirectory": null, "WorkingDirectory": null,
@@ -52,7 +65,7 @@ paths, timeouts, queue sizes, enum values, or protocol values are invalid.
"RecentSessionLimit": 200, "RecentSessionLimit": 200,
"ShowTagValues": false, "ShowTagValues": false,
"GroupToRole": { "GroupToRole": {
"GwAdmin": "Admin", "GwAdmin": "Administrator",
"GwReader": "Viewer" "GwReader": "Viewer"
} }
}, },
@@ -93,6 +106,39 @@ Environment variables use the normal .NET double-underscore form. For example,
When `Mode` is `ApiKey`, `SqlitePath` and `PepperSecretName` must be present. When `Mode` is `ApiKey`, `SqlitePath` and `PepperSecretName` must be present.
`SqlitePath` must be a valid filesystem path. `SqlitePath` must be a valid filesystem path.
## Ldap Options
The `MxGateway:Ldap` section configures the dashboard's LDAP login (the gRPC API
uses API keys, not LDAP — see [Authentication](./Authentication.md)). The same
section is bound twice: the runtime bind/search is performed by the shared
`ZB.MOM.WW.Auth.Ldap` provider wired up by `AddZbLdapAuth`, while the gateway's
own `LdapOptions` shadow exists only for startup validation, the redacted
effective-config display, and the dev/default values. The two stay
field-compatible so the one section binds onto both. The gateway ships
dev-friendly defaults (plaintext localhost); the shared provider's own defaults
are secure-by-default.
| Option | Default | Description |
|--------|---------|-------------|
| `MxGateway:Ldap:Enabled` | `true` | Enables LDAP-backed dashboard login. When `false`, the rest of the section is not validated and LDAP login is not wired up. |
| `MxGateway:Ldap:Server` | `localhost` | LDAP server host. Required when `Enabled`. |
| `MxGateway:Ldap:Port` | `3893` | LDAP server port. Must be a valid port (165535). |
| `MxGateway:Ldap:Transport` | `None` | Transport/TLS mode. One of `None` (plaintext), `StartTls` (upgrade a plaintext connection to TLS), or `Ldaps` (TLS from connect). Replaces the former boolean `UseTls`. |
| `MxGateway:Ldap:AllowInsecure` | `true` | Allows plaintext LDAP connections. Must be `true` when `Transport` is `None`; setting `Transport=None` with `AllowInsecure=false` fails validation. |
| `MxGateway:Ldap:SearchBase` | `dc=zb,dc=local` | Search base distinguished name for user lookup. Required when `Enabled`. |
| `MxGateway:Ldap:ServiceAccountDn` | `cn=serviceaccount,dc=zb,dc=local` | Service account DN used to bind before searching for the logging-in user. Required when `Enabled`. Redacted in the effective-config display. |
| `MxGateway:Ldap:ServiceAccountPassword` | `serviceaccount123` | Service account bind password. Required when `Enabled`. Never logged; redacted in the effective-config display. |
| `MxGateway:Ldap:UserNameAttribute` | `cn` | Attribute matched against the login user name (the dev GLAuth directory keys users by `cn`, not `uid`). Required when `Enabled`. |
| `MxGateway:Ldap:DisplayNameAttribute` | `cn` | Attribute read for the user's display name. Required when `Enabled`. |
| `MxGateway:Ldap:GroupAttribute` | `memberOf` | Attribute read for the user's group membership. The resulting group names are mapped to dashboard roles by `MxGateway:Dashboard:GroupToRole`. Required when `Enabled`. |
When `Enabled` is `true`, `Server`, `SearchBase`, `ServiceAccountDn`,
`ServiceAccountPassword`, `UserNameAttribute`, `DisplayNameAttribute`, and
`GroupAttribute` must be non-blank, `Port` must be valid, and `AllowInsecure`
must be `true` whenever `Transport` is `None`. Group-to-role mapping lives in the
dashboard section; see `MxGateway:Dashboard:GroupToRole` below and
[glauth.md](../glauth.md).
## Worker Options ## Worker Options
| Option | Default | Description | | Option | Default | Description |
@@ -153,7 +199,7 @@ the affected stream while the MXAccess session remains active.
| `MxGateway:Dashboard:RecentFaultLimit` | `100` | Maximum number of fault summaries projected into each dashboard snapshot. | | `MxGateway:Dashboard:RecentFaultLimit` | `100` | Maximum number of fault summaries projected into each dashboard snapshot. |
| `MxGateway:Dashboard:RecentSessionLimit` | `200` | Maximum number of session summaries projected into each dashboard snapshot. | | `MxGateway:Dashboard:RecentSessionLimit` | `200` | Maximum number of session summaries projected into each dashboard snapshot. |
| `MxGateway:Dashboard:ShowTagValues` | `false` | Reserved display control for tag values. The dashboard does not show full tag values by default. | | `MxGateway:Dashboard:ShowTagValues` | `false` | Reserved display control for tag values. The dashboard does not show full tag values by default. |
| `MxGateway:Dashboard:GroupToRole` | _(empty)_ | LDAP group → dashboard role mapping. Keys are LDAP group names (short CN or full DN — leading-RDN match). Values must be `Admin` (read/write, API-key CRUD) or `Viewer` (read-only). A user whose LDAP groups don't intersect this map cannot sign in; with no mapping at all, only the loopback bypass admits anyone. | | `MxGateway:Dashboard:GroupToRole` | _(empty)_ | LDAP group → dashboard role mapping. Keys are LDAP group names (short CN or full DN — leading-RDN match). Values must be `Administrator` (read/write, API-key CRUD) or `Viewer` (read-only). A user whose LDAP groups don't intersect this map cannot sign in; with no mapping at all, only the loopback bypass admits anyone. |
`SnapshotIntervalMilliseconds` must be greater than zero. `RecentFaultLimit` `SnapshotIntervalMilliseconds` must be greater than zero. `RecentFaultLimit`
and `RecentSessionLimit` must be greater than or equal to zero. and `RecentSessionLimit` must be greater than or equal to zero.
@@ -166,10 +212,10 @@ users) but practical deployments populate at least one Admin group.
Three authorization policies are registered out of these options: Three authorization policies are registered out of these options:
- `MxGateway.Dashboard.Viewer` — gates the Razor component routes. Satisfied by - `MxGateway.Dashboard.Viewer` — gates the Razor component routes. Satisfied by
either dashboard role (Admin or Viewer), by `AllowAnonymousLocalhost` on either dashboard role (Administrator or Viewer), by `AllowAnonymousLocalhost` on
loopback, or by `Authentication.Mode = Disabled`. loopback, or by `Authentication.Mode = Disabled`.
- `MxGateway.Dashboard.Admin` — gates write-capable surfaces (API-key CRUD). - `MxGateway.Dashboard.Admin` — gates write-capable surfaces (API-key CRUD).
Satisfied only by the Admin role (same environmental bypasses). Satisfied only by the Administrator role (same environmental bypasses).
- `MxGateway.Dashboard.HubClients` — attached to the SignalR hubs. Accepts - `MxGateway.Dashboard.HubClients` — attached to the SignalR hubs. Accepts
either the dashboard cookie scheme or the `MxGateway.Dashboard.HubToken` either the dashboard cookie scheme or the `MxGateway.Dashboard.HubToken`
bearer scheme (used by SignalR's WebSocket upgrade path where the HttpOnly bearer scheme (used by SignalR's WebSocket upgrade path where the HttpOnly
+123 -41
View File
@@ -9,11 +9,13 @@ statistics in real time.
## Technology Choice ## Technology Choice
Decision: Blazor Server with Bootstrap CSS/JS. Decision: Blazor Server with the shared `ZB.MOM.WW.Theme` kit layered over
Bootstrap CSS/JS.
Allowed UI stack: Allowed UI stack:
- ASP.NET Core Blazor Server, - ASP.NET Core Blazor Server,
- the `ZB.MOM.WW.Theme` kit (layout chassis, status components, design tokens),
- Bootstrap CSS, - Bootstrap CSS,
- Bootstrap JavaScript, - Bootstrap JavaScript,
- small local CSS for layout and status styling, - small local CSS for layout and status styling,
@@ -30,7 +32,35 @@ Not allowed for v1:
Rationale: Blazor Server keeps the dashboard in the gateway process, avoids a Rationale: Blazor Server keeps the dashboard in the gateway process, avoids a
separate frontend build, and gives real-time UI updates through the Blazor separate frontend build, and gives real-time UI updates through the Blazor
SignalR circuit. Bootstrap is sufficient for a basic dashboard. SignalR circuit. The `ZB.MOM.WW.Theme` kit gives the dashboard the same chassis,
status vocabulary, and visual identity as the other ZB.MOM.WW operations UIs
without re-implementing layout and status styling per project.
## Theme Kit
The dashboard depends on the shared `ZB.MOM.WW.Theme` NuGet package
(version `0.2.0`, referenced in `ZB.MOM.WW.MxGateway.Server.csproj`). The kit is
a Razor Class Library that ships the technical-light design system: a layout
chassis, a small set of UI components, the design tokens, and the head/script
asset wiring. The dashboard takes its chrome and status presentation from the
kit and adds only its own pages and view CSS on top.
Components and assets used:
| Kit member | Role in the dashboard |
|---|---|
| `<ThemeShell>` | The application chassis — vertical side rail (brand, hamburger, responsive collapse) plus a content area. `MainLayout.razor` wraps it and supplies `Nav`, `RailFooter`, and `ChildContent` slots. |
| `<NavRailSection>` / `<NavRailItem>` | Grouped navigation items in the rail. Section expand/collapse persistence is owned by the kit (`<details>` + `ThemeScripts`); the app runs no JS interop for it. |
| `<LoginCard>` | The centered login card on `Login.razor`. Renders a native static `<form method="post" action="/login">` so the submit reaches the minimal-API endpoint rather than a Blazor event. |
| `<StatusPill State="…">` | The status chip. `StatusBadge.razor` is a thin adapter that maps domain state text to one of four `StatusState` values (`Ok`, `Warn`, `Bad`, `Idle`) and renders this pill. |
| `<ThemeHead/>` | Loaded in `App.razor`'s `<head>`; injects the kit's `theme.css` and related head assets. |
| `<ThemeScripts/>` | Loaded at the end of `App.razor`'s `<body>`; supplies the rail's interactive behavior. |
| Token system | `theme.css` defines all design tokens (`var(--card)`, `var(--ink)`, `var(--accent)`, `var(--mono)`, the state colors, etc.). The local `site.css` references these tokens and defines no hard-coded colors. |
The dependency on this kit is the reason the layout shell, navigation, status
chips, and tokens differ from a stock Bootstrap dashboard. See
[Dashboard Interface Design](./DashboardInterfaceDesign.md) for how the kit's
tokens and components shape the visual language.
## Hosting Model ## Hosting Model
@@ -67,8 +97,8 @@ Endpoint layout:
The `/galaxy` page surfaces the Galaxy Repository browse summary The `/galaxy` page surfaces the Galaxy Repository browse summary
(deployed object hierarchy size, last deploy timestamp, attribute totals, (deployed object hierarchy size, last deploy timestamp, attribute totals,
template usage, and connectivity sync info). The summary is fed by template usage, and connectivity sync info). The summary is fed by
`GalaxySummaryCache`, which is refreshed off the request path by `GalaxyHierarchyCache`, which is refreshed off the request path by
`GalaxySummaryRefreshService` on the `GalaxyHierarchyRefreshService` on the
`MxGateway:Galaxy:DashboardRefreshIntervalSeconds` cadence so the dashboard `MxGateway:Galaxy:DashboardRefreshIntervalSeconds` cadence so the dashboard
never blocks on SQL. See [Galaxy Repository Browse](./GalaxyRepository.md) for never blocks on SQL. See [Galaxy Repository Browse](./GalaxyRepository.md) for
the underlying gRPC service. the underlying gRPC service.
@@ -79,24 +109,31 @@ the underlying gRPC service.
ZB.MOM.WW.MxGateway.Server ZB.MOM.WW.MxGateway.Server
Dashboard/ Dashboard/
Components/ Components/
App.razor App.razor (loads <ThemeHead/> / <ThemeScripts/>)
Routes.razor Routes.razor
DashboardPageBase.cs DashboardPageBase.cs
DashboardDisplay.cs DashboardDisplay.cs
Layout/ Layout/
DashboardLayout.razor MainLayout.razor (ThemeShell side-rail chassis)
LoginLayout.razor (minimal, no rail; hosts <LoginCard>)
Pages/ Pages/
DashboardHome.razor DashboardHome.razor
Login.razor
SessionsPage.razor SessionsPage.razor
SessionDetailsPage.razor SessionDetailsPage.razor
WorkersPage.razor WorkersPage.razor
EventsPage.razor EventsPage.razor
AlarmsPage.razor
GalaxyPage.razor
BrowsePage.razor
ApiKeysPage.razor ApiKeysPage.razor
SettingsPage.razor SettingsPage.razor
Shared/ Shared/
MetricCard.razor MetricCard.razor
StatusBadge.razor StatusBadge.razor (adapter over kit <StatusPill>)
FaultList.razor FaultList.razor
BrowseTreeNodeView.razor
ConfirmDialog.razor
DashboardSnapshotService.cs DashboardSnapshotService.cs
DashboardAuthorizationHandler.cs DashboardAuthorizationHandler.cs
DashboardAuthenticator.cs DashboardAuthenticator.cs
@@ -244,10 +281,14 @@ Show:
- admin Close session / Kill worker controls (Admin role only). - admin Close session / Kill worker controls (Admin role only).
The Sessions list, the Workers list, and this details page all render the same The Sessions list, the Workers list, and this details page all render the same
admin controls when the signed-in principal carries the `Admin` role; viewers admin controls when the signed-in principal carries the `Administrator` role; viewers
and the localhost-anonymous bypass see no action affordances and the server and the localhost-anonymous bypass see no action affordances and the server
re-checks the role on every invocation. Every destructive admin action is re-checks the role on every invocation. Every destructive admin action is
gated by a confirmation dialog before it reaches `ISessionManager`. gated by the shared `ConfirmDialog` component before it reaches
`ISessionManager`. `ConfirmDialog` is a reusable Bootstrap modal (title,
message, confirm/cancel buttons, and a busy state that disables both buttons
while the action runs); each page binds its open state and confirm/cancel
callbacks. The API keys page uses the same component.
- **Close session** routes through `ISessionManager.CloseSessionAsync`: the - **Close session** routes through `ISessionManager.CloseSessionAsync`: the
worker is asked to shut down gracefully and is killed only as a fallback if worker is asked to shut down gracefully and is killed only as a fallback if
@@ -288,8 +329,9 @@ it opt-in and redacted.
### Browse page ### Browse page
`/dashboard/browse` lets an operator explore the Galaxy tag hierarchy and watch `/browse` lets an operator explore the Galaxy tag hierarchy and watch
live values. The tree is built in-process by `DashboardBrowseTreeBuilder` from live values. The tree is built in-process by the static
`DashboardBrowseTreeBuilder` (in `DashboardBrowseModel.cs`) from
`IGalaxyHierarchyCache.Current` — the same cache the Galaxy page reads — so a `IGalaxyHierarchyCache.Current` — the same cache the Galaxy page reads — so a
render costs no gRPC call and no SQL round-trip. Each node shows its child render costs no gRPC call and no SQL round-trip. Each node shows its child
objects and, when expanded, its attributes with attribute name, data type objects and, when expanded, its attributes with attribute name, data type
@@ -306,8 +348,11 @@ diagnostic session/worker views.
### Alarms page ### Alarms page
`/dashboard/alarms` lists the alarms the gateway's central alarm monitor `/alarms` lists the alarms the gateway's central alarm monitor
currently holds as Active or ActiveAcked, refreshed every three seconds. It currently holds as Active or ActiveAcked. The page injects
`IDashboardLiveDataService` and drives a `PeriodicTimer` poll loop that calls
`QueryAlarmsAsync` every three seconds, rather than subscribing to the snapshot
hub or holding a `CurrentAlarms` reference directly. It
defaults to showing unacknowledged `Active` alarms; filters add acknowledged defaults to showing unacknowledged `Active` alarms; filters add acknowledged
alarms and narrow by area, severity range, and a reference/source/description alarms and narrow by area, severity range, and a reference/source/description
text search. Cleared alarms are not retained — the gateway holds no text search. Cleared alarms are not retained — the gateway holds no
@@ -335,7 +380,7 @@ the monitor never starts and the cache stays empty.
### API keys page ### API keys page
`/dashboard/apikeys` lists the gateway's API keys and, for authorized `/apikeys` lists the gateway's API keys and, for authorized
operators, manages them. It reads key metadata through the same operators, manages them. It reads key metadata through the same
`IApiKeyAdminStore` the `apikey` CLI uses, so the dashboard and the CLI act `IApiKeyAdminStore` the `apikey` CLI uses, so the dashboard and the CLI act
on one source of truth. on one source of truth.
@@ -358,7 +403,7 @@ for what each constraint means and how it is enforced on the gRPC path.
Create, Rotate, Revoke, and Delete controls render only when the signed-in Create, Rotate, Revoke, and Delete controls render only when the signed-in
user is authorized. `DashboardApiKeyAuthorization.CanManage` requires an user is authorized. `DashboardApiKeyAuthorization.CanManage` requires an
authenticated principal carrying the `Admin` role claim (resolved at login authenticated principal carrying the `Administrator` role claim (resolved at login
from the user's LDAP groups via `MxGateway:Dashboard:GroupToRole`). A from the user's LDAP groups via `MxGateway:Dashboard:GroupToRole`). A
`Viewer` role can read the table but sees no action controls, and an `Viewer` role can read the table but sees no action controls, and an
anonymous localhost session shows the same read-only view. anonymous localhost session shows the same read-only view.
@@ -385,10 +430,11 @@ Create and Rotate return the assembled `mxgw_<keyId>_<secret>` token **once**,
in a one-time banner. It is never shown again, so the operator must copy it in a one-time banner. It is never shown again, so the operator must copy it
immediately. This mirrors the `apikey create-key` / `rotate-key` CLI. immediately. This mirrors the `apikey create-key` / `rotate-key` CLI.
Every management action appends an `api_key_audit` entry Every management action writes an entry to the canonical `audit_event` store
(`dashboard-create-key`, `dashboard-rotate-key`, `dashboard-revoke-key`, through `IAuditWriter` (`dashboard-create-key`, `dashboard-rotate-key`,
`dashboard-delete-key`) with the key id and the caller's remote address. `dashboard-revoke-key`, `dashboard-delete-key`) with the key id, the caller's
Secrets and pepper values are never logged. remote address, and a correlation id. Secrets and pepper values are never
logged.
### Settings page ### Settings page
@@ -408,23 +454,33 @@ Do not show API key secrets or pepper values.
Dashboard authentication is LDAP-backed, distinct from the API-key model used Dashboard authentication is LDAP-backed, distinct from the API-key model used
on the gRPC API. Users sign in with directory credentials; the gateway maps on the gRPC API. Users sign in with directory credentials; the gateway maps
their LDAP groups to one of two dashboard roles (`Admin` or `Viewer`) and their LDAP groups to one of two dashboard roles (`Administrator` or `Viewer`) and
issues a cookie carrying those role claims. issues a cookie carrying those role claims.
Implemented behavior: Implemented behavior:
- a static `/login` HTML form posts username/password to the gateway; - `GET /login` is served by the `[AllowAnonymous]` Blazor `Login.razor`
- `DashboardAuthenticator` binds against `MxGateway:Ldap` (service-account bind, component (under `LoginLayout`), which renders the shared kit's `<LoginCard>`.
user search, candidate bind) using `Novell.Directory.Ldap.NETStandard`; `LoginCard` emits a native static `<form method="post" action="/login">`
- the user's `memberOf` (or short CN) is matched against (username, password, hidden returnUrl) plus an `<AntiforgeryToken/>`. A native
`MxGateway:Dashboard:GroupToRole`; the resolved role(s) are emitted as form submit is not a Blazor event, so it reaches the minimal-API `POST /login`
`ClaimTypes.Role` claims, alongside the per-group `mxgateway:ldap_group` endpoint regardless of the app's InteractiveServer render mode;
claims; - `DashboardAuthenticator` delegates bind/search to the shared
- a successful login signs in the `MxGateway.Dashboard` cookie scheme `ZB.MOM.WW.Auth.Ldap` provider, registered by `AddZbLdapAuth(configuration,
(`__Host-MxGatewayDashboard`, HttpOnly, SameSite=Strict, Secure); "MxGateway:Ldap")`. The provider performs a service-account bind, user search,
then candidate bind, and fails closed;
- the user's group membership (stripped to its first RDN by the provider) is
matched against `MxGateway:Dashboard:GroupToRole`; the resolved role(s) are
emitted as `ClaimTypes.Role` claims, alongside the per-group
`mxgateway:ldap_group` claims;
- a successful login signs in the `MxGateway.Dashboard` cookie scheme. The
cookie defaults to the name `MxGatewayDashboard` (HttpOnly, SameSite=Strict,
Secure) and can be overridden via `MxGateway:Dashboard:CookieName`;
- a user with no matching group cannot sign in — the login screen returns the - a user with no matching group cannot sign in — the login screen returns the
generic credential-rejected message; generic credential-rejected message via `/login?error=…`;
- antiforgery tokens guard the login and logout POSTs. - antiforgery tokens guard the login and logout POSTs. `POST /logout` (and a
`GET /logout` convenience redirect) sign the cookie out and return to
`/login`.
Three authorization policies are registered: Three authorization policies are registered:
@@ -443,8 +499,8 @@ Viewer role.
### Hub bearer flow ### Hub bearer flow
SignalR connections cannot reuse the `__Host-` cookie when the JS client SignalR connections cannot reuse the `MxGatewayDashboard` cookie when the JS
upgrades to WebSocket — the cookie's `SameSite=Strict; Path=/` keeps it from client upgrades to WebSocket — the cookie's `SameSite=Strict; Path=/` keeps it from
being forwarded by the browser's WebSocket layer in some edge cases. The being forwarded by the browser's WebSocket layer in some edge cases. The
dashboard mints short-lived bearer tokens for the connection: dashboard mints short-lived bearer tokens for the connection:
@@ -480,8 +536,10 @@ Effective configuration:
"RecentFaultLimit": 100, "RecentFaultLimit": 100,
"RecentSessionLimit": 200, "RecentSessionLimit": 200,
"ShowTagValues": false, "ShowTagValues": false,
"CookieName": null,
"RequireHttpsCookie": true,
"GroupToRole": { "GroupToRole": {
"GwAdmin": "Admin", "GwAdmin": "Administrator",
"GwReader": "Viewer" "GwReader": "Viewer"
} }
} }
@@ -489,6 +547,15 @@ Effective configuration:
} }
``` ```
Two cookie keys tune the auth cookie:
- `CookieName` overrides the cookie name. Null or blank keeps the canonical
default `MxGatewayDashboard`, so a misconfiguration cannot leave the cookie
unnamed.
- `RequireHttpsCookie` (default `true`) sets the cookie `SecurePolicy` to
`Always`. Set it to `false` for dev HTTP deployments, which relaxes the policy
to `SameAsRequest`.
See [Gateway Configuration](./GatewayConfiguration.md#dashboard-options) for See [Gateway Configuration](./GatewayConfiguration.md#dashboard-options) for
the full option table and the policies/hubs that derive from these values. the full option table and the policies/hubs that derive from these values.
@@ -504,17 +571,31 @@ the full option table and the policies/hubs that derive from these values.
## Styling ## Styling
The dashboard serves Bootstrap 5.3.3 assets from Styling is layered. From base to top:
`src/ZB.MOM.WW.MxGateway.Server/wwwroot/lib/bootstrap/` and local layout/status styling
from `src/ZB.MOM.WW.MxGateway.Server/wwwroot/css/dashboard.css`. 1. Bootstrap 5.3.3 assets served from
`src/ZB.MOM.WW.MxGateway.Server/wwwroot/lib/bootstrap/`.
2. The `ZB.MOM.WW.Theme` kit's `theme.css` (the technical-light design system),
which owns the design tokens and the kit component styles. `App.razor` loads
it through the kit's `<ThemeHead/>` component, and pairs it with
`<ThemeScripts/>` at the end of `<body>` for the rail's interactive behavior.
3. The local view stylesheet
`src/ZB.MOM.WW.MxGateway.Server/wwwroot/css/site.css`, which wires the
dashboard's own class names and Bootstrap widgets onto the kit tokens. It
defines no hard-coded colors.
The minimal `/denied` page is rendered outside the Blazor circuit, so it loads
the kit CSS directly from the static-web-asset path
(`/_content/ZB.MOM.WW.Theme/css/theme.css` and `…/layout.css`) plus Bootstrap
and `site.css`.
Recommended visual language: Recommended visual language:
- compact tables, - compact tables,
- status badges, - the kit `StatusPill` for state,
- metric cards, - metric cards,
- Bootstrap alerts for faults, - Bootstrap alerts for faults,
- restrained colors, - restrained colors drawn from the kit tokens,
- no decorative hero sections, - no decorative hero sections,
- no charting dependency for v1. - no charting dependency for v1.
@@ -530,7 +611,7 @@ Dashboard unit/component tests should cover:
- snapshot projection, - snapshot projection,
- dashboard auth authorization decisions, - dashboard auth authorization decisions,
- login API-key validation behavior, - login LDAP bind and group-to-role mapping behavior,
- pages render with empty state, - pages render with empty state,
- pages render with active sessions, - pages render with active sessions,
- pages render with faulted sessions, - pages render with faulted sessions,
@@ -557,7 +638,8 @@ Integration tests should verify:
The first dashboard slice implements: The first dashboard slice implements:
1. Blazor Server hosting in `ZB.MOM.WW.MxGateway.Server`. 1. Blazor Server hosting in `ZB.MOM.WW.MxGateway.Server`.
2. local Bootstrap static assets. 2. local Bootstrap static assets plus the `ZB.MOM.WW.Theme` kit layer
(chassis, tokens, status components).
3. dashboard configuration binding. 3. dashboard configuration binding.
4. dashboard auth using LDAP bind + role-mapped HTTP-only cookie. 4. dashboard auth using LDAP bind + role-mapped HTTP-only cookie.
5. `DashboardSnapshotService` projecting gateway state for read views. 5. `DashboardSnapshotService` projecting gateway state for read views.
+16 -10
View File
@@ -247,12 +247,17 @@ Technology:
Suggested routes: Suggested routes:
```text ```text
/dashboard /
/dashboard/sessions /login
/dashboard/sessions/{sessionId} /sessions
/dashboard/workers /sessions/{sessionId}
/dashboard/events /workers
/dashboard/settings /events
/alarms
/galaxy
/browse
/apikeys
/settings
``` ```
Dashboard pages: Dashboard pages:
@@ -681,13 +686,14 @@ Dashboard authentication uses LDAP bind + role mapping (separate from the
API-key model used on the gRPC API). The login endpoint accepts username and API-key model used on the gRPC API). The login endpoint accepts username and
password in a form post, calls `DashboardAuthenticator` to bind against password in a form post, calls `DashboardAuthenticator` to bind against
`MxGateway:Ldap`, resolves the user's LDAP groups through `MxGateway:Ldap`, resolves the user's LDAP groups through
`MxGateway:Dashboard:GroupToRole` to one of `Admin` / `Viewer`, and signs in `MxGateway:Dashboard:GroupToRole` to one of `Administrator` / `Viewer`, and signs in
with the `MxGateway.Dashboard` cookie scheme. The cookie is HTTP-only, with the `MxGateway.Dashboard` cookie scheme. The cookie is HTTP-only,
secure, strict SameSite, and named `__Host-MxGatewayDashboard`. Logout secure, strict SameSite, and named `MxGatewayDashboard` (configurable via
`MxGateway:Dashboard:CookieName`). Logout
clears it. Login and logout posts validate antiforgery tokens. SignalR clears it. Login and logout posts validate antiforgery tokens. SignalR
connections additionally accept a 30-minute data-protected bearer minted at connections additionally accept a 30-minute data-protected bearer minted at
`/hubs/token`. `Dashboard:AllowAnonymousLocalhost` permits loopback requests `/hubs/token`. `MxGateway:Dashboard:AllowAnonymousLocalhost` permits loopback
to bypass the cookie requirement and defaults to `true`. requests to bypass the cookie requirement and defaults to `true`.
Recommended scopes: Recommended scopes:
+12 -1
View File
@@ -100,6 +100,17 @@ Optional live smoke variables:
| `MXGATEWAY_LIVE_MXACCESS_WRITE_SECURED_USER` | `admin` | ArchestrA user name passed to `AuthenticateUser` before the `WriteSecured` parity step. | | `MXGATEWAY_LIVE_MXACCESS_WRITE_SECURED_USER` | `admin` | ArchestrA user name passed to `AuthenticateUser` before the `WriteSecured` parity step. |
| `MXGATEWAY_LIVE_MXACCESS_WRITE_SECURED_PASSWORD` | `admin123` | Password paired with the user above. Never logged; the test asserts the value does not appear in the WriteSecured diagnostic message. | | `MXGATEWAY_LIVE_MXACCESS_WRITE_SECURED_PASSWORD` | `admin123` | Password paired with the user above. Never logged; the test asserts the value does not appear in the WriteSecured diagnostic message. |
When `MXGATEWAY_LIVE_MXACCESS_WORKER_EXE` is unset, the integration harness
locates the worker by resolving the repository root: `ResolveRepositoryRoot`
walks parent directories from the test binary looking for a directory that
contains a `src` subdirectory next to either a `.git` marker or a `*.sln` /
`*.slnx` file under `src`. The `.git`-or-`.sln` pair lets the resolution work
both in a checked-out repository and in an extracted copy that ships no `.git`
folder. If the walk exhausts without a match, it throws `InvalidOperationException`
naming the start directory and the expected markers; set
`MXGATEWAY_LIVE_MXACCESS_WORKER_EXE` to point directly at a worker executable and
bypass repository-root resolution entirely.
The test output includes session id, worker process id, command status, The test output includes session id, worker process id, command status,
HRESULT/status diagnostics, event sequence and handles, close status, and worker HRESULT/status diagnostics, event sequence and handles, close status, and worker
stdout/stderr lines emitted during the run. stdout/stderr lines emitted during the run.
@@ -320,7 +331,7 @@ writes its `{"error":...}` envelope and the loop continues; the harness treats
that envelope as the operation failure (used by the parity and auth phases). that envelope as the operation failure (used by the parity and auth phases).
Before the per-client phases run, the script builds the .NET CLI Before the per-client phases run, the script builds the .NET CLI
(`dotnet build`) and installs the Java CLI (`gradle :mxgateway-cli:installDist`) (`dotnet build`) and installs the Java CLI (`gradle :zb-mom-ww-mxgateway-cli:installDist`)
once, so the `batch` process launches straight from the compiled exe / the once, so the `batch` process launches straight from the compiled exe / the
installed launcher. The Go, Rust, and Python batch processes are launched via installed launcher. The Go, Rust, and Python batch processes are launched via
`go run` / `cargo run` / `python -m`, which compile-or-start once when that `go run` / `cargo run` / `python -m`, which compile-or-start once when that
+9 -4
View File
@@ -10,7 +10,7 @@ The layer is composed of four collaborators:
| Type | Lifetime | Role | | Type | Lifetime | Role |
|------|----------|------| |------|----------|------|
| `MxAccessGatewayService` | scoped (gRPC) | Implements the six `MxAccessGateway` RPCs, performs exception mapping. | | `MxAccessGatewayService` | scoped (gRPC) | Implements the seven `MxAccessGateway` RPCs, performs exception mapping. |
| `MxAccessGrpcRequestValidator` | singleton | Rejects malformed requests before any session work runs. | | `MxAccessGrpcRequestValidator` | singleton | Rejects malformed requests before any session work runs. |
| `MxAccessGrpcMapper` | singleton | Converts public proto types to internal `WorkerCommand`/`WorkerEvent` types and back. | | `MxAccessGrpcMapper` | singleton | Converts public proto types to internal `WorkerCommand`/`WorkerEvent` types and back. |
| `IEventStreamService` (`EventStreamService`) | singleton | Owns the event stream pipeline, including bounded queue and backpressure handling. | | `IEventStreamService` (`EventStreamService`) | singleton | Owns the event stream pipeline, including bounded queue and backpressure handling. |
@@ -29,7 +29,7 @@ A second gRPC service, `GalaxyRepositoryGrpcService`, is mapped alongside it. It
## RPC Handlers ## RPC Handlers
`MxAccessGatewayService` derives from the generated `MxAccessGateway.MxAccessGatewayBase` and implements every RPC declared in `mxaccess_gateway.proto` — six in total: `OpenSession`, `CloseSession`, `Invoke`, `StreamEvents`, `AcknowledgeAlarm`, and `StreamAlarms`. The proto contract itself is documented in [Contracts](./Contracts.md); this section covers only what the server-side handler does on top of that contract. `MxAccessGatewayService` derives from the generated `MxAccessGateway.MxAccessGatewayBase` and implements every RPC declared in `mxaccess_gateway.proto` — seven in total: `OpenSession`, `CloseSession`, `Invoke`, `StreamEvents`, `AcknowledgeAlarm`, `StreamAlarms`, and `QueryActiveAlarms`. The proto contract itself is documented in [Contracts](./Contracts.md); this section covers only what the server-side handler does on top of that contract.
Public gRPC send and receive message sizes are configured from Public gRPC send and receive message sizes are configured from
`MxGateway:Protocol:MaxGrpcMessageBytes` (default 16 MiB). Official clients use `MxGateway:Protocol:MaxGrpcMessageBytes` (default 16 MiB). Official clients use
@@ -94,6 +94,10 @@ Carrying the enqueue timestamp into the worker layer is what lets queue-wait tim
`StreamAlarms` is a server-streaming, **session-less** RPC that attaches to the gateway's central alarm feed. The handler delegates to `IGatewayAlarmService.StreamAsync`. The stream opens with one `AlarmFeedMessage` carrying an `active_alarm` per currently-active alarm (the ConditionRefresh snapshot), then a single `snapshot_complete`, then a `transition` for every subsequent raise / acknowledge / clear. It is served by the always-on `GatewayAlarmMonitor`, which owns a single gateway-managed worker session and fans out to every attached client — clients no longer open a session of their own. `alarm_filter_prefix`, when set, scopes the stream to a sub-tree. `StreamAlarms` is a server-streaming, **session-less** RPC that attaches to the gateway's central alarm feed. The handler delegates to `IGatewayAlarmService.StreamAsync`. The stream opens with one `AlarmFeedMessage` carrying an `active_alarm` per currently-active alarm (the ConditionRefresh snapshot), then a single `snapshot_complete`, then a `transition` for every subsequent raise / acknowledge / clear. It is served by the always-on `GatewayAlarmMonitor`, which owns a single gateway-managed worker session and fans out to every attached client — clients no longer open a session of their own. `alarm_filter_prefix`, when set, scopes the stream to a sub-tree.
### `QueryActiveAlarms`
`QueryActiveAlarms` is a server-streaming, **session-less** RPC that returns a point-in-time snapshot of the alarm monitor's active-alarm cache. The handler iterates `IGatewayAlarmService.CurrentAlarms`, writing one `ActiveAlarmSnapshot` per active alarm, then completes — unlike `StreamAlarms` it emits no `snapshot_complete` sentinel and no transitions. When `alarm_filter_prefix` is non-empty, snapshots whose `alarm_full_reference` does not start with the prefix are skipped (ordinal match). Clients use it to seed or reconcile state after a reconnect; for a live feed they use `StreamAlarms`.
## Validation Rules ## Validation Rules
`MxAccessGrpcRequestValidator` rejects requests with `StatusCode.InvalidArgument` before any session work happens. The rules are intentionally narrow — anything that requires session state (for example, "session does not exist") is left for `ISessionManager` so the validator can stay synchronous and side-effect free. `MxAccessGrpcRequestValidator` rejects requests with `StatusCode.InvalidArgument` before any session work happens. The rules are intentionally narrow — anything that requires session state (for example, "session does not exist") is left for `ISessionManager` so the validator can stay synchronous and side-effect free.
@@ -106,6 +110,7 @@ Carrying the enqueue timestamp into the worker layer is what lets queue-wait tim
| `Invoke` | `session_id` non-empty, `command` present, `kind` not `Unspecified`, payload oneof must match `kind`. | `InvalidArgument` | | `Invoke` | `session_id` non-empty, `command` present, `kind` not `Unspecified`, payload oneof must match `kind`. | `InvalidArgument` |
| `AcknowledgeAlarm` | `alarm_full_reference` must be non-empty. Validated inline in the handler, not by `MxAccessGrpcRequestValidator`. | `InvalidArgument` | | `AcknowledgeAlarm` | `alarm_full_reference` must be non-empty. Validated inline in the handler, not by `MxAccessGrpcRequestValidator`. | `InvalidArgument` |
| `StreamAlarms` | No required fields — `alarm_filter_prefix` is optional. | — | | `StreamAlarms` | No required fields — `alarm_filter_prefix` is optional. | — |
| `QueryActiveAlarms` | No required fields — `alarm_filter_prefix` is optional. | — |
The payload-vs-kind check matters because the `MxCommand.payload` oneof is non-discriminated on the wire — a misaligned client could send `kind = Write` with a `Register` payload and silently confuse the worker. The validator turns that into a clear client error: The payload-vs-kind check matters because the `MxCommand.payload` oneof is non-discriminated on the wire — a misaligned client could send `kind = Write` with a `Register` payload and silently confuse the worker. The validator turns that into a clear client error:
@@ -145,7 +150,7 @@ public WorkerCommand MapCommand(MxCommandRequest request)
When the worker reply or event payload is missing, the mapper returns a synthetic public message with `ProtocolStatusCode.ProtocolViolation` (for replies) or a sentinel `MxEvent` with `MxEventFamily.Unspecified` (for events). The gateway never relays a partial frame to clients — anything missing is reported as a protocol violation against the worker, not a transport error against the client. When the worker reply or event payload is missing, the mapper returns a synthetic public message with `ProtocolStatusCode.ProtocolViolation` (for replies) or a sentinel `MxEvent` with `MxEventFamily.Unspecified` (for events). The gateway never relays a partial frame to clients — anything missing is reported as a protocol violation against the worker, not a transport error against the client.
The mapper also exposes static factory methods for every `ProtocolStatusCode` (`Ok`, `InvalidRequest`, `SessionNotFound`, `SessionNotReady`, `WorkerUnavailable`, `Timeout`, `Canceled`, `ProtocolViolation`) so that handlers and tests can produce status payloads without duplicating the enum-to-string mapping. The mapper also exposes static factory methods for most `ProtocolStatusCode` values (`Ok`, `InvalidRequest`, `SessionNotFound`, `SessionNotReady`, `WorkerUnavailable`, `Timeout`, `Canceled`, `ProtocolViolation`) so that handlers and tests can produce status payloads without duplicating the enum-to-string mapping. There is intentionally no factory for `MxAccessFailure` (the ninth enum value): that code is set by the worker on the reply payload to report an MXAccess-side failure, not synthesized by the gateway mapper.
## Exception to Status Mapping ## Exception to Status Mapping
@@ -224,7 +229,7 @@ if (!writer.TryWrite(publicEvent))
} }
``` ```
Under `FailFast` the session is faulted so subsequent commands return `FailedPrecondition`; the client must reopen. Under the default policy only the stream is dropped and the session continues to accept commands, leaving recovery to the client (typically a fresh `StreamEvents` call with an updated `AfterWorkerSequence`). Either way, the consumer side observes `StatusCode.ResourceExhausted` via the `EventQueueOverflow` mapping above. `FailFast` is the **default** policy (`Events:BackpressurePolicy`): on overflow the whole session is faulted, so subsequent commands return `FailedPrecondition` and the client must reopen. This is deliberate — the default refuses to silently drop MXAccess events. The non-default `DisconnectSubscriber` policy drops only the slow stream and leaves the session accepting commands, leaving recovery to the client (typically a fresh `StreamEvents` call with an updated `AfterWorkerSequence`). Either way, the consumer side observes `StatusCode.ResourceExhausted` via the `EventQueueOverflow` mapping above.
### Cancellation and cleanup ### Cancellation and cleanup
+84 -28
View File
@@ -94,9 +94,11 @@ Expected protected environment values:
```text ```text
MXGATEWAY_WORKER_NONCE=<random nonce> MXGATEWAY_WORKER_NONCE=<random nonce>
MXGATEWAY_WORKER_LOG_CONTEXT=<optional context>
``` ```
The nonce travels through the environment rather than the command line so it
never appears in process-listing tools that expose argument vectors.
Startup sequence: Startup sequence:
1. Parse command-line arguments. 1. Parse command-line arguments.
@@ -114,16 +116,26 @@ Startup sequence:
If validation fails before MXAccess creation, exit quickly with a non-zero exit If validation fails before MXAccess creation, exit quickly with a non-zero exit
code. If MXAccess creation fails, send `WorkerFault` when possible and exit. code. If MXAccess creation fails, send `WorkerFault` when possible and exit.
The bootstrap layer returns structured exit codes before it creates pipes, `WorkerApplication.Run` returns one of the structured `WorkerExitCode` values.
starts the STA, or touches MXAccess: Codes `2``4` are produced by the bootstrap parse phase before any pipe, STA, or
MXAccess work happens; codes `5``6` and a clean `0` only become reachable once
the parse succeeds and the worker runs its pipe session:
| Exit code | Name | Meaning | | Exit code | Name | Meaning |
|-----------|------|---------| |-----------|------|---------|
| `0` | `Success` | Required bootstrap options are valid. | | `0` | `Success` | The pipe session ran to a clean close. |
| `1` | `UnexpectedFailure` | A non-bootstrap exception reaches the process boundary. | | `1` | `UnexpectedFailure` | A non-bootstrap exception reaches the process boundary. |
| `2` | `InvalidArguments` | Required arguments are missing or unknown arguments are present. | | `2` | `InvalidArguments` | Required arguments are missing or unknown arguments are present. |
| `3` | `InvalidProtocolVersion` | `--protocol-version` is not numeric or does not match the supported worker protocol. | | `3` | `InvalidProtocolVersion` | `--protocol-version` is not numeric or does not match the supported worker protocol. |
| `4` | `MissingNonce` | `MXGATEWAY_WORKER_NONCE` is absent or empty. | | `4` | `MissingNonce` | `MXGATEWAY_WORKER_NONCE` is absent or empty. |
| `5` | `PipeConnectionFailed` | The pipe connection raised an `IOException` or `TimeoutException`. |
| `6` | `ProtocolViolation` | A `WorkerFrameProtocolException` escaped the pipe session. |
`WorkerBootstrapResult.Succeeded` is a separate parse-phase gate: it reports
whether argument parsing produced usable `WorkerOptions`. A `false` result
carries one of codes `2``4` and the worker exits before running a session, so a
successful parse is distinct from the `0` exit code, which only follows a clean
pipe-session close.
Bootstrap logs use `WorkerConsoleLogger` key/value output. `WorkerLogRedactor` Bootstrap logs use `WorkerConsoleLogger` key/value output. `WorkerLogRedactor`
redacts fields whose names indicate nonce, secret, password, token, redacts fields whose names indicate nonce, secret, password, token,
@@ -133,30 +145,35 @@ credential, or API key values before the message is written.
```text ```text
ZB.MOM.WW.MxGateway.Worker ZB.MOM.WW.MxGateway.Worker
Program Program (calls WorkerApplication.Run)
WorkerApplication (parse, bootstrap, run pipe session, map exit code)
Bootstrap Bootstrap
WorkerOptionsParser (parse args + env into WorkerOptions)
WorkerOptions WorkerOptions
WorkerHost WorkerBootstrapResult (parse outcome + WorkerExitCode)
WorkerExitCode
WorkerConsoleLogger / WorkerLogRedactor
Ipc Ipc
PipeClient WorkerPipeClient (named-pipe connect + retry, owns the session)
FrameReader WorkerPipeSession (handshake, read/write/drain/heartbeat loops)
FrameWriter WorkerFrameReader / WorkerFrameWriter
WorkerProtocol WorkerEnvelopeValidator
WorkerContractInfo (protocol version + descriptor names)
Sta Sta
StaRuntime StaRuntime (the dedicated STA thread + message pump loop)
StaCommandQueue StaCommandDispatcher
MessagePump StaMessagePump
StaWatchdog
MxAccess MxAccess
MxAccessSession MxAccessStaSession (IWorkerRuntimeSession over the STA)
MxAccessCommandDispatcher MxAccessSession (handle registry + COM-call orchestration)
MxAccessEventSink MxAccessCommandExecutor (IStaCommandExecutor; runs commands on the STA)
MxAccessBaseEventSink (OnDataChange tag-data events)
MxAccessHandleRegistry MxAccessHandleRegistry
(alarm subsystem — see below)
Conversion Conversion
VariantConverter VariantConverter (MxValue <-> COM VARIANT, both directions)
SafeArrayConverter MxStatusProxyConverter
StatusProxyConverter HResultConverter / HResultConversion
HResultMapper
``` ```
## Threading Model ## Threading Model
@@ -251,7 +268,7 @@ The loop should update a heartbeat timestamp after:
- processing an MXAccess event. - processing an MXAccess event.
`StaRuntime` implements this runtime boundary in the worker. It starts one `StaRuntime` implements this runtime boundary in the worker. It starts one
background thread named `ZB.MOM.WW.MxGateway.Worker.STA`, sets it to `ApartmentState.STA`, background thread named `MxGateway.Worker.STA`, sets it to `ApartmentState.STA`,
initializes COM through `StaComApartmentInitializer`, and runs initializes COM through `StaComApartmentInitializer`, and runs
`StaMessagePump`. Commands are scheduled through `InvokeAsync`; the command `StaMessagePump`. Commands are scheduled through `InvokeAsync`; the command
queue signals an `AutoResetEvent` so `MsgWaitForMultipleObjectsEx` can wake the queue signals an `AutoResetEvent` so `MsgWaitForMultipleObjectsEx` can wake the
@@ -330,13 +347,19 @@ cleanup path completes.
## Event Sink ## Event Sink
The worker must subscribe to every public MXAccess event family: The worker subscribes to every public MXAccess event family through
`MxAccessBaseEventSink`:
- `OnDataChange` - `OnDataChange`
- `OnWriteComplete` - `OnWriteComplete`
- `OperationComplete` - `OperationComplete`
- `OnBufferedDataChange` - `OnBufferedDataChange`
Alarm transitions arrive on a separate path. They do not originate from the
`LMXProxyServerClass` connection points, so `MxAccessAlarmEventSink` (driven by
the alarm subsystem below) feeds them onto the same `MxAccessEventQueue` rather
than `MxAccessBaseEventSink`.
Forward these event families only when the native MXAccess COM object raises Forward these event families only when the native MXAccess COM object raises
them. Do not synthesize `OperationComplete` from write completion or command them. Do not synthesize `OperationComplete` from write completion or command
status. `OnBufferedDataChange` must be represented in the protocol now, but status. `OnBufferedDataChange` must be represented in the protocol now, but
@@ -368,16 +391,49 @@ type on buffered events. `OperationComplete` is only emitted from the native
`MxAccessEventQueue` is the bounded outbound event queue for one worker `MxAccessEventQueue` is the bounded outbound event queue for one worker
session. It assigns the monotonic `WorkerSequence` and `WorkerTimestamp` when an session. It assigns the monotonic `WorkerSequence` and `WorkerTimestamp` when an
event is accepted, preserving the order in which MXAccess handlers enqueue event is accepted, preserving the order in which MXAccess handlers enqueue
events. The default capacity is `10000`. When the queue reaches capacity it events. The default capacity is `10000`. When the queue reaches capacity, `Enqueue`
records a `WorkerFaultCategory.QueueOverflow` fault and rejects further events. records a `WorkerFaultCategory.QueueOverflow` fault and then throws
The event handler catches conversion and enqueue failures, records the first `MxAccessEventQueueOverflowException` so the caller cannot silently drop the
fault on the queue, and returns to the STA message pump instead of writing to event. The event handler catches conversion and enqueue failures (including this
the pipe. overflow exception), records the first fault on the queue, and returns to the
STA message pump instead of writing to the pipe.
If event conversion throws, catch it inside the event handler, record a If event conversion throws, catch it inside the event handler, record a
structured `WorkerFault`, and keep the worker alive only if the fault policy structured `WorkerFault`, and keep the worker alive only if the fault policy
allows it. allows it.
## Alarm Subsystem
Alarms come from a different COM surface than tag data, so the worker carries a
separate pipeline rather than folding alarms into `MxAccessBaseEventSink`. The
MXAccess `LMXProxyServerClass` does not expose alarm subscription, so the worker
hosts AVEVA's standalone alarm-consumer COM object instead.
- `WnWrapAlarmConsumer` is the production `IMxAccessAlarmConsumer`, backed by
`WNWRAPCONSUMERLib.wwAlarmConsumerClass`. It returns the active alarm set as a
BSTR XML string through `GetXmlCurrentAlarms2`, which avoids the FILETIME→
`DateTime` marshaling that crashed the earlier managed alarm client. The CLSID
is registered `ThreadingModel=Apartment`, so the consumer is created and
driven entirely on the worker's STA. It owns no internal timer.
- `MxAccessStaSession` drives the **STA alarm poll loop**: `RunAlarmPollLoopAsync`
awaits a fixed `500 ms` interval and then calls `IAlarmCommandHandler.PollOnce`
on the STA via the runtime, so every `GetXmlCurrentAlarms2` call stays on the
apartment that owns the consumer. A poll failure is recorded as a
`WorkerFault` on the event queue rather than terminating the worker.
- `AlarmCommandHandler` owns one `AlarmDispatcher` per session and is the entry
point for the alarm IPC commands (`SubscribeAlarms`, `AcknowledgeAlarm` by GUID
or name, `QueryActiveAlarms`, `Unsubscribe`). It rejects a second subscribe
before an unsubscribe, mirroring the consumer's non-idempotent `Subscribe`.
- `AlarmDispatcher` wires the consumer's `AlarmTransitionEmitted` stream onto
`MxAccessAlarmEventSink.EnqueueTransition`. It maps state transitions through
`AlarmRecordTransitionMapper`, composes the canonical
`\\<machine>\Galaxy!<area>` full reference, and projects active-alarm
snapshots to `ActiveAlarmSnapshot` protos for the `QueryActiveAlarms` refresh
stream.
- `MxAccessAlarmEventSink` enqueues each decoded transition onto the shared
`MxAccessEventQueue` as a proto alarm-transition event, stamping the session
id, so alarms ride the same outbound IPC path as tag-data events.
## Command Queue ## Command Queue
The pipe reader converts `WorkerCommand` messages into `StaCommand` entries. The pipe reader converts `WorkerCommand` messages into `StaCommand` entries.
+45 -11
View File
@@ -4,9 +4,9 @@ The sessions subsystem owns the in-memory representation of an active gateway-to
## Overview ## Overview
A session is the gateway-side handle that callers use to invoke worker commands, stream worker events, and tear the worker down. The subsystem is split between the per-session state machine (`GatewaySession`), an in-memory directory (`SessionRegistry`), the orchestrator that opens and closes sessions (`SessionManager`), the worker construction step (`SessionWorkerClientFactory`), and a hosted service that drains sessions during host shutdown (`SessionShutdownHostedService`). A session is the gateway-side handle that callers use to invoke worker commands, stream worker events, and tear the worker down. The subsystem is split between the per-session state machine (`GatewaySession`), an in-memory directory (`SessionRegistry`), the orchestrator that opens and closes sessions (`SessionManager`), the worker construction step (`SessionWorkerClientFactory`), a hosted service that sweeps expired leases (`SessionLeaseMonitorHostedService`), and a hosted service that drains sessions during host shutdown (`SessionShutdownHostedService`).
All four interfaces (`ISessionManager`, `ISessionRegistry`, `ISessionWorkerClientFactory`) plus `SessionShutdownHostedService` are wired as singletons by `SessionServiceCollectionExtensions.AddGatewaySessions`. The three interfaces (`ISessionManager`, `ISessionRegistry`, `ISessionWorkerClientFactory`) are wired as singletons, and both hosted services (`SessionLeaseMonitorHostedService`, `SessionShutdownHostedService`) are registered, by `SessionServiceCollectionExtensions.AddGatewaySessions`. The startup orphan-worker cleanup that runs before any session opens lives in the worker subsystem (`OrphanWorkerCleanupHostedService`); see [Gateway Restart and Orphan Cleanup](#gateway-restart-and-orphan-cleanup).
## Key Types ## Key Types
@@ -18,6 +18,8 @@ The session id is an opaque string in the form `session-{guid:N}` and the per-se
`SessionState` itself is the protobuf-generated enum from `ZB.MOM.WW.MxGateway.Contracts.Proto`, so it is shared between the gateway and clients on the wire. `SessionState` itself is the protobuf-generated enum from `ZB.MOM.WW.MxGateway.Contracts.Proto`, so it is shared between the gateway and clients on the wire.
`GatewaySession` also keeps an `_items` dictionary keyed by `(ServerHandle, ItemHandle)` mapping each subscribed item to its `SessionItemRegistration` (server handle, item handle, tag address). It is the gateway-side shadow of the items the worker has added, populated as `AddItem`-style commands succeed and pruned on `RemoveItem`. The shadow exists so the gateway can answer item lookups and clean up subscriptions without round-tripping the worker; the worker remains authoritative for the handles themselves (see [gateway.md](../gateway.md)).
```csharp ```csharp
public void TransitionTo(SessionState nextState) public void TransitionTo(SessionState nextState)
{ {
@@ -54,7 +56,7 @@ public void TransitionTo(SessionState nextState)
`CloseSessionAsync` and `KillWorkerAsync` are both end-of-life paths but differ in what they offer the worker: `CloseSessionAsync` and `KillWorkerAsync` are both end-of-life paths but differ in what they offer the worker:
- `CloseSessionAsync` is the graceful path: it calls `GatewaySession.CloseAsync`, which asks the worker to shut down via `IWorkerClient.ShutdownAsync` and only kills the process as a fallback if shutdown fails. - `CloseSessionAsync` is the graceful path: it calls `GatewaySession.CloseAsync`, which asks the worker to shut down via `IWorkerClient.ShutdownAsync` and only kills the process as a fallback if shutdown fails.
- `KillWorkerAsync` is the forceful path used by the dashboard's admin Kill button: it calls `GatewaySession.KillWorker` directly, which kills the worker process immediately with no graceful-shutdown attempt and transitions the session to `Closed`. - `KillWorkerAsync` is the forceful path used by the dashboard's admin Kill button: it calls `GatewaySession.KillWorkerWithCloseGateAsync`, which kills the worker process immediately with no graceful-shutdown attempt and transitions the session to `Closed`. Routing through `KillWorkerWithCloseGateAsync` (rather than the bare `GatewaySession.KillWorker`) acquires the per-session `_closeLock` so a kill and an in-flight graceful close serialize on the same "was the session already closed" observation that drives metric accounting; the method returns that observation so `KillWorkerAsync` increments `mxgateway.sessions.closed` at most once across concurrent callers.
Both paths converge on the same registry/metrics cleanup, so the open-session slot is released and `mxgateway.sessions.closed` is incremented either way. Both paths converge on the same registry/metrics cleanup, so the open-session slot is released and `mxgateway.sessions.closed` is incremented either way.
@@ -99,6 +101,8 @@ if (exception is OperationCanceledException
The named pipe is created with `maxNumberOfServerInstances: 1` so a second worker cannot connect to the same pipe name even if the first launch is still pending. Combined with the per-session nonce passed to the worker, this is the gateway's defense against a foreign process answering a pipe. The named pipe is created with `maxNumberOfServerInstances: 1` so a second worker cannot connect to the same pipe name even if the first launch is still pending. Combined with the per-session nonce passed to the worker, this is the gateway's defense against a foreign process answering a pipe.
The factory also seeds the worker client's `MaxPendingCommands` from `MxGateway:Sessions:MaxPendingCommandsPerSession` (default 128, validated `> 0` at startup). This caps how many commands can be in flight to a single worker at once; the `WorkerClient` rejects an enqueue past the cap and records `mxgateway.queues.overflows` tagged `worker-pending-commands`. The bound exists because the worker executes commands serially on one STA — an unbounded backlog would only grow memory and latency, not throughput.
### SessionShutdownHostedService ### SessionShutdownHostedService
`SessionShutdownHostedService` is an `IHostedService` whose only job is to call `ISessionManager.ShutdownAsync` from `StopAsync`. It catches `OperationCanceledException` triggered by the host shutdown timeout and logs a warning so that an over-running shutdown does not surface as an unhandled exception. `SessionShutdownHostedService` is an `IHostedService` whose only job is to call `ISessionManager.ShutdownAsync` from `StopAsync`. It catches `OperationCanceledException` triggered by the host shutdown timeout and logs a warning so that an over-running shutdown does not surface as an unhandled exception.
@@ -172,6 +176,14 @@ catch (Exception exception)
await session.DisposeAsync().ConfigureAwait(false); await session.DisposeAsync().ConfigureAwait(false);
} }
// If SessionOpened() already incremented the open-session gauge,
// a failure after that point (e.g. auto-subscribe rejection) must
// decrement it again so mxgateway.sessions.open does not leak.
if (sessionOpenedRecorded)
{
_metrics.SessionRemoved();
}
ReleaseSessionSlot(); ReleaseSessionSlot();
_metrics.Fault(SessionManagerErrorCode.OpenFailed.ToString()); _metrics.Fault(SessionManagerErrorCode.OpenFailed.ToString());
_logger.LogWarning( _logger.LogWarning(
@@ -186,7 +198,7 @@ catch (Exception exception)
} }
``` ```
The order — fault, deregister, dispose, release slot, record metric, log, rethrow — matters because releasing the semaphore before disposal would let the next open race the worker process tear-down on the same machine. The order — fault, deregister, dispose, conditionally decrement the open-session gauge, release slot, record fault metric, log, rethrow — matters because releasing the semaphore before disposal would let the next open race the worker process tear-down on the same machine. The `SessionRemoved()` call is conditional on `sessionOpenedRecorded` (Server-006): a failure *after* `SessionOpened()` already incremented `mxgateway.sessions.open` (for example, an auto-subscribe rejection) must decrement the gauge so it does not leak, but a failure before that point must not.
### Run ### Run
@@ -194,6 +206,8 @@ While `Ready`, callers reach the worker through `SessionManager.InvokeAsync` or
Event streaming uses `AttachEventSubscriber` which returns a disposable lease. When `allowMultipleSubscribers` is false the second attach throws `EventSubscriberAlreadyActive`; this prevents two gRPC streams from racing on the same worker event channel. Active event subscribers keep the session lease from expiring until the stream is disposed. Event streaming uses `AttachEventSubscriber` which returns a disposable lease. When `allowMultipleSubscribers` is false the second attach throws `EventSubscriberAlreadyActive`; this prevents two gRPC streams from racing on the same worker event channel. Active event subscribers keep the session lease from expiring until the stream is disposed.
The single-subscriber rule is enforced at startup, not just at runtime: setting `MxGateway:Sessions:AllowMultipleEventSubscribers` to `true` is refused by `GatewayOptionsValidator` with "AllowMultipleEventSubscribers is not supported until event fan-out is implemented," so the gateway fails fast rather than booting in a configuration the event path cannot honor. Multi-subscriber fan-out is explicitly out of scope for v1 (see [Design Decisions](./DesignDecisions.md)).
Sessions open with `MxGateway:Sessions:DefaultLeaseSeconds` (default 1800) added to the open timestamp. Unary client activity refreshes the lease by the same duration. `ExtendLease` and `IsLeaseExpired` cooperate with `SessionManager.CloseExpiredLeasesAsync`, which iterates a registry snapshot and closes any session whose lease has expired with `LeaseExpiredReason`. `SessionLeaseMonitorHostedService` runs that sweep every `MxGateway:Sessions:LeaseSweepIntervalSeconds` seconds (default 30). Sessions open with `MxGateway:Sessions:DefaultLeaseSeconds` (default 1800) added to the open timestamp. Unary client activity refreshes the lease by the same duration. `ExtendLease` and `IsLeaseExpired` cooperate with `SessionManager.CloseExpiredLeasesAsync`, which iterates a registry snapshot and closes any session whose lease has expired with `LeaseExpiredReason`. `SessionLeaseMonitorHostedService` runs that sweep every `MxGateway:Sessions:LeaseSweepIntervalSeconds` seconds (default 30).
### Close ### Close
@@ -227,11 +241,11 @@ if (_workerClient is not null)
If both graceful shutdown and the kill fall-back fail, the original and kill exceptions are bundled into an `AggregateException` and surfaced as `SessionCloseStartedException`. `SessionManager.CloseSessionCoreAsync` then translates that into a `SessionManagerException` with `CloseFailed` and removes the session. If both graceful shutdown and the kill fall-back fail, the original and kill exceptions are bundled into an `AggregateException` and surfaced as `SessionCloseStartedException`. `SessionManager.CloseSessionCoreAsync` then translates that into a `SessionManagerException` with `CloseFailed` and removes the session.
`GatewaySession.KillWorker` is the unconditional forced-close path used by shutdown when graceful close itself throws, and also by `SessionManager.KillWorkerAsync` — the explicit kill path that the dashboard's admin Kill button invokes. `KillWorkerAsync` skips `WorkerClient.ShutdownAsync` entirely, so `KillCount` increments while `ShutdownCount` does not; the session is then removed from the registry and the open-session slot is released, identical to the cleanup that follows a successful `CloseSessionAsync`. `GatewaySession.KillWorker` is the unconditional forced-close path. `SessionManager.KillWorkerAsync` — the explicit kill path that the dashboard's admin Kill button invokes — no longer calls it directly; it routes through `GatewaySession.KillWorkerWithCloseGateAsync` so the kill takes the per-session `_closeLock`. That method skips `WorkerClient.ShutdownAsync` entirely and forces the worker process down via `IWorkerClient.Kill`, which records the `mxgateway.workers.killed` counter through `GatewayMetrics.WorkerKilled(reason)`. The session is then removed from the registry and the open-session slot is released, identical to the cleanup that follows a successful `CloseSessionAsync` (which increments `mxgateway.sessions.closed`). There is no separate `KillCount` / `ShutdownCount`: worker terminations are counted by `mxgateway.workers.killed` (tagged with the kill reason), and session closes by `mxgateway.sessions.closed`.
## Shutdown Coordination ## Shutdown Coordination
`SessionShutdownHostedService.StopAsync` calls `SessionManager.ShutdownAsync`, which closes every registered session with `GatewayShutdownReason`. The shutdown loop catches per-session exceptions, calls `KillWorker`, and removes the session so that one stuck worker cannot block the rest of the host: `SessionShutdownHostedService.StopAsync` calls `SessionManager.ShutdownAsync`, which closes every registered session with `GatewayShutdownReason`. The shutdown loop catches per-session exceptions and falls back to a forced kill so that one stuck worker cannot block the rest of the host. The fallback routes through `KillWorkerAsync` (not a bare `session.KillWorker`) so the kill takes the same close-gate and metric bookkeeping as the dashboard kill path (Server-046):
```csharp ```csharp
public async Task ShutdownAsync(CancellationToken cancellationToken) public async Task ShutdownAsync(CancellationToken cancellationToken)
@@ -248,21 +262,40 @@ public async Task ShutdownAsync(CancellationToken cancellationToken)
exception, exception,
"Graceful shutdown failed for session {SessionId}; killing worker.", "Graceful shutdown failed for session {SessionId}; killing worker.",
session.SessionId); session.SessionId);
// CloseSessionCoreAsync's inner SessionCloseStartedException catch normally
// removes and accounts the session; this fallback only fires for sessions
// still in the registry, and reuses KillWorkerAsync for identical bookkeeping.
if (_registry.TryGet(session.SessionId, out _)) if (_registry.TryGet(session.SessionId, out _))
{ {
session.KillWorker(GatewayShutdownReason); try
await RemoveSessionAsync(session).ConfigureAwait(false); {
await KillWorkerAsync(session.SessionId, GatewayShutdownReason, cancellationToken).ConfigureAwait(false);
}
catch (SessionManagerException killException)
{
_logger.LogWarning(
killException,
"Worker kill fallback failed for session {SessionId}.",
session.SessionId);
}
} }
} }
} }
} }
``` ```
Iterating over `Snapshot` rather than the live dictionary lets `RemoveSessionAsync` mutate the registry inside the loop without throwing. Iterating over `Snapshot` rather than the live dictionary lets the registry mutate inside the loop without throwing.
## Gateway Restart and Orphan Cleanup
A graceful shutdown drains sessions through `ShutdownAsync`, but a gateway crash or `Kill` leaves no chance to tear workers down. Those orphaned worker processes outlive the gateway that launched them, still holding their MXAccess COM instance and their named pipe. Because the pipe name encodes the *old* gateway PID, a fresh gateway will never reconnect to them — v1 deliberately does not reattach orphan workers (see [Design Decisions](./DesignDecisions.md)).
Instead, `OrphanWorkerCleanupHostedService` runs once on startup, before any session opens, and calls `OrphanWorkerTerminator.TerminateOrphans`. The terminator enumerates running processes matching the configured worker executable name, skips the current process, and kills any that it identifies as a leftover worker (matched against the configured executable path). Each kill records `mxgateway.workers.killed` tagged `OrphanStartupCleanup` and logs a warning. The sweep is best-effort: a failure to kill any one orphan (it may have already exited, or be inaccessible) is logged and swallowed so it cannot block gateway startup. This service lives in the worker subsystem, not the session subsystem, because it operates on OS processes rather than `GatewaySession` state.
## Dependency Injection ## Dependency Injection
`SessionServiceCollectionExtensions.AddGatewaySessions` registers the four singletons and the hosted service: `SessionServiceCollectionExtensions.AddGatewaySessions` registers the three singletons and the two hosted services:
```csharp ```csharp
public static IServiceCollection AddGatewaySessions(this IServiceCollection services) public static IServiceCollection AddGatewaySessions(this IServiceCollection services)
@@ -270,13 +303,14 @@ public static IServiceCollection AddGatewaySessions(this IServiceCollection serv
services.AddSingleton<ISessionRegistry, SessionRegistry>(); services.AddSingleton<ISessionRegistry, SessionRegistry>();
services.AddSingleton<ISessionWorkerClientFactory, SessionWorkerClientFactory>(); services.AddSingleton<ISessionWorkerClientFactory, SessionWorkerClientFactory>();
services.AddSingleton<ISessionManager, SessionManager>(); services.AddSingleton<ISessionManager, SessionManager>();
services.AddHostedService<SessionLeaseMonitorHostedService>();
services.AddHostedService<SessionShutdownHostedService>(); services.AddHostedService<SessionShutdownHostedService>();
return services; return services;
} }
``` ```
The registry must be a singleton because its `ConcurrentDictionary` is the source of truth for session state across the gRPC service, the lease sweeper, the dashboard, and the shutdown hosted service. Registering `SessionShutdownHostedService` last ensures it is constructed after `ISessionManager` and therefore drains sessions during host stop. The registry must be a singleton because its `ConcurrentDictionary` is the source of truth for session state across the gRPC service, the lease sweeper, the dashboard, and the shutdown hosted service. `SessionLeaseMonitorHostedService` runs the periodic expired-lease sweep; `SessionShutdownHostedService` drains sessions during host stop. Both are registered after `ISessionManager` so they resolve the same singleton manager when the host starts; `SessionShutdownHostedService` is registered last so it is the latter of the two to be constructed and is available to drain sessions on stop.
## Related Documentation ## Related Documentation
+2 -2
View File
@@ -4,7 +4,7 @@ The bootstrap layer parses the command-line arguments and environment variables
## Overview ## Overview
The worker process is a short-lived child of the gateway. The gateway side of this contract lives in [WorkerProcessLauncher](./WorkerProcessLauncher.md). On the worker side, `Program.cs` is a single line that delegates to `WorkerApplication.Run(args)`: The worker process is a per-session child process of the gateway: one worker is launched per session and lives for that session's lifetime. The gateway side of this contract lives in [WorkerProcessLauncher](./WorkerProcessLauncher.md). On the worker side, `Program.cs` is a single line that delegates to `WorkerApplication.Run(args)`:
```csharp ```csharp
using ZB.MOM.WW.MxGateway.Worker; using ZB.MOM.WW.MxGateway.Worker;
@@ -143,7 +143,7 @@ The production binding in `WorkerApplication.Run(string[])` is `EnvironmentVaria
## Logging ## Logging
The worker writes structured key/value lines to standard error. Standard error is used rather than standard output because the gateway side reads worker stdout for diagnostic capture only, while stderr is reserved for log output that does not interfere with any future stdout-based channel. The worker writes structured key/value lines to standard error. The launcher does not redirect either stream (`WorkerProcessLauncher` sets `UseShellExecute=false` and `CreateNoWindow=true` but leaves stdout and stderr inherited), so log output lands on the inherited console rather than a pipe the gateway reads. Standard error is used rather than standard output so that diagnostic logging stays clear of stdout, keeping that stream free for any future stdout-based channel.
### The logger contract ### The logger contract
+25 -1
View File
@@ -109,6 +109,30 @@ default:
The MXAccess engine returns values whose semantic type only fully resolves after consulting the engine's own attribute metadata. Clients that round-trip these values through the gateway (replay, parity fixtures, diagnostics) need the original `VT_*` tag, the engine-declared `MxDataType`, and any conversion diagnostic; otherwise edge cases such as decimal-to-double rounding, ulong overflow, or an unknown SAFEARRAY element type become invisible bugs. Storing both the typed projection and the raw fields in the same `MxValue`/`MxArray` lets cross-language clients recover the original observation byte-for-byte where possible and detect lossy cases where it is not. The MXAccess engine returns values whose semantic type only fully resolves after consulting the engine's own attribute metadata. Clients that round-trip these values through the gateway (replay, parity fixtures, diagnostics) need the original `VT_*` tag, the engine-declared `MxDataType`, and any conversion diagnostic; otherwise edge cases such as decimal-to-double rounding, ulong overflow, or an unknown SAFEARRAY element type become invisible bugs. Storing both the typed projection and the raw fields in the same `MxValue`/`MxArray` lets cross-language clients recover the original observation byte-for-byte where possible and detect lossy cases where it is not.
### Inverse projection for COM writes
The conversions above run on the read path, turning COM values into `MxValue`.
The write path runs the same `VariantConverter` in reverse: `ConvertToComValue`
takes an `MxValue` from a `Write` command and returns a CLR object that the COM
marshaler boxes into the matching VARIANT, so it is the inverse of `Convert`.
- A null `MxValue` argument throws; an `MxValue` whose `IsNull` flag is set
returns `null` (the MXAccess null), keeping the read/write null semantics
symmetric.
- Each `KindCase` maps to its CLR scalar (`bool`, `int`, `long`, `float`,
`double`, `string`). A `TimestampValue` becomes a `DateTime`, which the
marshaler renders as `VT_DATE` — the form MXAccess accepts for the
timestamped-write argument.
- An array kind delegates to `ConvertToComArray`, which projects each
`MxArray.ValuesCase` to a typed CLR array (for example `int[]`, `string[]`, or
a `DateTime[]` for timestamp arrays) so the marshaler produces the
corresponding SAFEARRAY.
- `RawValue` payloads are intentionally rejected on both the scalar and array
paths. Raw bytes are preserved on the read path for diagnostics, but there is
no safe way to reconstruct the original VARIANT from them, so a write that
carries a raw value throws rather than guessing. An `MxValue` with no value
kind set throws for the same reason — there is nothing to write.
## HResultConverter and HResultConversion ## HResultConverter and HResultConversion
`HResultConverter.Convert` wraps any `Exception` thrown across the COM boundary. It prefers `COMException.ErrorCode` over `Exception.HResult` because the runtime sometimes overwrites `Exception.HResult` while marshalling, and the `ErrorCode` field is the value the COM call actually returned. `HResultConverter.Convert` wraps any `Exception` thrown across the COM boundary. It prefers `COMException.ErrorCode` over `Exception.HResult` because the runtime sometimes overwrites `Exception.HResult` while marshalling, and the `ErrorCode` field is the value the COM call actually returned.
@@ -223,7 +247,7 @@ public string PreserveCompletionOnlyStatusBytes(byte[] statusBytes)
`MxStatusDetailText` is an internal lookup that maps known `MXSTATUS_PROXY.detail` codes to short human-readable strings (for example `28 = "Index out of range"`, `42 = "Unable to convert string"`, `8017 = "Object must be offscan to modify attributes that have an MxSecurityConfigure security classification"`). `MxStatusProxyConverter.Convert` calls `Lookup` and writes the result to `DiagnosticText`. Unknown codes return `string.Empty`, leaving the numeric `Detail` field as the authoritative identifier. `MxStatusDetailText` is an internal lookup that maps known `MXSTATUS_PROXY.detail` codes to short human-readable strings (for example `28 = "Index out of range"`, `42 = "Unable to convert string"`, `8017 = "Object must be offscan to modify attributes that have an MxSecurityConfigure security classification"`). `MxStatusProxyConverter.Convert` calls `Lookup` and writes the result to `DiagnosticText`. Unknown codes return `string.Empty`, leaving the numeric `Detail` field as the authoritative identifier.
The mapping covers the engine-error range documented for MXAccess (16-50, 56-61, 541-542, 8017). Adding entries here is the supported way to enrich wire-level diagnostics without changing the proto schema. The mapping covers selected detail codes in the MXAccess engine-error ranges (16-50, 56-61, 541-542, 8017). The ranges are not contiguous: codes that the runtime does not assign a distinct meaning are omitted (for example 35, 45, and 46 in the 16-50 range and 58-59 in the 56-61 range), so only codes with a known text appear. Adding entries here is the supported way to enrich wire-level diagnostics without changing the proto schema.
## MxStatusConversionException ## MxStatusConversionException
+5 -5
View File
@@ -16,17 +16,17 @@ The installed MXAccess interop assembly declares an `Apartment` threading model
| `IStaWorkItem` / `StaWorkItem<T>` | Internal queue entries that capture a delegate, a `CancellationToken`, and a `TaskCompletionSource<T>` for the caller. | | `IStaWorkItem` / `StaWorkItem<T>` | Internal queue entries that capture a delegate, a `CancellationToken`, and a `TaskCompletionSource<T>` for the caller. |
| `StaCommand` | Carries an `MxCommand` together with `SessionId`, `CorrelationId`, `EnqueueTimestamp`, and a `CancellationToken`. | | `StaCommand` | Carries an `MxCommand` together with `SessionId`, `CorrelationId`, `EnqueueTimestamp`, and a `CancellationToken`. |
| `IStaCommandExecutor` | The boundary between the dispatcher and the MXAccess interop layer; returns `MxCommandReply`. | | `IStaCommandExecutor` | The boundary between the dispatcher and the MXAccess interop layer; returns `MxCommandReply`. |
| `StaCommandDispatcher` | Bounded asynchronous queue in front of `StaRuntime` that converts `StaCommand` into `MxCommandReply` and applies status normalization. | | `StaCommandDispatcher` | A bounded `Queue<T>` (guarded by a lock) with an async drain loop in front of `StaRuntime` that converts `StaCommand` into `MxCommandReply` and applies status normalization. |
## STA Thread Initialization ## STA Thread Initialization
`StaRuntime`'s constructor configures a background `Thread` named `ZB.MOM.WW.MxGateway.Worker.STA` and forces it into `ApartmentState.STA` before the thread starts. `Start()` releases the thread and then blocks on `startedEvent` so callers observe a fully-initialized apartment (or a captured `startupException`) before the first `InvokeAsync` call: `StaRuntime`'s constructor configures a background `Thread` named `MxGateway.Worker.STA` and forces it into `ApartmentState.STA` before the thread starts. `Start()` releases the thread and then blocks on `startedEvent` so callers observe a fully-initialized apartment (or a captured `startupException`) before the first `InvokeAsync` call:
```csharp ```csharp
staThread = new Thread(ThreadMain) staThread = new Thread(ThreadMain)
{ {
IsBackground = true, IsBackground = true,
Name = "ZB.MOM.WW.MxGateway.Worker.STA" Name = "MxGateway.Worker.STA"
}; };
staThread.SetApartmentState(ApartmentState.STA); staThread.SetApartmentState(ApartmentState.STA);
``` ```
@@ -141,10 +141,10 @@ finally
`StaRuntime.Shutdown(TimeSpan timeout)` performs an ordered shutdown: `StaRuntime.Shutdown(TimeSpan timeout)` performs an ordered shutdown:
1. Sets `shutdownRequested` under `gate` so `InvokeAsync` rejects new work with `InvalidOperationException`. 1. Sets `shutdownRequested` under `gate` so subsequent `InvokeAsync` calls reject new work. `InvokeAsync` does not throw inline: it returns a faulted `Task` carrying `StaRuntimeShutdownException` (a dedicated subtype, not a bare `InvalidOperationException`). The distinct type lets callers and the dispatcher distinguish "rejected because the runtime is shutting down" from any other invalid-operation condition.
2. Signals `commandWakeEvent` to break the STA out of `WaitForWorkOrMessages`. 2. Signals `commandWakeEvent` to break the STA out of `WaitForWorkOrMessages`.
3. Waits up to `timeout` on `stoppedEvent`, which the STA sets after it leaves `ThreadMain`. 3. Waits up to `timeout` on `stoppedEvent`, which the STA sets after it leaves `ThreadMain`.
4. Once the thread has stopped, drains the queue through `CancelQueuedCommands`, which calls `CancelBeforeExecution` on every remaining work item so awaiting callers observe `OperationCanceledException` instead of hanging. 4. The queue is drained through `CancelQueuedCommands` twice. `ThreadMain`'s `finally` block runs it before setting `stoppedEvent`, so any work that was queued while the loop was exiting is canceled on the STA itself. `Shutdown` then runs it again after the wait returns, which catches work enqueued during the gap between the `finally` drain and the gate close. Either way, `CancelBeforeExecution` completes every remaining work item so awaiting callers observe `OperationCanceledException` instead of hanging. (When the STA thread never started, `Shutdown` instead drains directly and sets `stoppedEvent` itself.)
`ThreadMain`'s `finally` block guarantees that `comApartmentInitializer.Uninitialize` runs (when COM was successfully initialized) before `stoppedEvent.Set`, so the apartment is always torn down on the same thread that initialized it. `Dispose` calls `Shutdown` with a five-second budget and only disposes the wait handles when shutdown actually completed, which prevents a still-running STA thread from touching disposed handles. `ThreadMain`'s `finally` block guarantees that `comApartmentInitializer.Uninitialize` runs (when COM was successfully initialized) before `stoppedEvent.Set`, so the apartment is always torn down on the same thread that initialized it. `Dispose` calls `Shutdown` with a five-second budget and only disposes the wait handles when shutdown actually completed, which prevents a still-running STA thread from touching disposed handles.
+14
View File
@@ -0,0 +1,14 @@
# Documentation Audit Workspace
This directory holds the working artifacts for the repo-wide prose
documentation audit.
- `fragments/` — one Markdown findings fragment per subsystem cluster
(`NN-<cluster>.md`), produced by the read-only verifier pass. Each fragment
records claim-by-claim findings (claim, verdict, code evidence, severity,
proposed fix) for its docs.
- The fragments are aggregated, deduplicated by code area, and summarized into
the top-level report `MxAccessGateway-doc-audit.md`.
Design: [`../plans/2026-06-03-documentation-audit-design.md`](../plans/2026-06-03-documentation-audit-design.md)
Plan: [`../plans/2026-06-03-documentation-audit-implementation.md`](../plans/2026-06-03-documentation-audit-implementation.md)
View File
+443
View File
@@ -0,0 +1,443 @@
# Cluster 01 — Architecture
DOC: gateway.md
LINES: 737769
CLAIM: Project layout lists `src/MxGateway.Server`, `src/MxGateway.Worker`, `src/MxGateway.Contracts`, `src/MxGateway.Tests`, `src/MxGateway.Worker.Tests`, `src/MxGateway.IntegrationTests` as suggested path names.
CLAIM_TYPE: path
VERDICT: stale
EVIDENCE: src/ directory listing — actual project directories are `ZB.MOM.WW.MxGateway.Server`, `ZB.MOM.WW.MxGateway.Worker`, `ZB.MOM.WW.MxGateway.Contracts`, `ZB.MOM.WW.MxGateway.Tests`, `ZB.MOM.WW.MxGateway.Worker.Tests`, `ZB.MOM.WW.MxGateway.IntegrationTests`
CODE_AREA: arch.layout
SEVERITY: medium
PROPOSED_FIX: Replace all short project names in the layout block with the fully-qualified names (e.g. `src/ZB.MOM.WW.MxGateway.Server/`, `src/ZB.MOM.WW.MxGateway.Worker/`, etc.).
---
DOC: gateway.md
LINES: 231248
CLAIM: `WorkerEnvelope` has `uint64 correlation_id = 4` and oneof body field numbers: `worker_hello=10, gateway_hello=11, worker_ready=12, command=20, command_reply=21, event=22, heartbeat=23, cancel=24, shutdown=25, fault=26`.
CLAIM_TYPE: rpc/proto
VERDICT: wrong
EVIDENCE: src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_worker.proto:4,2038 — actual proto has `string correlation_id = 4` (not uint64); body fields are `gateway_hello=10, worker_hello=11, worker_ready=12, worker_command=13, worker_command_reply=14, worker_cancel=15, worker_shutdown=16, worker_shutdown_ack=17, worker_event=18, worker_heartbeat=19, worker_fault=20`; field names also differ (e.g. `command``worker_command`, `event``worker_event`).
CODE_AREA: arch.ipc
SEVERITY: high
PROPOSED_FIX: Replace the WorkerEnvelope protobuf block in gateway.md with the actual proto content from `mxaccess_worker.proto`, including the correct field type for `correlation_id` (string), the correct field numbers, and the correct field names. Also add the missing `WorkerShutdownAck worker_shutdown_ack = 17` entry.
---
DOC: gateway.md
LINES: 898913
CLAIM: Session state machine is `Creating -> StartingWorker -> WaitingForPipe -> InitializingWorker -> Ready -> Closing -> Closed -> Faulted`.
CLAIM_TYPE: behavior-rule
VERDICT: stale
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionWorkerClientFactory.cs:75 — code transitions to `SessionState.Handshaking` between `WaitingForPipe` and `InitializingWorker`; this state also appears in the generated proto enum (`MxaccessGateway.cs:726`, `SESSION_STATE_HANDSHAKING = 4`).
CODE_AREA: arch.session
SEVERITY: medium
PROPOSED_FIX: Add `-> Handshaking` between `WaitingForPipe` and `InitializingWorker` in the state machine diagram, and add a description: "`Handshaking`: pipe is connected and protocol hello is being verified."
---
DOC: gateway.md
LINES: 119121
CLAIM: Blazor dashboard mounts at the host root and renders pages at `/`, `/sessions`, `/workers`, `/events`, `/galaxy`, `/alarms`, `/apikeys`, and `/settings`.
CLAIM_TYPE: path
VERDICT: stale
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Dashboard/Components/Pages/BrowsePage.razor:1 — there is also a `/browse` page (`@page "/browse"`) that is not listed. `/login` is also present.
CODE_AREA: arch.layout
SEVERITY: low
PROPOSED_FIX: Add `/browse` (and `/login`) to the list of documented dashboard routes.
---
DOC: gateway.md
LINES: 662663
CLAIM: Rejects valid keys lacking the required `session, invoke, event, metadata, or admin` scope with gRPC `PermissionDenied`.
CLAIM_TYPE: config-key
VERDICT: stale
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Security/Authorization/GatewayScopes.cs:512 — actual scopes are `session:open`, `session:close`, `invoke:read`, `invoke:write`, `invoke:secure`, `events:read`, `metadata:read`, `admin`. The simplified short-form names (`session`, `invoke`, `event`) do not match the canonical scope strings.
CODE_AREA: arch.auth
SEVERITY: medium
PROPOSED_FIX: Replace the simplified scope names with the canonical forms: `session:open`, `session:close`, `invoke:read`, `invoke:write`, `invoke:secure`, `events:read`, `metadata:read`, `admin`.
---
DOC: docs/DesignDecisions.md
LINES: 360363
CLAIM: "Dashboard access should require API-key-backed dashboard authentication with `admin` scope when enabled."
CLAIM_TYPE: behavior-rule
VERDICT: wrong
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardAuthenticator.cs:9 — dashboard authentication is LDAP-backed (bind + group-to-role mapping), not API-key-backed. This is also confirmed in `GatewayProcessDesign.md` lines 291299 and `gateway.md` lines 147156.
CODE_AREA: arch.auth
SEVERITY: high
PROPOSED_FIX: Replace "API-key-backed dashboard authentication with `admin` scope" with "LDAP-backed authentication with `GroupToRole` mapping to `Admin` or `Viewer` roles." Keep the note about `AllowAnonymousLocalhost` for local development.
---
DOC: docs/GatewayProcessDesign.md
LINES: 249255
CLAIM: Dashboard suggested routes use a `/dashboard` prefix: `/dashboard`, `/dashboard/sessions`, `/dashboard/sessions/{sessionId}`, `/dashboard/workers`, `/dashboard/events`, `/dashboard/settings`.
CLAIM_TYPE: path
VERDICT: wrong
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Dashboard/Components/Pages/ — actual Blazor pages are mounted at `/` (DashboardHome.razor), `/sessions` (SessionsPage.razor), `/sessions/{SessionId}` (SessionDetailsPage.razor), `/workers` (WorkersPage.razor), `/events` (EventsPage.razor), `/settings` (SettingsPage.razor), `/alarms` (AlarmsPage.razor), `/galaxy` (GalaxyPage.razor), `/browse` (BrowsePage.razor), `/apikeys` (ApiKeysPage.razor). None have a `/dashboard` prefix.
CODE_AREA: arch.layout
SEVERITY: high
PROPOSED_FIX: Replace the `/dashboard`-prefixed route table with the actual routes: `/`, `/sessions`, `/sessions/{sessionId}`, `/workers`, `/events`, `/alarms`, `/galaxy`, `/browse`, `/apikeys`, `/settings`.
---
DOC: docs/GatewayProcessDesign.md
LINES: 689
CLAIM: "`Dashboard:AllowAnonymousLocalhost` permits loopback requests to bypass the cookie requirement."
CLAIM_TYPE: config-key
VERDICT: stale
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Configuration/DashboardOptions.cs:9 — property is `AllowAnonymousLocalhost` under `DashboardOptions`, which maps to `MxGateway:Dashboard:AllowAnonymousLocalhost`. The shorthand `Dashboard:AllowAnonymousLocalhost` omits the root `MxGateway:` prefix used throughout the project (also confirmed in GatewayProcessDesign.md line 298 which correctly uses `MxGateway:Dashboard:AllowAnonymousLocalhost`).
CODE_AREA: arch.config
SEVERITY: low
PROPOSED_FIX: Standardize to `MxGateway:Dashboard:AllowAnonymousLocalhost` (the form used in GatewayOptions / the configuration section name) everywhere this key is referenced.
---
DOC: docs/GatewayProcessDesign.md
LINES: 854855
CLAIM: Worker `ExecutablePath` default is `src/ZB.MOM.WW.MxGateway.Worker/bin/x86/Release/ZB.MOM.WW.MxGateway.Worker.exe` (forward-slash path shown in JSON block).
CLAIM_TYPE: config-key
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Configuration/WorkerOptions.cs:7 — actual default is `src\ZB.MOM.WW.MxGateway.Worker\bin\x86\Release\ZB.MOM.WW.MxGateway.Worker.exe` (backslashes on Windows). The path and filename match; only the separator style differs between the JSON doc sample and the C# literal.
CODE_AREA: arch.config
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/DesignDecisions.md
LINES: 36
CLAIM: Interop assembly identity: `ArchestrA.MxAccess, Version=3.2.0.0, PublicKeyToken=23106a86e706d0ae`.
CLAIM_TYPE: version
VERDICT: unverifiable
EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessInteropInfo.cs — the file records the assembly path and name (`ArchestrA.MxAccess`) but does not hard-code the version or public key token; `InteropAssemblyVersion` is read dynamically from the loaded assembly at runtime (`typeof(LMXProxyServerClass).Assembly.GetName().Version`). Cannot verify the exact version string without MXAccess installed.
CODE_AREA: arch.ipc
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/DesignDecisions.md
LINES: 3648
CLAIM: COM class `ArchestrA.MxAccess.LMXProxyServerClass`, CLSID `{C30B52F5-2CB5-4760-AF0A-3A344A7EB5DC}`, ProgID `LMXProxy.LMXProxyServer.1`, version-independent ProgID `LMXProxy.LMXProxyServer`, registered server `C:\Program Files (x86)\ArchestrA\Framework\Bin\LmxProxy.dll`, interop assembly `C:\Program Files (x86)\ArchestrA\Framework\Bin\ArchestrA.MXAccess.dll`.
CLAIM_TYPE: config-key
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessInteropInfo.cs:14,19,24,2930,3536,41 — `ComClassName = "ArchestrA.MxAccess.LMXProxyServerClass"`, `Clsid = "{C30B52F5-2CB5-4760-AF0A-3A344A7EB5DC}"`, `ProgId = "LMXProxy.LMXProxyServer.1"`, `VersionIndependentProgId = "LMXProxy.LMXProxyServer"`, `RegisteredServerPath = @"C:\Program Files (x86)\ArchestrA\Framework\Bin\LmxProxy.dll"`, `InteropAssemblyPath = @"C:\Program Files (x86)\ArchestrA\Framework\Bin\ArchestrA.MXAccess.dll"`. All match.
CODE_AREA: arch.ipc
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/DesignDecisions.md
LINES: 55
CLAIM: Worker should reference `ArchestrA.MXAccess.dll` (upper-case MXAccess in filename).
CLAIM_TYPE: path
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/ZB.MOM.WW.MxGateway.Worker.csproj:27 — `<HintPath>C:\Program Files (x86)\ArchestrA\Framework\Bin\ArchestrA.MXAccess.dll</HintPath>`. Matches.
CODE_AREA: arch.layout
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: gateway.md
LINES: 8894
CLAIM: Gateway runtime is `.NET 10`, `C#`, `x64 preferred`, `ASP.NET Core gRPC server`.
CLAIM_TYPE: version
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/ZB.MOM.WW.MxGateway.Server.csproj:4 — `<TargetFramework>net10.0</TargetFramework>`; no explicit `<PlatformTarget>` is set (so the default is AnyCPU/x64-preferred on .NET 10). Grpc.AspNetCore is referenced. Matches.
CODE_AREA: arch.layout
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: gateway.md
LINES: 162165
CLAIM: Worker runtime is `.NET Framework 4.8`, `C#`, `x86 build by default`.
CLAIM_TYPE: version
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/ZB.MOM.WW.MxGateway.Worker.csproj:57 — `<TargetFramework>net48</TargetFramework>`, `<PlatformTarget>x86</PlatformTarget>`, `<Prefer32Bit>true</Prefer32Bit>`. Matches.
CODE_AREA: arch.layout
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: gateway.md
LINES: 198210
CLAIM: Pipe name format is `mxaccess-gateway-{gatewayProcessId}-{sessionId}` and framing is `uint32 little-endian payload_length` followed by `payload_length bytes protobuf WorkerEnvelope`.
CLAIM_TYPE: rpc/proto
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManager.cs:433 — `string pipeName = $"mxaccess-gateway-{Environment.ProcessId}-{sessionId}"`. Framing confirmed by `WorkerFrameReader.cs` and `WorkerFrameWriter.cs` in `src/ZB.MOM.WW.MxGateway.Server/Workers/`.
CODE_AREA: arch.ipc
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: gateway.md
LINES: 108
CLAIM: "The gateway must never instantiate or call MXAccess directly."
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/ZB.MOM.WW.MxGateway.Server.csproj — no reference to `ArchestrA.MXAccess.dll`. MXAccess COM is only referenced in the Worker project csproj (line 2629).
CODE_AREA: arch.layout
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: gateway.md
LINES: 646650
CLAIM: Gateway restart does not reattach old workers; `OrphanWorkerCleanupHostedService` runs `OrphanWorkerTerminator` once on startup to kill leftover `ZB.MOM.WW.MxGateway.Worker.exe` processes.
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Workers/OrphanWorkerCleanupHostedService.cs:7 — class exists and references `OrphanWorkerTerminator`. `OrphanWorkerTerminator.cs:19` is present. Worker executable name `ZB.MOM.WW.MxGateway.Worker.exe` confirmed in `IntegrationTestEnvironment.cs:66`.
CODE_AREA: arch.layout
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GatewayProcessDesign.md
LINES: 420428
CLAIM: Pipe name format is `mxaccess-gateway-{gatewayProcessId}-{sessionId}` and framing is `uint32 little-endian payload_length` followed by `payload_length bytes protobuf WorkerEnvelope`.
CLAIM_TYPE: rpc/proto
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManager.cs:433 — confirmed matching.
CODE_AREA: arch.ipc
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GatewayProcessDesign.md
LINES: 459475
CLAIM: `IWorkerClient` has methods `StartAsync`, `InvokeAsync(WorkerCommand, TimeSpan, CancellationToken)`, `ReadEventsAsync(CancellationToken)`, `ShutdownAsync(TimeSpan, CancellationToken)`, `Kill(string)`.
CLAIM_TYPE: rpc/proto
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Workers/IWorkerClient.cs:22,2831,35,40,44 — all five methods are present with matching signatures.
CODE_AREA: arch.ipc
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GatewayProcessDesign.md
LINES: 713719
CLAIM: API-key admin CLI subcommands are `init-db`, `create-key`, `list-keys`, `revoke-key`, `rotate-key` on `ZB.MOM.WW.MxGateway.Server apikey`.
CLAIM_TYPE: command
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Security/Authentication/ApiKeyAdminCommandLineParser.cs:121135 — all five subcommands are parsed. Matches.
CODE_AREA: arch.auth
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GatewayProcessDesign.md
LINES: 408410
CLAIM: Nonce is passed via `MXGATEWAY_WORKER_NONCE` environment variable so the command line remains safe to log.
CLAIM_TYPE: config-key
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Workers/WorkerProcessLauncher.cs:1718 — `public const string WorkerNonceEnvironmentVariableName = "MXGATEWAY_WORKER_NONCE"`. Matches.
CODE_AREA: arch.ipc
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GatewayProcessDesign.md
LINES: 223229
CLAIM: `EventStreamService` rejects a second subscriber with `EventSubscriberAlreadyActive`; faults the session with `EventQueueOverflow` if the queue fills.
CLAIM_TYPE: term
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManagerErrorCode.cs:78 — enum values `EventSubscriberAlreadyActive` and `EventQueueOverflow` present. Also used at `MxAccessGatewayService.cs:929930` and `EventStreamService.cs:150,160`.
CODE_AREA: arch.ipc
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GatewayProcessDesign.md
LINES: 291299
CLAIM: Dashboard auth uses LDAP bind + role mapping (`MxGateway:Dashboard:GroupToRole`), issues HTTP-only secure cookie, allows `Dashboard:AllowAnonymousLocalhost` to default to `true`.
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardAuthenticator.cs:9 (LDAP-backed); `DashboardOptions.cs:9` (`AllowAnonymousLocalhost` defaults to `true`). Matches.
CODE_AREA: arch.auth
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GatewayProcessDesign.md
LINES: 527530
CLAIM: "During shutdown the worker client treats `WorkerShutdownAck` as the protocol close signal."
CLAIM_TYPE: rpc/proto
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_worker.proto:34,80 — `WorkerShutdownAck` is field 17 in the oneof body and its message is defined at line 80.
CODE_AREA: arch.ipc
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: gateway.md
LINES: 301314
CLAIM: Session state machine (in the "Session Manager" section): `Creating -> StartingWorker -> WaitingForPipe -> InitializingWorker -> Ready -> Closing -> Closed -> Faulted`.
CLAIM_TYPE: behavior-rule
VERDICT: stale
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionWorkerClientFactory.cs:75 — `session.TransitionTo(SessionState.Handshaking)` is called between `WaitingForPipe` and `InitializingWorker`. The `Handshaking` state also exists in the public `SessionState` proto enum (`MxaccessGateway.cs:726`). The state machine in gateway.md at this location (the Gateway Implementation Plan / Session Manager section) is missing the `Handshaking` state exactly as in the earlier reference at lines 898913.
CODE_AREA: arch.session
SEVERITY: medium
PROPOSED_FIX: Add `-> Handshaking` between `WaitingForPipe` and `InitializingWorker` in both state machine diagrams in gateway.md.
---
DOC: gateway.md
LINES: 10231025
CLAIM: "MXAccess COM target is `ArchestrA.MxAccess.LMXProxyServerClass` / `LMXProxy.LMXProxyServer.1` from the installed 32-bit `LmxProxy.dll`."
CLAIM_TYPE: term
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessInteropInfo.cs:14,41 — `ComClassName = "ArchestrA.MxAccess.LMXProxyServerClass"`, `ProgId = "LMXProxy.LMXProxyServer.1"`, registered server `LmxProxy.dll`. Matches.
CODE_AREA: arch.ipc
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GatewayProcessDesign.md
LINES: 6293
CLAIM: High-level component list references namespace `ZB.MOM.WW.MxGateway.Server` with sub-components including `GatewayMetrics` (under `Metrics`) and `HealthChecks` (under `Diagnostics`).
CLAIM_TYPE: term
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Metrics/GatewayMetrics.cs:4 — `namespace ZB.MOM.WW.MxGateway.Server.Metrics`; src/ZB.MOM.WW.MxGateway.Server/Diagnostics/AuthStoreHealthCheck.cs:5 — `namespace ZB.MOM.WW.MxGateway.Server.Diagnostics`. Matches.
CODE_AREA: arch.layout
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: gateway.md
LINES: 110116
CLAIM: Gateway observability foundation lives in `ZB.MOM.WW.MxGateway.Server.Diagnostics` and `ZB.MOM.WW.MxGateway.Server.Metrics`; `GatewayMetrics` exposes counters/gauges/histograms through .NET `Meter`; `DashboardSnapshotService` projects sessions/workers/metrics into immutable DTOs.
CLAIM_TYPE: term
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Metrics/GatewayMetrics.cs:4; src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardSnapshotService.cs:8. Both namespaces confirmed. Matches.
CODE_AREA: arch.layout
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: gateway.md
LINES: 119121
CLAIM: SignalR hubs at `/hubs/{snapshot,alarms,events}` accept either the cookie or a 30-minute bearer minted at `/hubs/token`.
CLAIM_TYPE: path
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardEndpointRouteBuilderExtensions.cs:6365,73 — `MapHub<DashboardSnapshotHub>("/hubs/snapshot")`, `MapHub<AlarmsHub>("/hubs/alarms")`, `MapHub<EventsHub>("/hubs/events")`, `/hubs/token` endpoint mapped at line 73. Matches.
CODE_AREA: arch.layout
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: gateway.md
LINES: 121122
CLAIM: "`/hubs/events` mirrors per-session `MxEvent` traffic from `EventStreamService` to clients subscribed to `session:{id}`."
CLAIM_TYPE: term
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Dashboard/Hubs/EventsHub.cs:27 — `public static string GroupName(string sessionId) => $"session:{sessionId}"`. Matches.
CODE_AREA: arch.layout
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GatewayProcessDesign.md
LINES: 864893
CLAIM: Configuration JSON block shows `MxGateway:Worker:ExecutablePath`, `MxGateway:Sessions:AllowMultipleEventSubscribers`, `MxGateway:Events:QueueCapacity`, `MxGateway:Protocol:WorkerProtocolVersion`, etc.
CLAIM_TYPE: config-key
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Configuration/WorkerOptions.cs:67,13 — `ExecutablePath` and `RequiredArchitecture` match; `SessionOptions.cs` and `EventsOptions` confirm the other keys through bound configuration.
CODE_AREA: arch.config
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/DesignDecisions.md
LINES: 8595
CLAIM: The single-subscriber rule for `StreamEvents` no longer applies to alarms. `GatewayAlarmMonitor` owns one gateway-managed worker session, fans alarm state to any number of clients through session-less `StreamAlarms`. `AcknowledgeAlarm` is session-less and routes through the monitor.
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Alarms/GatewayAlarmMonitor.cs:17 — class exists. `MxAccessGatewayService.cs:167``StreamAlarms` and `AcknowledgeAlarm` are session-less. Matches.
CODE_AREA: arch.session
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/DesignDecisions.md
LINES: 217225
CLAIM: Bulk commands are `AddItemBulk`, `AdviseItemBulk`, `RemoveItemBulk`, `UnAdviseItemBulk`, `SubscribeBulk`, `UnsubscribeBulk`, `WriteBulk`, `Write2Bulk`, `WriteSecuredBulk`, `WriteSecured2Bulk`, `ReadBulk`. Each runs single-item MXAccess COM calls sequentially on the STA; per-entry failures are non-throwing.
CLAIM_TYPE: rpc/proto
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_gateway.proto — all eleven bulk command kinds are present in the `MxCommandKind` enum and corresponding request/reply messages. Verified by cross-referencing `GatewayGrpcScopeResolver.cs:39` which maps `WriteBulk`, `Write2Bulk`, etc.
CODE_AREA: arch.ipc
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: gateway.md
LINES: 129130
CLAIM: "`/browse` walks the `IGalaxyHierarchyCache` tree and reads subscribed tag values live through `IDashboardLiveDataService`."
CLAIM_TYPE: term
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Dashboard/IDashboardBrowseService.cs — `IDashboardBrowseService` references `IGalaxyHierarchyCache`. `IDashboardLiveDataService.cs` exists in the same Dashboard directory. `/browse` page confirmed in `BrowsePage.razor:1`.
CODE_AREA: arch.layout
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: gateway.md
LINES: 219
CLAIM: Gateway preserves MXAccess behavior first, including public MXAccess command semantics, native MXAccess event families, STA/message-pump delivery behavior, HRESULT/status/value marshaling, and per-client isolation. "Installed MXAccess COM component is the compatibility baseline."
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessInteropInfo.cs (installs/references real COM interop); docs/DesignDecisions.md:2628 — "target the installed MXAccess COM interop surface directly from the x86 worker." Consistent across all three docs.
CODE_AREA: arch.layout
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GatewayProcessDesign.md
LINES: 100105
CLAIM: gRPC service surface at this stage is limited to `OpenSession`, `CloseSession`, `Invoke`, `StreamEvents` (with `Session(stream ClientMessage) returns (stream ServerMessage)` deferred).
CLAIM_TYPE: rpc/proto
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_gateway.proto — `MxAccessGateway` service defines `OpenSession`, `CloseSession`, `Invoke`, `StreamEvents`, and additional alarm/galaxy RPCs. The bidirectional `Session` RPC is not present in the current proto, consistent with the deferral noted in the doc.
CODE_AREA: arch.ipc
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: gateway.md
LINES: 266273
CLAIM: Public gRPC service is `MxAccessGateway` with `OpenSession`, `CloseSession`, `Invoke`, `StreamEvents`, and deferred bidirectional `Session` RPC.
CLAIM_TYPE: rpc/proto
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_gateway.proto — confirmed. The `Session` bidirectional RPC is absent as expected for deferred rollout.
CODE_AREA: arch.ipc
SEVERITY: low
PROPOSED_FIX: flag only
+580
View File
@@ -0,0 +1,580 @@
# Cluster 02 — Worker
Auditor: automated prose-documentation audit
Docs audited: WorkerBootstrap.md, WorkerConversion.md, WorkerFrameProtocol.md, WorkerProcessLauncher.md, WorkerSta.md, MxAccessWorkerInstanceDesign.md
Code verified against: src/ZB.MOM.WW.MxGateway.Worker/**, src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_worker.proto
---
DOC: WorkerSta.md
LINES: 23-31
CLAIM: `StaRuntime`'s constructor configures a background `Thread` named `ZB.MOM.WW.MxGateway.Worker.STA` and the code snippet shows `Name = "ZB.MOM.WW.MxGateway.Worker.STA"`.
CLAIM_TYPE: term
VERDICT: wrong
EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/Sta/StaRuntime.cs:61 — actual thread name is `"MxGateway.Worker.STA"` (no `ZB.MOM.WW.` prefix).
CODE_AREA: worker.sta
SEVERITY: medium
PROPOSED_FIX: Change every occurrence of `ZB.MOM.WW.MxGateway.Worker.STA` in WorkerSta.md (prose on line 23 and code snippet on line 29) to `MxGateway.Worker.STA`.
---
DOC: MxAccessWorkerInstanceDesign.md
LINES: 254
CLAIM: `StaRuntime` "starts one background thread named `ZB.MOM.WW.MxGateway.Worker.STA`".
CLAIM_TYPE: term
VERDICT: wrong
EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/Sta/StaRuntime.cs:61 — thread is named `"MxGateway.Worker.STA"`.
CODE_AREA: worker.sta
SEVERITY: medium
PROPOSED_FIX: Replace `ZB.MOM.WW.MxGateway.Worker.STA` with `MxGateway.Worker.STA` in the STA Runtime section.
---
DOC: WorkerSta.md
LINES: 144
CLAIM: "`InvokeAsync` rejects new work with `InvalidOperationException`" when shutdown is requested.
CLAIM_TYPE: term
VERDICT: wrong
EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/Sta/StaRuntime.cs:170 — actually throws `StaRuntimeShutdownException`. That class inherits from `InvalidOperationException` (StaRuntimeShutdownException.cs:16) but is a distinct type callers are expected to distinguish.
CODE_AREA: worker.sta
SEVERITY: medium
PROPOSED_FIX: Change "rejects new work with `InvalidOperationException`" to "rejects new work with `StaRuntimeShutdownException` (a subtype of `InvalidOperationException`)". The distinction matters because MxAccessStaSession uses it to separate graceful stop from programming errors (e.g., STA-affinity assertions).
---
DOC: MxAccessWorkerInstanceDesign.md
LINES: 122
CLAIM: Exit code `0` / `Success` meaning = "Required bootstrap options are valid."
CLAIM_TYPE: behavior-rule
VERDICT: wrong
EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/Bootstrap/WorkerExitCode.cs:5; WorkerBootstrap.md:113 states the authoritative meaning: "The pipe session ran to a clean close." The design-doc description conflates parse success with process-lifetime success.
CODE_AREA: worker.launcher
SEVERITY: high
PROPOSED_FIX: Update the Success row to: "`Success` | 0 | The pipe session ran to a clean close." Add a note that `WorkerBootstrapResult.Succeeded` is a parse-phase gate distinct from process exit code 0.
---
DOC: MxAccessWorkerInstanceDesign.md
LINES: 119-128
CLAIM: Exit code table lists only five codes (04). Codes 5 (`PipeConnectionFailed`) and 6 (`ProtocolViolation`) are absent.
CLAIM_TYPE: behavior-rule
VERDICT: stale
EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/Bootstrap/WorkerExitCode.cs:5-12 — enum has seven values (06); WorkerBootstrap.md:112-120 documents all seven.
CODE_AREA: worker.launcher
SEVERITY: high
PROPOSED_FIX: Add rows for `PipeConnectionFailed = 5` ("An `IOException` or `TimeoutException` escapes the pipe client") and `ProtocolViolation = 6` ("A `WorkerFrameProtocolException` escapes the pipe client") to the exit-code table.
---
DOC: MxAccessWorkerInstanceDesign.md
LINES: 134-160
CLAIM: Internal component tree lists class names including `WorkerHost`, `PipeClient`, `FrameReader`, `FrameWriter`, `WorkerProtocol`, `StaCommandQueue`, `MessagePump`, `StaWatchdog`, `MxAccessCommandDispatcher`, `SafeArrayConverter`, `StatusProxyConverter`, `HResultMapper`.
CLAIM_TYPE: term
VERDICT: stale
EVIDENCE: Actual source files in the worker project:
- `WorkerHost` does not exist; entry point is `WorkerApplication` (WorkerApplication.cs).
- `PipeClient` exists as `WorkerPipeClient` (Ipc/WorkerPipeClient.cs).
- `FrameReader`/`FrameWriter` exist as `WorkerFrameReader`/`WorkerFrameWriter` (Ipc/).
- `WorkerProtocol` does not exist; closest is `WorkerContractInfo` (Ipc/WorkerContractInfo.cs).
- `StaCommandQueue` does not exist; queue logic lives in `StaCommandDispatcher` (Sta/StaCommandDispatcher.cs).
- `MessagePump` exists as `StaMessagePump` (Sta/StaMessagePump.cs).
- `StaWatchdog` does not exist; watchdog logic lives in `WorkerPipeSession` (Ipc/WorkerPipeSession.cs).
- `MxAccessCommandDispatcher` does not exist; actual class is `MxAccessCommandExecutor` (MxAccess/MxAccessCommandExecutor.cs).
- `SafeArrayConverter` does not exist; SAFEARRAY conversion is part of `VariantConverter`.
- `StatusProxyConverter` does not exist; actual class is `MxStatusProxyConverter` (Conversion/MxStatusProxyConverter.cs).
- `HResultMapper` does not exist; actual class is `HResultConverter` (Conversion/HResultConverter.cs).
CODE_AREA: worker.sta
SEVERITY: high
PROPOSED_FIX: Rewrite the component tree to match actual class names. This section appears to be a design-phase placeholder that was never updated after implementation.
---
DOC: WorkerBootstrap.md
LINES: 146
CLAIM: "Standard error is used rather than standard output because the gateway side reads worker stdout for diagnostic capture only, while stderr is reserved for log output that does not interfere with any future stdout-based channel."
CLAIM_TYPE: behavior-rule
VERDICT: stale
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Workers/WorkerProcessLauncher.cs:166-174 — `ProcessStartInfo` does not set `RedirectStandardOutput = true` or `RedirectStandardError = true`; the gateway currently reads neither stream. The stated reason (gateway reads stdout) is not implemented.
CODE_AREA: worker.launcher
SEVERITY: medium
PROPOSED_FIX: Replace the stdout-capture rationale with the accurate reason: "Environment variables of another process are not visible to other users, unlike command-line arguments; stdout/stderr redirect is not currently wired by the launcher." Alternatively, if stdout capture is a planned feature, label it as such.
---
DOC: WorkerConversion.md
LINES: 178
CLAIM: "`MapCategory` and `MapSource` translate the integer codes documented for `MXSTATUS_PROXY` (for example `0 = Ok`, `3 = CommunicationError`, `0 = RequestingLmx`, `5 = RespondingAutomationObject`)".
CLAIM_TYPE: term
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/Conversion/MxStatusProxyConverter.cs:103-133 — `MapCategory(0)``MxStatusCategory.Ok`; `MapCategory(3)``MxStatusCategory.CommunicationError`; `MapSource(0)``MxStatusSource.RequestingLmx`; `MapSource(5)``MxStatusSource.RespondingAutomationObject`.
CODE_AREA: worker.convert
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: WorkerConversion.md
LINES: 225
CLAIM: "The mapping covers the engine-error range documented for MXAccess (16-50, 56-61, 541-542, 8017)."
CLAIM_TYPE: behavior-rule
VERDICT: stale
EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/Conversion/MxStatusDetailText.cs:7-48 — the dictionary has gaps within those ranges: keys 35, 45, 46 are absent from 1650; keys 58, 59 are absent from 5661. The doc implies contiguous ranges.
CODE_AREA: worker.convert
SEVERITY: low
PROPOSED_FIX: Replace the continuous-range description with "selected detail codes in the ranges 1650, 5661, 541542, and 8017 (not all values in those ranges are populated)."
---
DOC: WorkerBootstrap.md
LINES: 7-8
CLAIM: "`WorkerApplication.Run` constructs the bootstrap dependencies (`EnvironmentVariableWorkerEnvironment`, `WorkerConsoleLogger` writing to `Console.Error`, and a `WorkerPipeClient`)".
CLAIM_TYPE: term
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/WorkerApplication.cs:16-19.
CODE_AREA: worker.launcher
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: WorkerBootstrap.md
LINES: 113-120
CLAIM: Exit code table with seven rows 06.
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/Bootstrap/WorkerExitCode.cs:5-12.
CODE_AREA: worker.launcher
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: WorkerBootstrap.md
LINES: 181-193
CLAIM: `WorkerLogRedactor` `SensitiveFieldNameParts` list (seven entries: nonce, secret, password, token, credential, apikey, api_key).
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/Bootstrap/WorkerLogRedactor.cs:16-25.
CODE_AREA: worker.launcher
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: WorkerBootstrap.md
LINES: 105
CLAIM: "`Succeeded` is defined as `ExitCode == WorkerExitCode.Success` rather than as a separate flag, so the exit code and the success state cannot disagree."
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/Bootstrap/WorkerBootstrapResult.cs:36.
CODE_AREA: worker.launcher
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: WorkerFrameProtocol.md
LINES: 14-19
CLAIM: Each frame starts with a four-byte little-endian unsigned payload length followed by the serialized `WorkerEnvelope` payload. Zero-length payloads and payloads larger than the configured maximum are rejected before allocating the payload buffer. The default maximum is 16 MiB.
CLAIM_TYPE: rpc/proto
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/Ipc/WorkerFrameReader.cs:32-50; WorkerFrameProtocolOptions.cs:11 (`DefaultMaxMessageBytes = 16 * 1024 * 1024`).
CODE_AREA: worker.frameproto
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: WorkerFrameProtocol.md
LINES: 22-34
CLAIM: Envelope validation checks: `protocol_version` must match configured version; `session_id` must match owning session; envelope must contain one typed `body` value. Violations throw `WorkerFrameProtocolException` with a `WorkerFrameProtocolErrorCode`.
CLAIM_TYPE: rpc/proto
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/Ipc/WorkerEnvelopeValidator.cs:16-36.
CODE_AREA: worker.frameproto
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: WorkerFrameProtocol.md
LINES: 38-41
CLAIM: "The frame protocol lives in `ZB.MOM.WW.MxGateway.Worker.Ipc` (`WorkerFrameReader`, `WorkerFrameWriter`, `WorkerFrameProtocolOptions`)".
CLAIM_TYPE: path
VERDICT: accurate
EVIDENCE: Namespaces in WorkerFrameReader.cs:9, WorkerFrameWriter.cs:8, WorkerFrameProtocolOptions.cs:6.
CODE_AREA: worker.frameproto
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: WorkerFrameProtocol.md
LINES: 44-47
CLAIM: Test file path is `src/ZB.MOM.WW.MxGateway.Worker.Tests/Ipc/WorkerFrameProtocolTests.cs`.
CLAIM_TYPE: path
VERDICT: accurate
EVIDENCE: File confirmed at that path.
CODE_AREA: worker.frameproto
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: WorkerProcessLauncher.md
LINES: 18-25
CLAIM: Launcher passes `SessionId`, `PipeName`, and `ProtocolVersion` as `--session-id`, `--pipe-name`, `--protocol-version` CLI arguments; nonce travels via `MXGATEWAY_WORKER_NONCE` environment variable; nonce is excluded from `WorkerProcessCommandLine`.
CLAIM_TYPE: command
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Workers/WorkerProcessLauncher.cs:156-184.
CODE_AREA: worker.launcher
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: WorkerProcessLauncher.md
LINES: 30-34
CLAIM: Launcher validates that the configured worker path exists, has `.exe` extension, contains a valid Windows Portable Executable header, and matches `RequiredArchitecture`.
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Workers/WorkerProcessLauncher.cs:189-220 calls `WorkerExecutableValidator.Validate`.
CODE_AREA: worker.launcher
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: WorkerProcessLauncher.md
LINES: 35-45
CLAIM: Default probe (`IWorkerStartupProbe`) "only verifies that the worker did not exit immediately." Retry policy configured by `WorkerOptions.StartupProbeRetryAttempts` and `WorkerOptions.StartupProbeRetryDelayMilliseconds`; counter recorded as `mxgateway.retries.attempted` with `area=worker_startup`.
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: WorkerProcessStartedProbe.cs:10-24 (exits check only); WorkerOptions.cs:18-22; GatewayMetrics.cs:70 (`mxgateway.retries.attempted`); WorkerProcessLauncher.cs:279 (area label `"worker_startup"`).
CODE_AREA: worker.launcher
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: WorkerProcessLauncher.md
LINES: 48-55
CLAIM: Launcher also passes `MXGATEWAY_WORKER_PIPE_CONNECT_ATTEMPT_TIMEOUT_MS` from `WorkerOptions.PipeConnectAttemptTimeoutMilliseconds`. On failure, kills the worker process tree, disposes the process handle, disposes the optional pipe reservation, records a worker kill metric, and reports `WorkerProcessLaunchException`.
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: WorkerProcessLauncher.cs:181-182, 253-267.
CODE_AREA: worker.launcher
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: WorkerProcessLauncher.md
LINES: 60-64
CLAIM: Test command: `dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter WorkerProcessLauncherTests`.
CLAIM_TYPE: command
VERDICT: accurate
EVIDENCE: Project file confirmed at `src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj`; test class `WorkerProcessLauncherTests` confirmed at `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Workers/WorkerProcessLauncherTests.cs`.
CODE_AREA: worker.launcher
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: WorkerSta.md
LINES: 14
CLAIM: Type table shows `StaCommandDispatcher` as "Bounded asynchronous queue in front of `StaRuntime`…".
CLAIM_TYPE: term
VERDICT: stale
EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/Sta/StaCommandDispatcher.cs:15 — uses `Queue<QueuedStaCommand>`, a plain synchronous non-concurrent `Queue<T>` guarded by `lock(gate)`. There is no async channel or channel-based backpressure; `DrainAsync` is fire-and-forget but the queue itself is not an async queue.
CODE_AREA: worker.sta
SEVERITY: low
PROPOSED_FIX: Change "Bounded asynchronous queue" to "Bounded queue with an async drain loop" to avoid implying the underlying data structure is an async channel.
---
DOC: WorkerSta.md
LINES: 56
CLAIM: "`The idlePumpInterval` defaults to 50 ms so the pump still services Windows messages even when no commands are queued".
CLAIM_TYPE: config-key
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/Sta/StaRuntime.cs:30 — `TimeSpan.FromMilliseconds(50)`.
CODE_AREA: worker.sta
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: WorkerSta.md
LINES: 82-99
CLAIM: `InvokeAsync<T>` wraps the delegate in a `StaWorkItem<T>`, enqueues it on a `ConcurrentQueue<IStaWorkItem>`, and signals `commandWakeEvent`. `StaWorkItem<T>` uses an `Interlocked.CompareExchange` on `started` so exactly one of three outcomes happens.
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: StaRuntime.cs:12 (`ConcurrentQueue<IStaWorkItem>`); StaRuntime.cs:164-177; StaWorkItem.cs:31,47,57.
CODE_AREA: worker.sta
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: WorkerSta.md
LINES: 141-148
CLAIM: Shutdown sequence step 1: sets `shutdownRequested` under `gate`; step 2: signals `commandWakeEvent`; step 3: waits up to `timeout` on `stoppedEvent`, which the STA sets after leaving `ThreadMain`; step 4: drains the queue through `CancelQueuedCommands` calling `CancelBeforeExecution`.
CLAIM_TYPE: behavior-rule
VERDICT: stale
EVIDENCE: StaRuntime.cs:261-273 — `CancelQueuedCommands()` is called inside `ThreadMain`'s `finally` block *before* `stoppedEvent.Set()`, meaning the drain happens on the STA thread, not after `stoppedEvent` is observed by `Shutdown()`. `Shutdown()` calls `CancelQueuedCommands()` a *second* time after observing `stoppedEvent`, but the doc implies a single post-stop drain.
CODE_AREA: worker.sta
SEVERITY: medium
PROPOSED_FIX: Revise step 3 to note that `stoppedEvent` is set from within `ThreadMain`'s `finally` block (before the thread exits) after `CoUninitialize`. Revise step 4 to note the queue is drained *twice*: once by `ThreadMain` in its `finally` (to cancel items enqueued before shutdown) and once by `Shutdown()` after `stoppedEvent` (to cancel any items enqueued in the gap).
---
DOC: WorkerSta.md
LINES: 149
CLAIM: "`Dispose` calls `Shutdown` with a five-second budget and only disposes the wait handles when shutdown actually completed".
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: StaRuntime.cs:224-233 — `Shutdown(TimeSpan.FromSeconds(5))`; handles disposed only when `stopped` is true.
CODE_AREA: worker.sta
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: WorkerSta.md
LINES: 108
CLAIM: "when `commandQueue.Count` reaches `maxPendingCommands` (default `DefaultMaxPendingCommands = 128`) the dispatcher returns a synthetic `WorkerUnavailable` reply".
CLAIM_TYPE: config-key
VERDICT: accurate
EVIDENCE: StaCommandDispatcher.cs:11 (`DefaultMaxPendingCommands = 128`); lines 125-132 (count check and WorkerUnavailable reply).
CODE_AREA: worker.sta
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: MxAccessWorkerInstanceDesign.md
LINES: 97
CLAIM: Expected protected environment values include `MXGATEWAY_WORKER_LOG_CONTEXT=<optional context>`.
CLAIM_TYPE: config-key
VERDICT: wrong
EVIDENCE: No occurrence of `MXGATEWAY_WORKER_LOG_CONTEXT` anywhere in `src/ZB.MOM.WW.MxGateway.Worker/**`. The only worker environment variable in code is `MXGATEWAY_WORKER_NONCE` (WorkerOptions.cs:7) and `MXGATEWAY_WORKER_PIPE_CONNECT_ATTEMPT_TIMEOUT_MS` (WorkerProcessLauncher.cs:22).
CODE_AREA: worker.launcher
SEVERITY: high
PROPOSED_FIX: Remove `MXGATEWAY_WORKER_LOG_CONTEXT` from the bootstrap environment table, or add a note that it is not yet implemented if it is intended for a future slice.
---
DOC: MxAccessWorkerInstanceDesign.md
LINES: 86-99
CLAIM: Bootstrap sequence lists `MXGATEWAY_WORKER_LOG_CONTEXT` as an optional protected environment value alongside `MXGATEWAY_WORKER_NONCE`.
CLAIM_TYPE: config-key
VERDICT: wrong
EVIDENCE: Same as above — `MXGATEWAY_WORKER_LOG_CONTEXT` is not read anywhere in the worker bootstrap code.
CODE_AREA: worker.launcher
SEVERITY: high
PROPOSED_FIX: flag only (same fix as prior entry).
---
DOC: MxAccessWorkerInstanceDesign.md
LINES: 368-375
CLAIM: "`MxAccessEventQueue` is the bounded outbound event queue for one worker session. It assigns the monotonic `WorkerSequence` and `WorkerTimestamp` when an event is accepted. The default capacity is `10000`. When the queue reaches capacity it records a `WorkerFaultCategory.QueueOverflow` fault and rejects further events."
CLAIM_TYPE: behavior-rule
VERDICT: stale
EVIDENCE: MxAccessEventQueue.cs:115-132 — `Enqueue` throws `MxAccessEventQueueOverflowException` in addition to recording the fault. Callers in `MxAccessBaseEventSink` catch this exception. The doc's phrase "rejects further events" omits the thrown exception, which callers must handle.
CODE_AREA: worker.sta
SEVERITY: low
PROPOSED_FIX: Add that `Enqueue` raises `MxAccessEventQueueOverflowException` on overflow, in addition to recording the fault, so that callers know to catch this exception rather than only observing the fault via `DrainFault()`.
---
DOC: WorkerConversion.md
LINES: 1-262 (entire doc)
CLAIM: Documents `VariantConverter`, `HResultConverter`/`HResultConversion`, `MxStatusProxyConverter`, `MxStatusDetailText`, `MxStatusConversionException`.
CLAIM_TYPE: term
VERDICT: gap
EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/Conversion/VariantConverter.cs:129-177 — `ConvertToComValue(MxValue)` and `ConvertToComArray(MxArray)` are fully implemented methods that convert protobuf values back to CLR objects for COM write calls. These inverse-projection paths are nowhere mentioned in WorkerConversion.md, leaving integrators unaware of the write path.
CODE_AREA: worker.convert
SEVERITY: medium
PROPOSED_FIX: Add a section "Inverse projection for COM writes" describing `ConvertToComValue`, its dispatch on `MxValue.KindOneofCase`, the `ConvertToComArray` helper, and that raw or unset `MxValue` payloads throw `ArgumentException`.
---
DOC: MxAccessWorkerInstanceDesign.md
LINES: 134-160
CLAIM: Internal component tree for `MxAccess` subtree lists: `MxAccessSession`, `MxAccessCommandDispatcher`, `MxAccessEventSink`, `MxAccessHandleRegistry`.
CLAIM_TYPE: term
VERDICT: stale
EVIDENCE: Actual classes: `MxAccessSession` (internal session state), `MxAccessStaSession` (owner of the STA session lifecycle), `MxAccessCommandExecutor` (implements `IStaCommandExecutor`), `MxAccessBaseEventSink`/`MxAccessAlarmEventSink` (event sinks), `MxAccessHandleRegistry`. The class `MxAccessCommandDispatcher` does not exist.
CODE_AREA: worker.sta
SEVERITY: medium
PROPOSED_FIX: Update MxAccess subtree to reflect actual class names. Note that `MxAccessStaSession` owns `StaCommandDispatcher` (in the Sta namespace) and `MxAccessCommandExecutor`; they are separate concerns.
---
DOC: MxAccessWorkerInstanceDesign.md
LINES: 134-160 (entire component tree)
CLAIM: No mention of the alarm subsystem.
CLAIM_TYPE: term
VERDICT: gap
EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/MxAccess/ contains a complete alarm subsystem: `AlarmCommandHandler.cs`, `AlarmDispatcher.cs`, `AlarmRecordTransitionMapper.cs`, `IAlarmCommandHandler.cs`, `IMxAccessAlarmConsumer.cs`, `MxAccessAlarmEventSink.cs`, `WnWrapAlarmConsumer.cs`, `MxAlarmSnapshot.cs`, `MxAlarmStateKind.cs`, `MxAlarmTransitionEvent.cs`. None of these appear in any of the six audited docs. `MxAccessStaSession.cs` shows an `alarmCommandHandlerFactory` parameter and an alarm poll loop (lines 14-312).
CODE_AREA: worker.sta
SEVERITY: high
PROPOSED_FIX: Add an "Alarm Subsystem" section to MxAccessWorkerInstanceDesign.md (or create docs/WorkerAlarms.md) covering: `IAlarmCommandHandler`/`AlarmCommandHandler`, the `WnWrapAlarmConsumer` STA-affinity requirement, the 500 ms alarm poll loop in `MxAccessStaSession.RunAlarmPollLoopAsync`, `AlarmDispatcher`, and the `MxAccessAlarmEventSink`. Update the event-sink list in the "Event Sink" section to include alarm events.
---
DOC: MxAccessWorkerInstanceDesign.md
LINES: 336-338
CLAIM: Event sink must subscribe to `OnDataChange`, `OnWriteComplete`, `OperationComplete`, `OnBufferedDataChange`.
CLAIM_TYPE: behavior-rule
VERDICT: gap
EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessAlarmEventSink.cs exists alongside `MxAccessBaseEventSink.cs`, indicating a fifth event family (alarm events) is handled. The four-family list is incomplete.
CODE_AREA: worker.sta
SEVERITY: medium
PROPOSED_FIX: Add alarm events to the event sink subscription list and clarify that alarm events are handled via `MxAccessAlarmEventSink` on the same STA thread.
---
DOC: WorkerConversion.md
LINES: 17-18
CLAIM: "It accepts an optional `expectedDataType` so that an MXAccess attribute hint (for example `MxDataType.Time` for a 64-bit FILETIME) overrides the default CLR-driven projection."
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: VariantConverter.cs:262-291 (`ConvertInt64Scalar` checks `expectedDataType == MxDataType.Time && value is long`).
CODE_AREA: worker.convert
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: WorkerConversion.md
LINES: 112-135
CLAIM: "`HResultConverter.Convert` prefers `COMException.ErrorCode` over `Exception.HResult` because the runtime sometimes overwrites `Exception.HResult` while marshalling".
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: HResultConverter.cs:21-26.
CODE_AREA: worker.convert
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: WorkerBootstrap.md
LINES: 48-54
CLAIM: Three fields arrive on the command line (`--session-id`, `--pipe-name`, `--protocol-version`) and one via environment variable (`MXGATEWAY_WORKER_NONCE`).
CLAIM_TYPE: command
VERDICT: accurate
EVIDENCE: WorkerOptionsParser.cs:12-14, 78.
CODE_AREA: worker.launcher
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: WorkerBootstrap.md
LINES: 155-159
CLAIM: "`IWorkerLogger` exposes only `Information` and `Error`. There is no `Debug` or `Trace` level."
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: IWorkerLogger.cs:8-19.
CODE_AREA: worker.launcher
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: WorkerSta.md
LINES: 34
CLAIM: "`StaComApartmentInitializer.Initialize` calls `CoInitializeEx` with `COINIT_APARTMENTTHREADED` (`0x2`) and treats both `S_OK` and `S_FALSE` as success because `S_FALSE` indicates the apartment was already initialized on this thread."
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: StaComApartmentInitializer.cs:8-18.
CODE_AREA: worker.sta
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: WorkerSta.md
LINES: 63-78
CLAIM: "`StaMessagePump.WaitForWorkOrMessages` calls `MsgWaitForMultipleObjectsEx` with `QS_ALLINPUT` and `MWMO_INPUTAVAILABLE`. `PumpPendingMessages` drains the queue with `PM_REMOVE`."
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: StaMessagePump.cs:13-15 (`MwmoInputAvailable = 0x0004`, `PmRemove = 0x0001`, `QsAllInput = 0x04FF`); lines 31-36, 50-57.
CODE_AREA: worker.sta
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: MxAccessWorkerInstanceDesign.md
LINES: 271-286
CLAIM: COM details: interop assembly path, assembly identity (`ArchestrA.MxAccess, Version=3.2.0.0, PublicKeyToken=23106a86e706d0ae`), COM class `ArchestrA.MxAccess.LMXProxyServerClass`, CLSID `{C30B52F5-2CB5-4760-AF0A-3A344A7EB5DC}`, ProgID `LMXProxy.LMXProxyServer.1`, version-independent ProgID `LMXProxy.LMXProxyServer`, registered server `LmxProxy.dll`, threading model `Apartment`.
CLAIM_TYPE: path
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessInteropInfo.cs — ProgId, VersionIndependentProgId, Clsid, InteropAssemblyPath, RegisteredServerPath, ComClassName all match. Assembly identity and threading model are from MXAccess analysis sources and are unverifiable in this repo but consistent with design sources cited in CLAUDE.md.
CODE_AREA: worker.sta
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: MxAccessWorkerInstanceDesign.md
LINES: 656-660
CLAIM: "HeartbeatStuckCeiling (default 75 seconds = 5 × HeartbeatGrace)".
CLAIM_TYPE: config-key
VERDICT: accurate
EVIDENCE: WorkerPipeSessionOptions.cs:19 (`DefaultHeartbeatStuckCeiling = TimeSpan.FromSeconds(75)`); DefaultHeartbeatGrace = 15 s (line 11); 5 × 15 = 75. ✓
CODE_AREA: worker.sta
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: WorkerBootstrap.md
LINES: 5-6
CLAIM: "The worker process is a short-lived child of the gateway."
CLAIM_TYPE: term
VERDICT: stale
EVIDENCE: No functional error, but "short-lived" is context-dependent; workers persist for the entire duration of a gateway session (which may be hours). Integrators might misread this as expecting sub-minute lifetimes.
CODE_AREA: worker.launcher
SEVERITY: low
PROPOSED_FIX: Replace "short-lived child" with "per-session child process" or "child process that lives for the duration of one gateway session."
---
DOC: MxAccessWorkerInstanceDesign.md
LINES: 151
CLAIM: Component tree lists `MxAccessSession` as a class under `MxAccess`.
CLAIM_TYPE: term
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessSession.cs exists. The tree is incomplete (missing `MxAccessStaSession`, alarm classes, etc.) but `MxAccessSession` itself is real.
CODE_AREA: worker.sta
SEVERITY: low
PROPOSED_FIX: flag only (incompleteness covered by the component-tree stale entry above).
---
DOC: WorkerConversion.md
LINES: 18
CLAIM: `VariantConverter` is in namespace `ZB.MOM.WW.MxGateway.Worker.Conversion`.
CLAIM_TYPE: path
VERDICT: accurate
EVIDENCE: VariantConverter.cs:8 (`namespace ZB.MOM.WW.MxGateway.Worker.Conversion;`).
CODE_AREA: worker.convert
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: WorkerFrameProtocol.md
LINES: 49-53
CLAIM: Build command `dotnet build src/ZB.MOM.WW.MxGateway.Worker/ZB.MOM.WW.MxGateway.Worker.csproj -p:Platform=x86`.
CLAIM_TYPE: command
VERDICT: accurate
EVIDENCE: Project file exists at that path.
CODE_AREA: worker.frameproto
SEVERITY: low
PROPOSED_FIX: flag only
+380
View File
@@ -0,0 +1,380 @@
# Cluster 03 — Sessions/Runtime
Auditor: automated (claude-sonnet-4-6)
Date: 2026-06-03
Source doc: docs/Sessions.md
Verified against: src/ZB.MOM.WW.MxGateway.Server/Sessions/**, src/ZB.MOM.WW.MxGateway.Server/Workers/**
---
DOC / LINES / 9
CLAIM: "All four interfaces (`ISessionManager`, `ISessionRegistry`, `ISessionWorkerClientFactory`) plus `SessionShutdownHostedService` are wired as singletons by `SessionServiceCollectionExtensions.AddGatewaySessions`."
CLAIM_TYPE: term
VERDICT: wrong
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionServiceCollectionExtensions.cs:9-18 — only three interfaces exist (confirmed by `ls I*.cs` in Sessions/). The doc claims "four interfaces" but names only three. Additionally the DI registration also registers `SessionLeaseMonitorHostedService` as a hosted service, which is omitted from this sentence.
CODE_AREA: session.di
SEVERITY: medium
PROPOSED_FIX: Change "All four interfaces" to "All three interfaces". Separately note that two hosted services are registered: `SessionLeaseMonitorHostedService` and `SessionShutdownHostedService`.
---
DOC / LINES / 265-276
CLAIM: Code snippet for `AddGatewaySessions` shows only `SessionShutdownHostedService` registered; `SessionLeaseMonitorHostedService` is absent from the snippet.
CLAIM_TYPE: behavior-rule
VERDICT: stale
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionServiceCollectionExtensions.cs:14-15 — actual code registers both `AddHostedService<SessionLeaseMonitorHostedService>()` and `AddHostedService<SessionShutdownHostedService>()`. The snippet in the doc is missing the lease-monitor line.
CODE_AREA: session.di
SEVERITY: medium
PROPOSED_FIX: Add `services.AddHostedService<SessionLeaseMonitorHostedService>();` to the code snippet (between the `ISessionManager` singleton line and the shutdown service line).
---
DOC / LINES / 232-259
CLAIM: The `ShutdownAsync` code snippet shown calls `session.KillWorker(GatewayShutdownReason)` and `await RemoveSessionAsync(session)` directly in the catch block.
CLAIM_TYPE: behavior-rule
VERDICT: stale
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManager.cs:296-331 — the actual `ShutdownAsync` fallback calls `await KillWorkerAsync(session.SessionId, GatewayShutdownReason, cancellationToken)` (which routes through `KillWorkerWithCloseGateAsync` and then `RemoveSessionAsync`), not a direct `session.KillWorker` + `RemoveSessionAsync`. The old snippet predates the Server-045/Server-046 refactor that unified the kill path through `KillWorkerAsync`.
CODE_AREA: session.shutdown
SEVERITY: medium
PROPOSED_FIX: Replace the ShutdownAsync snippet with the current implementation, which checks `_registry.TryGet` then calls `KillWorkerAsync` (wrapped in its own try/catch) instead of directly calling `session.KillWorker` and `RemoveSessionAsync`.
---
DOC / LINES / 55-59
CLAIM: "`KillWorkerAsync` is the forceful path used by the dashboard's admin Kill button: it calls `GatewaySession.KillWorker` directly, which kills the worker process immediately with no graceful-shutdown attempt and transitions the session to `Closed`."
CLAIM_TYPE: behavior-rule
VERDICT: stale
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManager.cs:216-264 — `KillWorkerAsync` now calls `session.KillWorkerWithCloseGateAsync` (not `GatewaySession.KillWorker` directly). The `KillWorkerWithCloseGateAsync` method acquires `_closeLock` before killing, serializing concurrent close/kill attempts (Server-045 fix). The old description of a direct `KillWorker` call is stale.
CODE_AREA: session.lifecycle
SEVERITY: medium
PROPOSED_FIX: Update description to state that `KillWorkerAsync` calls `session.KillWorkerWithCloseGateAsync`, which acquires the per-session close lock before killing the worker, so concurrent close and kill callers serialize.
---
DOC / LINES / 59
CLAIM: "Both paths converge on the same registry/metrics cleanup, so the open-session slot is released and `mxgateway.sessions.closed` is incremented either way."
CLAIM_TYPE: config-key
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Metrics/GatewayMetrics.cs:59 — counter name `mxgateway.sessions.closed` confirmed. Both `CloseSessionCoreAsync` and `KillWorkerAsync` call `_metrics.SessionClosed()` and `RemoveSessionAsync` (which calls `ReleaseSessionSlot`).
CODE_AREA: session.metrics
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC / LINES / 60-72
CLAIM: Code snippet for `EnsureSessionCapacity` throws `SessionManagerException` with `SessionLimitExceeded`; open requests that exceed the bound "throw ... rather than queuing".
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManager.cs:388-396 — `_sessionSlots.Wait(0)` (zero timeout = non-blocking) confirms the no-queue, immediate-throw behavior.
CODE_AREA: session.lifecycle
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC / LINES / 61
CLAIM: "Concurrency is bounded by a `SemaphoreSlim` initialized to `GatewayOptions.Sessions.MaxSessions`."
CLAIM_TYPE: config-key
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManager.cs:53 — `new SemaphoreSlim(_options.Sessions.MaxSessions, _options.Sessions.MaxSessions)`.
CODE_AREA: session.lifecycle
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC / LINES / 75
CLAIM: "three close-reason constants — `DefaultCloseReason` (`\"client-close\"`), `GatewayShutdownReason` (`\"gateway-shutdown\"`), and `LeaseExpiredReason` (`\"lease-expired\"`)"
CLAIM_TYPE: term
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManager.cs:17-19 — all three constants confirmed with exact string values.
CODE_AREA: session.lifecycle
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC / LINES / 79-81
CLAIM: "`SessionRegistry` is a thin wrapper over a `ConcurrentDictionary<string, GatewaySession>` keyed by session id with `StringComparer.Ordinal`."
CLAIM_TYPE: term
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionRegistry.cs:12 — `new ConcurrentDictionary<string, GatewaySession>(StringComparer.Ordinal)` confirmed.
CODE_AREA: session.registry
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC / LINES / 81
CLAIM: "`ActiveCount` filters out sessions whose state is `Closed`"
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionRegistry.cs:22 — `_sessions.Values.Count(session => session.State is not SessionState.Closed)` confirmed.
CODE_AREA: session.registry
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC / LINES / 15-19
CLAIM: "The session id is an opaque string in the form `session-{guid:N}` and the per-session pipe name is `mxaccess-gateway-{ProcessId}-{SessionId}`."
CLAIM_TYPE: term
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManager.cs:433 (`pipeName = $"mxaccess-gateway-{Environment.ProcessId}-{sessionId}"`) and :479 (`$"session-{Guid.NewGuid():N}"`).
CODE_AREA: session.lifecycle
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC / LINES / 19
CLAIM: "`SessionState` itself is the protobuf-generated enum from `ZB.MOM.WW.MxGateway.Contracts.Proto`"
CLAIM_TYPE: term
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/GatewaySession.cs:1 — `using ZB.MOM.WW.MxGateway.Contracts.Proto;` and the state field is typed `SessionState`.
CODE_AREA: session.lifecycle
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC / LINES / 85-87
CLAIM: "`SessionWorkerClientFactory.CreateAsync` … drives the session through the protobuf `SessionState` substates in order: `StartingWorker`, `WaitingForPipe`, `Handshaking`, `InitializingWorker`."
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionWorkerClientFactory.cs:60-105 — `TransitionTo(SessionState.StartingWorker)``TransitionTo(SessionState.WaitingForPipe)``TransitionTo(SessionState.Handshaking)``TransitionTo(SessionState.InitializingWorker)` in sequence.
CODE_AREA: session.lifecycle
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC / LINES / 87-98
CLAIM: Startup timeout wrapped as `TimeoutException` with the exact catch pattern shown — `OperationCanceledException` where `startupCancellation.IsCancellationRequested` and `!cancellationToken.IsCancellationRequested`.
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionWorkerClientFactory.cs:145-153 — identical predicate confirmed.
CODE_AREA: session.lifecycle
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC / LINES / 100
CLAIM: "The named pipe is created with `maxNumberOfServerInstances: 1`"
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionWorkerClientFactory.cs:166 — `maxNumberOfServerInstances: 1` confirmed.
CODE_AREA: session.lifecycle
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC / LINES / 104
CLAIM: "`SessionShutdownHostedService` … catches `OperationCanceledException` triggered by the host shutdown timeout and logs a warning so that an over-running shutdown does not surface as an unhandled exception."
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionShutdownHostedService.cs:18-28 — exact catch confirmed.
CODE_AREA: session.shutdown
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC / LINES / 109-127
CLAIM: `SessionOpenRequest` is a `sealed record` with fields `RequestedBackend`, `ClientSessionName`, `ClientCorrelationId`, `CommandTimeout`, and a `FromContract` factory.
CLAIM_TYPE: term
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionOpenRequest.cs:6-24 — confirmed. Note: the doc snippet includes a `ClientCorrelationId` field in the record definition, but the actual `SessionManager.CreateSession` derives `clientCorrelationId` internally rather than forwarding the field from the request. This is a minor mismatch between what the record holds vs. how it is used, but does not constitute an error in the doc's description of the record type itself.
CODE_AREA: session.lifecycle
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC / LINES / 134-139
CLAIM: `SessionCloseResult` is a `sealed record` with `SessionId`, `FinalState`, `AlreadyClosed`.
CLAIM_TYPE: term
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionCloseResult.cs:5-8 — confirmed.
CODE_AREA: session.lifecycle
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC / LINES / 143
CLAIM: "`SessionCloseStartedException` is `internal`"
CLAIM_TYPE: term
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionCloseStartedException.cs:3 — `internal sealed class SessionCloseStartedException` confirmed.
CODE_AREA: session.lifecycle
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC / LINES / 148-157
CLAIM: Error code table for `SessionManagerException` — seven codes listed: `SessionNotFound`, `SessionNotReady`, `EventSubscriberAlreadyActive`, `EventQueueOverflow`, `SessionLimitExceeded`, `OpenFailed`, `CloseFailed`.
CLAIM_TYPE: term
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManagerErrorCode.cs:1-12 — all seven members confirmed in order.
CODE_AREA: session.lifecycle
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC / LINES / 163-188
CLAIM: Open failure rollback order: "fault, deregister, dispose, release slot, record metric, log, rethrow".
CLAIM_TYPE: behavior-rule
VERDICT: stale
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManager.cs:97-123 — actual order is: MarkFaulted → TryRemove (deregister) → DisposeAsync → (conditionally) SessionRemoved metric if sessionOpenedRecorded → ReleaseSessionSlot → Fault metric → LogWarning → rethrow. The doc omits the `sessionOpenedRecorded` conditional `SessionRemoved()` call that was added in the Server-006 fix, making the described order incomplete. The doc text says "release slot, record metric" but the actual code calls `SessionRemoved` before `ReleaseSessionSlot` when `sessionOpenedRecorded` is true.
CODE_AREA: session.lifecycle
SEVERITY: medium
PROPOSED_FIX: Update the rollback description to note the conditional `SessionRemoved()` metric call that precedes `ReleaseSessionSlot` when `SessionOpened()` was already recorded (guards against mxgateway.sessions.open gauge leak on late failures such as auto-subscribe rejection).
---
DOC / LINES / 193-195
CLAIM: "`GatewaySession` also exposes typed bulk helpers (`AddItemBulkAsync`, `SubscribeBulkAsync`, etc.) that wrap `WorkerCommand` round-trips and translate non-`Ok` `ProtocolStatus` replies into `SessionManagerException` with `SessionNotReady`."
CLAIM_TYPE: term
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/GatewaySession.cs:490, 590 (AddItemBulkAsync, SubscribeBulkAsync) and :1017-1023 (ProtocolStatusCode.Ok guard throwing SessionManagerException(SessionNotReady)).
CODE_AREA: session.lifecycle
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC / LINES / 195-197
CLAIM: "Event streaming uses `AttachEventSubscriber` which returns a disposable lease. When `allowMultipleSubscribers` is false the second attach throws `EventSubscriberAlreadyActive`; this prevents two gRPC streams from racing on the same worker event channel. Active event subscribers keep the session lease from expiring until the stream is disposed."
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/GatewaySession.cs:387-407 (AttachEventSubscriber guard and lease) and :373-380 (IsLeaseExpired checks `_activeEventSubscriberCount == 0`).
CODE_AREA: session.subscriber
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC / LINES / 197
CLAIM: "Sessions open with `MxGateway:Sessions:DefaultLeaseSeconds` (default 1800)"
CLAIM_TYPE: config-key
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Configuration/SessionOptions.cs:21 — `public int DefaultLeaseSeconds { get; init; } = 1800`.
CODE_AREA: session.lifecycle
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC / LINES / 197
CLAIM: "`SessionLeaseMonitorHostedService` runs that sweep every `MxGateway:Sessions:LeaseSweepIntervalSeconds` seconds (default 30)."
CLAIM_TYPE: config-key
VERDICT: accurate
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Configuration/SessionOptions.cs:24 — `public int LeaseSweepIntervalSeconds { get; init; } = 30`; src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionLeaseMonitorHostedService.cs:19 — `TimeSpan.FromSeconds(Math.Max(1, options.Value.Sessions.LeaseSweepIntervalSeconds))`.
CODE_AREA: session.lifecycle
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC / LINES / 230
CLAIM: "`GatewaySession.KillWorker` is the unconditional forced-close path used by shutdown when graceful close itself throws, and also by `SessionManager.KillWorkerAsync` — the explicit kill path that the dashboard's admin Kill button invokes."
CLAIM_TYPE: behavior-rule
VERDICT: stale
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManager.cs:233 — `KillWorkerAsync` now calls `session.KillWorkerWithCloseGateAsync` (not `session.KillWorker`). The shutdown fallback (line 319) also routes through `KillWorkerAsync` rather than calling `session.KillWorker` + `RemoveSessionAsync` directly. `GatewaySession.KillWorker` is still present (line 874) but is no longer the entry point from `SessionManager.KillWorkerAsync`.
CODE_AREA: session.lifecycle
SEVERITY: medium
PROPOSED_FIX: Update to reflect that `SessionManager.KillWorkerAsync` delegates to `session.KillWorkerWithCloseGateAsync` (which serializes concurrent kill/close via `_closeLock` — Server-045 fix) and that `GatewaySession.KillWorker` is now only the internal terminal action inside `KillWorkerWithCloseGateAsync`.
---
DOC / LINES / 230
CLAIM: "`KillCount` increments while `ShutdownCount` does not"
CLAIM_TYPE: term
VERDICT: wrong
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Metrics/GatewayMetrics.cs:56-79 — no metrics named `KillCount` or `ShutdownCount` exist. The actual worker-kill metric is `mxgateway.workers.killed` (counter). The doc invents non-existent metric names.
CODE_AREA: session.metrics
SEVERITY: high
PROPOSED_FIX: Replace "KillCount increments while ShutdownCount does not" with "the `mxgateway.workers.killed` counter is incremented (via `GatewayMetrics.WorkerKilled`) while the graceful-shutdown path does not increment it".
---
DOC / LINES / 265
CLAIM: "registers the four singletons and the hosted service" (singular "the hosted service")
CLAIM_TYPE: behavior-rule
VERDICT: wrong
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionServiceCollectionExtensions.cs:14-15 — two hosted services are registered: `SessionLeaseMonitorHostedService` and `SessionShutdownHostedService`.
CODE_AREA: session.di
SEVERITY: medium
PROPOSED_FIX: Change "registers the four singletons and the hosted service" to "registers the three singletons and two hosted services (`SessionLeaseMonitorHostedService`, `SessionShutdownHostedService`)".
---
DOC / LINES / 279
CLAIM: "Registering `SessionShutdownHostedService` last ensures it is constructed after `ISessionManager` and therefore drains sessions during host stop."
CLAIM_TYPE: behavior-rule
VERDICT: stale
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionServiceCollectionExtensions.cs:14-15 — `SessionLeaseMonitorHostedService` is now registered before `SessionShutdownHostedService`. The shutdown service is still last of the two hosted services, but the reasoning in the doc no longer fully applies because construction order of hosted services relative to singletons is governed by ASP.NET Core's DI container, not purely registration order.
CODE_AREA: session.di
SEVERITY: low
PROPOSED_FIX: Update to note that two hosted services are registered in order (lease monitor first, shutdown second) and that both depend on `ISessionManager` which is registered as a singleton.
---
DOC / LINES / (none — gap)
CLAIM: (gap) `GatewaySession` holds an item registration dictionary (`_items`, keyed by `(ServerHandle, ItemHandle)`) tracking all successfully added/subscribed items. The session tracks and prunes these registrations via `TrackCommandReply`, `TryGetItemRegistration`, and the per-command `TrackItem`/`RemoveItems` helpers. This bookkeeping is undocumented.
CLAIM_TYPE: behavior-rule
VERDICT: gap
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/GatewaySession.cs:17 (_items field), :425-481 (TrackCommandReply), :1059-1090 (TrackItem, TrackBulkItems, RemoveItems). src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionItemRegistration.cs:3 (SessionItemRegistration record).
CODE_AREA: session.lifecycle
SEVERITY: low
PROPOSED_FIX: Add a subsection or paragraph noting that `GatewaySession` maintains an in-session item registry keyed by `(ServerHandle, ItemHandle)`, updated after successful `AddItem`, `AddItem2`, `AddBufferedItem`, `AddItemBulk`, `SubscribeBulk`, `RemoveItem`, `RemoveItemBulk`, and `UnsubscribeBulk` replies.
---
DOC / LINES / (none — gap)
CLAIM: (gap) `SessionOptions` exposes `AllowMultipleEventSubscribers` (default `false`). Setting it `true` is **rejected at startup** by `GatewayOptionsValidator` with the message "AllowMultipleEventSubscribers is not supported until event fan-out is implemented." This validator-level enforcement of the v1 constraint is undocumented.
CLAIM_TYPE: config-key
VERDICT: gap
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Configuration/SessionOptions.cs:29 and src/ZB.MOM.WW.MxGateway.Server/Configuration/GatewayOptionsValidator.cs:181-184.
CODE_AREA: session.subscriber
SEVERITY: medium
PROPOSED_FIX: Add a note to the "Run" section explaining that `MxGateway:Sessions:AllowMultipleEventSubscribers` exists but is actively refused by the validator in v1; operators who set it to `true` will see a startup validation failure, not a runtime error.
---
DOC / LINES / (none — gap)
CLAIM: (gap) Gateway-restart orphan cleanup is performed by `OrphanWorkerCleanupHostedService` (wrapping `OrphanWorkerTerminator.TerminateOrphans`) on `StartAsync`, before the gateway accepts sessions. Cleanup is best-effort (a failure logs a warning but does not block startup). The `Sessions.md` doc does not mention this, yet it directly affects the "gateway restart does not reattach orphan workers" contract.
CLAIM_TYPE: behavior-rule
VERDICT: gap
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Workers/OrphanWorkerCleanupHostedService.cs:7-30; src/ZB.MOM.WW.MxGateway.Server/Workers/OrphanWorkerTerminator.cs:49-95; src/ZB.MOM.WW.MxGateway.Server/Workers/WorkerServiceCollectionExtensions.cs:19.
CODE_AREA: session.orphan
SEVERITY: high
PROPOSED_FIX: Add a "Gateway Restart / Orphan Cleanup" section to Sessions.md (or cross-reference from Shutdown Coordination) noting that `OrphanWorkerCleanupHostedService` runs `OrphanWorkerTerminator.TerminateOrphans` on startup, kills any running worker executables matching the configured `MxGateway:Worker:ExecutablePath`, and that failures are non-fatal to startup.
---
DOC / LINES / (none — gap)
CLAIM: (gap) `SessionOptions.MaxPendingCommandsPerSession` (default 128) is passed to `WorkerClientOptions.MaxPendingCommands` during session construction. This per-session command concurrency cap is not documented in Sessions.md.
CLAIM_TYPE: config-key
VERDICT: gap
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Configuration/SessionOptions.cs:18; src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionWorkerClientFactory.cs:92.
CODE_AREA: session.lifecycle
SEVERITY: low
PROPOSED_FIX: Add a note in the "Key Types — SessionManager" or "Run" section that each session is bounded to `MxGateway:Sessions:MaxPendingCommandsPerSession` (default 128) concurrent in-flight worker commands.
---
DOC / LINES / (none — gap)
CLAIM: (gap) `GatewaySession` exposes a `KillWorkerWithCloseGateAsync` method that acquires `_closeLock` before killing, introduced to serialize concurrent close/kill callers (Server-045). This method is not mentioned; the doc describes only `KillWorker` as the unconditional kill path from `SessionManager`.
CLAIM_TYPE: term
VERDICT: gap
EVIDENCE: src/ZB.MOM.WW.MxGateway.Server/Sessions/GatewaySession.cs:896-917; src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionManager.cs:233.
CODE_AREA: session.lifecycle
SEVERITY: low
PROPOSED_FIX: Mention `KillWorkerWithCloseGateAsync` in the "Close" section as the locked kill path now used by `SessionManager.KillWorkerAsync`, distinguishing it from the bare `KillWorker` still used as the internal terminal action.
+437
View File
@@ -0,0 +1,437 @@
# Cluster 04 — Auth
Auditor: Claude Code (claude-sonnet-4-6)
Date: 2026-06-03
Docs audited: docs/Authentication.md, docs/Authorization.md, glauth.md
Code verified against: src/ZB.MOM.WW.MxGateway.Server/Security/** and Dashboard/**
---
DOC / Authentication.md / LINES 253271
CLAIM / `AuthStoreServiceCollectionExtensions.AddSqliteAuthStore` wires services via direct `AddSingleton` calls for `IApiKeyParser`, `IApiKeySecretHasher`, `IApiKeyVerifier`, `IApiKeyStore`/`SqliteApiKeyStore`, `IApiKeyAdminStore`/`SqliteApiKeyAdminStore`, `IApiKeyAuditStore`/`SqliteApiKeyAuditStore`, `AuthSqliteConnectionFactory`, `IAuthStoreMigrator`/`SqliteAuthStoreMigrator`, `AuthStoreMigrationHostedService`.
CLAIM_TYPE / behavior-rule
VERDICT / stale
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authentication/AuthStoreServiceCollectionExtensions.cs:67 — the shared library `ZB.MOM.WW.Auth.ApiKeys` is registered via `services.AddZbApiKeyAuth(effectiveConfig, AuthenticationSectionPath)`, which owns all of those types. The local method no longer registers them individually. The doc code block is a fabricated snapshot of pre-migration code that no longer matches any method in the codebase.
CODE_AREA / auth.apikeys
SEVERITY / high
PROPOSED_FIX / Replace the Registration section code block with the actual method body from AuthStoreServiceCollectionExtensions.cs (calls AddZbApiKeyAuth, then registers CanonicalForwardingApiKeyAuditStore, SqliteCanonicalAuditStore, IAuditWriter, ApiKeyAdminCommands, ApiKeyAdminCliRunner). Remove the statement that AddSqliteAuthStore "registers the migration hosted service" — the hosted service is registered by AddZbApiKeyAuth, not by local code.
---
DOC / Authentication.md / LINES 5368
CLAIM / `ApiKeySecretHasher` (registered behind `IApiKeySecretHasher`) hashes secrets with `HMACSHA256` keyed by a server-side pepper. The pepper is resolved by `IConfiguration` lookup against `PepperSecretName`. `ApiKeyPepperUnavailableException` is thrown when the pepper is missing.
CLAIM_TYPE / behavior-rule
VERDICT / stale
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authentication/AuthStoreServiceCollectionExtensions.cs:58 — these types (`ApiKeySecretHasher`, `IApiKeySecretHasher`, `ApiKeyPepperUnavailableException`) now live in the shared package `ZB.MOM.WW.Auth.ApiKeys` (PackageReference in .csproj line 11). The behavior is correct but the doc presents them as if they are local gateway types. The interceptor's return type is `ApiKeyVerification` not `ApiKeyVerificationResult` (AuthStoreServiceCollectionExtensions.cs context; GatewayGrpcAuthorizationInterceptor.cs:69).
CODE_AREA / auth.apikeys
SEVERITY / medium
PROPOSED_FIX / Clarify that `ApiKeySecretHasher`, `IApiKeySecretHasher`, and `ApiKeyPepperUnavailableException` are provided by the `ZB.MOM.WW.Auth.ApiKeys` shared library, not gateway-local types. Correct `ApiKeyVerificationResult``ApiKeyVerification` (the type returned by `IApiKeyVerifier.VerifyAsync` in the interceptor).
---
DOC / Authentication.md / LINES 7298
CLAIM / `ApiKeyVerifier` (`IApiKeyVerifier`) step 5: "Compare hashes with `CryptographicOperations.FixedTimeEquals`." Step 6: "Record a `LastUsedUtc` timestamp via `MarkKeyUsedAsync` and return an `ApiKeyIdentity`." Code block shows `ApiKeyVerificationResult.Fail(ApiKeyVerificationFailure.SecretMismatch)` and `ApiKeyVerificationResult.Success(new ApiKeyIdentity(...))`.
CLAIM_TYPE / behavior-rule
VERDICT / stale
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authorization/GatewayGrpcAuthorizationInterceptor.cs:69 — the interceptor receives `ApiKeyVerification verification`, not `ApiKeyVerificationResult`. These types are from the shared package `ZB.MOM.WW.Auth.ApiKeys` which was migrated to. The types, method signatures, and return types shown in the code block may have been renamed or restructured during the migration to the shared library; the gateway no longer owns or contains these implementations.
CODE_AREA / auth.apikeys
SEVERITY / medium
PROPOSED_FIX / Update type names to match the shared library (`ApiKeyVerification` instead of `ApiKeyVerificationResult`). Add note that `ApiKeyVerifier` is from `ZB.MOM.WW.Auth.ApiKeys`. Verify failure enum values against the shared library.
---
DOC / Authentication.md / LINES 108122
CLAIM / "`AuthSqliteConnectionFactory` reads `GatewayOptions.Authentication.SqlitePath`"
CLAIM_TYPE / term
VERDICT / stale
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authentication/AuthStoreServiceCollectionExtensions.cs:67 — `AuthSqliteConnectionFactory` is now registered by `AddZbApiKeyAuth` from the shared package. The doc implies it is a local type that reads the gateway's `GatewayOptions`, but it is actually from `ZB.MOM.WW.Auth.ApiKeys` and reads `ApiKeyOptions.SqlitePath` (bound from `MxGateway:Authentication` section). The behavior is equivalent but the doc is misleading about the type ownership.
CODE_AREA / auth.apikeys
SEVERITY / low
PROPOSED_FIX / Note that `AuthSqliteConnectionFactory` is from `ZB.MOM.WW.Auth.ApiKeys` and reads `ApiKeyOptions.SqlitePath` (bound via `MxGateway:Authentication:SqlitePath`).
---
DOC / Authentication.md / LINES 126133
CLAIM / "`SqliteAuthSchema` declares table names and the current schema version as constants. Three tables are involved: `api_keys`, `api_key_audit`, `schema_version`."
CLAIM_TYPE / behavior-rule
VERDICT / stale
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authentication/AuthStoreServiceCollectionExtensions.cs:6974 — a new `audit_event` table now exists in the same SQLite file, written by `SqliteCanonicalAuditStore`. The `api_key_audit` table is left in place but nothing writes to it once the `CanonicalForwardingApiKeyAuditStore` adapter overrides the library's audit store. The doc says only three tables; there are now at minimum four.
CODE_AREA / auth.apikeys
SEVERITY / medium
PROPOSED_FIX / Add `audit_event` as a fourth table (from `SqliteCanonicalAuditStore`). Note that `api_key_audit` is retained by the schema but is no longer written to at runtime (the `CanonicalForwardingApiKeyAuditStore` adapter redirects all writes to `audit_event` via `IAuditWriter`).
---
DOC / Authentication.md / LINES 134153
CLAIM / "`SqliteApiKeyStore` (`IApiKeyStore`) handles the two reads needed at request time: `FindByKeyIdAsync` and `FindActiveByKeyIdAsync`. `MarkKeyUsedAsync` updates `last_used_utc` only for non-revoked rows." Shows `ApiKeyRecordReader.Read` code block with column-ordinal reader.
CLAIM_TYPE / behavior-rule
VERDICT / stale
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authentication/AuthStoreServiceCollectionExtensions.cs:67 — `SqliteApiKeyStore` is in the shared package `ZB.MOM.WW.Auth.ApiKeys`. The code block shown is from the package, not local gateway code. If the package's internal implementation has changed, the doc may be inaccurate. The doc presents this as if it is local gateway source.
CODE_AREA / auth.apikeys
SEVERITY / low
PROPOSED_FIX / Clarify that `SqliteApiKeyStore`, `ApiKeyRecord`, and `ApiKeyRecordReader` are in the shared `ZB.MOM.WW.Auth.ApiKeys` package and are not directly modifiable in this repository. Remove or label the code block as "from shared library."
---
DOC / Authentication.md / LINES 156164
CLAIM / "`SqliteApiKeyAdminStore` (`IApiKeyAdminStore`) implements administrative mutations: `CreateAsync`, `RevokeAsync`, `RotateAsync`, `DeleteAsync`."
CLAIM_TYPE / term
VERDICT / stale
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authentication/AuthStoreServiceCollectionExtensions.cs:67 — `SqliteApiKeyAdminStore` is in `ZB.MOM.WW.Auth.ApiKeys`. The gateway now wraps admin operations through `ApiKeyAdminCommands` (from the same package), not by injecting `IApiKeyAdminStore` directly in the CLI runner. `DashboardSnapshotService` and `DashboardApiKeyManagementService` do consume `IApiKeyAdminStore` directly, which is fine.
CODE_AREA / auth.apikeys
SEVERITY / low
PROPOSED_FIX / Note that `SqliteApiKeyAdminStore` is from the shared library. Note that the gateway CLI runner delegates through `ApiKeyAdminCommands` (shared library), not by calling `IApiKeyAdminStore` directly.
---
DOC / Authentication.md / LINES 165183
CLAIM / "`SqliteAuthStoreMigrator` executes the migration inside a single transaction so a partial failure leaves the database untouched, refuses to start when the on-disk schema version is newer than the binary supports, and idempotently creates the v1 schema." "Operators who manage schema out-of-band can disable the hosted run and use the admin CLI's `init-db` command instead."
CLAIM_TYPE / behavior-rule
VERDICT / stale
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authentication/AuthStoreServiceCollectionExtensions.cs:104 — `SqliteAuthStoreMigrator` is from `ZB.MOM.WW.Auth.ApiKeys` (resolved via `sp.GetRequiredService<SqliteAuthStoreMigrator>()`). The description of its behavior is likely still accurate but is presented as locally-owned code. `AuthStoreMigrationHostedService` is also from the shared package (registered by `AddZbApiKeyAuth`). The code block shown at lines 171179 is from the package.
CODE_AREA / auth.apikeys
SEVERITY / low
PROPOSED_FIX / Clarify that `SqliteAuthStoreMigrator`, `IAuthStoreMigrator`, and `AuthStoreMigrationHostedService` are from the shared library.
---
DOC / Authentication.md / LINES 187208
CLAIM / CLI subcommand table lists: `init-db`, `create-key`, `list-keys`, `revoke-key`, `rotate-key`. CLI example uses `mxgateway apikey create-key --key-id ops.alice --display-name "Alice (ops)" --scopes read,write`.
CLAIM_TYPE / command
VERDICT / wrong
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authorization/GatewayScopes.cs:513 — `GatewayScopes.All` contains `session:open`, `session:close`, `invoke:read`, `invoke:write`, `invoke:secure`, `events:read`, `metadata:read`, `admin`. The values `read` and `write` are not in the scope catalog. `ApiKeyAdminCommandLineParser.ValidateScopes` at line 170177 would reject `--scopes read,write` as unknown scopes.
CODE_AREA / auth.scopes
SEVERITY / high
PROPOSED_FIX / Replace `--scopes read,write` with valid scope strings, e.g. `--scopes invoke:read,invoke:write`. Update all CLI examples in Authentication.md to use canonical scope strings from `GatewayScopes.All`.
---
DOC / Authentication.md / LINES 229248
CLAIM / "`ApiKeyScopeSerializer.Serialize` writes a JSON array sorted with `StringComparer.Ordinal`." Code block shown.
CLAIM_TYPE / behavior-rule
VERDICT / stale
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authentication/AuthStoreServiceCollectionExtensions.cs:5 — `ApiKeyScopeSerializer` is from the shared `ZB.MOM.WW.Auth.ApiKeys` package. The behavior described is likely correct but is presented as local gateway code.
CODE_AREA / auth.apikeys
SEVERITY / low
PROPOSED_FIX / Note that `ApiKeyScopeSerializer` is in the shared `ZB.MOM.WW.Auth.ApiKeys` library.
---
DOC / Authorization.md / LINES 107113
CLAIM / Scope resolver code block includes `TestConnectionRequest or GetLastDeployTimeRequest or DiscoverHierarchyRequest or WatchDeployEventsRequest => GatewayScopes.MetadataRead`.
CLAIM_TYPE / rpc/proto
VERDICT / stale
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authorization/GatewayGrpcScopeResolver.cs:2328 — the actual resolver also includes `BrowseChildrenRequest => GatewayScopes.MetadataRead` in the same arm. `BrowseChildrenRequest` was added (per docs/plans/2026-05-28-lazy-browse-implementation.md) but the code block in Authorization.md was not updated.
CODE_AREA / auth.scopes
SEVERITY / high
PROPOSED_FIX / Add `BrowseChildrenRequest` to the `MetadataRead` arm of the scope resolver code block. Update the scope catalog table at line 212 to include `GalaxyRepository.BrowseChildren` in the `MetadataRead` row.
---
DOC / Authorization.md / LINE 212
CLAIM / Scope catalog table row: `MetadataRead` / `metadata:read` / "`MxCommandKind.ArchestraUserToId`, `MxCommandKind.GetSessionState`, `MxCommandKind.GetWorkerInfo`, `GalaxyRepository.TestConnection`, `GalaxyRepository.GetLastDeployTime`, `GalaxyRepository.DiscoverHierarchy`, `GalaxyRepository.WatchDeployEvents`".
CLAIM_TYPE / rpc/proto
VERDICT / stale
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authorization/GatewayGrpcScopeResolver.cs:27 — `BrowseChildrenRequest` is also mapped to `metadata:read` but is absent from the table.
CODE_AREA / auth.scopes
SEVERITY / high
PROPOSED_FIX / Add `GalaxyRepository.BrowseChildren` to the `MetadataRead` row of the scope catalog table.
---
DOC / Authorization.md / LINES 260270
CLAIM / Registration code block for `AddGatewayGrpcAuthorization` shows three `AddSingleton` calls: `GatewayGrpcScopeResolver`, `IGatewayRequestIdentityAccessor`/`GatewayRequestIdentityAccessor`, `GatewayGrpcAuthorizationInterceptor`, then `AddGrpc`.
CLAIM_TYPE / behavior-rule
VERDICT / stale
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authorization/GrpcAuthorizationServiceCollectionExtensions.cs:1831 — the actual method also registers `IConstraintEnforcer`/`ConstraintEnforcer` as a singleton (line 20) and configures `GrpcServiceOptions` with `MaxReceiveMessageSize`/`MaxSendMessageSize` from `MxGateway:Protocol`. The doc code block omits both.
CODE_AREA / auth.scopes
SEVERITY / medium
PROPOSED_FIX / Update the Registration code block to include `services.AddSingleton<IConstraintEnforcer, ConstraintEnforcer>()` and the `AddOptions<GrpcServiceOptions>` configuration block for message size limits.
---
DOC / Authorization.md / LINE 273
CLAIM / "none of the three classes hold per-request state on instance fields"
CLAIM_TYPE / behavior-rule
VERDICT / stale
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authorization/GrpcAuthorizationServiceCollectionExtensions.cs:20 — there are now four singleton classes registered by `AddGatewayGrpcAuthorization` (`GatewayGrpcScopeResolver`, `GatewayRequestIdentityAccessor`, `GatewayGrpcAuthorizationInterceptor`, `ConstraintEnforcer`), not three.
CODE_AREA / auth.scopes
SEVERITY / low
PROPOSED_FIX / Update "three classes" to "four classes."
---
DOC / glauth.md / LINES 6366
CLAIM / "`LdapOptions.RequiredGroup` defaults to `GwAdmin`, so the dashboard login and `DashboardLdapLiveTests` require `admin` to be a member of a `GwAdmin` group."
CLAIM_TYPE / config-key
VERDICT / wrong
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Configuration/LdapOptions.cs — no `RequiredGroup` field exists on the gateway's `LdapOptions`. The gateway enforces group membership via `MxGateway:Dashboard:GroupToRole` (a dictionary mapping LDAP group names to dashboard roles) in `DashboardOptions`. Authorization succeeds if the user's LDAP groups map to at least one role — there is no `RequiredGroup` concept in the current architecture.
CODE_AREA / auth.ldap
SEVERITY / high
PROPOSED_FIX / Remove the sentence "`LdapOptions.RequiredGroup` defaults to `GwAdmin`." Replace with: the dashboard enforces that at least one of the user's LDAP groups appears in `MxGateway:Dashboard:GroupToRole` (e.g. `GwAdmin: Administrator`); a login with no matching group is rejected. `DashboardLdapLiveTests` seeds the role map with `GwAdmin -> Administrator`.
---
DOC / glauth.md / LINES 181182
CLAIM / "the authenticator strips to `GwAdmin` and matches against `RequiredGroup`"
CLAIM_TYPE / behavior-rule
VERDICT / wrong
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardGroupRoleMapping.cs:3548 — the shared `ILdapAuthService` strips the leading RDN value from each group DN, and the gateway's `DashboardGroupRoleMapper` looks up the short name in `GroupToRole`. There is no `RequiredGroup` property or concept anywhere in the codebase.
CODE_AREA / auth.ldap
SEVERITY / high
PROPOSED_FIX / Replace "matches against `RequiredGroup`" with "looks up the short RDN name (e.g. `GwAdmin`) in `MxGateway:Dashboard:GroupToRole`."
---
DOC / glauth.md / LINES 113136
CLAIM / "Suggested mxgw configuration shape" YAML block uses config keys `useTls`, `allowInsecureLdap`, `userNameAttribute`.
CLAIM_TYPE / config-key
VERDICT / wrong
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Configuration/LdapOptions.cs:49,52,64 — the current config keys (as bound by the shared `LdapOptions` and the gateway's shadow `LdapOptions`) are `Transport` (an enum: `None`/`Ldaps`/`StartTls`), `AllowInsecure` (bool), `UserNameAttribute` (string, default `"cn"` not `"uid"`). The YAML block uses stale camelCase key names from a pre-migration configuration shape.
CODE_AREA / auth.ldap
SEVERITY / high
PROPOSED_FIX / Update the YAML config example to use `Transport: None` (or `Ldaps`/`StartTls`) instead of `useTls: false`, `AllowInsecure: true` instead of `allowInsecureLdap: true`, `UserNameAttribute: "cn"` (gateway default; note GLAuth populates `cn` not `uid` per the gateway default). Rename the section header from `ldap:` to `MxGateway: Ldap:` to match the actual config path.
---
DOC / glauth.md / LINE 128
CLAIM / `userNameAttribute: "uid" # GLAuth populates this; AD uses sAMAccountName`
CLAIM_TYPE / config-key
VERDICT / wrong
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Configuration/LdapOptions.cs:64 — the gateway `LdapOptions` default for `UserNameAttribute` is `"cn"`, not `"uid"`. GLAuth does populate both `uid` and `cn`, but the gateway ships `"cn"` as default.
CODE_AREA / auth.ldap
SEVERITY / medium
PROPOSED_FIX / Change example to `UserNameAttribute: "cn"` with a note that the gateway default is `cn`; to use `uid` instead set `MxGateway:Ldap:UserNameAttribute: uid`.
---
DOC / glauth.md / LINES 261269
CLAIM / AD migration cheat-sheet uses field names `UseTls` and `AllowInsecureLdap`.
CLAIM_TYPE / config-key
VERDICT / wrong
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Configuration/LdapOptions.cs:49,52 — these fields were renamed: `UseTls``Transport` (enum), `AllowInsecureLdap``AllowInsecure`.
CODE_AREA / auth.ldap
SEVERITY / high
PROPOSED_FIX / Update the AD migration table: rename `UseTls` row to `Transport` (GLAuth dev value: `None`, AD value: `Ldaps`); rename `AllowInsecureLdap` row to `AllowInsecure` (GLAuth dev: `true`, AD: `false`).
---
DOC / CLAUDE.md / LINE 119
CLAIM / "maps the user's LDAP groups to `Admin` or `Viewer` via `MxGateway:Dashboard:GroupToRole`, then issues an HTTP-only secure `__Host-MxGatewayDashboard` cookie"
CLAIM_TYPE / term
VERDICT / wrong
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardAuthenticationDefaults.cs:38 — the cookie name constant is `CookieName = "MxGatewayDashboard"` (no `__Host-` prefix). `__Host-` is a browser security prefix that requires `Path=/`, no `Domain`, and `Secure` — the code sets `Path = "/"` and `SecurePolicy = Always` by default, satisfying the requirements, but the actual cookie name in the constant and in `ZbCookieDefaults.Apply` is `MxGatewayDashboard`, not `__Host-MxGatewayDashboard`. Additionally, `Admin` should be `Administrator` (the renamed role value per `DashboardRoles.Admin = "Administrator"`).
CODE_AREA / auth.cookie
SEVERITY / high
PROPOSED_FIX / Change `__Host-MxGatewayDashboard` to `MxGatewayDashboard` in CLAUDE.md. Change `Admin` to `Administrator`.
---
DOC / CLAUDE.md / LINE 119
CLAIM / "maps the user's LDAP groups to `Admin` or `Viewer`"
CLAIM_TYPE / term
VERDICT / wrong
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardRoles.cs:14 — `DashboardRoles.Admin = "Administrator"` (not `"Admin"`). The role value was renamed in Task 1.7. CLAUDE.md was not updated.
CODE_AREA / auth.roles
SEVERITY / high
PROPOSED_FIX / Change `Admin` to `Administrator` in the CLAUDE.md authentication paragraph.
---
DOC / CLAUDE.md / LINE 35
CLAIM / `dotnet run --project src/MxGateway.Server/MxGateway.Server.csproj -- apikey create --display-name "dev" --scopes session,invoke,event,metadata,admin`
CLAIM_TYPE / command
VERDICT / wrong
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authorization/GatewayScopes.cs:513 — canonical scopes are `session:open`, `session:close`, `invoke:read`, `invoke:write`, `invoke:secure`, `events:read`, `metadata:read`, `admin`. The shorthand values `session`, `invoke`, `event`, `metadata` are not recognized and would be rejected by `ApiKeyAdminCommandLineParser.ValidateScopes` as unknown scopes. Also, the subcommand is `create-key` not `create`.
CODE_AREA / auth.scopes
SEVERITY / high
PROPOSED_FIX / Replace the example with a valid invocation, e.g.: `dotnet run --project src/MxGateway.Server/MxGateway.Server.csproj -- apikey create-key --key-id dev --display-name "dev" --scopes session:open,session:close,invoke:read,invoke:write,events:read,metadata:read,admin`
---
DOC / CLAUDE.md / LINE 117
CLAIM / "Keys are stored hashed (with a peppered SHA) in a gateway-owned SQLite DB (default `C:\ProgramData\MxGateway\gateway-auth.db`). Scopes (`session`, `invoke`, `event`, `metadata`, `admin`) gate specific RPCs"
CLAIM_TYPE / config-key
VERDICT / wrong
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Configuration/AuthenticationOptions.cs:9 — SQLite path default is correct. However, scope names `session`, `invoke`, `event`, `metadata` are not the canonical scope strings. Actual scopes are `session:open`, `session:close`, `invoke:read`, `invoke:write`, `invoke:secure`, `events:read`, `metadata:read`, `admin`.
CODE_AREA / auth.scopes
SEVERITY / high
PROPOSED_FIX / Replace the scope shorthand list with the full canonical scope strings from `GatewayScopes.All`. The SQLite path is accurate and should be kept.
---
DOC / glauth.md / LINES 7074
CLAIM / "> **Dashboard role value (Task 1.7):** the LDAP `GwAdmin` group now maps to the canonical dashboard role **`Administrator`** (was `Admin`); `GwReader` maps to `Viewer`."
CLAIM_TYPE / term
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardRoles.cs:14 — `DashboardRoles.Admin = "Administrator"`, `DashboardRoles.Viewer = "Viewer"`. src/ZB.MOM.WW.MxGateway.Server/appsettings.json:6364 confirms `"GwAdmin": "Administrator"`, `"GwReader": "Viewer"`.
CODE_AREA / auth.roles
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / glauth.md / LINES 2126
CLAIM / Connection details: Protocol LDAP, Host `localhost`, Port `3893`, Base DN `dc=zb,dc=local`, Bind DN format `cn={username},dc=zb,dc=local`, Group OU `ou=<groupname>,ou=groups,dc=zb,dc=local`.
CLAIM_TYPE / config-key
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Configuration/LdapOptions.cs:36,39,55,58 — defaults: `Server=localhost`, `Port=3893`, `SearchBase=dc=zb,dc=local`, `ServiceAccountDn=cn=serviceaccount,dc=zb,dc=local`.
CODE_AREA / auth.ldap
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / docs/Authentication.md / LINES 130
CLAIM / Token format `mxgw_<keyId>_<secret>`, prefix `mxgw_`, parser is `ApiKeyParser` behind `IApiKeyParser`.
CLAIM_TYPE / term
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authentication/AuthStoreServiceCollectionExtensions.cs:30,33 — `TokenPrefix = "mxgw"`, `PepperSecretName = "MxGateway:ApiKeyPepper"`. The token format claim is accurate; `IApiKeyParser`/`ApiKeyParser` are from the shared package but the behavior description matches.
CODE_AREA / auth.apikeys
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / docs/Authentication.md / LINE 110
CLAIM / "`AuthSqliteConnectionFactory` reads `GatewayOptions.Authentication.SqlitePath`"
CLAIM_TYPE / config-key
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Configuration/AuthenticationOptions.cs:9 — `SqlitePath` default is `C:\ProgramData\MxGateway\gateway-auth.db`. The factory reads from `ApiKeyOptions.SqlitePath` which is bound from `MxGateway:Authentication:SqlitePath`, so the effective config key path matches `GatewayOptions.Authentication.SqlitePath`.
CODE_AREA / auth.apikeys
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / docs/Authentication.md / LINES 189208
CLAIM / CLI subcommands: `init-db`, `create-key`, `list-keys`, `revoke-key`, `rotate-key`.
CLAIM_TYPE / command
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authentication/ApiKeyAdminCommandKind.cs — enum has `InitDb`, `CreateKey`, `ListKeys`, `RevokeKey`, `RotateKey`. ApiKeyAdminCommandLineParser.cs maps these to exactly those string values.
CODE_AREA / auth.apikeys
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / docs/Authentication.md / LINES 220225
CLAIM / "Every destructive dashboard action is gated by a confirmation dialog and emits its own audit event (`dashboard-create-key`, `dashboard-rotate-key`, `dashboard-revoke-key`, `dashboard-delete-key`)."
CLAIM_TYPE / behavior-rule
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardApiKeyManagementService.cs:69,201 — audit event strings `dashboard-create-key` and `dashboard-delete-key` confirmed in code.
CODE_AREA / auth.apikeys
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / docs/Authorization.md / LINES 94116
CLAIM / Scope resolver switches on request type; `_ => GatewayScopes.Admin` fallback for unrecognized types.
CLAIM_TYPE / behavior-rule
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authorization/GatewayGrpcScopeResolver.cs:1329 — the pattern and fallback match exactly.
CODE_AREA / auth.scopes
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / docs/Authorization.md / LINE 85
CLAIM / "If `GatewayOptions.Authentication.Mode` is `AuthenticationMode.Disabled`, the helper returns `null` immediately. No identity is pushed onto the accessor and the continuation runs without scope enforcement. This matches the `AuthenticationMode` enum, which only defines `ApiKey` and `Disabled`."
CLAIM_TYPE / behavior-rule
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authorization/GatewayGrpcAuthorizationInterceptor.cs:59 — confirmed.
CODE_AREA / auth.apikeys
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / docs/Authorization.md / LINE 215
CLAIM / "The `Admin` constant is also referenced by `DashboardAuthenticator` and `DashboardAuthorizationHandler` so that the dashboard and the gRPC layer agree on what 'admin' means."
CLAIM_TYPE / behavior-rule
VERDICT / stale
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardAuthenticator.cs — `DashboardAuthenticator` does not reference `GatewayScopes.Admin`. The `admin` gRPC scope and the `Administrator` dashboard role are separate concepts. The dashboard authorization policy uses `DashboardRoles.Admin = "Administrator"`, not `GatewayScopes.Admin = "admin"`. These are distinct and do not share a constant.
CODE_AREA / auth.roles
SEVERITY / medium
PROPOSED_FIX / Correct or remove the claim that `GatewayScopes.Admin` is referenced by `DashboardAuthenticator`. The dashboard and gRPC "admin" are deliberately separate concepts — the dashboard role is `Administrator` (a role claim value on the ClaimsPrincipal), while the gRPC scope is the literal string `"admin"` (a scope string on ApiKeyIdentity).
---
DOC / docs/Authorization.md / LINE 116
CLAIM / "`AcknowledgeAlarm` is treated as a write — it mutates alarm state, mirroring `MxCommandKind.Write*` — and `StreamAlarms` shares the alarm/event surface with `StreamEvents` and `MxCommandKind.DrainEvents`, so it carries `events:read`. Both alarm RPCs are session-less."
CLAIM_TYPE / behavior-rule
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authorization/GatewayGrpcScopeResolver.cs:21,22 — `AcknowledgeAlarmRequest => GatewayScopes.InvokeWrite`, `StreamAlarmsRequest => GatewayScopes.EventsRead`. Confirmed.
CODE_AREA / auth.scopes
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / docs/Authorization.md / LINES 205215
CLAIM / Scope catalog table — all scope strings and their `Required For` mappings.
CLAIM_TYPE / rpc/proto
VERDICT / stale
EVIDENCE / GatewayGrpcScopeResolver.cs:27 — `BrowseChildrenRequest` is missing from the `MetadataRead` row (already captured above). All other rows are accurate.
CODE_AREA / auth.scopes
SEVERITY / high
PROPOSED_FIX / (Same as finding above — add `GalaxyRepository.BrowseChildren` to `MetadataRead` row.)
---
## GAP FINDINGS (auth behavior in code but undocumented)
DOC / (none — gap)
CLAIM / `DashboardAuthenticationDefaults.CookieName` is the default cookie name `"MxGatewayDashboard"`, but `DashboardOptions.CookieName` allows a per-deployment override via `MxGateway:Dashboard:CookieName`. Auth docs do not mention this override.
CLAIM_TYPE / config-key
VERDICT / gap
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardServiceCollectionExtensions.cs:9197, src/ZB.MOM.WW.MxGateway.Server/Configuration/DashboardOptions.cs:33.
CODE_AREA / auth.cookie
SEVERITY / medium
PROPOSED_FIX / Add documentation of `MxGateway:Dashboard:CookieName` override and when to use it (multiple gateway instances sharing a hostname).
---
DOC / (none — gap)
CLAIM / The dashboard cookie idle timeout is 8 hours (set by `ZbCookieDefaults.Apply` with `idleTimeout: TimeSpan.FromHours(8)`). The hub bearer token expires in 30 minutes (`HubTokenService.TokenLifetime = TimeSpan.FromMinutes(30)`). Neither timeout is documented in Authentication.md.
CLAIM_TYPE / behavior-rule
VERDICT / gap
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardServiceCollectionExtensions.cs:66, src/ZB.MOM.WW.MxGateway.Server/Dashboard/HubTokenService.cs:29.
CODE_AREA / auth.hub
SEVERITY / medium
PROPOSED_FIX / Add a section in Authentication.md (or GatewayDashboardDesign.md) documenting the 8-hour dashboard cookie idle timeout and the 30-minute hub bearer token lifetime.
---
DOC / (none — gap)
CLAIM / The `CanonicalForwardingApiKeyAuditStore` overrides the shared library's `IApiKeyAuditStore`. As a result, the `api_key_audit` table in the SQLite DB is written by the shared library's migration but is NOT written to at runtime — all audit records go to `audit_event` via `IAuditWriter`. This is operationally important for anyone reading the DB directly but is not documented.
CLAIM_TYPE / behavior-rule
VERDICT / gap
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Security/Authentication/AuthStoreServiceCollectionExtensions.cs:8594, src/ZB.MOM.WW.MxGateway.Server/Security/Audit/CanonicalForwardingApiKeyAuditStore.cs.
CODE_AREA / auth.apikeys
SEVERITY / medium
PROPOSED_FIX / Document in Authentication.md that `api_key_audit` exists in the schema but is unused at runtime; all audit events flow to `audit_event` via `IAuditWriter`/`SqliteCanonicalAuditStore`.
---
DOC / (none — gap)
CLAIM / `DashboardOptions.RequireHttpsCookie` (default `true`) controls whether the dashboard cookie uses `SecurePolicy.Always` or `SameAsRequest`. Setting it `false` is required for plain-HTTP dev deployments. This config key is not mentioned in auth docs.
CLAIM_TYPE / config-key
VERDICT / gap
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Configuration/DashboardOptions.cs:22, src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardServiceCollectionExtensions.cs:87.
CODE_AREA / auth.cookie
SEVERITY / low
PROPOSED_FIX / Reference `MxGateway:Dashboard:RequireHttpsCookie` in the auth cookie documentation.
---
DOC / (none — gap)
CLAIM / `ZbClaimTypes` and `ZbCookieDefaults` (from `ZB.MOM.WW.Auth.AspNetCore` package) are now used for claim and cookie setup. Authentication.md does not mention the shared library claim types (`zb:username`, `zb:displayname`) or that cookie hardening defaults come from `ZbCookieDefaults.Apply`.
CLAIM_TYPE / behavior-rule
VERDICT / gap
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardAuthenticator.cs:111115, src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardServiceCollectionExtensions.cs:66.
CODE_AREA / auth.cookie
SEVERITY / low
PROPOSED_FIX / Add a brief note in dashboard auth documentation about `ZbClaimTypes` (`zb:username`, `zb:displayname`, `zb:name`, `zb:role`) and `ZbCookieDefaults.Apply` providing cookie security defaults.
+332
View File
@@ -0,0 +1,332 @@
# Cluster 05 — Dashboard
Audited docs: `docs/DashboardInterfaceDesign.md`, `docs/GatewayDashboardDesign.md`
Verified against: `src/ZB.MOM.WW.MxGateway.Server/Dashboard/**`, `src/ZB.MOM.WW.MxGateway.Server/wwwroot/**`
Audit date: 2026-06-03
---
DOC / DashboardInterfaceDesign.md / LINES / 3957
CLAIM / "The shell does not use a sidebar. A horizontal navigation bar is enough…" with a `<div class="dashboard-shell">` / `<nav class="navbar navbar-expand-lg bg-body border-bottom dashboard-navbar">` HTML skeleton
CLAIM_TYPE / behavior-rule
VERDICT / stale
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Dashboard/Components/Layout/MainLayout.razor:7 — layout now delegates to `<ThemeShell Product="MXAccess Gateway" Accent="#2f5fd0">` with `NavRailSection`/`NavRailItem` kit components; there is no local navbar, no `dashboard-shell` class, no `dashboard-navbar` class, and no `container-fluid` content area anywhere in the codebase
CODE_AREA / dashboard.theme
SEVERITY / high
PROPOSED_FIX / Replace the HTML skeleton and prose description with the current ThemeShell side-rail pattern (`<ThemeShell>``<Nav>``<NavRailSection>` / `<NavRailItem>`). Update the note about "horizontal navigation bar" — the nav is now a collapsible side rail managed by the ZB.MOM.WW.Theme kit.
---
DOC / DashboardInterfaceDesign.md / LINES / 115123
CLAIM / Navigation uses `NavLink` and labels: `Overview`, `Sessions`, `Workers`, `Events`, `Settings`
CLAIM_TYPE / behavior-rule
VERDICT / stale
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Dashboard/Components/Layout/MainLayout.razor:923 — actual nav items are `Dashboard` (not "Overview"), `Sessions`, `Workers`, `Events`, `Alarms`, `Repository`, `Browse`, `API Keys`, `Settings`; the old flat list of five labels has been replaced by three grouped NavRailSections (`Runtime`, `Galaxy`, `Admin`) with eight leaf items
CODE_AREA / dashboard.theme
SEVERITY / high
PROPOSED_FIX / Update the nav-label list to match the current ThemeShell Nav: Dashboard / [Runtime: Sessions, Workers, Events, Alarms] / [Galaxy: Repository, Browse] / [Admin: API Keys, Settings].
---
DOC / DashboardInterfaceDesign.md / LINES / 6379
CLAIM / Four local CSS tokens: `--mxgw-surface: #f7f8fa`, `--mxgw-border: #d8dee6`, `--mxgw-ink-muted: #667085`, `--mxgw-accent: #146c64`
CLAIM_TYPE / config-key
VERDICT / wrong
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/wwwroot/css/site.css:17 — no `--mxgw-*` tokens are defined anywhere in `site.css`; the file's own header states "Every colour … resolves to a theme.css token — no hard-coded hex." All colour expressions in site.css use theme kit tokens: `var(--card)`, `var(--rule)`, `var(--ink)`, `var(--ink-faint)`, `var(--accent)`, `var(--accent-deep)`, `var(--bad)`, `var(--bad-bg)`, etc.
CODE_AREA / dashboard.css
SEVERITY / high
PROPOSED_FIX / Remove the `--mxgw-*` token table entirely. Replace with a note that the dashboard's view-layer CSS (`site.css`) resolves all colour via the `ZB.MOM.WW.Theme` kit tokens (`--card`, `--rule`, `--ink`, `--ink-faint`, `--accent`, `--accent-deep`, etc.) and defines no local colour tokens.
---
DOC / DashboardInterfaceDesign.md / LINES / 8797
CLAIM / Page headings use `1.35rem`, weight `650`. Metric labels use `uppercase text at .78rem` and weight `650`. Metric values use `1.7rem`, weight `700`, and the accent color.
CLAIM_TYPE / behavior-rule
VERDICT / stale
EVIDENCE / site.css:3031 (page h1: `font-size: 1.15rem; font-weight: 600`), site.css:6465 (agg-label: `font-size: 0.68rem; font-weight: 600`), site.css:7677 (agg-value: `font-size: 1.5rem; font-weight: 600; color: var(--ink)`) — page headings are `1.15rem/600` (not `1.35rem/650`); metric values are `1.5rem/600` in `var(--ink)` (not `1.7rem/700` in accent color)
CODE_AREA / dashboard.css
SEVERITY / medium
PROPOSED_FIX / Update typography table: h1 → 1.15rem/600, agg-label → 0.68rem/600/uppercase, agg-value → 1.5rem/600/var(--ink). Note metric values render in ink (not the accent colour) per the post-theme-migration design.
---
DOC / DashboardInterfaceDesign.md / LINES / 99111
CLAIM / Page content has `1.25rem` padding on desktop and `.75rem` on small screens. Metric grids use `.75rem` gaps. Cards and empty states use Bootstrap's small radius `.375rem`. Content sections start with a top border and `1rem` top padding.
CLAIM_TYPE / behavior-rule
VERDICT / stale
EVIDENCE / site.css:272279 (`@media (max-width: 700px)` sets `.page { padding: 0.85rem }`; no 1.25rem desktop padding rule exists in site.css). site.css:59 (`border-radius: 8px` for `.agg-card`, not `.375rem`). site.css:59 (`box-shadow: none` but no top-border-only sections — `.dashboard-section` at line 91100 is a raised card with `border: 1px solid var(--rule); border-radius: 8px`).
CODE_AREA / dashboard.css
SEVERITY / low
PROPOSED_FIX / Update spacing table: small-screen padding is 0.85rem; cards use 8px radius; sections are full-border raised cards (not top-border-only dividers).
---
DOC / DashboardInterfaceDesign.md / LINES / 153168
CLAIM / `metric-grid` uses `repeat(auto-fit, minmax(12rem, 1fr))` and `compact` variant uses `minmax(10rem, 1fr)`
CLAIM_TYPE / behavior-rule
VERDICT / stale
EVIDENCE / site.css:49 (`grid-template-columns: repeat(auto-fill, minmax(11rem, 1fr))`) and site.css:54 (`repeat(auto-fill, minmax(10rem, 1fr))`) — base grid is `auto-fill, 11rem` (not `auto-fit, 12rem`); `auto-fill` not `auto-fit`
CODE_AREA / dashboard.css
SEVERITY / low
PROPOSED_FIX / Update code block: base grid is `repeat(auto-fill, minmax(11rem, 1fr))`; compact stays `minmax(10rem, 1fr)`.
---
DOC / DashboardInterfaceDesign.md / LINES / 191200
CLAIM / Status uses Bootstrap badge classes (`text-bg-success`, `text-bg-info`, `text-bg-secondary`, `text-bg-danger`, `text-bg-light text-dark border`) with mapping: `Closed``text-bg-secondary`; `Creating`/`StartingWorker`/`WaitingForPipe`/`InitializingWorker`/`Closing``text-bg-info`
CLAIM_TYPE / behavior-rule
VERDICT / wrong
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Dashboard/Components/Shared/StatusBadge.razor:116 — `StatusBadge` is now a thin adapter over the kit's `<StatusPill State="…">` component; no Bootstrap `text-bg-*` classes are used at all. State mapping uses `StatusState.Ok/Warn/Bad/Idle`. `Closed` falls through to `StatusState.Idle` (no `text-bg-secondary`). `Closing` is mapped to `StatusState.Warn` (not info). New states `Stale`, `Degraded`, `Active`, `Unavailable` are handled; `Unknown state``StatusState.Idle` (was `text-bg-light text-dark border`).
CODE_AREA / dashboard.theme
SEVERITY / high
PROPOSED_FIX / Replace the badge-class table with the current `StatusState` enum vocabulary: Ok (`Ready`, `Healthy`, `Active`); Warn (`Creating`, `StartingWorker`, `WaitingForPipe`, `InitializingWorker`, `Closing`, `Stale`, `Degraded`); Bad (`Faulted`, `Unavailable`); Idle (everything else including `Closed`). Note that visual rendering is owned by the ZB.MOM.WW.Theme `StatusPill` component.
---
DOC / DashboardInterfaceDesign.md / LINES / 229245
CLAIM / Responsive breakpoint CSS: `.dashboard-content { padding: .75rem }` at `max-width: 700px`
CLAIM_TYPE / behavior-rule
VERDICT / stale
EVIDENCE / site.css:272279 — the responsive rule targets `.page { padding: 0.85rem }` (not `.dashboard-content`; not `.75rem`)
CODE_AREA / dashboard.css
SEVERITY / low
PROPOSED_FIX / Update code block to `@media (max-width: 700px) { .page { padding: 0.85rem; } … }`.
---
DOC / GatewayDashboardDesign.md / LINES / 78110
CLAIM / Component tree lists `Layout/DashboardLayout.razor` and `Shared/StatusBadge.razor` as standalone status component
CLAIM_TYPE / path
VERDICT / stale
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Dashboard/Components/Layout/ contains only `MainLayout.razor` and `LoginLayout.razor`; there is no `DashboardLayout.razor`. `StatusBadge.razor` still exists but is now a thin wrapper delegating to the kit's `StatusPill`; the doc's tree implies it is a standalone Bootstrap-badge component.
CODE_AREA / dashboard.theme
SEVERITY / medium
PROPOSED_FIX / In the component tree rename `DashboardLayout.razor``MainLayout.razor` and `LoginLayout.razor`. Add a note that `StatusBadge` delegates to `ZB.MOM.WW.Theme`'s `StatusPill`. Also add `BrowseTreeNodeView.razor` and `ConfirmDialog.razor` which are present in code but absent from the tree.
---
DOC / GatewayDashboardDesign.md / LINES / 507510
CLAIM / "The dashboard serves Bootstrap 5.3.3 assets from `src/ZB.MOM.WW.MxGateway.Server/wwwroot/lib/bootstrap/` and local layout/status styling from `src/ZB.MOM.WW.MxGateway.Server/wwwroot/css/dashboard.css`."
CLAIM_TYPE / path
VERDICT / wrong
EVIDENCE / find shows `wwwroot/css/site.css` exists; there is no `dashboard.css` under wwwroot. Bootstrap 5.3.3 is confirmed (bootstrap.min.css header). App.razor:89 loads `/css/site.css`, not `/css/dashboard.css`. Additionally the denied-page renderer at DashboardEndpointRouteBuilderExtensions.cs:172173 also loads theme kit CSS: `/_content/ZB.MOM.WW.Theme/css/theme.css` and `/_content/ZB.MOM.WW.Theme/css/layout.css`.
CODE_AREA / dashboard.css
SEVERITY / high
PROPOSED_FIX / Change `dashboard.css``site.css` throughout. Add that App.razor also loads `<ThemeHead />` (which injects the theme kit's CSS) and `<ThemeScripts />`. Note the denied-page also pulls `/_content/ZB.MOM.WW.Theme/css/theme.css` and `/_content/ZB.MOM.WW.Theme/css/layout.css` directly.
---
DOC / GatewayDashboardDesign.md / LINES / 406428
CLAIM / "`DashboardAuthenticator` binds against `MxGateway:Ldap` … using `Novell.Directory.Ldap.NETStandard`"
CLAIM_TYPE / behavior-rule
VERDICT / stale
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardAuthenticator.cs:16 — imports are `ZB.MOM.WW.Auth.Abstractions.Ldap`, `ZB.MOM.WW.Auth.Abstractions.Roles`, `ZB.MOM.WW.Auth.AspNetCore`; the csproj references `ZB.MOM.WW.Auth.Ldap 0.1.2`, not Novell. DashboardServiceCollectionExtensions.cs:35 calls `services.AddZbLdapAuth(configuration, "MxGateway:Ldap")`. `Novell.Directory.Ldap.NETStandard` is not referenced in the csproj.
CODE_AREA / dashboard.login
SEVERITY / medium
PROPOSED_FIX / Replace "using `Novell.Directory.Ldap.NETStandard`" with "using the shared `ZB.MOM.WW.Auth.Ldap` package (`ILdapAuthService`), registered via `AddZbLdapAuth`".
---
DOC / GatewayDashboardDesign.md / LINES / 420422
CLAIM / Cookie name is `__Host-MxGatewayDashboard`
CLAIM_TYPE / config-key
VERDICT / wrong
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardAuthenticationDefaults.cs:38 — `public const string CookieName = "MxGatewayDashboard"` (no `__Host-` prefix). The code comment explains the `__Host-` prefix is not used; the cookie path is `/` set explicitly at DashboardServiceCollectionExtensions.cs:72.
CODE_AREA / dashboard.login
SEVERITY / high
PROPOSED_FIX / Change `__Host-MxGatewayDashboard` to `MxGatewayDashboard` everywhere in the auth section. Note that the cookie name is configurable via `MxGateway:Dashboard:CookieName`.
---
DOC / GatewayDashboardDesign.md / LINES / 289306
CLAIM / Browse page is at `/dashboard/browse`; tree built by `DashboardBrowseTreeBuilder` from `IGalaxyHierarchyCache.Current`; subscription panel is the explicit opt-in for tag values
CLAIM_TYPE / path
VERDICT / stale
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Dashboard/Components/Pages/BrowsePage.razor:1 — route is `@page "/browse"` (not `/dashboard/browse`). `DashboardBrowseTreeBuilder` is found at DashboardBrowseModel.cs:66 as a static class. Browse uses `IGalaxyHierarchyCache` (injected via `IDashboardBrowseService`) confirmed at BrowsePage.razor:3.
CODE_AREA / dashboard.hub
SEVERITY / medium
PROPOSED_FIX / Fix route to `/browse` (not `/dashboard/browse`). The tree builder name is accurate but clarify it is a static class inside `DashboardBrowseModel.cs`.
---
DOC / GatewayDashboardDesign.md / LINES / 307318
CLAIM / Alarms page is at `/dashboard/alarms`; defaults to showing unacknowledged `Active` alarms; Alarms page reads via `IDashboardLiveDataService`
CLAIM_TYPE / path
VERDICT / stale
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Dashboard/Components/Pages/AlarmsPage.razor:1 — route is `@page "/alarms"` (not `/dashboard/alarms`). The claim that alarm data comes from "gateway's always-on central monitor" via `IGatewayAlarmService.CurrentAlarms` is contradicted by the implementation: AlarmsPage.razor:3 injects `IDashboardLiveDataService` and calls `LiveData.QueryAlarmsAsync` in a poll loop — not `IGatewayAlarmService.CurrentAlarms` directly.
CODE_AREA / dashboard.hub
SEVERITY / medium
PROPOSED_FIX / Fix route to `/alarms`. Correct the live data source description: the Alarms page uses `IDashboardLiveDataService.QueryAlarmsAsync` (a polling loop every 3 s), not a direct read of `IGatewayAlarmService.CurrentAlarms`.
---
DOC / GatewayDashboardDesign.md / LINES / 337345
CLAIM / API keys page is at `/dashboard/apikeys`
CLAIM_TYPE / path
VERDICT / stale
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Dashboard/Components/Pages/ApiKeysPage.razor:1 — route is `@page "/apikeys"` (not `/dashboard/apikeys`)
CODE_AREA / dashboard.hub
SEVERITY / medium
PROPOSED_FIX / Fix route to `/apikeys`.
---
DOC / GatewayDashboardDesign.md / LINES / 387391
CLAIM / "Every management action appends an `api_key_audit` entry (`dashboard-create-key`, `dashboard-rotate-key`, `dashboard-revoke-key`, `dashboard-delete-key`) with the key id and the caller's remote address."
CLAIM_TYPE / behavior-rule
VERDICT / stale
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardApiKeyManagementService.cs:59 — "both rows land in the canonical audit_event store"; DashboardApiKeyManagementService.cs:69,113,156,201 — action strings `dashboard-create-key`, `dashboard-rotate-key`, `dashboard-revoke-key`, `dashboard-delete-key` are confirmed. However the table used is `audit_event` (via `IAuditWriter`), not `api_key_audit`. Comments explicitly state "the library's `api_key_audit` table is left in place but UNUSED".
CODE_AREA / dashboard.login
SEVERITY / medium
PROPOSED_FIX / Change "appends an `api_key_audit` entry" to "appends an `audit_event` entry (via `IAuditWriter`)". The `api_key_audit` table is no longer used for dashboard actions.
---
DOC / GatewayDashboardDesign.md / LINES / 6869
CLAIM / Galaxy page is at `/galaxy`; "summary is fed by `GalaxySummaryCache`, which is refreshed off the request path by `GalaxySummaryRefreshService` on the `MxGateway:Galaxy:DashboardRefreshIntervalSeconds` cadence"
CLAIM_TYPE / behavior-rule
VERDICT / stale
EVIDENCE / `GalaxySummaryCache` and `GalaxySummaryRefreshService` do not exist in the codebase. The actual implementation uses `IGalaxyHierarchyCache` (GalaxyHierarchyCache.cs) refreshed by `GalaxyHierarchyRefreshService` (Galaxy/GalaxyHierarchyRefreshService.cs:19), driven by `GalaxyRepositoryOptions.DashboardRefreshIntervalSeconds` under config key `MxGateway:Galaxy:DashboardRefreshIntervalSeconds` (confirmed by appsettings.json:74). Galaxy page route `/galaxy` is confirmed.
CODE_AREA / dashboard.hub
SEVERITY / medium
PROPOSED_FIX / Replace `GalaxySummaryCache` / `GalaxySummaryRefreshService` with `GalaxyHierarchyCache` / `GalaxyHierarchyRefreshService`. Config key `MxGateway:Galaxy:DashboardRefreshIntervalSeconds` is correct.
---
DOC / GatewayDashboardDesign.md / LINES / 160170
CLAIM / "Updates flow over three SignalR hubs … `DashboardSnapshotPublisher` (BackgroundService consuming `IDashboardSnapshotService.WatchSnapshotsAsync`)"; hub table row for EventsHub: "`DashboardEventBroadcaster` invoked by `EventStreamService` for each event it forwards to a gRPC client"
CLAIM_TYPE / behavior-rule
VERDICT / accurate
EVIDENCE / Dashboard/Hubs/DashboardSnapshotPublisher.cs confirms BackgroundService consuming `WatchSnapshotsAsync`. Dashboard/Hubs/EventsHub.cs:617 comment: "The publisher side is intentionally a follow-up. Today the dashboard's per-session event view is fed by the snapshot hub … Once a dedicated MxEvent broadcaster lands, this hub's group convention is what it will publish to." — so the doc's description of `DashboardEventBroadcaster` being active is aspirational; it currently exists as a stub.
CODE_AREA / dashboard.hub
SEVERITY / low
PROPOSED_FIX / Flag only — add a note that `EventsHub`'s broadcaster is a planned follow-up; today the per-session events view in `SessionDetailsPage` connects to `/hubs/events` directly and `DashboardEventBroadcaster` exists but the EventStreamService hook is not yet wired. The hub routing convention is stable.
---
DOC / GatewayDashboardDesign.md / LINES / 171177
CLAIM / "`DashboardPageBase` … seeds `Snapshot` synchronously from `IDashboardSnapshotService.GetSnapshot()` … and calls `InvokeAsync(StateHasChanged)` on every `SnapshotUpdated` push. SignalR's `WithAutomaticReconnect` handles transient disconnects."
CLAIM_TYPE / behavior-rule
VERDICT / accurate
EVIDENCE / Dashboard/Components/DashboardPageBase.cs:3778 — `GetSnapshot()` seed on line 37, hub `On<DashboardSnapshot>` calls `InvokeAsync(StateHasChanged)` on line 65. DashboardHubConnectionFactory.cs:36 — `.WithAutomaticReconnect()` confirmed.
CODE_AREA / dashboard.hub
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / GatewayDashboardDesign.md / LINES / 559577
CLAIM / "Initial Implementation Slice … 2. local Bootstrap static assets."
CLAIM_TYPE / behavior-rule
VERDICT / stale
EVIDENCE / App.razor:78 — `<ThemeHead />` is loaded before `/css/site.css`; ThemeScripts at line 15. The theme kit (`ZB.MOM.WW.Theme 0.2.0`) is now the primary asset provider via `/_content/ZB.MOM.WW.Theme/`. The "local Bootstrap static assets" description is no longer the full picture — Bootstrap is still vendored locally but the theme kit adds additional CSS/JS layers. DashboardEndpointRouteBuilderExtensions.cs:172173 directly references `/_content/ZB.MOM.WW.Theme/css/theme.css` and `/_content/ZB.MOM.WW.Theme/css/layout.css`.
CODE_AREA / dashboard.css
SEVERITY / low
PROPOSED_FIX / Update item 2 to: "local Bootstrap static assets plus `ZB.MOM.WW.Theme` kit (PackageReference `0.2.0`) providing theme CSS, layout CSS, and JS via `<ThemeHead />` / `<ThemeScripts />`".
---
DOC / GatewayDashboardDesign.md / LINES / 463465
CLAIM / "Two environmental bypasses … `MxGateway:Authentication:Mode = Disabled` authorizes every request"
CLAIM_TYPE / config-key
VERDICT / unverifiable
EVIDENCE / No code path for `MxGateway:Authentication:Mode` was found in Dashboard/ — search returned no matches. The `AllowAnonymousLocalhost` bypass is confirmed at DashboardAuthorizationHandler.cs (referenced from DashboardAuthorizationRequirement). The global Auth mode bypass may live in GatewayOptions outside the dashboard cluster.
CODE_AREA / dashboard.login
SEVERITY / low
PROPOSED_FIX / Cross-check against GatewayOptions and the auth middleware — if this config key was removed or renamed, update the doc. If it lives outside dashboard code, add a cross-reference.
---
## Gap findings (code behavior undocumented)
DOC / gap
LINES / n/a
CLAIM / `Login.razor` is a Blazor page (`@page "/login"`) using `LoginLayout` and the kit's `<LoginCard>` component. GET /login is served by this Blazor page, not a static HTML form. POST /login is a minimal-API endpoint.
CLAIM_TYPE / behavior-rule
VERDICT / gap
EVIDENCE / Dashboard/Components/Pages/Login.razor:127; DashboardEndpointRouteBuilderExtensions.cs:2736
CODE_AREA / dashboard.login
SEVERITY / medium
PROPOSED_FIX / Add to the auth section: "GET `/login` is served by the Blazor `Login.razor` page (using the shared kit's `<LoginCard>`); the page is `[AllowAnonymous]` and uses `LoginLayout` (no side rail). POST `/login` remains a minimal-API endpoint."
---
DOC / gap
LINES / n/a
CLAIM / `StatusBadge.razor` now maps `Closed``StatusState.Idle` (not `text-bg-secondary`); adds `Stale`, `Degraded` → Warn; adds `Active` → Ok; adds `Unavailable` → Bad. None of these new states are documented.
CLAIM_TYPE / behavior-rule
VERDICT / gap
EVIDENCE / Dashboard/Components/Shared/StatusBadge.razor:1014
CODE_AREA / dashboard.theme
SEVERITY / medium
PROPOSED_FIX / Document the full current state-to-StatusState mapping including `Active`, `Stale`, `Degraded`, `Unavailable`.
---
DOC / gap
LINES / n/a
CLAIM / `ZB.MOM.WW.Theme 0.2.0` is a PackageReference and provides `ThemeShell`, `ThemeHead`, `ThemeScripts`, `NavRailSection`, `NavRailItem`, `StatusPill`, `LoginCard` components. This dependency is not mentioned in either dashboard doc.
CLAIM_TYPE / behavior-rule
VERDICT / gap
EVIDENCE / ZB.MOM.WW.MxGateway.Server.csproj (PackageReference ZB.MOM.WW.Theme Version=0.2.0); MainLayout.razor; App.razor; Login.razor
CODE_AREA / dashboard.theme
SEVERITY / high
PROPOSED_FIX / Add a "Theme Kit" section to GatewayDashboardDesign.md (and update DashboardInterfaceDesign.md) documenting the ZB.MOM.WW.Theme dependency, which components it provides, and that the kit owns the shell frame, nav rail, login card, status pill rendering, and base CSS tokens.
---
DOC / gap
LINES / n/a
CLAIM / `DashboardOptions` has a `CookieName` override property (`MxGateway:Dashboard:CookieName`) and a `RequireHttpsCookie` flag. Neither is mentioned in the Configuration section of GatewayDashboardDesign.md.
CLAIM_TYPE / config-key
VERDICT / gap
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Configuration/DashboardOptions.cs; DashboardServiceCollectionExtensions.cs:8797
CODE_AREA / dashboard.login
SEVERITY / medium
PROPOSED_FIX / Add `CookieName` and `RequireHttpsCookie` to the effective-configuration JSON block and the configuration section prose.
---
DOC / gap
LINES / n/a
CLAIM / `SessionsPage` and `WorkersPage` both render admin `Close`/`Kill` action buttons (with `ConfirmDialog`), not only `SessionDetailsPage` as the doc implies. The `ConfirmDialog` shared component (`Shared/ConfirmDialog.razor`) is not listed in the component tree.
CLAIM_TYPE / behavior-rule
VERDICT / gap
EVIDENCE / Dashboard/Components/Pages/SessionsPage.razor:3137 (ConfirmDialog usage); Dashboard/Components/Shared/ConfirmDialog.razor (file exists)
CODE_AREA / dashboard.theme
SEVERITY / low
PROPOSED_FIX / Add `ConfirmDialog.razor` to the Shared/ component tree. Note admin controls appear on Sessions list and Workers list pages, not only SessionDetailsPage.
---
## Summary
| Verdict | Count |
|---------|-------|
| accurate | 2 |
| stale | 11 |
| wrong | 4 |
| unverifiable | 1 |
| gap | 6 |
| Severity | Count |
|----------|-------|
| high | 7 |
| medium | 9 |
| low | 8 |
## High-severity findings
- **Layout: no horizontal navbar** — Both docs describe a horizontal top navbar with `dashboard-shell`/`dashboard-navbar`/`container-fluid` classes. The actual layout is a `ZB.MOM.WW.Theme` `ThemeShell` side-rail with `NavRailSection`/`NavRailItem` components. The HTML skeleton in DashboardInterfaceDesign.md is obsolete.
- **Nav labels wrong** — DashboardInterfaceDesign.md lists five flat labels (Overview/Sessions/Workers/Events/Settings). Actual nav has eight items in three groups (Runtime, Galaxy, Admin) and the home link is labelled "Dashboard" not "Overview".
- **CSS tokens do not exist** — DashboardInterfaceDesign.md documents four `--mxgw-*` custom CSS properties. None exist in `site.css`; all colour resolves through ZB.MOM.WW.Theme kit tokens.
- **StatusBadge uses Bootstrap `text-bg-*` classes** — DashboardInterfaceDesign.md documents a Bootstrap badge mapping. `StatusBadge` now delegates to the kit's `StatusPill` with `StatusState` enum; no `text-bg-*` classes are used.
- **`dashboard.css` does not exist** — GatewayDashboardDesign.md refers to `wwwroot/css/dashboard.css` as the local stylesheet. The file is `wwwroot/css/site.css`.
- **Cookie name wrong** — GatewayDashboardDesign.md states cookie name `__Host-MxGatewayDashboard`. Actual default is `MxGatewayDashboard` (no `__Host-` prefix).
- **ZB.MOM.WW.Theme dependency undocumented** — Neither doc mentions the theme kit package, its components (`ThemeShell`, `LoginCard`, `StatusPill`, etc.) or its CSS token system. This is the single most architecturally significant post-migration gap.
+473
View File
@@ -0,0 +1,473 @@
# Cluster 06 — Config
Docs audited: `docs/GatewayConfiguration.md`, `docs/Diagnostics.md`, `docs/Metrics.md`
Code verified against:
- `src/ZB.MOM.WW.MxGateway.Server/Configuration/` (GatewayOptions, GatewayOptionsValidator, and all sub-options)
- `src/ZB.MOM.WW.MxGateway.Server/Diagnostics/`
- `src/ZB.MOM.WW.MxGateway.Server/Metrics/`
- `src/ZB.MOM.WW.MxGateway.Server/Galaxy/GalaxyRepositoryOptions.cs`
- `src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardRoles.cs`, `DashboardAuthenticationDefaults.cs`
- `src/ZB.MOM.WW.MxGateway.Server/appsettings.json`
---
DOC / GatewayConfiguration.md / LINES / 5556
CLAIM / Config shape example shows GroupToRole values as `"Admin"` and `"Viewer"`
CLAIM_TYPE / config-key
VERDICT / wrong
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardRoles.cs:14 — `public const string Admin = "Administrator";`; src/ZB.MOM.WW.MxGateway.Server/Configuration/GatewayOptionsValidator.cs:212216 — validator compares against `DashboardRoles.Admin` and `DashboardRoles.Viewer`; src/ZB.MOM.WW.MxGateway.Server/appsettings.json:63 — canonical example uses `"Administrator"`
CODE_AREA / config.Dashboard.GroupToRole
SEVERITY / high
PROPOSED_FIX / Change `"Admin"` to `"Administrator"` in the config shape example JSON (line 55). The Viewer value is correct.
---
DOC / GatewayConfiguration.md / LINES / 156
CLAIM / Description says 'Values must be `Admin` (read/write, API-key CRUD) or `Viewer` (read-only)'
CLAIM_TYPE / config-key
VERDICT / wrong
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardRoles.cs:14 — `public const string Admin = "Administrator";`; GatewayOptionsValidator.cs:216 — error message embeds `DashboardRoles.Admin` which resolves to `"Administrator"`
CODE_AREA / config.Dashboard.GroupToRole
SEVERITY / high
PROPOSED_FIX / Replace `` `Admin` `` with `` `Administrator` `` in the table description. The note in the Authorization policies subsection (lines 169, 174) says "Admin or Viewer" as role labels, not config values — those are fine as label prose.
---
DOC / Diagnostics.md / LINES / 165166
CLAIM / Code snippet shows `CreateLogger("ZB.MOM.WW.MxGateway.Request")` as the logger category
CLAIM_TYPE / term
VERDICT / wrong
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Diagnostics/GatewayRequestLoggingMiddlewareExtensions.cs:30 — `.CreateLogger("MxGateway.Request")`
CODE_AREA / diag.GatewayRequestLoggingMiddleware
SEVERITY / medium
PROPOSED_FIX / Change the code snippet and the surrounding sentence ("The logger category is `ZB.MOM.WW.MxGateway.Request`") to use `MxGateway.Request`.
---
DOC / GatewayConfiguration.md / LINES / 1419
CLAIM / The `MxGateway:Ldap` configuration section (11 keys, validated by GatewayOptionsValidator) is not documented in this file
CLAIM_TYPE / config-key
VERDICT / gap
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Configuration/LdapOptions.cs:3171 — 11 properties (Enabled, Server, Port, Transport, AllowInsecure, SearchBase, ServiceAccountDn, ServiceAccountPassword, UserNameAttribute, DisplayNameAttribute, GroupAttribute); GatewayOptionsValidator.cs:5590 — ValidateLdap() validates all required fields; appsettings.json:2233 — Ldap section present in default config; GatewayOptions.cs:13 — `public LdapOptions Ldap { get; init; } = new();`
CODE_AREA / config.Ldap
SEVERITY / medium
PROPOSED_FIX / Add a `## Ldap Options` table covering the 11 keys with their defaults and the validation rules (Server/SearchBase/ServiceAccountDn/ServiceAccountPassword/UserNameAttribute/DisplayNameAttribute/GroupAttribute required when Enabled; Port must be valid; Transport=None requires AllowInsecure=true).
---
DOC / Diagnostics.md / LINES / 1222
CLAIM / GatewayLogRedactorSeam (in Diagnostics/ folder) is not mentioned
CLAIM_TYPE / term
VERDICT / gap
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Diagnostics/GatewayLogRedactorSeam.cs:127 — implements `ILogRedactor`; adapts `GatewayLogRedactor` for the Serilog `RedactionEnricher` so every log event masks API-key/credential material in `ClientIdentity`, `authorization`, and `Authorization` properties
CODE_AREA / diag.GatewayLogRedactorSeam
SEVERITY / low
PROPOSED_FIX / Add a short note under the Consumers section describing `GatewayLogRedactorSeam` as the `ILogRedactor` adapter that wires `GatewayLogRedactor` into the Serilog telemetry enrichment pipeline, covering the three property keys it redacts.
---
DOC / Diagnostics.md / LINES / 1222
CLAIM / AuthStoreHealthCheck (in Diagnostics/ folder, an ASP.NET Core health check) is not mentioned
CLAIM_TYPE / term
VERDICT / gap
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Diagnostics/AuthStoreHealthCheck.cs:130 — readiness probe verifying the SQLite auth store; GatewayApplication.cs:7172 — `.AddTypeActivatedCheck<AuthStoreHealthCheck>(...)`
CODE_AREA / diag.AuthStoreHealthCheck
SEVERITY / low
PROPOSED_FIX / Add a brief section describing the `AuthStoreHealthCheck` readiness probe (executes `SELECT 1` against the SQLite auth store, exposed via the `/health/ready` and `/healthz` endpoints).
---
DOC / GatewayConfiguration.md / LINES / 1477 (config shape JSON)
CLAIM / Config shape JSON example omits the `MxGateway:Ldap` section entirely
CLAIM_TYPE / config-key
VERDICT / gap
EVIDENCE / appsettings.json:2233 — Ldap section is present; GatewayOptions.cs:13 — Ldap is a first-class sub-section of GatewayOptions
CODE_AREA / config.Ldap
SEVERITY / medium
PROPOSED_FIX / Add the `"Ldap": { ... }` block to the configuration shape example, showing the keys and their defaults from `LdapOptions`.
---
DOC / GatewayConfiguration.md / LINES / 1519
CLAIM / Authentication options: Mode=ApiKey, SqlitePath, PepperSecretName, RunMigrationsOnStartup all have documented defaults matching code
CLAIM_TYPE / config-key
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Configuration/AuthenticationOptions.cs:616 — Mode=ApiKey, SqlitePath=`C:\ProgramData\MxGateway\gateway-auth.db`, PepperSecretName=`MxGateway:ApiKeyPepper`, RunMigrationsOnStartup=true
CODE_AREA / config.Authentication
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / GatewayConfiguration.md / LINES / 2133
CLAIM / Worker options: all 10 keys and their documented defaults match code
CLAIM_TYPE / config-key
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Configuration/WorkerOptions.cs:538 — ExecutablePath, WorkingDirectory=null, RequiredArchitecture=X86, StartupTimeoutSeconds=30, StartupProbeRetryAttempts=3, StartupProbeRetryDelayMilliseconds=250, PipeConnectAttemptTimeoutMilliseconds=2000, ShutdownTimeoutSeconds=10, HeartbeatIntervalSeconds=5, HeartbeatGraceSeconds=15, MaxMessageBytes=16777216
CODE_AREA / config.Worker
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / GatewayConfiguration.md / LINES / 110
CLAIM / MaxMessageBytes validator range is 1024 through 268435456
CLAIM_TYPE / config-key
VERDICT / accurate
EVIDENCE / GatewayOptionsValidator.cs:910 — `MinimumMaxMessageBytes = 1024`, `MaximumMaxMessageBytes = 256 * 1024 * 1024` (= 268435456)
CODE_AREA / config.Worker.MaxMessageBytes
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / GatewayConfiguration.md / LINES / 3441
CLAIM / Session options: all 6 keys and their documented defaults match code
CLAIM_TYPE / config-key
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Configuration/SessionOptions.cs:430 — DefaultCommandTimeoutSeconds=30, MaxSessions=64, MaxPendingCommandsPerSession=128, DefaultLeaseSeconds=1800, LeaseSweepIntervalSeconds=30, AllowMultipleEventSubscribers=false (C# bool default)
CODE_AREA / config.Sessions
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / GatewayConfiguration.md / LINES / 4345
CLAIM / Event options: QueueCapacity=10000, BackpressurePolicy=FailFast
CLAIM_TYPE / config-key
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Configuration/EventOptions.cs:414 — QueueCapacity=10_000, BackpressurePolicy=FailFast
CODE_AREA / config.Events
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / GatewayConfiguration.md / LINES / 4657
CLAIM / Dashboard options: Enabled=true, AllowAnonymousLocalhost=true, RequireHttpsCookie=true, CookieName default=MxGatewayDashboard, SnapshotIntervalMilliseconds=1000, RecentFaultLimit=100, RecentSessionLimit=200, ShowTagValues=false, GroupToRole empty by default
CLAIM_TYPE / config-key
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Configuration/DashboardOptions.cs:653 — all defaults confirmed; src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardAuthenticationDefaults.cs:38 — CookieName="MxGatewayDashboard"
CODE_AREA / config.Dashboard
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / GatewayConfiguration.md / LINES / 5962
CLAIM / Protocol options: WorkerProtocolVersion=1, MaxGrpcMessageBytes=16777216; validator range 1024268435456
CLAIM_TYPE / config-key
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Configuration/ProtocolOptions.cs:1316; GatewayOptionsValidator.cs:291302
CODE_AREA / config.Protocol
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / GatewayConfiguration.md / LINES / 6369
CLAIM / Galaxy options: ConnectionString, CommandTimeoutSeconds=60, DashboardRefreshIntervalSeconds=30, PersistSnapshot=true, SnapshotCachePath defaults all match code
CLAIM_TYPE / config-key
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Galaxy/GalaxyRepositoryOptions.cs:1646 — all defaults confirmed
CODE_AREA / config.Galaxy
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / GatewayConfiguration.md / LINES / 7075
CLAIM / Alarm options: Enabled=false, SubscriptionExpression=empty, DefaultArea=empty, ReconcileIntervalSeconds=30
CLAIM_TYPE / config-key
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Configuration/AlarmsOptions.cs:2247 — Enabled default is C# bool default (false), SubscriptionExpression=string.Empty, DefaultArea=string.Empty, ReconcileIntervalSeconds=30
CODE_AREA / config.Alarms
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / GatewayConfiguration.md / LINES / 228
CLAIM / ReconcileIntervalSeconds is "Floored at 5 seconds"
CLAIM_TYPE / behavior-rule
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Alarms/GatewayAlarmMonitor.cs:239 — `int seconds = Math.Max(5, _options.ReconcileIntervalSeconds);`
CODE_AREA / config.Alarms.ReconcileIntervalSeconds
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / GatewayConfiguration.md / LINES / 346354
CLAIM / TLS options: SelfSignedCertPath, ValidityYears=10, AdditionalDnsNames=[], RegenerateIfExpired=true; ValidityYears validated 1100
CLAIM_TYPE / config-key
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Configuration/TlsOptions.cs:1122; GatewayOptionsValidator.cs:260261 — `MinimumCertValidityYears = 1`, `MaximumCertValidityYears = 100`
CODE_AREA / config.Tls
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / GatewayConfiguration.md / LINES / 164176
CLAIM / Three authorization policies named MxGateway.Dashboard.Viewer, MxGateway.Dashboard.Admin, MxGateway.Dashboard.HubClients; hub-token bearer scheme named MxGateway.Dashboard.HubToken
CLAIM_TYPE / term
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardAuthenticationDefaults.cs:20,27,34,14
CODE_AREA / config.Dashboard.AuthPolicies
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / GatewayConfiguration.md / LINES / 180195
CLAIM / SignalR hubs mapped at /hubs/snapshot, /hubs/alarms, /hubs/events; token endpoint at /hubs/token
CLAIM_TYPE / path
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Dashboard/DashboardEndpointRouteBuilderExtensions.cs:6365,73
CODE_AREA / config.Dashboard.Hubs
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / GatewayConfiguration.md / LINES / 193
CLAIM / `GET /hubs/token` mints a 30-minute data-protected bearer token
CLAIM_TYPE / behavior-rule
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Dashboard/HubTokenService.cs:29 — `private static readonly TimeSpan TokenLifetime = TimeSpan.FromMinutes(30);`
CODE_AREA / config.Dashboard.HubToken
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / GatewayConfiguration.md / LINES / 197206
CLAIM / Pipeline ordering: UseGatewayRequestLoggingScope → UseStaticFiles → UseAuthentication → UseAuthorization → UseAntiforgery → MapGatewayEndpoints
CLAIM_TYPE / behavior-rule
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/GatewayApplication.cs:4045
CODE_AREA / diag.GatewayRequestLoggingMiddleware
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / Diagnostics.md / LINES / 1534
CLAIM / GatewayLogScope record signature (SessionId, WorkerProcessId, CorrelationId, CommandMethod, ClientIdentity) and ToDictionary behavior matches code
CLAIM_TYPE / term
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Diagnostics/GatewayLogScope.cs:334
CODE_AREA / diag.GatewayLogScope
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / Diagnostics.md / LINES / 4457
CLAIM / GatewayLoggerExtensions.BeginGatewayScope signature and behavior match code
CLAIM_TYPE / term
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Diagnostics/GatewayLoggerExtensions.cs:918
CODE_AREA / diag.GatewayLoggerExtensions
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / Diagnostics.md / LINES / 6880
CLAIM / SensitiveCommandMethods set contains AuthenticateUser, WriteSecured, WriteSecured2; IsCredentialBearingCommand logic is correct
CLAIM_TYPE / behavior-rule
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Diagnostics/GatewayLogRedactor.cs:1126
CODE_AREA / diag.GatewayLogRedactor
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / Diagnostics.md / LINES / 86117
CLAIM / RedactApiKey implementation (bearer prefix, mxgw_ marker, split count=3, tokenParts[1] kept) matches code
CLAIM_TYPE / behavior-rule
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Diagnostics/GatewayLogRedactor.cs:3259
CODE_AREA / diag.GatewayLogRedactor
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / Diagnostics.md / LINES / 127148
CLAIM / RedactCommandValue: when valueLoggingEnabled=false every value is redacted; credential-bearing commands always redact even with valueLoggingEnabled=true
CLAIM_TYPE / behavior-rule
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Diagnostics/GatewayLogRedactor.cs:8399
CODE_AREA / diag.GatewayLogRedactor
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / Diagnostics.md / LINES / 181188
CLAIM / Request logging scope reads headers: x-session-id, x-worker-process-id, x-correlation-id, x-command-method, authorization
CLAIM_TYPE / term
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Diagnostics/GatewayRequestLoggingMiddlewareExtensions.cs:916,3237
CODE_AREA / diag.GatewayRequestLoggingMiddleware
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / Metrics.md / LINES / 8
CLAIM / GatewayMetrics is a singleton registered in GatewayApplication.cs
CLAIM_TYPE / term
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/GatewayApplication.cs:76 — `builder.Services.AddSingleton<GatewayMetrics>();`
CODE_AREA / metrics.GatewayMetrics
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / Metrics.md / LINES / 14
CLAIM / Meter name constant is "ZB.MOM.WW.MxGateway"
CLAIM_TYPE / term
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Metrics/GatewayMetrics.cs:8 — `public const string MeterName = "ZB.MOM.WW.MxGateway";`
CODE_AREA / metrics.GatewayMetrics.MeterName
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / Metrics.md / LINES / 3649
CLAIM / All 13 counter instrument names match code
CLAIM_TYPE / term
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Metrics/GatewayMetrics.cs:5870 — mxgateway.sessions.opened, .sessions.closed, .commands.started, .commands.succeeded, .commands.failed, .events.received, .queues.overflows, .faults, .workers.killed, .workers.exited, .heartbeats.failed, .grpc.streams.disconnected, .retries.attempted all confirmed
CODE_AREA / metrics.counters
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / Metrics.md / LINES / 5665
CLAIM / Three histograms: mxgateway.workers.startup.duration ("s"), mxgateway.commands.duration ("s"), mxgateway.events.stream_send.duration ("s") — names, units, tag shapes match code
CLAIM_TYPE / term
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Metrics/GatewayMetrics.cs:7173
CODE_AREA / metrics.histograms
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / Metrics.md / LINES / 7377
CLAIM / Four observable gauges: mxgateway.sessions.open, mxgateway.workers.running, mxgateway.events.worker_queue.depth, mxgateway.events.grpc_stream_queue.depth match code
CLAIM_TYPE / term
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Metrics/GatewayMetrics.cs:7578
CODE_AREA / metrics.gauges
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / Metrics.md / LINES / 82104
CLAIM / GatewayMetricsSnapshot record fields (21 parameters) match code exactly
CLAIM_TYPE / term
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Metrics/GatewayMetricsSnapshot.cs:324
CODE_AREA / metrics.GatewayMetricsSnapshot
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / Metrics.md / LINES / 114
CLAIM / EventsReceived is read with Interlocked.Read(ref _eventsReceived) inside GetSnapshot
CLAIM_TYPE / behavior-rule
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Metrics/GatewayMetrics.cs:397 — `EventsReceived: Interlocked.Read(ref _eventsReceived),`
CODE_AREA / metrics.GatewayMetrics.GetSnapshot
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / Metrics.md / LINES / 138139
CLAIM / SessionRemoved decrements the open-session gauge without incrementing the closed counter
CLAIM_TYPE / behavior-rule
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Metrics/GatewayMetrics.cs:126134 — SessionRemoved() decrements _openSessions but does not touch _sessionsClosed
CODE_AREA / metrics.GatewayMetrics.SessionRemoved
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / Metrics.md / LINES / 169
CLAIM / SessionWorkerClientFactory records WorkerKilled("OpenSessionFailed")
CLAIM_TYPE / behavior-rule
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Sessions/SessionWorkerClientFactory.cs:133
CODE_AREA / metrics.recording.SessionWorkerClientFactory
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / Metrics.md / LINES / 154162
CLAIM / WorkerProcessLauncher records WorkerKilled(reason) and RetryAttempted("worker_startup")
CLAIM_TYPE / behavior-rule
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Workers/WorkerProcessLauncher.cs:260,282
CODE_AREA / metrics.recording.WorkerProcessLauncher
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / Metrics.md / LINES / 178192
CLAIM / EventStreamService records AdjustGrpcEventStreamQueueDepth, StreamDisconnected("Detached"), QueueOverflow("grpc-event-stream"), Fault(EventQueueOverflow), Fault(WorkerFaulted)
CLAIM_TYPE / behavior-rule
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Grpc/EventStreamService.cs:58,67,96,99,146,150,179
CODE_AREA / metrics.recording.EventStreamService
SEVERITY / low
PROPOSED_FIX / flag only
---
## Summary
| Verdict | Count |
|--------------|-------|
| accurate | 25 |
| wrong | 3 |
| stale | 0 |
| unverifiable | 0 |
| gap | 4 |
| **Total** | **32** |
| Severity | Count |
|----------|-------|
| high | 2 |
| medium | 3 |
| low | 27 |
## High-Severity Findings
- **GatewayConfiguration.md line 55 — GroupToRole config shape example uses `"Admin"` as a role value.** The validator accepts only `"Administrator"` (`DashboardRoles.Admin = "Administrator"`). Any operator who copies this example verbatim will produce a validation failure at startup. Fix: change `"GwAdmin": "Admin"` to `"GwAdmin": "Administrator"` in the JSON block.
- **GatewayConfiguration.md line 156 — GroupToRole table description says values must be `Admin` or `Viewer`.** The accepted value is `"Administrator"`, not `"Admin"`. This is the primary prose that operators read when configuring LDAP role mapping; the wrong string here will silently break authentication if an operator follows the docs. Fix: replace `` `Admin` `` with `` `Administrator` `` in the description column.
## Medium-Severity Findings
- **Diagnostics.md line 165166 — Embedded code snippet and surrounding text state the logger category is `ZB.MOM.WW.MxGateway.Request`.** The actual category used by `GatewayRequestLoggingMiddlewareExtensions` is `MxGateway.Request`. An operator filtering logs by the documented category will see no output. Fix: update snippet and prose to `MxGateway.Request`.
- **GatewayConfiguration.md — `MxGateway:Ldap` section (11 keys) is entirely absent from the config shape JSON example and has no option table.** The section is validated at startup by `GatewayOptionsValidator.ValidateLdap` and appears in `appsettings.json`. Fix: add `"Ldap"` block to the JSON shape and a `## Ldap Options` table.
- **GatewayConfiguration.md — Config shape JSON omits the `Ldap` section** (duplicate of the above gap, listed separately because the shape and the prose table are independent defects).
+362
View File
@@ -0,0 +1,362 @@
# Cluster 07 — Contracts/gRPC
Audit of `docs/Contracts.md`, `docs/Grpc.md`, and `docs/ClientProtoGeneration.md`
verified against:
- `src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_gateway.proto`
- `src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_worker.proto`
- `src/ZB.MOM.WW.MxGateway.Contracts/Protos/galaxy_repository.proto`
- `src/ZB.MOM.WW.MxGateway.Server/Grpc/**`
- `clients/proto/proto-inputs.json`
- `src/ZB.MOM.WW.MxGateway.Contracts/ZB.MOM.WW.MxGateway.Contracts.csproj`
---
DOC / LINES / CLAIM / CLAIM_TYPE / VERDICT / EVIDENCE / CODE_AREA / SEVERITY / PROPOSED_FIX
---
DOC: docs/Grpc.md
LINES: 13, 32
CLAIM: "`MxAccessGatewayService` implements the six `MxAccessGateway` RPCs — `OpenSession`, `CloseSession`, `Invoke`, `StreamEvents`, `AcknowledgeAlarm`, and `StreamAlarms`."
CLAIM_TYPE: rpc/proto
VERDICT: wrong
EVIDENCE: mxaccess_gateway.proto:17-38 defines seven RPCs — the six listed plus `QueryActiveAlarms(QueryActiveAlarmsRequest) returns (stream ActiveAlarmSnapshot)`. `MxAccessGatewayService.cs:233` implements `QueryActiveAlarms`. The table at line 13 also says "six" and the table and prose at line 32 both omit `QueryActiveAlarms`.
CODE_AREA: proto.QueryActiveAlarms
SEVERITY: high
PROPOSED_FIX: Change "six" to "seven" in the table and prose. Add `QueryActiveAlarms` to the RPC list at line 32. Add a `### QueryActiveAlarms` handler section describing the server-streaming, session-less snapshot behavior (iterates `alarmService.CurrentAlarms`, respects `alarm_filter_prefix`, completes without emitting transitions).
---
DOC: docs/Grpc.md
LINES: 148
CLAIM: "The mapper exposes static factory methods for every `ProtocolStatusCode` (`Ok`, `InvalidRequest`, `SessionNotFound`, `SessionNotReady`, `WorkerUnavailable`, `Timeout`, `Canceled`, `ProtocolViolation`)."
CLAIM_TYPE: rpc/proto
VERDICT: wrong
EVIDENCE: `mxaccess_gateway.proto:1025` defines `PROTOCOL_STATUS_CODE_MXACCESS_FAILURE = 9`. `MxAccessGrpcMapper.cs:76-174` lists eight factory methods — none for `MxAccessFailure`. The claim "every ProtocolStatusCode" is false because `MxAccessFailure` has no corresponding factory method.
CODE_AREA: proto.ProtocolStatusCode
SEVERITY: medium
PROPOSED_FIX: Either add "except `MxAccessFailure`, which is produced only by the worker" to the sentence, or add the missing factory method and update the list. Do not silently elide the gap.
---
DOC: docs/ClientProtoGeneration.md
LINES: 80, 145
CLAIM: "Python generated-code output directory is `clients/python/src/mxgateway/generated`."
CLAIM_TYPE: path
VERDICT: wrong
EVIDENCE: `clients/proto/proto-inputs.json:28` declares `"python": "clients/python/src/zb_mom_ww_mxgateway/generated"`. The actual directory on disk is `clients/python/src/zb_mom_ww_mxgateway/generated/` (confirmed by `ls`). The doc path `clients/python/src/mxgateway/generated` does not exist.
CODE_AREA: proto.gen
SEVERITY: high
PROPOSED_FIX: Replace both occurrences of `clients/python/src/mxgateway/generated` with `clients/python/src/zb_mom_ww_mxgateway/generated` to match `proto-inputs.json` and the actual filesystem.
---
DOC: docs/Grpc.md
LINES: 227
CLAIM: "Under the default policy only the stream is dropped and the session continues to accept commands."
CLAIM_TYPE: behavior-rule
VERDICT: wrong
EVIDENCE: `appsettings.json:53` sets `"BackpressurePolicy": "FailFast"`. `EventOptions.cs:13` confirms `EventBackpressurePolicy.FailFast` as the default. `EventBackpressurePolicy.cs` names the two values `FailFast` and `DisconnectSubscriber`. The non-FailFast (stream-drop-only) behaviour belongs to `DisconnectSubscriber`, not "the default policy". Under the actual default (`FailFast`) the session is faulted.
CODE_AREA: proto.gen
SEVERITY: medium
PROPOSED_FIX: Rewrite as: "Under `DisconnectSubscriber` only the stream is dropped … Under `FailFast` (the default configured in `appsettings.json`) the session is faulted …"
---
DOC: docs/Contracts.md
LINES: 94, 107
CLAIM: "Full solution build: `dotnet build src/ZB.MOM.WW.MxGateway.slnx`"
CLAIM_TYPE: command
VERDICT: accurate
EVIDENCE: `src/ZB.MOM.WW.MxGateway.slnx` exists on disk.
CODE_AREA: proto.gen
SEVERITY: low
PROPOSED_FIX: None.
---
DOC: docs/Contracts.md
LINES: 94
CLAIM: "Run the contracts build to regenerate C# protobuf and gRPC code: `dotnet build src/ZB.MOM.WW.MxGateway.Contracts/ZB.MOM.WW.MxGateway.Contracts.csproj`"
CLAIM_TYPE: command
VERDICT: accurate
EVIDENCE: `ZB.MOM.WW.MxGateway.Contracts.csproj:27-29` includes all three `.proto` files with `GrpcServices="Both"` or `"None"` and `OutputDir="Generated"`. Building the project triggers protoc via Grpc.Tools.
CODE_AREA: proto.gen
SEVERITY: low
PROPOSED_FIX: None.
---
DOC: docs/Contracts.md
LINES: 4-5
CLAIM: "The contracts project multi-targets `net10.0;net48` and owns the `.proto` files."
CLAIM_TYPE: config-key
VERDICT: accurate
EVIDENCE: `ZB.MOM.WW.MxGateway.Contracts.csproj:4``<TargetFrameworks>net10.0;net48</TargetFrameworks>`.
CODE_AREA: proto.gen
SEVERITY: low
PROPOSED_FIX: None.
---
DOC: docs/Contracts.md
LINES: 80-81
CLAIM: "Generated C# output is written to `src/ZB.MOM.WW.MxGateway.Contracts/Generated/`."
CLAIM_TYPE: path
VERDICT: accurate
EVIDENCE: `ZB.MOM.WW.MxGateway.Contracts.csproj:27``OutputDir="Generated"`. Directory `src/ZB.MOM.WW.MxGateway.Contracts/Generated/` contains five generated `.cs` files confirmed by `ls`.
CODE_AREA: proto.gen
SEVERITY: low
PROPOSED_FIX: None.
---
DOC: docs/Contracts.md
LINES: 9-19
CLAIM: "The public command model includes bulk subscription command kinds for `AddItemBulk`, `AdviseItemBulk`, `RemoveItemBulk`, `UnAdviseItemBulk`, `SubscribeBulk`, and `UnsubscribeBulk`. They return a `BulkSubscribeReply` containing per-item `SubscribeResult` records with `ServerHandle`, `TagAddress`, `ItemHandle`, `WasSuccessful`, and `ErrorMessage`."
CLAIM_TYPE: rpc/proto
VERDICT: accurate
EVIDENCE: `mxaccess_gateway.proto:117-122` defines all six payloads. `proto:562-568` defines `SubscribeResult` with fields `server_handle`, `tag_address`, `item_handle`, `was_successful`, `error_message`. `proto:570-572` defines `BulkSubscribeReply`.
CODE_AREA: proto.SubscribeResult
SEVERITY: low
PROPOSED_FIX: None.
---
DOC: docs/Contracts.md
LINES: 32-45
CLAIM: "`WriteBulkCommand` / `Write2BulkCommand` / `WriteSecuredBulkCommand` / `WriteSecured2BulkCommand` each carry `server_handle` and a `repeated` list of entries. Each entry mirrors the single-item command shape — `item_handle` + `value` (+ `timestamp_value` on the `*2` variants, + `current_user_id` / `verifier_user_id` on the secured variants). All four replies use `BulkWriteReply` with `repeated BulkWriteResult`. A `BulkWriteResult` has `server_handle`, `item_handle`, `was_successful`, `optional int32 hresult`, `repeated MxStatusProxy statuses`, and `error_message`."
CLAIM_TYPE: rpc/proto
VERDICT: accurate
EVIDENCE: `mxaccess_gateway.proto:384-441` defines all four commands with matching fields. `proto:581-588` defines `BulkWriteResult` with exactly those six fields. `proto:590-592` defines `BulkWriteReply`.
CODE_AREA: proto.BulkWriteResult
SEVERITY: low
PROPOSED_FIX: None.
---
DOC: docs/Contracts.md
LINES: 46-61
CLAIM: "`ReadBulkCommand` carries `server_handle`, `repeated string tag_addresses`, and `uint32 timeout_ms`. The reply is `BulkReadReply` carrying `repeated BulkReadResult`. A `BulkReadResult` has `server_handle`, `tag_address`, `item_handle`, `was_successful`, `was_cached`, `value`, `quality`, `source_timestamp`, `repeated MxStatusProxy statuses`, and `error_message`. `BulkReadResult` has no `hresult` field."
CLAIM_TYPE: rpc/proto
VERDICT: accurate
EVIDENCE: `mxaccess_gateway.proto:456-460` defines `ReadBulkCommand` with those three fields. `proto:612-623` defines `BulkReadResult` with exactly those ten fields, no `hresult`. `proto:625-627` defines `BulkReadReply`.
CODE_AREA: proto.BulkReadResult
SEVERITY: low
PROPOSED_FIX: None.
---
DOC: docs/Contracts.md
LINES: 68-71
CLAIM: "`mxaccess_worker.proto` defines the named-pipe worker IPC envelope and control messages. It imports `mxaccess_gateway.proto` so the worker and gateway use the same command, reply, event, value, and status shapes."
CLAIM_TYPE: rpc/proto
VERDICT: accurate
EVIDENCE: `mxaccess_worker.proto:9``import "mxaccess_gateway.proto";`. The `WorkerCommand`, `WorkerCommandReply`, `WorkerEvent` messages wrap `mxaccess_gateway.v1` types directly.
CODE_AREA: proto.WorkerEnvelope
SEVERITY: low
PROPOSED_FIX: None.
---
DOC: docs/Contracts.md
LINES: 73-78
CLAIM: "`galaxy_repository.proto` defines the `GalaxyRepository` service. The service is metadata-only and does not share types with `mxaccess_gateway.proto`."
CLAIM_TYPE: rpc/proto
VERDICT: accurate
EVIDENCE: `galaxy_repository.proto:7-8` imports only `google/protobuf/timestamp.proto` and `google/protobuf/wrappers.proto` — no import of `mxaccess_gateway.proto`. The comment at `galaxy_repository.proto:130` states the type enumeration is distinct from `MxDataType`.
CODE_AREA: proto.GalaxyRepository
SEVERITY: low
PROPOSED_FIX: None.
---
DOC: docs/Grpc.md
LINES: 9-16
CLAIM: "Four collaborators: `MxAccessGatewayService` (scoped/gRPC), `MxAccessGrpcRequestValidator` (singleton), `MxAccessGrpcMapper` (singleton), `IEventStreamService`/`EventStreamService` (singleton)."
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: `GatewayApplication.cs:88-90` registers mapper, validator, and event stream service as singletons. `MxAccessGatewayService` is not explicitly registered (gRPC services resolved per-request by ASP.NET Core are transient/scoped — "scoped (gRPC)" is accurate per ASP.NET Core DI conventions). `GatewayApplication.cs:195` maps it as a gRPC endpoint.
CODE_AREA: proto.gen
SEVERITY: low
PROPOSED_FIX: None.
---
DOC: docs/Grpc.md
LINES: 20-26
CLAIM: "Registration: `builder.Services.AddSingleton<MxAccessGrpcMapper>(); builder.Services.AddSingleton<MxAccessGrpcRequestValidator>(); builder.Services.AddSingleton<IEventStreamService, EventStreamService>();`"
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: `GatewayApplication.cs:88-90` matches exactly.
CODE_AREA: proto.gen
SEVERITY: low
PROPOSED_FIX: None.
---
DOC: docs/Grpc.md
LINES: 237-243
CLAIM: "Authorization interceptor registration: `services.AddSingleton<GatewayGrpcAuthorizationInterceptor>(); services.AddGrpc(options => options.Interceptors.Add<GatewayGrpcAuthorizationInterceptor>());`"
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: `GrpcAuthorizationServiceCollectionExtensions.cs:21,31` contains both lines verbatim.
CODE_AREA: proto.gen
SEVERITY: low
PROPOSED_FIX: None.
---
DOC: docs/Grpc.md
LINES: 100-108
CLAIM: "Validation table — `OpenSession`: `command_timeout` when set must be `> 0`; `CloseSession`: `session_id` non-empty; `StreamEvents`: `session_id` non-empty; `Invoke`: session_id non-empty, command present, kind not Unspecified, payload oneof matches kind; `AcknowledgeAlarm`: `alarm_full_reference` non-empty, validated inline not by `MxAccessGrpcRequestValidator`; `StreamAlarms`: no required fields."
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: `MxAccessGrpcRequestValidator.cs:10-53` confirms all four validator methods. `MxAccessGatewayService.cs:181-183` confirms the inline alarm reference check. `StreamAlarms` handler at line 204 has no field validation.
CODE_AREA: proto.gen
SEVERITY: low
PROPOSED_FIX: None.
---
DOC: docs/Grpc.md
LINES: 141-147
CLAIM: "When the worker reply or event payload is missing, the mapper returns a synthetic public message with `ProtocolStatusCode.ProtocolViolation` (for replies) or a sentinel `MxEvent` with `MxEventFamily.Unspecified` (for events)."
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: `MxAccessGrpcMapper.cs:46-54` returns `ProtocolViolation(...)` when `reply.Reply` is null. `MxAccessGrpcMapper.cs:65-69` returns sentinel `MxEvent { Family = MxEventFamily.Unspecified }` when event is null.
CODE_AREA: proto.gen
SEVERITY: low
PROPOSED_FIX: None.
---
DOC: docs/Grpc.md
LINES: 159-174
CLAIM: "Exception mapping: `OperationCanceledException``Cancelled`; `SessionManagerException` → mapped by `ErrorCode`; `WorkerClientException` → mapped by `ErrorCode`. `WorkerClientException`: `CommandTimeout``DeadlineExceeded`, `GatewayShutdown``Cancelled`, `InvalidState``FailedPrecondition`, `ProtocolViolation``Internal`, others → `Unavailable`."
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: `MxAccessGatewayService.cs:902-950` matches exactly. `WorkerClientErrorCode.cs:5-12` confirms the four enum values.
CODE_AREA: proto.gen
SEVERITY: low
PROPOSED_FIX: None.
---
DOC: docs/Grpc.md
LINES: 184-196
CLAIM: "The channel is bounded by `Events:QueueCapacity` and configured for a single reader and writer with `FullMode = BoundedChannelFullMode.Wait` and `AllowSynchronousContinuations = false`."
CLAIM_TYPE: config-key
VERDICT: accurate
EVIDENCE: `EventStreamService.cs:44-51` matches the code snippet in the doc verbatim.
CODE_AREA: proto.gen
SEVERITY: low
PROPOSED_FIX: None.
---
DOC: docs/ClientProtoGeneration.md
LINES: 39-45
CLAIM: "`GatewayContractInfo.GatewayProtocolVersion` is the public gateway protocol version. `OpenSessionReply.gateway_protocol_version` returns the same value."
CLAIM_TYPE: rpc/proto
VERDICT: accurate
EVIDENCE: `GatewayContractInfo.cs:12``public const uint GatewayProtocolVersion = 3;`. `mxaccess_gateway.proto:71``uint32 gateway_protocol_version = 8;`. `MxAccessGatewayService.cs:49` copies `GatewayContractInfo.GatewayProtocolVersion` into the reply field.
CODE_AREA: proto.OpenSessionReply
SEVERITY: low
PROPOSED_FIX: None.
---
DOC: docs/ClientProtoGeneration.md
LINES: 55-61
CLAIM: "The script writes `clients/proto/descriptors/mxaccessgw-client-v1.protoset`."
CLAIM_TYPE: path
VERDICT: accurate
EVIDENCE: `clients/proto/descriptors/mxaccessgw-client-v1.protoset` exists on disk. `proto-inputs.json:21` references the same path.
CODE_AREA: proto.gen
SEVERITY: low
PROPOSED_FIX: None.
---
DOC: docs/ClientProtoGeneration.md
LINES: 74-81
CLAIM: "Generated-code directories table: .NET → `clients/dotnet/generated`, Go → `clients/go/internal/generated`, Rust → `clients/rust/src/generated`, Python → `clients/python/src/mxgateway/generated`, Java → `clients/java/src/main/generated`."
CLAIM_TYPE: path
VERDICT: wrong
EVIDENCE: `clients/proto/proto-inputs.json:26-30` lists `"python": "clients/python/src/zb_mom_ww_mxgateway/generated"`. The actual directory is `clients/python/src/zb_mom_ww_mxgateway/generated/` (confirmed by filesystem). The table row for Python says `clients/python/src/mxgateway/generated` which does not exist. All other rows match `proto-inputs.json` and the filesystem.
CODE_AREA: proto.gen
SEVERITY: high
PROPOSED_FIX: Change the Python row from `clients/python/src/mxgateway/generated` to `clients/python/src/zb_mom_ww_mxgateway/generated` in the table and also fix line 145 which contains the same wrong path.
---
DOC: docs/ClientProtoGeneration.md
LINES: 89-101 (generation commands table)
CLAIM: ".NET generation: `dotnet build src/ZB.MOM.WW.MxGateway.Contracts/ZB.MOM.WW.MxGateway.Contracts.csproj`; Go: `Push-Location clients/go; ./generate-proto.ps1; Pop-Location`; Rust: `Push-Location clients/rust; cargo check --workspace; Pop-Location`; Python: `Push-Location clients/python; ./generate-proto.ps1; Pop-Location`; Java: `Push-Location clients/java; gradle :mxgateway-client:generateProto; Pop-Location`."
CLAIM_TYPE: command
VERDICT: accurate
EVIDENCE: Scripts `clients/go/generate-proto.ps1` and `clients/python/generate-proto.ps1` exist. `generate-proto.ps1` for Go uses `$modulePath = 'gitea.dohertylan.com/dohertj2/mxaccessgw/clients/go/internal/generated'` matching the stated package. Contracts csproj exists. All scripts confirmed present.
CODE_AREA: proto.gen
SEVERITY: low
PROPOSED_FIX: None.
---
DOC: docs/ClientProtoGeneration.md
LINES: 119-125
CLAIM: "The Go scaffold maps both proto files into the internal Go package `gitea.dohertylan.com/dohertj2/mxaccessgw/clients/go/internal/generated`."
CLAIM_TYPE: path
VERDICT: accurate
EVIDENCE: `clients/go/generate-proto.ps1:7``$modulePath = 'gitea.dohertylan.com/dohertj2/mxaccessgw/clients/go/internal/generated'`.
CODE_AREA: proto.gen
SEVERITY: low
PROPOSED_FIX: None.
---
DOC: docs/ClientProtoGeneration.md
LINES: 170-176
CLAIM: "Golden fixtures: `open-session-reply.ok.json`, `register-command-request.json`, `on-data-change-event.json`."
CLAIM_TYPE: path
VERDICT: accurate
EVIDENCE: All three files exist at `clients/proto/fixtures/golden/`.
CODE_AREA: proto.gen
SEVERITY: low
PROPOSED_FIX: None.
---
DOC: docs/Contracts.md / docs/ClientProtoGeneration.md
LINES: (gap — not documented)
CLAIM: gap — `QueryActiveAlarms` RPC in `mxaccess_gateway.proto` service definition (line 37), `QueryActiveAlarmsRequest` message (line 44), and `ActiveAlarmSnapshot` message (line 783) are not mentioned in `Contracts.md` or `ClientProtoGeneration.md`.
CLAIM_TYPE: rpc/proto
VERDICT: gap
EVIDENCE: `mxaccess_gateway.proto:37``rpc QueryActiveAlarms(QueryActiveAlarmsRequest) returns (stream ActiveAlarmSnapshot);`. `Contracts.md` describes every other public RPC but never mentions `QueryActiveAlarms`.
CODE_AREA: proto.QueryActiveAlarms
SEVERITY: medium
PROPOSED_FIX: Add a paragraph to `Contracts.md` describing `QueryActiveAlarms` — session-less, server-streaming, returns point-in-time snapshot of active alarms from the gateway's always-on alarm monitor cache, optionally filtered by `alarm_filter_prefix`. Cross-reference the `StreamAlarms` section.
---
DOC: docs/Contracts.md / docs/ClientProtoGeneration.md
LINES: (gap — not documented)
CLAIM: gap — `AlarmFeedMessage` oneof message and the `StreamAlarms` protocol (snapshot → `snapshot_complete` → transitions) are described in `Grpc.md` but not in `Contracts.md` which should be the shape-level reference.
CLAIM_TYPE: rpc/proto
VERDICT: gap
EVIDENCE: `mxaccess_gateway.proto:860-870` defines `AlarmFeedMessage { oneof payload { ActiveAlarmSnapshot active_alarm = 1; bool snapshot_complete = 2; OnAlarmTransitionEvent transition = 3; } }`. `Contracts.md` does not describe this message or its stream protocol.
CODE_AREA: proto.AlarmFeedMessage
SEVERITY: low
PROPOSED_FIX: Add a brief entry in `Contracts.md` describing `AlarmFeedMessage` and the three-phase stream sequence for `StreamAlarms`.
---
DOC: docs/Contracts.md / docs/Grpc.md
LINES: (gap — not documented)
CLAIM: gap — `AcknowledgeAlarmRequest` has a reserved field 1 (`session_id`) and the acknowledgement is session-less. `AcknowledgeAlarmReply` also has a reserved field 1 and an intentionally-unset `status` field (field 5). This wire-compatibility detail is not captured in `Contracts.md`.
CLAIM_TYPE: rpc/proto
VERDICT: gap
EVIDENCE: `mxaccess_gateway.proto:812-847``AcknowledgeAlarmRequest` has `reserved 1; reserved "session_id";`. `AcknowledgeAlarmReply` likewise has `reserved 1; reserved "session_id";` and inline comment that `status` (field 5) is intentionally unset.
CODE_AREA: proto.AcknowledgeAlarm
SEVERITY: low
PROPOSED_FIX: Add a note in `Contracts.md` about the reserved `session_id` fields and the intentionally-empty `status` field so integrators using older generated code do not misinterpret wire defaults.
+521
View File
@@ -0,0 +1,521 @@
# Cluster 08 — Galaxy Repository
Audited doc: `docs/GalaxyRepository.md`
Verified against: `src/ZB.MOM.WW.MxGateway.Server/Galaxy/**`, `src/ZB.MOM.WW.MxGateway.Contracts/Protos/galaxy_repository.proto`
Date: 2026-06-03
---
DOC: docs/GalaxyRepository.md
LINES: 34
CLAIM: The SQL Server database is named `ZB`.
CLAIM_TYPE: config-key
VERDICT: accurate
EVIDENCE: GalaxyRepositoryOptions.cs:17 (`DefaultConnectionString = "Server=localhost;Database=ZB;..."`)
CODE_AREA: gr.conn
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GalaxyRepository.md
LINES: 34
CLAIM: The database is a SQL Server database.
CLAIM_TYPE: config-key
VERDICT: accurate
EVIDENCE: GalaxyRepository.cs:1 (`using Microsoft.Data.SqlClient;`); GalaxyRepositoryOptions.cs:17
CODE_AREA: gr.conn
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GalaxyRepository.md
LINES: 3031
CLAIM: The service is defined in `src/ZB.MOM.WW.MxGateway.Contracts/Protos/galaxy_repository.proto` under package `galaxy_repository.v1`.
CLAIM_TYPE: path
VERDICT: accurate
EVIDENCE: galaxy_repository.proto:3 (`package galaxy_repository.v1;`)
CODE_AREA: gr.proto
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GalaxyRepository.md
LINES: 3539
CLAIM: `TestConnection` returns `{ ok: bool }` after a `SELECT 1`. Does not throw on SQL failure — returns `ok = false`.
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: GalaxyRepository.cs:2032 (catches `SqlException` and `InvalidOperationException`, returns false); galaxy_repository.proto:4345 (`TestConnectionReply { bool ok = 1; }`)
CODE_AREA: gr.proto
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GalaxyRepository.md
LINES: 36
CLAIM: `GetLastDeployTime` returns the cached `galaxy.time_of_last_deploy`. Served from the shared hierarchy cache; refreshed in the background.
CLAIM_TYPE: behavior-rule
VERDICT: wrong
EVIDENCE: GalaxyRepositoryGrpcService.cs:4262 — `GetLastDeployTime` calls `WaitForCacheBootstrap` then reads `cache.Current`, not `repository` directly. The underlying SQL is `SELECT time_of_last_deploy FROM galaxy` (GalaxyRepository.cs:40) but it is served from cache, not direct SQL. The doc correctly says "served from cache" in the inline column. However the inline description says "Served from the shared hierarchy cache; refreshed in the background" which is accurate for the RPC handler — but the SQL column itself (`galaxy.time_of_last_deploy`) is an internal SQL column name, not a table.column phrasing. No actual error; accurate.
CODE_AREA: gr.sql
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GalaxyRepository.md
LINES: 3839
CLAIM: `WatchDeployEvents` is server-streaming. The server emits the current state immediately on subscribe.
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: galaxy_repository.proto:33 (`rpc WatchDeployEvents ... returns (stream DeployEvent)`); GalaxyDeployNotifier.cs:5863 (bootstrap emit on subscribe)
CODE_AREA: gr.proto
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GalaxyRepository.md
LINES: 39
CLAIM: `BrowseChildren` returns the direct children of one parent object (or root objects when `parent` is unset). Includes a per-child `has_children` hint so UIs can draw expand triangles without an extra round trip. Served from cache.
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: galaxy_repository.proto:175190 (`BrowseChildrenReply` with `child_has_children` repeated bool); GalaxyRepositoryGrpcService.cs:112168
CODE_AREA: gr.proto
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GalaxyRepository.md
LINES: 4243
CLAIM: The server defaults omitted page size to 1000 objects and caps every page at 5000 objects (for `DiscoverHierarchy`).
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: GalaxyRepositoryGrpcService.cs:2728 (`DefaultDiscoverPageSize = 1000`, `MaxDiscoverPageSize = 5000`)
CODE_AREA: gr.proto
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GalaxyRepository.md
LINES: 8386
CLAIM: `BrowseChildren` default page size is 500; the server caps any requested size at 5000. Page tokens encode `(cache_sequence, parent_id, filter_signature, offset)`.
CLAIM_TYPE: behavior-rule
VERDICT: wrong
EVIDENCE: GalaxyRepositoryGrpcService.cs:29 (`DefaultBrowsePageSize = 500`) — default is accurate. Cap of 5000 is accurate (comment "MaxBrowsePageSize reuses MaxDiscoverPageSize (5000)"). However the token encoding claim is inaccurate: the actual token format is `sequence:filterSignature:offset` (GalaxyRepositoryGrpcService.cs:295302, `FormatPageToken`). `parent_id` is embedded inside `filterSignature` as a component (GalaxyBrowseProjector.cs:266 `builder.Append("parent=").Append(parentId...)`) — it is NOT a separate named field in the token. Describing it as `(cache_sequence, parent_id, filter_signature, offset)` implies four independent fields; the wire encoding has three fields with parent_id folded into the signature hash.
CODE_AREA: gr.proto
SEVERITY: medium
PROPOSED_FIX: Change "Page tokens encode `(cache_sequence, parent_id, filter_signature, offset)`" to "Page tokens encode `sequence:filterSignature:offset`; `parent_id` is incorporated into `filterSignature` along with the other filter parameters."
---
DOC: docs/GalaxyRepository.md
LINES: 9798
CLAIM: Missing `metadata:read` scope returns `PermissionDenied`.
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: GatewayGrpcScopeResolver.cs:2327 (all five Galaxy request types map to `GatewayScopes.MetadataRead`); GatewayScopes.cs:11
CODE_AREA: gr.proto
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GalaxyRepository.md
LINES: 118119
CLAIM: `GalaxyHierarchyRefreshService` ticks every `MxGateway:Galaxy:DashboardRefreshIntervalSeconds` seconds (default 30).
CLAIM_TYPE: config-key
VERDICT: accurate
EVIDENCE: GalaxyRepositoryOptions.cs:2829 (`DashboardRefreshIntervalSeconds { get; init; } = 30`); GalaxyHierarchyRefreshService.cs:18 (`TimeSpan.FromSeconds(Math.Max(1, options.Value.DashboardRefreshIntervalSeconds))`)
CODE_AREA: gr.conn
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GalaxyRepository.md
LINES: 120
CLAIM: Each tick queries the cheap `SELECT time_of_last_deploy FROM galaxy` first.
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: GalaxyRepository.cs:40 (`"SELECT time_of_last_deploy FROM galaxy"`); GalaxyHierarchyCache.cs:117 (`GetLastDeployTimeAsync` called first before deciding whether to run heavy queries)
CODE_AREA: gr.sql
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GalaxyRepository.md
LINES: 150152
CLAIM: The snapshot file is written atomically — a temp file plus rename — so a crash mid-write cannot corrupt the snapshot.
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: GalaxyHierarchySnapshotStore.cs:7481 (writes to `_path + ".tmp"` then `File.Move(..., overwrite: true)`)
CODE_AREA: gr.conn
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GalaxyRepository.md
LINES: 178179
CLAIM: `GalaxyDeployNotifier` maintains a private bounded channel per subscriber. The bound is 16 events with `DropOldest`.
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: GalaxyDeployNotifier.cs:18 (`SubscriberQueueCapacity = 16`); GalaxyDeployNotifier.cs:4953 (`BoundedChannelOptions` with `FullMode = BoundedChannelFullMode.DropOldest`)
CODE_AREA: gr.proto
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GalaxyRepository.md
LINES: 386387
CLAIM: Default connection string is `Server=localhost;Database=ZB;Integrated Security=True;TrustServerCertificate=True;Encrypt=False;`
CLAIM_TYPE: config-key
VERDICT: accurate
EVIDENCE: GalaxyRepositoryOptions.cs:17 (exact match)
CODE_AREA: gr.conn
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GalaxyRepository.md
LINES: 387
CLAIM: `MxGateway:Galaxy:CommandTimeoutSeconds` default is `60`. Applies to all three RPCs.
CLAIM_TYPE: config-key
VERDICT: wrong
EVIDENCE: GalaxyRepositoryOptions.cs:22 (`CommandTimeoutSeconds { get; init; } = 60`) — default is accurate. However "Applies to all three RPCs" is stale: there are five RPCs (`TestConnection`, `GetLastDeployTime`, `DiscoverHierarchy`, `WatchDeployEvents`, `BrowseChildren`), not three. `CommandTimeoutSeconds` applies to the SQL commands in `GalaxyRepository.cs` which backs `TestConnection`, `GetLastDeployTime`, `GetHierarchyAsync`, and `GetAttributesAsync`. The doc says "all three RPCs" presumably counting only the original three before `BrowseChildren` was added.
CODE_AREA: gr.conn
SEVERITY: medium
PROPOSED_FIX: Change "Applies to all three RPCs" to "Applies to all SQL commands issued by the repository (used by `TestConnection`, `GetLastDeployTime`, and the hierarchy/attributes queries backing `DiscoverHierarchy` and `BrowseChildren`)."
---
DOC: docs/GalaxyRepository.md
LINES: 388389
CLAIM: `MxGateway:Galaxy:PersistSnapshot` default is `true`.
CLAIM_TYPE: config-key
VERDICT: accurate
EVIDENCE: GalaxyRepositoryOptions.cs:40 (`PersistSnapshot { get; init; } = true`)
CODE_AREA: gr.conn
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GalaxyRepository.md
LINES: 389390
CLAIM: `MxGateway:Galaxy:SnapshotCachePath` default is `C:\ProgramData\MxGateway\galaxy-snapshot.json`.
CLAIM_TYPE: config-key
VERDICT: accurate
EVIDENCE: GalaxyRepositoryOptions.cs:3233 (`DefaultSnapshotCachePath = @"C:\ProgramData\MxGateway\galaxy-snapshot.json"`)
CODE_AREA: gr.conn
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GalaxyRepository.md
LINES: 403404
CLAIM: "All four Galaxy RPCs (including `WatchDeployEvents`) require the `metadata:read` API-key scope."
CLAIM_TYPE: rpc/proto
VERDICT: wrong
EVIDENCE: GatewayGrpcScopeResolver.cs:2327 — all **five** Galaxy RPCs require `metadata:read`: `TestConnectionRequest`, `GetLastDeployTimeRequest`, `DiscoverHierarchyRequest`, `WatchDeployEventsRequest`, and `BrowseChildrenRequest`. The service has five RPCs (galaxy_repository.proto:2139), not four. `BrowseChildren` was added after the original four but the authorization section was not updated.
CODE_AREA: gr.proto
SEVERITY: high
PROPOSED_FIX: Change "All four Galaxy RPCs" to "All five Galaxy RPCs" (or explicitly list all five: `TestConnection`, `GetLastDeployTime`, `DiscoverHierarchy`, `WatchDeployEvents`, `BrowseChildren`).
---
DOC: docs/GalaxyRepository.md
LINES: 378
CLAIM: "`GalaxyRepositoryGrpcService` (`src/ZB.MOM.WW.MxGateway.Server/Grpc/GalaxyRepositoryGrpcService.cs`) implements the five RPCs."
CLAIM_TYPE: path
VERDICT: accurate
EVIDENCE: GalaxyRepositoryGrpcService.cs (file exists at that path); implements all five overrides: TestConnection, GetLastDeployTime, DiscoverHierarchy, WatchDeployEvents, BrowseChildren
CODE_AREA: gr.proto
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GalaxyRepository.md
LINES: 327
CLAIM: Architecture diagram shows `DiscoverHierarchy, GetLastDeployTime, BrowseChildren -> IGalaxyHierarchyCache.Current` (WatchDeployEvents -> IGalaxyDeployNotifier, TestConnection -> GalaxyRepository direct SQL).
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: GalaxyRepositoryGrpcService.cs:3339 (TestConnection → repository.TestConnectionAsync), :4262 (GetLastDeployTime → cache.Current), :64110 (DiscoverHierarchy → cache.Current), :112168 (BrowseChildren → cache.Current), :171200 (WatchDeployEvents → notifier.SubscribeAsync)
CODE_AREA: gr.proto
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GalaxyRepository.md
LINES: 346350
CLAIM: "`GalaxyRepository` (`src/ZB.MOM.WW.MxGateway.Server/Galaxy/GalaxyRepository.cs`) holds the SQL. Both `HierarchySql` and `AttributesSql` walk template-derivation and package-derivation chains via recursive CTEs."
CLAIM_TYPE: path
VERDICT: accurate
EVIDENCE: GalaxyRepository.cs:117164 (`HierarchySql` with `template_chain` CTE); GalaxyRepository.cs:176251 (`AttributesSql` with `deployed_package_chain` CTE)
CODE_AREA: gr.sql
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GalaxyRepository.md
LINES: 347348
CLAIM: "`HierarchySql` still matches the OtOpcUa original; `AttributesSql` does not — it additionally enumerates built-in primitive attributes."
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: GalaxyRepository.cs:914 (doc comment confirming this); GalaxyRepository.cs:166175 (comment on AttributesSql confirming divergence)
CODE_AREA: gr.sql
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GalaxyRepository.md
LINES: 269270
CLAIM: Configured (dynamic) attributes are stored in the Galaxy `dynamic_attribute` table.
CLAIM_TYPE: term
VERDICT: accurate
EVIDENCE: GalaxyRepository.cs:196 (`INNER JOIN dynamic_attribute da ON da.package_id = dpc.package_id`)
CODE_AREA: gr.sql
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GalaxyRepository.md
LINES: 272273
CLAIM: Built-in attributes are stored in `attribute_definition` and reached through `primitive_instance`.
CLAIM_TYPE: term
VERDICT: accurate
EVIDENCE: GalaxyRepository.cs:214218 (`INNER JOIN primitive_instance pi ON pi.package_id = dpc.package_id` / `INNER JOIN attribute_definition ad ON ad.primitive_definition_id = pi.primitive_definition_id`)
CODE_AREA: gr.sql
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GalaxyRepository.md
LINES: 283284
CLAIM: The configured-attribute category allow-list is `mx_attribute_category IN (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 24)`.
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: GalaxyRepository.cs:203 (`AND da.mx_attribute_category IN (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 24)`)
CODE_AREA: gr.sql
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GalaxyRepository.md
LINES: 283285
CLAIM: No category filter applies to built-in rows (`attribute_definition`); only the `_`-prefixed-name and `.Description` exclusions apply.
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: GalaxyRepository.cs:221223 (`AND ad.attribute_name NOT LIKE '[_]%'` and `NOT LIKE '%.Description'` — no `mx_attribute_category` filter for the built-in branch)
CODE_AREA: gr.sql
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GalaxyRepository.md
LINES: 285287
CLAIM: "`is_historized` / `is_alarm` are always `false` for built-in rows."
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: GalaxyRepository.cs:236248 — both `is_historized` and `is_alarm` use `CASE WHEN r.src_pri = 0 AND EXISTS (...)` — built-in rows have `src_pri = 1` so both expressions evaluate to 0 (false).
CODE_AREA: gr.sql
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GalaxyRepository.md
LINES: 288290
CLAIM: "When a configured attribute and a built-in attribute resolve to the same reference, the configured attribute wins."
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: GalaxyRepository.cs:225228 — `ROW_NUMBER() OVER (PARTITION BY c.gobject_id, c.attribute_name ORDER BY c.src_pri, c.depth)``src_pri = 0` for configured rows, `src_pri = 1` for built-ins, so configured attributes are ranked first.
CODE_AREA: gr.sql
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GalaxyRepository.md
LINES: 420422
CLAIM: Dashboard `/dashboard/galaxy` page with object-category and top-template breakdowns.
CLAIM_TYPE: path
VERDICT: wrong
EVIDENCE: GalaxyPage.razor:1 — the page route is `@page "/galaxy"`, not `/dashboard/galaxy`. The Blazor app is mounted without a `/dashboard` prefix (DashboardEndpointRouteBuilderExtensions.cs:86, `MapRazorComponents<App>()`). The Galaxy page is at `/galaxy`, not `/dashboard/galaxy`. The home page at `@page "/"` is the dashboard overview, not at `/dashboard`.
CODE_AREA: gr.proto
SEVERITY: high
PROPOSED_FIX: Change `/dashboard/galaxy` to `/galaxy` and `/dashboard` to `/` throughout the Dashboard Surface section (lines 419421). The Blazor router has no `/dashboard` prefix.
---
DOC: docs/GalaxyRepository.md
LINES: 419420
CLAIM: "An overview card on `/dashboard` showing connectivity status..."
CLAIM_TYPE: path
VERDICT: wrong
EVIDENCE: DashboardHome.razor:1 — `@page "/"`. The home/overview page is at `/`, not `/dashboard`.
CODE_AREA: gr.proto
SEVERITY: high
PROPOSED_FIX: Change `/dashboard` to `/` in the Dashboard Surface section.
---
DOC: docs/GalaxyRepository.md
LINES: 369375
CLAIM: "`GalaxyBrowseProjector` (`src/ZB.MOM.WW.MxGateway.Server/Galaxy/GalaxyBrowseProjector.cs`) projects one level of children out of an immutable cache entry. Memoizes the filtered child list per cache-entry instance so repeated paging is an O(pageSize) slice rather than an O(siblings) filter scan. The memo is keyed on the cache entry reference, so a new entry from the background refresh makes the stale memo unreachable and it is collected with it. `DashboardBrowseService` wraps this projector to drive the dashboard's lazy-expand tree."
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: GalaxyBrowseProjector.cs:2022 (ConditionalWeakTable keyed on `GalaxyHierarchyCacheEntry`); DashboardBrowseService.cs:55 (`GalaxyBrowseProjector.ProjectChildren` called inside `DashboardBrowseService`)
CODE_AREA: gr.proto
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GalaxyRepository.md
LINES: 110111
CLAIM: "`IGalaxyHierarchyCache` (`src/ZB.MOM.WW.MxGateway.Server/Galaxy/GalaxyHierarchyCache.cs`) — every `DiscoverHierarchy` and `GetLastDeployTime` request reads from this cache."
CLAIM_TYPE: path
VERDICT: accurate
EVIDENCE: GalaxyHierarchyCache.cs (file at that path); GalaxyRepositoryGrpcService.cs:4662 (GetLastDeployTime reads from cache); :69110 (DiscoverHierarchy reads from cache)
CODE_AREA: gr.proto
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GalaxyRepository.md
LINES: 445447
CLAIM: "Integration tests live in `src/ZB.MOM.WW.MxGateway.IntegrationTests/Galaxy/GalaxyRepositoryLiveTests.cs`. Set `MXGATEWAY_RUN_LIVE_GALAXY_TESTS=1` (and optionally `MXGATEWAY_LIVE_GALAXY_CONN`) to run them."
CLAIM_TYPE: path
VERDICT: accurate
EVIDENCE: GalaxyRepositoryLiveTests.cs (file exists at that path); LiveGalaxyRepositoryFactAttribute.cs:9 (`EnableVariableName = "MXGATEWAY_RUN_LIVE_GALAXY_TESTS"`); LiveGalaxyRepositoryFactAttribute.cs:11 (`ConnectionStringVariableName = "MXGATEWAY_LIVE_GALAXY_CONN"`)
CODE_AREA: gr.proto
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GalaxyRepository.md
LINES: 365367
CLAIM: "`GalaxyProtoMapper` (`src/ZB.MOM.WW.MxGateway.Server/Grpc/GalaxyProtoMapper.cs`) converts row models to proto messages. Used by the cache during refresh to materialize the reply once."
CLAIM_TYPE: path
VERDICT: accurate
EVIDENCE: GalaxyProtoMapper.cs (file at that path); GalaxyHierarchyCache.cs:223 (`BuildObjects``GalaxyProtoMapper.MapObject`)
CODE_AREA: gr.proto
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GalaxyRepository.md
LINES: 212261
CLAIM: The `GalaxyObject`, `GalaxyAttribute`, `DiscoverHierarchyRequest`, and `DiscoverHierarchyReply` message field numbers and types as shown in the "Reply shape" proto block (field numbers 112 for `GalaxyAttribute`, 112 for `DiscoverHierarchyRequest`, etc.).
CLAIM_TYPE: rpc/proto
VERDICT: accurate
EVIDENCE: galaxy_repository.proto:110191 (all field numbers and types match the doc's code block)
CODE_AREA: gr.proto
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GalaxyRepository.md
LINES: 399400
CLAIM: Dashboard "displays only non-secret fields: server, database, integrated security, encrypt, and trust-server-certificate. It never displays user id, password, access token, or arbitrary unparsed connection string text."
CLAIM_TYPE: behavior-rule
VERDICT: unverifiable
EVIDENCE: GalaxyPage.razor:129 (`DashboardDisplay.Text(GalaxyConnectionStringDisplay())`); GalaxyPage.razor:193196 delegates to `DashboardConnectionStringDisplay.GalaxyRepositoryConnectionString`. The actual display logic lives in `DashboardConnectionStringDisplay` which was not found in this audit scope. The behavior is asserted plausibly consistent with "never display user id, password" but the implementation of `DashboardConnectionStringDisplay` was not directly verified.
CODE_AREA: gr.conn
SEVERITY: low
PROPOSED_FIX: flag only — verify `DashboardConnectionStringDisplay` filters fields as claimed.
---
DOC: docs/GalaxyRepository.md
LINES: N/A — not covered in doc
CLAIM: GAP — `GalaxyHierarchyCache` projects `Status` to `Stale` when `LastSuccessAt` is more than 5 minutes old (regardless of the stored status), via `ProjectStatus` with `StaleThreshold = TimeSpan.FromMinutes(5)`.
CLAIM_TYPE: behavior-rule
VERDICT: gap
EVIDENCE: GalaxyHierarchyCache.cs:22 (`StaleThreshold = TimeSpan.FromMinutes(5)`); GalaxyHierarchyCache.cs:474488 (`ProjectStatus` method)
CODE_AREA: gr.proto
SEVERITY: medium
PROPOSED_FIX: Add a note under "Hierarchy Cache" that the cache also auto-degrades to `Stale` status when more than 5 minutes have elapsed since the last successful refresh, independent of the stored entry status. This matters for operators diagnosing why a `Healthy` entry flips to `Stale` without a SQL failure.
---
DOC: docs/GalaxyRepository.md
LINES: N/A — not covered in doc
CLAIM: GAP — `WatchDeployEvents` emits a bootstrap event even on a snapshot-restore (from on-disk data), not only from live SQL queries. `GalaxyHierarchyCache.TryRestoreFromDiskAsync` calls `_notifier.Publish` after restoring.
CLAIM_TYPE: behavior-rule
VERDICT: gap
EVIDENCE: GalaxyHierarchyCache.cs:315320 (`_notifier.Publish` called from `TryRestoreFromDiskAsync`)
CODE_AREA: gr.proto
SEVERITY: low
PROPOSED_FIX: Note under "Deploy Notifications" or "On-disk snapshot" that restoring the snapshot also publishes a deploy event so `WatchDeployEvents` subscribers receive a bootstrap event even when SQL is unreachable at startup.
---
DOC: docs/GalaxyRepository.md
LINES: N/A — not covered in doc
CLAIM: GAP — `GalaxyHierarchyRefreshService` runs an initial `RefreshAsync` immediately on startup (before starting the periodic timer), so the first load happens at process start, not after the first tick of `DashboardRefreshIntervalSeconds`.
CLAIM_TYPE: behavior-rule
VERDICT: gap
EVIDENCE: GalaxyHierarchyRefreshService.cs:2237 (initial `await cache.RefreshAsync` before `PeriodicTimer` is created)
CODE_AREA: gr.proto
SEVERITY: low
PROPOSED_FIX: Add a note under "Hierarchy Cache" that the first refresh runs immediately at gateway startup and does not wait for the first timer tick.
---
DOC: docs/GalaxyRepository.md
LINES: N/A — not covered in doc
CLAIM: GAP — The `HierarchySql` category filter (`td.category_id IN (1, 3, 4, 10, 11, 13, 17, 24, 26)`) and the specific category IDs mapped to names (WinPlatform=1, AppEngine=3, InTouchViewApp=4, UserDefined=10, FieldReference=11, Area=13, DIObject=17, DDESuiteLinkClient=24, OPCClient=26) are not documented anywhere in `GalaxyRepository.md`.
CLAIM_TYPE: behavior-rule
VERDICT: gap
EVIDENCE: GalaxyRepository.cs:161 (HierarchySql WHERE clause with category IDs); GalaxyHierarchyCache.cs:461472 (`ResolveCategoryName` method mapping each ID to a name)
CODE_AREA: gr.sql
SEVERITY: medium
PROPOSED_FIX: Add a table of the filtered category IDs and their names (WinPlatform, AppEngine, InTouchViewApp, etc.) to the doc. Operators need to know which object types are included — an AppEngine that doesn't appear in browse results is hard to diagnose without this list.
---
DOC: docs/GalaxyRepository.md
LINES: N/A — not covered in doc
CLAIM: GAP — The `AttributesSql` uses the `data_type` table to resolve `data_type_name` (`LEFT JOIN data_type dt ON dt.mx_data_type = r.mx_data_type`). The Galaxy table name `data_type` is not mentioned in the doc.
CLAIM_TYPE: term
VERDICT: gap
EVIDENCE: GalaxyRepository.cs:249 (`LEFT JOIN data_type dt ON dt.mx_data_type = r.mx_data_type`)
CODE_AREA: gr.sql
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: docs/GalaxyRepository.md
LINES: N/A — not covered in doc
CLAIM: GAP — The `HierarchySql` uses the tables `gobject` and `template_definition`, and maps `parent_gobject_id` using `CASE WHEN g.contained_by_gobject_id = 0 THEN g.area_gobject_id ELSE g.contained_by_gobject_id END`. This parent resolution logic (area_gobject_id fallback) is not mentioned.
CLAIM_TYPE: behavior-rule
VERDICT: gap
EVIDENCE: GalaxyRepository.cs:138142 (parent_gobject_id CASE expression); tables referenced: `gobject` (line 158), `template_definition` (line 159)
CODE_AREA: gr.sql
SEVERITY: low
PROPOSED_FIX: flag only
+404
View File
@@ -0,0 +1,404 @@
# Cluster 09 — Alarms
Audited doc: `docs/AlarmClientDiscovery.md`
Code base verified against:
- `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/WnWrapAlarmConsumer.cs`
- `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/AlarmDispatcher.cs`
- `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/AlarmCommandHandler.cs`
- `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/AlarmRecordTransitionMapper.cs`
- `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessAlarmEventSink.cs`
- `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAlarmStateKind.cs`
- `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAlarmTransitionEvent.cs`
- `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/IMxAccessAlarmConsumer.cs`
- `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/IAlarmCommandHandler.cs`
- `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessCommandExecutor.cs` (alarm arms)
- `src/ZB.MOM.WW.MxGateway.Server/Alarms/GatewayAlarmMonitor.cs`
- `src/ZB.MOM.WW.MxGateway.Server/Alarms/IGatewayAlarmService.cs`
- `src/ZB.MOM.WW.MxGateway.Server/Alarms/AlarmsServiceCollectionExtensions.cs`
- `src/ZB.MOM.WW.MxGateway.Server/Configuration/AlarmsOptions.cs`
- `src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_gateway.proto`
---
DOC / LINES / 71-74 (comment about `AlarmClientConsumer.cs`)
CLAIM / The file `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/AlarmClientConsumer.cs` exists in the repo.
CLAIM_TYPE / path
VERDICT / wrong
EVIDENCE / `find /Users/dohertj2/Desktop/mxaccessgateway/src -name "AlarmClientConsumer*"` returns nothing. The file no longer exists; `WnWrapAlarmConsumer.cs` comments confirm it was replaced: `WnWrapAlarmConsumer.cs:18-19`.
CODE_AREA / alarm.subscribe
SEVERITY / medium
PROPOSED_FIX / Update references to the obsolete `AlarmClientConsumer.cs` throughout the doc to note that the file was retired and replaced by `WnWrapAlarmConsumer.cs`.
---
DOC / LINES / 71-74
CLAIM / The architecture comment on `AlarmClientConsumer.cs` (PR A.5) describing `IAlarmMgrDataProvider` managed events is wrong against the deployed assembly — there is no managed event surface.
CLAIM_TYPE / behavior-rule
VERDICT / stale
EVIDENCE / The source file it critiques (`AlarmClientConsumer.cs`) no longer exists in the repo. The critique is historically accurate but refers to a file that was removed during the wnwrap migration. No live code contains `IAlarmMgrDataProvider`. `WnWrapAlarmConsumer.cs:1-575`.
CODE_AREA / alarm.subscribe
SEVERITY / low
PROPOSED_FIX / Note that the critique is a historical record; the offending file has been removed. The section remains valid as probe context but should clarify the current state.
---
DOC / LINES / 87-88
CLAIM / `AlarmClientConsumer.AlarmRecordReceived` has no production callers; `RaiseAlarmRecordReceived` is `internal` for tests and never invoked at runtime.
CLAIM_TYPE / behavior-rule
VERDICT / stale
EVIDENCE / Neither `AlarmRecordReceived` nor `RaiseAlarmRecordReceived` appear anywhere in the current source tree (`grep -rn "AlarmRecordReceived\|RaiseAlarmRecordReceived" src` — zero results outside tests or binaries). The entire `AlarmClientConsumer` class was removed; the observation is a dead historical probe note.
CODE_AREA / alarm.subscribe
SEVERITY / low
PROPOSED_FIX / Flag as historical only; the code path no longer exists.
---
DOC / LINES / 492
CLAIM / "PR A.5's `Subscribe` / `AcknowledgeByGuid` / `SnapshotActiveAlarms` are correct — they're pull-style and don't depend on the notification mechanism."
CLAIM_TYPE / behavior-rule
VERDICT / stale
EVIDENCE / The method names in the current interface are `IMxAccessAlarmConsumer.AcknowledgeByGuid` and `SnapshotActiveAlarms` (`IMxAccessAlarmConsumer.cs:64,104`), so the names are accurate. However, this statement refers to PR A.5's `AlarmClientConsumer`, which no longer exists. The claim implicitly endorses `AlarmClientConsumer` code that has been replaced by `WnWrapAlarmConsumer`. The successor also exposes `AcknowledgeByGuid` but routes it through `AlarmAckByGUID` on `wwAlarmConsumerClass`.
CODE_AREA / alarm.ack
SEVERITY / low
PROPOSED_FIX / Note that PR A.5 was superseded; the current production path is `WnWrapAlarmConsumer`.
---
DOC / LINES / 604-605
CLAIM / After an alarm return-to-normal (`UNACK_RTN`), `wwAlarmConsumerClass.AlarmAckByGUID` is "the method to call" for acknowledgement.
CLAIM_TYPE / behavior-rule
VERDICT / wrong
EVIDENCE / The doc itself contradicts this eleven sections later ("Section 4. `AlarmAckByGUID` is not implemented", lines 750-756): `AlarmAckByGUID(VBGUID, …)` throws `NotImplementedException` (COM `E_NOTIMPL`) on `wwAlarmConsumerClass`. The doc at line 604 presents it as the correct ack method before the discovery in the live-smoke section, creating a contradiction within the document that integrators reading top-to-bottom will encounter.
CODE_AREA / alarm.ack
SEVERITY / high
PROPOSED_FIX / Add a forward-reference warning at line 604 ("Note: see 'Live smoke-test discoveries — section 4' below; AlarmAckByGUID is E_NOTIMPL on wnwrap and must not be called directly; use AlarmAckByName via the ack-only consumer.") or reorder the section.
---
DOC / LINES / 750-756
CLAIM / `AlarmAckByGUID(VBGUID, …)` throws `NotImplementedException` (`E_NOTIMPL`) on `wwAlarmConsumerClass`, so all acks must go through `AlarmAckByName`.
CLAIM_TYPE / behavior-rule
VERDICT / accurate
EVIDENCE / `WnWrapAlarmConsumer.cs:215-239` provides `AcknowledgeByGuid` which calls `com.AlarmAckByGUID` directly (the COM interop). The method is present in the consumer and called from `AlarmCommandHandler.Acknowledge` (`AlarmCommandHandler.cs:141-158`) and `AlarmDispatcher.Acknowledge` (`AlarmDispatcher.cs:87-103`). The code path is plumbed through and compiles. Whether it still throws `E_NOTIMPL` at runtime on the deployed AVEVA build is a runtime-only observable — the doc's claim was empirically confirmed 2026-05-01.
CODE_AREA / alarm.ack
SEVERITY / medium
PROPOSED_FIX / Flag: the code now calls `AlarmAckByGUID` without a try/catch for `E_NOTIMPL`; document that the GUID path will surface a `COMException` at runtime on affected AVEVA builds and that the gateway routes canonical `Provider!Group.Tag` references through `AcknowledgeAlarmByName` to avoid this.
---
DOC / LINES / 758-762
CLAIM / "The proto `AcknowledgeAlarmCommand` (GUID-based) and `MxAccessCommandExecutor.ExecuteAcknowledgeAlarm` switch arm remain in the codebase for forward-compat, but the gateway-side `WorkerAlarmRpcDispatcher.AcknowledgeAsync` now always routes through `AcknowledgeAlarmByName` when the public RPC supplies a recognizable `Provider!Group.Tag` reference."
CLAIM_TYPE / cross-ref
VERDICT / wrong
EVIDENCE / (a) `WorkerAlarmRpcDispatcher` does not exist in the source tree. The class that routes acknowledge requests is `GatewayAlarmMonitor.AcknowledgeAsync` + `BuildAcknowledgeCommand` (`GatewayAlarmMonitor.cs:437,516`). (b) The gateway does NOT always route through `AcknowledgeAlarmByName`: `BuildAcknowledgeCommand` first tries `Guid.TryParse`; if the `alarm_full_reference` is a canonical GUID it still dispatches `MxCommandKind.AcknowledgeAlarm` (the GUID path) (`GatewayAlarmMonitor.cs:528-543`). Only when the reference is not a GUID does it fall through to `AcknowledgeAlarmByName` (`GatewayAlarmMonitor.cs:545-563`).
CODE_AREA / alarm.ack
SEVERITY / high
PROPOSED_FIX / (1) Replace `WorkerAlarmRpcDispatcher` with the actual class name `GatewayAlarmMonitor`. (2) Correct the routing description: GUID-shaped references still go through `AcknowledgeAlarmCommand` (GUID path); `Provider!Group.Tag` references go through `AcknowledgeAlarmByNameCommand`. The claim that it "always routes through `AcknowledgeAlarmByName`" is false.
---
DOC / LINES / 636-639 (A.2 outline step 2)
CLAIM / Production `WnWrapAlarmConsumer` polls `GetXmlCurrentAlarms2(maxAlmCnt, out xml)` on a timer (500ms1s cadence).
CLAIM_TYPE / behavior-rule
VERDICT / wrong
EVIDENCE / `WnWrapAlarmConsumer.cs:38-43` explicitly states "the consumer owns no internal timer." `PollOnce()` is driven externally by `StaRuntime.InvokeAsync` (`WnWrapAlarmConsumer.cs:39`, `AlarmCommandHandler.cs:29-33`). The 500ms1s timer cadence mentioned in the doc was a design proposal; the implementation delegates all poll scheduling to the caller (STA). The doc's step 2 reads as if the consumer self-schedules.
CODE_AREA / alarm.subscribe
SEVERITY / medium
PROPOSED_FIX / Correct to: "Poll `GetXmlCurrentAlarms2` via `PollOnce()` called externally by the worker's STA through `StaRuntime.InvokeAsync`; the consumer owns no internal timer."
---
DOC / LINES / 641-643 (A.2 outline step 2)
CLAIM / "`AlarmAckByGUID(VBGUID, comment, oprName, node, domain, fullName)` for client-driven acknowledgements (matches PR A.5's `AlarmAckCommand` payload)."
CLAIM_TYPE / rpc/proto
VERDICT / wrong
EVIDENCE / The proto message is named `AcknowledgeAlarmCommand` (not `AlarmAckCommand`): `mxaccess_gateway.proto:337`. The consumer also exposes `AcknowledgeByGuid` (not `AlarmAckByGUID`) as its interface method (`IMxAccessAlarmConsumer.cs:64`). The doc uses the COM method name where it should use the proto/interface name, and uses the wrong proto message name.
CODE_AREA / alarm.ack
SEVERITY / medium
PROPOSED_FIX / Replace "PR A.5's `AlarmAckCommand` payload" with "the proto's `AcknowledgeAlarmCommand` message" (`mxaccess_gateway.proto:337`).
---
DOC / LINES / 644-647 (A.2 outline step 3)
CLAIM / STATE mapping: `UNACK_ALM``in_alarm=true, acked=false`; `UNACK_RTN``in_alarm=false, acked=false`; `ACK_ALM``in_alarm=true, acked=true`; `ACK_RTN``in_alarm=false, acked=true`.
CLAIM_TYPE / term
VERDICT / wrong
EVIDENCE / The production proto uses `AlarmConditionState` (Active / ActiveAcked / Inactive), not boolean `in_alarm`/`acked` fields. `AlarmDispatcher.MapConditionState` (`AlarmDispatcher.cs:221-234`): `UnackAlm→Active`, `AckAlm→ActiveAcked`, `UnackRtn→Inactive`, `AckRtn→Inactive`. Both Rtn states collapse to `Inactive` — the `acked` distinction on a cleared alarm is not surfaced. The doc's proposed boolean decomposition was a design proposal that was not adopted; the final proto shape uses the enum.
CODE_AREA / alarm.state
SEVERITY / high
PROPOSED_FIX / Replace the boolean mapping table with the actual `AlarmConditionState` enum mapping used in `AlarmDispatcher.MapConditionState`. Document that `UnackRtn` and `AckRtn` both map to `Inactive` (ack-vs-unack on a cleared alarm is not exposed through the proto).
---
DOC / LINES / 648-649 (A.2 outline step 3)
CLAIM / "`GUID``condition_id` (canonicalize the no-dashes hex to a UUID string)."
CLAIM_TYPE / term
VERDICT / wrong
EVIDENCE / The production code stores the GUID as `MxAlarmSnapshotRecord.AlarmGuid` (a `System.Guid`) and the proto carries it inside `OnAlarmTransitionEvent` only implicitly (there is no `condition_id` field in the proto). The `alarm_full_reference` field is used as the stable identifier for condition correlation, not a `condition_id`. `mxaccess_gateway.proto:720-723`, `OnAlarmTransitionEvent.alarm_full_reference`. The field name `condition_id` does not exist in the proto.
CODE_AREA / alarm.state
SEVERITY / medium
PROPOSED_FIX / Replace `condition_id` with the actual stable identifier: `alarm_full_reference` (`OnAlarmTransitionEvent.alarm_full_reference`). The GUID is used internally by `WnWrapAlarmConsumer` as a snapshot key but is not exposed as a proto field.
---
DOC / LINES / 651-654 (A.2 outline step 3 — timestamp)
CLAIM / "`DATE + TIME + GMTOFFSET + DSTADJUST` → reassemble UTC timestamp; matches the worker's existing `Timestamp` wire format."
CLAIM_TYPE / behavior-rule
VERDICT / accurate
EVIDENCE / `AlarmRecordTransitionMapper.ParseTransitionTimestampUtc` (`AlarmRecordTransitionMapper.cs:116-188`) parses all four fields and computes UTC. The proto uses `google.protobuf.Timestamp` (`mxaccess_gateway.proto:747`). Wire-up matches.
CODE_AREA / alarm.state
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / 656-657 (A.2 outline step 3)
CLAIM / "`PRIORITY` → severity (already 1-1000-ish range)."
CLAIM_TYPE / behavior-rule
VERDICT / accurate
EVIDENCE / `WnWrapAlarmConsumer.ParseSnapshotXml` reads `PRIORITY` as `int` (`WnWrapAlarmConsumer.cs:433`), stored as `MxAlarmSnapshotRecord.Priority`. `AlarmDispatcher.OnTransition` passes it as `severity: record.Priority` (`AlarmDispatcher.cs:187`). `OnAlarmTransitionEvent.severity` is `int32` in the proto (`mxaccess_gateway.proto`). The 1-1000 range is consistent with AVEVA's alarm priority range.
CODE_AREA / alarm.state
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / 658-659 (A.2 outline step 3)
CLAIM / "`TAGNAME` → reference; `PROVIDER_NAME` + `GROUP` for scope metadata."
CLAIM_TYPE / behavior-rule
VERDICT / accurate
EVIDENCE / `AlarmDispatcher.OnTransition` calls `AlarmRecordTransitionMapper.ComposeFullReference(record.ProviderName, record.Group, record.TagName)` and passes the result as `alarmFullReference` (`AlarmDispatcher.cs:180-183`). `ComposeFullReference` formats it as `Provider!Group.TagName` (`AlarmRecordTransitionMapper.cs:90-102`). `TAGNAME` alone is passed as `sourceObjectReference` (`AlarmDispatcher.cs:184`).
CODE_AREA / alarm.state
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / 672-676 (A.2 outline step 5)
CLAIM / "PR A.5's snapshot/ack contract tests can stay — they don't touch the underlying COM API."
CLAIM_TYPE / cross-ref
VERDICT / stale
EVIDENCE / PR A.5's `AlarmClientConsumer` was retired; there is no class by that name. The test files for alarm command handling now cover `AlarmCommandHandler`, `AlarmDispatcher`, and `WnWrapAlarmConsumerXmlTests` — none named as "PR A.5 tests." The statement implies a test corpus that doesn't exist under the described label.
CODE_AREA / alarm.subscribe
SEVERITY / low
PROPOSED_FIX / Remove or update the PR label; reference actual test files: `AlarmCommandHandlerTests.cs`, `AlarmDispatcherTests.cs`, `WnWrapAlarmConsumerXmlTests.cs`.
---
DOC / LINES / 673-675 (settled API ordering section)
CLAIM / "`InitializeConsumer` first, then `RegisterConsumer` — both on `aaAlarmManagedClient.AlarmClient` and `wwAlarmConsumerClass`."
CLAIM_TYPE / behavior-rule
VERDICT / accurate
EVIDENCE / `WnWrapAlarmConsumer.Subscribe` calls `IwwAlarmConsumer_InitializeConsumer` before `IwwAlarmConsumer_RegisterConsumer` (`WnWrapAlarmConsumer.cs:117-137`). Same ordering for `ackClient` (`WnWrapAlarmConsumer.cs:188-208`).
CODE_AREA / alarm.subscribe
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / 676-682 (settled API section)
CLAIM / "`aaAlarmManagedClient.AlarmClient.RegisterConsumer` is 5-arg (includes `bRetainHiddenAlarms`); `wwAlarmConsumerClass.RegisterConsumer` is 4-arg (no `bRetainHiddenAlarms`)."
CLAIM_TYPE / behavior-rule
VERDICT / accurate
EVIDENCE / `WnWrapAlarmConsumer.Subscribe` calls `IwwAlarmConsumer_RegisterConsumer` with 4 args: `hWnd, szProductName, szApplicationName, szVersion` (`WnWrapAlarmConsumer.cs:128-132`). Consistent with the doc.
CODE_AREA / alarm.subscribe
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / 683-685 (settled API section)
CLAIM / "Subscription expression format: `\\<machine>\Galaxy!<area>` (literal `Galaxy` provider) for both libraries."
CLAIM_TYPE / path
VERDICT / accurate
EVIDENCE / `WnWrapAlarmConsumer.ComposeXmlAlarmQuery` parses this format and treats `Galaxy` as the provider (`WnWrapAlarmConsumer.cs:489-530`). `IMxAccessAlarmConsumer.Subscribe` doc comment confirms: "Subscription string follows AVEVA's canonical format: `\\<node>\Galaxy!<area>`. The literal 'Galaxy' is the provider name (regardless of the configured Galaxy database name)." (`IMxAccessAlarmConsumer.cs:44-46`). `AlarmsOptions.cs:16-17` also confirms.
CODE_AREA / alarm.subscribe
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / 684-685 (settled API section)
CLAIM / "Native ack: `AlarmAckByGUID(VBGUID guid, comment, oprName, node, domain, fullName)` on the v2 surface."
CLAIM_TYPE / behavior-rule
VERDICT / accurate
EVIDENCE / `WnWrapAlarmConsumer.AcknowledgeByGuid` calls `com.AlarmAckByGUID` with exactly those args (`WnWrapAlarmConsumer.cs:232-238`).
CODE_AREA / alarm.ack
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / 695-699 (live smoke quirk 1)
CLAIM / "Without `SetXmlAlarmQuery`, the first `GetXmlCurrentAlarms2` call fails with `E_FAIL` (HRESULT `0x80004005`)."
CLAIM_TYPE / behavior-rule
VERDICT / accurate
EVIDENCE / `WnWrapAlarmConsumer.Subscribe` calls `SetXmlAlarmQuery` and wraps it with a `COMException` guard that would surface as `InvalidOperationException` with the E_FAIL message (`WnWrapAlarmConsumer.cs:156-182`). The call is mandatory per production code structure.
CODE_AREA / alarm.subscribe
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / 719-733 (live smoke quirk 2)
CLAIM / "Two consumers required: read-side consumer (with `SetXmlAlarmQuery`) and ack-only consumer (without `SetXmlAlarmQuery`). All `AcknowledgeByName` calls dispatch through the ack-only instance."
CLAIM_TYPE / behavior-rule
VERDICT / accurate
EVIDENCE / `WnWrapAlarmConsumer.Subscribe` provisions `ackClient = new wwAlarmConsumerClass()` with full lifecycle but no `SetXmlAlarmQuery` (`WnWrapAlarmConsumer.cs:184-210`). `AcknowledgeByName` uses `ackClient` (`WnWrapAlarmConsumer.cs:256-278`). `AcknowledgeByGuid` uses `client` (read-side) (`WnWrapAlarmConsumer.cs:224-238`).
CODE_AREA / alarm.ack
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / 736-748 (live smoke quirk 3)
CLAIM / "The v2 8-arg `AlarmAckByName` returns -55 on this AVEVA build. The v1 6-arg `AlarmAckByName` works. Production `WnWrapAlarmConsumer.AcknowledgeByName` calls the 6-arg overload. Operator domain and full-name fields are accepted by the proto but not propagated to AVEVA (discarded)."
CLAIM_TYPE / behavior-rule
VERDICT / accurate
EVIDENCE / `WnWrapAlarmConsumer.AcknowledgeByName` calls `com.AlarmAckByName` (6-arg) and explicitly discards `ackOperatorDomain` and `ackOperatorFullName` with `_ = ...` (`WnWrapAlarmConsumer.cs:268-278`). The proto `AcknowledgeAlarmByNameCommand` retains `operator_domain` and `operator_full_name` fields (`mxaccess_gateway.proto:359-373`).
CODE_AREA / alarm.ack
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / 750-756 (live smoke quirk 4)
CLAIM / "`AlarmAckByGUID` is not implemented on `wwAlarmConsumerClass`; it throws `NotImplementedException` / `E_NOTIMPL`. The reference→GUID lookup is not viable; all acks must go through `AlarmAckByName`."
CLAIM_TYPE / behavior-rule
VERDICT / stale
EVIDENCE / The production code at `WnWrapAlarmConsumer.AcknowledgeByGuid` still calls `com.AlarmAckByGUID` directly without a guard for `E_NOTIMPL` (`WnWrapAlarmConsumer.cs:215-239`). The gateway's `BuildAcknowledgeCommand` still dispatches `MxCommandKind.AcknowledgeAlarm` (GUID path) when `alarm_full_reference` parses as a GUID (`GatewayAlarmMonitor.cs:528-543`). The doc says all acks must go through `AcknowledgeByName`, but the code still routes GUID-shaped references through `AlarmAckByGUID`. The `E_NOTIMPL` runtime behavior is unguarded.
CODE_AREA / alarm.ack
SEVERITY / high
PROPOSED_FIX / Either (a) add a `COMException`/`NotImplementedException` guard around `AlarmAckByGUID` in `WnWrapAlarmConsumer.AcknowledgeByGuid` that falls back to `AcknowledgeByName`, or (b) make the gateway never dispatch the GUID arm. Document whichever approach is taken. The current state silently sends a doomed IPC command.
---
DOC / LINES / 761-762
CLAIM / "`WorkerAlarmRpcDispatcher.AcknowledgeAsync` now always routes through `AcknowledgeAlarmByName` when the public RPC supplies a recognizable `Provider!Group.Tag` reference."
CLAIM_TYPE / cross-ref
VERDICT / wrong
EVIDENCE / (a) No class named `WorkerAlarmRpcDispatcher` exists in the source tree. The gateway-side routing is in `GatewayAlarmMonitor.BuildAcknowledgeCommand` (`GatewayAlarmMonitor.cs:516`). (b) The routing is conditional: GUID-shaped `alarm_full_reference``AcknowledgeAlarmCommand` (GUID path); `Provider!Group.Tag``AcknowledgeAlarmByNameCommand`. The claim that the routing "always" goes through `AcknowledgeAlarmByName` is incorrect.
CODE_AREA / alarm.ack
SEVERITY / high
PROPOSED_FIX / Replace the entire sentence. The correct description: "The gateway's `GatewayAlarmMonitor.BuildAcknowledgeCommand` (`GatewayAlarmMonitor.cs:516`) dispatches `MxCommandKind.AcknowledgeAlarm` for GUID-shaped references and `MxCommandKind.AcknowledgeAlarmByName` for `Provider!Group.Tag` references."
---
DOC / LINES / 765-773 (STA quirk 5)
CLAIM / "The consumer's internal `Timer` fires on threadpool threads and would block on cross-apartment marshaling unless the host STA pumps Win32 messages. The smoke test sidesteps this by setting `pollIntervalMilliseconds=0` (Timer disabled) and driving `PollOnce` manually."
CLAIM_TYPE / behavior-rule
VERDICT / stale
EVIDENCE / The production `WnWrapAlarmConsumer` has no internal `Timer` at all — the design was revised so `PollOnce()` is always external (`WnWrapAlarmConsumer.cs:38-43`: "the consumer owns no internal timer"). There is no `pollIntervalMilliseconds` constructor parameter (`WnWrapAlarmConsumer.cs:69-87`). The constructor takes only `wwAlarmConsumerClass client` and `int maxAlarmsPerFetch`. The smoke test mention of `pollIntervalMilliseconds=0` refers to a superseded design.
CODE_AREA / alarm.subscribe
SEVERITY / medium
PROPOSED_FIX / Update to reflect the final design: `WnWrapAlarmConsumer` has no internal timer; `PollOnce()` is always called externally by the STA. Remove the `pollIntervalMilliseconds=0` test-workaround reference.
---
DOC / LINES / 599-601 (XML STATE enum section)
CLAIM / "`STATE` enum values observed: `UNACK_RTN` (alarm returned to normal, unacknowledged) and `UNACK_ALM` (alarm active and unacknowledged). Other states (`ACK_RTN`, `ACK_ALM`) would appear when an ack is performed."
CLAIM_TYPE / term
VERDICT / accurate
EVIDENCE / `MxAlarmStateKind.cs:1-17` defines all four values. `AlarmRecordTransitionMapper.ParseStateKind` handles all four (`AlarmRecordTransitionMapper.cs:27-38`).
CODE_AREA / alarm.state
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / 628-630 (reference format in smoke capture)
CLAIM / Reference format in the capture: `ref='Galaxy!TestArea.TestMachine_001.TestAlarm001'` — the `alarm_full_reference` is composed as `ProviderName!Group.TagName`.
CLAIM_TYPE / term
VERDICT / accurate
EVIDENCE / `AlarmRecordTransitionMapper.ComposeFullReference` formats as `{provider}!{group}.{name}` (`AlarmRecordTransitionMapper.cs:90-102`). The example matches this pattern exactly.
CODE_AREA / alarm.state
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / (entire doc — RPC names)
CLAIM / The document mentions IPC commands `SubscribeAlarms`, `AcknowledgeByGuid`, `SnapshotActiveAlarms`, `QueryActiveAlarms` but never names the public gRPC RPCs — `AcknowledgeAlarm`, `StreamAlarms`, `QueryActiveAlarms` — or the config keys governing the always-on monitor (`MxGateway:Alarms:Enabled`, `MxGateway:Alarms:SubscriptionExpression`, `MxGateway:Alarms:DefaultArea`, `MxGateway:Alarms:ReconcileIntervalSeconds`).
CLAIM_TYPE / gap
VERDICT / gap
EVIDENCE / `mxaccess_gateway.proto:22-37` (RPCs); `AlarmsOptions.cs:21-47` (config keys); `GatewayAlarmMonitor.cs:17-51` (always-on broker). None documented in `AlarmClientDiscovery.md`.
CODE_AREA / alarm.subscribe
SEVERITY / high
PROPOSED_FIX / The doc is a probe/research log, not an operator/integrator guide. However, the gap means no other document covers these public-surface items. Add a section or cross-reference to the public alarm API: RPCs `AcknowledgeAlarm`, `StreamAlarms`, `QueryActiveAlarms`; config keys `MxGateway:Alarms:Enabled`, `MxGateway:Alarms:SubscriptionExpression`, `MxGateway:Alarms:DefaultArea`, `MxGateway:Alarms:ReconcileIntervalSeconds`.
---
DOC / LINES / (entire doc — always-on broker architecture)
CLAIM / (gap) The doc describes a model where individual client sessions subscribe to alarms. The production architecture uses a gateway-owned always-on `GatewayAlarmMonitor` that holds one dedicated worker session and fans the alarm feed to all clients. No client opens its own alarm subscription; `StreamAlarms` is session-less.
CLAIM_TYPE / gap
VERDICT / gap
EVIDENCE / `GatewayAlarmMonitor.cs:1-697`; `IGatewayAlarmService.cs:27-63`; `AlarmsOptions.cs:1-48`. `AlarmClientDiscovery.md` describes the worker alarm consumer (IPC layer) but never describes the gateway-level brokering architecture that wraps it.
CODE_AREA / alarm.subscribe
SEVERITY / high
PROPOSED_FIX / Add a section describing `GatewayAlarmMonitor` as the always-on broker: one gateway-owned session, periodic reconcile loop (`ReconcileIntervalSeconds`), `StreamAsync` fan-out to per-client `Channel<AlarmFeedMessage>`, subscriber capacity (2048 messages), fail-open restart-backoff (5s).
---
DOC / LINES / (entire doc — AlarmFeedMessage / snapshot_complete protocol)
CLAIM / (gap) The doc does not document the `AlarmFeedMessage` stream protocol: initial burst of `active_alarm` messages, then `snapshot_complete` sentinel, then `transition` messages for live changes.
CLAIM_TYPE / gap
VERDICT / gap
EVIDENCE / `mxaccess_gateway.proto:857-868` (message definition); `GatewayAlarmMonitor.StreamAsync:386-434`. This is the key integrator-facing streaming contract.
CODE_AREA / alarm.subscribe
SEVERITY / high
PROPOSED_FIX / Document the `StreamAlarms` protocol: `AlarmFeedMessage` union with `active_alarm`, `snapshot_complete`, and `transition` fields; the invariant that the snapshot precedes the sentinel which precedes live transitions.
---
DOC / LINES / (entire doc — reconcile mechanism)
CLAIM / (gap) The periodic reconcile loop (`ReconcileIntervalSeconds`, default 30s, floor 5s) that snapshots the worker's active-alarm set and broadcasts synthetic raise/clear transitions for missed alarms is not documented.
CLAIM_TYPE / gap
VERDICT / gap
EVIDENCE / `GatewayAlarmMonitor.ReconcileLoopAsync:235-260`; `GatewayAlarmMonitor.ApplyReconcile:315-354`; `AlarmsOptions.ReconcileIntervalSeconds:47`.
CODE_AREA / alarm.subscribe
SEVERITY / medium
PROPOSED_FIX / Document the reconcile pass: cadence, purpose (catches missed poll-and-diff transitions), synthetic transition kind (`Raise`/`Clear`), and that it does not emit `Acknowledge` transitions.
---
DOC / LINES / (entire doc — subscriber backpressure / drop behavior)
CLAIM / (gap) A subscriber that cannot keep up with the alarm feed is dropped with an error ("Alarm feed subscriber fell behind and was dropped; reconnect to re-snapshot"). The queue capacity is 2048. This behavior is not documented.
CLAIM_TYPE / gap
VERDICT / gap
EVIDENCE / `GatewayAlarmMonitor.Broadcast:358-375`; `SubscriberQueueCapacity = 2048` (`GatewayAlarmMonitor.cs:21`).
CODE_AREA / alarm.subscribe
SEVERITY / medium
PROPOSED_FIX / Document the backpressure model: bounded 2048-message channel per subscriber; slow subscribers are completed with error and must reconnect; reconnect re-snapshots the active set.
---
DOC / LINES / (entire doc — `alarm_full_reference` parse format for ack)
CLAIM / (gap) The doc does not document the `alarm_full_reference` parse contract for `AcknowledgeAlarm`: a canonical GUID string triggers the GUID path; `Provider!Group.Tag` (first `!` splits provider, first `.` splits group from tag) triggers the by-name path; anything else is rejected.
CLAIM_TYPE / gap
VERDICT / gap
EVIDENCE / `GatewayAlarmMonitor.BuildAcknowledgeCommand` and `TryParseAlarmReference` (`GatewayAlarmMonitor.cs:516-610`). Error message: "alarm_full_reference must be a canonical GUID or 'Provider!Group.Tag' format."
CODE_AREA / alarm.ack
SEVERITY / high
PROPOSED_FIX / Document the `AcknowledgeAlarm.alarm_full_reference` field's two accepted formats and how the gateway routes each.
---
DOC / LINES / (entire doc — `AlarmConditionState` on snapshot)
CLAIM / (gap) The `ActiveAlarmSnapshot.current_state` field uses `AlarmConditionState` (Active / ActiveAcked / Inactive) — the distinction between `UnackRtn` and `AckRtn` is lost in the snapshot (both collapse to Inactive). This is not documented.
CLAIM_TYPE / gap
VERDICT / gap
EVIDENCE / `AlarmDispatcher.MapConditionState` (`AlarmDispatcher.cs:221-234`): both `UnackRtn` and `AckRtn` map to `AlarmConditionState.Inactive`.
CODE_AREA / alarm.state
SEVERITY / medium
PROPOSED_FIX / Document the state collapse rule: the `ActiveAlarmSnapshot.current_state` field does not distinguish between acknowledged-cleared and unacknowledged-cleared alarms; both surface as `Inactive`. Consumers that need this distinction must track the transition stream.
---
DOC / LINES / (entire doc — transition kind table)
CLAIM / (gap) The `AlarmTransitionKind` enum has a `Retrigger` value (`ALARM_TRANSITION_KIND_RETRIGGER = 4`), but the doc only describes Raise / Acknowledge / Clear.
CLAIM_TYPE / gap
VERDICT / gap
EVIDENCE / `mxaccess_gateway.proto:777`; `AlarmRecordTransitionMapper.MapTransition` does not produce `Retrigger` — it is defined in the proto but unused by the current mapping logic (`AlarmRecordTransitionMapper.cs:54-78`).
CODE_AREA / alarm.state
SEVERITY / low
PROPOSED_FIX / Note that `AlarmTransitionKind.Retrigger` exists in the proto but is not emitted by the current worker (the `*Rtn→*Alm` re-trigger case maps to `Raise`). Flag as reserved for future use or remove from the proto if unused.
+522
View File
@@ -0,0 +1,522 @@
# Cluster 10 — Testing
Docs audited: `docs/GatewayTesting.md`, `docs/ClientBehaviorFixtures.md`,
`docs/ParityFixtureMatrix.md`, `docs/CrossLanguageSmokeMatrix.md`,
`docs/ToolchainLinks.md`.
Verified against: `src/ZB.MOM.WW.MxGateway.Tests/**`, `src/ZB.MOM.WW.MxGateway.Worker.Tests/**`,
`src/ZB.MOM.WW.MxGateway.IntegrationTests/**`, `scripts/run-client-e2e-tests.ps1`,
`scripts/validate-client-behavior-fixtures.ps1`, `scripts/discover-testmachine-tags.ps1`,
`clients/proto/fixtures/**`.
---
DOC / LINES / GatewayTesting.md / 322324
CLAIM / "the script builds the .NET CLI (`dotnet build`) and installs the Java CLI (`gradle :mxgateway-cli:installDist`) once"
CLAIM_TYPE / command
VERDICT / wrong
EVIDENCE / scripts/run-client-e2e-tests.ps1:542 — actual invocation is `gradle :zb-mom-ww-mxgateway-cli:installDist`; clients/java/settings.gradle:26 — the Gradle subproject is named `zb-mom-ww-mxgateway-cli`, not `mxgateway-cli`
CODE_AREA / test.cmd
SEVERITY / high
PROPOSED_FIX / Replace `:mxgateway-cli:installDist` with `:zb-mom-ww-mxgateway-cli:installDist` in GatewayTesting.md line 323.
---
DOC / LINES / clients/proto/fixtures/smoke/cross-language-smoke-matrix.json / multiple Java command entries
CLAIM / Java bundled and optional commands use `gradle :mxgateway-cli:run` (e.g. `gradle :mxgateway-cli:run --args="close-session ..."`)
CLAIM_TYPE / command
VERDICT / wrong
EVIDENCE / clients/java/settings.gradle:26 — Gradle subproject name is `zb-mom-ww-mxgateway-cli`; the `:mxgateway-cli:run` task does not exist and would fail. scripts/run-client-e2e-tests.ps1:542 uses the correct `:zb-mom-ww-mxgateway-cli:installDist`
CODE_AREA / test.cmd
SEVERITY / high
PROPOSED_FIX / Replace every `:mxgateway-cli:run` in the smoke matrix JSON with `:zb-mom-ww-mxgateway-cli:run`. Also update the `installDist` reference in any bundled command if present.
---
DOC / LINES / GatewayTesting.md / 4044
CLAIM / "`WorkerLiveMxAccessSmokeTests` in `src/ZB.MOM.WW.MxGateway.IntegrationTests/` … It is skipped unless `MXGATEWAY_RUN_LIVE_MXACCESS_TESTS=1` is set"
CLAIM_TYPE / path, config-key
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.IntegrationTests/WorkerLiveMxAccessSmokeTests.cs:25 (`[LiveMxAccessFact]`); src/ZB.MOM.WW.MxGateway.IntegrationTests/IntegrationTestEnvironment.cs:13 (`LiveMxAccessVariableName = GatewayContractInfo.LiveMxAccessOptInVariableName`); src/ZB.MOM.WW.MxGateway.Contracts/GatewayContractInfo.cs:28 (`"MXGATEWAY_RUN_LIVE_MXACCESS_TESTS"`)
CODE_AREA / test.envgate
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / GatewayTesting.md / 76
CLAIM / "All six tests are gated by the same `MXGATEWAY_RUN_LIVE_MXACCESS_TESTS=1` opt-in variable"
CLAIM_TYPE / term, config-key
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.IntegrationTests/WorkerLiveMxAccessSmokeTests.cs — exactly 6 `[LiveMxAccessFact]` attributes at lines 33, 122, 238, 296, 440, 571
CODE_AREA / test.envgate
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / GatewayTesting.md / 8283
CLAIM / Worker build command: `dotnet build src/ZB.MOM.WW.MxGateway.Worker/ZB.MOM.WW.MxGateway.Worker.csproj -p:Platform=x86`
CLAIM_TYPE / command, path
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Worker/ZB.MOM.WW.MxGateway.Worker.csproj exists; PlatformTarget=x86 is set in the Worker.Tests csproj (src/ZB.MOM.WW.MxGateway.Worker.Tests/ZB.MOM.WW.MxGateway.Worker.Tests.csproj:6)
CODE_AREA / test.cmd
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / GatewayTesting.md / 89
CLAIM / Live MXAccess smoke run: `dotnet test src/ZB.MOM.WW.MxGateway.IntegrationTests/ZB.MOM.WW.MxGateway.IntegrationTests.csproj --filter FullyQualifiedName~WorkerLiveMxAccessSmokeTests`
CLAIM_TYPE / command, path
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.IntegrationTests/ZB.MOM.WW.MxGateway.IntegrationTests.csproj exists; class name `WorkerLiveMxAccessSmokeTests` confirmed at src/ZB.MOM.WW.MxGateway.IntegrationTests/WorkerLiveMxAccessSmokeTests.cs:25
CODE_AREA / test.cmd
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / GatewayTesting.md / 94101
CLAIM / Optional live smoke variables table: `MXGATEWAY_LIVE_MXACCESS_WORKER_EXE`, `MXGATEWAY_LIVE_MXACCESS_ITEM`, `MXGATEWAY_LIVE_MXACCESS_CLIENT_NAME`, `MXGATEWAY_LIVE_MXACCESS_EVENT_TIMEOUT_SECONDS` with stated defaults
CLAIM_TYPE / config-key
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.IntegrationTests/IntegrationTestEnvironment.cs:1417 — all four constant names match exactly; defaults match (TestChildObject.TestInt line 39, ZB.MOM.WW.MxGateway.IntegrationTests line 45, 15 seconds line 51)
CODE_AREA / test.envgate
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / GatewayTesting.md / 100101
CLAIM / Optional variables `MXGATEWAY_LIVE_MXACCESS_WRITE_SECURED_USER` (default `admin`) and `MXGATEWAY_LIVE_MXACCESS_WRITE_SECURED_PASSWORD` (default `admin123`) "are gated by the same opt-in variable"
CLAIM_TYPE / config-key
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.IntegrationTests/WorkerLiveMxAccessSmokeTests.cs:974977 — variable names and defaults exactly match; note these constants are NOT in `IntegrationTestEnvironment` (they live inline in `ResolveLiveMxAccessSecuredCredentials`), which is an internal code organisation matter not a doc error
CODE_AREA / test.envgate
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / GatewayTesting.md / 1016
CLAIM / "`FakeWorkerHarness` in `src/ZB.MOM.WW.MxGateway.Tests/Gateway/Workers/Fakes/` … uses the same `WorkerFrameReader`, `WorkerFrameWriter`, and `WorkerEnvelope` contract"
CLAIM_TYPE / path, term
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Tests/Gateway/Workers/Fakes/FakeWorkerHarness.cs:55 (`CreateConnectedPairAsync`) and :90 (`ConnectToGatewayPipeAsync`) confirm both methods exist
CODE_AREA / test.cmd
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / GatewayTesting.md / 2226
CLAIM / FakeWorkerHarness scripts: WorkerHello, WorkerReady, command replies, ordered WorkerEvent frames, WorkerHeartbeat frames, WorkerFault frames, shutdown acknowledgements, malformed payloads, oversized frame headers, slow/hung workers
CLAIM_TYPE / behavior-rule
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Tests/Gateway/Workers/Fakes/FakeWorkerHarness.cs:208 (SendWorkerHelloAsync), :233 (SendWorkerReadyAsync), :317 (WorkerEvent), :329 (WorkerFault), :353 (SendHeartbeatAsync), :373 (shutdown ack), :394 (malformed payload), :412 (oversized frame header)
CODE_AREA / test.cmd
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / GatewayTesting.md / 109113
CLAIM / "`src/ZB.MOM.WW.MxGateway.Worker.Tests/Probes/` partitions runtime probes … `ZB.MOM.WW.MxGateway.Worker.Tests.Probes` namespace so a discovery filter … can target or exclude them"
CLAIM_TYPE / path, term
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Worker.Tests/Probes/ contains AlarmsLiveSmokeTests.cs, AlarmClientWmProbeTests.cs, WnWrapConsumerProbeTests.cs; namespace confirmed at AlarmsLiveSmokeTests.cs:9
CODE_AREA / test.cmd
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / GatewayTesting.md / 118131
CLAIM / Three probes: `AlarmsLiveSmokeTests`, `AlarmClientWmProbeTests`, `WnWrapConsumerProbeTests`, all `[Fact(Skip = "...")]` by default
CLAIM_TYPE / term
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Worker.Tests/Probes/AlarmsLiveSmokeTests.cs:47 (`[Fact(Skip = "Live dev-rig smoke test …")]`); all three classes confirmed in directory listing
CODE_AREA / test.cmd
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / GatewayTesting.md / 139143
CLAIM / "`GalaxyRepositoryLiveTests` in `src/ZB.MOM.WW.MxGateway.IntegrationTests/Galaxy/` … skipped unless `MXGATEWAY_RUN_LIVE_GALAXY_TESTS=1`"
CLAIM_TYPE / path, config-key
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.IntegrationTests/Galaxy/GalaxyRepositoryLiveTests.cs:6; src/ZB.MOM.WW.MxGateway.IntegrationTests/Galaxy/LiveGalaxyRepositoryFactAttribute.cs:9 (`"MXGATEWAY_RUN_LIVE_GALAXY_TESTS"`)
CODE_AREA / test.envgate
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / GatewayTesting.md / 145148
CLAIM / GalaxyRepositoryLiveTests covers `TestConnectionAsync`, `GetLastDeployTimeAsync`, `GetHierarchyAsync`, `GetAttributesAsync`; hierarchy/attributes assert non-empty
CLAIM_TYPE / term
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.IntegrationTests/Galaxy/GalaxyRepositoryLiveTests.cs:10 (TestConnection), :20 (GetLastDeployTime), :31 (GetHierarchy, Assert.NotEmpty), :50 (GetAttributes, Assert.NotEmpty)
CODE_AREA / test.cmd
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / GatewayTesting.md / 154
CLAIM / Galaxy live tests run: `dotnet test src/ZB.MOM.WW.MxGateway.IntegrationTests/ZB.MOM.WW.MxGateway.IntegrationTests.csproj --filter FullyQualifiedName~GalaxyRepositoryLiveTests`
CLAIM_TYPE / command, path
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.IntegrationTests/ZB.MOM.WW.MxGateway.IntegrationTests.csproj exists; class `GalaxyRepositoryLiveTests` at Galaxy/GalaxyRepositoryLiveTests.cs:7
CODE_AREA / test.cmd
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / GatewayTesting.md / 161
CLAIM / `MXGATEWAY_LIVE_GALAXY_CONN` default: `Server=localhost;Database=ZB;Integrated Security=True;TrustServerCertificate=True;Encrypt=False;`
CLAIM_TYPE / config-key
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Galaxy/GalaxyRepositoryOptions.cs:1617 — exact string match; LiveGalaxyRepositoryFactAttribute.cs:32 falls back to this constant
CODE_AREA / test.envgate
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / GatewayTesting.md / 168199
CLAIM / "`GalaxyFilterInputSafetyTests` in `src/ZB.MOM.WW.MxGateway.Tests/Galaxy/`" exercises GalaxyGlobMatcher and GalaxyHierarchyProjector with described adversarial inputs; GalaxyGlobMatcher applies a 100 ms regex timeout
CLAIM_TYPE / path, term, behavior-rule
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Tests/Galaxy/GalaxyFilterInputSafetyTests.cs:33 (class), :6091 (adversarial cases); src/ZB.MOM.WW.MxGateway.Server/Galaxy/GalaxyGlobMatcher.cs:69 (TimeSpan.FromMilliseconds(100))
CODE_AREA / test.cmd
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / GatewayTesting.md / 172174
CLAIM / "re-frames the original 'Galaxy SQL injection' concern (Tests-002 in `code-reviews/Tests/findings.md`)"
CLAIM_TYPE / cross-ref
VERDICT / accurate
EVIDENCE / code-reviews/Tests/findings.md exists; src/ZB.MOM.WW.MxGateway.Tests/Galaxy/GalaxyFilterInputSafetyTests.cs:16 references `finding Tests-002`
CODE_AREA / test.cmd
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / GatewayTesting.md / 174178
CLAIM / `GalaxyRepository` issues only four *constant* SQL statements: `HierarchySql`, `AttributesSql`, `SELECT 1`, `SELECT time_of_last_deploy FROM galaxy`
CLAIM_TYPE / behavior-rule
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Server/Galaxy/GalaxyRepository.cs:26 (`SELECT 1`), :40 (`SELECT time_of_last_deploy FROM galaxy`), :117 (`HierarchySql`), :176 (`AttributesSql`)
CODE_AREA / test.cmd
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / GatewayTesting.md / 203206
CLAIM / "`DashboardLdapLiveTests` in `src/ZB.MOM.WW.MxGateway.IntegrationTests/` … skipped unless `MXGATEWAY_RUN_LIVE_LDAP_TESTS=1`"
CLAIM_TYPE / path, config-key
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.IntegrationTests/DashboardLdapLiveTests.cs:14; LiveLdapFactAttribute.cs:5 (`"MXGATEWAY_RUN_LIVE_LDAP_TESTS"`)
CODE_AREA / test.envgate
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / GatewayTesting.md / 230
CLAIM / LDAP live tests run: `dotnet test src/ZB.MOM.WW.MxGateway.IntegrationTests/ZB.MOM.WW.MxGateway.IntegrationTests.csproj --filter FullyQualifiedName~DashboardLdapLiveTests`
CLAIM_TYPE / command
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.IntegrationTests/DashboardLdapLiveTests.cs:14 — class name matches filter
CODE_AREA / test.cmd
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / GatewayTesting.md / 237243
CLAIM / `scripts/discover-testmachine-tags.ps1` queries TestMachine_001TestMachine_020 for attributes: `ProtectedValue`, `TestChangingInt`, `TestBoolArray`, `TestIntArray`, `TestDateTimeArray`, `TestStringArray`
CLAIM_TYPE / command
VERDICT / accurate
EVIDENCE / scripts/discover-testmachine-tags.ps1:511 — param `$Attributes` default matches exactly
CODE_AREA / test.cmd
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / GatewayTesting.md / 370372
CLAIM / Cross-language smoke matrix filter: `dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter FullyQualifiedName~CrossLanguageSmokeMatrixTests`
CLAIM_TYPE / command
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Tests/Contracts/CrossLanguageSmokeMatrixTests.cs:5 — class name matches
CODE_AREA / test.cmd
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / GatewayTesting.md / 374378
CLAIM / Parity fixture matrix filter: `dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter FullyQualifiedName~ParityFixtureMatrixTests`
CLAIM_TYPE / command
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Tests/Contracts/ParityFixtureMatrixTests.cs:6 — class name matches
CODE_AREA / test.cmd
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / GatewayTesting.md / 380390
CLAIM / Fake worker test filters: `FakeWorkerHarnessTests`, `SessionWorkerClientFactoryFakeWorkerTests`, `GatewayEndToEndFakeWorkerSmokeTests`, `WorkerClientTests` all in the main tests project; `WorkerPipeSessionTests` in Worker.Tests with `-p:Platform=x86`
CLAIM_TYPE / command
VERDICT / accurate
EVIDENCE / FakeWorkerHarnessTests.cs:9, SessionWorkerClientFactoryFakeWorkerTests.cs:13, GatewayEndToEndFakeWorkerSmokeTests.cs:19, WorkerClientTests.cs:11; src/ZB.MOM.WW.MxGateway.Worker.Tests/Ipc/WorkerPipeSessionTests.cs:17; Worker.Tests.csproj:6 (`<PlatformTarget>x86</PlatformTarget>`)
CODE_AREA / test.cmd
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / ClientBehaviorFixtures.md / 811
CLAIM / "The fixture manifest is `clients/proto/fixtures/behavior/manifest.json`. `clients/proto/proto-inputs.json` references the fixture root through `behaviorFixtureRoot`"
CLAIM_TYPE / path
VERDICT / accurate
EVIDENCE / clients/proto/fixtures/behavior/manifest.json exists; clients/proto/proto-inputs.json:23 (`"behaviorFixtureRoot": "clients/proto/fixtures/behavior"`)
CODE_AREA / test.matrix
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / ClientBehaviorFixtures.md / 3136
CLAIM / Command reply fixtures in `clients/proto/fixtures/behavior/command-replies/` parsing as `mxaccess_gateway.v1.MxCommandReply`
CLAIM_TYPE / path
VERDICT / accurate
EVIDENCE / clients/proto/fixtures/behavior/command-replies/register.ok.reply.json and write.mxaccess-failure.reply.json exist
CODE_AREA / test.matrix
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / ClientBehaviorFixtures.md / 4860
CLAIM / Event stream fixtures in `clients/proto/fixtures/behavior/event-streams/`; event families: `OnDataChange`, `OnWriteComplete`, `OperationComplete`, `OnBufferedDataChange`
CLAIM_TYPE / path, term
VERDICT / accurate
EVIDENCE / clients/proto/fixtures/behavior/event-streams/session-event-stream.json exists
CODE_AREA / test.matrix
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / ClientBehaviorFixtures.md / 9496
CLAIM / Validation: `powershell -ExecutionPolicy Bypass -File scripts/validate-client-behavior-fixtures.ps1`; "The script runs the focused C# contract tests that parse all protobuf JSON fixtures"
CLAIM_TYPE / command
VERDICT / accurate
EVIDENCE / scripts/validate-client-behavior-fixtures.ps1:1015 — runs `dotnet test` on `ZB.MOM.WW.MxGateway.Tests.csproj` with filter `ClientBehaviorFixtureTests`; src/ZB.MOM.WW.MxGateway.Tests/Contracts/ClientBehaviorFixtureTests.cs:11 class exists
CODE_AREA / test.cmd
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / ParityFixtureMatrix.md / 811
CLAIM / "The matrix lives in `clients/proto/fixtures/parity/parity-fixture-matrix.json`. It references the local MXAccess capture set under `C:/Users/dohertj2/Desktop/mxaccess/captures`"
CLAIM_TYPE / path
VERDICT / accurate
EVIDENCE / clients/proto/fixtures/parity/parity-fixture-matrix.json exists; host-specific path is unverifiable from this repo
CODE_AREA / test.matrix
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / ParityFixtureMatrix.md / 3740
CLAIM / "WriteSecured remains a documented gap because the current captures show `0x80004021` before MXAccess emits a value-bearing write body. `OperationComplete` and public `OnBufferedDataChange` batches also remain documented gaps"
CLAIM_TYPE / behavior-rule
VERDICT / accurate
EVIDENCE / clients/proto/fixtures/parity/parity-fixture-matrix.json:280291 (WriteSecured documented_gap), :369 (OperationComplete documented_gap), :382 (OnBufferedDataChange documented_gap)
CODE_AREA / test.matrix
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / ParityFixtureMatrix.md / 91
CLAIM / Validation: `dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter FullyQualifiedName~ParityFixtureMatrixTests`
CLAIM_TYPE / command
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Tests/Contracts/ParityFixtureMatrixTests.cs:6 — class name matches
CODE_AREA / test.cmd
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / CrossLanguageSmokeMatrix.md / 89
CLAIM / "The matrix lives in `clients/proto/fixtures/smoke/cross-language-smoke-matrix.json`"
CLAIM_TYPE / path
VERDICT / accurate
EVIDENCE / clients/proto/fixtures/smoke/cross-language-smoke-matrix.json exists
CODE_AREA / test.matrix
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / CrossLanguageSmokeMatrix.md / 3839
CLAIM / Integration gate: `$env:MXGATEWAY_INTEGRATION = "1"`
CLAIM_TYPE / config-key
VERDICT / accurate
EVIDENCE / clients/proto/fixtures/smoke/cross-language-smoke-matrix.json:6 (`"variable": "MXGATEWAY_INTEGRATION"`); src/ZB.MOM.WW.MxGateway.Tests/Contracts/CrossLanguageSmokeMatrixTests.cs:18 asserts this exact value
CODE_AREA / test.envgate
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / CrossLanguageSmokeMatrix.md / 4349
CLAIM / Shared inputs table: `MXGATEWAY_ENDPOINT` (default `localhost:5000`), `MXGATEWAY_API_KEY`, `MXGATEWAY_TEST_ITEM` (`TestChildObject.TestInt`), `MXGATEWAY_TEST_WRITE_VALUE`
CLAIM_TYPE / config-key
VERDICT / accurate
EVIDENCE / clients/proto/fixtures/smoke/cross-language-smoke-matrix.json:1016 — all four variable names and the `localhost:5000` fallback match exactly; CrossLanguageSmokeMatrixTests.cs:2325 asserts these
CODE_AREA / test.envgate
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / CrossLanguageSmokeMatrix.md / 99101
CLAIM / Validation: `dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter FullyQualifiedName~CrossLanguageSmokeMatrixTests`
CLAIM_TYPE / command
VERDICT / accurate
EVIDENCE / src/ZB.MOM.WW.MxGateway.Tests/Contracts/CrossLanguageSmokeMatrixTests.cs:5 — class name matches
CODE_AREA / test.cmd
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / CrossLanguageSmokeMatrix.md / 5865
CLAIM / "the Rust CLI, which is pin-only and needs `--ca-file` or `--require-certificate-validation`"
CLAIM_TYPE / term
VERDICT / accurate
EVIDENCE / clients/rust/crates/mxgw-cli/src/main.rs:426 (`ca_file: Option<PathBuf>`), :433 (`require_certificate_validation: bool`)
CODE_AREA / test.cmd
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / CrossLanguageSmokeMatrix.md / 64
CLAIM / "Python uses trust-on-first-use"
CLAIM_TYPE / behavior-rule
VERDICT / accurate
EVIDENCE / clients/python/tests/test_tls.py:114 (`test_default_tls_connects_via_tofu`) and :5 (doc string confirms TOFU default)
CODE_AREA / test.cmd
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / ToolchainLinks.md / 61
CLAIM / "Go | 1.26.2 windows/amd64"
CLAIM_TYPE / version
VERDICT / unverifiable
EVIDENCE / clients/go/go.mod:3 specifies `go 1.26` (minimum requirement); ToolchainLinks records the installed binary version (1.26.2) which is a host-specific measurement not assertable from the repo. The go.mod minimum (1.26) is consistent with the stated installed version (1.26.2). Mark unverifiable — host install path.
CODE_AREA / test.toolchain
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / ToolchainLinks.md / 8486
CLAIM / "rustc | 1.95.0" and "cargo | 1.95.0"
CLAIM_TYPE / version
VERDICT / unverifiable
EVIDENCE / clients/rust/Cargo.toml:4 — edition 2021; no `rust-version` field pins a minimum; host install version is not assertable from the repo.
CODE_AREA / test.toolchain
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / ToolchainLinks.md / 107113
CLAIM / Python packages table: `grpcio==1.80.0`, `grpcio-tools==1.80.0`, `protobuf==6.33.6`, `pytest==9.0.3`, `pytest-asyncio==1.3.0`, `click==8.3.3`, `typer==0.25.0`
CLAIM_TYPE / version
VERDICT / unverifiable
EVIDENCE / clients/python/pyproject.toml:42 specifies `"pytest-asyncio>=1.3,<2"` (range constraint, not a pinned version); the `==` version pins in ToolchainLinks reflect the installed state of the host machine at time of writing, not a locked requirement file committed to the repo. Internally consistent for their stated purpose (documenting installed versions).
CODE_AREA / test.toolchain
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / ToolchainLinks.md / 123
CLAIM / "Gradle | 9.4.1 | `C:\Tools\gradle-9.4.1\bin\gradle.bat`"
CLAIM_TYPE / version
VERDICT / unverifiable
EVIDENCE / clients/java/settings.gradle:23 — no Gradle wrapper or toolchain version constraint committed. Host-specific install; unverifiable from the repo.
CODE_AREA / test.toolchain
SEVERITY / low
PROPOSED_FIX / flag only
---
DOC / LINES / GatewayTesting.md / (gap)
CLAIM / (undocumented) `IntegrationTestEnvironment.ResolveRepositoryRoot` uses a parent-walk that accepts either `.git` marker or `*.sln`/`*.slnx` files, with an optional `stopBoundary` parameter added for test isolation (IntegrationTests-025). No prose in GatewayTesting.md explains this behaviour or what to do when the walk fails.
CLAIM_TYPE / behavior-rule
VERDICT / gap
EVIDENCE / src/ZB.MOM.WW.MxGateway.IntegrationTests/IntegrationTestEnvironment.cs:100157 — ResolveRepositoryRoot with stopBoundary, throws InvalidOperationException on failure with actionable message. Gap: the doc mentions only `MXGATEWAY_LIVE_MXACCESS_WORKER_EXE` as an escape hatch; the error message itself (line 155) explains what to do, but the docs omit the root-not-found failure mode.
CODE_AREA / test.envgate
SEVERITY / low
PROPOSED_FIX / Add a short note under "Live MXAccess Smoke" explaining that if the worker path resolver cannot locate the repository root it throws with a descriptive message; set `MXGATEWAY_LIVE_MXACCESS_WORKER_EXE` to bypass.
---
DOC / LINES / GatewayTesting.md / (gap)
CLAIM / (undocumented) `LiveGalaxyRepositoryFactAttribute` exposes `MXGATEWAY_LIVE_GALAXY_CONN` as its own constant (`ConnectionStringVariableName`) separate from `IntegrationTestEnvironment`. This is not the same pattern as the MXAccess variables (which are centralised in `IntegrationTestEnvironment`). A developer running from `CLAUDE.md`'s test table might look for this constant in the wrong class.
CLAIM_TYPE / config-key
VERDICT / gap
EVIDENCE / src/ZB.MOM.WW.MxGateway.IntegrationTests/Galaxy/LiveGalaxyRepositoryFactAttribute.cs:11 — `ConnectionStringVariableName` lives here, not in `IntegrationTestEnvironment`
CODE_AREA / test.envgate
SEVERITY / low
PROPOSED_FIX / flag only — the table in GatewayTesting.md correctly names the variable; the inconsistent home is a code-organisation note not a doc error.
---
## Summary
### Verdict counts
| Verdict | Count |
|---|---|
| accurate | 26 |
| wrong | 2 |
| stale | 0 |
| unverifiable | 4 |
| gap | 2 |
### Severity counts
| Severity | Count |
|---|---|
| high | 2 |
| medium | 0 |
| low | 30 |
### High-severity findings
- **GatewayTesting.md line 323 — wrong Gradle task name**: The prose says the e2e script installs the Java CLI via `gradle :mxgateway-cli:installDist`, but the script actually uses `:zb-mom-ww-mxgateway-cli:installDist` (matching the actual Gradle subproject name `zb-mom-ww-mxgateway-cli` in `clients/java/settings.gradle`). A developer copying the documented command would get a Gradle "task not found" error.
- **`clients/proto/fixtures/smoke/cross-language-smoke-matrix.json` — wrong Java Gradle task in all Java command entries**: Every Java command in the smoke fixture uses `gradle :mxgateway-cli:run` but the Gradle subproject is named `:zb-mom-ww-mxgateway-cli`. Running any Java smoke command from the fixture verbatim would fail. The unit tests that validate the matrix shape (`CrossLanguageSmokeMatrixTests`) do not check the literal Gradle task name, so this error passes CI undetected.
+341
View File
@@ -0,0 +1,341 @@
# Cluster 11 — Clients
Auditor: automated read-only audit
Date: 2026-06-03
Scope: clients/dotnet, clients/go, clients/java, clients/python, clients/rust — README.md, *ClientDesign.md; docs/ClientLibrariesDesign.md; docs/ClientPackaging.md
---
DOC: docs/ClientPackaging.md
LINES: 5152
CLAIM: Build and test commands reference `clients/dotnet/ZB.MOM.WW.MxGateway.Client.sln` (`.sln` extension)
CLAIM_TYPE: command
VERDICT: wrong
EVIDENCE: clients/dotnet/ZB.MOM.WW.MxGateway.Client.slnx:1 — only a `.slnx` file exists; no `.sln` file is present
CODE_AREA: client.dotnet
SEVERITY: high
PROPOSED_FIX: Replace `.sln` with `.slnx` in both `dotnet build` and `dotnet test` lines: `dotnet build clients/dotnet/ZB.MOM.WW.MxGateway.Client.slnx` / `dotnet test clients/dotnet/ZB.MOM.WW.MxGateway.Client.slnx --no-build`
---
DOC: docs/ClientPackaging.md
LINES: 159160
CLAIM: "The Python package is `mxaccess-gateway-client`. Generated modules live under `clients/python/src/mxgateway/generated`."
CLAIM_TYPE: config-key
VERDICT: wrong
EVIDENCE: clients/python/pyproject.toml:6 — `name = "zb-mom-ww-mxaccess-gateway-client"`; clients/python/src/zb_mom_ww_mxgateway/generated/ — actual generated path
CODE_AREA: client.python
SEVERITY: high
PROPOSED_FIX: Correct both: package name → `zb-mom-ww-mxaccess-gateway-client`; generated path → `clients/python/src/zb_mom_ww_mxgateway/generated`
---
DOC: docs/ClientPackaging.md
LINES: 187
CLAIM: `python -m mxgateway_cli version --json`
CLAIM_TYPE: command
VERDICT: wrong
EVIDENCE: clients/python/src/zb_mom_ww_mxgateway_cli/__main__.py — actual module is `zb_mom_ww_mxgateway_cli`; clients/python/pyproject.toml:48 — entry point `zb_mom_ww_mxgateway_cli.commands:main`
CODE_AREA: client.python
SEVERITY: high
PROPOSED_FIX: Replace with `python -m zb_mom_ww_mxgateway_cli version --json`
---
DOC: docs/ClientPackaging.md
LINES: 193194, 201, 217, 225227
CLAIM: Java workspace uses `mxgateway-client` and `mxgateway-cli` as subproject names; Gradle task paths use `:mxgateway-client:generateProto`, `:mxgateway-client:jar`, `:mxgateway-cli:installDist`, `:mxgateway-cli:run`
CLAIM_TYPE: command
VERDICT: wrong
EVIDENCE: clients/java/settings.gradle:2526 — `include 'zb-mom-ww-mxgateway-client'` and `include 'zb-mom-ww-mxgateway-cli'`; Gradle task paths are derived from subproject names
CODE_AREA: client.java
SEVERITY: high
PROPOSED_FIX: Replace all `:mxgateway-client:` references with `:zb-mom-ww-mxgateway-client:` and `:mxgateway-cli:` with `:zb-mom-ww-mxgateway-cli:` throughout ClientPackaging.md
---
DOC: clients/rust/README.md
LINES: 65
CLAIM: `cargo run -p mxgw-cli -- stream-alarms --session-id <session-id> --max-messages 1 --json`
CLAIM_TYPE: command
VERDICT: wrong
EVIDENCE: clients/rust/crates/mxgw-cli/src/main.rs:282295 — `StreamAlarms` struct has fields `filter_prefix`, `max_events`, `json`, `jsonl` (via `ConnectionArgs`); no `session_id`, flag is `--max-events` not `--max-messages`
CODE_AREA: client.rust
SEVERITY: high
PROPOSED_FIX: Replace with `cargo run -p mxgw-cli -- stream-alarms --max-events 1 --json` (no `--session-id`; change `--max-messages` to `--max-events`)
---
DOC: clients/rust/README.md
LINES: 66
CLAIM: `cargo run -p mxgw-cli -- acknowledge-alarm --session-id <session-id> --alarm-reference "\\Galaxy\Area001.Pump001.PumpFault" --json`
CLAIM_TYPE: command
VERDICT: wrong
EVIDENCE: clients/rust/crates/mxgw-cli/src/main.rs:298311 — `AcknowledgeAlarm` struct has fields `reference`, `comment`, `operator`, `json` (via `ConnectionArgs`); flag is `--reference` not `--alarm-reference`, and `session_id` is absent
CODE_AREA: client.rust
SEVERITY: high
PROPOSED_FIX: Replace with `cargo run -p mxgw-cli -- acknowledge-alarm --reference "\\Galaxy\Area001.Pump001.PumpFault" --json`
---
DOC: clients/go/README.md
LINES: 143
CLAIM: `import pb "gitea.dohertylan.com/dohertj2/mxaccessgw/clients/go/internal/generated/galaxy_repository/v1"`
CLAIM_TYPE: path
VERDICT: wrong
EVIDENCE: clients/go/internal/generated/ — flat directory; all .pb.go files are `package generated`. Actual import path used in library code: `"gitea.dohertylan.com/dohertj2/mxaccessgw/clients/go/internal/generated"` (clients/go/mxgateway/galaxy.go:11, types.go:3, session.go:13)
CODE_AREA: client.go
SEVERITY: high
PROPOSED_FIX: Replace the import path with `pb "gitea.dohertylan.com/dohertj2/mxaccessgw/clients/go/internal/generated"` (drop the `/galaxy_repository/v1` suffix)
---
DOC: docs/ClientLibrariesDesign.md
LINES: 410
CLAIM: Python generated code lives under `clients/python/src/mxgateway/generated`
CLAIM_TYPE: path
VERDICT: wrong
EVIDENCE: clients/python/src/zb_mom_ww_mxgateway/generated/ — actual path; no `mxgateway` directory under `src/`
CODE_AREA: client.python
SEVERITY: high
PROPOSED_FIX: Replace with `clients/python/src/zb_mom_ww_mxgateway/generated`
---
DOC: clients/dotnet/DotnetClientDesign.md
LINES: 3536
CLAIM: Layout includes `ZB.MOM.WW.MxGateway.Client.IntegrationTests/` project
CLAIM_TYPE: path
VERDICT: wrong
EVIDENCE: `ls clients/dotnet/` — directory does not exist; `.slnx` file (clients/dotnet/ZB.MOM.WW.MxGateway.Client.slnx:911) contains only three projects (Client, Cli, Tests)
CODE_AREA: client.dotnet
SEVERITY: medium
PROPOSED_FIX: Remove the `ZB.MOM.WW.MxGateway.Client.IntegrationTests/` entry from the layout section, or note it as "not yet created"
---
DOC: clients/python/PythonClientDesign.md
LINES: 215
CLAIM: "Publishable package name should be stable, for example: `mxaccess-gateway-client`"
CLAIM_TYPE: config-key
VERDICT: stale
EVIDENCE: clients/python/pyproject.toml:6 — actual name is `zb-mom-ww-mxaccess-gateway-client`; the "example" name was never adopted
CODE_AREA: client.python
SEVERITY: medium
PROPOSED_FIX: Update the example to `zb-mom-ww-mxaccess-gateway-client` to reflect the chosen name
---
DOC: clients/dotnet/DotnetClientDesign.md
LINES: 55
CLAIM: "`Grpc.Tools` for generation" listed as an expected package
CLAIM_TYPE: config-key
VERDICT: stale
EVIDENCE: clients/dotnet/ZB.MOM.WW.MxGateway.Client/ZB.MOM.WW.MxGateway.Client.csproj — no `Grpc.Tools` reference; the client uses a project reference to the shared contracts csproj for generated types. clients/dotnet/README.md:17 correctly notes this is "reserved for future use"
CODE_AREA: client.dotnet
SEVERITY: medium
PROPOSED_FIX: Remove `Grpc.Tools` from the "Expected packages" list or qualify it as "future, if client-local generation is adopted"
---
DOC: clients/go/GoClientDesign.md
LINES: 2830
CLAIM: `internal/generated/` contains only `mxaccess_gateway.pb.go` and `mxaccess_gateway_grpc.pb.go`
CLAIM_TYPE: path
VERDICT: stale
EVIDENCE: clients/go/internal/generated/ — 5 files present: `galaxy_repository.pb.go`, `galaxy_repository_grpc.pb.go`, `mxaccess_gateway.pb.go`, `mxaccess_gateway_grpc.pb.go`, `mxaccess_worker.pb.go`
CODE_AREA: client.go
SEVERITY: medium
PROPOSED_FIX: Add the missing galaxy_repository and mxaccess_worker generated files to the layout listing
---
DOC: docs/ClientPackaging.md
LINES: 116
CLAIM: "The Rust workspace builds the `mxgateway-client` library crate and the `mxgw` CLI crate."
CLAIM_TYPE: term
VERDICT: wrong
EVIDENCE: clients/rust/Cargo.toml:2 — `name = "zb-mom-ww-mxgateway-client"`; crates/mxgw-cli/Cargo.toml:4 — package name `mxgw-cli`, binary name `mxgw`
CODE_AREA: client.rust
SEVERITY: medium
PROPOSED_FIX: Change "mxgateway-client" to "zb-mom-ww-mxgateway-client" (the library crate name); "mxgw CLI crate" is acceptable since the binary is named `mxgw`
---
DOC: clients/rust/RustClientDesign.md
LINES: 278
CLAIM: `mxgw stream-alarms [--filter-prefix <prefix>] [--max-events <n>]`
CLAIM_TYPE: command
VERDICT: accurate
EVIDENCE: clients/rust/crates/mxgw-cli/src/main.rs:282295 — `StreamAlarms` struct has `filter_prefix: Option<String>`, `max_events: usize`; test at line 2053 confirms `--max-events` flag name
CODE_AREA: client.rust
SEVERITY: low
PROPOSED_FIX: flag only — design doc is accurate; the discrepancy is in README.md (already captured above)
---
DOC: docs/ClientPackaging.md
LINES: (no entry)
CLAIM: (gap) `scripts/pack-clients.ps1` — the canonical multi-language pack and publish script — is not mentioned in ClientPackaging.md
CLAIM_TYPE: behavior-rule
VERDICT: gap
EVIDENCE: scripts/pack-clients.ps1:140 — packs all five clients into `dist/`, supports `-Publish` to upload to Gitea feeds; ClientPackaging.md has no reference to it
CODE_AREA: client.packaging
SEVERITY: medium
PROPOSED_FIX: Add a "Packing all clients at once" section to ClientPackaging.md pointing to `scripts/pack-clients.ps1` with the example invocations from the script's `.SYNOPSIS`
---
DOC: docs/ClientPackaging.md
LINES: (no entry for Python build method)
CLAIM: (gap) Python README and ClientPackaging.md document `pip wheel . --no-deps` as the build command; `scripts/pack-clients.ps1` uses `python -m build` (the PEP 517 standard tool). The dev dependency `build>=1.2,<2` is listed in pyproject.toml but the README's wheel command bypasses it.
CLAIM_TYPE: command
VERDICT: gap
EVIDENCE: clients/python/pyproject.toml:43 — `"build>=1.2,<2"` in dev deps; scripts/pack-clients.ps1:156 — `python -m build`; clients/python/README.md:43 — `pip wheel . --no-deps`
CODE_AREA: client.python
SEVERITY: low
PROPOSED_FIX: Note in Python README and ClientPackaging.md that `python -m build` (using the `build` package already in dev deps) is the canonical wheel-build method; `pip wheel` is an alternative
---
DOC: clients/java/README.md
LINES: 37
CLAIM: `gradle :zb-mom-ww-mxgateway-client:generateProto`
CLAIM_TYPE: command
VERDICT: accurate
EVIDENCE: clients/java/settings.gradle:25 — `include 'zb-mom-ww-mxgateway-client'`; build.gradle:4 — `id 'com.google.protobuf'` applied to that subproject
CODE_AREA: client.java
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: clients/java/README.md
LINES: 314
CLAIM: `implementation 'com.zb.mom.ww.mxgateway:zb-mom-ww-mxgateway-client:0.1.0'`
CLAIM_TYPE: config-key
VERDICT: accurate
EVIDENCE: clients/java/build.gradle:1516 — `group = 'com.zb.mom.ww.mxgateway'`, `version = '0.1.0'`; subproject name `zb-mom-ww-mxgateway-client` is the artifact ID
CODE_AREA: client.java
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: clients/rust/README.md
LINES: 83
CLAIM: "tonic 0.13.1 exposes no public hook to inject a custom certificate verifier"
CLAIM_TYPE: version
VERDICT: accurate
EVIDENCE: clients/rust/Cargo.toml:40 — `tonic = { version = "0.13.1", features = ["transport", "tls-ring"] }`
CODE_AREA: client.rust
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: clients/dotnet/README.md
LINES: 2223
CLAIM: Build and test with `dotnet build clients/dotnet/ZB.MOM.WW.MxGateway.Client.slnx` / `dotnet test clients/dotnet/ZB.MOM.WW.MxGateway.Client.slnx --no-build`
CLAIM_TYPE: command
VERDICT: accurate
EVIDENCE: clients/dotnet/ZB.MOM.WW.MxGateway.Client.slnx:1 — file exists with `.slnx` extension
CODE_AREA: client.dotnet
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: clients/dotnet/README.md
LINES: 331
CLAIM: `dotnet add package ZB.MOM.WW.MxGateway.Client --version 0.1.0`
CLAIM_TYPE: version
VERDICT: accurate
EVIDENCE: clients/dotnet/Directory.Build.props:14 — `<Version>0.1.0</Version>`; clients/dotnet/ZB.MOM.WW.MxGateway.Client/ZB.MOM.WW.MxGateway.Client.csproj:21 — `<PackageId>ZB.MOM.WW.MxGateway.Client</PackageId>`
CODE_AREA: client.dotnet
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: clients/go/README.md
LINES: 292, 297
CLAIM: `go get gitea.dohertylan.com/dohertj2/mxaccessgw/clients/go@v0.1.0`; import `"gitea.dohertylan.com/dohertj2/mxaccessgw/clients/go/mxgateway"`
CLAIM_TYPE: path
VERDICT: accurate
EVIDENCE: clients/go/go.mod:1 — `module gitea.dohertylan.com/dohertj2/mxaccessgw/clients/go`; `mxgateway/` subdirectory exists
CODE_AREA: client.go
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: clients/go/README.md
LINES: 310312
CLAIM: `pwsh scripts/tag-go-module.ps1 -Version v0.1.1 -Push` creates an annotated tag `clients/go/v0.1.1`
CLAIM_TYPE: command
VERDICT: accurate
EVIDENCE: scripts/tag-go-module.ps1:39 — `$tag = "clients/go/$Version"`; line 54 — `git tag -a $tag`; line 2529 — `-Version` and `-Push` parameters exist
CODE_AREA: client.go
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: clients/python/README.md
LINES: 288290
CLAIM: `pip install --index-url https://gitea.dohertylan.com/api/packages/dohertj2/pypi/simple/ zb-mom-ww-mxaccess-gateway-client`
CLAIM_TYPE: path
VERDICT: accurate
EVIDENCE: clients/python/pyproject.toml:6 — `name = "zb-mom-ww-mxaccess-gateway-client"` matches the pip install name
CODE_AREA: client.python
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: clients/rust/README.md
LINES: 257274
CLAIM: Gitea Cargo registry at `sparse+https://gitea.dohertylan.com/api/packages/dohertj2/cargo/`, registry name `dohertj2-gitea`, crate `zb-mom-ww-mxgateway-client = { version = "0.1.0", registry = "dohertj2-gitea" }`
CLAIM_TYPE: path
VERDICT: accurate
EVIDENCE: clients/rust/Cargo.toml:14 — `publish = ["dohertj2-gitea"]`; version = "0.1.0"; registry name matches
CODE_AREA: client.rust
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: clients/java/README.md
LINES: 297299
CLAIM: Maven feed at `https://gitea.dohertylan.com/api/packages/dohertj2/maven`; publish via `gradle :zb-mom-ww-mxgateway-client:publish`
CLAIM_TYPE: path
VERDICT: accurate
EVIDENCE: clients/java/build.gradle:72 — `url = 'https://gitea.dohertylan.com/api/packages/dohertj2/maven'`; `maven-publish` plugin applied to `zb-mom-ww-mxgateway-client`
CODE_AREA: client.java
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: clients/go/README.md
LINES: 3940
CLAIM: Build and test with `go test ./...` / `go build ./...` / `go vet ./...` from `clients/go`
CLAIM_TYPE: command
VERDICT: accurate
EVIDENCE: clients/go/go.mod:1 — module root at clients/go; standard Go toolchain commands apply
CODE_AREA: client.go
SEVERITY: low
PROPOSED_FIX: flag only
---
DOC: clients/java/README.md
LINES: 265267
CLAIM: Build and test with `gradle test` from `clients/java`
CLAIM_TYPE: command
VERDICT: accurate
EVIDENCE: clients/java/settings.gradle:2526 — `include 'zb-mom-ww-mxgateway-client'` and `include 'zb-mom-ww-mxgateway-cli'`; root `build.gradle` applies `useJUnitPlatform()` to all java subprojects
CODE_AREA: client.java
SEVERITY: low
PROPOSED_FIX: flag only
+226
View File
@@ -0,0 +1,226 @@
# Cluster 12 — Style Guides
Docs audited: `StyleGuide.md`, `REVIEW-PROCESS.md`, `docs/style-guides/CSharpStyleGuide.md`,
`docs/style-guides/GoStyleGuide.md`, `docs/style-guides/JavaStyleGuide.md`,
`docs/style-guides/ProtobufStyleGuide.md`, `docs/style-guides/PythonStyleGuide.md`,
`docs/style-guides/RustStyleGuide.md`.
---
DOC: StyleGuide.md
LINES: 3
CLAIM: "This guide defines writing conventions and formatting rules for all ScadaBridge documentation."
CLAIM_TYPE: term
VERDICT: wrong
EVIDENCE: StyleGuide.md:3; no file in the repo uses "ScadaBridge" as the project name — the project is `mxaccessgw` / MXAccess Gateway throughout every other file.
CODE_AREA: style.docs
SEVERITY: high
PROPOSED_FIX: flag only — replace "ScadaBridge" with "MXAccess Gateway" / `mxaccessgw`.
---
DOC: StyleGuide.md
LINES: 12, 15, 76, 100105, 144, 147, 154155, 161, 215217, 226227, 246248, 263
CLAIM: Examples throughout use `ScadaGatewayActor`, `ScadaClientActor`, `TemplateInstanceActor`, `ReceiveActor`, `IRequiredActor<T>`, `IActorRef`, configuration key `ScadaBridge:Timeout`, file path `src/Infrastructure/Akka/Actors/`, and documentation paths `../Akka/Actors.md`, `../Akka/HealthChecks.md`, `../Configuration/Akka.md`.
CLAIM_TYPE: cross-ref
VERDICT: wrong
EVIDENCE: None of these types, paths, or configuration keys exist anywhere in the `mxaccessgw` repository. The "Akka" actor framework is not used in this project. `ls /Users/dohertj2/Desktop/mxaccessgateway/` shows no `Akka/` directory. The referenced paths (`../Akka/Actors.md`, `./Configuration.md`, `./Patterns.md`) are all dead links.
CODE_AREA: style.docs
SEVERITY: high
PROPOSED_FIX: flag only — the entire examples section was copied from a different (Akka-based) project. All example types, paths, and configuration keys must be replaced with MXAccess Gateway equivalents.
---
DOC: StyleGuide.md
LINES: 90
CLAIM: "Supported languages: `csharp`, `json`, `bash`, `xml`, `sql`, `yaml`, `html`, `css`, `javascript`"
CLAIM_TYPE: term
VERDICT: stale
EVIDENCE: Corpus-wide count of code-block language identifiers in `docs/` (via grep): `powershell` (42 uses), `text` (48 uses), `rust` (12), `python` (12), `go` (7), `proto`/`protobuf` (6) are all actively used but not listed. `yaml` and `javascript` appear zero times. The list is both under-inclusive and includes unused entries.
CODE_AREA: style.docs
SEVERITY: low
PROPOSED_FIX: flag only — update the list to reflect languages actually used in the docs corpus; at minimum add `powershell`, `text`, `rust`, `python`, `go`, `proto`; optionally remove `yaml` and `javascript`.
---
DOC: docs/style-guides/JavaStyleGuide.md
LINES: 25
CLAIM: "Use lowercase package names under `com.dohertylan.mxgateway`."
CLAIM_TYPE: config-key
VERDICT: wrong
EVIDENCE: Every handwritten Java source file in `clients/java/` uses the package root `com.zb.mom.ww.mxgateway` (confirmed by inspecting `clients/java/zb-mom-ww-mxgateway-client/src/main/java/com/zb/mom/ww/mxgateway/client/MxGatewayClient.java` and sibling files). No file uses `com.dohertylan`.
CODE_AREA: style.java
SEVERITY: high
PROPOSED_FIX: flag only — change the prescribed package root to `com.zb.mom.ww.mxgateway` to match the actual codebase.
---
DOC: docs/style-guides/PythonStyleGuide.md
LINES: 2729
CLAIM: "Put library code under `src/mxgateway/`. Put CLI entry points under `src/mxgateway_cli/`. Keep generated protobuf modules under a clearly named `generated` package."
CLAIM_TYPE: path
VERDICT: wrong
EVIDENCE: Actual package directories are `clients/python/src/zb_mom_ww_mxgateway/` and `clients/python/src/zb_mom_ww_mxgateway_cli/` (confirmed by `ls clients/python/src/`). The short names `mxgateway` and `mxgateway_cli` do not exist on disk. The generated package is correctly at `src/zb_mom_ww_mxgateway/generated/` (matches the rule in spirit, but the parent path is wrong).
CODE_AREA: style.python
SEVERITY: medium
PROPOSED_FIX: flag only — update the stated package paths to `src/zb_mom_ww_mxgateway/` and `src/zb_mom_ww_mxgateway_cli/`.
---
DOC: docs/style-guides/GoStyleGuide.md
LINES: 68
CLAIM: "Keep integration tests behind `MXGATEWAY_INTEGRATION=1` or build tags."
CLAIM_TYPE: config-key
VERDICT: unverifiable
EVIDENCE: No Go source file in `clients/go/` references `MXGATEWAY_INTEGRATION`; no build tag gating for integration was found. The existing Go tests (`clients/go/mxgateway/*_test.go`) all use in-process fakes via `bufconn`, so no live integration tests appear to exist yet. The rule is prescriptive (correct direction) but the env-var name cannot be confirmed against practice because live tests are absent.
CODE_AREA: style.go
SEVERITY: low
PROPOSED_FIX: flag only.
---
DOC: docs/style-guides/PythonStyleGuide.md
LINES: 68
CLAIM: "Keep live integration tests behind `MXGATEWAY_INTEGRATION=1`."
CLAIM_TYPE: config-key
VERDICT: stale
EVIDENCE: The only opt-in env var actually used in `clients/python/tests/` is `MXGATEWAY_RUN_TLS_TESTS=1` (confirmed in `tests/test_tls.py:36`). There is no usage of `MXGATEWAY_INTEGRATION=1` in the Python test tree. The same inconsistency holds for Java (`clients/java/`) and Rust (`clients/rust/`), where no opt-in env-var pattern for live tests was found at all.
CODE_AREA: style.python
SEVERITY: low
PROPOSED_FIX: flag only — standardize the prescribed env var with what the test code actually uses; consider aligning Go, Java, Rust, Python on a single variable name.
---
DOC: docs/style-guides/JavaStyleGuide.md
LINES: 65
CLAIM: "Keep live gateway tests behind `MXGATEWAY_INTEGRATION=1` and JUnit assumptions."
CLAIM_TYPE: config-key
VERDICT: unverifiable
EVIDENCE: No Java test file in `clients/java/` references `MXGATEWAY_INTEGRATION`. The Java test classes (`MxGatewayClientSessionTests.java`, etc.) use in-process gRPC servers. No live-gateway test gating was found in the Java client tree.
CODE_AREA: style.java
SEVERITY: low
PROPOSED_FIX: flag only.
---
DOC: docs/style-guides/RustStyleGuide.md
LINES: 65
CLAIM: "Keep live gateway tests behind `MXGATEWAY_INTEGRATION=1`."
CLAIM_TYPE: config-key
VERDICT: unverifiable
EVIDENCE: No Rust source file in `clients/rust/` references `MXGATEWAY_INTEGRATION`. The existing Rust tests (`tests/client_behavior.rs`) use a fake `tonic` in-process server. No live-gateway gating was found.
CODE_AREA: style.rust
SEVERITY: low
PROPOSED_FIX: flag only.
---
DOC: REVIEW-PROCESS.md
LINES: 14
CLAIM: "For a `src/` project, `<Module>` is the project name with the `ZB.MOM.WW.MxGateway.` prefix stripped — `src/ZB.MOM.WW.MxGateway.Server` is reviewed in `code-reviews/Server/`."
CLAIM_TYPE: path
VERDICT: accurate
EVIDENCE: `ls /Users/dohertj2/Desktop/mxaccessgateway/src/` confirms project names `ZB.MOM.WW.MxGateway.{Server,Worker,Contracts,Tests,Worker.Tests,IntegrationTests}`; `ls /Users/dohertj2/Desktop/mxaccessgateway/code-reviews/` shows folders `Server`, `Worker`, `Tests`, `Worker.Tests`, `IntegrationTests`, `Contracts`, `Client.*`. Mapping is accurate.
CODE_AREA: style.crossref
SEVERITY: low
PROPOSED_FIX: accurate — no action needed.
---
DOC: REVIEW-PROCESS.md
LINES: 6869
CLAIM: Test projects are `src/ZB.MOM.WW.MxGateway.Tests`, `src/ZB.MOM.WW.MxGateway.Worker.Tests`, `src/ZB.MOM.WW.MxGateway.IntegrationTests`.
CLAIM_TYPE: path
VERDICT: accurate
EVIDENCE: All three directories exist under `src/` (confirmed by `ls /Users/dohertj2/Desktop/mxaccessgateway/src/`).
CODE_AREA: style.crossref
SEVERITY: low
PROPOSED_FIX: accurate — no action needed.
---
DOC: REVIEW-PROCESS.md
LINES: 7778
CLAIM: Entry format is in `[code-reviews/_template/findings.md](code-reviews/_template/findings.md)`.
CLAIM_TYPE: path
VERDICT: accurate
EVIDENCE: File exists at `/Users/dohertj2/Desktop/mxaccessgateway/code-reviews/_template/findings.md`.
CODE_AREA: style.crossref
SEVERITY: low
PROPOSED_FIX: accurate — no action needed.
---
DOC: REVIEW-PROCESS.md
LINES: 120127
CLAIM: "`python code-reviews/regen-readme.py`" regenerates the README; `regen-readme.py --check` validates it; `scripts/check-code-reviews-readme.ps1` is the CI hook.
CLAIM_TYPE: command
VERDICT: accurate
EVIDENCE: `code-reviews/regen-readme.py` exists; `scripts/check-code-reviews-readme.ps1` exists; the `code-reviews/README.md` header confirms generation ("GENERATED FILE — do not edit by hand. Regenerate with: `python code-reviews/regen-readme.py`").
CODE_AREA: style.crossref
SEVERITY: low
PROPOSED_FIX: accurate — no action needed.
---
DOC: docs/style-guides/CSharpStyleGuide.md
LINES: 11 ("Prefer file-scoped namespaces"), 12 ("Prefer `sealed` classes unless inheritance is required")
CLAIM: These are the established conventions in the codebase.
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: Spot-checked `src/` C# files: all non-generated namespace declarations use the file-scoped `namespace X.Y;` form (grep of block-scoped `namespace` without semicolon returns only generated/obj files). 249 `public sealed class` declarations found vs. exactly 1 bare `public class` (a test fixture `GatewayLogRedactorSeamTests`). Convention is well-followed.
CODE_AREA: style.csharp
SEVERITY: low
PROPOSED_FIX: accurate — no action needed.
---
DOC: docs/style-guides/GoStyleGuide.md
LINES: 13
CLAIM: "Keep generated protobuf code under `internal/generated` unless the public API intentionally exposes it."
CLAIM_TYPE: path
VERDICT: accurate
EVIDENCE: `clients/go/internal/generated/` exists and contains generated code (confirmed by `find /Users/dohertj2/Desktop/mxaccessgateway/clients/go -type d`).
CODE_AREA: style.go
SEVERITY: low
PROPOSED_FIX: accurate — no action needed.
---
DOC: docs/style-guides/RustStyleGuide.md
LINES: 49, 42
CLAIM: "Use `thiserror` for library error enums." / "Use `async` APIs with `tokio` for network operations."
CLAIM_TYPE: behavior-rule
VERDICT: accurate
EVIDENCE: `clients/rust/Cargo.toml` lists `thiserror = "2.0.17"` and `tokio = { version = "1.48.0", ... }` as workspace dependencies.
CODE_AREA: style.rust
SEVERITY: low
PROPOSED_FIX: accurate — no action needed.
---
## Summary
### Counts by verdict
| Verdict | Count |
|----------------|-------|
| wrong | 3 |
| stale | 2 |
| unverifiable | 3 |
| accurate | 6 |
| **Total** | **14** |
### Counts by severity
| Severity | Count |
|----------|-------|
| high | 3 |
| medium | 1 |
| low | 10 |
### High-severity findings
- **StyleGuide.md line 3 (wrong/high):** The guide's opening sentence names "ScadaBridge" as the project — a stale copy-paste from a different codebase. The term does not appear anywhere else in the repo.
- **StyleGuide.md lines 12263 (wrong/high):** All illustrative examples (types, file paths, doc cross-references, configuration keys) are from an Akka-actor project (`ScadaGatewayActor`, `ReceiveActor`, `IActorRef`, `../Akka/Actors.md`, `ScadaBridge:Timeout`, etc.). None exist in the MXAccess Gateway codebase. Every linked path is a dead reference.
- **JavaStyleGuide.md line 25 (wrong/high):** Prescribed Java package root `com.dohertylan.mxgateway` does not match the actual code, which universally uses `com.zb.mom.ww.mxgateway`.
+265
View File
@@ -0,0 +1,265 @@
# Cluster 13 — Design-history/Plans
Audited docs:
- `docs/ImplementationPlanIndex.md`
- `docs/ImplementationPlanGateway.md`
- `docs/ImplementationPlanClients.md`
- `docs/ImplementationPlanMxAccessWorker.md`
- `docs/plans/2026-05-28-client-walker-design.md`
- `docs/plans/2026-05-28-client-walker-implementation.md`
- `docs/plans/2026-05-28-lazy-browse-design.md`
- `docs/plans/2026-05-28-lazy-browse-implementation.md`
- `docs/plans/2026-06-01-gateway-cert-autogen-design.md`
- `docs/plans/2026-06-01-gateway-cert-autogen-implementation.md`
---
DOC: docs/plans/2026-05-28-lazy-browse-implementation.md
LINES: 1059
CLAIM: `Run: dotnet build src/MxGateway.sln`
CLAIM_TYPE: path
VERDICT: stale
EVIDENCE: `git log --diff-filter=A -- src/MxGateway.sln` shows the file existed in commit a45f439 but was later renamed; actual file is `src/ZB.MOM.WW.MxGateway.slnx`
CODE_AREA: history.crossref
SEVERITY: low
PROPOSED_FIX: flag only — historical record; the build command step in this plan is a point-in-time artefact. If CLAUDE.md's own build table still says `src/MxGateway.sln` (it does — CLAUDE.md line 22), that living doc should be updated to `src/ZB.MOM.WW.MxGateway.slnx`.
---
DOC: docs/plans/2026-05-28-lazy-browse-implementation.md
LINES: 885, 888, 1069
CLAIM: `clients/dotnet/MxGateway.Client.sln`
CLAIM_TYPE: path
VERDICT: stale
EVIDENCE: Actual solution file is `clients/dotnet/ZB.MOM.WW.MxGateway.Client.slnx` (confirmed by `ls`). No `.sln` variant exists in that directory. Note: CLAUDE.md line 57 and 93 carry the same stale name, so the plan merely repeated the living doc's error.
CODE_AREA: history.crossref
SEVERITY: low
PROPOSED_FIX: flag only — historical record. PROPOSED_FIX targets CLAUDE.md lines 57 and 93: replace `clients/dotnet/MxGateway.Client.sln` with `clients/dotnet/ZB.MOM.WW.MxGateway.Client.slnx`.
---
DOC: docs/plans/2026-06-01-gateway-cert-autogen-implementation.md
LINES: 872, 1196
CLAIM: `clients/dotnet/MxGateway.Client.sln`
CLAIM_TYPE: path
VERDICT: stale
EVIDENCE: Same issue as above — actual file is `clients/dotnet/ZB.MOM.WW.MxGateway.Client.slnx`.
CODE_AREA: history.crossref
SEVERITY: low
PROPOSED_FIX: flag only — historical record. Living-doc fix is the same CLAUDE.md correction noted above.
---
DOC: docs/plans/2026-05-28-lazy-browse-implementation.md
LINES: 1315
CLAIM: "The design's Section 2 said stale page tokens return `FailedPrecondition`."
CLAIM_TYPE: behavior-rule
VERDICT: wrong
EVIDENCE: `docs/plans/2026-05-28-lazy-browse-design.md` line 105 and 229 both say `InvalidArgument` for stale page tokens — `FailedPrecondition` appears nowhere in that document. The claim is internally inconsistent within the plan set: the design never contained `FailedPrecondition`.
CODE_AREA: history.crossref
SEVERITY: medium
PROPOSED_FIX: flag only — the implementation plan is a historical record. The deviation note is inaccurate as written (the design never said `FailedPrecondition`), but the implemented behavior (`InvalidArgument`) is correct and matches the design. No living doc needs correction because Task 10 of that plan correctly reconciled the design doc to say `InvalidArgument`, which it already did.
---
DOC: docs/plans/2026-05-28-client-walker-implementation.md
LINES: 12191221
CLAIM: "`clients/go/mxgateway/galaxy.go:150``DiscoverHierarchy` paging idiom. `clients/go/mxgateway/galaxy_test.go:96``TestGalaxyDiscoverHierarchyReturnsObjects`. `clients/go/mxgateway/galaxy_test.go:370``fakeGalaxyServer` struct."
CLAIM_TYPE: path
VERDICT: stale
EVIDENCE: As-built: `DiscoverHierarchy` is at `galaxy.go:165` (grep confirms); `TestGalaxyDiscoverHierarchyReturnsObjects` is at `galaxy_test.go:99`; `fakeGalaxyServer` struct definition is at `galaxy_test.go:414`. The plan was written before additional code landed. These are implementer navigation hints, not design assertions.
CODE_AREA: history.crossref
SEVERITY: low
PROPOSED_FIX: flag only — stale line numbers in an implementation plan's "read first" guidance. No living doc is affected.
---
DOC: docs/plans/2026-05-28-client-walker-implementation.md
LINES: 580585
CLAIM: "Python: `clients/python/tests/test_galaxy.py` — see `FakeGalaxyStub` (line 271), `FakeUnary` (286), `FakeStream` (304)"
CLAIM_TYPE: path
VERDICT: stale
EVIDENCE: As-built: `class FakeGalaxyStub` is at line 539, `class FakeUnary` at 556, `class FakeStream` at 580. The plan was written before additional tests were added to the file.
CODE_AREA: history.crossref
SEVERITY: low
PROPOSED_FIX: flag only — stale navigation hints in an implementation plan. No living doc is affected.
---
DOC: docs/plans/2026-05-28-client-walker-implementation.md
LINES: 937941
CLAIM: "Rust: `clients/rust/src/galaxy.rs` lines 145-186 — `discover_hierarchy` for paging idiom. `clients/rust/src/galaxy.rs` lines 265+ as a test module (`#[cfg(test)] mod tests`)."
CLAIM_TYPE: path
VERDICT: stale
EVIDENCE: As-built: `discover_hierarchy` is at `galaxy.rs:254` (not 145-186); `#[cfg(test)] mod tests` begins at `galaxy.rs:421` (not 265). The file grew between plan authoring and implementation completion.
CODE_AREA: history.crossref
SEVERITY: low
PROPOSED_FIX: flag only — stale navigation hints in an implementation plan. No living doc is affected.
---
DOC: docs/ImplementationPlanGateway.md
LINES: 2538
CLAIM: Solution and project names use prefix `ZB.MOM.WW.MxGateway.*` (e.g. `src/ZB.MOM.WW.MxGateway.slnx`, `src/ZB.MOM.WW.MxGateway.Server`).
CLAIM_TYPE: path
VERDICT: accurate
EVIDENCE: `ls src/` confirms `ZB.MOM.WW.MxGateway.slnx`, `ZB.MOM.WW.MxGateway.Server`, etc. all exist.
CODE_AREA: history.crossref
SEVERITY: low
PROPOSED_FIX: none
---
DOC: docs/ImplementationPlanGateway.md
LINES: 519530
CLAIM: Related Documentation links to `./Sessions.md`, `./Grpc.md`, `./Authentication.md`, `./Authorization.md`, `./GatewayDashboardDesign.md`, `./GatewayConfiguration.md`, `./GatewayTesting.md`, `./Metrics.md`, `./Diagnostics.md`
CLAIM_TYPE: path
VERDICT: accurate
EVIDENCE: All nine files confirmed present under `docs/`.
CODE_AREA: history.crossref
SEVERITY: low
PROPOSED_FIX: none
---
DOC: docs/ImplementationPlanClients.md
LINES: 514
CLAIM: Primary design files: `docs/ClientLibrariesDesign.md`, `clients/dotnet/DotnetClientDesign.md`, `clients/go/GoClientDesign.md`, `clients/rust/RustClientDesign.md`, `clients/python/PythonClientDesign.md`, `clients/java/JavaClientDesign.md`
CLAIM_TYPE: path
VERDICT: accurate
EVIDENCE: All six files confirmed present.
CODE_AREA: history.crossref
SEVERITY: low
PROPOSED_FIX: none
---
DOC: docs/ImplementationPlanClients.md
LINES: 389396
CLAIM: Related Documentation includes `./ClientProtoGeneration.md`, `./ClientBehaviorFixtures.md`, `./ClientPackaging.md`, `./CrossLanguageSmokeMatrix.md`
CLAIM_TYPE: path
VERDICT: accurate
EVIDENCE: All four files confirmed present under `docs/`.
CODE_AREA: history.crossref
SEVERITY: low
PROPOSED_FIX: none
---
DOC: docs/ImplementationPlanMxAccessWorker.md
LINES: 457466
CLAIM: Related Documentation links: `./WorkerBootstrap.md`, `./WorkerSta.md`, `./WorkerConversion.md`, `./WorkerFrameProtocol.md`, `./WorkerProcessLauncher.md`, `./ParityFixtureMatrix.md`
CLAIM_TYPE: path
VERDICT: accurate
EVIDENCE: All six files confirmed present under `docs/`.
CODE_AREA: history.crossref
SEVERITY: low
PROPOSED_FIX: none
---
DOC: docs/plans/2026-05-28-client-walker-design.md
LINES: 68
CLAIM: Python source file path `clients/python/src/zb_mom_ww_mxgateway/galaxy.py`
CLAIM_TYPE: path
VERDICT: accurate
EVIDENCE: `ls clients/python/src/zb_mom_ww_mxgateway/galaxy.py` confirms existence.
CODE_AREA: history.crossref
SEVERITY: low
PROPOSED_FIX: none
---
DOC: docs/plans/2026-05-28-client-walker-design.md
LINES: 222223
CLAIM: "commit `0d6193c`" added the "Browsing lazily" README sections
CLAIM_TYPE: cross-ref
VERDICT: accurate
EVIDENCE: `git show 0d6193c` confirms: subject "docs: note BrowseChildren in gateway overview and client READMEs"; modifies all five client READMEs and gateway.md. `grep "Browsing lazily" clients/*/README.md` confirms sections are present.
CODE_AREA: history.crossref
SEVERITY: low
PROPOSED_FIX: none
---
DOC: docs/plans/2026-05-28-lazy-browse-design.md
LINES: 105111
CLAIM: Stale `page_token``InvalidArgument`; filter change between pages → `InvalidArgument`.
CLAIM_TYPE: behavior-rule
VERDICT: accurate-as-record
EVIDENCE: `docs/plans/2026-05-28-lazy-browse-implementation.md` implements `StatusCode.InvalidArgument` for both conditions (lines 529530, 590, 616). Design and implementation are consistent.
CODE_AREA: history.crossref
SEVERITY: low
PROPOSED_FIX: none
---
DOC: docs/plans/2026-06-01-gateway-cert-autogen-design.md
LINES: 96
CLAIM: Java client uses "grpc-netty-shaded 1.76.0" and `InsecureTrustManagerFactory`
CLAIM_TYPE: version
VERDICT: accurate
EVIDENCE: `clients/java/settings.gradle` sets `grpcVersion = '1.76.0'`; `clients/java/zb-mom-ww-mxgateway-client/build.gradle` references `io.grpc:grpc-netty-shaded:${grpcVersion}`.
CODE_AREA: history.crossref
SEVERITY: low
PROPOSED_FIX: none
---
DOC: docs/plans/2026-06-01-gateway-cert-autogen-design.md
LINES: 98
CLAIM: Rust client uses "tonic 0.13.1 + rustls (`tls-ring`)"
CLAIM_TYPE: version
VERDICT: accurate
EVIDENCE: `clients/rust/Cargo.toml` line 40: `tonic = { version = "0.13.1", features = ["transport", "tls-ring"] }`.
CODE_AREA: history.crossref
SEVERITY: low
PROPOSED_FIX: none
---
DOC: docs/plans/2026-06-01-gateway-cert-autogen-design.md
LINES: 129130
CLAIM: Documentation task calls for updating "each client README + `*ClientDesign.md`" (`clients/rust/RustClientDesign.md`, `clients/python/PythonClientDesign.md`, `clients/java/JavaClientDesign.md`, `clients/go/GoClientDesign.md`)
CLAIM_TYPE: path
VERDICT: accurate
EVIDENCE: All four `*ClientDesign.md` files confirmed present.
CODE_AREA: history.crossref
SEVERITY: low
PROPOSED_FIX: none
---
DOC: docs/ImplementationPlanGateway.md
LINES: 457459
CLAIM: "`MxGateway:Dashboard:AllowAnonymousLocalhost` loopback bypass (defaults to true for local development)"
CLAIM_TYPE: config-key
VERDICT: accurate
EVIDENCE: `docs/GatewayConfiguration.md` line 149 confirms default `true`; CLAUDE.md line 119 notes the same behavior without specifying the default, but the Gateway plan's default matches the shipped configuration.
CODE_AREA: history.crossref
SEVERITY: low
PROPOSED_FIX: none
---
DOC: docs/plans/2026-05-28-client-walker-implementation.md
LINES: 940941
CLAIM: "`clients/rust/tests/client_behavior.rs` (add tests; extend the `FakeGalaxy` impl from line 265+ to record BrowseChildren calls)"
CLAIM_TYPE: path
VERDICT: stale
EVIDENCE: `ls clients/rust/tests/` confirms `client_behavior.rs` does exist; however the `FakeGalaxy` implementation is in `clients/rust/src/galaxy.rs` (at `#[cfg(test)] mod tests`, line 421), not in `client_behavior.rs`. The "line 265+" reference is also stale (actual line is 421). The plan conflates the two files.
CODE_AREA: history.crossref
SEVERITY: low
PROPOSED_FIX: flag only — implementation-plan navigation hint that was partially wrong at time of writing (or grew inaccurate as code landed). No living doc is affected.
---
DOC: docs/plans/2026-05-28-client-walker-design.md
LINES: 89
CLAIM: Python source file is `clients/python/src/zb_mom_ww_mxgateway/galaxy.py`; the class is `LazyBrowseNode`.
CLAIM_TYPE: path
VERDICT: accurate
EVIDENCE: `grep -n "class LazyBrowseNode" clients/python/src/zb_mom_ww_mxgateway/galaxy.py` returns line 289.
CODE_AREA: history.crossref
SEVERITY: low
PROPOSED_FIX: none
@@ -0,0 +1,101 @@
# Documentation Audit Design
**Date:** 2026-06-03
**Goal:** Audit all prose documentation in the repository for accuracy and
completeness against the current code/contracts, then apply fixes — producing
both an evidence-backed findings report and corrected docs.
## Decisions
- **Deliverable:** Findings report *and* applied fixes (reviewable commit).
- **Scope:** All prose docs — top-level (`gateway.md`, `glauth.md`,
`StyleGuide.md`, `REVIEW-PROCESS.md`), `docs/**`, the 6 style guides, and all
10 client README + design docs (~55 files).
- **Verification depth:** Deep, claim-by-claim, against **this repo's code and
contracts only** (not the external `mxaccess` / `lmxopcua/gr` reference
projects).
- **Approach:** A + C dedup — per-subsystem verifier fan-out (A), with findings
additionally keyed by code area so renamed terms / moved paths are fixed
consistently everywhere at once (C).
## Architecture
Two-phase pipeline:
- **Phase 1 — Verify (fan-out).** ~13 verifier subagents, one per subsystem
cluster. Each reads its docs *and* the relevant source/contracts and returns a
structured findings list.
- **Phase 2 — Fix.** Aggregate findings → dedup by code area into a global
substitutions table → apply mechanical substitutions repo-wide, then
per-cluster judgment fixes. Every fix cites the finding that justifies it.
No code changes; prose only. Source of truth is the repo's own code/contracts.
## Cluster map (Phase 1 fan-out)
| # | Cluster | Docs | Verified against |
|---|---|---|---|
| 1 | Architecture | `gateway.md`, `DesignDecisions.md`, `GatewayProcessDesign.md` | top-level src layout, two-process model |
| 2 | Worker | `Worker{Bootstrap,Conversion,FrameProtocol,ProcessLauncher,Sta}.md`, `MxAccessWorkerInstanceDesign.md` | `…MxGateway.Worker` |
| 3 | Sessions/runtime | `Sessions.md` | Server sessions/workers |
| 4 | Auth (high drift) | `Authentication.md`, `Authorization.md`, `glauth.md` | `Security/Authentication`, ZB.MOM.WW.Auth migration |
| 5 | Dashboard (high drift) | `DashboardInterfaceDesign.md`, `GatewayDashboardDesign.md` | dashboard + ZB.MOM.WW.Theme migration |
| 6 | Config | `GatewayConfiguration.md`, `Diagnostics.md`, `Metrics.md` | `GatewayOptions(Validator)`, appsettings |
| 7 | Contracts/gRPC | `Contracts.md`, `Grpc.md`, `ClientProtoGeneration.md` | `.proto` + generated |
| 8 | Galaxy repo | `GalaxyRepository.md` | GR SQL browse RPCs |
| 9 | Alarms | `AlarmClientDiscovery.md` | alarm worker/server code |
| 10 | Testing | `GatewayTesting.md`, `ClientBehaviorFixtures.md`, `ParityFixtureMatrix.md`, `CrossLanguageSmokeMatrix.md`, `ToolchainLinks.md` | test projects, harness, fixtures |
| 11 | Clients ×5 | each `clients/<lang>/README.md` + `<Lang>ClientDesign.md`, `ClientLibrariesDesign.md`, `ClientPackaging.md` | each client's source |
| 12 | Style guides (normative) | `StyleGuide.md`, `REVIEW-PROCESS.md`, `docs/style-guides/*` | spot-check vs observed conventions; flag-only |
| 13 | Design-history/plans | `ImplementationPlan*.md`, `docs/plans/*` | point-in-time records — verify internal consistency + that no *living* doc cites them as current truth; do **not** rewrite to match code |
## Findings schema
Each verifier returns, per claim:
- `doc_file`, `doc_lines`
- `claim` — the concrete assertion (verbatim or tight paraphrase)
- `claim_type``path` | `config-key` | `rpc/proto` | `port` | `term` |
`behavior-rule` | `command` | `cross-ref` | `version`
- `verdict``accurate` | `stale` | `wrong` | `unverifiable` | `gap`
(missing doc for an existing feature)
- `code_evidence``file:line` proving the verdict
- `code_area` — dedup tag (e.g. `auth.roles`, `dashboard.theme`)
- `severity``high` (misleads integrator/operator) | `medium` | `low`
- `proposed_fix` — replacement text, or "flag only"
## Dedup layer (the C ingredient)
Aggregate all findings; group by `code_area` + normalized claim. Renamed terms
and moved paths collapse into one **global substitutions table** (old → new)
applied once repo-wide. Doc-specific judgment fixes stay per-doc.
## Report
`MxAccessGateway-doc-audit.md` (matching the existing `*-docs-*.md` report
convention): summary counts by verdict/severity/cluster, the global
substitutions table, then per-doc findings with evidence and proposed fixes.
## Fix pass & verification
1. Apply global substitutions first (mechanical, repo-wide).
2. Apply per-cluster judgment fixes — each citing a finding ID, honoring
`StyleGuide.md` (PascalCase filenames, present tense, explain *why* not
*what*, no marketing language).
3. Re-verify changed claims (lighter spot pass) to confirm resolution and no new
drift.
4. Sanity-check any fenced shell/build commands in docs against current
invocations.
## Out of scope (YAGNI)
- Rewriting design-history docs to match current code (they are records).
- XML doc-comments (handled in a prior pass; the `*-docs-*.md` analyzer is clean).
- External `mxaccess` / `lmxopcua/gr` reference projects.
## Branch note
Current branch is `docs/xml-doc-comments` (a separate PR in flight). The audit
fixes are best kept on their own branch (`docs/prose-audit` off `main`) so the
two reviews stay independent.
@@ -0,0 +1,255 @@
# Documentation Audit Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers-extended-cc:executing-plans (or subagent-driven-development) to implement this plan task-by-task.
**Goal:** Audit every prose doc in the repo for accuracy and completeness against the current code/contracts, produce an evidence-backed findings report, then apply fixes.
**Architecture:** Two phases. **Verify** — 13 cluster verifier subagents read their docs + the relevant source and emit structured findings into per-cluster fragment files. **Fix** — a synthesis task aggregates fragments, dedups findings by `code_area` into a global-substitutions table and writes the report; a mechanical substitution pass applies repo-wide renames/path fixes; per-cluster fix tasks apply judgment edits; a final pass re-verifies changed claims. Prose only — never edit code.
**Tech Stack:** Markdown docs; C# (.NET 10 gateway / .NET Framework 4.8 x86 worker); protobuf contracts; 5 language clients (.NET, Go, Rust, Python, Java). Verification reads source under `src/` and `clients/`. No build required (prose), but fenced shell/build commands in docs are sanity-checked against real invocations.
**Design source:** `docs/plans/2026-06-03-documentation-audit-design.md`
**Branch:** Recommended `docs/prose-audit` off `main` (keeps this independent of the in-flight `docs/xml-doc-comments` PR). Decide at execution start; the executor must not work on `main`.
---
## Shared artifacts
- **Fragments dir:** `docs/audit/fragments/` — one Markdown fragment per cluster, `NN-<cluster>.md`. Tracked in git (evidence trail).
- **Report:** `MxAccessGateway-doc-audit.md` (repo root; not gitignored — only `*-docs-{issues,fixed,final}.md` are).
- **Findings schema** (every fragment entry, one block per claim, blocks separated by `---`):
```
DOC: <repo-relative doc path>
LINES: <line or range in the doc>
CLAIM: <the concrete assertion, verbatim or tight paraphrase>
CLAIM_TYPE: path | config-key | rpc/proto | port | term | behavior-rule | command | cross-ref | version
VERDICT: accurate | stale | wrong | unverifiable | gap
EVIDENCE: <source file:line that proves the verdict, or "none — feature absent">
CODE_AREA: <dedup tag, e.g. auth.roles, dashboard.theme, worker.frameproto>
SEVERITY: high | medium | low
PROPOSED_FIX: <replacement text, or "flag only">
```
Severity rule: `high` = misleads an integrator/operator (wrong port, wrong config key, wrong RPC, wrong auth behavior, broken build command); `medium` = stale-but-not-dangerous; `low` = cosmetic/typo/style.
Design-history docs (cluster 13) are point-in-time records: verify internal consistency and that no *living* doc cites them as current truth, but `PROPOSED_FIX` must be "flag only" for divergence-from-current-code — do **not** rewrite history to match code.
---
## Task 0: Scaffold audit workspace
**Classification:** trivial
**Estimated implement time:** ~2 min
**Parallelizable with:** none
**Files:**
- Create: `docs/audit/fragments/.gitkeep`
- Create: `docs/audit/README.md` (one paragraph: what this dir is, links to design + report)
**Steps:**
1. Confirm current branch is not `main`; if on `main`, create/switch to `docs/prose-audit`.
2. `mkdir -p docs/audit/fragments` and add `.gitkeep`.
3. Write `docs/audit/README.md` explaining the fragments are per-cluster audit evidence feeding `MxAccessGateway-doc-audit.md`, and link `docs/plans/2026-06-03-documentation-audit-design.md`.
4. Commit: `docs(audit): scaffold prose-audit workspace`.
**Acceptance:** `docs/audit/fragments/` exists and is tracked; not on `main`.
---
## Verifier tasks (113) — read-only, run in parallel
Each verifier task is **read-only analysis**. The subagent reads ONLY its assigned docs and the listed source, then writes its fragment file. It does not edit any doc or any code. Every finding must cite real `file:line` evidence; spot-check 3 citations before finishing. Use the findings schema above.
Common procedure for Tasks 113:
1. Read each assigned doc fully.
2. Extract every concrete, verifiable claim (paths, config keys, RPC/proto names, ports, role/term names, behavior rules, commands, cross-references, versions).
3. For each claim, locate the source of truth under the listed code area and assign a verdict + evidence + severity + proposed fix.
4. Also record `gap` entries: features/behaviors present in code but undocumented in the assigned docs.
5. Write all findings to the fragment file; commit `docs(audit): cluster NN findings — <name>`.
All verifier tasks are **Parallelizable with: Tasks 113** (disjoint fragment outputs, read-only on disjoint doc sets).
### Task 1: Verify Architecture cluster
**Classification:** small · **Est:** ~5 min · **Parallelizable with:** Tasks 213
**Files:** Create `docs/audit/fragments/01-architecture.md`
**Docs:** `gateway.md`, `docs/DesignDecisions.md`, `docs/GatewayProcessDesign.md`
**Verify against:** `src/` project layout, two-process model (`src/ZB.MOM.WW.MxGateway.{Server,Worker,Contracts}`), IPC pipe naming, STA model in `src/ZB.MOM.WW.MxGateway.Worker`.
### Task 2: Verify Worker cluster
**Classification:** small · **Est:** ~5 min · **Parallelizable with:** Tasks 1,313
**Files:** Create `docs/audit/fragments/02-worker.md`
**Docs:** `docs/WorkerBootstrap.md`, `docs/WorkerConversion.md`, `docs/WorkerFrameProtocol.md`, `docs/WorkerProcessLauncher.md`, `docs/WorkerSta.md`, `docs/MxAccessWorkerInstanceDesign.md`
**Verify against:** `src/ZB.MOM.WW.MxGateway.Worker/**`, `src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_worker.proto`.
### Task 3: Verify Sessions/runtime cluster
**Classification:** small · **Est:** ~3 min · **Parallelizable with:** Tasks 12,413
**Files:** Create `docs/audit/fragments/03-sessions.md`
**Docs:** `docs/Sessions.md`
**Verify against:** `src/ZB.MOM.WW.MxGateway.Server/Sessions/**`, `.../Workers/**`.
### Task 4: Verify Auth cluster (high-drift)
**Classification:** standard · **Est:** ~5 min · **Parallelizable with:** Tasks 13,513
**Files:** Create `docs/audit/fragments/04-auth.md`
**Docs:** `docs/Authentication.md`, `docs/Authorization.md`, `glauth.md`
**Verify against:** `src/ZB.MOM.WW.MxGateway.Server/Security/**`. Pay special attention to recent migration: API keys moved to `ZB.MOM.WW.Auth.ApiKeys`, role rename `Admin``Administrator`, LDAP base DN unified to `dc=zb,dc=local`, `ZbClaimTypes`/`ZbCookieDefaults`, scopes (`session`,`invoke`,`event`,`metadata`,`admin`), cookie name `__Host-MxGatewayDashboard`, SignalR hub token behavior. Tag these `auth.roles`, `auth.apikeys`, `auth.ldap`, `auth.cookie` for dedup.
### Task 5: Verify Dashboard cluster (high-drift)
**Classification:** standard · **Est:** ~5 min · **Parallelizable with:** Tasks 14,613
**Files:** Create `docs/audit/fragments/05-dashboard.md`
**Docs:** `docs/DashboardInterfaceDesign.md`, `docs/GatewayDashboardDesign.md`
**Verify against:** `src/ZB.MOM.WW.MxGateway.Server/Dashboard/**`, `.../wwwroot/**`. Note recent migration to `ZB.MOM.WW.Theme` (ThemeShell/ThemeHead/ThemeScripts, StatusPill), pruned sidebar/login CSS, Blazor LoginCard. Confirm the "no UI component libraries (local Bootstrap only)" rule still matches reality. Tag `dashboard.theme`, `dashboard.login`.
### Task 6: Verify Config cluster
**Classification:** small · **Est:** ~5 min · **Parallelizable with:** Tasks 15,713
**Files:** Create `docs/audit/fragments/06-config.md`
**Docs:** `docs/GatewayConfiguration.md`, `docs/Diagnostics.md`, `docs/Metrics.md`
**Verify against:** `src/ZB.MOM.WW.MxGateway.Server/Configuration/**` (`GatewayOptions`, `GatewayOptionsValidator`), `.../Diagnostics/**`, `.../Metrics/**`, `appsettings.json`. Check every `MxGateway:*` key, default, and validation rule. Tag `config.<key>`.
### Task 7: Verify Contracts/gRPC cluster
**Classification:** small · **Est:** ~5 min · **Parallelizable with:** Tasks 16,813
**Files:** Create `docs/audit/fragments/07-contracts.md`
**Docs:** `docs/Contracts.md`, `docs/Grpc.md`, `docs/ClientProtoGeneration.md`
**Verify against:** `src/ZB.MOM.WW.MxGateway.Contracts/Protos/{mxaccess_gateway,galaxy_repository,mxaccess_worker}.proto`, `src/ZB.MOM.WW.MxGateway.Server/Grpc/**`. Confirm every RPC name, message, and service described matches the `.proto`. Tag `proto.<rpc>`.
### Task 8: Verify Galaxy Repository cluster
**Classification:** small · **Est:** ~3 min · **Parallelizable with:** Tasks 17,913
**Files:** Create `docs/audit/fragments/08-galaxy.md`
**Docs:** `docs/GalaxyRepository.md`
**Verify against:** `src/ZB.MOM.WW.MxGateway.Server/Galaxy/**`, `galaxy_repository.proto`.
### Task 9: Verify Alarms cluster
**Classification:** small · **Est:** ~3 min · **Parallelizable with:** Tasks 18,1013
**Files:** Create `docs/audit/fragments/09-alarms.md`
**Docs:** `docs/AlarmClientDiscovery.md`
**Verify against:** `src/ZB.MOM.WW.MxGateway.Server/Alarms/**`, `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/Alarm*.cs`.
### Task 10: Verify Testing cluster
**Classification:** small · **Est:** ~5 min · **Parallelizable with:** Tasks 19,1113
**Files:** Create `docs/audit/fragments/10-testing.md`
**Docs:** `docs/GatewayTesting.md`, `docs/ClientBehaviorFixtures.md`, `docs/ParityFixtureMatrix.md`, `docs/CrossLanguageSmokeMatrix.md`, `docs/ToolchainLinks.md`
**Verify against:** `src/ZB.MOM.WW.MxGateway.{Tests,Worker.Tests,IntegrationTests}/**`, env-var gates (`MXGATEWAY_RUN_LIVE_*`), `LiveMxAccessFactAttribute`/`LiveLdapFactAttribute`, `scripts/run-client-e2e-tests.ps1`. Sanity-check every fenced command and tool version/path.
### Task 11: Verify Clients cluster
**Classification:** standard · **Est:** ~5 min · **Parallelizable with:** Tasks 110,1213
**Files:** Create `docs/audit/fragments/11-clients.md`
**Docs:** `clients/{dotnet,go,java,python,rust}/README.md` + each `*ClientDesign.md`, `docs/ClientLibrariesDesign.md`, `docs/ClientPackaging.md`
**Verify against:** each `clients/<lang>/` source + build manifests (`*.csproj`/`go.mod`/`Cargo.toml`/`pyproject.toml`/`build.gradle`). Confirm install/build/test commands, package names, and proto-generation steps match reality. Tag `client.<lang>`.
### Task 12: Verify Style-guides cluster (normative — flag-only)
**Classification:** small · **Est:** ~3 min · **Parallelizable with:** Tasks 111,13
**Files:** Create `docs/audit/fragments/12-styleguides.md`
**Docs:** `StyleGuide.md`, `REVIEW-PROCESS.md`, `docs/style-guides/*.md`
**Verify against:** observed conventions in the corresponding source trees. These prescribe rather than describe — record only contradictions between a stated rule and actual pervasive practice; `PROPOSED_FIX` mostly "flag only".
### Task 13: Verify Design-history/plans cluster (records — flag-only)
**Classification:** small · **Est:** ~4 min · **Parallelizable with:** Tasks 112
**Files:** Create `docs/audit/fragments/13-history.md`
**Docs:** `docs/ImplementationPlan{Index,Gateway,Clients,MxAccessWorker}.md`, `docs/plans/*`
**Verify against:** check internal consistency and whether any *living* doc (clusters 112) cites these as current truth. Do NOT propose rewriting history to match code — `PROPOSED_FIX` is "flag only" except for genuinely broken internal cross-references.
---
## Task 14: Synthesize findings report + dedup table
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** none (blocked by Tasks 113)
**Files:**
- Read: `docs/audit/fragments/*.md`
- Create: `MxAccessGateway-doc-audit.md`
**Steps:**
1. Read all 13 fragments.
2. Build the **summary**: counts by verdict, by severity, by cluster.
3. Build the **global-substitutions table**: group findings by `CODE_AREA` + normalized claim; any rename/moved-path/renamed-term that recurs becomes one row `old → new (applies to: doc list)`. This is the C-dedup output.
4. Append **per-doc findings** (all fragment blocks, grouped by doc, ordered high→low severity).
5. Add a short **fix plan** mapping each fix task (1622) to the findings it must resolve.
6. Commit: `docs(audit): findings report + global-substitutions table`.
**Acceptance:** report exists with summary, substitutions table, and every fragment finding represented; high-severity findings listed first per doc.
---
## Task 15: Apply global substitutions (mechanical, repo-wide)
**Classification:** standard
**Estimated implement time:** ~4 min
**Parallelizable with:** none (blocked by Task 14; blocks 1622)
**Files:** every doc named in the substitutions table (prose only — never code).
**Steps:**
1. For each substitutions row, apply the `old → new` replacement across the listed docs only.
2. Be precise — do not over-replace (e.g. don't rewrite a deliberately historical mention in cluster-13 docs; the table's "applies to" list excludes those).
3. After each row, grep the repo docs to confirm no unintended remaining/òver-broad matches.
4. Commit: `docs(audit): apply global term/path substitutions`.
**Acceptance:** every substitutions row applied to exactly its listed docs; `git diff` touches only `.md` files; spot-grep shows no stale term left in living docs.
---
## Fix tasks (1622) — per-cluster judgment edits, run in parallel
Each fix task consumes the report's findings for its docs and applies each `PROPOSED_FIX` that is not "flag only". Honor `StyleGuide.md` (PascalCase filenames, present tense, explain *why* not *what*, no marketing language). Prose only. Do not re-apply global substitutions (Task 15 already did). All fix tasks are **Parallelizable with: Tasks 1622** (disjoint doc sets) and **blocked by Task 15**.
Common procedure: read the report section for the cluster → apply each non-flag-only fix → re-read the edited doc to confirm coherence → commit `docs(audit): fix <cluster> findings`.
### Task 16: Fix Architecture + Sessions
**Classification:** small · **Est:** ~4 min · **Parallelizable with:** 1722 · **Blocked by:** 15
**Files:** `gateway.md`, `docs/DesignDecisions.md`, `docs/GatewayProcessDesign.md`, `docs/Sessions.md`
### Task 17: Fix Worker
**Classification:** small · **Est:** ~4 min · **Parallelizable with:** 16,1822 · **Blocked by:** 15
**Files:** `docs/Worker{Bootstrap,Conversion,FrameProtocol,ProcessLauncher,Sta}.md`, `docs/MxAccessWorkerInstanceDesign.md`
### Task 18: Fix Auth (high-drift)
**Classification:** standard · **Est:** ~5 min · **Parallelizable with:** 1617,1922 · **Blocked by:** 15
**Files:** `docs/Authentication.md`, `docs/Authorization.md`, `glauth.md`
### Task 19: Fix Dashboard (high-drift)
**Classification:** standard · **Est:** ~5 min · **Parallelizable with:** 1618,2022 · **Blocked by:** 15
**Files:** `docs/DashboardInterfaceDesign.md`, `docs/GatewayDashboardDesign.md`
### Task 20: Fix Config + Contracts/gRPC + Galaxy + Alarms
**Classification:** standard · **Est:** ~5 min · **Parallelizable with:** 1619,2122 · **Blocked by:** 15
**Files:** `docs/GatewayConfiguration.md`, `docs/Diagnostics.md`, `docs/Metrics.md`, `docs/Contracts.md`, `docs/Grpc.md`, `docs/ClientProtoGeneration.md`, `docs/GalaxyRepository.md`, `docs/AlarmClientDiscovery.md`
### Task 21: Fix Clients
**Classification:** standard · **Est:** ~5 min · **Parallelizable with:** 1620,22 · **Blocked by:** 15
**Files:** `clients/{dotnet,go,java,python,rust}/README.md` + each `*ClientDesign.md`, `docs/ClientLibrariesDesign.md`, `docs/ClientPackaging.md`
### Task 22: Fix Testing + Style-guides + history cross-refs
**Classification:** small · **Est:** ~4 min · **Parallelizable with:** 1621 · **Blocked by:** 15
**Files:** `docs/GatewayTesting.md`, `docs/ClientBehaviorFixtures.md`, `docs/ParityFixtureMatrix.md`, `docs/CrossLanguageSmokeMatrix.md`, `docs/ToolchainLinks.md`, `StyleGuide.md`, `REVIEW-PROCESS.md`, `docs/style-guides/*.md`, and only broken internal cross-refs in `docs/ImplementationPlan*.md` / `docs/plans/*` (no history rewrites).
---
## Task 23: Re-verify changed claims + finalize report
**Classification:** standard
**Estimated implement time:** ~5 min
**Parallelizable with:** none (blocked by Tasks 1622)
**Files:**
- Modify: `MxAccessGateway-doc-audit.md` (mark each finding resolved/deferred)
- Read: all docs edited in Tasks 1522
**Steps:**
1. For every `high`/`medium` finding that had a non-flag-only fix, re-check the edited prose against the same `EVIDENCE` source to confirm it is now accurate and no new inaccuracy was introduced.
2. Sanity-check every fenced shell/build command edited in this audit against the real invocation (compare to `CLAUDE.md` Build/Test/Run and each client README).
3. Update the report: each finding → `resolved` | `deferred (flag-only)` | `still-open` with a one-line note. Add a final tally.
4. Commit: `docs(audit): finalize report — resolution status`.
**Acceptance:** no `high`-severity finding left `still-open` without an explicit deferral note; report tally matches the fixes committed; `git diff --stat` across the branch shows only `.md` changes.
---
## Notes for the executor
- **Read-only verifiers must not edit docs or code.** If a verifier is tempted to fix, it records a `PROPOSED_FIX` instead.
- **Never touch source code.** This audit changes `.md` only. A `git diff --stat` containing any non-`.md` file is a defect.
- **Parallel dispatch:** Tasks 113 all at once; then 14; then 15; then 1622 all at once; then 23.
- **Worker/x86 build is not required** — verification is read-only against source. No Windows host needed for this plan.
@@ -0,0 +1,30 @@
{
"planPath": "docs/plans/2026-06-03-documentation-audit-implementation.md",
"tasks": [
{"id": 24, "subject": "Task 0: Scaffold audit workspace", "status": "pending"},
{"id": 25, "subject": "Task 1: Verify Architecture cluster", "status": "pending", "blockedBy": [24]},
{"id": 26, "subject": "Task 2: Verify Worker cluster", "status": "pending", "blockedBy": [24]},
{"id": 27, "subject": "Task 3: Verify Sessions/runtime cluster", "status": "pending", "blockedBy": [24]},
{"id": 28, "subject": "Task 4: Verify Auth cluster (high-drift)", "status": "pending", "blockedBy": [24]},
{"id": 29, "subject": "Task 5: Verify Dashboard cluster (high-drift)", "status": "pending", "blockedBy": [24]},
{"id": 30, "subject": "Task 6: Verify Config cluster", "status": "pending", "blockedBy": [24]},
{"id": 31, "subject": "Task 7: Verify Contracts/gRPC cluster", "status": "pending", "blockedBy": [24]},
{"id": 32, "subject": "Task 8: Verify Galaxy Repository cluster", "status": "pending", "blockedBy": [24]},
{"id": 33, "subject": "Task 9: Verify Alarms cluster", "status": "pending", "blockedBy": [24]},
{"id": 34, "subject": "Task 10: Verify Testing cluster", "status": "pending", "blockedBy": [24]},
{"id": 35, "subject": "Task 11: Verify Clients cluster", "status": "pending", "blockedBy": [24]},
{"id": 36, "subject": "Task 12: Verify Style-guides cluster (flag-only)", "status": "pending", "blockedBy": [24]},
{"id": 37, "subject": "Task 13: Verify Design-history/plans cluster (flag-only)", "status": "pending", "blockedBy": [24]},
{"id": 38, "subject": "Task 14: Synthesize findings report + dedup table", "status": "pending", "blockedBy": [25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37]},
{"id": 39, "subject": "Task 15: Apply global substitutions (mechanical)", "status": "pending", "blockedBy": [38]},
{"id": 40, "subject": "Task 16: Fix Architecture + Sessions", "status": "pending", "blockedBy": [39]},
{"id": 41, "subject": "Task 17: Fix Worker", "status": "pending", "blockedBy": [39]},
{"id": 42, "subject": "Task 18: Fix Auth (high-drift)", "status": "pending", "blockedBy": [39]},
{"id": 43, "subject": "Task 19: Fix Dashboard (high-drift)", "status": "pending", "blockedBy": [39]},
{"id": 44, "subject": "Task 20: Fix Config + Contracts + Galaxy + Alarms", "status": "pending", "blockedBy": [39]},
{"id": 45, "subject": "Task 21: Fix Clients", "status": "pending", "blockedBy": [39]},
{"id": 46, "subject": "Task 22: Fix Testing + Style-guides + history cross-refs", "status": "pending", "blockedBy": [39]},
{"id": 47, "subject": "Task 23: Re-verify changed claims + finalize report", "status": "pending", "blockedBy": [40, 41, 42, 43, 44, 45, 46]}
],
"lastUpdated": "2026-06-03"
}
+1 -1
View File
@@ -22,7 +22,7 @@ library, CLI, and tests.
## Packages ## Packages
- Use lowercase package names under `com.dohertylan.mxgateway`. - Use lowercase package names under `com.zb.mom.ww.mxgateway`.
- Keep client library code separate from CLI code. - Keep client library code separate from CLI code.
- Keep generated protobuf classes in a generated package. - Keep generated protobuf classes in a generated package.
- Do not expose implementation-only transport helpers as public API. - Do not expose implementation-only transport helpers as public API.
+5 -3
View File
@@ -24,8 +24,8 @@ CLI, and tests.
## Package Structure ## Package Structure
- Put library code under `src/mxgateway/`. - Put library code under `src/zb_mom_ww_mxgateway/`.
- Put CLI entry points under `src/mxgateway_cli/`. - Put CLI entry points under `src/zb_mom_ww_mxgateway_cli/`.
- Keep generated protobuf modules under a clearly named `generated` package. - Keep generated protobuf modules under a clearly named `generated` package.
- Avoid import side effects that open channels, read environment variables, or - Avoid import side effects that open channels, read environment variables, or
start background tasks. start background tasks.
@@ -65,4 +65,6 @@ CLI, and tests.
- Use `pytest` and `pytest-asyncio`. - Use `pytest` and `pytest-asyncio`.
- Use fake generated stubs or an in-process test gRPC server for unit tests. - Use fake generated stubs or an in-process test gRPC server for unit tests.
- Keep live integration tests behind `MXGATEWAY_INTEGRATION=1`. - Keep live integration tests behind an explicit opt-in environment variable
and a `pytest` skip guard, matching the existing tests (for example the
loopback TLS tests gate on `MXGATEWAY_RUN_TLS_TESTS=1`).
+28 -16
View File
@@ -145,9 +145,10 @@ for the alarm subsystem.
Dashboard authentication is LDAP-backed (distinct from the API-key model on Dashboard authentication is LDAP-backed (distinct from the API-key model on
the gRPC API). `/login` accepts username and password in a form body, binds the gRPC API). `/login` accepts username and password in a form body, binds
against `MxGateway:Ldap`, maps the user's LDAP groups to `Admin` or `Viewer` against `MxGateway:Ldap`, maps the user's LDAP groups to `Administrator` or
via `MxGateway:Dashboard:GroupToRole`, and issues an HTTP-only secure `Viewer` via `MxGateway:Dashboard:GroupToRole`, and issues an HTTP-only secure
`__Host-MxGatewayDashboard` cookie. `/logout` clears it. Login and logout `MxGatewayDashboard` cookie (the name is configurable via
`MxGateway:Dashboard:CookieName`). `/logout` clears it. Login and logout
posts validate antiforgery tokens. SignalR hub connections accept either the posts validate antiforgery tokens. SignalR hub connections accept either the
cookie or a 30-minute data-protected bearer minted at `/hubs/token`. cookie or a 30-minute data-protected bearer minted at `/hubs/token`.
`MxGateway:Dashboard:AllowAnonymousLocalhost` permits loopback to bypass the `MxGateway:Dashboard:AllowAnonymousLocalhost` permits loopback to bypass the
@@ -232,27 +233,35 @@ message WorkerEnvelope {
uint32 protocol_version = 1; uint32 protocol_version = 1;
string session_id = 2; string session_id = 2;
uint64 sequence = 3; uint64 sequence = 3;
uint64 correlation_id = 4; string correlation_id = 4;
oneof body { oneof body {
WorkerHello worker_hello = 10; GatewayHello gateway_hello = 10;
GatewayHello gateway_hello = 11; WorkerHello worker_hello = 11;
WorkerReady worker_ready = 12; WorkerReady worker_ready = 12;
WorkerCommand command = 20; WorkerCommand worker_command = 13;
WorkerCommandReply command_reply = 21; WorkerCommandReply worker_command_reply = 14;
WorkerEvent event = 22; WorkerCancel worker_cancel = 15;
WorkerHeartbeat heartbeat = 23; WorkerShutdown worker_shutdown = 16;
WorkerCancel cancel = 24; WorkerShutdownAck worker_shutdown_ack = 17;
WorkerShutdown shutdown = 25; WorkerEvent worker_event = 18;
WorkerFault fault = 26; WorkerHeartbeat worker_heartbeat = 19;
WorkerFault worker_fault = 20;
} }
} }
``` ```
The contract evolves additively only: field numbers and enum values are never
renumbered or repurposed, so a stale gateway and worker that disagree on the
newest tags still decode the fields they share. `correlation_id` is a `string`
(not a numeric id) because it is the same correlation token the public gRPC API
carries end to end, so the worker never has to translate id formats.
Rules: Rules:
- `sequence` is monotonic per sender. - `sequence` is monotonic per sender.
- `correlation_id` links commands to replies. - `correlation_id` links commands to replies.
- Events use their own correlation id or zero. - Events carry their own correlation id or an empty string.
- Replies must preserve MXAccess HRESULT/status information even when the - Replies must preserve MXAccess HRESULT/status information even when the
command is also represented as a protocol-level failure. command is also represented as a protocol-level failure.
- Protocol version mismatch fails session creation. - Protocol version mismatch fails session creation.
@@ -659,8 +668,10 @@ External gateway:
- authenticate v1 gRPC clients with `authorization: Bearer - authenticate v1 gRPC clients with `authorization: Bearer
mxgw_<key-id>_<secret>` API-key metadata, mxgw_<key-id>_<secret>` API-key metadata,
- reject missing or invalid API keys with gRPC `Unauthenticated`, - reject missing or invalid API keys with gRPC `Unauthenticated`,
- reject valid keys that lack the required session, invoke, event, metadata, or - reject valid keys that lack the required scope with gRPC `PermissionDenied`.
admin scope with gRPC `PermissionDenied`, Scopes are fine-grained: `session:open`, `session:close`, `invoke:read`,
`invoke:write`, `invoke:secure`, `events:read`, `metadata:read`, and `admin`
(see `GatewayScopes`),
- authorize access to commands that can write, authenticate users, expose - authorize access to commands that can write, authenticate users, expose
metadata, stream events, or alter runtime state. metadata, stream events, or alter runtime state.
@@ -901,6 +912,7 @@ State machine:
Creating Creating
-> StartingWorker -> StartingWorker
-> WaitingForPipe -> WaitingForPipe
-> Handshaking
-> InitializingWorker -> InitializingWorker
-> Ready -> Ready
-> Closing -> Closing
+71 -50
View File
@@ -59,13 +59,17 @@ For mxaccessgw dev, `admin` covers every gw-side capability test;
`readonly` is the right "negative" case for proving Browse-OK / `readonly` is the right "negative" case for proving Browse-OK /
Write-denied. Write-denied.
The gateway dashboard adds one role beyond this LmxOpcUa taxonomy: The gateway dashboard adds one group beyond this LmxOpcUa taxonomy:
`GwAdmin`. `LdapOptions.RequiredGroup` defaults to `GwAdmin`, so the `GwAdmin`. There is no `RequiredGroup` option — dashboard authorization
dashboard login and `DashboardLdapLiveTests` require `admin` to be a is driven entirely by `MxGateway:Dashboard:GroupToRole`, which maps an
member of a `GwAdmin` group. `GwAdmin` is **not** in the baseline LDAP group to a dashboard role. A user whose groups produce no mapped
GLAuth config — it must be provisioned before dashboard authn or the role is rejected at login. So for the dashboard to admit `admin`, a
LDAP live tests work. See [Provisioning the GwAdmin group named in `GroupToRole` (by convention `GwAdmin``Administrator`)
group](#provisioning-the-gwadmin-group) below. must exist and `admin` must belong to it. `GwAdmin` is **not** in the
baseline GLAuth config — it must be provisioned before dashboard authn
or the `DashboardLdapLiveTests` (`MXGATEWAY_RUN_LIVE_LDAP_TESTS=1`)
work. See [Provisioning the GwAdmin group](#provisioning-the-gwadmin-group)
below.
> **Dashboard role value (Task 1.7):** the LDAP `GwAdmin` group now maps to > **Dashboard role value (Task 1.7):** the LDAP `GwAdmin` group now maps to
> the canonical dashboard role **`Administrator`** (was `Admin`); `GwReader` > the canonical dashboard role **`Administrator`** (was `Admin`); `GwReader`
@@ -112,43 +116,58 @@ to avoid re-deriving the LDAP escape-string handling.
## Suggested mxgw configuration shape ## Suggested mxgw configuration shape
A YAML/JSON section for mxaccessgw that mirrors LmxOpcUa's `LdapOptions` The gateway binds the `MxGateway:Ldap` section onto `LdapOptions`. The
record: field names are PascalCase config keys (shown here as YAML; JSON
`appsettings` and env-var overrides use the same names). Note the keys
that changed from the older LmxOpcUa shape: `Transport` (an enum,
replacing the boolean `UseTls`), `AllowInsecure` (replacing
`AllowInsecureLdap`), and `UserNameAttribute` which defaults to `cn`:
```yaml ```yaml
ldap: MxGateway:
enabled: true Ldap:
server: localhost Enabled: true
port: 3893 Server: localhost
useTls: false Port: 3893
allowInsecureLdap: true # dev only Transport: None # None | StartTls | Ldaps (dev: None)
searchBase: "dc=zb,dc=local" AllowInsecure: true # dev only
serviceAccountDn: "cn=serviceaccount,dc=zb,dc=local" SearchBase: "dc=zb,dc=local"
serviceAccountPassword: "serviceaccount123" ServiceAccountDn: "cn=serviceaccount,dc=zb,dc=local"
userNameAttribute: "uid" # GLAuth populates this; AD uses sAMAccountName ServiceAccountPassword: "serviceaccount123"
displayNameAttribute: "cn" UserNameAttribute: "cn" # GLAuth keys users by cn; AD uses sAMAccountName
groupAttribute: "memberOf" DisplayNameAttribute: "cn"
groupToRole: GroupAttribute: "memberOf"
ReadOnly: "Browse" Dashboard:
WriteOperate: "Write" GroupToRole:
WriteTune: "WriteSecured" GwAdmin: "Administrator"
WriteConfigure: "WriteSecured" GwReader: "Viewer"
AlarmAck: "AlarmAck"
``` ```
`groupAttribute` returns full DNs like `Transport` is an `LdapTransport` enum (`None`, `StartTls`, `Ldaps`); it
`ou=ReadOnly,ou=groups,dc=zb,dc=local` — the authenticator replaces the old boolean `UseTls` (`true``Ldaps`, `false` = `None`).
should strip the leading `ou=` (or `cn=` against AD) RDN value and `UserNameAttribute` defaults to `cn` because GLAuth keys users by `cn`
look that up in `groupToRole`. (`backend.nameformat = "cn"`); only AD needs `sAMAccountName`. The
group-to-role mapping lives under `MxGateway:Dashboard:GroupToRole`, not
in the LDAP section, and its values must be dashboard roles
(`Administrator` or `Viewer`).
The shared `ZB.MOM.WW.Auth.Ldap` provider performs the runtime bind and
search; it returns each group already stripped to its short RDN value
(e.g. `GwAdmin` from `ou=GwAdmin,ou=groups,dc=zb,dc=local`) before the
gateway looks it up in `GroupToRole`. Keep `GroupToRole` keys as short
group names — a full-DN key will never match the short name the provider
returns.
## Provisioning the GwAdmin group ## Provisioning the GwAdmin group
`GwAdmin` is the gateway-specific dashboard-admin role. It is the `GwAdmin` is the gateway-specific dashboard-admin group, mapped to the
default `LdapOptions.RequiredGroup`, so the dashboard cookie login and `Administrator` role through `MxGateway:Dashboard:GroupToRole`. Because
`DashboardLdapLiveTests` (`MXGATEWAY_RUN_LIVE_LDAP_TESTS=1`) reject dashboard login rejects any user who resolves to no role, the dashboard
`admin` until a `GwAdmin` group exists and `admin` is a member. cookie login and `DashboardLdapLiveTests`
GLAuth's baseline config ships only the five LmxOpcUa role groups, so (`MXGATEWAY_RUN_LIVE_LDAP_TESTS=1`) reject `admin` until a `GwAdmin`
`GwAdmin` must be added to GLAuth rather than run from a separate LDAP group exists, `admin` is a member, and `GroupToRole` maps `GwAdmin` to a
role. GLAuth's baseline config ships only the five LmxOpcUa role groups,
so `GwAdmin` must be added to GLAuth rather than run from a separate LDAP
server: server:
1. Edit `C:\publish\glauth\glauth.cfg` 1. Edit `C:\publish\glauth\glauth.cfg`
@@ -178,10 +197,11 @@ server:
4. `nssm restart GLAuth` 4. `nssm restart GLAuth`
After the restart, `admin`'s `memberOf` includes After the restart, `admin`'s `memberOf` includes
`ou=GwAdmin,ou=groups,dc=zb,dc=local`, which the authenticator `ou=GwAdmin,ou=groups,dc=zb,dc=local`. The shared LDAP provider strips
strips to `GwAdmin` and matches against `RequiredGroup`. The same that to the short RDN `GwAdmin`, which the gateway looks up in
pattern applies to any future permission that doesn't fit the existing `MxGateway:Dashboard:GroupToRole` to resolve the dashboard role. The same
five roles. pattern applies to any future group that doesn't fit the existing five
roles — add the group, add the member, and add a `GroupToRole` entry.
Generate `passsha256` from a plaintext password: Generate `passsha256` from a plaintext password:
@@ -254,24 +274,25 @@ Get-Content C:\publish\glauth\logs\stderr.log -Tail 20 -Wait
## Active Directory migration cheat-sheet ## Active Directory migration cheat-sheet
LmxOpcUa's `LdapOptions` xml-doc captures the AD overrides; same set These `MxGateway:Ldap` keys change when pointing the gateway at AD
applies to mxaccessgw verbatim. Keys that change: instead of dev GLAuth:
| Field | GLAuth dev value | AD production value | | Field | GLAuth dev value | AD production value |
|---|---|---| |---|---|---|
| `Server` | `localhost` | a domain controller FQDN, or the domain itself | | `Server` | `localhost` | a domain controller FQDN, or the domain itself |
| `Port` | `3893` | `636` (LDAPS) — AD increasingly rejects plain bind under LDAP-signing enforcement | | `Port` | `3893` | `636` (LDAPS) — AD increasingly rejects plain bind under LDAP-signing enforcement |
| `UseTls` | `false` | `true` | | `Transport` | `None` | `Ldaps` (or `StartTls`) |
| `AllowInsecureLdap` | `true` | `false` | | `AllowInsecure` | `true` | `false` |
| `SearchBase` | `dc=zb,dc=local` | `DC=corp,DC=example,DC=com` | | `SearchBase` | `dc=zb,dc=local` | `DC=corp,DC=example,DC=com` |
| `ServiceAccountDn` | `cn=serviceaccount,dc=zb,dc=local` | `CN=MxGwSvc,OU=Service Accounts,DC=corp,...` | | `ServiceAccountDn` | `cn=serviceaccount,dc=zb,dc=local` | `CN=MxGwSvc,OU=Service Accounts,DC=corp,...` |
| `UserNameAttribute` | `uid` | `sAMAccountName` (or `userPrincipalName`) | | `UserNameAttribute` | `cn` | `sAMAccountName` (or `userPrincipalName`) |
| `GroupAttribute` | `memberOf` (unchanged) | `memberOf` (unchanged) | | `GroupAttribute` | `memberOf` (unchanged) | `memberOf` (unchanged) |
`memberOf` returns full DNs; the authenticator strips the leading `memberOf` returns full DNs; the shared LDAP provider strips each to its
`CN=` value and uses it as the lookup key in `groupToRole`. Nested leading RDN value (`CN=`/`OU=`) and the gateway uses that as the lookup
groups are **not** auto-expanded; either flatten in the directory or key in `MxGateway:Dashboard:GroupToRole`. Nested groups are **not**
add a `tokenGroups` query as an enhancement. auto-expanded; either flatten in the directory or add a `tokenGroups`
query as an enhancement.
## Security notes for production ## Security notes for production