docs(audit): current-state MxAccessGateway

This commit is contained in:
Joseph Doherty
2026-06-01 06:55:07 -04:00
parent e498bb7c5a
commit 02cc687556
@@ -0,0 +1,118 @@
# Audit — current state: MxAccessGateway (`mxaccessgw`)
Repo: `~/Desktop/MxAccessGateway` (Gitea `mxaccessgw`). Stack: .NET 10 gateway (x64) + x86/net48 worker.
Audit lives entirely in the **gateway** (.NET 10); the worker records nothing.
All paths relative to repo root; audit code under `src/ZB.MOM.WW.MxGateway.Server/Security/Authentication/`. Verified 2026-06-01.
This is the **narrowest** of the three implementations: a single SQLite-backed append-only log scoped
to **API-key lifecycle and constraint denials**. There is no general-purpose audit abstraction, no
separate redaction seam, and no CorrelationId. Read-back exists but has no production consumer today.
## How it works today
The audit log is one seam, `IApiKeyAuditStore`
(`src/ZB.MOM.WW.MxGateway.Server/Security/Authentication/IApiKeyAuditStore.cs:6`), with exactly two
operations: `AppendAsync(ApiKeyAuditEntry, ...)` (`IApiKeyAuditStore.cs:14`) and
`ListRecentAsync(int count, ...)` (`IApiKeyAuditStore.cs:22`). Single implementation,
`SqliteApiKeyAuditStore` (`SqliteApiKeyAuditStore.cs:5`), registered as a singleton in
`AuthStoreServiceCollectionExtensions.cs:23` alongside the rest of the auth stores.
- **Append-side shape:** callers pass `ApiKeyAuditEntry(string? KeyId, string EventType, string? RemoteAddress, string? Details)`
(`ApiKeyAuditEntry.cs:3`). The store sets the timestamp itself — `AppendAsync` writes
`created_utc = DateTimeOffset.UtcNow.ToString("O")` (`SqliteApiKeyAuditStore.cs:20`), so the caller
cannot supply the time and there is **no idempotency/event key** (the only identity is the DB
`AUTOINCREMENT` rowid).
- **Read-side shape:** `ListRecentAsync` returns `ApiKeyAuditRecord(long AuditId, string? KeyId, string EventType, string? RemoteAddress, DateTimeOffset CreatedUtc, string? Details)`
(`ApiKeyAuditRecord.cs:3`), ordered `audit_id DESC LIMIT $count` (`SqliteApiKeyAuditStore.cs:38-42`),
returning `[]` for `count <= 0` (`SqliteApiKeyAuditStore.cs:29-32`).
- **Storage:** SQLite, the same gateway-owned auth DB (`AuthSqliteConnectionFactory`, WAL; default
`C:\ProgramData\MxGateway\gateway-auth.db`). Table `api_key_audit` is created by
`SqliteAuthStoreMigrator.cs:95-102``audit_id INTEGER PRIMARY KEY AUTOINCREMENT, key_id TEXT NULL,
event_type TEXT NOT NULL, remote_address TEXT NULL, created_utc TEXT NOT NULL, details TEXT NULL`,
plus index `ix_api_key_audit_key_id_created_utc` (`SqliteAuthStoreMigrator.cs:107-108`). Table name
constant `SqliteAuthSchema.ApiKeyAuditTable = "api_key_audit"` (`SqliteAuthSchema.cs:11`). The log is
append-only: there is no update/delete/prune path.
- **Producers (three, all in the gateway):**
- **Admin CLI** `ApiKeyAdminCliRunner` — its private `AppendAuditAsync` (`ApiKeyAdminCliRunner.cs:153`)
always passes `RemoteAddress: null` (`ApiKeyAdminCliRunner.cs:163`). Event types:
`"init-db"` (`:48`), `"create-key"` (`:74`), `"list-keys"` (`:83`),
`"revoke-key"` with details `revoked`/`not-found-or-already-revoked` (`:102`),
`"rotate-key"` with details `rotated`/`not-found` (`:121`).
- **Dashboard** `DashboardApiKeyManagementService` — its `AppendAuditAsync` (`:197`) captures
`RemoteAddress: httpContextAccessor.HttpContext?.Connection.RemoteIpAddress?.ToString()` (`:207`).
Event types: `"dashboard-create-key"` (`:62`), revoke (`:101`, details
`revoked`/`not-found-or-already-revoked`), rotate (`:143`, details `rotated`/`not-found`),
delete (`:185`, details `deleted`/`not-found-or-active`).
- **Constraint denials** `ConstraintEnforcer.RecordDenialAsync` (`ConstraintEnforcer.cs:117`) writes
`EventType: "constraint-denied"`, `RemoteAddress: null`, and `Details:
$"{commandKind}: {target}: {failure.ConstraintName}: {failure.Message}"` (`ConstraintEnforcer.cs:124-129`).
This is the only "denial" event in the log.
- **No authn events.** The verifier (`ApiKeyVerifier`) and the gRPC authorization interceptor
(`GatewayGrpcAuthorizationInterceptor`) do **not** write to the audit store — authentication
success/failure and `Unauthenticated`/`PermissionDenied` outcomes are surfaced as gRPC statuses and
(per policy) discriminated for logging, but are not persisted as audit rows. So in practice the log
records **key lifecycle (CLI + dashboard) + constraint denials**, not per-request authn outcomes.
- **No separate redaction seam — scrubbing is structural, in the store/entry shape.** There is no
redactor, scrubber, sanitizer, or masking helper. Safety comes from *what the entry type can carry*:
`ApiKeyAuditEntry` has no field for a secret, and every caller passes only a `KeyId` (the public
key identifier, never the secret), an event-type literal, and short hand-built `Details` strings —
the secret/pepper never enters the audit path. This aligns with the repo policy that "API keys,
passwords, `WriteSecured` payloads, and `AuthenticateUser` credentials must never reach logs"
(`CLAUDE.md:79`). Net: redaction is by construction, not a pluggable seam.
- **Read-back has no production consumer.** `ListRecentAsync` is called only by tests
(`SqliteAuthStoreTests`, `ApiKeyAdminCliRunnerTests`). The dashboard `ApiKeysPage.razor` mentions the
audit log only in a delete-confirmation string (`ApiKeysPage.razor:321`) — it does **not** render it.
There is no UI or RPC that surfaces audit history today.
## Mapping to the canonical record
Target: `ZB.MOM.WW.Audit`'s `AuditEvent { Guid EventId; DateTimeOffset OccurredAtUtc; string Actor;
string Action; AuditOutcome Outcome; string? Category; string? Target; string? SourceNode;
Guid? CorrelationId; string? DetailsJson; }` with `AuditOutcome ∈ { Success, Failure, Denied }`.
| `AuditEvent` field | Source today | Mapping note |
|---|---|---|
| `EventId` (Guid, required) | — none — | **Must be generated** at write time. `ApiKeyAuditRecord` has only the autoincrement `AuditId` (`ApiKeyAuditRecord.cs:4`); no idempotency key exists. |
| `OccurredAtUtc` (required) | `CreatedUtc` (`ApiKeyAuditRecord.cs:8`), set as `DateTimeOffset.UtcNow` in the store (`SqliteApiKeyAuditStore.cs:20`) | Direct. Note: time is store-assigned today, not caller-supplied. |
| `Actor` (required) | `KeyId` (`ApiKeyAuditRecord.cs:5`) | Nullable today (`init-db`/`list-keys` pass `null`); the canonical `Actor` is required, so a fallback (e.g. `"system"`/`"cli"`) is needed for keyless events. |
| `Action` (required) | `EventType` (`ApiKeyAuditRecord.cs:6`) | Direct. Vocabulary: `init-db`, `create-key`, `dashboard-create-key`, `list-keys`, `revoke-key`, `rotate-key`, delete, `constraint-denied`. |
| `Outcome` (required) | derived | `constraint-denied``Denied`; everything else → `Success` (no `Failure` events are emitted today). |
| `Category` | — none — | Constant `"ApiKey"`. |
| `Target` | — none as a field — | No structured target. (`ConstraintEnforcer` does embed `commandKind`/`target` inside `Details` text, but there is no dedicated column.) |
| `SourceNode` | `RemoteAddress` (`ApiKeyAuditRecord.cs:7`) | Direct; populated only on the dashboard path (`DashboardApiKeyManagementService.cs:207`), `null` on CLI/constraint paths. |
| `CorrelationId` | — none — | Not captured today. |
| `DetailsJson` | `Details` (`ApiKeyAuditRecord.cs:9`) | Today this is a **plain string**, not JSON; either store as-is in `DetailsJson` or wrap as a small JSON object. |
---
## Adoption plan → `ZB.MOM.WW.Audit`
**Effort: LOW.** The seam is tiny (one interface, two methods, one record pair) and the data already
maps cleanly onto `AuditEvent`. Concretely:
1. **Adapter, not rewrite.** Map `IApiKeyAuditStore` → the shared `IAuditWriter`, and
`ApiKeyAuditEntry`/`ApiKeyAuditRecord``AuditEvent`, using the table above: generate a new
`EventId` Guid per write; `KeyId → Actor` (with a `"system"` fallback for null); `EventType → Action`;
`CreatedUtc → OccurredAtUtc`; `RemoteAddress → SourceNode`; `constraint-denied → Outcome.Denied`,
else `Success`; constant `Category = "ApiKey"`; `Details → DetailsJson`. The three producers
(`ApiKeyAdminCliRunner`, `DashboardApiKeyManagementService`, `ConstraintEnforcer`) keep their call
sites — only the injected type changes.
2. **Redaction stays by-construction.** No separate redactor needs porting; just preserve the rule that
callers never put secrets in `DetailsJson` (mirrors `CLAUDE.md:79`). The shared writer can keep its
own redaction policy as a defence-in-depth layer.
3. **Read-back is free to drop or defer.** `ListRecentAsync` has no production consumer, so the adapter
need not implement a shared query API on day one — only the test/CLI read paths exercise it.
4. **No new dimensions required.** `CorrelationId` and a structured `Target` are absent today and are
*not* in scope to add as part of adoption (descriptive parity only); the canonical record simply
leaves them `null`.
**Coordination risk — sequence against the health/observability work.** A parallel session is actively
editing **this same repo** (`mxaccessgw`) for the MEL → Serilog logging migration
(`ZB.MOM.WW.Health` + `ZB.MOM.WW.Telemetry` normalization). Because audit adoption here also touches the
gateway's `Security/Authentication/` wiring (DI registration in `AuthStoreServiceCollectionExtensions.cs`,
and the three producer call sites), the two efforts can collide on the same files and on logging-pipeline
DI. **Do not start MxGateway audit adoption until the Serilog migration in this repo has landed (or is
explicitly fenced off)**, and confirm with the orchestrator that the logging session is not mid-flight in
`Security/` before opening a PR. The audit and logging seams are conceptually independent (audit = durable
SQLite record of who-did-what; logging = operational telemetry), but they share the gateway's startup/DI
surface, so they must be merged in a defined order rather than in parallel.