From 16540b3001e133b73a539ac51a05477fe74d84c8 Mon Sep 17 00:00:00 2001 From: Joseph Doherty Date: Mon, 1 Jun 2026 06:32:39 -0400 Subject: [PATCH] docs: design for audit normalization component + ZB.MOM.WW.Audit --- .../2026-06-01-audit-component-design.md | 246 ++++++++++++++++++ 1 file changed, 246 insertions(+) create mode 100644 docs/plans/2026-06-01-audit-component-design.md diff --git a/docs/plans/2026-06-01-audit-component-design.md b/docs/plans/2026-06-01-audit-component-design.md new file mode 100644 index 0000000..557e208 --- /dev/null +++ b/docs/plans/2026-06-01-audit-component-design.md @@ -0,0 +1,246 @@ +# Design — Audit normalization component + `ZB.MOM.WW.Audit` shared library + +Date: 2026-06-01 +Status: **Approved design** (brainstorm output). Implementation plan follows separately +via the writing-plans workflow. + +This design adds **Audit** as the next entry in the [component-normalization](../../components/README.md) +program, following the exact arc already used for **Auth** (`ZB.MOM.WW.Auth`), **UI-Theme** +(`ZB.MOM.WW.Theme`), and the in-flight **Health/Telemetry** pair: normalize the concern under +`components/`, then build a thin, tested, packed shared library in this repo. It is the #3-ranked +candidate in [`upcoming.md`](../../upcoming.md) (Audit — "ties back to Auth"). + +## Scope decisions (locked during brainstorm) + +1. **Audit only — logging is out of scope.** The parallel health/observability session already owns + structured logging: its `ZB.MOM.WW.Telemetry.Serilog` package holds the shared Serilog bootstrap, + `SiteId`/`NodeRole`/`Host` enrichers, `trace_id`/`span_id` correlation, an `ILogRedactor` seam, OTel + log export, **and** the MxGateway MEL→Serilog migration. That is the `upcoming.md` Tier-2 "Logging" + candidate. This session does **not** create a second Serilog owner. +2. **Deliverable depth = docs + a thin built library.** Matches the house arc (Auth/Theme/Health/ + Telemetry were all docs *then* a tested + packed lib) and `components/README.md`'s "extract only what + is genuinely common." **No sister-repo adoption this round** — adoption is deferred to `GAPS.md`, + exactly where Auth/Theme/Health sit today. +3. **Canonical record shape = required core + optional common + JSON extension bag.** No project's + domain enums leak into the shared type; `Actor` stays a plain string (no hard dependency on + `ZB.MOM.WW.Auth`). +4. **Redaction seam = aligned-but-independent.** Audit defines its own `IAuditRedactor` (over + `AuditEvent`), shaped + named to mirror Telemetry's `ILogRedactor` so a future `ZB.MOM.WW.Hosting` + aggregator wires both with one mental model — but **no cross-package dependency**; audit stays thin. +5. **Packaging = single package `ZB.MOM.WW.Audit`.** The shared core (record + seams + tiny helpers) + has **zero heavy dependencies** — Akka/EF/SQLite/Serilog are per-project *transport*, which stays + per-project. Auth split into 4 / Health into 3 only because they had heavy, independently-optional + impls; audit has none. A future heavy shared sink (EF/Akka) would become an opt-in satellite then — + YAGNI now. + +## The unifying hinge — audit closes the loop on Auth + +Every audit row's `Actor` is the *who* — which is precisely the identity the `ZB.MOM.WW.Auth` component +already normalizes (LDAP/GLAuth principal, API-key name). Audit is the write-side counterpart of Auth's +read-side identity: Auth answers "who is this and what may they do," audit records "who did what." The +spec ties them by stating `Actor` SHOULD be the `ZB.MOM.WW.Auth` principal at adoption time (kept as a +string in the contract so the library carries no dependency on Auth). + +## Repo layout + +``` +scadaproj/ +├─ components/ +│ └─ audit/ NEW normalization component (docs) +│ ├─ README.md overview + per-project status table (links into the docs below) +│ ├─ spec/SPEC.md the ONE normalized target (Section 0: normalized vs left-per-project) +│ ├─ spec/EVENT-MODEL.md canonical record + Outcome + per-project mapping reference +│ │ (mirrors auth CANONICAL-ROLES.md / theme DESIGN-TOKENS.md) +│ ├─ shared-contract/ZB.MOM.WW.Audit.md proposed public API on paper +│ ├─ current-state/{otopcua,mxaccessgw,scadabridge}/CURRENT-STATE.md code-verified, file:line +│ └─ GAPS.md per-project deltas + adoption/extraction backlog +├─ ZB.MOM.WW.Audit/ NEW built library (nested git repo, .NET 10) → 1 nupkg @ 0.1.0 +└─ docs/plans/ + ├─ 2026-06-01-audit-component-design.md (this design) + └─ 2026-06-01-zb-mom-ww-audit-shared-library.md (impl plan — from writing-plans) +``` + +Index updates (same discipline as prior components): add the `audit/` row to `components/README.md`, +the Component-normalization table in [`CLAUDE.md`](../../CLAUDE.md), and check off **Audit (#3)** in +[`upcoming.md`](../../upcoming.md). + +## Code-verified current state (2026-06-01 scan) + +| | OtOpcUa | MxGateway | ScadaBridge | +|---|---|---|---| +| Record | `AuditEvent` (8 fields) | `ApiKeyAuditRecord` (6 fields) | `AuditEvent` (~25 fields) | +| Writer seam | Akka `tell` → singleton | `IApiKeyAuditStore.AppendAsync` | **`IAuditWriter.WriteAsync`** (best-effort) | +| Redaction seam | none | scrubs in store | **`IAuditPayloadFilter.Apply`** (truncate + redact, never throws) | +| Transport | Akka cluster broadcast → `AuditWriterActor` (batch 500 / 5s, 2-layer dedup) | SQLite append + list-recent | Site SQLite hot-path → Central MS SQL ingest/reconcile/purge/partition-maintenance + hash-chain verify | +| Storage | `ConfigAuditLog` EF entity (filtered-unique `EventId` index) | SQLite table | partitioned SQL Server `datetime2`, EF + migrations | +| Domain vocab | `EventType` strings (DraftCreated/Published/OpcUaAccessDenied/…) | API-key event types | `Channel`/`Kind`/`Status`/`ForwardState` enums | +| Scope | config writes + authz checks | API-key auth events/denials only | full who-did-what across site + central (CLI + UI + export) | + +**Key finding:** ScadaBridge is **already at the target** — it has `IAuditWriter` (best-effort, "failures +must NEVER abort the user-facing action") + `IAuditPayloadFilter` (pure, never-throws, over-redacts on +failure) with contracts near-identical to what we extract. The library essentially *lifts ScadaBridge's +seams*, renaming the filter to the `ILogRedactor`-aligned `IAuditRedactor`. OtOpcUa's fire-and-forget +Akka `tell` is morally the same best-effort writer; MxGateway's `IApiKeyAuditStore` is a specialized, +narrower writer. The genuinely common core is the *who/what/when/outcome/target/correlation + details* +record plus those two seams; **transport and storage diverge wildly and stay per-project.** + +Key refs: +- OtOpcUa `src/Core/ZB.MOM.WW.OtOpcUa.Commons/Messages/Audit/AuditEvent.cs`; + `src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/ConfigAuditLog.cs`; + `src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/Audit/AuditWriterActor.cs`. +- MxGateway `src/ZB.MOM.WW.MxGateway.Server/Security/Authentication/{IApiKeyAuditStore,ApiKeyAuditRecord,ApiKeyAuditEntry,SqliteApiKeyAuditStore}.cs`. +- ScadaBridge `src/ZB.MOM.WW.ScadaBridge.Commons/Entities/Audit/AuditEvent.cs`; + `src/ZB.MOM.WW.ScadaBridge.Commons/Interfaces/Services/IAuditWriter.cs`; + `src/ZB.MOM.WW.ScadaBridge.AuditLog/Payload/IAuditPayloadFilter.cs` (+ the whole + `ZB.MOM.WW.ScadaBridge.AuditLog/` Site+Central pipeline). + +## Library design — `ZB.MOM.WW.Audit` (1 package, BCL-only) + +### Canonical record + outcome + +```csharp +namespace ZB.MOM.WW.Audit; + +public sealed record AuditEvent +{ + // REQUIRED core — who / what / when / outcome + public required Guid EventId { get; init; } // idempotency key + public required DateTimeOffset OccurredAtUtc { get; init; } // UTC (see note) + public required string Actor { get; init; } // who — = ZB.MOM.WW.Auth principal at adoption + public required string Action { get; init; } // what — verb/event-type string + public required AuditOutcome Outcome { get; init; } // Success | Failure | Denied + + // OPTIONAL common + public string? Category { get; init; } // subsystem/grouping + public string? Target { get; init; } // on-what (resource/method/connection) + public string? SourceNode { get; init; } // emitting node + public Guid? CorrelationId { get; init; } // join to originating request/workflow + + // EXTENSION — everything project-specific, as JSON + public string? DetailsJson { get; init; } +} + +public enum AuditOutcome { Success, Failure, Denied } +``` + +**Timestamp choice:** `DateTimeOffset` — unambiguous UTC (MxGateway already uses it). ScadaBridge/OtOpcUa +store UTC-forced `DateTime` and convert at their mapping boundary. (Swappable to UTC `DateTime` if the +team prefers to match the storage majority; flagged as the one open detail.) + +**Why `Outcome` is in the required core:** denials/failures are genuinely common — OtOpcUa +`OpcUaAccessDenied`, MxGateway API-key denials, ScadaBridge `InboundAuthFailure` + `AuditStatus`. A +3-value `Success | Failure | Denied` enum normalizes them without importing any app's full taxonomy. + +### Two seams + +```csharp +// Lifts ScadaBridge's IAuditWriter: best-effort, MUST swallow internal failures, NEVER throw to caller. +public interface IAuditWriter +{ + Task WriteAsync(AuditEvent evt, CancellationToken ct = default); +} + +// Mirrors Telemetry's ILogRedactor shape (aligned-but-independent). Pure function; MUST NOT throw +// (over-redact on internal failure). Generalizes ScadaBridge's IAuditPayloadFilter. +public interface IAuditRedactor +{ + AuditEvent Apply(AuditEvent rawEvent); +} +``` + +### Thin shipped helpers (the only concrete types) + +- `NullAuditRedactor` — identity; the default when no policy is configured. +- `TruncatingAuditRedactor` — caps `DetailsJson`/`Target` to a configured max + sets a truncation + marker; never throws (over-redacts on failure). Generalizes ScadaBridge's truncation half. The + secret-field *policy* (which fields/commands are sensitive) stays per-project via composition. +- `NoOpAuditWriter` — discards (tests / disabled audit). +- `CompositeAuditWriter` — fans out to N writers; one writer throwing does not stop the others + (holds the best-effort contract). +- `RedactingAuditWriter` — decorator: `Apply` the redactor, then delegate to an inner `IAuditWriter`. + Generalizes ScadaBridge's "filter between event construction and the writer chain." +- `services.AddZbAudit(...)` — DI extension wiring redactor + decorator; `Null`/`NoOp` by default. + +### How each repo maps onto the canonical record + +| Canonical | OtOpcUa | MxGateway | ScadaBridge | +|---|---|---|---| +| `EventId` / `OccurredAtUtc` | `EventId` / `OccurredAtUtc` | new Guid / `CreatedUtc` | `EventId` / `OccurredAtUtc` | +| `Actor` / `Action` | `Actor` / `Action` | `KeyId` / `EventType` | `Actor` / `Kind`(+`Channel`)→str | +| `Outcome` | derive from action | denial → `Denied` | `Status` → `Outcome` | +| `Category` / `Target` / `SourceNode` | `Category` / — / `SourceNode` | `"ApiKey"` / — / `RemoteAddress` | `Channel` / `Target` / `SourceNode` | +| `CorrelationId` / `DetailsJson` | `CorrelationId` / `DetailsJson` | — / `Details` | `CorrelationId` / Request+Response+Error+Extra → JSON | + +### Stays per-project (explicitly NOT in the library) + +- **Transport/storage:** OtOpcUa Akka broadcast + `AuditWriterActor` + `ConfigAuditLog`; ScadaBridge + Site-SQLite hot-path + Central MS-SQL ingest/reconcile/purge/partition-maintenance + hash-chain; + MxGateway SQLite `IApiKeyAuditStore`. +- **Domain vocab:** `Channel`/`Kind`/`Status`/`ForwardState` enums, OtOpcUa `EventType` strings — these + map into `Action`/`Category`/`DetailsJson`. +- **Query / CLI / UI / export** surfaces. +- Each app's redaction **policy** (which fields are secret) — only the *seam* is shared. + +## Normalization component docs + +Follows `components/README.md`'s six-part layout (matching auth + ui-theme). `spec/SPEC.md` opens with a +Section 0 stating normalized vs. left-per-project explicitly. `spec/EVENT-MODEL.md` is the reference doc +(canonical record + `Outcome` + the mapping table), mirroring auth's `CANONICAL-ROLES.md` / theme's +`DESIGN-TOKENS.md`. Three `current-state//CURRENT-STATE.md` at full code-verified depth, each +ending in an Adoption plan. `GAPS.md` turns deltas into a prioritized backlog. Registers at status +**Draft** (`Draft → Reviewed → Adopting → Converged`). + +## Testing & verification + +Every type ships tests (mirrors auth's 172 / theme's 32; a thin lib lands ~40–60; `dotnet test` from the +library root): + +- **Record/enum** — required fields enforced; value-equality; `OccurredAtUtc` round-trips as UTC; + `DetailsJson` passthrough; `AuditOutcome` values. +- **`NullAuditRedactor`** — identity (input returned unchanged). +- **`TruncatingAuditRedactor`** — caps `DetailsJson`/`Target` + sets truncation marker; **never throws** + on malformed input (over-redacts on internal failure — the seam's hard contract). +- **`NoOpAuditWriter`** — discards, completes. +- **`CompositeAuditWriter`** — fans out to all inner writers; one writer throwing does not stop the rest. +- **`RedactingAuditWriter`** — passes the *redacted* (not raw) event to the inner writer; never throws to + caller. +- **`AddZbAudit`** — resolves `IAuditWriter`/`IAuditRedactor` with `NoOp`/`Null` defaults; decorator + composition wires. + +**Verification gates (evidence, not assertions):** `dotnet test` green + `dotnet pack` → **1 nupkg @ +0.1.0**. + +## GAPS.md / adoption backlog (deferred — adoption lives here) + +- Per-project divergences vs `SPEC.md`; each `current-state` ends in an Adoption plan. +- **ScadaBridge** — already at the target; adoption is "align, don't replace" (rename + `IAuditPayloadFilter`→`IAuditRedactor`; its `IAuditWriter` already matches). Large surface, but mostly + naming/contract alignment, no behaviour change. Risk: high blast radius, so low priority. +- **MxGateway** — map `IApiKeyAuditStore`/`ApiKeyAuditRecord` → `IAuditWriter`/`AuditEvent`. Low effort, + but **coordinate** — the parallel session is already editing this repo (MEL→Serilog). +- **OtOpcUa** — `AuditEvent` → canonical record; `AuditWriterActor : IAuditWriter`; `ConfigAuditLog` + mapping. Medium effort. +- Cross-cutting items: `Outcome` normalization across all three; `Actor` = `ZB.MOM.WW.Auth` principal + (closes the loop on Auth); `IAuditRedactor` naming aligned with Telemetry's `ILogRedactor`. + +## Build order + +``` +1. components/audit/ docs (spec first — drives the API): + current-state ×3 → SPEC + EVENT-MODEL → shared-contract → GAPS +2. ZB.MOM.WW.Audit library (record + enum + 2 seams + helpers + AddZbAudit), tests, pack @ 0.1.0 +3. Index/registry updates (components/README, CLAUDE.md, upcoming.md #3) + GAPS cross-check + (no adoption step — deferred to GAPS) +``` + +## Implementation tasks (native task IDs) + +- #7 Author `components/audit/` normalization docs (current-state ×3 + SPEC + EVENT-MODEL + + shared-contract + GAPS) +- #8 Build `ZB.MOM.WW.Audit` library (1 package: record + enum + 2 seams + helpers + `AddZbAudit`; + tests; `dotnet pack` @ 0.1.0) — blocked by #7 (spec drives the API) +- #9 Index/registry updates (`components/README.md`, `CLAUDE.md`, `upcoming.md` #3) + GAPS cross-check — + blocked by #8 + +Adoption into the three apps is intentionally **not** a task here — it is the `GAPS.md` follow-on, +identical to where Auth/Theme/Health adoption sits today.