247 lines
15 KiB
Markdown
247 lines
15 KiB
Markdown
# Design — Audit normalization component + `ZB.MOM.WW.Audit` shared library
|
||
|
||
Date: 2026-06-01
|
||
Status: **Approved design** (brainstorm output). Implementation plan follows separately
|
||
via the writing-plans workflow.
|
||
|
||
This design adds **Audit** as the next entry in the [component-normalization](../../components/README.md)
|
||
program, following the exact arc already used for **Auth** (`ZB.MOM.WW.Auth`), **UI-Theme**
|
||
(`ZB.MOM.WW.Theme`), and the in-flight **Health/Telemetry** pair: normalize the concern under
|
||
`components/`, then build a thin, tested, packed shared library in this repo. It is the #3-ranked
|
||
candidate in [`upcoming.md`](../../upcoming.md) (Audit — "ties back to Auth").
|
||
|
||
## Scope decisions (locked during brainstorm)
|
||
|
||
1. **Audit only — logging is out of scope.** The parallel health/observability session already owns
|
||
structured logging: its `ZB.MOM.WW.Telemetry.Serilog` package holds the shared Serilog bootstrap,
|
||
`SiteId`/`NodeRole`/`Host` enrichers, `trace_id`/`span_id` correlation, an `ILogRedactor` seam, OTel
|
||
log export, **and** the MxGateway MEL→Serilog migration. That is the `upcoming.md` Tier-2 "Logging"
|
||
candidate. This session does **not** create a second Serilog owner.
|
||
2. **Deliverable depth = docs + a thin built library.** Matches the house arc (Auth/Theme/Health/
|
||
Telemetry were all docs *then* a tested + packed lib) and `components/README.md`'s "extract only what
|
||
is genuinely common." **No sister-repo adoption this round** — adoption is deferred to `GAPS.md`,
|
||
exactly where Auth/Theme/Health sit today.
|
||
3. **Canonical record shape = required core + optional common + JSON extension bag.** No project's
|
||
domain enums leak into the shared type; `Actor` stays a plain string (no hard dependency on
|
||
`ZB.MOM.WW.Auth`).
|
||
4. **Redaction seam = aligned-but-independent.** Audit defines its own `IAuditRedactor` (over
|
||
`AuditEvent`), shaped + named to mirror Telemetry's `ILogRedactor` so a future `ZB.MOM.WW.Hosting`
|
||
aggregator wires both with one mental model — but **no cross-package dependency**; audit stays thin.
|
||
5. **Packaging = single package `ZB.MOM.WW.Audit`.** The shared core (record + seams + tiny helpers)
|
||
has **zero heavy dependencies** — Akka/EF/SQLite/Serilog are per-project *transport*, which stays
|
||
per-project. Auth split into 4 / Health into 3 only because they had heavy, independently-optional
|
||
impls; audit has none. A future heavy shared sink (EF/Akka) would become an opt-in satellite then —
|
||
YAGNI now.
|
||
|
||
## The unifying hinge — audit closes the loop on Auth
|
||
|
||
Every audit row's `Actor` is the *who* — which is precisely the identity the `ZB.MOM.WW.Auth` component
|
||
already normalizes (LDAP/GLAuth principal, API-key name). Audit is the write-side counterpart of Auth's
|
||
read-side identity: Auth answers "who is this and what may they do," audit records "who did what." The
|
||
spec ties them by stating `Actor` SHOULD be the `ZB.MOM.WW.Auth` principal at adoption time (kept as a
|
||
string in the contract so the library carries no dependency on Auth).
|
||
|
||
## Repo layout
|
||
|
||
```
|
||
scadaproj/
|
||
├─ components/
|
||
│ └─ audit/ NEW normalization component (docs)
|
||
│ ├─ README.md overview + per-project status table (links into the docs below)
|
||
│ ├─ spec/SPEC.md the ONE normalized target (Section 0: normalized vs left-per-project)
|
||
│ ├─ spec/EVENT-MODEL.md canonical record + Outcome + per-project mapping reference
|
||
│ │ (mirrors auth CANONICAL-ROLES.md / theme DESIGN-TOKENS.md)
|
||
│ ├─ shared-contract/ZB.MOM.WW.Audit.md proposed public API on paper
|
||
│ ├─ current-state/{otopcua,mxaccessgw,scadabridge}/CURRENT-STATE.md code-verified, file:line
|
||
│ └─ GAPS.md per-project deltas + adoption/extraction backlog
|
||
├─ ZB.MOM.WW.Audit/ NEW built library (nested git repo, .NET 10) → 1 nupkg @ 0.1.0
|
||
└─ docs/plans/
|
||
├─ 2026-06-01-audit-component-design.md (this design)
|
||
└─ 2026-06-01-zb-mom-ww-audit-shared-library.md (impl plan — from writing-plans)
|
||
```
|
||
|
||
Index updates (same discipline as prior components): add the `audit/` row to `components/README.md`,
|
||
the Component-normalization table in [`CLAUDE.md`](../../CLAUDE.md), and check off **Audit (#3)** in
|
||
[`upcoming.md`](../../upcoming.md).
|
||
|
||
## Code-verified current state (2026-06-01 scan)
|
||
|
||
| | OtOpcUa | MxGateway | ScadaBridge |
|
||
|---|---|---|---|
|
||
| Record | `AuditEvent` (8 fields) | `ApiKeyAuditRecord` (6 fields) | `AuditEvent` (~25 fields) |
|
||
| Writer seam | Akka `tell` → singleton | `IApiKeyAuditStore.AppendAsync` | **`IAuditWriter.WriteAsync`** (best-effort) |
|
||
| Redaction seam | none | scrubs in store | **`IAuditPayloadFilter.Apply`** (truncate + redact, never throws) |
|
||
| Transport | Akka cluster broadcast → `AuditWriterActor` (batch 500 / 5s, 2-layer dedup) | SQLite append + list-recent | Site SQLite hot-path → Central MS SQL ingest/reconcile/purge/partition-maintenance + hash-chain verify |
|
||
| Storage | `ConfigAuditLog` EF entity (filtered-unique `EventId` index) | SQLite table | partitioned SQL Server `datetime2`, EF + migrations |
|
||
| Domain vocab | `EventType` strings (DraftCreated/Published/OpcUaAccessDenied/…) | API-key event types | `Channel`/`Kind`/`Status`/`ForwardState` enums |
|
||
| Scope | config writes + authz checks | API-key auth events/denials only | full who-did-what across site + central (CLI + UI + export) |
|
||
|
||
**Key finding:** ScadaBridge is **already at the target** — it has `IAuditWriter` (best-effort, "failures
|
||
must NEVER abort the user-facing action") + `IAuditPayloadFilter` (pure, never-throws, over-redacts on
|
||
failure) with contracts near-identical to what we extract. The library essentially *lifts ScadaBridge's
|
||
seams*, renaming the filter to the `ILogRedactor`-aligned `IAuditRedactor`. OtOpcUa's fire-and-forget
|
||
Akka `tell` is morally the same best-effort writer; MxGateway's `IApiKeyAuditStore` is a specialized,
|
||
narrower writer. The genuinely common core is the *who/what/when/outcome/target/correlation + details*
|
||
record plus those two seams; **transport and storage diverge wildly and stay per-project.**
|
||
|
||
Key refs:
|
||
- OtOpcUa `src/Core/ZB.MOM.WW.OtOpcUa.Commons/Messages/Audit/AuditEvent.cs`;
|
||
`src/Core/ZB.MOM.WW.OtOpcUa.Configuration/Entities/ConfigAuditLog.cs`;
|
||
`src/Server/ZB.MOM.WW.OtOpcUa.ControlPlane/Audit/AuditWriterActor.cs`.
|
||
- MxGateway `src/ZB.MOM.WW.MxGateway.Server/Security/Authentication/{IApiKeyAuditStore,ApiKeyAuditRecord,ApiKeyAuditEntry,SqliteApiKeyAuditStore}.cs`.
|
||
- ScadaBridge `src/ZB.MOM.WW.ScadaBridge.Commons/Entities/Audit/AuditEvent.cs`;
|
||
`src/ZB.MOM.WW.ScadaBridge.Commons/Interfaces/Services/IAuditWriter.cs`;
|
||
`src/ZB.MOM.WW.ScadaBridge.AuditLog/Payload/IAuditPayloadFilter.cs` (+ the whole
|
||
`ZB.MOM.WW.ScadaBridge.AuditLog/` Site+Central pipeline).
|
||
|
||
## Library design — `ZB.MOM.WW.Audit` (1 package, BCL-only)
|
||
|
||
### Canonical record + outcome
|
||
|
||
```csharp
|
||
namespace ZB.MOM.WW.Audit;
|
||
|
||
public sealed record AuditEvent
|
||
{
|
||
// REQUIRED core — who / what / when / outcome
|
||
public required Guid EventId { get; init; } // idempotency key
|
||
public required DateTimeOffset OccurredAtUtc { get; init; } // UTC (see note)
|
||
public required string Actor { get; init; } // who — = ZB.MOM.WW.Auth principal at adoption
|
||
public required string Action { get; init; } // what — verb/event-type string
|
||
public required AuditOutcome Outcome { get; init; } // Success | Failure | Denied
|
||
|
||
// OPTIONAL common
|
||
public string? Category { get; init; } // subsystem/grouping
|
||
public string? Target { get; init; } // on-what (resource/method/connection)
|
||
public string? SourceNode { get; init; } // emitting node
|
||
public Guid? CorrelationId { get; init; } // join to originating request/workflow
|
||
|
||
// EXTENSION — everything project-specific, as JSON
|
||
public string? DetailsJson { get; init; }
|
||
}
|
||
|
||
public enum AuditOutcome { Success, Failure, Denied }
|
||
```
|
||
|
||
**Timestamp choice:** `DateTimeOffset` — unambiguous UTC (MxGateway already uses it). ScadaBridge/OtOpcUa
|
||
store UTC-forced `DateTime` and convert at their mapping boundary. (Swappable to UTC `DateTime` if the
|
||
team prefers to match the storage majority; flagged as the one open detail.)
|
||
|
||
**Why `Outcome` is in the required core:** denials/failures are genuinely common — OtOpcUa
|
||
`OpcUaAccessDenied`, MxGateway API-key denials, ScadaBridge `InboundAuthFailure` + `AuditStatus`. A
|
||
3-value `Success | Failure | Denied` enum normalizes them without importing any app's full taxonomy.
|
||
|
||
### Two seams
|
||
|
||
```csharp
|
||
// Lifts ScadaBridge's IAuditWriter: best-effort, MUST swallow internal failures, NEVER throw to caller.
|
||
public interface IAuditWriter
|
||
{
|
||
Task WriteAsync(AuditEvent evt, CancellationToken ct = default);
|
||
}
|
||
|
||
// Mirrors Telemetry's ILogRedactor shape (aligned-but-independent). Pure function; MUST NOT throw
|
||
// (over-redact on internal failure). Generalizes ScadaBridge's IAuditPayloadFilter.
|
||
public interface IAuditRedactor
|
||
{
|
||
AuditEvent Apply(AuditEvent rawEvent);
|
||
}
|
||
```
|
||
|
||
### Thin shipped helpers (the only concrete types)
|
||
|
||
- `NullAuditRedactor` — identity; the default when no policy is configured.
|
||
- `TruncatingAuditRedactor` — caps `DetailsJson`/`Target` to a configured max + sets a truncation
|
||
marker; never throws (over-redacts on failure). Generalizes ScadaBridge's truncation half. The
|
||
secret-field *policy* (which fields/commands are sensitive) stays per-project via composition.
|
||
- `NoOpAuditWriter` — discards (tests / disabled audit).
|
||
- `CompositeAuditWriter` — fans out to N writers; one writer throwing does not stop the others
|
||
(holds the best-effort contract).
|
||
- `RedactingAuditWriter` — decorator: `Apply` the redactor, then delegate to an inner `IAuditWriter`.
|
||
Generalizes ScadaBridge's "filter between event construction and the writer chain."
|
||
- `services.AddZbAudit(...)` — DI extension wiring redactor + decorator; `Null`/`NoOp` by default.
|
||
|
||
### How each repo maps onto the canonical record
|
||
|
||
| Canonical | OtOpcUa | MxGateway | ScadaBridge |
|
||
|---|---|---|---|
|
||
| `EventId` / `OccurredAtUtc` | `EventId` / `OccurredAtUtc` | new Guid / `CreatedUtc` | `EventId` / `OccurredAtUtc` |
|
||
| `Actor` / `Action` | `Actor` / `Action` | `KeyId` / `EventType` | `Actor` / `Kind`(+`Channel`)→str |
|
||
| `Outcome` | derive from action | denial → `Denied` | `Status` → `Outcome` |
|
||
| `Category` / `Target` / `SourceNode` | `Category` / — / `SourceNode` | `"ApiKey"` / — / `RemoteAddress` | `Channel` / `Target` / `SourceNode` |
|
||
| `CorrelationId` / `DetailsJson` | `CorrelationId` / `DetailsJson` | — / `Details` | `CorrelationId` / Request+Response+Error+Extra → JSON |
|
||
|
||
### Stays per-project (explicitly NOT in the library)
|
||
|
||
- **Transport/storage:** OtOpcUa Akka broadcast + `AuditWriterActor` + `ConfigAuditLog`; ScadaBridge
|
||
Site-SQLite hot-path + Central MS-SQL ingest/reconcile/purge/partition-maintenance + hash-chain;
|
||
MxGateway SQLite `IApiKeyAuditStore`.
|
||
- **Domain vocab:** `Channel`/`Kind`/`Status`/`ForwardState` enums, OtOpcUa `EventType` strings — these
|
||
map into `Action`/`Category`/`DetailsJson`.
|
||
- **Query / CLI / UI / export** surfaces.
|
||
- Each app's redaction **policy** (which fields are secret) — only the *seam* is shared.
|
||
|
||
## Normalization component docs
|
||
|
||
Follows `components/README.md`'s six-part layout (matching auth + ui-theme). `spec/SPEC.md` opens with a
|
||
Section 0 stating normalized vs. left-per-project explicitly. `spec/EVENT-MODEL.md` is the reference doc
|
||
(canonical record + `Outcome` + the mapping table), mirroring auth's `CANONICAL-ROLES.md` / theme's
|
||
`DESIGN-TOKENS.md`. Three `current-state/<project>/CURRENT-STATE.md` at full code-verified depth, each
|
||
ending in an Adoption plan. `GAPS.md` turns deltas into a prioritized backlog. Registers at status
|
||
**Draft** (`Draft → Reviewed → Adopting → Converged`).
|
||
|
||
## Testing & verification
|
||
|
||
Every type ships tests (mirrors auth's 172 / theme's 32; a thin lib lands ~40–60; `dotnet test` from the
|
||
library root):
|
||
|
||
- **Record/enum** — required fields enforced; value-equality; `OccurredAtUtc` round-trips as UTC;
|
||
`DetailsJson` passthrough; `AuditOutcome` values.
|
||
- **`NullAuditRedactor`** — identity (input returned unchanged).
|
||
- **`TruncatingAuditRedactor`** — caps `DetailsJson`/`Target` + sets truncation marker; **never throws**
|
||
on malformed input (over-redacts on internal failure — the seam's hard contract).
|
||
- **`NoOpAuditWriter`** — discards, completes.
|
||
- **`CompositeAuditWriter`** — fans out to all inner writers; one writer throwing does not stop the rest.
|
||
- **`RedactingAuditWriter`** — passes the *redacted* (not raw) event to the inner writer; never throws to
|
||
caller.
|
||
- **`AddZbAudit`** — resolves `IAuditWriter`/`IAuditRedactor` with `NoOp`/`Null` defaults; decorator
|
||
composition wires.
|
||
|
||
**Verification gates (evidence, not assertions):** `dotnet test` green + `dotnet pack` → **1 nupkg @
|
||
0.1.0**.
|
||
|
||
## GAPS.md / adoption backlog (deferred — adoption lives here)
|
||
|
||
- Per-project divergences vs `SPEC.md`; each `current-state` ends in an Adoption plan.
|
||
- **ScadaBridge** — already at the target; adoption is "align, don't replace" (rename
|
||
`IAuditPayloadFilter`→`IAuditRedactor`; its `IAuditWriter` already matches). Large surface, but mostly
|
||
naming/contract alignment, no behaviour change. Risk: high blast radius, so low priority.
|
||
- **MxGateway** — map `IApiKeyAuditStore`/`ApiKeyAuditRecord` → `IAuditWriter`/`AuditEvent`. Low effort,
|
||
but **coordinate** — the parallel session is already editing this repo (MEL→Serilog).
|
||
- **OtOpcUa** — `AuditEvent` → canonical record; `AuditWriterActor : IAuditWriter`; `ConfigAuditLog`
|
||
mapping. Medium effort.
|
||
- Cross-cutting items: `Outcome` normalization across all three; `Actor` = `ZB.MOM.WW.Auth` principal
|
||
(closes the loop on Auth); `IAuditRedactor` naming aligned with Telemetry's `ILogRedactor`.
|
||
|
||
## Build order
|
||
|
||
```
|
||
1. components/audit/ docs (spec first — drives the API):
|
||
current-state ×3 → SPEC + EVENT-MODEL → shared-contract → GAPS
|
||
2. ZB.MOM.WW.Audit library (record + enum + 2 seams + helpers + AddZbAudit), tests, pack @ 0.1.0
|
||
3. Index/registry updates (components/README, CLAUDE.md, upcoming.md #3) + GAPS cross-check
|
||
(no adoption step — deferred to GAPS)
|
||
```
|
||
|
||
## Implementation tasks (native task IDs)
|
||
|
||
- #7 Author `components/audit/` normalization docs (current-state ×3 + SPEC + EVENT-MODEL +
|
||
shared-contract + GAPS)
|
||
- #8 Build `ZB.MOM.WW.Audit` library (1 package: record + enum + 2 seams + helpers + `AddZbAudit`;
|
||
tests; `dotnet pack` @ 0.1.0) — blocked by #7 (spec drives the API)
|
||
- #9 Index/registry updates (`components/README.md`, `CLAUDE.md`, `upcoming.md` #3) + GAPS cross-check —
|
||
blocked by #8
|
||
|
||
Adoption into the three apps is intentionally **not** a task here — it is the `GAPS.md` follow-on,
|
||
identical to where Auth/Theme/Health adoption sits today.
|