Files
scadaproj/components/audit/spec/SPEC.md
T

147 lines
9.0 KiB
Markdown

# Audit — normalized target spec
Status: **Draft**. The single design the sister projects converge on. Derived from the three
code-verified current-state docs (`../current-state/`) and the locked design
(`../../../docs/plans/2026-06-01-audit-component-design.md`). Goal is *path to shared code*
(`../shared-contract/ZB.MOM.WW.Audit.md`), so each normalized section maps to a shared library seam.
## 0. Normalized vs left-per-project
**Normalized here** (the shared `ZB.MOM.WW.Audit` library):
- **The canonical `AuditEvent` record** — required core (`EventId`, `OccurredAtUtc`, `Actor`,
`Action`, `Outcome`) + optional common (`Category`, `Target`, `SourceNode`, `CorrelationId`) +
the `DetailsJson` extension bag. The full field-by-field reference is [`EVENT-MODEL.md`](EVENT-MODEL.md).
- **`AuditOutcome`** — the 3-value `Success | Failure | Denied` enum (§3). This is a *new*
normalized field every app derives; see [`EVENT-MODEL.md`](EVENT-MODEL.md) for the per-app derivation.
- **The two seams** — `IAuditWriter` (best-effort, never throws to caller, §1) and `IAuditRedactor`
(pure, never throws, over-redacts on failure, §2).
**Explicitly NOT normalized** (domain-specific / divergent — keep per project):
- **Transport & storage** — OtOpcUa's Akka cluster-broadcast → singleton `AuditWriterActor` (batch
500 / 5 s, two-layer dedup) over `ConfigAuditLog`; MxGateway's SQLite `IApiKeyAuditStore` append +
list-recent; ScadaBridge's site-SQLite hot-path → central MS SQL ingest / reconcile / purge /
partition-maintenance / hash-chain pipeline. The shared core carries no Akka / EF / SQLite /
Serilog dependency; its only non-BCL dependency is `Microsoft.Extensions.DependencyInjection.Abstractions`
(for `AddZbAudit`).
- **Domain vocabulary** — ScadaBridge's `Channel` / `Kind` / `Status` / `ForwardState` enums and
OtOpcUa's `EventType` strings (`DraftCreated`, `Published`, `OpcUaAccessDenied`, …). These map
*into* `Action` / `Category` / `Outcome` / `DetailsJson`; they do not leak into the shared type.
- **Query / CLI / UI / export** surfaces (OtOpcUa `ClusterAudit.razor`; ScadaBridge `export` /
`verify-chain` CLI + Blazor audit pages; MxGateway's unused `ListRecentAsync`).
- **Each app's redaction *policy*** — *which* fields/commands/payloads are sensitive. Only the
`IAuditRedactor` *seam* is shared; the `Default` / `Safe` filter behaviour stays per-project.
> **Scope of the producer path.** OtOpcUa has **two producers** writing the same `ConfigAuditLog`
> table — the structured Akka `AuditEvent` path *and* older SQL stored procedures that `INSERT`
> directly (`SUSER_SNAME()`, bare `EventType`, NULL `EventId`). Normalization targets the
> **structured producer path** (the one that builds an `AuditEvent`), not every SQL insert; the SP
> path stays per-project and is a reconcile item, not an extraction item (`../GAPS.md`).
## 1. The writer contract — `IAuditWriter` (best-effort)
```csharp
public interface IAuditWriter
{
Task WriteAsync(AuditEvent evt, CancellationToken ct = default);
}
```
Audit is a side-channel, never on the critical path. The hard rule:
- **`WriteAsync` MUST NOT throw to the caller.** An implementation swallows/logs its own internal
failures; a failed write **must never abort the user-facing action** it is recording. (ScadaBridge's
seam already states this almost word-for-word: "Failures must NEVER abort the user-facing action.")
- Idempotency is carried by `EventId`, so retries and at-least-once transports are safe (OtOpcUa's
filtered-unique `EventId` index and ScadaBridge's first-write-wins are both honoured by this key).
- Delivery is at-most-once *as a contract* — a writer MAY drop on failure (OtOpcUa drops a failed
batch; ScadaBridge's ring-buffer fallback drops oldest). Durability is a per-project transport
decision, not part of this seam.
Shipped helpers (the only concrete writers): `NoOpAuditWriter` (discards — tests / disabled audit),
`CompositeAuditWriter` (fans out to N writers; **one writer throwing does not stop the others**), and
`RedactingAuditWriter` (decorator: applies the redactor, then delegates to an inner writer).
## 2. The redactor contract — `IAuditRedactor` (never throws)
```csharp
public interface IAuditRedactor
{
AuditEvent Apply(AuditEvent rawEvent);
}
```
A pure projection from a raw event to a safe one, applied between event construction and the writer
chain. The hard rule:
- **`Apply` MUST NOT throw.** On any internal failure it **over-redacts** (returns a strictly safer
event) rather than propagating — a redactor that throws would either crash the audit path or leak
the unredacted event. (ScadaBridge's `SafeDefaultAuditPayloadFilter` is the reference: header-only
redaction, over-redacts on parse failure.)
- It is a **pure function** returning a filtered *copy* (via `with`); it does not mutate the input or
perform I/O.
The seam is **aligned-but-independent** with Telemetry's `ILogRedactor` — same shape and naming
discipline so a future `ZB.MOM.WW.Hosting` aggregator wires both with one mental model — but there is
**no cross-package dependency**. Shipped helpers: `NullAuditRedactor` (identity — the default when no
policy is configured) and `TruncatingAuditRedactor` (caps `DetailsJson` / `Target` to a configured
max + sets a truncation marker; never throws). The *secret-field policy* (which fields/commands are
sensitive) stays per-project via composition.
## 3. `AuditOutcome` — the new normalized field
`Outcome` is in the **required core**, but **no app stores it today** — each encodes outcome
implicitly and must **derive** it at adoption (this is the one genuinely new field):
- **OtOpcUa** — derived from the `EventType` vocabulary (`OpcUaAccessDenied` /
`CrossClusterNamespaceAttempt``Denied`; config-write verbs → `Success`).
- **MxGateway** — `constraint-denied``Denied`; key-lifecycle events → `Success`.
- **ScadaBridge** — `AuditStatus``Outcome` (`Delivered``Success`; `Failed` / `Parked` /
`Discarded``Failure`; `InboundAuthFailure` kind → `Denied`).
The three values normalize denials and failures across the family without importing any app's full
taxonomy. The enum definition and the complete state-by-state mapping live in [`EVENT-MODEL.md`](EVENT-MODEL.md).
## 4. The hinge — audit closes the loop on Auth
Every audit row's `Actor` is the *who*, which is exactly the identity the **Auth** component already
normalizes (LDAP/GLAuth principal, API-key name). Auth is the read side ("who is this and what may
they do"); audit is the write side ("who did what"). The spec ties them by stating:
- **`Actor` SHOULD be the `ZB.MOM.WW.Auth` principal** at adoption time.
- But `Actor` is **kept as a plain `string`** in the contract, so the library carries **no dependency
on `ZB.MOM.WW.Auth`**. (MxGateway's keyless events — `init-db` / `list-keys` — supply a `"system"` /
`"cli"` fallback rather than leaving the required field empty.)
This mirrors Auth's own decision to keep audit *read* inside `OBSERVE` and audit *export* inside
`ADMINISTER` rather than minting a separate auditor role: the two components share a vocabulary, not a
dependency.
## 5. ScadaBridge is already at the target
ScadaBridge already ships **both** seams: an `IAuditWriter` whose best-effort contract matches
word-for-word, and an `IAuditPayloadFilter` that *is* the canonical `IAuditRedactor` under a different
name (identical `AuditEvent Apply(AuditEvent)` signature, pure / never-throws / over-redacts). The
library essentially **lifts ScadaBridge's seams**.
The one real (non-naming) decision is the **writer's record type**: the canonical `IAuditWriter` is
typed on the 10-field `AuditEvent`; ScadaBridge's writer is typed on its ~25-field record.
> **Resolution (recommended):** share the **interface *name* + the `AuditOutcome` enum**, not the
> record schema. ScadaBridge keeps its rich ~25-field record as its **storage shape** (its whole
> transport / partition / forwarding / reconciliation layer is built on the extra columns), and maps
> to the canonical 10-field record **only at cross-app reporting boundaries**. This is the
> minimal-coupling option — share the contract, not the schema — and avoids making the shared seam
> generic over the event type. ScadaBridge therefore converges by **renaming one interface** and
> adopting `AuditOutcome`, with no transport / storage / CLI / UI change.
## 6. Acceptance (what "converged" means)
A project is converged when: (a) its structured audit-producer path constructs the canonical
`AuditEvent` (with `Outcome` derived per §3) and persists via an implementation of `IAuditWriter`;
(b) any redaction runs through an `IAuditRedactor`; (c) `Actor` carries the `ZB.MOM.WW.Auth` principal
where one exists (string fallback otherwise); with its transport, storage, domain vocabulary, query
surfaces, and redaction *policy* unchanged. Per-project deltas and the adoption backlog are in
[`../GAPS.md`](../GAPS.md); the proposed library API is [`../shared-contract/ZB.MOM.WW.Audit.md`](../shared-contract/ZB.MOM.WW.Audit.md).