docs(audit): spec + event-model
This commit is contained in:
@@ -0,0 +1,94 @@
|
||||
# Canonical event model (standardized)
|
||||
|
||||
Status: **Standardized**. The org-wide audit record + outcome enum every sister project maps onto.
|
||||
This is the reference companion to [`SPEC.md`](SPEC.md) (mirroring auth's `CANONICAL-ROLES.md` /
|
||||
theme's `DESIGN-TOKENS.md`): the field-by-field canonical record, the `AuditOutcome` definition with
|
||||
which app states map onto each value, and the full per-project mapping table. The shared library
|
||||
defines exactly this record; each project **projects its native record onto it** at the seam.
|
||||
|
||||
## The canonical record
|
||||
|
||||
```csharp
|
||||
namespace ZB.MOM.WW.Audit;
|
||||
|
||||
public sealed record AuditEvent
|
||||
{
|
||||
// REQUIRED core — who / what / when / outcome
|
||||
public required Guid EventId { get; init; } // idempotency key
|
||||
public required DateTimeOffset OccurredAtUtc { get; init; } // normalized to UTC
|
||||
public required string Actor { get; init; } // who — = ZB.MOM.WW.Auth principal at adoption
|
||||
public required string Action { get; init; } // what — verb / event-type string
|
||||
public required AuditOutcome Outcome { get; init; } // Success | Failure | Denied
|
||||
|
||||
// OPTIONAL common
|
||||
public string? Category { get; init; } // subsystem / grouping bucket
|
||||
public string? Target { get; init; } // on-what (resource / method / connection)
|
||||
public string? SourceNode { get; init; } // emitting logical node / host
|
||||
public Guid? CorrelationId { get; init; } // join to originating request / workflow
|
||||
|
||||
// EXTENSION — everything project-specific, as JSON
|
||||
public string? DetailsJson { get; init; }
|
||||
}
|
||||
|
||||
public enum AuditOutcome { Success, Failure, Denied }
|
||||
```
|
||||
|
||||
### Field-by-field
|
||||
|
||||
| Field | Req? | Type | Meaning | Notes |
|
||||
|---|:-:|---|---|---|
|
||||
| `EventId` | yes | `Guid` | Idempotency key | Backs at-least-once transports: OtOpcUa's filtered-unique `EventId` index, ScadaBridge's first-write-wins. MxGateway has none today → **generate at write time**. |
|
||||
| `OccurredAtUtc` | yes | `DateTimeOffset` | When it happened, UTC | MxGateway already uses `DateTimeOffset`. OtOpcUa / ScadaBridge store UTC-forced `DateTime` and widen at the mapping boundary. |
|
||||
| `Actor` | yes | `string` | Who acted | SHOULD be the `ZB.MOM.WW.Auth` principal ([`SPEC.md`](SPEC.md) §4). Kept a `string` (no Auth dependency). Keyless events use a `"system"` / `"cli"` fallback rather than empty. |
|
||||
| `Action` | yes | `string` | What was done (verb / event-type) | Carries each app's domain verb: OtOpcUa `EventType`, MxGateway `EventType`, ScadaBridge `{Channel}.{Kind}`. |
|
||||
| `Outcome` | yes | `AuditOutcome` | Success / Failure / Denied | **New normalized field — no app stores it today; each derives it** (see below). |
|
||||
| `Category` | no | `string?` | Coarse subsystem / grouping | OtOpcUa `Category` (`"Config"`); MxGateway constant `"ApiKey"`; ScadaBridge `Channel`. |
|
||||
| `Target` | no | `string?` | The object acted on | ScadaBridge `Target` (direct). OtOpcUa / MxGateway have no dedicated field → null or fold into `DetailsJson`. |
|
||||
| `SourceNode` | no | `string?` | Emitting logical node / host | OtOpcUa `SourceNode` (a logical node name, **not** an OPC UA NodeId); ScadaBridge `SourceNode`; MxGateway `RemoteAddress`. |
|
||||
| `CorrelationId` | no | `Guid?` | Join to originating request / workflow | OtOpcUa / ScadaBridge direct; MxGateway has none today (left null). |
|
||||
| `DetailsJson` | no | `string?` | Extension bag — all project-specific data | Must be valid JSON where stored (OtOpcUa enforces this with a CHECK constraint). Absorbs each app's surplus columns. |
|
||||
|
||||
## `AuditOutcome` — definition and app-state mapping
|
||||
|
||||
Three values, deliberately minimal — enough to normalize denials and failures without importing any
|
||||
app's full taxonomy. `Outcome` is **derived** at each emit site (no app persists it today; OtOpcUa
|
||||
encodes it implicitly in `EventType`, MxGateway in the event-type literal, ScadaBridge in `Status`):
|
||||
|
||||
| `AuditOutcome` | Meaning | OtOpcUa (`EventType`) | MxGateway (event type) | ScadaBridge (`AuditStatus` / `AuditKind`) |
|
||||
|---|---|---|---|---|
|
||||
| **`Success`** | The action completed | config-write verbs — `DraftCreated`, `DraftEdited`, `Published`, `RolledBack`, `NodeApplied`, `CredentialAdded`, `ClusterCreated`, `NodeAdded`, `ExternalIdReleased`, … | key-lifecycle — `init-db`, `create-key`, `list-keys`, `revoke-key`, `rotate-key` + all `dashboard-*` | `Status = Delivered` |
|
||||
| **`Failure`** | The action was attempted and failed | *(none today — a failed actor flush is dropped, not recorded as an event)* | *(none emitted today)* | `Status ∈ { Failed, Parked, Discarded }` |
|
||||
| **`Denied`** | The action was rejected by authorization / policy | `OpcUaAccessDenied`, `CrossClusterNamespaceAttempt` | `constraint-denied` | `Kind = InboundAuthFailure` |
|
||||
|
||||
Notes:
|
||||
|
||||
- **OtOpcUa has no `Failure` source.** Its vocabulary only distinguishes success-verbs from
|
||||
access-denials; an internal write failure is dropped (best-effort), not emitted as an event. So
|
||||
OtOpcUa produces only `Success` / `Denied` until/unless it adds failure events.
|
||||
- **MxGateway emits only `Success` / `Denied`** today (no failure events; authentication
|
||||
success/failure is surfaced as gRPC status, not persisted — see its current-state doc).
|
||||
- **ScadaBridge in-flight states** (`Submitted` / `Forwarded` / `Attempted`) are not terminal; when
|
||||
projecting to a single `Outcome` they collapse to the last-known terminal state. `Skipped` is not a
|
||||
user-facing outcome and is excluded from the canonical projection.
|
||||
|
||||
## Per-project mapping table (canonical ← native record)
|
||||
|
||||
Consolidated from the three current-state docs. "Direct" = field exists with the same role; the
|
||||
right-hand notes flag the type bridges and synthesized fields.
|
||||
|
||||
| Canonical field | OtOpcUa `AuditEvent` (8 fields) | MxGateway `ApiKeyAuditRecord` (6 fields) | ScadaBridge `AuditEvent` (~25 fields) |
|
||||
|---|---|---|---|
|
||||
| `EventId` | `EventId` — direct (idempotency key) | **generate** new `Guid` (only `AuditId` rowid exists) | `EventId` — direct |
|
||||
| `OccurredAtUtc` | `OccurredAtUtc` (`DateTime` UTC) → widen | `CreatedUtc` (store-assigned `DateTimeOffset`) — direct | `OccurredAtUtc` (`DateTime` UTC-forced) → widen |
|
||||
| `Actor` | `Actor` — direct | `KeyId` (nullable → `"system"`/`"cli"` fallback) | `Actor` (nullable on system rows) |
|
||||
| `Action` | `Action` (persisted as `"{Category}:{Action}"`) | `EventType` — direct | `{Channel}.{Kind}` (e.g. `ApiOutbound.ApiCall`) |
|
||||
| `Outcome` | **derive** from `EventType` | **derive**: `constraint-denied`→`Denied`, else `Success` | **derive** from `Status` (+`InboundAuthFailure`→`Denied`) |
|
||||
| `Category` | `Category` (`"Config"`) | constant `"ApiKey"` | `Channel` |
|
||||
| `Target` | — none — (null or via `DetailsJson`) | — none — (`commandKind`/`target` embedded in `Details` text) | `Target` — direct |
|
||||
| `SourceNode` | `SourceNode` (logical node, `NodeId.Value`) | `RemoteAddress` (dashboard path only) | `SourceNode` — direct |
|
||||
| `CorrelationId` | `CorrelationId` (`CorrelationId.Value`) — direct | — none — | `CorrelationId` — direct |
|
||||
| `DetailsJson` | `DetailsJson` — direct (also `ClusterId`/`GenerationId` on the SP path) | `Details` (plain string → store as-is or wrap) | the ~15 rich/plumbing fields (`ExecutionId`, `SourceSiteId`, `HttpStatus`, `DurationMs`, `ErrorMessage`, `RequestSummary`, `ResponseSummary`, `PayloadTruncated`, `Extra`, `ForwardState`, …) serialize here |
|
||||
|
||||
The canonical record is a **lossy projection**: it is sufficient for cross-project reporting, but each
|
||||
project keeps its native record as the storage shape — ScadaBridge especially, whose partitioned SQL
|
||||
schema, forwarding state, and reconciliation depend on the extra columns ([`SPEC.md`](SPEC.md) §5).
|
||||
@@ -0,0 +1,145 @@
|
||||
# Audit — normalized target spec
|
||||
|
||||
Status: **Draft**. The single design the sister projects converge on. Derived from the three
|
||||
code-verified current-state docs (`../current-state/`) and the locked design
|
||||
(`../../../docs/plans/2026-06-01-audit-component-design.md`). Goal is *path to shared code*
|
||||
(`../shared-contract/ZB.MOM.WW.Audit.md`), so each normalized section maps to a shared library seam.
|
||||
|
||||
## 0. Normalized vs left-per-project
|
||||
|
||||
**Normalized here** (the shared `ZB.MOM.WW.Audit` library):
|
||||
|
||||
- **The canonical `AuditEvent` record** — required core (`EventId`, `OccurredAtUtc`, `Actor`,
|
||||
`Action`, `Outcome`) + optional common (`Category`, `Target`, `SourceNode`, `CorrelationId`) +
|
||||
the `DetailsJson` extension bag. The full field-by-field reference is [`EVENT-MODEL.md`](EVENT-MODEL.md).
|
||||
- **`AuditOutcome`** — the 3-value `Success | Failure | Denied` enum (§3). This is a *new*
|
||||
normalized field every app derives; see [`EVENT-MODEL.md`](EVENT-MODEL.md) for the per-app derivation.
|
||||
- **The two seams** — `IAuditWriter` (best-effort, never throws to caller, §1) and `IAuditRedactor`
|
||||
(pure, never throws, over-redacts on failure, §2).
|
||||
|
||||
**Explicitly NOT normalized** (domain-specific / divergent — keep per project):
|
||||
|
||||
- **Transport & storage** — OtOpcUa's Akka cluster-broadcast → singleton `AuditWriterActor` (batch
|
||||
500 / 5 s, two-layer dedup) over `ConfigAuditLog`; MxGateway's SQLite `IApiKeyAuditStore` append +
|
||||
list-recent; ScadaBridge's site-SQLite hot-path → central MS SQL ingest / reconcile / purge /
|
||||
partition-maintenance / hash-chain pipeline. The shared core is **BCL-only** and carries no Akka /
|
||||
EF / SQLite / Serilog dependency.
|
||||
- **Domain vocabulary** — ScadaBridge's `Channel` / `Kind` / `Status` / `ForwardState` enums and
|
||||
OtOpcUa's `EventType` strings (`DraftCreated`, `Published`, `OpcUaAccessDenied`, …). These map
|
||||
*into* `Action` / `Category` / `Outcome` / `DetailsJson`; they do not leak into the shared type.
|
||||
- **Query / CLI / UI / export** surfaces (OtOpcUa `ClusterAudit.razor`; ScadaBridge `export` /
|
||||
`verify-chain` CLI + Blazor audit pages; MxGateway's unused `ListRecentAsync`).
|
||||
- **Each app's redaction *policy*** — *which* fields/commands/payloads are sensitive. Only the
|
||||
`IAuditRedactor` *seam* is shared; the `Default` / `Safe` filter behaviour stays per-project.
|
||||
|
||||
> **Scope of the producer path.** OtOpcUa has **two producers** writing the same `ConfigAuditLog`
|
||||
> table — the structured Akka `AuditEvent` path *and* older SQL stored procedures that `INSERT`
|
||||
> directly (`SUSER_SNAME()`, bare `EventType`, NULL `EventId`). Normalization targets the
|
||||
> **structured producer path** (the one that builds an `AuditEvent`), not every SQL insert; the SP
|
||||
> path stays per-project and is a reconcile item, not an extraction item (`../GAPS.md`).
|
||||
|
||||
## 1. The writer contract — `IAuditWriter` (best-effort)
|
||||
|
||||
```csharp
|
||||
public interface IAuditWriter
|
||||
{
|
||||
Task WriteAsync(AuditEvent evt, CancellationToken ct = default);
|
||||
}
|
||||
```
|
||||
|
||||
Audit is a side-channel, never on the critical path. The hard rule:
|
||||
|
||||
- **`WriteAsync` MUST NOT throw to the caller.** An implementation swallows/logs its own internal
|
||||
failures; a failed write **must never abort the user-facing action** it is recording. (ScadaBridge's
|
||||
seam already states this almost word-for-word: "Failures must NEVER abort the user-facing action.")
|
||||
- Idempotency is carried by `EventId`, so retries and at-least-once transports are safe (OtOpcUa's
|
||||
filtered-unique `EventId` index and ScadaBridge's first-write-wins are both honoured by this key).
|
||||
- Delivery is at-most-once *as a contract* — a writer MAY drop on failure (OtOpcUa drops a failed
|
||||
batch; ScadaBridge's ring-buffer fallback drops oldest). Durability is a per-project transport
|
||||
decision, not part of this seam.
|
||||
|
||||
Shipped helpers (the only concrete writers): `NoOpAuditWriter` (discards — tests / disabled audit),
|
||||
`CompositeAuditWriter` (fans out to N writers; **one writer throwing does not stop the others**), and
|
||||
`RedactingAuditWriter` (decorator: applies the redactor, then delegates to an inner writer).
|
||||
|
||||
## 2. The redactor contract — `IAuditRedactor` (never throws)
|
||||
|
||||
```csharp
|
||||
public interface IAuditRedactor
|
||||
{
|
||||
AuditEvent Apply(AuditEvent rawEvent);
|
||||
}
|
||||
```
|
||||
|
||||
A pure projection from a raw event to a safe one, applied between event construction and the writer
|
||||
chain. The hard rule:
|
||||
|
||||
- **`Apply` MUST NOT throw.** On any internal failure it **over-redacts** (returns a strictly safer
|
||||
event) rather than propagating — a redactor that throws would either crash the audit path or leak
|
||||
the unredacted event. (ScadaBridge's `SafeDefaultAuditPayloadFilter` is the reference: header-only
|
||||
redaction, over-redacts on parse failure.)
|
||||
- It is a **pure function** returning a filtered *copy* (via `with`); it does not mutate the input or
|
||||
perform I/O.
|
||||
|
||||
The seam is **aligned-but-independent** with Telemetry's `ILogRedactor` — same shape and naming
|
||||
discipline so a future `ZB.MOM.WW.Hosting` aggregator wires both with one mental model — but there is
|
||||
**no cross-package dependency**. Shipped helpers: `NullAuditRedactor` (identity — the default when no
|
||||
policy is configured) and `TruncatingAuditRedactor` (caps `DetailsJson` / `Target` to a configured
|
||||
max + sets a truncation marker; never throws). The *secret-field policy* (which fields/commands are
|
||||
sensitive) stays per-project via composition.
|
||||
|
||||
## 3. `AuditOutcome` — the new normalized field
|
||||
|
||||
`Outcome` is in the **required core**, but **no app stores it today** — each encodes outcome
|
||||
implicitly and must **derive** it at adoption (this is the one genuinely new field):
|
||||
|
||||
- **OtOpcUa** — derived from the `EventType` vocabulary (`OpcUaAccessDenied` /
|
||||
`CrossClusterNamespaceAttempt` → `Denied`; config-write verbs → `Success`).
|
||||
- **MxGateway** — `constraint-denied` → `Denied`; key-lifecycle events → `Success`.
|
||||
- **ScadaBridge** — `AuditStatus` → `Outcome` (`Delivered` → `Success`; `Failed` / `Parked` /
|
||||
`Discarded` → `Failure`; `InboundAuthFailure` kind → `Denied`).
|
||||
|
||||
The three values normalize denials and failures across the family without importing any app's full
|
||||
taxonomy. The enum definition and the complete state-by-state mapping live in [`EVENT-MODEL.md`](EVENT-MODEL.md).
|
||||
|
||||
## 4. The hinge — audit closes the loop on Auth
|
||||
|
||||
Every audit row's `Actor` is the *who*, which is exactly the identity the **Auth** component already
|
||||
normalizes (LDAP/GLAuth principal, API-key name). Auth is the read side ("who is this and what may
|
||||
they do"); audit is the write side ("who did what"). The spec ties them by stating:
|
||||
|
||||
- **`Actor` SHOULD be the `ZB.MOM.WW.Auth` principal** at adoption time.
|
||||
- But `Actor` is **kept as a plain `string`** in the contract, so the library carries **no dependency
|
||||
on `ZB.MOM.WW.Auth`**. (MxGateway's keyless events — `init-db` / `list-keys` — supply a `"system"` /
|
||||
`"cli"` fallback rather than leaving the required field empty.)
|
||||
|
||||
This mirrors Auth's own decision to keep audit *read* inside `OBSERVE` and audit *export* inside
|
||||
`ADMINISTER` rather than minting a separate auditor role: the two components share a vocabulary, not a
|
||||
dependency.
|
||||
|
||||
## 5. ScadaBridge is already at the target
|
||||
|
||||
ScadaBridge already ships **both** seams: an `IAuditWriter` whose best-effort contract matches
|
||||
word-for-word, and an `IAuditPayloadFilter` that *is* the canonical `IAuditRedactor` under a different
|
||||
name (identical `AuditEvent Apply(AuditEvent)` signature, pure / never-throws / over-redacts). The
|
||||
library essentially **lifts ScadaBridge's seams**.
|
||||
|
||||
The one real (non-naming) decision is the **writer's record type**: the canonical `IAuditWriter` is
|
||||
typed on the 8-field `AuditEvent`; ScadaBridge's writer is typed on its ~25-field record.
|
||||
|
||||
> **Resolution (recommended):** share the **interface *name* + the `AuditOutcome` enum**, not the
|
||||
> record schema. ScadaBridge keeps its rich ~25-field record as its **storage shape** (its whole
|
||||
> transport / partition / forwarding / reconciliation layer is built on the extra columns), and maps
|
||||
> to the canonical 8-field record **only at cross-app reporting boundaries**. This is the
|
||||
> minimal-coupling option — share the contract, not the schema — and avoids making the shared seam
|
||||
> generic over the event type. ScadaBridge therefore converges by **renaming one interface** and
|
||||
> adopting `AuditOutcome`, with no transport / storage / CLI / UI change.
|
||||
|
||||
## 6. Acceptance (what "converged" means)
|
||||
|
||||
A project is converged when: (a) its structured audit-producer path constructs the canonical
|
||||
`AuditEvent` (with `Outcome` derived per §3) and persists via an implementation of `IAuditWriter`;
|
||||
(b) any redaction runs through an `IAuditRedactor`; (c) `Actor` carries the `ZB.MOM.WW.Auth` principal
|
||||
where one exists (string fallback otherwise); with its transport, storage, domain vocabulary, query
|
||||
surfaces, and redaction *policy* unchanged. Per-project deltas and the adoption backlog are in
|
||||
[`../GAPS.md`](../GAPS.md); the proposed library API is [`../shared-contract/ZB.MOM.WW.Audit.md`](../shared-contract/ZB.MOM.WW.Audit.md).
|
||||
Reference in New Issue
Block a user