diff --git a/components/audit/GAPS.md b/components/audit/GAPS.md new file mode 100644 index 0000000..cb82e30 --- /dev/null +++ b/components/audit/GAPS.md @@ -0,0 +1,114 @@ +# Audit β€” gaps & adoption backlog + +Divergence of each project from [`spec/SPEC.md`](spec/SPEC.md), and the ordered backlog to +reach the shared `ZB.MOM.WW.Audit` library. Status legend: β›” gap Β· 🟑 partial Β· βœ… matches. + +> **Adoption is deferred this round.** The library is being designed (shared contract in +> [`shared-contract/ZB.MOM.WW.Audit.md`](shared-contract/ZB.MOM.WW.Audit.md)) but is not yet +> wired into any app β€” exactly where `ZB.MOM.WW.Auth` and `ZB.MOM.WW.Theme` sit today. +> The items below are the follow-on work; each lands as a separate PR per project. + +## Divergence vs spec + +### Β§1 Canonical record (`AuditEvent`) + +| Canonical field | OtOpcUa | MxAccessGateway | ScadaBridge | +|---|---|---|---| +| `EventId` (Guid, required) | βœ… β€” idempotency key; buffer key + filtered-unique DB index | β›” β€” no event key; only an `AUTOINCREMENT` rowid (`AuditId`) | βœ… β€” direct | +| `OccurredAtUtc` (DateTimeOffset, required) | 🟑 β€” `DateTime` UTC; widen at mapping boundary | 🟑 β€” `DateTimeOffset` but store-assigned (not caller-supplied); direct after widening | 🟑 β€” `DateTime` UTC-forced; widen at mapping boundary | +| `Actor` (string, required) | βœ… β€” direct (`AuditEvent.Actor` β†’ `ConfigAuditLog.Principal`) | 🟑 β€” `KeyId` nullable; keyless events (`init-db`/`list-keys`) need a `"system"`/`"cli"` fallback | 🟑 β€” nullable on system-originated rows; fallback needed | +| `Action` (string, required) | 🟑 β€” `Action` field exists, but persisted as `"{Category}:{Action}"` composite in `EventType`; canonical keeps them separate | βœ… β€” `EventType` literal direct | 🟑 β€” derived as `{Channel}.{Kind}` (e.g. `ApiOutbound.ApiCall`) | +| `Outcome` (AuditOutcome, required) | β›” **NEW** β€” derived from `EventType` vocabulary; not stored today | β›” **NEW** β€” derived: `constraint-denied`β†’`Denied`, else `Success` | β›” **NEW** β€” derived from `Status` (+`InboundAuthFailure` Kindβ†’`Denied`) | +| `Category` (string?) | βœ… β€” `AuditEvent.Category` (e.g. `"Config"`) | β›” β€” no field; constant `"ApiKey"` at mapping | βœ… β€” `Channel` | +| `Target` (string?) | β›” β€” no dedicated field; closest is `DetailsJson` | β›” β€” embedded in `Details` text (`commandKind`/`target`) | βœ… β€” direct | +| `SourceNode` (string?) | βœ… β€” `SourceNode` (logical cluster node / host name, NOT an OPC UA NodeId) | 🟑 β€” `RemoteAddress`; dashboard path only (null on CLI/constraint paths) | βœ… β€” direct | +| `CorrelationId` (Guid?) | βœ… β€” direct (`CorrelationId.Value`) | β›” β€” not captured today; left null | βœ… β€” direct | +| `DetailsJson` (string?) | βœ… β€” direct (JSON CHECK constraint enforced) | 🟑 β€” `Details` is a plain string, not JSON; wrap or store as-is | 🟑 β€” ~15 rich/plumbing fields serialize here at the cross-project reporting boundary | + +### Β§2 `IAuditWriter` seam + +| | OtOpcUa | MxAccessGateway | ScadaBridge | +|---|---|---|---| +| Named seam | β›” β€” no `IAuditWriter`; `AuditWriterActor` is the sink, consumed directly via Akka messaging | β›” β€” `IApiKeyAuditStore` (narrow, two-method) is the seam; no general `IAuditWriter` | βœ… β€” `IAuditWriter` with `WriteAsync(AuditEvent, CancellationToken)` signature; "failures must NEVER abort the user-facing action" contract; best-effort | +| Best-effort / never throws | 🟑 β€” the actor drops a failed flush (best-effort), but the seam is not a typed interface a caller can inject independently | β›” β€” no contract; `AppendAsync` may propagate | βœ… | +| Record type at the seam | 🟑 β€” OtOpcUa's own `AuditEvent` (8 fields, with Commons value-types `NodeId`/`CorrelationId`) | β›” β€” `ApiKeyAuditEntry` (4 fields) | 🟑 β€” ScadaBridge's ~25-field `AuditEvent` (rich record; adoption = keep own record, adopt canonical interface name + `AuditOutcome`) | + +### Β§3 `IAuditRedactor` seam + +| | OtOpcUa | MxAccessGateway | ScadaBridge | +|---|---|---|---| +| Named seam | β›” β€” no redactor; no payload filtering today | β›” β€” no redactor; safety by construction (entry type cannot carry a secret) | βœ… β€” `IAuditPayloadFilter` (`AuditEvent Apply(AuditEvent)`, pure/never-throws/over-redacts); **only the name differs** from canonical `IAuditRedactor` | +| Over-redacts on failure | β›” β€” n/a | β›” β€” n/a | βœ… β€” `SafeDefaultAuditPayloadFilter` is the reference | + +### Β§4 `AuditOutcome` β€” the new normalized field + +`Outcome` is a **genuinely new field** across all three projects. No app stores it today; +each encodes it implicitly. All three must derive and emit it at adoption: + +β†’ **Gap O1 (OtOpcUa):** derive from `EventType` vocabulary β€” `OpcUaAccessDenied` / +`CrossClusterNamespaceAttempt` β†’ `Denied`; config-write verbs β†’ `Success`. No `Failure` +value exists in OtOpcUa's vocabulary today (failed flushes are dropped, not emitted), so +OtOpcUa will produce only `Success` / `Denied` until/unless failure events are added. + +β†’ **Gap O2 (MxGateway):** derive β€” `constraint-denied` β†’ `Denied`; all others β†’ `Success`. +No `Failure` events are emitted today. + +β†’ **Gap O3 (ScadaBridge):** derive from `AuditStatus` β€” `Delivered` β†’ `Success`; +`Failed` / `Parked` / `Discarded` β†’ `Failure`; `Kind = InboundAuthFailure` β†’ `Denied`. +In-flight states (`Submitted` / `Forwarded` / `Attempted`) collapse to the last-known +terminal state when projecting; `Skipped` is excluded from the canonical projection. + +### Β§5 `Actor` β†’ Auth principal + +At adoption, every emit site should supply the `ZB.MOM.WW.Auth` principal as `Actor` +(string). The library carries no Auth dependency β€” `Actor` is a plain `string` β€” but the +handshake with Auth is the semantic goal (closes the loop). + +β†’ **Gap P1 (all 3):** at adoption, update emit sites to populate `Actor` from the Auth +principal (LDAP user / API-key name). Auth adoption (#8 in `components/auth/GAPS.md`) is a +prerequisite for the full story; until then, use the existing actor string. + +### Β§6 OtOpcUa two-producer problem + +OtOpcUa has **two writers to `ConfigAuditLog`**: the structured Akka `AuditEvent` path AND +older SQL stored procedures that `INSERT` directly (bare `EventType`, NULL `EventId` / +`CorrelationId`, populated `ClusterId` / `GenerationId`). Normalization targets the +structured path only; the SP path stays per-project. + +β†’ **Gap Q1 (OtOpcUa):** decide at adoption whether to route SP events through the actor +or leave them non-idempotent. Also: the `ClusterId`-filter / actor-never-sets-`ClusterId` +mismatch (Admin UI `ClusterAudit.razor` filters by `ClusterId`, but the actor path sets +`NodeId` not `ClusterId`, so structured rows are invisible to the cluster view). Fix when +normalizing the query surface. + +## Adoption backlog (ordered) + +| # | Item | Projects | Priority | Effort | Risk | Notes | +|---|---|---|---|---|---|---| +| 1 | **OtOpcUa:** rename `AuditWriterActor` β†’ implements `IAuditWriter`; replace `Commons/Messages/Audit/AuditEvent.cs` with canonical record; add `Outcome` derivation at every emit site (Gap O1) | OtOpcUa | Med | M | Med | Actor internals (batching / dedup / flush triggers) stay bespoke; only the seam type and record change. Commons value-types `NodeId`/`CorrelationId` bridged at construction. | +| 2 | **MxGateway:** map `IApiKeyAuditStore` / `ApiKeyAuditEntry` / `ApiKeyAuditRecord` β†’ `IAuditWriter` / `AuditEvent`; generate `EventId` per write; add `"system"`/`"cli"` Actor fallback; constant `Category = "ApiKey"`; `constraint-denied`β†’`Outcome.Denied` (Gaps O2, record gaps) | MxGateway | Low | S | Med | ⚠ **COORDINATE** β€” a parallel session is editing this repo for the MELβ†’Serilog migration (Health/Telemetry normalization). Do NOT start until the Serilog session has landed (or is explicitly fenced off); the two efforts share `Security/Authentication/` DI wiring. | +| 3 | **ScadaBridge:** rename `IAuditPayloadFilter` β†’ `IAuditRedactor` (or alias during transition); adopt canonical `AuditOutcome` enum (Gap O3); confirm writer contract matches (already byte-for-byte) | ScadaBridge | Low | S | High | **"Align, don't replace."** Blast radius is HIGH β€” `IAuditPayloadFilter` is used across the entire pipeline (site, central, wiring). Rename + alias only; no transport/storage/record change. `DefaultAuditPayloadFilter` / `SafeDefaultAuditPayloadFilter` implementations unchanged. | +| 4 | **All:** populate `Actor` from `ZB.MOM.WW.Auth` principal at emit sites (Gap P1) | All 3 | Low | S | Low | **Prerequisite:** Auth adoption per `components/auth/GAPS.md` #8. Until Auth is adopted, leave the existing actor string as-is. | +| 5 | **OtOpcUa:** reconcile two-producer problem β€” decide SP path routing + fix `ClusterId`-filter / actor mismatch in `ClusterAudit.razor` (Gap Q1) | OtOpcUa | Low | S | Low | Normalization does not unify the SP path; this is a reconcile item to decide and document. The mismatch means structured `AuditEvent` rows are currently invisible to the cluster-scoped view. | +| 6 | **MxGateway:** add `CorrelationId` capture at constraint denial + dashboard paths; structured `Target` from `Details` text (currently embedded as a plain string in `ConstraintEnforcer`) | MxGateway | Low | S | Low | Nice-to-have parity; not required for adoption. `CorrelationId` and `Target` canonical fields left null until this is done. | + +**Sequencing:** #3 (ScadaBridge rename) is lowest-risk and self-contained β€” do it first (or +last, depending on blast-radius appetite). #1 (OtOpcUa) is medium effort but independent; it +can start once the shared library is built. #2 (MxGateway) is the smallest code change but +has the highest **coordination dependency** β€” gate it on the Serilog migration landing first. +#4 (Actorβ†’Auth) is blocked on Auth adoption and is the last to close. #5 and #6 are cleanup +items with no bearing on shared-library adoption. + +Each adoption lands as an opt-in version bump per project behind the seam; the shared library +is consumed but the bespoke transport/storage/UI for each project is not touched. + +## Decisions still open + +- ScadaBridge `IAuditPayloadFilter` β†’ `IAuditRedactor`: outright rename vs. transitional alias + (both are valid; alias reduces blast radius in the short term). +- MxGateway `Details` plain string β†’ `DetailsJson`: store as-is or wrap in a JSON object at + the mapping boundary. +- `AuditOutcome` column in OtOpcUa storage: add a new `Outcome` column to `ConfigAuditLog` + or fold into `DetailsJson` / derive at read time (schema change vs. runtime cost). +- OtOpcUa SP path: route through the actor path (unified producer) or leave as a bespoke + secondary writer with its own column conventions (separate reconcile effort). diff --git a/components/audit/README.md b/components/audit/README.md new file mode 100644 index 0000000..9c7d9b0 --- /dev/null +++ b/components/audit/README.md @@ -0,0 +1,72 @@ +# Audit (who-did-what) + +Status: **Draft**. Normalized component β€” path to shared code. Goal: converge the three +sister projects onto a canonical `AuditEvent` record + `AuditOutcome` enum + two thin seams +(`IAuditWriter`, `IAuditRedactor`), proposed as the `ZB.MOM.WW.Audit` library, while each +project keeps its own transport, storage, domain vocabulary, and redaction policy. + +- The one target: [`spec/SPEC.md`](spec/SPEC.md) +- Canonical event model + field reference: [`spec/EVENT-MODEL.md`](spec/EVENT-MODEL.md) +- The proposed shared library: [`shared-contract/ZB.MOM.WW.Audit.md`](shared-contract/ZB.MOM.WW.Audit.md) +- Divergences + backlog: [`GAPS.md`](GAPS.md) +- Current state, per project: [`current-state/`](current-state/) + +## Why audit is a strong normalization candidate + +All three projects record a structured who-did-what trail with an actor identity, an action +verb, and a timestamp. Two (OtOpcUa + ScadaBridge) already have a named `AuditEvent` record +with an `EventId` idempotency key, `Actor`, and `CorrelationId`. ScadaBridge already ships +**both** canonical seams under slightly different names (`IAuditWriter` is byte-for-byte the +spec; `IAuditPayloadFilter` is the canonical `IAuditRedactor`). OtOpcUa's record is almost +field-for-field aligned. MxGateway has a narrow API-key-lifecycle log that maps cleanly. + +The one new field across all three is `AuditOutcome` β€” no project stores it explicitly today; +each encodes it implicitly and derives it at adoption. This is the bulk of the per-project +work. Transport, storage, domain vocabulary, and redaction policy are **not** unified β€” each +project keeps its own bespoke implementation behind the seam. + +**Audit closes the loop on Auth.** Every audit row's `Actor` is exactly the identity that the +`ZB.MOM.WW.Auth` component normalizes (LDAP/GLAuth principal, API-key name). The library keeps +`Actor` as a plain `string` (no Auth dependency), but at adoption each emit site supplies the +Auth principal. + +**`IAuditRedactor` naming is aligned with Telemetry's `ILogRedactor`** β€” same shape and naming +discipline so a future `ZB.MOM.WW.Hosting` aggregator wires both redactors with one mental +model β€” but there is no cross-package dependency between the two libraries. + +## Status by project + +| Project | Audit today | Seams present | `AuditOutcome` | Adoption status | +|---|---|---|---|---| +| **OtOpcUa** | Akka cluster-broadcast `AuditEvent` β†’ cluster-singleton `AuditWriterActor` (batch 500/5 s, two-layer dedup) over EF `ConfigAuditLog` (SQL Server). Also a legacy SQL stored-procedure write path (bare `EventType`, NULL `EventId`). Admin UI page `ClusterAudit.razor`. | No named `IAuditWriter` seam; no redactor seam. | Not stored β€” encoded in `EventType` strings (`OpcUaAccessDenied`/`CrossClusterNamespaceAttempt` β†’ `Denied`; config-write verbs β†’ `Success`). | Not started | +| **MxAccessGateway** | Single SQLite-backed `IApiKeyAuditStore` / `ApiKeyAuditEntry` β€” key lifecycle (CLI + dashboard) + constraint denials only. No authn events persisted; no production read consumer. | Narrow custom seam (`IApiKeyAuditStore`); no general `IAuditWriter`; redaction is by-construction (secret never enters the record type). | Not stored β€” derived: `constraint-denied` β†’ `Denied`; all others β†’ `Success`. | Not started | +| **ScadaBridge** | Full pipeline: site SQLite hot-path (`SqliteAuditWriter` + ring-buffer fallback) β†’ Akka `ClusterClient` forwarder β†’ central MS SQL (ingest / reconcile / purge / partition maintenance). Rich ~25-field `AuditEvent` record. CLI `export`/`verify-chain`; Blazor audit UI. | βœ… `IAuditWriter` (matches canonical contract word-for-word); βœ… `IAuditPayloadFilter` (= canonical `IAuditRedactor`, identical signature, pure/never-throws/over-redacts). | Not stored explicitly β€” derived from `Status` (`Delivered`β†’`Success`; `Failed`/`Parked`/`Discarded`β†’`Failure`; `Kind = InboundAuthFailure`β†’`Denied`). | Not started (align, don't replace) | + +See each project's `current-state//CURRENT-STATE.md` for code-verified detail and +adoption plan: + +- [`current-state/otopcua/CURRENT-STATE.md`](current-state/otopcua/CURRENT-STATE.md) +- [`current-state/mxaccessgw/CURRENT-STATE.md`](current-state/mxaccessgw/CURRENT-STATE.md) +- [`current-state/scadabridge/CURRENT-STATE.md`](current-state/scadabridge/CURRENT-STATE.md) + +## Normalized vs. left per-project + +**Normalized (the shared `ZB.MOM.WW.Audit` library):** the canonical `AuditEvent` record +(5 required fields + 4 optional common + `DetailsJson` extension bag); the `AuditOutcome` +enum (`Success | Failure | Denied`); the `IAuditWriter` seam (best-effort, never throws to +caller); the `IAuditRedactor` seam (pure, never throws, over-redacts on failure); shipped +helpers (`NoOpAuditWriter`, `CompositeAuditWriter`, `RedactingAuditWriter`, +`NullAuditRedactor`, `TruncatingAuditRedactor`). Library is BCL-only β€” no Akka / EF / SQLite +/ Serilog dependency. + +**Left per-project (each project keeps these behind the seam):** transport and storage (Akka +singleton + EF/SQL Server; SQLite; site-SQLite + central MS SQL + forwarding/reconcile +pipeline); domain vocabulary (`EventType` strings / API-key event-type literals / `Channel` + +`Kind` + `Status` enums); query, CLI, and UI surfaces (`ClusterAudit.razor`; `ListRecentAsync`; +`export` / `verify-chain`; Blazor audit pages); redaction *policy* (which fields/payloads are +sensitive β€” only the `IAuditRedactor` *seam* is shared). + +> **Adoption is deferred this round.** The `ZB.MOM.WW.Audit` library is being designed and +> the shared contract defined, but none of the three apps wire it in yet β€” exactly where +> `ZB.MOM.WW.Auth` and `ZB.MOM.WW.Theme` sit today. The per-project adoption backlog is in +> [`GAPS.md`](GAPS.md).