Files
scadaproj/components/audit/GAPS.md
T
2026-06-01 07:08:31 -04:00

115 lines
10 KiB
Markdown

# Audit — gaps & adoption backlog
Divergence of each project from [`spec/SPEC.md`](spec/SPEC.md), and the ordered backlog to
reach the shared `ZB.MOM.WW.Audit` library. Status legend: ⛔ gap · 🟡 partial · ✅ matches.
> **Adoption is deferred this round.** The library is being designed (shared contract in
> [`shared-contract/ZB.MOM.WW.Audit.md`](shared-contract/ZB.MOM.WW.Audit.md)) but is not yet
> wired into any app — exactly where `ZB.MOM.WW.Auth` and `ZB.MOM.WW.Theme` sit today.
> The items below are the follow-on work; each lands as a separate PR per project.
## Divergence vs spec
### §1 Canonical record (`AuditEvent`)
| Canonical field | OtOpcUa | MxAccessGateway | ScadaBridge |
|---|---|---|---|
| `EventId` (Guid, required) | ✅ — idempotency key; buffer key + filtered-unique DB index | ⛔ — no event key; only an `AUTOINCREMENT` rowid (`AuditId`) | ✅ — direct |
| `OccurredAtUtc` (DateTimeOffset, required) | 🟡 — `DateTime` UTC; widen at mapping boundary | 🟡 — `DateTimeOffset` but store-assigned (not caller-supplied); direct after widening | 🟡 — `DateTime` UTC-forced; widen at mapping boundary |
| `Actor` (string, required) | ✅ — direct (`AuditEvent.Actor``ConfigAuditLog.Principal`) | 🟡 — `KeyId` nullable; keyless events (`init-db`/`list-keys`) need a `"system"`/`"cli"` fallback | 🟡 — nullable on system-originated rows; fallback needed |
| `Action` (string, required) | 🟡 — `Action` field exists, but persisted as `"{Category}:{Action}"` composite in `EventType`; canonical keeps them separate | ✅ — `EventType` literal direct | 🟡 — derived as `{Channel}.{Kind}` (e.g. `ApiOutbound.ApiCall`) |
| `Outcome` (AuditOutcome, required) | ⛔ **NEW** — derived from `EventType` vocabulary; not stored today | ⛔ **NEW** — derived: `constraint-denied``Denied`, else `Success` | ⛔ **NEW** — derived from `Status` (+`InboundAuthFailure` Kind→`Denied`) |
| `Category` (string?) | ✅ — `AuditEvent.Category` (e.g. `"Config"`) | ⛔ — no field; constant `"ApiKey"` at mapping | ✅ — `Channel` |
| `Target` (string?) | ⛔ — no dedicated field; closest is `DetailsJson` | ⛔ — embedded in `Details` text (`commandKind`/`target`) | ✅ — direct |
| `SourceNode` (string?) | ✅ — `SourceNode` (logical cluster node / host name, NOT an OPC UA NodeId) | 🟡 — `RemoteAddress`; dashboard path only (null on CLI/constraint paths) | ✅ — direct |
| `CorrelationId` (Guid?) | ✅ — direct (`CorrelationId.Value`) | ⛔ — not captured today; left null | ✅ — direct |
| `DetailsJson` (string?) | ✅ — direct (JSON CHECK constraint enforced) | 🟡 — `Details` is a plain string, not JSON; wrap or store as-is | 🟡 — ~15 rich/plumbing fields serialize here at the cross-project reporting boundary |
### §2 `IAuditWriter` seam
| | OtOpcUa | MxAccessGateway | ScadaBridge |
|---|---|---|---|
| Named seam | ⛔ — no `IAuditWriter`; `AuditWriterActor` is the sink, consumed directly via Akka messaging | ⛔ — `IApiKeyAuditStore` (narrow, two-method) is the seam; no general `IAuditWriter` | ✅ — `IAuditWriter` with `WriteAsync(AuditEvent, CancellationToken)` signature; "failures must NEVER abort the user-facing action" contract; best-effort |
| Best-effort / never throws | 🟡 — the actor drops a failed flush (best-effort), but the seam is not a typed interface a caller can inject independently | ⛔ — no contract; `AppendAsync` may propagate | ✅ |
| Record type at the seam | 🟡 — OtOpcUa's own `AuditEvent` (8 fields, with Commons value-types `NodeId`/`CorrelationId`) | ⛔ — `ApiKeyAuditEntry` (4 fields) | 🟡 — ScadaBridge's ~25-field `AuditEvent` (rich record; adoption = keep own record, adopt canonical interface name + `AuditOutcome`) |
### §3 `IAuditRedactor` seam
| | OtOpcUa | MxAccessGateway | ScadaBridge |
|---|---|---|---|
| Named seam | ⛔ — no redactor; no payload filtering today | ⛔ — no redactor; safety by construction (entry type cannot carry a secret) | ✅ — `IAuditPayloadFilter` (`AuditEvent Apply(AuditEvent)`, pure/never-throws/over-redacts); **only the name differs** from canonical `IAuditRedactor` |
| Over-redacts on failure | ⛔ — n/a | ⛔ — n/a | ✅ — `SafeDefaultAuditPayloadFilter` is the reference |
### §4 `AuditOutcome` — the new normalized field
`Outcome` is a **genuinely new field** across all three projects. No app stores it today;
each encodes it implicitly. All three must derive and emit it at adoption:
**Gap O1 (OtOpcUa):** derive from `EventType` vocabulary — `OpcUaAccessDenied` /
`CrossClusterNamespaceAttempt``Denied`; config-write verbs → `Success`. No `Failure`
value exists in OtOpcUa's vocabulary today (failed flushes are dropped, not emitted), so
OtOpcUa will produce only `Success` / `Denied` until/unless failure events are added.
**Gap O2 (MxGateway):** derive — `constraint-denied``Denied`; all others → `Success`.
No `Failure` events are emitted today.
**Gap O3 (ScadaBridge):** derive from `AuditStatus``Delivered``Success`;
`Failed` / `Parked` / `Discarded``Failure`; `Kind = InboundAuthFailure``Denied`.
In-flight states (`Submitted` / `Forwarded` / `Attempted`) collapse to the last-known
terminal state when projecting; `Skipped` is excluded from the canonical projection.
### §5 `Actor` → Auth principal
At adoption, every emit site should supply the `ZB.MOM.WW.Auth` principal as `Actor`
(string). The library carries no Auth dependency — `Actor` is a plain `string` — but the
handshake with Auth is the semantic goal (closes the loop).
**Gap P1 (all 3):** at adoption, update emit sites to populate `Actor` from the Auth
principal (LDAP user / API-key name). Auth adoption (#8 in `components/auth/GAPS.md`) is a
prerequisite for the full story; until then, use the existing actor string.
### §6 OtOpcUa two-producer problem
OtOpcUa has **two writers to `ConfigAuditLog`**: the structured Akka `AuditEvent` path AND
older SQL stored procedures that `INSERT` directly (bare `EventType`, NULL `EventId` /
`CorrelationId`, populated `ClusterId` / `GenerationId`). Normalization targets the
structured path only; the SP path stays per-project.
**Gap Q1 (OtOpcUa):** decide at adoption whether to route SP events through the actor
or leave them non-idempotent. Also: the `ClusterId`-filter / actor-never-sets-`ClusterId`
mismatch (Admin UI `ClusterAudit.razor` filters by `ClusterId`, but the actor path sets
`NodeId` not `ClusterId`, so structured rows are invisible to the cluster view). Fix when
normalizing the query surface.
## Adoption backlog (ordered)
| # | Item | Projects | Priority | Effort | Risk | Notes |
|---|---|---|---|---|---|---|
| 1 | **OtOpcUa:** rename `AuditWriterActor` → implements `IAuditWriter`; replace `Commons/Messages/Audit/AuditEvent.cs` with canonical record; add `Outcome` derivation at every emit site (Gap O1) | OtOpcUa | Med | M | Med | Actor internals (batching / dedup / flush triggers) stay bespoke; only the seam type and record change. Commons value-types `NodeId`/`CorrelationId` bridged at construction. |
| 2 | **MxGateway:** map `IApiKeyAuditStore` / `ApiKeyAuditEntry` / `ApiKeyAuditRecord``IAuditWriter` / `AuditEvent`; generate `EventId` per write; add `"system"`/`"cli"` Actor fallback; constant `Category = "ApiKey"`; `constraint-denied``Outcome.Denied` (Gaps O2, record gaps) | MxGateway | Low | S | Med | ⚠ **COORDINATE** — a parallel session is editing this repo for the MEL→Serilog migration (Health/Telemetry normalization). Do NOT start until the Serilog session has landed (or is explicitly fenced off); the two efforts share `Security/Authentication/` DI wiring. |
| 3 | **ScadaBridge:** rename `IAuditPayloadFilter``IAuditRedactor` (or alias during transition); adopt canonical `AuditOutcome` enum (Gap O3); confirm writer contract matches (already byte-for-byte) | ScadaBridge | Low | S | High | **"Align, don't replace."** Blast radius is HIGH — `IAuditPayloadFilter` is used across the entire pipeline (site, central, wiring). Rename + alias only; no transport/storage/record change. `DefaultAuditPayloadFilter` / `SafeDefaultAuditPayloadFilter` implementations unchanged. |
| 4 | **All:** populate `Actor` from `ZB.MOM.WW.Auth` principal at emit sites (Gap P1) | All 3 | Low | S | Low | **Prerequisite:** Auth adoption per `components/auth/GAPS.md` #8. Until Auth is adopted, leave the existing actor string as-is. |
| 5 | **OtOpcUa:** reconcile two-producer problem — decide SP path routing + fix `ClusterId`-filter / actor mismatch in `ClusterAudit.razor` (Gap Q1) | OtOpcUa | Low | S | Low | Normalization does not unify the SP path; this is a reconcile item to decide and document. The mismatch means structured `AuditEvent` rows are currently invisible to the cluster-scoped view. |
| 6 | **MxGateway:** add `CorrelationId` capture at constraint denial + dashboard paths; structured `Target` from `Details` text (currently embedded as a plain string in `ConstraintEnforcer`) | MxGateway | Low | S | Low | Nice-to-have parity; not required for adoption. `CorrelationId` and `Target` canonical fields left null until this is done. |
**Sequencing:** #3 (ScadaBridge rename) is lowest-risk and self-contained — do it first (or
last, depending on blast-radius appetite). #1 (OtOpcUa) is medium effort but independent; it
can start once the shared library is built. #2 (MxGateway) is the smallest code change but
has the highest **coordination dependency** — gate it on the Serilog migration landing first.
#4 (Actor→Auth) is blocked on Auth adoption and is the last to close. #5 and #6 are cleanup
items with no bearing on shared-library adoption.
Each adoption lands as an opt-in version bump per project behind the seam; the shared library
is consumed but the bespoke transport/storage/UI for each project is not touched.
## Decisions still open
- ScadaBridge `IAuditPayloadFilter``IAuditRedactor`: outright rename vs. transitional alias
(both are valid; alias reduces blast radius in the short term).
- MxGateway `Details` plain string → `DetailsJson`: store as-is or wrap in a JSON object at
the mapping boundary.
- `AuditOutcome` column in OtOpcUa storage: add a new `Outcome` column to `ConfigAuditLog`
or fold into `DetailsJson` / derive at read time (schema change vs. runtime cost).
- OtOpcUa SP path: route through the actor path (unified producer) or leave as a bespoke
secondary writer with its own column conventions (separate reconcile effort).