11 KiB
Audit — gaps & adoption backlog
Divergence of each project from spec/SPEC.md, and the ordered backlog to
reach the shared ZB.MOM.WW.Audit library. Status legend: ⛔ gap · 🟡 partial · ✅ matches.
✅ ADOPTED 2026-06-02 (local-only) — DEEP. The backlog (#1–#6) was implemented across all three apps on each repo's
feat/adopt-zb-auditbranch (stacked onfeat/adopt-zb-auth) — committed + spec/code-reviewed, then merged to each repo's local default (main/master) and PUSHED to origin (gitea) on 2026-06-03 (in sync). The user chose DEEP adopt: the canonical 9-fieldAuditEventis the record EVERYWHERE (domain fields ride inDetailsJson), so the §1 "keep own record" framing below was superseded. OtOpcUa: canonical record +AuditWriterActor : IAuditWriter+Outcomecol/migration +ClusterAuditfix. MxGateway: canonical SQLiteaudit_eventstore +IAuditWriter+IApiKeyAuditStore→canonical adapter. ScadaBridge: a full audit-subsystem re-architecture (codec + siteaudit_event/audit_forward_statesidecar + central partitioned-table collapse to 10 canonical + persisted computed cols, MSSQL-verified). §5 (Actor→Auth principal) wired via per-appIAuditActorAccessor(Phase 3). The Task 2.0 gate found this doc's pre-adoption framing was partly stale (MxGateway's store had moved into the lib; OtOpcUa's structured path was dormant; ScadaBridge's filter was typed to its own record). Detail:docs/plans/2026-06-02-auth-audit-normalization-phase2-deep.md+…-scadabridge-audit-rearch.md. The ⛔/🟡 cells below describe the PRE-adoption divergence (kept for history).
Divergence vs spec
§1 Canonical record (AuditEvent)
| Canonical field | OtOpcUa | MxAccessGateway | ScadaBridge |
|---|---|---|---|
EventId (Guid, required) |
✅ — idempotency key; buffer key + filtered-unique DB index | ⛔ — no event key; only an AUTOINCREMENT rowid (AuditId) |
✅ — direct |
OccurredAtUtc (DateTimeOffset, required) |
🟡 — DateTime UTC; widen at mapping boundary |
🟡 — DateTimeOffset but store-assigned (not caller-supplied); direct after widening |
🟡 — DateTime UTC-forced; widen at mapping boundary |
Actor (string, required) |
✅ — direct (AuditEvent.Actor → ConfigAuditLog.Principal) |
🟡 — KeyId nullable; keyless events (init-db/list-keys) need a "system"/"cli" fallback |
🟡 — nullable on system-originated rows; fallback needed |
Action (string, required) |
🟡 — Action field exists, but persisted as "{Category}:{Action}" composite in EventType; canonical keeps them separate |
✅ — EventType literal direct |
🟡 — derived as {Channel}.{Kind} (e.g. ApiOutbound.ApiCall) |
Outcome (AuditOutcome, required) |
⛔ NEW — derived from EventType vocabulary; not stored today |
⛔ NEW — derived: constraint-denied→Denied, else Success |
⛔ NEW — derived from Status (+InboundAuthFailure Kind→Denied) |
Category (string?) |
✅ — AuditEvent.Category (e.g. "Config") |
⛔ — no field; constant "ApiKey" at mapping |
✅ — Channel |
Target (string?) |
⛔ — no dedicated field; closest is DetailsJson |
⛔ — embedded in Details text (commandKind/target) |
✅ — direct |
SourceNode (string?) |
✅ — SourceNode (logical cluster node / host name, NOT an OPC UA NodeId) |
🟡 — RemoteAddress; dashboard path only (null on CLI/constraint paths) |
✅ — direct |
CorrelationId (Guid?) |
✅ — direct (CorrelationId.Value) |
⛔ — not captured today; left null | ✅ — direct |
DetailsJson (string?) |
✅ — direct (JSON CHECK constraint enforced) | 🟡 — Details is a plain string, not JSON; wrap or store as-is |
🟡 — ~15 rich/plumbing fields serialize here at the cross-project reporting boundary |
§2 IAuditWriter seam
| OtOpcUa | MxAccessGateway | ScadaBridge | |
|---|---|---|---|
| Named seam | ⛔ — no IAuditWriter; AuditWriterActor is the sink, consumed directly via Akka messaging |
⛔ — IApiKeyAuditStore (narrow, two-method) is the seam; no general IAuditWriter |
✅ — IAuditWriter with WriteAsync(AuditEvent, CancellationToken) signature; "failures must NEVER abort the user-facing action" contract; best-effort |
| Best-effort / never throws | 🟡 — the actor drops a failed flush (best-effort), but the seam is not a typed interface a caller can inject independently | ⛔ — no contract; AppendAsync may propagate |
✅ |
| Record type at the seam | 🟡 — OtOpcUa's own AuditEvent (8 fields, with Commons value-types NodeId/CorrelationId) |
⛔ — ApiKeyAuditEntry (4 fields) |
🟡 — ScadaBridge's ~25-field AuditEvent (rich record; adoption = keep own record, adopt canonical interface name + AuditOutcome) |
§3 IAuditRedactor seam
| OtOpcUa | MxAccessGateway | ScadaBridge | |
|---|---|---|---|
| Named seam | ⛔ — no redactor; no payload filtering today | ⛔ — no redactor; safety by construction (entry type cannot carry a secret) | ✅ — IAuditPayloadFilter (AuditEvent Apply(AuditEvent), pure/never-throws/over-redacts); only the name differs from canonical IAuditRedactor |
| Over-redacts on failure | ⛔ — n/a | ⛔ — n/a | ✅ — SafeDefaultAuditPayloadFilter is the reference |
§4 AuditOutcome — the new normalized field
Outcome is a genuinely new field across all three projects. No app stores it today;
each encodes it implicitly. All three must derive and emit it at adoption:
→ Gap O1 (OtOpcUa): derive from EventType vocabulary — OpcUaAccessDenied /
CrossClusterNamespaceAttempt → Denied; config-write verbs → Success. No Failure
value exists in OtOpcUa's vocabulary today (failed flushes are dropped, not emitted), so
OtOpcUa will produce only Success / Denied until/unless failure events are added.
→ Gap O2 (MxGateway): derive — constraint-denied → Denied; all others → Success.
No Failure events are emitted today.
→ Gap O3 (ScadaBridge): derive from AuditStatus — Delivered → Success;
Failed / Parked / Discarded → Failure; Kind = InboundAuthFailure → Denied.
In-flight states (Submitted / Forwarded / Attempted) collapse to the last-known
terminal state when projecting; Skipped is excluded from the canonical projection.
§5 Actor → Auth principal
At adoption, every emit site should supply the ZB.MOM.WW.Auth principal as Actor
(string). The library carries no Auth dependency — Actor is a plain string — but the
handshake with Auth is the semantic goal (closes the loop).
→ Gap P1 (all 3): at adoption, update emit sites to populate Actor from the Auth
principal (LDAP user / API-key name). Auth adoption (#8 in components/auth/GAPS.md) is a
prerequisite for the full story; until then, use the existing actor string.
§6 OtOpcUa two-producer problem
OtOpcUa has two writers to ConfigAuditLog: the structured Akka AuditEvent path AND
older SQL stored procedures that INSERT directly (bare EventType, NULL EventId /
CorrelationId, populated ClusterId / GenerationId). Normalization targets the
structured path only; the SP path stays per-project.
→ Gap Q1 (OtOpcUa): decide at adoption whether to route SP events through the actor
or leave them non-idempotent. Also: the ClusterId-filter / actor-never-sets-ClusterId
mismatch (Admin UI ClusterAudit.razor filters by ClusterId, but the actor path sets
NodeId not ClusterId, so structured rows are invisible to the cluster view). Fix when
normalizing the query surface.
Adoption backlog (ordered)
| # | Item | Projects | Priority | Effort | Risk | Notes |
|---|---|---|---|---|---|---|
| 1 | OtOpcUa: rename AuditWriterActor → implements IAuditWriter; replace Commons/Messages/Audit/AuditEvent.cs with canonical record; add Outcome derivation at every emit site (Gap O1) |
OtOpcUa | Med | M | Med | Actor internals (batching / dedup / flush triggers) stay bespoke; only the seam type and record change. Commons value-types NodeId/CorrelationId bridged at construction. |
| 2 | MxGateway: map IApiKeyAuditStore / ApiKeyAuditEntry / ApiKeyAuditRecord → IAuditWriter / AuditEvent; generate EventId per write; add "system"/"cli" Actor fallback; constant Category = "ApiKey"; constraint-denied→Outcome.Denied (Gaps O2, record gaps) |
MxGateway | Low | S | Med | ⚠ COORDINATE — a parallel session is editing this repo for the MEL→Serilog migration (Health/Telemetry normalization). Do NOT start until the Serilog session has landed (or is explicitly fenced off); the two efforts share Security/Authentication/ DI wiring. |
| 3 | ScadaBridge: rename IAuditPayloadFilter → IAuditRedactor (or alias during transition); adopt canonical AuditOutcome enum (Gap O3); confirm writer contract matches (already byte-for-byte) |
ScadaBridge | Low | S | High | "Align, don't replace." Blast radius is HIGH — IAuditPayloadFilter is used across the entire pipeline (site, central, wiring). Rename + alias only; no transport/storage/record change. DefaultAuditPayloadFilter / SafeDefaultAuditPayloadFilter implementations unchanged. |
| 4 | All: populate Actor from ZB.MOM.WW.Auth principal at emit sites (Gap P1) |
All 3 | Low | S | Low | Prerequisite: Auth adoption per components/auth/GAPS.md #8. Until Auth is adopted, leave the existing actor string as-is. |
| 5 | OtOpcUa: reconcile two-producer problem — decide SP path routing + fix ClusterId-filter / actor mismatch in ClusterAudit.razor (Gap Q1) |
OtOpcUa | Low | S | Low | Normalization does not unify the SP path; this is a reconcile item to decide and document. The mismatch means structured AuditEvent rows are currently invisible to the cluster-scoped view. |
| 6 | MxGateway: add CorrelationId capture at constraint denial + dashboard paths; structured Target from Details text (currently embedded as a plain string in ConstraintEnforcer) |
MxGateway | Low | S | Low | Nice-to-have parity; not required for adoption. CorrelationId and Target canonical fields left null until this is done. |
Sequencing: #3 (ScadaBridge rename) is lowest-risk and self-contained — do it first (or last, depending on blast-radius appetite). #1 (OtOpcUa) is medium effort but independent; it can start once the shared library is built. #2 (MxGateway) is the smallest code change but has the highest coordination dependency — gate it on the Serilog migration landing first. #4 (Actor→Auth) is blocked on Auth adoption and is the last to close. #5 and #6 are cleanup items with no bearing on shared-library adoption.
Each adoption lands as an opt-in version bump per project behind the seam; the shared library is consumed but the bespoke transport/storage/UI for each project is not touched.
Decisions still open
- ScadaBridge
IAuditPayloadFilter→IAuditRedactor: outright rename vs. transitional alias (both are valid; alias reduces blast radius in the short term). - MxGateway
Detailsplain string →DetailsJson: store as-is or wrap in a JSON object at the mapping boundary. AuditOutcomecolumn in OtOpcUa storage: add a newOutcomecolumn toConfigAuditLogor fold intoDetailsJson/ derive at read time (schema change vs. runtime cost).- OtOpcUa SP path: route through the actor path (unified producer) or leave as a bespoke secondary writer with its own column conventions (separate reconcile effort).