13 KiB
ScadaBridge audit re-architecture (Task 2.5, DEEP full 9-col) — decomposition
Companion to 2026-06-02-auth-audit-normalization-phase2-deep.md. User chose Full re-arch (pure 9-col storage)
for ScadaBridge audit. Architect design pass (read-only, verified on feat/adopt-zb-audit) produced this. The full
audit record becomes the library 9-field ZB.MOM.WW.Audit.AuditEvent; ~15 domain fields relocate into DetailsJson;
ScadaBridge consumes the library IAuditWriter/IAuditRedactor/AuditOutcome. This is the program's largest task.
Key resolutions (from the design)
- Forwarding state machine (the crux) → resolved cleanly. It lives only in site SQLite; the central MS SQL
AuditLogtable is append-only (DENY UPDATE/DELETE; central rows leaveForwardStatenull; reconciliation is pure idempotent-insert with in-memory cursors), and the gRPCAuditEventDtoMapperalready dropsForwardState/IngestedAtUtcon the wire. So central needs NO forwarding columns (pure 9-col). On the site, add a sidecaraudit_forward_statetable keyed byEventId(ForwardState,OccurredAtUtc, precomputedIsCachedKind, optionalAttemptCount/LastAttemptUtc) —MarkForwarded/MarkReconciledUPDATE the sidecar;ReadPending*JOIN it; the canonicalaudit_eventtable is write-once. PrecomputingIsCachedKindkeeps the drain hot path off JSON parsing (strictly faster than today'sKind NOT IN(...)). - Central storage migration → new table + copy (in-place collapse infeasible: partition-aligned indexes +
SwitchOutPartitionAsynchard-codes a byte-identical staging column list). New 10-col table on the SAMEps_AuditLog_Month(OccurredAtUtc)scheme; per-partition data copy projecting old typed columns intoDetailsJson(FOR JSON PATH); rename + role re-grant (append-only preserved). Partitioning preserved (OccurredAtUtcstays). - Reporting queryability → persisted computed columns for hot filters.
Category(=Channel) + canonicalOutcome/Target/Actor/SourceNode/CorrelationIdcover most filters directly. Add PERSISTED computed columnsKind/Status/SourceSiteId/ExecutionId/ParentExecutionId(JSON_VALUE(DetailsJson,'$.x')) + partition-aligned indexes so the existing index semantics + theGetExecutionTreeAsyncrecursive CTE survive without a JSON perf cliff. - Redactor →
ScadaBridgeAuditRedactor : IAuditRedactoron the canonical record: parseDetailsJsononce, redact + byte-safe-truncaterequestSummary/responseSummary/errorDetail/extrain the JSON tree, cap on canonicalCategory/Outcome(replacing the typedChannel/Statusreads), setpayloadTruncated, re-serialize. Add a fast-path that skips JSON parse when nothing to redact.SafeDefault→SafeDefaultAuditRedactor. Re-baseline the perf hot-path budgets (JSON parse/rewrite is ~2–4× the typed-field path). - Canonical field mapping:
Action = "{Channel}.{Kind}";Category = Channel;Target/SourceNode/CorrelationId/ Actor/OccurredAtUtcdirect (DateTime→DateTimeOffset UTC).Outcome:Kind==InboundAuthFailure→Denied(checked first);Status==Delivered→Success;Status∈{Failed,Parked,Discarded}→Failure; in-flight/Skipped→Success. DetailsJsonschema (camelCase, stable): channel, kind, status, executionId, parentExecutionId, sourceSiteId, sourceInstanceId, sourceScript, httpStatus, durationMs, errorMessage, errorDetail, requestSummary, responseSummary, payloadTruncated, extra, ingestedAtUtc. One sharedAuditDetailsCodec(Commons) with deterministic options is MANDATORY — the canonical record uses value-equality + consumers dedup on it, so key-order/whitespace drift would break dedup. (forwardStateis NOT in DetailsJson — it's site-sidecar only.)- Commons takes the
ZB.MOM.WW.Auditpackage ref (the record lives in Commons; the package is a leaf canonical-types pkg, only depMicrosoft.Extensions.DependencyInjection.Abstractions). Acceptable. - gRPC proto kept UNCHANGED — the wire
AuditEventDtostays 24-field internally;AuditEventDtoMapperprojects to/fromDetailsJson. Avoids a proto/codegen rev + a site/central version-skew handshake. (A proto collapse is a separate later task.)
Staged decomposition (C1–C7)
| Stage | Scope | Green? | Class | Risk |
|---|---|---|---|---|
| C1 | Commons: add ZB.MOM.WW.Audit ref; new pure types AuditDetails record + AuditDetailsCodec (deterministic) + Status/Kind→AuditOutcome projection + Action/Category builders. No existing type changes. |
yes | small | trivial |
| C2 | ScadaBridgeAuditRedactor/SafeDefaultAuditRedactor : IAuditRedactor (canonical record, parse/rewrite DetailsJson, fast-path) — additive, old IAuditPayloadFilter still wired; unit-tested in isolation. |
yes | standard | low |
| C3 | ATOMIC CUT — swap the record everywhere. Commons.Entities.Audit.AuditEvent → ZB.MOM.WW.Audit.AuditEvent across ~40 src files + tests: emitters build canonical (domain→DetailsJson via codec); seams (IAuditWriter/ICentralAuditWriter/ISiteAuditQueue/IAuditLogRepository/AuditLogQueryFilter) re-type; AuditEventDtoMapper DTO↔canonical (proto unchanged); switch redactor wiring IAuditPayloadFilter→IAuditRedactor. |
boundaries only | high-risk | HIGHEST |
| C4 | Site SQLite two-table forwarding: SqliteAuditWriter → audit_event + audit_forward_state; retarget MarkForwarded/MarkReconciled/ReadPending*/GetBacklogStats/MapRow to JOIN+sidecar; precompute IsCachedKind. Telemetry/Reconciliation actors unchanged (seam stable). Site SQLite is ephemeral (7-day) → in-place schema reset, no data migration. |
yes | high-risk | HIGH |
| C5 | ATOMIC CUT — central migration. EF CollapseAuditLogToCanonical: new 10-col table on the partition scheme + per-partition data copy (old cols→DetailsJson) + persisted computed cols/indexes + rename + role re-grant; update AuditLogRepository.InsertIfNotExistsAsync + SwitchOutPartitionAsync staging list; regen ModelSnapshot. Maintenance-window; verify row-count + JSON spot-check. |
boundaries only | high-risk | HIGHEST |
| C6 | Reporting/UI/export retarget: QueryAsync/GetKpiSnapshotAsync/GetExecutionTreeAsync predicates→canonical/computed cols; AuditLogExportService+AuditEndpoints CSV + CentralUI Audit components + CLI parse DetailsJson for display. |
yes | standard | med |
| C7 | Tests + perf re-baseline + cleanup: rewrite PayloadFilterContractTests/redaction/HotPathLatencyTests to canonical+JSON + new budget; delete dead Commons.Entities.Audit.AuditEvent, 4 audit enums (or relocate behind codec), IAuditPayloadFilter/Default/SafeDefault, obsolete AddColumnIfMissing. |
yes | standard | low |
Atomic cuts: only C3 (shared record type changes for all callers at once) and C5's data-copy half cannot stay green continuously. All other stages are green at completion.
Top risks (carry into execution)
- C5 partition +
SwitchOutPartitionAsync+ persisted computed columns — staging table must carry identical computed defs for SWITCH; add a SWITCH round-trip integration test before C5 ships. Documented fallback: if too brittle, keepKind/Statusas 2 real non-canonical columns on the central table (pragmatic, not pure-9-col) — decide at C5 implementation if blocked. - DetailsJson determinism — single
AuditDetailsCodec(C1) is load-bearing for value-equality/dedup, not cosmetic. - Redactor perf — budgets move; add the no-op fast-path + empirically re-baseline in C7.
- gRPC — keep the proto unchanged (mapper-internal projection); do NOT couple a wire change to this storage cut.
Action=Channel.Kindlossiness — mitigated byCategory(=channel) + persisted computedKind; ScadaBridge-internal filtering uses those, notActionparsing.
Delivery: feat/adopt-zb-audit (stacked on auth), local-only. Each stage = one implementer + classification review chain; full ScadaBridge suite at C3/C4/C5/C7.
Stage status (live)
- ✅ C1 DONE
3d77dc0(code ✅) —AuditDetails+ deterministicAuditDetailsCodec(pinned byte-exact) +AuditOutcomeProjector+AuditFieldBuilders+ Commons→ZB.MOM.WW.Auditref; 56 tests. - ✅ C2 DONE
adfb4d3+ fix5aaf9e2(spec ✅, code ✅ after fix) —ScadaBridgeAuditRedactor/SafeDefaultAuditRedactor : IAuditRedactoron the canonical record; redaction primitives extracted into sharedAuditRedactionPrimitives/AuditRegexCache(old filter delegates, behaviour-preserved); cap-selection readsd.Status(faithful to legacyIsErrorStatus); fast-path + never-throws; review-fix hardenedOverRedactto scrub ALL free-text fields + marker alignment + outer-catch never-leak test. 61 redaction + 44 payload + 88 commons-audit green. - ✅ C3 DONE
db707bb+ fixc27b2c3(spec ✅, code ✅; independently re-verified build 0/0 + AuditLog 241/Communication 201). Atomic record swap across all seams/emitters/gRPC DTO/redactor-wiring (127 files);ScadaBridgeAuditEventFactorysingle emit point;AuditRowProjectionDecompose/Recompose transitional 24-col shim (lossless round-trip verified); proto unchanged; oldIAuditPayloadFilterclasses deleted (C7 pulled forward). Fix: safe enum-parse fallback inMapRow+FromDto. - ✅ C4 DONE
946d3e2+ fix1737d15(spec ✅, code ✅; independently re-verified diff scope = writer+tests only, build 0/0, AuditLog 249/1-preexisting). Site SQLite →audit_event(canonical) +audit_forward_statesidecar; forwarding marks/reads on the sidecar via JOIN;IsCachedKind={CachedSubmit,ApiCallCached,DbWriteCached,CachedResolve} precomputed drain split; oldAuditLogtable dropped (ephemeral reset). Fix:PRAGMA foreign_keys=ON+MarkForwardedno-demote guard. - ✅ C5 DONE
68a6bd1(spec ✅, code ✅; a LIVE SQL Server was available so the migration + SWITCH were fully exercised — independently re-verified build 0/0 + ConfigurationDatabase 248/248). Centraldbo.AuditLogcollapsed to 10 canonical cols + 6 computed cols (5 PERSISTED +IngestedAtUtcnon-persisted) on the preservedps_AuditLog_Monthscheme;CollapseAuditLogToCanonicalnew-table-and-copy migration (FOR JSON PATHprojection, byte-verified round-trip; Down = documented one-way); repo writes/reads canonical directly;SwitchOutPartitionstaging matches the computed-col defs; append-only roles re-granted. C3 central shim retired. Forced deviations (all sound): IngestedAtUtc non-persisted, execution-id indexes unfiltered, provider-awareOnModelCreatingstrips JSON_VALUE for SQLite. Deferred to C7: a dedicated migration-projection test + the staleCreatesFiveNamedIndexestest name. - ✅ C6 SUBSUMED (no commit) — reporting/UI/export/CLI retarget was already completed by the C3 record-swap (
AuditEventView/AuditExportRowshims decode every domain field fromDetailsJson) + the C5 repo-query retarget. Read-only explorer verdict: all consumer surfaces canonical-complete; the only flagged items (ExecutionId/ParentExecutionId not in CSV; SourceNodes not parsed in exportParseFilter) are PRE-rearch omissions, not regressions. CentralUI 595/595, ManagementService 125/125 confirm. - ✅ C7 DONE
635461c+ doc-fixbc0e5bf(review ✅; independently re-verified build 0/0, PerformanceTests 10/10, ConfigurationDatabase 251/251 incl. the 3 new migration-projection tests PASSING on live MSSQL, zero dead crefs). Perf hot-path re-baselined (canonical JSON redactor measured ~14µs/2µs — faster than the old typed walk; budgets 200/30/5µs + fast-pathAssert.Same);CollapseAuditLogToCanonicalMigrationTests(seed→migrate→assert Action/Category/Outcome/Actor-null/DetailsJson-round-trip + 5 persisted computed cols); index test →CreatesNineNamedIndexes; 26 dead-<see cref>across 13 files cleaned; doc-fix corrected the "six persisted" wording (5 persisted + IngestedAtUtc non-persisted).
✅ TASK 2.5 COMPLETE — ScadaBridge audit FULL re-architecture to pure 9-col canonical (2026-06-02)
All of C1–C7 done, each spec+code reviewed, on feat/adopt-zb-audit (local-only, never pushed). ScadaBridge's audit subsystem now: the canonical ZB.MOM.WW.Audit.AuditEvent record everywhere (domain fields in DetailsJson via the deterministic AuditDetailsCodec); the library IAuditRedactor/AuditOutcome consumed; site SQLite = audit_event (canonical) + audit_forward_state sidecar (forwarding decoupled, IsCachedKind drain split); central dbo.AuditLog collapsed to 10 canonical cols + persisted computed cols on the preserved partition scheme (CollapseAuditLogToCanonical migration, MSSQL-verified); UI/export/CLI canonical-complete via AuditEventView/AuditExportRow. The gRPC proto was intentionally left unchanged (mapper-internal projection). This was the program's single largest task.