# M5 — Audit Hardening (T3–T8) Implementation Plan > **For Claude:** executed via superpowers-extended-cc:subagent-driven-development in this session. **Goal:** Ship six independent audit-log hardening items (per-channel retention, ParentExecutionId tag-cascade, SourceNode backfill, per-node stuck KPIs, structured response-capture increments, CLI `audit tree`) without an AuditLog schema change. **Architecture:** Each item extends an existing seam identified in the survey. No new infra dependency (T1 hash-chain + T2 Parquet stay deferred to v1.x). Design: `docs/plans/2026-06-16-m5-audit-hardening-design.md`. **Tech Stack:** C#/.NET 10, EF Core (MS SQL), Akka.NET, Blazor Server, System.CommandLine, xUnit. **Conventions:** targeted builds/tests per task (`dotnet build `, `dotnet test --filter`); full-solution build only at integration (M5.7). Implementers do NOT create worktrees (already in `worktree-m5-audit-hardening`) and commit with pathspec form `git commit -m "..." -- ` (retry on index.lock). Append-only invariant holds for writer/ingest paths; the only sanctioned mutations are T3's purge-role channel delete and T5's purge-role sentinel UPDATE, both reflected in the M2.10 CI-guard allow-list. --- # Wave A — leverage-existing-infra (parallel; disjoint projects) ### Task M5.1 (T8): CLI `audit tree` + tree endpoint **Classification:** standard · **~5 min** · **Parallelizable with:** M5.2, M5.3 **Files:** - Modify: `src/ZB.MOM.WW.ScadaBridge.ManagementService/AuditEndpoints.cs` (`MapAuditAPI`, ~line 97) — add `GET /api/audit/tree?executionId=` → `IAuditLogRepository.GetExecutionTreeAsync(executionId)` → JSON `ExecutionTreeNode[]`; 400 on missing/invalid guid, empty array when no rows. - Create: `src/ZB.MOM.WW.ScadaBridge.CLI/Commands/AuditTreeHelpers.cs` — render `ExecutionTreeNode[]` as an indented ASCII tree (table) and as raw JSON (`--format json`), mirroring `AuditQueryHelpers`/`AuditExportHelpers`. - Modify: `src/ZB.MOM.WW.ScadaBridge.CLI/Commands/AuditCommands.cs` (`Build`, ~line 28) — add `BuildTree()`: `audit tree --execution-id [--format table|json]`, calls the new endpoint via the existing `ManagementHttpClient` pattern. - Test: ManagementService tests for the endpoint (multi-level tree + not-found); CLI tests for `AuditTreeHelpers` rendering. **AC:** `audit tree --execution-id ` prints the execution tree (root→children, indented); `--format json` emits the node array; the server walk reuses the existing `GetExecutionTreeAsync` (no new SQL). No schema change. ### Task M5.2 (T6): Per-node stuck-count KPIs **Classification:** standard · **~5 min** · **Parallelizable with:** M5.1, M5.3 **Files:** - Modify: `NotificationOutboxRepository` — add `ComputePerNodeKpisAsync` (group by `SourceNode`) parallel to `ComputePerSiteKpisAsync`. - Modify: `src/ZB.MOM.WW.ScadaBridge.SiteCallAudit/...Repository` — same `ComputePerNodeKpisAsync`. - Modify: `NotificationOutboxActor.cs` (~line 1054) + `SiteCallAuditActor.cs` (~line 781) — add a `PerNode…KpiRequest`/`Response` message pair (in Commons messages) and a `Receive<>`/handler each. - Modify: CentralUI `AuditKpiTiles.razor` / `SiteCallKpiTiles.razor` (or the per-site KPI panel) — add an additive per-node breakdown. - Test: repository per-node grouping returns correct stuck/parked/queue-depth counts; actor message round-trip. **AC:** per-node stuck/parked counts available + surfaced; `SourceNode` already on both tables (no migration). Per-site KPIs unchanged. ### Task M5.3 (T7): Structured response-capture increments **Classification:** standard · **~5 min** · **Parallelizable with:** M5.1, M5.2 **Files:** - Modify: `src/ZB.MOM.WW.ScadaBridge.AuditLog/...AuditWriteMiddleware.cs` (`EmitInboundAudit`, ~line 246) — capture inbound **request headers** into the existing `Extra` JSON (through the existing header redactor; auth headers redacted by default). - Modify: `src/ZB.MOM.WW.ScadaBridge.AuditLog/Central/AuditCentralHealthSnapshot.cs` — add an `AuditInboundCeilingHits` counter (+ its interface), incremented from the middleware when an inbound row truncates (`requestTruncated || responseTruncated`). - Modify: `src/ZB.MOM.WW.ScadaBridge.AuditLog/Configuration/PerTargetRedactionOverride.cs` — add a `SkipBodyCapture` flag; honor it in the capture pipeline (suppress body, keep headers + metadata + the row). - Test: request headers land in `Extra` and are redacted; ceiling-hit increments the counter; `SkipBodyCapture` suppresses body but still writes the row. **AC:** no schema change (uses `Extra` JSON + health snapshot); existing redaction behavior preserved. --- # Wave B — actor model + maintenance (parallel; T5 after M5.1's CLI edits) ### Task M5.4 (T4): ParentExecutionId tag-cascade **Classification:** high-risk (actor model + correlation) · **~5 min** · **Parallelizable with:** M5.5 (and M5.6) **Files:** - Modify: `src/ZB.MOM.WW.ScadaBridge.SiteRuntime/Actors/AlarmActor.cs` (`SpawnAlarmExecutionActor`, ~line 578) + `AlarmExecutionActor.cs` (ctor, ~line 90) — thread a `Guid? parentExecutionId` so alarm-triggered scripts chain to the firing context; pass it into the `ScriptRuntimeContext` (currently `null`). - Modify: `src/ZB.MOM.WW.ScadaBridge.SiteRuntime/Scripts/ScriptRuntimeContext.cs` (`CallScript` ~line 394, `CallShared`) — pass **the current run's `_executionId`** (not the inherited `_parentExecutionId`) as the child invocation's `ParentExecutionId`, forming a true multi-level tree. - Test (`tests/.../SiteRuntime.Tests/`): an alarm-triggered script row carries the expected parent; a 2-level nested `CallScript` (A→B→C) is walkable via `GetExecutionTreeAsync` (or assert the emitted `ParentExecutionId` chain). **AC:** alarm/trigger-spawned and nested-call runs form a correct execution tree; top-level timer/expression-trigger runs stay roots; no regression to the inbound-API→routed-script path. ### Task M5.5 (T3): Per-channel retention overrides **Classification:** high-risk (purge/deletion + CI guard) · **~5 min** · **Parallelizable with:** M5.4, M5.6 **Files:** - Modify: `src/ZB.MOM.WW.ScadaBridge.AuditLog/Configuration/AuditLogOptions.cs` — add `Dictionary PerChannelRetentionDays` (keyed by `Action`/channel name); validate in `AuditLogOptionsValidator.cs` (each override in `[30, global]`, shorter-than-global only). - Modify: `src/ZB.MOM.WW.ScadaBridge.AuditLog/Central/AuditLogPurgeActor.cs` (`HandlePurgeTickAsync`, ~line 135) — after the global partition switch-out, for each channel with a shorter override, run a **bounded batched DELETE** (`WHERE Action=@channel AND OccurredAtUtc<@threshold`) via the purge/maintenance path. - Modify: the M2.10 CI grep-guard script — add an allow-list entry for the purge actor's single audited DELETE call site (do NOT blanket-exempt; the guard must still reject all other UPDATE/DELETE on AuditLog). - Test: a channel with a shorter override is purged earlier than global; un-overridden channels follow global; the CI guard still fails on a stray DELETE elsewhere. **AC:** per-channel retention works without violating writer-role append-only; the guard remains effective. ### Task M5.6 (T5): SourceNode sentinel backfill + runbook **Classification:** small · **~4 min** · **Parallelizable with:** M5.4, M5.5 · **Depends on:** M5.1 (shares `AuditCommands.cs`) **Files:** - Create: a one-shot maintenance backfill (purge/maintenance path) that sets `SourceNode` to a configurable sentinel (default `"unknown"`) on `NULL` rows within a bounded `OccurredAtUtc` range; idempotent. - Modify: `src/ZB.MOM.WW.ScadaBridge.CLI/Commands/AuditCommands.cs` — add `audit backfill-source-node [--sentinel ] [--before ]` invoking it (after M5.1's `audit tree` is in, to avoid a concurrent edit to this file). - Modify/Create: a runbook note (`deploy/.../RUNBOOK.md` or the AuditLog component doc) documenting that `ExecutionId`/`ParentExecutionId` are computed from `DetailsJson` and CANNOT be backfilled under append-only (pre-feature rows stay NULL) — no false precision. - Test: backfill sets the sentinel only on NULL rows in range, is idempotent, and does not touch non-NULL rows. **AC:** SourceNode backfill is sanctioned maintenance (CI-guard allow-listed if it does UPDATE); the computed-id limitation is documented, not coded. --- # Wave C — integration + docs ### Task M5.7: Integration verification + docs **Classification:** high-risk (final integration reviewer) · **~5 min** · **Depends on:** M5.1–M5.6 **Steps:** 1. `dotnet build ZB.MOM.WW.ScadaBridge.slnx` (full solution). 2. Targeted tests across AuditLog, ManagementService, CLI, NotificationOutbox/SiteCallAudit, SiteRuntime, CentralUI; run the CI grep-guard to confirm it still blocks stray UPDATE/DELETE. 3. Docs: `docs/requirements/Component-AuditLog.md` (per-channel retention, per-node KPIs, response-capture increments, tag-cascade, `audit tree`), `Component-CLI.md` + CLI README (`audit tree`, `audit backfill-source-node`), CLAUDE.md audit notes (per-channel retention; tag-cascade now beyond inbound; per-node KPIs), and the runbook computed-id limitation. 4. Commit; final integration review of the whole `1b7600f..HEAD` diff. **AC:** full build green; all targeted suites + CI guard green; docs reflect the six shipped items; no doc claims a deferred item shipped (T1/T2 remain deferred). --- ## Native tasks & dependencies Sub-tasks created as native tasks under umbrella #16 (M5). Edges: M5.6 ⟵ M5.1 (shared CLI file); M5.7 ⟵ M5.1–M5.6. Waves: A = {M5.1, M5.2, M5.3} parallel; B = {M5.4, M5.5, M5.6} parallel (M5.6 after M5.1); C = M5.7.