Files
scadalink-design/docs/plans/2026-05-20-centralized-audit-log.md
Joseph Doherty acb160ecce docs(audit): fix plan reference to existing CachedCallTelemetry message
Task 2's spec reviewer flagged that the plan used a non-existent name
'CachedOperationTelemetry' when describing the additively-evolved cached
telemetry message. The existing message is 'CachedCallTelemetry'; renaming
would violate Commons REQ-COM-5a (additive-only). Plan now reflects the
in-place additive evolution and warns against rename.
2026-05-20 07:53:23 -04:00

34 KiB
Raw Blame History

Centralized Audit Log Implementation Plan

For Claude: REQUIRED SUB-SKILL: Use superpowers-extended-cc:executing-plans to implement this plan task-by-task.

Repo nature: Design-documentation only. No code, no tests. Each task is a documentation change. "Verify" = re-read the diff + grep for stale cross-references. Commit after each task.

Goal: Document the new #23 Audit Log component and propagate its cross-references across every affected component design, the README, HighLevelReqs, and CLAUDE.md — exactly as specified in alog.md (committed fec0bb1).

Architecture: Layered, append-only AuditLog table at central, alongside existing Notifications (#21) and SiteCalls (#22) operational stores. Site SQLite writes on the hot path; gRPC telemetry forwards to central; site purge requires ForwardState ∈ {Forwarded, Reconciled}. Cached calls send a single telemetry packet that drives both the immutable AuditLog insert and the operational SiteCalls upsert. Central-originated events (Inbound API, Notification dispatch attempts) write directly. Monthly partitioning at central, 365-day default retention.

Tech Stack: Markdown only. No code in v1 of this plan.

Spec: /Users/dohertj2/Desktop/scadalink-design/alog.md (see commit fec0bb1). All task content below cites sections of that file.


Task 0: Prepare branch

Files:

  • None — git operation only.

Step 1: Confirm working tree state

Run: git status --short Expected: three unstaged infra/ modifications (unrelated; leave them alone), nothing else.

Step 2: Create feature branch off main

Run: git switch -c feature/audit-log-docs Expected: switched to a new branch.

Step 3: Verify branch

Run: git rev-parse --abbrev-ref HEAD Expected: feature/audit-log-docs.

No commit at this task — just branch prep.


Task 1: Author Component-AuditLog.md

Files:

  • Create: docs/requirements/Component-AuditLog.md

Step 1: Read context

Read alog.md §1§16. Read the structural style of docs/requirements/Component-SiteCallAudit.md and docs/requirements/Component-NotificationOutbox.md — mirror their section ordering (Purpose / Location / Responsibilities / Tables / Lifecycle / Ingest & Idempotency / Reconciliation / Retention & Purge / KPIs / Configuration / Dependencies / Interactions).

Step 2: Write the skeleton

Create the file with these top-level headings (verbatim, in order):

# Component: Audit Log

## Purpose
## Location
## Responsibilities
## Scope — the script trust boundary
## The `AuditLog` Table (central)
## The Site-Local `AuditLog` (SQLite)
## Ingestion Paths
## Cached Operations — Combined Telemetry
## Payload Capture Policy
## Failure Handling & Idempotency
## Retention & Purge
## Security & Tamper-Evidence
## KPIs
## Configuration
## Dependencies
## Interactions

Step 3: Fill Purpose

Two-paragraph version of alog.md §1. Lead sentence: "Provides a single, append-only, forensic + operational record of every integration action initiated by, or terminating in, a script — across outbound API, outbound DB, notifications, and inbound API." Second paragraph: not a dispatcher, observes Notification Outbox (#21) and Site Call Audit (#22), adds coverage where they are silent.

Step 4: Fill Location

Central cluster + site cluster. Central: AuditLog table in MS SQL plus three singleton actors on the active central node — AuditLogIngestActor (telemetry receiver), SiteAuditReconciliationActor, AuditLogPurgeActor. Sites: AuditLog SQLite database file alongside the S&F buffer plus SiteAuditTelemetryActor singleton on the active site node. Registered as component #23 in the Host role configuration.

Step 5: Fill Responsibilities

Bullet list mirroring alog.md §1§3 commitments. Six bullets:

  • Accept site-local hot-path audit writes from script-trust-boundary call paths.
  • Forward site audit rows to central via gRPC telemetry with at-least-once + idempotency on EventId.
  • Run periodic reconciliation pulls per site to self-heal missed telemetry.
  • Accept central-originated audit writes (Inbound API, Notification dispatch attempts).
  • Compute point-in-time KPIs (global + per-site) from the central AuditLog table.
  • Purge expired rows by monthly partition switch.

Step 6: Fill Scope — the script trust boundary

Reproduce the table from alog.md §2 verbatim (the six rows). Add the "Out of scope" bullet list. Add the DB-reads note.

Step 7: Fill The AuditLog Table (central)

Reproduce the column table from alog.md §4. Then the index list. Then the Kind-per-channel table (with the inbound API simplification — only Completed).

Step 8: Fill The Site-Local AuditLog (SQLite)

State same schema as central minus IngestedAtUtc, plus ForwardState (Pending | Forwarded | Reconciled). Reproduce the hard purge invariant from alog.md §4 verbatim:

A row is eligible for purge only when both OccurredAtUtc < retention threshold AND ForwardState IN ('Forwarded', 'Reconciled'). Pending rows are never purged.

Mention the SiteAuditBacklog health metric.

Step 9: Fill Ingestion Paths

Three subsections mirroring alog.md §6.1, §6.2, §6.3, §6.4. Keep concise — full pseudo-code lives in alog.md; the component doc captures the contract.

Step 10: Fill Cached Operations — Combined Telemetry

Capture alog.md §6.5 — site is source of truth, one telemetry packet carries both the audit row and the SiteCalls operational update; central ingest performs both writes in a single transaction.

Step 11: Fill Payload Capture Policy

Compress alog.md §8 into 812 lines: defaults (8 KB / 64 KB on error), header redaction, body-redactor regex hook, SQL captures values by default with per-connection opt-out, never-captured list (API keys, LDAP creds, secrets), safety-net over-redacts on misconfiguration.

Step 12: Fill Failure Handling & Idempotency

Compress alog.md §9: EventId is the PK and dedup key; never-fail-the-action principle; ring buffer for transient SQLite write failures; reconciliation as fallback when telemetry actor wedges; central-direct-write failure handling.

Step 13: Fill Retention & Purge

Compress alog.md §12: 365-day default central retention; monthly partition switch; no row-level deletes at central; site 7-day default; site purge respects ForwardState.

Step 14: Fill Security & Tamper-Evidence

Compress alog.md §11: dedicated scadalink_audit_writer (INSERT+SELECT) and scadalink_audit_purger (partition-switch only) DB roles; CI grep guard against UPDATE/DELETE of AuditLog; Audit + OperationalAudit + AuditExport permissions; hash-chain tamper evidence deferred to v1.x.

Step 15: Fill KPIs

List the five KPIs from alog.md §14: Volume, Error rate, Backlog, Top inbound callers, Top outbound 5xx. Note that Notification Outbox and Site Call Audit KPIs are unaffected.

Step 16: Fill Configuration

Show the AuditLog appsettings.json shape from alog.md §8.4. Include DefaultCapBytes, ErrorCapBytes, HeaderRedactList, GlobalBodyRedactors, PerTargetOverrides, and RetentionDays (global only in v1).

Step 17: Fill Dependencies

Cross-references to:

  • Commons (#16)AuditEvent, IAuditWriter, ICentralAuditWriter, AuditChannel, AuditKind, AuditStatus types and interfaces.
  • Configuration Database (#17)AuditLog table schema, partition function/scheme, DB roles, retention options.
  • Cluster Infrastructure (#13) — singleton placement and supervision (AuditLogIngestActor, SiteAuditTelemetryActor, SiteAuditReconciliationActor, AuditLogPurgeActor).
  • Communication (#5) — gRPC telemetry message types added to the existing site-stream proto additively.
  • Site Runtime (#3) — script trust boundary touchpoints invoke IAuditWriter.
  • Host (#15) — registers the new component under the central + site roles.

Step 18: Fill Interactions

Edges to:

  • External System Gateway (#7) — emits ApiOutbound.SyncCall rows; for CachedCall emits combined telemetry (audit + operational).
  • Site Runtime (#3) / Database layer — emits DbOutbound.SyncWrite, DbOutbound.SyncRead, and cached variants similarly.
  • Inbound API (#14) — emits ApiInbound.Completed rows from request middleware.
  • Notification Outbox (#21) — site-emitted Notification.Enqueued flows via audit telemetry; central dispatcher writes Notification.Attempt and Notification.Terminal rows directly via ICentralAuditWriter.
  • Site Call Audit (#22) — shares the cached-call telemetry packet; central ingest of that packet performs both AuditLog insert and SiteCalls upsert in one transaction.
  • Central UI (#9) — new Audit nav group + Audit Log page; drill-in links from Notifications, Site Calls, External Systems, Inbound API key, Sites, Instances detail pages.
  • Health Monitoring (#11) — three new tiles (Volume, Error rate, Backlog) plus new metrics (SiteAuditBacklog, SiteAuditWriteFailures, SiteAuditTelemetryStalled, CentralAuditWriteFailures, AuditRedactionFailure).
  • CLI (#19)scadalink audit query|export|verify-chain commands.

Step 19: Verify

Run: grep -n "Component-AuditLog.md\|#23" docs/requirements/Component-AuditLog.md Expected: file references itself sensibly.

Run: wc -l docs/requirements/Component-AuditLog.md Expected: ~250400 lines (sanity check; not exact).

Step 20: Commit

git add docs/requirements/Component-AuditLog.md
git commit -m "docs(audit): add Component-AuditLog (#23) design document"

Task 2: Update Component-Commons.md

Files:

  • Modify: docs/requirements/Component-Commons.md

Step 1: Read existing structure

Read the file to find the right sections — likely "Types", "Interfaces", "Messages", "Entities". Note which subsections audit-related additions belong in.

Step 2: Add to Types/

Under the Types section, add:

  • AuditChannel enum: ApiOutbound | DbOutbound | Notification | ApiInbound.
  • AuditKind enum: union of channel-specific values from alog.md §4 table.
  • AuditStatus enum: Success | TransientFailure | PermanentFailure | Enqueued | Retrying | Delivered | Parked | Discarded.
  • AuditEvent POCO record carrying every column from alog.md §4 (central schema), plus a ForwardState for site SQLite.

Step 3: Add to Interfaces/

  • IAuditWriter — site-local hot-path interface: Task WriteAsync(AuditEvent evt, CancellationToken ct). Implementation lives in Audit Log (#23) component.
  • ICentralAuditWriter — central direct-write interface: Task WriteAsync(AuditEvent evt, CancellationToken ct) with insert-if-not-exists semantics on EventId.

Step 4: Add to Messages/

  • AuditTelemetryEnvelope — gRPC message wrapping a batch of AuditEvent rows for telemetry forwarding.
  • CachedCallTelemetry — the existing SiteCalls telemetry message, additively extended in place to also carry AuditEvent content alongside the operational SiteCalls upsert fields. Do NOT rename; per Component-Commons.md REQ-COM-5a, message renames are breaking changes. Extend the existing entry's description.

Step 5: Verify

Run: grep -n "AuditEvent\|IAuditWriter\|AuditChannel" docs/requirements/Component-Commons.md Expected: all five identifiers appear in the right sections.

Step 6: Commit

git add docs/requirements/Component-Commons.md
git commit -m "docs(audit): register AuditEvent, IAuditWriter, AuditTelemetry types in Commons"

Task 3: Update Component-ConfigurationDatabase.md

Files:

  • Modify: docs/requirements/Component-ConfigurationDatabase.md

Step 1: Read existing structure

Find the "Tables" and "Roles" / "Permissions" / "Migrations" sections.

Step 2: Add AuditLog table description

Under Tables, add a new subsection mirroring how Notifications and SiteCalls are documented. Include:

  • Full column list from alog.md §4 (central table).
  • Index list from alog.md §4.
  • Monthly partitioning: partition function pf_AuditLog_Month, scheme ps_AuditLog_Month, filegroup-per-month rollover.
  • PK on EventId for idempotency.

Step 3: Add AuditLog DB roles

Under Roles/Permissions, add scadalink_audit_writer (INSERT+SELECT only) and scadalink_audit_purger (partition-switch only). Note the CI grep guard against UPDATE … AuditLog / DELETE … AuditLog.

Step 4: Add AuditLog migration note

Under Migrations, note that the initial migration creates the partition function/scheme and the table aligned to the scheme; partition-maintenance job is owned by the Audit Log component, not the Configuration DB.

Step 5: Add retention config note

Mention AuditLog:RetentionDays (global only in v1) as an Audit Log options key consumed by the purge actor.

Step 6: Verify cross-reference

Run: grep -n "AuditLog\|Audit Log" docs/requirements/Component-ConfigurationDatabase.md Expected: new table appears in the Tables section, roles in Roles section.

Step 7: Commit

git add docs/requirements/Component-ConfigurationDatabase.md
git commit -m "docs(audit): add AuditLog table, partitioning, and DB roles to Config DB"

Task 4: Update Component-ClusterInfrastructure.md

Files:

  • Modify: docs/requirements/Component-ClusterInfrastructure.md

Step 1: Read singleton-placement section

Find where Notification Outbox / Site Call Audit singletons are documented (active-central placement model).

Step 2: Register central singletons

Add to the central-singleton list:

  • AuditLogIngestActor — receives gRPC telemetry batches, performs insert-if-not-exists on EventId; for cached telemetry, performs both AuditLog insert and SiteCalls upsert in one transaction.
  • SiteAuditReconciliationActor — periodic per-site pull, default every 5 minutes.
  • AuditLogPurgeActor — daily partition-switch purge.

Step 3: Register site singletons

Add to the site-singleton list:

  • SiteAuditTelemetryActor — drains the local AuditLog SQLite's Pending rows to central in batches; short interval (5s) when busy, longer (30s) when idle.

Step 4: Note dedicated dispatcher

Add a one-liner: SiteAuditTelemetryActor runs on a dedicated dispatcher so it doesn't compete with the script blocking-I/O dispatcher (per alog.md §6.2).

Step 5: Verify

Run: grep -n "AuditLogIngestActor\|SiteAuditTelemetryActor\|AuditLogPurgeActor\|SiteAuditReconciliationActor" docs/requirements/Component-ClusterInfrastructure.md Expected: all four singletons listed.

Step 6: Commit

git add docs/requirements/Component-ClusterInfrastructure.md
git commit -m "docs(audit): register AuditLog singletons in Cluster Infrastructure"

Task 5: Update Component-SiteRuntime.md

Files:

  • Modify: docs/requirements/Component-SiteRuntime.md

Step 1: Find script-trust-boundary section

Locate the section listing what scripts can/cannot do and how their boundary-crossing calls are mediated.

Step 2: Note audit hook

Add: "Every script-trust-boundary call (External System Gateway, Database layer, Notify) emits an AuditEvent to IAuditWriter (site-local SQLite append). Hot path; never fails the calling action; failures logged via the SiteAuditWriteFailures health metric (see Health Monitoring #11)."

Step 3: Note site SQLite footprint

Find the section discussing site storage (SQLite for deployed configs, S&F buffer, event log, operation tracking). Add the AuditLog SQLite database file as a peer with the 7-day-purge-respecting-ForwardState invariant; cross-reference to Component-AuditLog.md.

Step 4: Verify

Run: grep -n "IAuditWriter\|AuditLog\|Audit Log" docs/requirements/Component-SiteRuntime.md Expected: hook documented, SQLite file mentioned.

Step 5: Commit

git add docs/requirements/Component-SiteRuntime.md
git commit -m "docs(audit): note IAuditWriter hook and site SQLite in Site Runtime"

Task 6: Update Component-ExternalSystemGateway.md

Files:

  • Modify: docs/requirements/Component-ExternalSystemGateway.md

Step 1: Find Call/CachedCall sections

Locate the dual-call-modes documentation.

Step 2: Note audit emission on sync calls

Under ExternalSystem.Call, add: "Emits an ApiOutbound.SyncCall row to IAuditWriter at call completion (success or failure). Payload captured per the Audit Log policy (#23 §Payload Capture Policy). Audit-write failure never aborts the script."

Step 3: Note audit emission on cached calls

Under ExternalSystem.CachedCall, add: "Each lifecycle transition (CachedEnqueued, CachedAttempt, CachedTerminal) emits an audit row via the combined cached-operation telemetry packet — one packet carries both the audit row and the SiteCalls upsert (see Audit Log #23 §Cached Operations and Site Call Audit #22)."

Step 4: Note audit emission on DB writes

Under Database.Connection() (synchronous), add: "Script-initiated Execute/ExecuteScalar calls emit DbOutbound.SyncWrite rows; ExecuteReader emits DbOutbound.SyncRead. SQL parameter values are captured by default; per-connection redaction opt-in via the Audit Log configuration (#23 §Payload Capture Policy §8.2)."

Step 5: Note audit emission on cached DB writes

Under Database.CachedWrite, add: same combined-telemetry pattern as cached external calls.

Step 6: Verify

Run: grep -n "AuditLog\|Audit Log\|ApiOutbound\|DbOutbound\|IAuditWriter" docs/requirements/Component-ExternalSystemGateway.md Expected: hooks documented in all four call-mode subsections.

Step 7: Commit

git add docs/requirements/Component-ExternalSystemGateway.md
git commit -m "docs(audit): emit AuditLog rows from External System Gateway call paths"

Task 7: Update Component-SiteCallAudit.md

Files:

  • Modify: docs/requirements/Component-SiteCallAudit.md

Step 1: Find Ingest & Idempotency section

Locate the "Ingest & Idempotency" section (around line 69 in current file).

Step 2: Note combined telemetry

Add a new paragraph: "From v1.x onward, the cached-operation telemetry packet additively carries the AuditEvent content alongside the existing operational fields. Central's AuditLogIngestActor (Audit Log #23) performs both the immutable AuditLog insert and the SiteCalls upsert in a single transaction. Idempotency keys remain EventId (for AuditLog) and TrackedOperationId (for SiteCalls)."

Step 3: Cross-reference Audit Log

Find the Dependencies / Interactions sections (typically near the end). Add an edge to Audit Log (#23) noting the shared telemetry packet and dual-write ingest.

Step 4: Verify

Run: grep -n "Audit Log\|AuditLog\|AuditEvent\|#23" docs/requirements/Component-SiteCallAudit.md Expected: combined-telemetry paragraph + Dependencies edge present.

Step 5: Commit

git add docs/requirements/Component-SiteCallAudit.md
git commit -m "docs(audit): note shared cached-operation telemetry with Audit Log"

Task 8: Update Component-NotificationOutbox.md

Files:

  • Modify: docs/requirements/Component-NotificationOutbox.md

Step 1: Find dispatcher section

Locate the section describing the central dispatcher's delivery attempt loop.

Step 2: Note central direct-write of attempt/terminal

Add: "Each delivery attempt writes a Notification.Attempt row to the AuditLog via ICentralAuditWriter; transition to a terminal status (Delivered / Parked / Discarded) writes a Notification.Terminal row. Audit writes are direct (no telemetry — the dispatcher runs at central). The site-emitted Notification.Enqueued row arrives via the standard audit telemetry channel."

Step 3: Cross-reference Audit Log

Add to Dependencies / Interactions: edge to Audit Log (#23) noting central direct-write of dispatch lifecycle events.

Step 4: Note status independence

Add a clarifying sentence: "The operational Notifications table remains the source of truth for the dispatcher and for Retry/Discard actions; the AuditLog rows are immutable shadows."

Step 5: Verify

Run: grep -n "Audit Log\|ICentralAuditWriter\|Notification.Attempt\|#23" docs/requirements/Component-NotificationOutbox.md Expected: dispatcher hook + Dependencies edge present.

Step 6: Commit

git add docs/requirements/Component-NotificationOutbox.md
git commit -m "docs(audit): central direct-write of notification dispatch events to AuditLog"

Task 9: Update Component-InboundAPI.md

Files:

  • Modify: docs/requirements/Component-InboundAPI.md

Step 1: Find request-completion / logging section

Locate the section describing how requests are processed and what gets logged today (today: failures only, per the brainstorm exploration).

Step 2: Replace failures-only stance

Edit the "failures-only logging" claim so it now reads: "Every request (success or failure) emits one ApiInbound.Completed row to ICentralAuditWriter from request middleware before the HTTP response is flushed. The row captures the API key name (never the key material), remote IP, user-agent, response status, duration, and truncated request/response bodies per the Audit Log capture policy (#23 §Payload Capture Policy)."

Step 3: Cross-reference Audit Log

Add Dependencies edge to Audit Log (#23).

Step 4: Note non-blocking semantics

Add: "Middleware audit-write failures are logged and metricked (see Health Monitoring #11) but never affect the HTTP response."

Step 5: Verify

Run: grep -n "Audit Log\|ApiInbound\|ICentralAuditWriter\|#23" docs/requirements/Component-InboundAPI.md Expected: middleware hook + Dependencies edge present.

Step 6: Commit

git add docs/requirements/Component-InboundAPI.md
git commit -m "docs(audit): emit ApiInbound.Completed audit row per request"

Task 10: Update Component-CentralUI.md

Files:

  • Modify: docs/requirements/Component-CentralUI.md

Step 1: Find navigation / page list

Locate the section enumerating top-level nav groups and pages.

Step 2: Add Audit nav group

Add a new top-level group Audit with one page in v1:

  • Audit Log — global query/filter/drilldown over the central AuditLog table.

Document the filter bar and results grid columns from alog.md §10.1.

Step 3: Add drill-in links

In the existing Notifications, Site Calls, External Systems, Inbound API Keys, Sites, and Instances detail-page documentation, add a "View audit history" / "Recent activity" / "Audit feed" entry that opens the Audit Log page pre-filtered (per alog.md §10.2).

Step 4: Add Health dashboard tiles

In the Health dashboard documentation, add three tiles under a new "Audit" KPI group: Audit volume, Audit error rate, Audit backlog (per alog.md §10.3 / §14).

Step 5: Note UI rules already covered

No new framework choices — sticks to Blazor Server + Bootstrap + custom components per the existing project rules (per memory note feedback_central_ui.md).

Step 6: Verify

Run: grep -n "Audit Log\|Audit nav\|Audit feed\|Audit volume\|#23" docs/requirements/Component-CentralUI.md Expected: nav group, page, drill-ins, tiles all documented.

Step 7: Commit

git add docs/requirements/Component-CentralUI.md
git commit -m "docs(audit): add Audit nav group, Audit Log page, drill-ins, and KPI tiles to Central UI"

Task 11: Update Component-HealthMonitoring.md

Files:

  • Modify: docs/requirements/Component-HealthMonitoring.md

Step 1: Find metrics list

Locate where existing site + central metrics are enumerated.

Step 2: Add new site metrics

  • SiteAuditBacklog — count of Pending rows in site-local AuditLog plus oldest-pending-age plus on-disk bytes. Threshold drives a Health dashboard warning on the affected site tile.
  • SiteAuditWriteFailures — count of failed hot-path appends since last report.
  • SiteAuditTelemetryStalled — boolean flag set when reconciliation reports a non-draining backlog over two cycles.

Step 3: Add new central metrics

  • CentralAuditWriteFailures — central direct-write failures (Inbound API middleware, Notification Outbox dispatcher).
  • AuditRedactionFailure — payload redactor errors (over-redacted, safety-net hit).

Step 4: Add new tiles

Three new dashboard tiles under an "Audit" group: Audit volume, Audit error rate, Audit backlog.

Step 5: Cross-reference Audit Log

Dependencies edge to Audit Log (#23).

Step 6: Verify

Run: grep -n "SiteAuditBacklog\|SiteAuditWriteFailures\|CentralAuditWriteFailures\|AuditRedactionFailure\|Audit volume" docs/requirements/Component-HealthMonitoring.md Expected: all five metrics + three tiles listed.

Step 7: Commit

git add docs/requirements/Component-HealthMonitoring.md
git commit -m "docs(audit): add Audit Log health metrics and dashboard tiles"

Task 12: Update Component-CLI.md

Files:

  • Modify: docs/requirements/Component-CLI.md

Step 1: Find command-group list

Locate the section enumerating top-level CLI command groups.

Step 2: Add scadalink audit group

Three subcommands per alog.md §15.1:

  • audit query --site <s> --since <t> --kind <k> [...] — UI-equivalent filter set.
  • audit export --since <t> --until <t> --format csv|jsonl|parquet --output <path> — server-side streaming export.
  • audit verify-chain --month <YYYY-MM> — hash-chain verification (no-op in v1; available once §11.4 ships).

Note: requires OperationalAudit + AuditExport permissions (Security & Auth #10).

Step 3: Cross-reference Audit Log and Management Service

Dependencies edges to Audit Log (#23) and Management Service (#18) (the CLI hits central via the existing HTTP Management API).

Step 4: Verify

Run: grep -n "scadalink audit\|audit query\|audit export\|audit verify-chain\|#23" docs/requirements/Component-CLI.md Expected: command group documented with all three subcommands.

Step 5: Commit

git add docs/requirements/Component-CLI.md
git commit -m "docs(audit): add scadalink audit command group to CLI"

Task 13: Update README.md

Files:

  • Modify: README.md

Step 1: Find component table

Locate the markdown table containing rows #1#22 (currently around lines 3658).

Step 2: Add row #23

Append a row after Site Call Audit:

| 23 | Audit Log | [docs/requirements/Component-AuditLog.md](docs/requirements/Component-AuditLog.md) | New central append-only AuditLog spanning every script-trust-boundary action (outbound API sync+cached, outbound DB sync+cached, notifications, inbound API). Site-local SQLite hot-path append + gRPC telemetry + central reconciliation; combined telemetry packet with Site Call Audit; central direct-write for Notification Outbox dispatch + Inbound API middleware; monthly partitioning, 365-day default retention. |

Step 3: Update architecture diagram (logical)

In the architecture diagram, add an AuditLog box under the central cluster's "Audit Log" / observability cluster (parallel to Notification Outbox and Site Call Audit). Add a thin arrow from each affected component into it.

Step 4: Verify

Run: grep -n "Audit Log\|Component-AuditLog.md\|| 23 |" README.md Expected: new row + diagram entry present.

Step 5: Commit

git add README.md
git commit -m "docs(audit): register Audit Log (#23) in the README component table"

Task 14: Update docs/requirements/HighLevelReqs.md

Files:

  • Modify: docs/requirements/HighLevelReqs.md

Step 1: Find functional-area sections

Locate the section that currently contains requirements for Notification Outbox and Site Call Audit (likely under "Observability" or "Audit & Reporting").

Step 2: Add Audit Log requirements section

Add a new subsection "Centralized Audit Log" with numbered requirements covering:

  • AL-1: Append-only central record of every script-trust-boundary action.
  • AL-2: One row per lifecycle event for cached calls and notifications.
  • AL-3: Site-local hot-path append; gRPC telemetry to central; idempotent on EventId.
  • AL-4: Reconciliation pull self-heals missed telemetry.
  • AL-5: Payload metadata + truncated bodies (8 KB default, 64 KB on errors).
  • AL-6: Headers redacted by default; SQL parameter values captured by default; per-target redaction opt-in.
  • AL-7: Audit-write failure never aborts the user-facing action.
  • AL-8: 365-day default central retention; monthly partition switch purge.
  • AL-9: Site SQLite purge requires ForwardState ∈ {Forwarded, Reconciled}; central outage cannot cause audit loss at sites.
  • AL-10: Central UI Audit Log page with cross-channel filter and drill-ins from existing operational pages.
  • AL-11: Append-only enforced via DB roles; tamper-evidence hash chain deferred to v1.x.
  • AL-12: CLI scadalink audit command group.

Step 3: Cross-reference Audit Log component

Add a "See Component-AuditLog.md (#23)" pointer at the top of the subsection.

Step 4: Verify

Run: grep -n "AL-1\|AL-12\|Centralized Audit Log\|Component-AuditLog.md" docs/requirements/HighLevelReqs.md Expected: section header and all twelve requirements present.

Step 5: Commit

git add docs/requirements/HighLevelReqs.md
git commit -m "docs(audit): add Centralized Audit Log requirements (AL-1..AL-12) to HighLevelReqs"

Task 15: Update CLAUDE.md

Files:

  • Modify: CLAUDE.md

Step 1: Update Current Component List

Change the heading from ## Current Component List (22 components) to ## Current Component List (23 components). Append a new line at the end of the numbered list:

23. Audit Log — Central append-only AuditLog table spanning every script-trust-boundary action (outbound API sync+cached, outbound DB sync+cached, notifications, inbound API). Site SQLite hot-path + gRPC telemetry + reconciliation; combined telemetry with Site Call Audit; central direct-write for Notification Outbox dispatch + Inbound API; monthly partitioning, 365-day retention.

Step 2: Add Key Design Decisions block

In the Key Design Decisions section, add a new subsection ### Centralized Audit Log with bulleted decisions mirroring alog.md §1§15 highlights:

  • Layered design — append-only AuditLog alongside operational Notifications (#21) and SiteCalls (#22), not replacing them.
  • Scope = script trust boundary; framework traffic explicitly excluded.
  • One row per lifecycle event; cached calls produce 4+ rows per operation.
  • Site SQLite hot-path first; gRPC telemetry to central; idempotent on EventId; reconciliation pull as fallback.
  • Cached operations: site emits, one telemetry packet carries audit + operational state; central writes both in one transaction.
  • Payload cap 8 KB default / 64 KB on errors; headers redacted by default; SQL parameter values captured by default; per-target redaction opt-in.
  • Audit-write failure never aborts the user-facing action.
  • 365-day central retention with monthly partition-switch purge; 7-day site SQLite with hard ForwardState invariant.
  • Append-only enforced via DB roles; hash-chain tamper evidence and Parquet archival deferred to v1.x.
  • New top-level Audit nav group + Audit Log page + drill-ins from Notifications / Site Calls / External Systems / Inbound API Keys / Sites / Instances.

Step 3: Verify

Run: grep -n "Centralized Audit Log\|Audit Log\|23 components\|23\\. Audit Log" CLAUDE.md Expected: count updated, list extended, Key Design Decisions block present.

Step 4: Commit

git add CLAUDE.md
git commit -m "docs(audit): register Audit Log (#23) in CLAUDE.md component list and key decisions"

Task 16: Final cross-reference verification

Files:

  • None — verification only.

Step 1: Grep for stale references

Run: grep -rn "22 components\|Currently 22\|22\\. Site Call Audit\\s*$" docs/ README.md CLAUDE.md Expected: no hits — all updated to 23.

Step 2: Grep for orphan references

Run: grep -rn "Component-AuditLog.md" docs/ README.md CLAUDE.md Expected: hits in README, CLAUDE.md, and each affected component doc. Confirm the file exists at the referenced path.

Step 3: Verify all twelve affected component docs cross-reference Audit Log

Run: for f in docs/requirements/Component-{ExternalSystemGateway,InboundAPI,NotificationOutbox,SiteCallAudit,SiteRuntime,Commons,CentralUI,ConfigurationDatabase,ClusterInfrastructure,HealthMonitoring,CLI}.md; do echo "--- $f"; grep -c "Audit Log\|AuditLog\|#23" "$f"; done Expected: each file shows count ≥ 1.

Step 4: Verify alog.md still matches the design canonically

Run: git diff fec0bb1 -- alog.md Expected: no diff — alog.md is unchanged from the validated commit.

Step 5: Skim the new file once more end-to-end

Read: docs/requirements/Component-AuditLog.md. Verify section ordering, completeness, no contradictions with alog.md.

Step 6: Review the commit graph

Run: git log --oneline feature/audit-log-docs ^main Expected: 14 commits — one per Task 113 plus Task 15 (Task 14 is HighLevelReqs in this list — recount: tasks 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 = 15 commits). Adjust expectation: 15 docs/commits.

Step 7: Final commit (only if any fix-ups needed)

If grep finds any issue, fix it and commit with docs(audit): cross-reference cleanup. Otherwise no commit at this task.


Task 17: Merge to main (optional, on user request only)

Files:

  • None — git operation only.

Step 1: Confirm with user

Per CLAUDE.md and harness policy, do not push or merge to main without explicit user instruction. This task documents the option but does not execute automatically.

Step 2: If user requests merge

git switch main
git merge --no-ff feature/audit-log-docs -m "Merge feature/audit-log-docs: centralized audit log design"

Step 3: If user requests push

git push origin main

(or push the feature branch instead — operator's call).


Execution Notes

  • Tasks 214 are mostly independent of each other once Task 1 is done. Suitable for parallel execution via the subagent-driven-development sub-skill — one fresh subagent per task, review between commits.
  • Tasks 15 and 16 must run last (Task 15 is the CLAUDE.md rollup; Task 16 is verification).
  • Task 0 must run first (branch prep).
  • Total: 17 tasks, ~15 commits, ~250400 lines of new prose in Component-AuditLog.md plus smaller per-component additions.
  • Spec is alog.md (commit fec0bb1); every task cites the relevant section.