docs(audit): add SourceNode column to AuditLog/Notifications/SiteCalls design + plan

- Adds SourceNode varchar(64) NULL to AuditLog, Notifications, and SiteCalls
  tables with role-name semantics: node-a/node-b for site rows (qualified by
  SourceSiteId), central-a/central-b for central direct-write rows.
- New IX_AuditLog_Node_Occurred (SourceNode, OccurredAtUtc) index.
- Reframes CLAUDE.md from documentation-only to implementation project.
- Adds docs/plans/2026-05-23-audit-source-node.md + tasks.json companion.
This commit is contained in:
Joseph Doherty
2026-05-23 15:34:44 -04:00
parent e3345a0fc1
commit 9e5e32d0f2
6 changed files with 986 additions and 11 deletions

View File

@@ -86,6 +86,7 @@ row per lifecycle event across all channels.
| `ExecutionId` | `uniqueidentifier` NULL | The originating script execution / inbound request — the universal per-run correlation value; distinct from `CorrelationId`, which is the per-operation lifecycle id. Stamped on *every* audit row emitted by one execution. |
| `ParentExecutionId` | `uniqueidentifier` NULL | The `ExecutionId` of the execution that *spawned* this run — the cross-execution correlation pointer. Set on every row of an inbound-API-routed site script run (= the inbound request's `ExecutionId`); NULL for a top-level run (inbound, tag-change / timer-triggered, un-bridged). |
| `SourceSiteId` | `varchar(64)` NULL | NULL for central-originated events. |
| `SourceNode` | `varchar(64)` NULL | The cluster node on which the event was emitted — `node-a` / `node-b` for site rows (qualified by `SourceSiteId`), `central-a` / `central-b` for central-originated rows. Nullable so reconciled rows from a node that has since been retired don't block ingest. |
| `SourceInstanceId` | `varchar(128)` NULL | Instance whose script initiated the action (when applicable). |
| `SourceScript` | `varchar(128)` NULL | Script name within the instance. |
| `Actor` | `varchar(128)` NULL | Inbound API: API key name. Outbound: script identity. Central: system user. |
@@ -104,6 +105,7 @@ row per lifecycle event across all channels.
- `IX_AuditLog_OccurredAtUtc` — primary time-range index for global scans.
- `IX_AuditLog_Site_Occurred (SourceSiteId, OccurredAtUtc)` — per-site filters.
- `IX_AuditLog_Node_Occurred (SourceNode, OccurredAtUtc)` — per-node filters ("everything `central-a` did in window X", or pinning a misbehaving site node).
- `IX_AuditLog_CorrelationId (CorrelationId)` — drilldown from a single operation.
- `IX_AuditLog_Execution (ExecutionId)` — drilldown to every action of one script execution / inbound request.
- `IX_AuditLog_ParentExecution (ParentExecutionId)` — cross-execution drilldown: the downward leg of the execution-tree walk seeks on it (`ParentExecutionId = ancestor.ExecutionId`), and it backs the `parentExecutionId` filter.
@@ -172,7 +174,9 @@ generalises to it with no schema change once that spawn point is threaded.
A SQLite database file on each site node, alongside the Store-and-Forward
buffer. Same schema as central minus `IngestedAtUtc` (irrelevant at the source),
plus a `ForwardState` column with values `Pending | Forwarded | Reconciled` that
drives the telemetry loop and reconciliation pull.
drives the telemetry loop and reconciliation pull. `SourceNode` is stamped by the
writing node itself (`node-a` / `node-b`) at append time and travels with the row
through telemetry and reconciliation unchanged.
**Site SQLite retention rule (hard invariant):**
@@ -233,7 +237,9 @@ instead. The Notification Outbox dispatcher writes
`Notification.NotifyDeliver` with `Status=Attempted` per delivery attempt and
`Notification.NotifyDeliver` with `Status=Delivered`/`Parked`/`Discarded` on
terminal status. Central direct-writes use the same insert-if-not-exists
semantics keyed on `EventId`.
semantics keyed on `EventId`. `SourceSiteId` is NULL on all central direct-write
rows; `SourceNode` is stamped to the local central node's role name
(`central-a` / `central-b`).
## Cached Operations — Combined Telemetry