File-based, encrypted bundle export/import via the Central UI for
promoting templates, system artifacts, and central-only configuration
across environments. Site-scoped artifacts excluded. Per-artifact
conflict resolution; config-only import (user redeploys via existing
Deployments page). Per-entity audit rows correlated by BundleImportId.
Tidies flagged by code review on the T6/T7/T8 migration bundle:
- Add `.IsUnicode(false)` to the three SourceNode EF property mappings to
match every other ASCII varchar column on the same entities. Physical
column was already `varchar(64)` because `HasColumnType` wins, but the EF
model metadata flag was inconsistent.
- Add `unicode: false` to the three AddColumn<string> calls in the migrations
+ their Designer snapshots so the historical snapshots match the model.
- Update the model snapshot to carry IsUnicode(false) on each SourceNode entry.
- Document the SELECT-list invariant on SiteCallAuditRepository.QueryAsync:
EF Core's FromSqlInterpolated requires every entity-tracked column in the
result set, so future SiteCall columns must extend the list too.
- Amend plan Task 6 Step 2 to document the partition-aligned raw-SQL index
recipe and the staging-table sync requirement.
- Adds SourceNode varchar(64) NULL to AuditLog, Notifications, and SiteCalls
tables with role-name semantics: node-a/node-b for site rows (qualified by
SourceSiteId), central-a/central-b for central direct-write rows.
- New IX_AuditLog_Node_Occurred (SourceNode, OccurredAtUtc) index.
- Reframes CLAUDE.md from documentation-only to implementation project.
- Adds docs/plans/2026-05-23-audit-source-node.md + tasks.json companion.
The design doc claimed (in two places) that InboundAuthFailure rows
were excluded from the inbound full-body carve-out — but the actual
implementation gates the carve-out on Channel == ApiInbound, NOT Kind.
Every audit row the InboundAPI middleware emits (whether
Kind = InboundRequest or Kind = InboundAuthFailure) carries
Channel = ApiInbound, so both Kinds receive the inbound ceiling. That
is the intended behaviour: an auth-failure row's request body is
exactly the body the operator wants to see in full when investigating
a rejected request.
Update both occurrences (Decision block + Not in Scope block) to say
the carve-out applies to all Channel = ApiInbound rows regardless of
Kind. Pure documentation change — no code drift.
Plan companion to the 2026-05-23 design doc. Seven tasks (#0 prep, #1-3
implementation TDD, #4-5 doc updates, #6 final sweep). Tracks via
.tasks.json for resumability.
Carve-out from Payload Capture Policy: ApiInbound rows capture
RequestSummary and ResponseSummary in full up to a configurable 1 MB
per-body ceiling (AuditLog:InboundMaxBytes), instead of the global 8 KB /
64 KB caps. No schema change; existing redaction (headers + per-target
body redactors) still applies before persistence.
The appsettings example used AuthMode 'None', which the delivery code
(MailKitSmtpClientWrapper) rejects — only Basic and OAuth2 are valid.
Switch to a working Basic config with Credentials and TlsMode None, and
document that Server must be the container name scadalink-smtp when the
Notification Service runs inside the docker cluster.
M7 head records M6 realities:
- IAuditCentralHealthSnapshot exists; M7 dashboard reads it.
- SiteHealthReport.SiteAuditBacklog ready for per-site tiles.
- IAuditLogRepository.QueryAsync is the page's data source.
- Pre-existing AuditLog.razor rename to ConfigurationAuditLog.razor
needs verification.
- OperationalAudit + AuditExport permission strings need to exist.
- Real gRPC pull client still deferred; doesn't gate M7.
6 bundles: proto+site handler, reconciliation actor, purge actor with
drop-and-rebuild around UX index, partition maintenance, four health
metrics, integration tests. M5 realities baked in.
M6 head records M5 realities:
- IOptionsMonitor hot-reload pattern verified; M6 retention config can
reuse.
- AuditRedactionFailure counter site-only in M5; M6 wires central side.
- Filter integration is at 3 writer entry points; purge actor doesn't
emit so no filter integration needed.
- SwitchOutPartitionAsync drop-and-rebuild dance required (M1 reality
+ M6-T4 already documents it).
- M6 should land the real ISiteStreamAuditClient (Option A) so push
telemetry leaves NoOp behind.
4 bundles: filter+truncation, redactors (header/body/SQL-param), wire
into all emission paths + health metric, config+perf+safety-net.
Vocabulary translation locked: error-row cap (64 KB) on Status NOT IN
(Delivered, Submitted, Forwarded). Filter integration point in each
writer (FallbackAuditWriter, CentralAuditWriter, AuditLogIngestActor)
BEFORE storage call.
M5 head records M4 realities:
- AuditingDbConnection/Command/DataReader decorators need filter plug-in
at WriteAsync emission point.
- CentralAuditWriter + FallbackAuditWriter are both filter integration
points for the direct-write + chained-write paths.
- InboundAPI middleware RequestSummary populated, ResponseSummary=null
pending response-body buffering decision in M5.
- UseWhen(/api/) path-scoped middleware gives natural per-target
redaction hook.
- Error-row cap raised on Status IN (Failed, Parked, Discarded,
Attempted, Skipped) per M1 vocab reconciliation.
5 bundles: DB sync emissions, NotificationOutbox central, site Notify.Send,
Inbound API middleware, integration tests. M3-reality vocab baked in
(DbWrite/NotifyDeliver/NotifySend/InboundRequest/InboundAuthFailure).
M4 head now records M3 realities:
- Vocabulary translation table from pre-M1 spec strings to M1-aligned
enum values (DbWrite vs SyncWrite/SyncRead; NotifyDeliver vs
Notification.Attempt/Terminal; InboundRequest/InboundAuthFailure vs
ApiInbound.Completed; Failed vs PermanentFailure).
- Mapper consolidation: 4 DTO mappers exist; extract single helper
before M4 adds more channels.
- OnCachedTelemetryWithoutDualWriteAsync test-mode fallback may be
deprecated in M4.
- Site SQLite drain for OperationTrackingStore: only dual-write
transaction writes central today; plan drain if M4 needs in-flight
tracking visibility.
- SiteCallAuditActor wired but unused on M3 hot path; M4/M6 natural
first direct caller.
M3 head now records M2 realities:
- enum vocabulary (M1-aligned) drives CachedSubmit/ApiCallCached/etc.
- NoOpSiteStreamAuditClient stays until M6; M3 e2e tests reuse Bundle H's
DirectActorSiteStreamAuditClient (extract to Integration/Infrastructure/).
- Mapper duplication note (gRPC handler inlines DTO->entity decoding;
consider moving AuditEventMapper to Commons in M3).
- AuditIngestAskTimeout=30s hardcoded; M3 may expose via options.
- CachedCallTelemetry message MUST be created from scratch (additive
per Commons REQ-COM-5a; never renamed CachedOperationTelemetry).
- Central dual-write AuditLog + SiteCalls in one tx; reuse Bundle A
duplicate-key swallow pattern for CachedCallId.
- M2 head: honor M1 vocabulary (ApiCall/Delivered), harden InsertIfNotExistsAsync
(race window — first concurrent writer arrives in M2), add keyset-tiebreaker
test (Bundle D reviewer's deferred recommendation), reuse MsSqlMigrationFixture
+ Xunit.SkippableFact pattern.
- M6-T4 (AuditLogPurgeActor): replace M1's NotSupportedException stub with the
drop-and-rebuild dance for the non-aligned UX_AuditLog_EventId unique index;
acknowledge the small outage window during partition SWITCH.
- M6-T5 (partition maintenance): note M1 ships 24 monthly boundaries (Jan 2026 -
Dec 2027); service rolls the function forward via SPLIT RANGE.
The M1 implementation (Bundle A) committed concrete AuditChannel /
AuditKind / AuditStatus enums that reflect CLAUDE.md's locked
cached-call lifecycle decisions. The older alog.md and
Component-AuditLog.md narratives still used pre-M1 vocabulary
(Success / TransientFailure / PermanentFailure / Enqueued / Retrying /
SyncCall / CachedEnqueued / Attempt / Terminal / Completed). This
commit reconciles both docs to the M1 vocabulary:
AuditChannel : ApiOutbound, DbOutbound, Notification, ApiInbound
AuditKind (10): ApiCall, ApiCallCached, DbWrite, DbWriteCached,
NotifySend, NotifyDeliver, InboundRequest,
InboundAuthFailure, CachedSubmit, CachedResolve
AuditStatus(8): Submitted, Forwarded, Attempted, Delivered, Failed,
Parked, Discarded, Skipped
Updates:
- Status column description + worked examples use the new 8 values.
- Kind table flattened from per-channel groupings to a single flat
list of the 10 discriminators (no more SyncCall / Cached* /
Attempt / Terminal / Completed).
- Cached-call lifecycle examples rewritten to the
CachedSubmit -> Forwarded -> Attempted... -> CachedResolve shape.
- Notification lifecycle examples rewritten to
NotifySend(Submitted) -> NotifyDeliver(Attempted) ->
NotifyDeliver(Delivered/Parked/Discarded).
- Inbound API examples split into InboundRequest (success path) and
InboundAuthFailure (401 path).
- 'Errors only' UI toggle, audit-error-rate KPI, and payload-cap
decision (#6 in §16) all switched from 'non-Success' to
Status IN ('Failed', 'Parked', 'Discarded').
- Per-site event-rate table in §13.1 renamed to the new kinds.
Pure design correction; no operational behavior change. Per the
goal-prompt invariant #6, alog.md may change when a design correction
is committed before the affected code change — this commit is that
correction, landed ahead of the M1 merge so the merge order reads
design-first, code-second.
No code, test, or infra file changes.
Bundles A-F per cadence memory. Brainstorm decisions locked:
infra/mssql test harness, single AuditEvent record (nullable IngestedAtUtc
+ ForwardState), PRIMARY filegroup, explicit index names.
Per user request: every milestone now carries bite-sized TDD tasks
(write failing test -> run failing -> implement -> run passing -> commit),
matching M1's density. Each task lists exact file paths, numbered steps,
and a commit message.
Task counts per milestone:
- M1 Foundation: 11
- M2 Site pipeline (sync-only): 12
- M3 Cached operations + dual-write (inlines #22 + cached-call tracking): 18
- M4 Remaining boundary emission: 12
- M5 Payload + redaction policy: 10
- M6 Reconciliation, purge, partition maintenance, metrics: 12
- M7 Central UI: 16
- M8 CLI: 9
Total: ~100 bite-sized tasks.
The roadmap remains the contract; per-milestone execution still goes
through brainstorm -> writing-plans -> subagent-driven-development to
produce a milestone-specific .tasks.json. Tasks in this roadmap will
shift slightly as M1 reveals codebase realities; treat them as the
intended shape rather than immutable IDs.
Roadmap covering Audit Log (#23) code implementation across 8 milestones
(M1 Foundation → M8 CLI). Reflects the actual state of the codebase —
all 22 prior components have source + tests, but Site Call Audit (#22)
and cached-call tracking are design-only despite being on main; their
minimum surface is inlined into M3.
M1 is laid out at full TDD-level task detail (11 bite-sized tasks).
M2–M8 are at milestone-shape detail (goals, files, task headlines,
acceptance criteria, risk callouts). Per-milestone bite-sized plans
will be generated by brainstorm + writing-plans when each milestone is
about to execute — locking 80 task cards now would mostly be stale by
M5 as M1 reveals codebase realities.
Critical path: M1 → M2 → (M3 ∥ M4 ∥ M5) → M6 → (M7 ∥ M8).
Spec: docs/requirements/Component-AuditLog.md + alog.md (commit
fec0bb1).
Final cross-bundle reviewer identified 7 inconsistencies that the per-bundle
reviewers couldn't see; all fixed in one logical commit.
Critical:
- HighLevelReqs AL-3: drop 'then upsert-on-newer-status' — AuditLog is
strictly append-only (correct for SiteCalls/Notifications, wrong for
the immutable AuditLog shadow).
- Component-AuditLog Error rate KPI: align with HealthMonitoring's
exclusion list (Success/Delivered/Enqueued) rather than just non-Success;
otherwise every Delivered notification or Enqueued cached call would be
counted as an error.
Important:
- Component-AuditLog line 154: ISiteAuditWriter -> IAuditWriter (canonical
name per Commons and the rest of this doc).
- Component-AuditLog Central direct-write paragraph: convert remaining
slash notation (ApiInbound/Completed, Notification/Attempt,
Notification/Terminal) to dot notation used everywhere else.
- Component-ClusterInfrastructure: scope SiteCallAuditActor to
reconciliation + KPIs + Retry/Discard relay; cached-telemetry ingest is
AuditLogIngestActor's role per Combined Telemetry contract.
- Component-CentralUI Audit Log page: state the OperationalAudit read
permission and the read-vs-export split (matching CLI doc).
- Component-NotificationOutbox: add never-fail-the-action invariant for
dispatcher audit writes.
Minor:
- Component-InboundAPI: 'Non-blocking semantics' was ambiguous (could be
read as async); reword to 'Fail-soft' — the write is still synchronous
before flush, but failures are caught and don't change the response.
- Component-CLI: realign audit-query/audit-export flags to actually match
the Central UI Audit Log filter set (channel, kind, status, site,
instance, target, actor, correlation-id, errors-only); drop --user and
--entity-id which are IAuditService concepts, not Audit Log columns.
- Component-AuditLog KPI tile names: 'Volume/Error rate/Backlog' ->
'Audit volume/Audit error rate/Audit backlog' (matches Central UI and
Health Monitoring); drop the two orphan KPIs (Top inbound callers, Top
outbound 5xx) that were never surfaced anywhere.
- Component-AuditLog Interactions: re-attribute DbOutbound emissions to
ESG (where Database.* lives) with a note that Site Runtime is the API
surface for scripts.
- HighLevelReqs AL-12: drop 'and reconciliation operations' (CLI has no
reconcile command; reconciliation is an internal self-healing pull).
Add note that verify-chain becomes operational once AL-11's hash chain
ships.