- AddColumnIfMissing is now shared by ExecutionId and ParentExecutionId;
drop the ExecutionId-specific tag.
- AuditLogRepository.GetExecutionTreeAsync doc no longer hardcodes the
MAXRECURSION literal; reference the ExecutionChainMaxDepth const instead.
Adds three KPI tiles to the central Health dashboard for the Audit channel:
volume (rows in the last hour), error rate (Failed/Parked/Discarded over
total), and backlog (sum of SiteAuditBacklog.PendingCount across all sites).
Repo + service:
- IAuditLogRepository.GetKpiSnapshotAsync(window, nowUtc) — single aggregate
SELECT over the trailing window returning total + error counts; nowUtc is
optional for production callers and pinned by integration tests against the
shared MSSQL fixture so the global counts are deterministic.
- AuditLogQueryService.GetKpiSnapshotAsync() — composes the repo aggregate
with a sum of SiteAuditBacklog.PendingCount read from ICentralHealthAggregator.
- AuditLogKpiSnapshot record in Commons/Types/.
UI:
- New AuditKpiTiles Blazor component (Components/Health/) — three Bootstrap
card-tiles, click navigates to /audit/log with the matching pre-filter.
- Health.razor wires the tiles in alongside the existing Notification Outbox
KPIs; LoadAuditKpis() runs on every 10s refresh tick and degrades to em
dashes + inline error if the query fails.
- AuditLogPage extended to parse ?status= so the error-rate tile drill-in
(?status=Failed) auto-loads the grid.
Tests:
- AuditLogRepositoryTests: GetKpiSnapshotAsync mixed-status + empty-window
cases against the MSSQL migration fixture.
- AuditLogQueryServiceTests: forwarding + backlog composition; sites with
null SiteAuditBacklog contribute zero.
- AuditKpiTilesTests: 9 bUnit tests covering tile render, error-rate maths
with safe zero-events handling, em-dash unavailable path, click-through
navigation, and warning/danger border thresholds.
- HealthPageTests: new Renders_AuditKpiTiles_WithValues plus IAuditLogQueryService
stub registration in the constructor so existing outbox tests still pass.
- AuditLogPageScaffoldTests: ?status=Failed auto-load + unknown status drop.
Central singleton (M6-T4 Bundle C) that drives the daily AuditLog partition
purge. On a configurable timer (default 24 hours) the actor:
1. Queries IAuditLogRepository.GetPartitionBoundariesOlderThanAsync for
monthly boundaries whose latest OccurredAtUtc is older than
DateTime.UtcNow - AuditLogOptions.RetentionDays.
2. For each eligible boundary calls SwitchOutPartitionAsync, which runs
the drop-and-rebuild dance around UX_AuditLog_EventId.
3. Publishes AuditLogPurgedEvent(boundary, rowsDeleted, durationMs) on
the actor-system EventStream so the Bundle E central health collector
and ops surfaces can subscribe without coupling to this actor.
Co-changes:
* SwitchOutPartitionAsync returns long (rows deleted) — sampled BEFORE the
switch via COUNT_BIG over the per-partition filter so the count
reflects what the switch removed, not a post-purge scan of a table that
no longer exists. All stub implementations updated.
* AuditLogPurgeOptions: IntervalHours (default 24), IntervalOverride for
tests, Interval property resolving either.
* AuditLogPurgedEvent: record with MonthBoundary, RowsDeleted, DurationMs.
Behavior:
* Continue-on-error per boundary — one partition that throws does NOT
abandon the rest of the tick.
* DI scope opened per tick (IAuditLogRepository is a SCOPED EF Core
service); mirrors SiteAuditReconciliationActor and AuditLogIngestActor.
* SupervisorStrategy Resume keeps the singleton alive across leaked
exceptions.
* EventStream capture BEFORE the first await — Context is unsafe after
await in async receive handlers (same pattern as Sender-capture in
AuditLogIngestActor.OnIngestAsync).
Tests:
* Tick_Fires_OnDailyInterval — visible timer side effect.
* Tick_OldPartitions_SwitchedOut — both seeded boundaries purged.
* Tick_NewerPartitions_Untouched — empty enumerator → no switches.
* Tick_PublishesPurgedEvent_WithRowCount — AuditLogPurgedEvent carries
RowsDeleted and DurationMs.
* Tick_SwitchThrows_OtherPartitionsStillProcessed — continue-on-error.
* Threshold_UsesAuditLogOptionsRetentionDays — non-default 30-day window
computed from UtcNow - RetentionDays.
* EndToEnd_RealPartition_RowsRemoved_PurgedEventPublished — TestKit +
MsSqlMigrationFixture: real partitioned table, Jan-2026 row purged,
Apr-2026 row kept, AuditLogPurgedEvent observed via probe.
Replaces M1's NotSupportedException stub with the production drop-DROP-INDEX
→ CREATE-staging → SWITCH PARTITION → DROP-staging → CREATE-INDEX dance
documented in alog.md §4. UX_AuditLog_EventId is intentionally non-aligned
with ps_AuditLog_Month so single-column EventId uniqueness can be enforced
cheaply for InsertIfNotExistsAsync; SQL Server rejects ALTER TABLE SWITCH
while a non-aligned unique index is present, so the implementation drops
it, switches the partition data into a GUID-suffixed staging table on
[PRIMARY], drops staging (discarding the rows), and rebuilds the unique
index — all inside an explicit transaction with a CATCH that guarantees
the unique index is rebuilt regardless of failure point.
Also adds GetPartitionBoundariesOlderThanAsync to IAuditLogRepository: a
CROSS APPLY over sys.partition_range_values + per-partition MAX(OccurredAtUtc)
to enumerate retention-eligible months for the M6 purge actor (next commit).
Tests verify:
* Old partition's rows are removed; other months untouched
* UX_AuditLog_EventId is rebuilt after a successful switch
* InsertIfNotExistsAsync's first-write-wins idempotency still holds after switch
* On engineered SWITCH failure (inbound FK from a probe table), SqlException
propagates AND UX_AuditLog_EventId is still present (CATCH branch ran)
* GetPartitionBoundariesOlderThanAsync returns only boundaries whose partition's
MAX(OccurredAtUtc) is strictly older than the threshold; empty partitions
excluded
Bundle B3 of Audit Log #23 M3: data-access layer for the central SiteCalls
table introduced in B1+B2. UpsertAsync is insert-if-not-exists then
monotonic-status update so out-of-order telemetry, duplicate gRPC packets,
and reconciliation pulls all converge on the same row without rolling
state backward.
- src/ScadaLink.Commons/Interfaces/Repositories/ISiteCallAuditRepository.cs:
UpsertAsync (monotonic), GetAsync, QueryAsync, PurgeTerminalAsync.
- src/ScadaLink.Commons/Types/Audit/SiteCallQueryFilter.cs +
SiteCallPaging.cs: filter (Channel/SourceSite/Status/Target/time range)
and keyset paging cursor on (CreatedAtUtc DESC, TrackedOperationId DESC),
mirrored on M1's AuditLog* equivalents.
- src/ScadaLink.ConfigurationDatabase/Repositories/SiteCallAuditRepository.cs:
raw-SQL InsertIfNotExists + conditional UPDATE with inline CASE rank
compare (Submitted=0, Forwarded=1, Attempted/Skipped=2, terminal=3 —
terminal statuses are mutually exclusive so e.g. Delivered cannot
overwrite Parked). Duplicate-key violations (SQL 2601/2627) are
swallowed at Debug, identical to AuditLogRepository's race-fix.
QueryAsync uses FromSqlInterpolated because EF Core 10 cannot translate
string.Compare against the value-converted TrackedOperationId column
inside an expression tree.
- ServiceCollectionExtensions wires the repository (scoped, after
IAuditLogRepository).
- 12 integration tests in tests/ScadaLink.ConfigurationDatabase.Tests/
Repositories/ (MsSqlMigrationFixture + [SkippableFact]): fresh insert,
monotonic advance, older-status no-op, same-status no-op,
terminal-over-terminal no-op, 50-way concurrent-insert race produces
exactly one row, Get known/unknown, filter by site, keyset paging no
overlap, purge terminal-and-old, purge keeps non-terminal-and-recent.
Bundle B2 of Audit Log #23 M3: EF-generated migration that creates the
SiteCalls operational-state table on [PRIMARY], with the simple clustered
PK on TrackedOperationId and the two named indexes the entity config
declares.
No partition function / scheme / DB-role restriction — SiteCalls holds
mutable operational state (insert-once + monotonic-status update at the
repo layer), unlike the partitioned append-only AuditLog table from M1.
- Migration: 20260520180431_AddSiteCallsTable.cs (auto-generated;
EF emitted CREATE TABLE + 2 indexes without customisation needed).
- Model snapshot updated alongside.
- Integration test: tests/ScadaLink.ConfigurationDatabase.Tests/Migrations/
AddSiteCallsTableMigrationTests.cs. Uses the existing MsSqlMigrationFixture
with [SkippableFact] + Skip.IfNot(fixture.Available). Asserts table +
twelve columns + PK on TrackedOperationId + both named indexes.
Bundle B1 of Audit Log #23 M3: introduces the SiteCall entity + EF mapping
for the central SiteCalls operational-state table. One row per
TrackedOperationId, mirrored from sites via best-effort telemetry then
periodic reconciliation; eventually-consistent mirror, not a dispatcher.
- src/ScadaLink.Commons/Entities/Audit/SiteCall.cs: append-once record
with required TrackedOperationId/Channel/Target/SourceSite/Status,
monotonic status update at the repo layer.
- src/ScadaLink.ConfigurationDatabase/Configurations/SiteCallEntityTypeConfiguration.cs:
table SiteCalls, PK on TrackedOperationId (stored as varchar(36) via
value conversion through the canonical 'D'-format GUID string —
matches the wire shape used by gRPC + SQLite columns), two named
indexes (IX_SiteCalls_Source_Created, IX_SiteCalls_Status_Updated).
- ScadaLinkDbContext: DbSet<SiteCall> SiteCalls in the existing Audit
section, after AuditLogs.
- Tests in tests/ScadaLink.ConfigurationDatabase.Tests/Configurations/:
table name, PK, value-conversion shape, index presence + ordering.
Two concurrent sessions can both pass the IF NOT EXISTS check and then both attempt the INSERT against UX_AuditLog_EventId; the loser surfaced as SqlException 2601 (or 2627 for PK violations) and aborted the audit write. First-write-wins idempotency is the documented contract, so the race outcome is semantically a no-op — catch the two duplicate-key error numbers and log at Debug, let any other SqlException bubble.
Tests:
- InsertIfNotExistsAsync_ConcurrentDuplicateInserts_ProduceExactlyOneRow: 50 parallel inserters with the same EventId end with exactly one row and no escaped exceptions.
- QueryAsync_Keyset_SameOccurredAtUtc_TiebreaksOnEventId: four rows sharing the same OccurredAtUtc page deterministically through the (OccurredAtUtc, EventId) keyset cursor — exercises the e.OccurredAtUtc == after && e.EventId.CompareTo(afterId) < 0 branch and verifies EF Core 10's Guid.CompareTo translation against SQL Server uniqueidentifier order (deferred Bundle D reviewer recommendation).
AuditLogRepository now takes an optional ILogger<AuditLogRepository> (NullLogger default, mirrors InboundApiRepository); DI registration unchanged.
EF Core implementation of IAuditLogRepository:
- InsertIfNotExistsAsync: single IF NOT EXISTS ... INSERT via
ExecuteSqlInterpolatedAsync, bypasses the change tracker. Enum
values converted to string in C# (columns are varchar(32) via
HasConversion<string>).
- QueryAsync: AsNoTracking, predicate-per-non-null-filter, keyset
paging on (OccurredAtUtc DESC, EventId DESC) — EF Core 10
translates Guid.CompareTo to a uniqueidentifier < comparison
natively (verified against MSSQL 2022).
- SwitchOutPartitionAsync: throws NotSupportedException naming M6;
the non-aligned UX_AuditLog_EventId unique index blocks
ALTER TABLE SWITCH PARTITION until the drop-and-rebuild dance
ships with the purge actor.
DI: AddScoped<IAuditLogRepository, AuditLogRepository>() added after
the NotificationOutboxRepository registration; existing DI smoke test
extended with an IAuditLogRepository assertion.
Integration tests (8 new) use the Bundle C MsSqlMigrationFixture and
scope by a per-test SourceSiteId guid so they neither collide nor
require cleanup.
Bundle D of the Audit Log #23 M1 Foundation plan.
Bundle C of the #23 M1 foundation. Creates the centralized AuditLog table
with the partition function, partition scheme, partition-aligned
non-clustered indexes, and the two access-control roles documented in
alog.md §4.
Schema:
- pf_AuditLog_Month: RANGE RIGHT, 24 monthly boundaries (Jan 2026 – Dec 2027).
- ps_AuditLog_Month: ALL TO ([PRIMARY]) — dev/test parity.
- dbo.AuditLog: created via raw SQL ON ps_AuditLog_Month(OccurredAtUtc).
Composite clustered PK {EventId, OccurredAtUtc} (partition column must be
part of the clustered key). 22 columns matching the EF AuditEvent model.
- 5 reconciliation/query non-clustered indexes from alog.md §4
(Channel_Status_Occurred, CorrelationId filtered, OccurredAtUtc,
Site_Occurred, Target_Occurred filtered) — all partition-aligned.
- UX_AuditLog_EventId: non-aligned UNIQUE on EventId alone (preserves
InsertIfNotExistsAsync idempotency from M1-T8). Non-aligned because
partition-aligned unique indexes require the partition column in the key,
which would weaken to composite uniqueness; the purge story (M2/M3)
rebuilds this index around partition switches.
Access control:
- scadalink_audit_writer: GRANT INSERT + GRANT SELECT, DENY UPDATE + DENY DELETE
on AuditLog. The explicit DENY guarantees later db_datawriter membership
cannot quietly re-enable mutation.
- scadalink_audit_purger: GRANT SELECT on AuditLog, GRANT ALTER on SCHEMA::dbo
(enables ALTER PARTITION FUNCTION SWITCH and SWITCH PARTITION).
Both role definitions are idempotent (IF DATABASE_PRINCIPAL_ID IS NULL).
Down() drops in reverse dependency order with IF EXISTS guards.
Integration tests (tests/ScadaLink.ConfigurationDatabase.Tests/Migrations/):
- MsSqlMigrationFixture: connects to the running infra/mssql container (or
the SCADALINK_MSSQL_TEST_CONN override), creates a unique per-fixture
database, applies the migrations, drops the DB on dispose. Marks itself
Available=false when MSSQL is unreachable so tests early-return cleanly
on CI without the dev container.
- AddAuditLogTableMigrationTests: 8 tests covering table existence,
partition function/scheme, partition-aligned PK, the 5 named indexes,
both roles' grants, and a smoke test that a writer-role user receives
SqlException with "permission" on UPDATE AuditLog.
ConfigurationDatabase tests: 142 passing -> 150 passing (8 new integration
tests). Full solution builds clean.
Package: tests project locally overrides Microsoft.Data.SqlClient to 6.1.1
(EF SqlServer 10.0.7 needs >= 6.1.1; central package version is pinned at
6.0.2 for the production ExternalSystemGateway).
Bundle C migration aligns AuditLog to ps_AuditLog_Month(OccurredAtUtc).
A partitioned table's clustered key must include the partition column, so
the PK becomes the composite {EventId, OccurredAtUtc} — a divergence from
Bundle B's single-column PK that needs reconciling in the EF mapping
before the migration is generated.
EventId remains the idempotency key for AuditLogRepository.InsertIfNotExistsAsync
(M1-T8), so a dedicated unique index UX_AuditLog_EventId preserves the
single-column uniqueness constraint.
Tests updated:
- Configure_MapsToAuditLogTable_WithCompositePrimaryKey (replaces the
WithEventIdAsPrimaryKey assertion) verifies {EventId, OccurredAtUtc}.
- Configure_DeclaresUniqueIndex_OnEventIdAlone_ForIdempotencyLookups
asserts the new UX_AuditLog_EventId is unique and on EventId alone.
- Configure_ExpectedIndexes_WithCorrectNames now expects six index names
(the original five plus UX_AuditLog_EventId).
A composition-derived template now stores its contained name — the
composition slot's InstanceName (e.g. "Pump"), unique only within its
owner — instead of the dotted global path ("Motor Controller.Pump").
The qualified hierarchical name is computed on read.
- TemplateNaming.QualifiedName: walks the OwnerCompositionId chain to
build the dotted path; null-safe, cycle-guarded.
- TemplateConfiguration: the unique index on Template.Name becomes
filtered (WHERE IsDerived = 0) — base templates stay globally unique;
derived templates' uniqueness is the existing (TemplateId,
InstanceName) index on TemplateComposition.
- Migration ContainedDerivedTemplateNames: rewrites derived rows to the
contained name; Down rebuilds the dotted names via a recursive CTE
before restoring the global index.
- TemplateService: composition create/rename store the contained name;
the dotted-name collision pre-checks and cascade-rename are removed
(a slot rename no longer touches nested derived templates).
- TemplateEdit: title shows the contained name; the qualified path is a
breadcrumb subtitle; "composed inside" uses the owner's qualified name.
TDD: 4 TemplateNaming tests + updated composition tests. TemplateEngine
293, ConfigurationDatabase 114, CentralUI 316 green. Migration applied to
the dev cluster and verified in the browser (Motor Controller.Pump now
titled "Pump"; nested Motor Controller.Pump.TempSensor resolves).
Design: docs/plans/2026-05-18-contained-template-names-design.md
Inbound-API bearer credentials are no longer persisted in plaintext. ApiKey now
holds a KeyHash (peppered HMAC-SHA256); the key is shown once at creation and
only its hash is stored. Lookup and validation hash the presented candidate.
Cross-module: Commons (ApiKey, ApiKeyHasher), ConfigurationDatabase (mapping +
HashApiKeyValue migration), InboundAPI (ApiKeyValidator), ManagementService
(key creation), CentralUI (ApiKeys.razor). Existing keys must be re-issued.
Adds IsInherited/LockedInDerived to the TemplateAlarm entity (mirroring the
attribute/script override model), an EF migration, base-alarm copy-on-derive,
inherited-alarm flattening skip, and LockedInDerived override-rejection validation.
Move all package versions into Directory.Packages.props so every project
resolves a single consistent version. Consolidates the Roslyn packages
(Microsoft.CodeAnalysis.CSharp.Scripting/Workspaces) onto 5.0.0, which
resolves the pre-existing NU1608 version-skew error in the test projects.
Deleting an instance only undeployed it from the site and set the state
to NotDeployed, leaving an orphan record that could never be removed —
the state-transition matrix rejected delete from NotDeployed.
Delete now removes the instance record entirely (deployment history,
snapshot, attribute/alarm overrides, and connection bindings go with
it), and is permitted from any state.
Adds a new HiLo alarm trigger type with four configurable setpoints
(LoLo / Lo / Hi / HiHi). Each setpoint carries an optional priority,
deadband (for hysteresis), and operator message. The site runtime emits
AlarmStateChanged with an AlarmLevel field so consumers can differentiate
warning vs critical bands.
Plumbing:
- new AlarmLevel enum + AlarmStateChanged.Level/Message init properties
- AlarmTriggerEditor (Blazor) gets a HiLo render with severity tinting
- AlarmTriggerConfigCodec extracted from the editor for testability
- sitestream.proto carries level + message over gRPC
- SemanticValidator enforces numeric attribute, setpoint ordering,
non-negative deadband
- on-trigger scripts get an Alarm global (Name/Level/Priority/Message)
so notification routing can branch by severity
- per-instance InstanceAlarmOverride entity + EF migration + flattening
step + CLI commands; HiLo overrides merge setpoint-by-setpoint, binary
types whole-replace
- DebugView shows a Level badge + per-band message tooltip
- App.razor auto-reloads on permanent Blazor circuit failure
- docker/regen-proto.sh automates the proto regen workflow (the linux/arm64
protoc segfault means generated files are checked in for now)
Replace raw-JSON text inputs with rich UI: script parameter/return types use
a JSON Schema builder (SchemaBuilder + JsonSchemaShapeParser, with a migration
to convert existing definitions); alarm trigger config uses a type-aware
editor with a flattened attribute picker (AlarmTriggerEditor). AlarmActor
gains optional direction (rising/falling/either) on RateOfChange triggers.
EF migration MigrateCompositionsToDerived. Aborts with a clear error if
any '<parent>.<slot>' derived name would collide with an existing
template. Otherwise it cursor-walks every TemplateComposition that still
points at a non-derived template:
1. Insert a derived Template (name "<parent>.<slot>",
ParentTemplateId=base, IsDerived=1, OwnerCompositionId=composition).
2. Copy base attributes / scripts into the derived row with
IsInherited=1, LockedInDerived=0.
3. Repoint TemplateComposition.ComposedTemplateId at the new derived.
Idempotent: only touches compositions whose target is IsDerived=0, so
re-runs and freshly-created Phase 2 compositions are skipped.
Down() reverses by repointing compositions back to derived.ParentTemplateId
and dropping all derived templates (with cascade copy rows).
Phase 1 of the design at
docs/plans/2026-05-12-derive-on-compose-design.md.
Additive schema only — no behavior changes. Existing data and code
paths continue to work; subsequent phases will start writing the
new fields.
Template gains:
IsDerived true when this row was auto-created to back
a composition slot
OwnerCompositionId back-ref to the owning TemplateComposition
(plain int, not an EF nav property — managed
by TemplateService for cascade-delete)
TemplateAttribute / TemplateScript each gain:
IsInherited row copied from base and not yet overridden;
changes to the base flow downward
LockedInDerived on a base, blocks derived from overriding;
enforced at the service layer in later phases
EF Core migration AddDerivedTemplateFields adds four columns:
Templates.IsDerived bit NOT NULL DEFAULT 0
Templates.OwnerCompositionId int NULL
TemplateAttributes.IsInherited bit NOT NULL DEFAULT 0
TemplateAttributes.LockedInDerived bit NOT NULL DEFAULT 0
TemplateScripts.IsInherited bit NOT NULL DEFAULT 0
TemplateScripts.LockedInDerived bit NOT NULL DEFAULT 0
Existing rows get the defaults. Tests across SiteRuntime / TemplateEngine
/ CentralUI suites stay green (129 / 199 / 159).
Next: phase 2 — wire AddCompositionAsync to derive on compose for
new compositions. Old data still flows the direct-reference path
until phase 3's migration script.
Two caveats from the script-scope rollout addressed:
1. ITemplateEngineRepository.GetTemplatesComposingAsync — a scoped
query that returns only the templates referencing a given template
via Compositions, eager-loaded with their Attributes / Scripts /
Compositions. Replaces the GetAllTemplatesAsync + filter pattern
in TemplateEdit so the Monaco metadata fetch doesn't pull the
entire template catalog to find one parent.
2. Multi-parent picker. The previous implementation suppressed Parent
assistance entirely when more than one template composes the open
one. Now TemplateEdit collects every parent into _editorParents
and renders a small `select` above the script editor when there
are >1, letting the user choose which parent's metadata drives
Parent.Attributes / Parent.CallScript completion + diagnostics.
Single-parent templates skip the picker (no UI change). Zero
parents (root template) hide the picker and surface no Parent
assistance.
Browser-verified on the Sensor Module template (composed by both Pump
and Variable Speed Motor): picker shows both options, switching
updates the editor's parent metadata immediately via the existing
GetContext callback.
Test counts unchanged (159 / 199); the new repo method is exercised
end-to-end by the parent-picker browser path.