fix(configdb): replace SwitchOutPartitionAsync stub with drop-and-rebuild dance (#23 M6)
Replaces M1's NotSupportedException stub with the production drop-DROP-INDEX → CREATE-staging → SWITCH PARTITION → DROP-staging → CREATE-INDEX dance documented in alog.md §4. UX_AuditLog_EventId is intentionally non-aligned with ps_AuditLog_Month so single-column EventId uniqueness can be enforced cheaply for InsertIfNotExistsAsync; SQL Server rejects ALTER TABLE SWITCH while a non-aligned unique index is present, so the implementation drops it, switches the partition data into a GUID-suffixed staging table on [PRIMARY], drops staging (discarding the rows), and rebuilds the unique index — all inside an explicit transaction with a CATCH that guarantees the unique index is rebuilt regardless of failure point. Also adds GetPartitionBoundariesOlderThanAsync to IAuditLogRepository: a CROSS APPLY over sys.partition_range_values + per-partition MAX(OccurredAtUtc) to enumerate retention-eligible months for the M6 purge actor (next commit). Tests verify: * Old partition's rows are removed; other months untouched * UX_AuditLog_EventId is rebuilt after a successful switch * InsertIfNotExistsAsync's first-write-wins idempotency still holds after switch * On engineered SWITCH failure (inbound FK from a probe table), SqlException propagates AND UX_AuditLog_EventId is still present (CATCH branch ran) * GetPartitionBoundariesOlderThanAsync returns only boundaries whose partition's MAX(OccurredAtUtc) is strictly older than the threshold; empty partitions excluded
This commit is contained in:
@@ -45,12 +45,43 @@ public interface IAuditLogRepository
|
||||
|
||||
/// <summary>
|
||||
/// Switches out (purges) the monthly partition whose lower bound is
|
||||
/// <paramref name="monthBoundary"/>. The honest M1 implementation throws
|
||||
/// <see cref="NotSupportedException"/>: the <c>UX_AuditLog_EventId</c> unique
|
||||
/// index is non-partition-aligned (lives on <c>[PRIMARY]</c>, not on
|
||||
/// <c>ps_AuditLog_Month</c>), so SQL Server rejects
|
||||
/// <c>ALTER TABLE … SWITCH PARTITION</c> until the drop-and-rebuild dance
|
||||
/// shipped by the M6 purge actor is in place.
|
||||
/// <paramref name="monthBoundary"/>.
|
||||
/// </summary>
|
||||
/// <remarks>
|
||||
/// <para>
|
||||
/// <b>Drop-and-rebuild dance.</b> <c>UX_AuditLog_EventId</c> is intentionally
|
||||
/// non-partition-aligned (it lives on <c>[PRIMARY]</c> so single-column
|
||||
/// EventId uniqueness — required by <see cref="InsertIfNotExistsAsync"/> —
|
||||
/// can be enforced cheaply). SQL Server rejects
|
||||
/// <c>ALTER TABLE … SWITCH PARTITION</c> while a non-aligned unique index
|
||||
/// is present, so the M6 implementation drops the index, creates a staging
|
||||
/// table with byte-identical schema, switches the partition's data into
|
||||
/// staging, drops staging (discarding the rows), and rebuilds the unique
|
||||
/// index. The CATCH branch guarantees the index is rebuilt even on partial
|
||||
/// failure so the table never returns to live traffic without its
|
||||
/// idempotency-supporting index.
|
||||
/// </para>
|
||||
/// <para>
|
||||
/// <b>Outage window.</b> The dance briefly removes the unique index, so
|
||||
/// concurrent <see cref="InsertIfNotExistsAsync"/> calls during the switch
|
||||
/// could in principle race past the IF NOT EXISTS check without the index
|
||||
/// catching the duplicate. This is acceptable for the daily purge cadence
|
||||
/// — the inserts that the IF NOT EXISTS check guards are themselves rare
|
||||
/// enough that a sub-second collision window is operationally negligible,
|
||||
/// and the composite PK still rejects same-(EventId, OccurredAtUtc) rows.
|
||||
/// </para>
|
||||
/// </remarks>
|
||||
Task SwitchOutPartitionAsync(DateTime monthBoundary, CancellationToken ct = default);
|
||||
|
||||
/// <summary>
|
||||
/// Returns the set of <c>pf_AuditLog_Month</c> partition lower-bound
|
||||
/// boundaries whose partitions contain only rows with
|
||||
/// <see cref="AuditEvent.OccurredAtUtc"/> strictly older than
|
||||
/// <paramref name="threshold"/>. Boundaries whose partition is empty are
|
||||
/// excluded (a no-op switch is wasted work). Used by the M6 purge actor
|
||||
/// to enumerate retention-eligible months on every tick.
|
||||
/// </summary>
|
||||
Task<IReadOnlyList<DateTime>> GetPartitionBoundariesOlderThanAsync(
|
||||
DateTime threshold,
|
||||
CancellationToken ct = default);
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user