Files
ScadaBridge/src/ScadaLink.Commons/Interfaces/Repositories/IAuditLogRepository.cs
T
Joseph Doherty 6069a20e0f fix(configdb): replace SwitchOutPartitionAsync stub with drop-and-rebuild dance (#23 M6)
Replaces M1's NotSupportedException stub with the production drop-DROP-INDEX
→ CREATE-staging → SWITCH PARTITION → DROP-staging → CREATE-INDEX dance
documented in alog.md §4. UX_AuditLog_EventId is intentionally non-aligned
with ps_AuditLog_Month so single-column EventId uniqueness can be enforced
cheaply for InsertIfNotExistsAsync; SQL Server rejects ALTER TABLE SWITCH
while a non-aligned unique index is present, so the implementation drops
it, switches the partition data into a GUID-suffixed staging table on
[PRIMARY], drops staging (discarding the rows), and rebuilds the unique
index — all inside an explicit transaction with a CATCH that guarantees
the unique index is rebuilt regardless of failure point.

Also adds GetPartitionBoundariesOlderThanAsync to IAuditLogRepository: a
CROSS APPLY over sys.partition_range_values + per-partition MAX(OccurredAtUtc)
to enumerate retention-eligible months for the M6 purge actor (next commit).

Tests verify:
* Old partition's rows are removed; other months untouched
* UX_AuditLog_EventId is rebuilt after a successful switch
* InsertIfNotExistsAsync's first-write-wins idempotency still holds after switch
* On engineered SWITCH failure (inbound FK from a probe table), SqlException
  propagates AND UX_AuditLog_EventId is still present (CATCH branch ran)
* GetPartitionBoundariesOlderThanAsync returns only boundaries whose partition's
  MAX(OccurredAtUtc) is strictly older than the threshold; empty partitions
  excluded
2026-05-20 18:20:55 -04:00

88 lines
4.2 KiB
C#

using ScadaLink.Commons.Entities.Audit;
using ScadaLink.Commons.Types.Audit;
namespace ScadaLink.Commons.Interfaces.Repositories;
/// <summary>
/// Append-only data access for the central <c>AuditLog</c> table (Audit Log #23).
/// </summary>
/// <remarks>
/// <para>
/// The append-only invariant is enforced both at the SQL level (the
/// <c>scadalink_audit_writer</c> role has only INSERT + SELECT — UPDATE and DELETE
/// are not granted) and at the API level: this interface deliberately exposes no
/// Update and no single-row Delete. Bulk purge is performed exclusively via
/// monthly partition switch-out (<see cref="SwitchOutPartitionAsync"/>).
/// </para>
/// <para>
/// Ingest is idempotent on <c>EventId</c>: <see cref="InsertIfNotExistsAsync"/> is
/// first-write-wins, so retrying telemetry and reconciliation pulls can both feed
/// the same writer without producing duplicates.
/// </para>
/// </remarks>
public interface IAuditLogRepository
{
/// <summary>
/// Inserts <paramref name="evt"/> if no row with the same
/// <see cref="AuditEvent.EventId"/> exists; otherwise silently leaves the
/// stored row untouched (first-write-wins). Bypasses the EF change tracker
/// so the row never enters a tracked state.
/// </summary>
Task InsertIfNotExistsAsync(AuditEvent evt, CancellationToken ct = default);
/// <summary>
/// Returns up to <see cref="AuditLogPaging.PageSize"/> rows matching
/// <paramref name="filter"/>, ordered by <c>(OccurredAtUtc DESC, EventId DESC)</c>.
/// Use keyset paging by passing the last returned row's
/// <c>OccurredAtUtc</c> + <c>EventId</c> back via
/// <see cref="AuditLogPaging.AfterOccurredAtUtc"/> +
/// <see cref="AuditLogPaging.AfterEventId"/> to fetch the next page.
/// </summary>
Task<IReadOnlyList<AuditEvent>> QueryAsync(
AuditLogQueryFilter filter,
AuditLogPaging paging,
CancellationToken ct = default);
/// <summary>
/// Switches out (purges) the monthly partition whose lower bound is
/// <paramref name="monthBoundary"/>.
/// </summary>
/// <remarks>
/// <para>
/// <b>Drop-and-rebuild dance.</b> <c>UX_AuditLog_EventId</c> is intentionally
/// non-partition-aligned (it lives on <c>[PRIMARY]</c> so single-column
/// EventId uniqueness — required by <see cref="InsertIfNotExistsAsync"/> —
/// can be enforced cheaply). SQL Server rejects
/// <c>ALTER TABLE … SWITCH PARTITION</c> while a non-aligned unique index
/// is present, so the M6 implementation drops the index, creates a staging
/// table with byte-identical schema, switches the partition's data into
/// staging, drops staging (discarding the rows), and rebuilds the unique
/// index. The CATCH branch guarantees the index is rebuilt even on partial
/// failure so the table never returns to live traffic without its
/// idempotency-supporting index.
/// </para>
/// <para>
/// <b>Outage window.</b> The dance briefly removes the unique index, so
/// concurrent <see cref="InsertIfNotExistsAsync"/> calls during the switch
/// could in principle race past the IF NOT EXISTS check without the index
/// catching the duplicate. This is acceptable for the daily purge cadence
/// — the inserts that the IF NOT EXISTS check guards are themselves rare
/// enough that a sub-second collision window is operationally negligible,
/// and the composite PK still rejects same-(EventId, OccurredAtUtc) rows.
/// </para>
/// </remarks>
Task SwitchOutPartitionAsync(DateTime monthBoundary, CancellationToken ct = default);
/// <summary>
/// Returns the set of <c>pf_AuditLog_Month</c> partition lower-bound
/// boundaries whose partitions contain only rows with
/// <see cref="AuditEvent.OccurredAtUtc"/> strictly older than
/// <paramref name="threshold"/>. Boundaries whose partition is empty are
/// excluded (a no-op switch is wasted work). Used by the M6 purge actor
/// to enumerate retention-eligible months on every tick.
/// </summary>
Task<IReadOnlyList<DateTime>> GetPartitionBoundariesOlderThanAsync(
DateTime threshold,
CancellationToken ct = default);
}