Merge branch 'feature/audit-log-m2-site-sync-pipeline': Audit Log #23 M2 Site Pipeline (sync-only)

M2 ships the first end-to-end audit emission. A script-initiated
ExternalSystem.Call() produces one ApiOutbound/ApiCall row in the central
AuditLog table via site SQLite hot-path + gRPC telemetry push + central
ingest actor. Audit-write failures NEVER abort the script.

Shipped (13 commits):
- Race-fix + tiebreaker: InsertIfNotExistsAsync swallows duplicate-key
  races (SqlException 2601/2627); same-OccurredAt keyset test added.
- Site SQLite writer: SqliteAuditWriter (Channel<T> + background batch
  inserter, sub-ms enqueue) + RingBufferFallback (1024-cap drop-oldest)
  + FallbackAuditWriter composing primary+ring+failure counter.
- gRPC layer: IngestAuditEvents unary RPC + AuditEventDto on
  sitestream.proto; AuditEventMapper for AuditEvent <-> Dto round-trip
  (ForwardState site-only, IngestedAtUtc central-only).
- Actors: SiteAuditTelemetryActor (per-site, dedicated dispatcher,
  drain loop with 5s busy / 30s idle cadence); AuditLogIngestActor
  (central singleton, scope-per-message via IServiceProvider ctor for
  scoped repository, idempotent acks).
- Host wiring: cluster singleton + proxy on central, per-site actor
  bound to audit-telemetry-dispatcher (ForkJoin, 2 threads). NoOp
  ISiteStreamAuditClient registered as production default; real
  site->central gRPC client deferred to M6 (orthogonal to M3).
- ESG emission: ScriptRuntimeContext.ExternalSystem.Call wraps
  ExternalSystemClient.CallAsync; emits one ApiOutbound/ApiCall row per
  call with provenance from context (SourceSiteId/Instance/Script).
  Three nested fail-safe layers ensure audit failure never aborts script.
- Health metric: SiteAuditWriteFailures counter + Interlocked.Increment,
  exposed in SiteHealthReport; HealthMetricsAuditWriteFailureCounter
  bridge swaps the NoOp default when both AddHealthMonitoring + AddAuditLog
  are registered.
- E2E: component-level test using TestKit + MsSqlMigrationFixture +
  DirectActorSiteStreamAuditClient stub. Verifies push, retry, and
  duplicate-collapse in <15s.

Tests: full solution dotnet test ScadaLink.slnx green (one isolated
SiteRuntime sandbox-timeout flake is pre-existing and not M2-related).
~80 net new tests across Commons.Tests / ConfigDb.Tests / Communication.Tests /
HealthMonitoring.Tests / AuditLog.Tests / SiteRuntime.Tests / Host.Tests.

Strict invariants honored: infra/* never touched on any branch commit;
no push to origin; explicit git add throughout; alog.md unchanged
(vocabulary correction from M1 stands).
This commit is contained in:
Joseph Doherty
2026-05-20 13:40:59 -04:00
51 changed files with 6595 additions and 65 deletions

View File

@@ -0,0 +1,408 @@
# Audit Log #23 — M2 Site Pipeline (sync-only) Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers-extended-cc:subagent-driven-development (bundled cadence per `feedback_subagent_cadence`).
**Goal:** First end-to-end audit emission. A script-initiated `ExternalSystem.Call()` produces exactly one `ApiOutbound`/`ApiCall` row in the central `AuditLog` table via site SQLite hot-path + gRPC push telemetry + central ingest actor. Audit-write failures NEVER abort the script.
**Architecture (decisions locked):**
- Provenance: **Wrap CallAsync in ScriptRuntimeContext** — IExternalSystemClient.CallAsync signature unchanged; ScriptRuntimeContext.ExternalSystem.Call captures instance/script/site and emits the AuditEvent via IAuditWriter.
- Direction: **Push primary** — SiteAuditTelemetryActor batches Pending rows and pushes via a new `IngestAuditEvents` unary gRPC RPC on `sitestream.proto`. Pull (reconciliation) deferred to M6.
- E2E: **Component-level test** via TestKit + MSSQL fixture; stubbed gRPC client forwards directly to the central ingest actor. No expansion of `ScadaLinkWebApplicationFactory`.
- Site writer: **Mirror SiteEventLogger**`Channel<PendingAuditEvent>` + background writer Task for sub-ms enqueue durability.
**M1 realities baked in:**
- Enum vocabulary: `AuditKind.ApiCall` for sync API call; `AuditStatus.Delivered` for success, `AuditStatus.Failed` for HTTP non-2xx (permanent OR transient → both Failed for a sync call; cached path differs in M3). The "Status=Success/TransientFailure/PermanentFailure" wording in the roadmap is stale and must be replaced with the new vocabulary.
- `AuditLogRepository.InsertIfNotExistsAsync` race window — M2 is the first concurrent writer; harden it before AuditLogIngestActor lands.
- Keyset tiebreaker test gap from Bundle D — add a same-OccurredAt test in M2.
- `MsSqlMigrationFixture` reusable as-is; promoted to `[CollectionDefinition]`-shared if multiple test classes need it (defer until actually needed).
- `Xunit.SkippableFact` + `Skip.IfNot(_fixture.Available, _fixture.SkipReason)` for any MSSQL-dependent tests.
- `ScadaLink.AuditLog/Site/` and `ScadaLink.AuditLog/Central/` and `ScadaLink.AuditLog/Telemetry/` subfolders. DI extension `AddAuditLog` is the registration point.
**Tech stack additions:**
- `Microsoft.Data.Sqlite 10.0.7` (pinned).
- `Akka.TestKit.Xunit2 1.5.62` (pinned).
- `Grpc.Tools` already configured in `ScadaLink.Communication.csproj`.
---
## Bundles
- **Bundle A — Repo race-fix + tiebreaker test** (M1 realities catch-up).
- **Bundle B — Site SQLite writer + fallback** (M2-T1, T2, T3, T4).
- **Bundle C — gRPC proto + mapper** (M2-T5, T6).
- **Bundle D — Telemetry actor + ingest actor + gRPC handler** (M2-T7, T8).
- **Bundle E — Host wiring** (M2-T9).
- **Bundle F — ESG emission via ScriptRuntimeContext wrapper** (M2-T10).
- **Bundle G — Health metric SiteAuditWriteFailures** (M2-T11).
- **Bundle H — Component-level integration test** (M2-T12).
Final cross-bundle reviewer pass, then merge + roadmap update.
---
## Bundle A — Repo race-fix + keyset tiebreaker test
### Task A1: Harden `InsertIfNotExistsAsync` against duplicate-key race
**Files:**
- Modify: `src/ScadaLink.ConfigurationDatabase/Repositories/AuditLogRepository.cs:30-60` — wrap the `ExecuteSqlInterpolatedAsync` call in a `try/catch Microsoft.Data.SqlClient.SqlException` that swallows error numbers 2601 and 2627 (unique-index violation on `UX_AuditLog_EventId`) and logs at Debug. Other SqlExceptions rethrow.
- Modify: `tests/ScadaLink.ConfigurationDatabase.Tests/Repositories/AuditLogRepositoryTests.cs` — add:
- `InsertIfNotExistsAsync_ConcurrentDuplicateInserts_ProduceExactlyOneRow` — fire 50 parallel `InsertIfNotExistsAsync` calls with the same `EventId`, assert row count = 1 and no exception escapes.
- `QueryAsync_Keyset_SameOccurredAtUtc_TiebreaksOnEventId` — Bundle D reviewer's deferred recommendation. Insert 4 rows with identical OccurredAtUtc but distinct EventIds; page through them with PageSize=2; assert no overlap, correct count, and that the second page's first row's EventId is strictly less than the first page's last row's EventId.
**Steps:**
1. Write failing concurrency test.
2. Run: expect SqlException 2601/2627 OR identical-row-count violation.
3. Add try/catch in the repo.
4. Run: pass.
5. Write failing keyset-tiebreaker test.
6. Run: depending on EF Core 10's Guid.CompareTo translation, this may already pass — confirm.
7. If passing, the test locks in the behavior; commit anyway.
8. Commit: `fix(configdb): InsertIfNotExistsAsync swallows duplicate-key races + add keyset tiebreaker test (#23)`.
**Bundle A acceptance:** All ConfigurationDatabase.Tests still green; 2 new tests pass.
---
## Bundle B — Site SQLite writer + fallback (mirror SiteEventLogger pattern)
### Task B1: `SqliteAuditWriter` — schema + connection bootstrap
**Files:**
- Create: `src/ScadaLink.AuditLog/Site/SqliteAuditWriter.cs` — implements `IAuditWriter` per Bundle A's signature (single `Task WriteAsync(AuditEvent evt, CancellationToken ct = default)`). Constructor takes `IOptions<SqliteAuditWriterOptions>` + `ILogger`. Single `SqliteConnection` opened at construction (`Data Source={path};Cache=Shared`). Sync `_writeLock` Monitor-pattern (mirrors `SiteEventLogger.cs:32`). Inline `InitializeSchema()` runs `PRAGMA auto_vacuum = INCREMENTAL` + `CREATE TABLE IF NOT EXISTS AuditLog (...)`.
- Create: `src/ScadaLink.AuditLog/Site/SqliteAuditWriterOptions.cs``string DatabasePath = "auditlog.db"`, `int ChannelCapacity = 4096` (bounded; drop-oldest applies in Bundle B-T3 ring overflow, but the writer's pending channel is bounded as a safety net), `int BatchSize = 256`, `int FlushIntervalMs = 50`.
- Create: `tests/ScadaLink.AuditLog.Tests/Site/SqliteAuditWriterSchemaTests.cs`.
**Schema (20 site columns + ForwardState — IngestedAtUtc is central-only):**
```sql
CREATE TABLE IF NOT EXISTS AuditLog (
EventId TEXT NOT NULL,
OccurredAtUtc TEXT NOT NULL,
Channel TEXT NOT NULL,
Kind TEXT NOT NULL,
CorrelationId TEXT NULL,
SourceSiteId TEXT NULL,
SourceInstanceId TEXT NULL,
SourceScript TEXT NULL,
Actor TEXT NULL,
Target TEXT NULL,
Status TEXT NOT NULL,
HttpStatus INTEGER NULL,
DurationMs INTEGER NULL,
ErrorMessage TEXT NULL,
ErrorDetail TEXT NULL,
RequestSummary TEXT NULL,
ResponseSummary TEXT NULL,
PayloadTruncated INTEGER NOT NULL,
Extra TEXT NULL,
ForwardState TEXT NOT NULL,
PRIMARY KEY (EventId)
);
CREATE INDEX IF NOT EXISTS IX_SiteAuditLog_ForwardState_Occurred
ON AuditLog (ForwardState, OccurredAtUtc);
```
**Tests:**
1. `Opens_Creates_AuditLog_Table_With_All_Columns_And_PK`
2. `Opens_Creates_IX_ForwardState_Occurred_Index`
3. `PRAGMA_auto_vacuum_Is_INCREMENTAL`
**Steps:**
1. Failing test asserts table + PK + 20 columns + index via `PRAGMA table_info(AuditLog)` + `PRAGMA index_list(AuditLog)`.
2. Implement constructor + InitializeSchema with inline SQL.
3. Run: pass.
4. Commit: `feat(auditlog): SqliteAuditWriter schema bootstrap (#23)`.
### Task B2: `SqliteAuditWriter` — Channel<T> + background writer for hot-path
**Files:**
- Modify: `src/ScadaLink.AuditLog/Site/SqliteAuditWriter.cs` — add `Channel<PendingAuditEvent> _writeQueue` (bounded BoundedChannelFullMode.Wait, default capacity 4096), background `Task ProcessWriteQueueAsync()` launched in constructor. `WriteAsync` enqueues + returns the pending's `TaskCompletionSource`. The loop reads up to `BatchSize`, opens a transaction, INSERTs all events, commits, completes the TCS for each.
- Pattern mirrors `src/ScadaLink.SiteEventLogging/SiteEventLogger.cs:135-173`.
- Test: `tests/ScadaLink.AuditLog.Tests/Site/SqliteAuditWriterWriteTests.cs`.
**Tests:**
1. `WriteAsync_FreshEvent_PersistsWithForwardStatePending` — write one event, query SQLite, assert row has `ForwardState='Pending'`.
2. `WriteAsync_Concurrent_1000Calls_All_Persist_NoExceptions` — fire 1000 parallel WriteAsync, assert row count = 1000 and zero exceptions surface.
3. `WriteAsync_LatencyP99_LessThan_5ms_For_Enqueue` — assert TCS Task.IsCompleted within reasonable time AFTER awaiting, but the enqueue itself returns near-instantly (verify via a stopwatch around the Channel.Writer.TryWriteAsync).
4. `WriteAsync_DuplicateEventId_FirstWriteWins_NoException` — insert same EventId twice, assert one row only and no exception (the PRIMARY KEY violation is caught/swallowed in the writer loop).
**Steps:**
1. Failing tests for 1, 2, 4.
2. Implement Channel + background loop + transactional batch INSERT.
3. Run: pass.
4. Commit: `feat(auditlog): SqliteAuditWriter Channel-based hot-path write (#23)`.
### Task B3: `RingBufferFallback`
**Files:**
- Create: `src/ScadaLink.AuditLog/Site/RingBufferFallback.cs``Channel<AuditEvent>` bounded at 1024 with `BoundedChannelFullMode.DropOldest`. Exposes `bool TryEnqueue(AuditEvent)`, `IAsyncEnumerable<AuditEvent> DrainAsync(CancellationToken)`, and an event `RingBufferOverflowed` (callback for the health counter).
- Test: `tests/ScadaLink.AuditLog.Tests/Site/RingBufferFallbackTests.cs`.
**Tests:**
1. `Enqueue_1025_Into_1024Cap_Ring_DropsOldest_AndRaisesOverflow` — invoke 1025 enqueues, assert the OverflowEvent counter increments once, and the surviving 1024 are the latest.
2. `DrainAsync_Yields_FIFO_Then_Completes_When_Empty`.
**Steps:**
1. Failing tests.
2. Implement using `Channel.CreateBounded<AuditEvent>(new BoundedChannelOptions(1024) { FullMode = BoundedChannelFullMode.DropOldest })`.
3. Run: pass.
4. Commit: `feat(auditlog): RingBufferFallback with drop-oldest overflow (#23)`.
### Task B4: `FallbackAuditWriter` — compose primary + ring
**Files:**
- Create: `src/ScadaLink.AuditLog/Site/FallbackAuditWriter.cs` — implements `IAuditWriter`. Constructor takes the primary `SqliteAuditWriter` + `RingBufferFallback` + `IAuditWriteFailureCounter` (lightweight DI'd interface, Bundle G implements it as `SiteAuditWriteFailures` counter on health metrics). On primary success: returns. On primary throw: increments counter, enqueues into ring (DropOldest), returns success. On the NEXT successful primary call (success after a failure window), drains the ring back through the primary.
- Test: `tests/ScadaLink.AuditLog.Tests/Site/FallbackAuditWriterTests.cs`.
**Tests:**
1. `WriteAsync_PrimaryThrows_EventLandsInRing_CallReturnsSuccess`.
2. `WriteAsync_PrimaryRecovers_RingDrains_InFIFOOrder_OnNextWrite`.
3. `WriteAsync_PrimaryAlwaysSucceeds_Ring_StaysEmpty`.
**Steps:**
1. Failing tests.
2. Implement; mock the primary with a `Func<AuditEvent, Task>` flip-switch failure.
3. Run: pass.
4. Commit: `feat(auditlog): FallbackAuditWriter compose SQLite + ring (#23)`.
**Bundle B acceptance:** 4 tasks merged. `ScadaLink.AuditLog.Tests` adds ~12+ tests. No regressions.
---
## Bundle C — gRPC proto + mapper
### Task C1: Extend `sitestream.proto` with `IngestAuditEvents`
**Files:**
- Modify: `src/ScadaLink.Communication/Protos/sitestream.proto` — add the messages and unary RPC. Use `google.protobuf.Timestamp` for `OccurredAtUtc`; encode enums as `string` (matches the EF mapping).
Proposed addition:
```proto
message AuditEventDto {
string event_id = 1;
google.protobuf.Timestamp occurred_at_utc = 2;
string channel = 3;
string kind = 4;
string correlation_id = 5; // empty string when null
string source_site_id = 6;
string source_instance_id = 7;
string source_script = 8;
string actor = 9;
string target = 10;
string status = 11;
google.protobuf.Int32Value http_status = 12;
google.protobuf.Int32Value duration_ms = 13;
string error_message = 14;
string error_detail = 15;
string request_summary = 16;
string response_summary = 17;
bool payload_truncated = 18;
string extra = 19;
}
message AuditEventBatch { repeated AuditEventDto events = 1; }
message IngestAck { repeated string accepted_event_ids = 1; }
service SiteStreamService {
// existing rpcs...
rpc IngestAuditEvents(AuditEventBatch) returns (IngestAck);
}
```
(Use `google.protobuf.Int32Value` to encode nullable ints; empty string semantics for nullable text fields.)
- Test: `tests/ScadaLink.Communication.Tests/Protos/AuditEventProtoTests.cs`.
**Steps:**
1. Edit proto + rebuild (`dotnet build src/ScadaLink.Communication/`).
2. Failing test round-trips an `AuditEventDto` through `ToByteArray()` and `Parser.ParseFrom()`; asserts all populated fields survive.
3. Run: pass.
4. Commit: `feat(comms): IngestAuditEvents RPC + AuditEventDto proto (#23)`.
### Task C2: `AuditEvent` ↔ `AuditEventDto` mapper
**Files:**
- Create: `src/ScadaLink.AuditLog/Telemetry/AuditEventMapper.cs` — static `ToDto(AuditEvent)` and `FromDto(AuditEventDto)`. Handles nullable→empty-string, Timestamp↔DateTime UTC, enum↔string. ForwardState NOT carried in the proto (site-local only; central never sees it).
- Test: `tests/ScadaLink.AuditLog.Tests/Telemetry/AuditEventMapperTests.cs`.
**Tests:**
1. `Roundtrip_FullyPopulated_PreservesAllFields`.
2. `Roundtrip_AllNullableFieldsNull_ProducesEmptyDtoFields`.
3. `FromDto_EmptyOptionalString_BecomesNullProperty`.
4. `ToDto_Sets_OccurredAtUtc_As_UtcTimestamp` — Round-trip with `DateTimeKind.Utc` preserved.
**Steps:**
1. Failing tests.
2. Implement.
3. Run: pass.
4. Commit: `feat(auditlog): AuditEvent ↔ proto mapper (#23)`.
**Bundle C acceptance:** Communication.Tests + AuditLog.Tests still green; proto rebuilds cleanly.
---
## Bundle D — SiteAuditTelemetryActor + AuditLogIngestActor + gRPC handler
### Task D1: `SiteAuditTelemetryActor` — drain loop
**Files:**
- Create: `src/ScadaLink.AuditLog/Site/Telemetry/SiteAuditTelemetryActor.cs``ReceiveActor`. On `Drain`: queries `SqliteAuditWriter.ReadPendingAsync(BatchSize)`, calls `gRPC client.IngestAuditEventsAsync(batch)`, on ack flips returned EventIds to `Forwarded` via `SqliteAuditWriter.MarkForwardedAsync(eventIds)`. Re-schedules `Drain` self-tick: 5s if ≥1 row drained, 30s otherwise. On gRPC error: re-schedule 5s; rows stay Pending.
- Modify: `src/ScadaLink.AuditLog/Site/SqliteAuditWriter.cs` — add `ReadPendingAsync(int limit, CancellationToken)` returning `IReadOnlyList<AuditEvent>` (with ForwardState=Pending), and `MarkForwardedAsync(IReadOnlyList<Guid> eventIds, CancellationToken)`.
- Create: `src/ScadaLink.AuditLog/Site/Telemetry/SiteAuditTelemetryOptions.cs``BatchSize=256`, `BusyIntervalSeconds=5`, `IdleIntervalSeconds=30`.
- Test: `tests/ScadaLink.AuditLog.Tests/Site/Telemetry/SiteAuditTelemetryActorTests.cs` using `TestKit` + NSubstitute-mocked gRPC client.
**Tests:**
1. `Drain_With_50PendingRows_Sends_OneBatch_Of_50`.
2. `Drain_Ack_Flips_Rows_To_Forwarded`.
3. `Drain_GrpcThrows_Rows_StayPending_NextTick_Retries`.
4. `Drain_Cadence_5s_AfterNonZero_30s_AfterZero` (via `TestScheduler`).
**Steps:**
1. Failing tests.
2. Implement.
3. Run: pass.
4. Commit: `feat(auditlog): SiteAuditTelemetryActor drain loop (#23)`.
### Task D2: `AuditLogIngestActor` + gRPC server handler
**Files:**
- Create: `src/ScadaLink.AuditLog/Central/AuditLogIngestActor.cs``ReceiveActor` accepting `IngestAuditEventsCommand(IReadOnlyList<AuditEvent> events, IActorRef replyTo)`. For each event, calls `IAuditLogRepository.InsertIfNotExistsAsync` (which now swallows duplicates per Bundle A). Sets `IngestedAtUtc = DateTime.UtcNow` before insert (this is the central-side timestamp). Replies with `IngestAck(acceptedEventIds)` — by spec "accepted" includes already-existed rows (idempotent semantics).
- Create: `src/ScadaLink.AuditLog/Central/IngestAuditEventsCommand.cs` (Akka message).
- Create: `src/ScadaLink.AuditLog/Central/IngestAck.cs` (Akka reply).
- Modify: `src/ScadaLink.Communication/SiteStreamGrpc/SiteStreamGrpcServer.cs` — implement `public override async Task<IngestAck> IngestAuditEvents(AuditEventBatch request, ServerCallContext context)` — Ask the central `AuditLogIngestActor` proxy with the deserialized batch, await reply, return.
- Modify: `src/ScadaLink.Communication/SiteStreamGrpc/SiteStreamGrpcServer.cs` — add a setter `SetAuditIngestActor(IActorRef)` mirroring how `SetNotificationOutbox` is wired (per recon: Notification Outbox proxy is handed in via `commService?.SetNotificationOutbox(outboxProxy)`).
- Test: `tests/ScadaLink.AuditLog.Tests/Central/AuditLogIngestActorTests.cs`.
- Test: `tests/ScadaLink.Communication.Tests/SiteStreamIngestAuditEventsTests.cs`.
**Tests:**
1. `Receive_BatchOf5_Calls_Repo_5Times_Acks_All`.
2. `Receive_BatchWith_AlreadyExistingEvent_AcksAll_NoDoubleInsert` (idempotent).
3. `Receive_RepoThrowsTransient_Replies_AckExcludingFailedEventIds_LogsError` (partial-failure semantics — what gets acked is what was persisted).
4. `Receive_Sets_IngestedAtUtc_Before_Insert`.
5. `gRPC_Handler_Routes_To_Actor_Returns_Reply`.
**Steps:**
1. Failing tests.
2. Implement actor + gRPC handler.
3. Run: pass.
4. Commit: `feat(auditlog): AuditLogIngestActor + gRPC handler (#23)`.
**Bundle D acceptance:** New actor + gRPC handler tests all green.
---
## Bundle E — Host wiring (central singleton + site actor + dispatcher)
### Task E1: Register `AuditLogIngestActor` + `SiteAuditTelemetryActor` + dispatcher
**Files:**
- Modify: `src/ScadaLink.Host/Actors/AkkaHostedService.cs` — mirror the Notification Outbox pattern (recon report's exact lines 272-295):
- Central role: `AuditLogIngestActor` as `ClusterSingletonManager` (singleton name `"audit-log-ingest"`) + `ClusterSingletonProxy` (`"audit-log-ingest-proxy"`). Hand the proxy to `SiteStreamGrpcServer.SetAuditIngestActor(proxy)`.
- Site role: `SiteAuditTelemetryActor` as a per-site actor (`actorSystem.ActorOf(Props.Create(...)`), bound to the dedicated dispatcher (below).
- Modify: HOCON in `src/ScadaLink.Host/Configuration/` (the existing akka config file) — add:
```
audit-telemetry-dispatcher {
type = ForkJoinDispatcher
throughput = 100
dedicated-thread-pool { thread-count = 2 }
}
```
Apply `.WithDispatcher("audit-telemetry-dispatcher")` to `SiteAuditTelemetryActor`'s Props.
- Modify: `src/ScadaLink.AuditLog/ServiceCollectionExtensions.cs:AddAuditLog` — register the SqliteAuditWriter+RingBufferFallback+FallbackAuditWriter chain and the actor factories.
- Test: `tests/ScadaLink.Host.Tests/AkkaHostedServiceAuditWiringTests.cs`.
**Tests:**
1. `Central_Host_Starts_With_AuditLogIngest_Singleton_Healthy`.
2. `Site_Host_Starts_With_SiteAuditTelemetry_Bound_To_DedicatedDispatcher`.
3. `AuditWriter_Resolves_From_DI_To_FallbackAuditWriter`.
**Steps:**
1. Failing tests against current host (which doesn't wire audit).
2. Implement wiring.
3. Run: pass.
4. Commit: `feat(host): register Audit Log #23 singletons with dedicated dispatcher`.
**Bundle E acceptance:** Host.Tests still green; 3 new tests pass.
---
## Bundle F — ESG audit emission via ScriptRuntimeContext wrapper
### Task F1: Wrap `ExternalSystem.Call` in `ScriptRuntimeContext` to emit audit
**Files:**
- Modify: `src/ScadaLink.SiteRuntime/Scripts/ScriptRuntimeContext.cs` — find the existing `ExternalSystem.Call` method (or add one if scripts call through a dynamic API surface). Inside, after `_externalSystemClient.CallAsync(...)` returns OR throws, build the `AuditEvent` (channel=`ApiOutbound`, kind=`ApiCall`, status=`Delivered` for success, `Failed` for HTTP non-2xx or exception, populate `Target=$"{systemName}.{methodName}"`, `SourceSiteId={siteId}`, `SourceInstanceId={instanceName}`, `SourceScript={sourceScript}`, `DurationMs={stopwatch}`, `HttpStatus`, `ErrorMessage`). Call `_auditWriter.WriteAsync(evt)` inside a try/catch that swallows + logs at Warning + increments `SiteAuditWriteFailures` (via the same counter Bundle G defines). Re-throw the original ExternalSystem exception (if any) so the script sees its original error path unchanged.
- Modify: `src/ScadaLink.SiteRuntime/Scripts/ScriptRuntimeContext.cs` constructor — inject `IAuditWriter`.
- Modify: `src/ScadaLink.SiteRuntime/Actors/ScriptExecutionActor.cs` — resolve and pass `IAuditWriter` into the ScriptRuntimeContext.
- Test: `tests/ScadaLink.SiteRuntime.Tests/Scripts/ExternalSystemCallAuditEmissionTests.cs`.
**Tests:**
1. `Call_Success_EmitsOneEvent_Channel_ApiOutbound_Kind_ApiCall_Status_Delivered`.
2. `Call_HTTP500_EmitsEvent_Status_Failed_HttpStatus_500_ErrorMessage_Set`.
3. `Call_HTTP400_EmitsEvent_Status_Failed_HttpStatus_400`.
4. `Call_ClientThrows_NetworkError_EmitsEvent_Status_Failed_ErrorMessage_SetFromException`.
5. `AuditWriter_Throws_Script_Call_Returns_Original_Result_Unchanged_Audit_Failure_Counter_Incremented`.
6. `Provenance_Populated_FromContext` — SourceInstanceId, SourceScript, SourceSiteId all match the ScriptRuntimeContext's values.
**Steps:**
1. Failing tests.
2. Implement wrapper + provenance threading.
3. Run: pass.
4. Commit: `feat(siteruntime): ExternalSystem.Call emits Audit Log #23 event on every sync call`.
**Bundle F acceptance:** SiteRuntime.Tests still green; 6 new tests.
---
## Bundle G — Health metric `SiteAuditWriteFailures`
### Task G1: Counter + DI surface
**Files:**
- Create: `src/ScadaLink.AuditLog/Site/IAuditWriteFailureCounter.cs` — `void Increment();`. Bundle B's `FallbackAuditWriter` already takes this.
- Modify: `src/ScadaLink.HealthMonitoring/SiteHealthCollector.cs` — add `int _siteAuditWriteFailures` field + `IncrementSiteAuditWriteFailures()` method using `Interlocked.Increment`. Expose via a snapshot read.
- Modify: `src/ScadaLink.HealthMonitoring/SiteHealthState.cs` — add `SiteAuditWriteFailures` property to the report payload.
- Implementation: a small adapter class `HealthMetricsAuditWriteFailureCounter : IAuditWriteFailureCounter` registered in DI that bridges to `ISiteHealthCollector.IncrementSiteAuditWriteFailures()`.
- Test: `tests/ScadaLink.HealthMonitoring.Tests/SiteAuditWriteFailuresMetricTests.cs`.
**Tests:**
1. `Increment_Three_Times_Counter_Reports_3`.
2. `Report_Payload_Includes_SiteAuditWriteFailures`.
**Steps:**
1. Failing tests.
2. Implement counter + adapter + DI registration.
3. Run: pass.
4. Commit: `feat(health): SiteAuditWriteFailures counter (#23)`.
**Bundle G acceptance:** HealthMonitoring.Tests still green; 2 new tests.
---
## Bundle H — Component-level integration test
### Task H1: End-to-end via TestKit + MSSQL fixture
**Files:**
- Create: `tests/ScadaLink.AuditLog.Tests/Integration/SyncCallEmissionEndToEndTests.cs` — uses `MsSqlMigrationFixture` (the M1 reusable fixture; depend on `Xunit.SkippableFact`):
- Brings up `SqliteAuditWriter` against `:memory:`.
- Brings up `SiteAuditTelemetryActor` via TestKit.
- Brings up `AuditLogIngestActor` via TestKit, configured with the MSSQL `IAuditLogRepository` from M1.
- Stubs the gRPC client by overriding the actor's gRPC dependency with a direct `IActorRef`-backed mock that forwards `IngestAuditEvents` directly to the central actor.
- Writes one `AuditEvent` via the FallbackAuditWriter.
- Drives a `Drain` tick on the telemetry actor.
- Asserts the row appears in the MS SQL `AuditLog` table within 5 seconds via `IAuditLogRepository.QueryAsync`.
**Steps:**
1. Failing test (telemetry not yet wired).
2. Wire the components together via the test harness.
3. Run: pass.
4. Commit: `test(auditlog): end-to-end sync-call emission via TestKit + MSSQL fixture (#23)`.
**Bundle H acceptance:** New test passes when MSSQL container is up; skips cleanly when down.
---
## Final cross-bundle review
After Bundles AH, dispatch a final reviewer agent with the same template as M1's. Acceptance gate: full `dotnet test ScadaLink.slnx` green. Then merge `--no-ff` with summary; update M3M8 with M2 realities; status paragraph; proceed to M3.

View File

@@ -0,0 +1,153 @@
using Akka.Actor;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Logging;
using ScadaLink.Commons.Entities.Audit;
using ScadaLink.Commons.Interfaces.Repositories;
using ScadaLink.Commons.Messages.Audit;
namespace ScadaLink.AuditLog.Central;
/// <summary>
/// Central-side singleton (per Bundle E wiring) that ingests batches of
/// <see cref="AuditEvent"/> rows pushed from sites via the
/// <c>IngestAuditEvents</c> gRPC RPC. Each row is stamped with the central-side
/// <see cref="AuditEvent.IngestedAtUtc"/> and inserted idempotently via
/// <see cref="IAuditLogRepository.InsertIfNotExistsAsync"/> — duplicates are
/// silently swallowed (first-write-wins per Bundle A's hardening).
/// </summary>
/// <remarks>
/// <para>
/// Idempotency is the contract: a row that already exists at central counts
/// as "accepted" for the purposes of the reply, because the storage state is
/// consistent and the site is free to flip its local row to <c>Forwarded</c>.
/// </para>
/// <para>
/// Per Bundle D's brief, audit-write failures must NEVER abort the user-facing
/// action. The actor wraps each repository call in its own try/catch so a
/// single bad row cannot cause the rest of the batch to be lost; the actor's
/// <see cref="SupervisorStrategy"/> uses <c>Resume</c> so a thrown exception
/// inside <c>ReceiveAsync</c> does not restart the actor (which would also
/// reset any in-flight state).
/// </para>
/// <para>
/// Two constructors exist for a deliberate reason: Bundle D's tests inject a
/// concrete <see cref="IAuditLogRepository"/> against a per-test MSSQL fixture
/// (the only way to verify the IngestedAtUtc stamp + duplicate-key idempotency
/// end to end), while Bundle E's host wiring registers the actor as a cluster
/// singleton and must therefore resolve the repository — which is a scoped EF
/// Core service — from a fresh DI scope per message. Mirroring the Notification
/// Outbox actor's pattern.
/// </para>
/// </remarks>
public class AuditLogIngestActor : ReceiveActor
{
private readonly IServiceProvider? _serviceProvider;
private readonly IAuditLogRepository? _injectedRepository;
private readonly ILogger<AuditLogIngestActor> _logger;
/// <summary>
/// Test-mode constructor — injects a concrete repository instance whose
/// lifetime exceeds the test, so the actor reuses the same instance across
/// every message. Used by Bundle D's MSSQL-backed TestKit fixture.
/// </summary>
public AuditLogIngestActor(
IAuditLogRepository repository,
ILogger<AuditLogIngestActor> logger)
{
ArgumentNullException.ThrowIfNull(repository);
ArgumentNullException.ThrowIfNull(logger);
_injectedRepository = repository;
_logger = logger;
ReceiveAsync<IngestAuditEventsCommand>(OnIngestAsync);
}
/// <summary>
/// Production constructor — resolves <see cref="IAuditLogRepository"/> from
/// a fresh DI scope per message because the repository is a scoped EF Core
/// service registered by <c>AddConfigurationDatabase</c>. The actor itself
/// is a long-lived cluster singleton, so it cannot hold a scope across
/// messages.
/// </summary>
public AuditLogIngestActor(
IServiceProvider serviceProvider,
ILogger<AuditLogIngestActor> logger)
{
ArgumentNullException.ThrowIfNull(serviceProvider);
ArgumentNullException.ThrowIfNull(logger);
_serviceProvider = serviceProvider;
_logger = logger;
ReceiveAsync<IngestAuditEventsCommand>(OnIngestAsync);
}
/// <summary>
/// Audit-write failures are best-effort by design (see alog.md §13): a
/// thrown exception in the ingest pipeline must not crash the actor.
/// Resume keeps the actor's state intact so the next batch is processed
/// against the same repository instance.
/// </summary>
protected override SupervisorStrategy SupervisorStrategy()
{
return new OneForOneStrategy(maxNrOfRetries: 0, withinTimeRange: TimeSpan.Zero, decider:
Akka.Actor.SupervisorStrategy.DefaultDecider);
}
private async Task OnIngestAsync(IngestAuditEventsCommand cmd)
{
// Sender is captured before the first await — Akka resets Sender
// between message dispatches, so a post-await Tell would go to
// DeadLetters.
var replyTo = Sender;
var nowUtc = DateTime.UtcNow;
var accepted = new List<Guid>(cmd.Events.Count);
// Resolve the repository for the whole batch — one DbContext per
// message, mirroring NotificationOutboxActor. The injected-repository
// mode (Bundle D tests) skips the scope entirely.
IServiceScope? scope = null;
IAuditLogRepository repository;
if (_injectedRepository is not null)
{
repository = _injectedRepository;
}
else
{
scope = _serviceProvider!.CreateScope();
repository = scope.ServiceProvider.GetRequiredService<IAuditLogRepository>();
}
try
{
foreach (var evt in cmd.Events)
{
try
{
// Stamp IngestedAtUtc here, not at the site. Bundle A's
// repository hardening already swallows duplicate-key races,
// so the same id arriving twice (site retry, reconciliation)
// is a silent no-op.
var ingested = evt with { IngestedAtUtc = nowUtc };
await repository.InsertIfNotExistsAsync(ingested).ConfigureAwait(false);
accepted.Add(evt.EventId);
}
catch (Exception ex)
{
// Per-row catch — one bad row never sinks the whole batch.
// The row stays Pending at the site; the next drain retries.
_logger.LogError(ex,
"Failed to persist audit event {EventId} during batch ingest; row will be retried by the site.",
evt.EventId);
}
}
}
finally
{
scope?.Dispose();
}
replyTo.Tell(new IngestAuditEventsReply(accepted));
}
}

View File

@@ -8,7 +8,12 @@
</PropertyGroup> </PropertyGroup>
<ItemGroup> <ItemGroup>
<!-- Bundle D D1: SiteAuditTelemetryActor + (D2) AuditLogIngestActor live
in this project, so Akka is an explicit dependency. -->
<PackageReference Include="Akka" />
<PackageReference Include="Microsoft.Data.Sqlite" />
<PackageReference Include="Microsoft.Extensions.DependencyInjection.Abstractions" /> <PackageReference Include="Microsoft.Extensions.DependencyInjection.Abstractions" />
<PackageReference Include="Microsoft.Extensions.Logging.Abstractions" />
<PackageReference Include="Microsoft.Extensions.Options" /> <PackageReference Include="Microsoft.Extensions.Options" />
<PackageReference Include="Microsoft.Extensions.Options.ConfigurationExtensions" /> <PackageReference Include="Microsoft.Extensions.Options.ConfigurationExtensions" />
</ItemGroup> </ItemGroup>
@@ -19,6 +24,8 @@
IAuditLogRepository is registered by ScadaLink.ConfigurationDatabase; the project IAuditLogRepository is registered by ScadaLink.ConfigurationDatabase; the project
reference is documented here so M2 writers + telemetry actors can depend on it. --> reference is documented here so M2 writers + telemetry actors can depend on it. -->
<ProjectReference Include="../ScadaLink.ConfigurationDatabase/ScadaLink.ConfigurationDatabase.csproj" /> <ProjectReference Include="../ScadaLink.ConfigurationDatabase/ScadaLink.ConfigurationDatabase.csproj" />
<!-- Communication carries the IngestAuditEvents proto + DTOs (#23 M2 site sync). -->
<ProjectReference Include="../ScadaLink.Communication/ScadaLink.Communication.csproj" />
</ItemGroup> </ItemGroup>
<ItemGroup> <ItemGroup>

View File

@@ -1,44 +1,139 @@
using Microsoft.Extensions.Configuration; using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.DependencyInjection; using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.DependencyInjection.Extensions;
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Options; using Microsoft.Extensions.Options;
using ScadaLink.AuditLog.Configuration; using ScadaLink.AuditLog.Configuration;
using ScadaLink.AuditLog.Site;
using ScadaLink.AuditLog.Site.Telemetry;
using ScadaLink.Commons.Interfaces.Services;
namespace ScadaLink.AuditLog; namespace ScadaLink.AuditLog;
/// <summary> /// <summary>
/// Composition root for the Audit Log (#23) component. M1 registers /// Composition root for the Audit Log (#23) component.
/// <see cref="AuditLogOptions"/> and its validator; later milestones extend
/// this method to wire up writers, telemetry actors, and the central ingest
/// pipeline. Audit Log (#23) sits alongside Notification Outbox (#21) and
/// Site Call Audit (#22).
/// </summary> /// </summary>
/// <remarks>
/// <para>
/// M1 registered <see cref="AuditLogOptions"/> + the validator. M2 Bundle E
/// extends the surface with the site-side writer chain
/// (<see cref="SqliteAuditWriter"/> + <see cref="RingBufferFallback"/> +
/// <see cref="FallbackAuditWriter"/>) and the telemetry collaborators
/// (<see cref="ISiteAuditQueue"/>, <see cref="ISiteStreamAuditClient"/>,
/// <see cref="IAuditWriteFailureCounter"/>, <see cref="SiteAuditTelemetryOptions"/>,
/// <see cref="SqliteAuditWriterOptions"/>).
/// </para>
/// <para>
/// Audit Log (#23) sits alongside Notification Outbox (#21) and Site Call
/// Audit (#22). <c>IAuditLogRepository</c> is registered by
/// <c>ScadaLink.ConfigurationDatabase.ServiceCollectionExtensions.AddConfigurationDatabase</c>,
/// so the caller (the Host on the central node) must also call that.
/// </para>
/// </remarks>
public static class ServiceCollectionExtensions public static class ServiceCollectionExtensions
{ {
/// <summary>Configuration section bound to <see cref="AuditLogOptions"/>.</summary> /// <summary>Configuration section bound to <see cref="AuditLogOptions"/>.</summary>
public const string ConfigSectionName = "AuditLog"; public const string ConfigSectionName = "AuditLog";
/// <summary>Configuration section bound to <see cref="SqliteAuditWriterOptions"/>.</summary>
public const string SiteWriterSectionName = "AuditLog:SiteWriter";
/// <summary>Configuration section bound to <see cref="SiteAuditTelemetryOptions"/>.</summary>
public const string SiteTelemetrySectionName = "AuditLog:SiteTelemetry";
/// <summary> /// <summary>
/// Binds <see cref="AuditLogOptions"/> from the /// Registers the Audit Log (#23) component services: options, the site
/// <see cref="ConfigSectionName"/> section of <paramref name="config"/> /// SQLite writer chain (primary + ring fallback + failure-counter sink),
/// and registers <see cref="AuditLogOptionsValidator"/> so a misconfigured /// and the site-→central telemetry collaborators. Idempotent re-registration
/// <c>AuditLog</c> section is rejected with a key-naming message when the /// is not supported; call this exactly once per <see cref="IServiceCollection"/>.
/// options are first resolved (or at startup when consumers wire in
/// <c>ValidateOnStart()</c>). M2+ will register writers, telemetry actors,
/// and the central ingest pipeline here. <c>IAuditLogRepository</c> is
/// registered by
/// <c>ScadaLink.ConfigurationDatabase.ServiceCollectionExtensions.AddConfigurationDatabase</c>,
/// so the caller (the Host on the central node) must also call that.
/// </summary> /// </summary>
public static IServiceCollection AddAuditLog(this IServiceCollection services, IConfiguration config) public static IServiceCollection AddAuditLog(this IServiceCollection services, IConfiguration config)
{ {
ArgumentNullException.ThrowIfNull(services); ArgumentNullException.ThrowIfNull(services);
ArgumentNullException.ThrowIfNull(config); ArgumentNullException.ThrowIfNull(config);
// M1: top-level AuditLogOptions + validator (redaction policy, payload caps, etc.).
services.AddOptions<AuditLogOptions>() services.AddOptions<AuditLogOptions>()
.Bind(config.GetSection(ConfigSectionName)) .Bind(config.GetSection(ConfigSectionName))
.ValidateOnStart(); .ValidateOnStart();
services.AddSingleton<IValidateOptions<AuditLogOptions>, AuditLogOptionsValidator>(); services.AddSingleton<IValidateOptions<AuditLogOptions>, AuditLogOptionsValidator>();
// M2 Bundle E: site writer + telemetry options bindings.
// BindConfiguration is not used because the configuration root supplied
// by the caller may not be the application root — we go through the
// section explicitly so a partial IConfiguration (e.g. a test stub
// anchored on the AuditLog section's parent) still works.
services.AddOptions<SqliteAuditWriterOptions>()
.Bind(config.GetSection(SiteWriterSectionName));
services.AddOptions<SiteAuditTelemetryOptions>()
.Bind(config.GetSection(SiteTelemetrySectionName));
// SqliteAuditWriter is a singleton with a single owned SqliteConnection
// and a background writer Task; multiple instances would race on the
// same file. Registered concretely so the ISiteAuditQueue + IAuditWriter
// forwards below resolve to the same instance — the actor must observe
// the writes made via the hot-path interface.
services.AddSingleton<SqliteAuditWriter>();
services.AddSingleton<ISiteAuditQueue>(sp => sp.GetRequiredService<SqliteAuditWriter>());
// RingBufferFallback: drop-oldest in-memory ring used by
// FallbackAuditWriter when the primary SQLite writer throws. Default
// capacity is fine for M2 (1024).
services.AddSingleton<RingBufferFallback>();
// IAuditWriteFailureCounter: NoOp default. Bundle G overrides this
// binding with the real Site Health Monitoring counter. Registered
// before FallbackAuditWriter so the factory can resolve it.
services.AddSingleton<IAuditWriteFailureCounter, NoOpAuditWriteFailureCounter>();
// The script-thread surface is FallbackAuditWriter (primary + ring +
// counter), not the raw SqliteAuditWriter — primary failures must NEVER
// abort the user-facing action.
services.AddSingleton<IAuditWriter>(sp => new FallbackAuditWriter(
primary: sp.GetRequiredService<SqliteAuditWriter>(),
ring: sp.GetRequiredService<RingBufferFallback>(),
failureCounter: sp.GetRequiredService<IAuditWriteFailureCounter>(),
logger: sp.GetRequiredService<ILogger<FallbackAuditWriter>>()));
// ISiteStreamAuditClient: NoOp default. M6's reconciliation work brings
// the real gRPC-backed implementation (no site→central gRPC channel
// exists today — sites talk to central via Akka ClusterClient only).
// Bundle H's integration test substitutes a stub directly into the
// SiteAuditTelemetryActor's Props.Create call.
services.AddSingleton<ISiteStreamAuditClient, NoOpSiteStreamAuditClient>();
return services;
}
/// <summary>
/// Audit Log (#23) M2 Bundle G — swap the default
/// <see cref="NoOpAuditWriteFailureCounter"/> registration for the real
/// <see cref="HealthMetricsAuditWriteFailureCounter"/> bridge so the
/// FallbackAuditWriter primary-failure counter surfaces in the site health
/// report payload as <c>SiteHealthReport.SiteAuditWriteFailures</c>.
/// </summary>
/// <remarks>
/// <para>
/// Must be called AFTER both <see cref="AddAuditLog"/> (registers the
/// NoOp default this method replaces) and
/// <c>ScadaLink.HealthMonitoring.ServiceCollectionExtensions.AddHealthMonitoring</c>
/// or <c>AddSiteHealthMonitoring</c> (registers the
/// <see cref="ISiteHealthCollector"/> the bridge depends on). Resolving
/// <see cref="IAuditWriteFailureCounter"/> without the latter throws
/// <see cref="InvalidOperationException"/> at <c>GetRequiredService</c>
/// time — by design, since a silent NoOp would mask a misconfiguration.
/// </para>
/// <para>
/// Idempotent — calling twice replaces the descriptor each time without
/// piling up registrations.
/// </para>
/// </remarks>
public static IServiceCollection AddAuditLogHealthMetricsBridge(this IServiceCollection services)
{
ArgumentNullException.ThrowIfNull(services);
services.Replace(
ServiceDescriptor.Singleton<IAuditWriteFailureCounter, HealthMetricsAuditWriteFailureCounter>());
return services; return services;
} }
} }

View File

@@ -0,0 +1,125 @@
using Microsoft.Extensions.Logging;
using ScadaLink.Commons.Entities.Audit;
using ScadaLink.Commons.Interfaces.Services;
namespace ScadaLink.AuditLog.Site;
/// <summary>
/// Composes the primary <see cref="SqliteAuditWriter"/> with a drop-oldest
/// <see cref="RingBufferFallback"/>. Audit writes are best-effort by contract
/// (see <see cref="IAuditWriter"/>) — a primary failure must NEVER bubble out
/// to the calling script. Failed events are stashed in the ring; on the next
/// successful primary write the ring is drained back through the primary in
/// FIFO order.
/// </summary>
/// <remarks>
/// <para>
/// Each primary failure increments <see cref="IAuditWriteFailureCounter"/> so
/// Site Health Monitoring can surface a sustained outage as
/// <c>SiteAuditWriteFailures</c> (Bundle G).
/// </para>
/// <para>
/// Errors raised by the ring drain on recovery are logged and silently dropped
/// so we don't loop the failure mode — the trigger event itself succeeded, and
/// retrying the drain on the NEXT successful write is the recovery path.
/// </para>
/// </remarks>
public sealed class FallbackAuditWriter : IAuditWriter
{
private readonly IAuditWriter _primary;
private readonly RingBufferFallback _ring;
private readonly IAuditWriteFailureCounter _failureCounter;
private readonly ILogger<FallbackAuditWriter> _logger;
private readonly SemaphoreSlim _drainGate = new(1, 1);
public FallbackAuditWriter(
IAuditWriter primary,
RingBufferFallback ring,
IAuditWriteFailureCounter failureCounter,
ILogger<FallbackAuditWriter> logger)
{
_primary = primary ?? throw new ArgumentNullException(nameof(primary));
_ring = ring ?? throw new ArgumentNullException(nameof(ring));
_failureCounter = failureCounter ?? throw new ArgumentNullException(nameof(failureCounter));
_logger = logger ?? throw new ArgumentNullException(nameof(logger));
}
public async Task WriteAsync(AuditEvent evt, CancellationToken ct = default)
{
ArgumentNullException.ThrowIfNull(evt);
try
{
await _primary.WriteAsync(evt, ct).ConfigureAwait(false);
}
catch (Exception ex)
{
// Primary down: record the failure, stash in the ring, return
// success to the caller. Audit-write failures NEVER abort the
// user-facing action (alog.md §7). DO NOT attempt the ring drain
// here — primary is throwing, draining would just scramble FIFO
// order across re-enqueues.
_failureCounter.Increment();
_logger.LogWarning(ex,
"Primary audit writer threw; routing EventId {EventId} to drop-oldest ring.",
evt.EventId);
_ring.TryEnqueue(evt);
return;
}
// Primary succeeded — opportunistically drain anything that piled up
// in the ring during the outage. Best-effort: a failure during the
// drain re-enqueues the popped event and is logged; the next
// successful write will retry. Drain order in the audit log is
// therefore: <triggering event>, <backlog FIFO>.
if (_ring.Count > 0)
{
await TryDrainRingAsync(ct).ConfigureAwait(false);
}
}
private async Task TryDrainRingAsync(CancellationToken ct)
{
// Serialise drains so two concurrent recoveries don't double-replay.
if (!await _drainGate.WaitAsync(0, ct).ConfigureAwait(false))
{
return;
}
try
{
// Pull only what is currently buffered; do NOT wait for new events.
// We iterate with a snapshot of Count so we never starve under
// concurrent enqueues.
var pending = _ring.Count;
for (var i = 0; i < pending; i++)
{
if (!_ring.TryDequeue(out var queued))
{
break;
}
try
{
await _primary.WriteAsync(queued, ct).ConfigureAwait(false);
}
catch (Exception ex)
{
// Primary fell over again. Put the event back at the head
// of the queue is impossible with Channel<T>; route to the
// tail (drop-oldest preserves the most-recent picture).
_failureCounter.Increment();
_logger.LogWarning(ex,
"Ring drain re-throw on EventId {EventId}; re-enqueuing.",
queued.EventId);
_ring.TryEnqueue(queued);
break;
}
}
}
finally
{
_drainGate.Release();
}
}
}

View File

@@ -0,0 +1,33 @@
using ScadaLink.HealthMonitoring;
namespace ScadaLink.AuditLog.Site;
/// <summary>
/// Audit Log (#23) M2 Bundle G — bridges <see cref="IAuditWriteFailureCounter"/>
/// (incremented by <see cref="FallbackAuditWriter"/> every time the primary
/// SQLite writer throws) into <see cref="ISiteHealthCollector"/> so the count
/// surfaces in the site health report payload as
/// <c>SiteHealthReport.SiteAuditWriteFailures</c>.
/// </summary>
/// <remarks>
/// <para>
/// Registered by <see cref="ServiceCollectionExtensions.AddAuditLogHealthMetricsBridge"/>;
/// callers must register <c>AddHealthMonitoring()</c> first so
/// <see cref="ISiteHealthCollector"/> resolves. The default <see cref="AddAuditLog"/>
/// registration keeps <see cref="NoOpAuditWriteFailureCounter"/> for nodes
/// where Site Health Monitoring is not wired (the silent-sink contract — audit
/// write failures must NEVER abort the user-facing action, alog.md §7).
/// </para>
/// </remarks>
public sealed class HealthMetricsAuditWriteFailureCounter : IAuditWriteFailureCounter
{
private readonly ISiteHealthCollector _collector;
public HealthMetricsAuditWriteFailureCounter(ISiteHealthCollector collector)
{
_collector = collector ?? throw new ArgumentNullException(nameof(collector));
}
/// <inheritdoc/>
public void Increment() => _collector.IncrementSiteAuditWriteFailures();
}

View File

@@ -0,0 +1,14 @@
namespace ScadaLink.AuditLog.Site;
/// <summary>
/// Lightweight counter sink invoked by <see cref="FallbackAuditWriter"/> every
/// time the primary <see cref="SqliteAuditWriter"/> throws on an audit write.
/// Bundle G (M2-T11) implements this as a thread-safe Interlocked counter
/// bridged into the Site Health Monitoring report payload as
/// <c>SiteAuditWriteFailures</c>.
/// </summary>
public interface IAuditWriteFailureCounter
{
/// <summary>Increment the audit-write failure counter by one.</summary>
void Increment();
}

View File

@@ -0,0 +1,25 @@
namespace ScadaLink.AuditLog.Site;
/// <summary>
/// Default <see cref="IAuditWriteFailureCounter"/> registered by
/// <see cref="ScadaLink.AuditLog.ServiceCollectionExtensions.AddAuditLog"/> on
/// every node. Bundle G replaces this binding with a real counter that bridges
/// into the Site Health Monitoring report payload as
/// <c>SiteAuditWriteFailures</c> — until then,
/// <see cref="FallbackAuditWriter"/> emits to a silent sink rather than NRE-ing
/// on a null collaborator.
/// </summary>
/// <remarks>
/// Audit-write failures must NEVER abort the user-facing action (alog.md §7),
/// so the counter is best-effort by contract. A NoOp default is the correct
/// safe fallback while the health metric is being wired in.
/// </remarks>
public sealed class NoOpAuditWriteFailureCounter : IAuditWriteFailureCounter
{
/// <inheritdoc/>
public void Increment()
{
// Intentionally empty. Bundle G overrides this binding with the real
// counter once Site Health Monitoring is wired.
}
}

View File

@@ -0,0 +1,115 @@
using System.Runtime.CompilerServices;
using System.Threading.Channels;
using ScadaLink.Commons.Entities.Audit;
namespace ScadaLink.AuditLog.Site;
/// <summary>
/// Drop-oldest in-memory ring buffer used by <see cref="FallbackAuditWriter"/>
/// when the primary SQLite writer is throwing. Capacity is fixed at construction
/// (default 1024). When full, the oldest event is silently dropped to make room
/// for the newest — preserving the most recent picture of activity in the face
/// of an extended SQLite outage — and <see cref="RingBufferOverflowed"/> is
/// raised so a health counter can record the loss.
/// </summary>
/// <remarks>
/// <para>
/// Backed by a <see cref="Channel{T}"/> with
/// <see cref="BoundedChannelFullMode.DropOldest"/>. The channel doesn't natively
/// notify on drop, so this class compares <c>Reader.Count</c> before and after
/// each enqueue: any time we hit capacity and a subsequent enqueue keeps the
/// count at capacity, exactly one event has been dropped.
/// </para>
/// <para>
/// Per the M2 plan: the ring is the absolute-last-resort buffer for the
/// hot-path; it is NOT a substitute for the bounded
/// <see cref="SqliteAuditWriter"/> write queue.
/// </para>
/// </remarks>
public sealed class RingBufferFallback
{
private readonly Channel<AuditEvent> _channel;
private readonly int _capacity;
/// <summary>
/// Raised once each time a drop-oldest overflow occurs. Hooked by
/// <see cref="FallbackAuditWriter"/>'s health counter wiring.
/// </summary>
public event Action? RingBufferOverflowed;
public RingBufferFallback(int capacity = 1024)
{
if (capacity <= 0)
{
throw new ArgumentOutOfRangeException(nameof(capacity), "capacity must be > 0.");
}
_capacity = capacity;
_channel = Channel.CreateBounded<AuditEvent>(new BoundedChannelOptions(capacity)
{
FullMode = BoundedChannelFullMode.DropOldest,
SingleReader = true,
SingleWriter = false,
});
}
/// <summary>Current event count in the ring (for diagnostics/tests).</summary>
public int Count => _channel.Reader.Count;
/// <summary>
/// Try to enqueue an event. Returns <see langword="true"/> on success (even
/// when an overflow caused an older event to be dropped); returns
/// <see langword="false"/> only when the ring has been
/// <see cref="Complete"/>-d.
/// </summary>
public bool TryEnqueue(AuditEvent evt)
{
ArgumentNullException.ThrowIfNull(evt);
// DropOldest TryWrite always succeeds unless the channel is completed.
// Detect overflow by comparing the count before vs. after: if we were
// already at capacity and remain at capacity, exactly one event was
// dropped to make room for evt.
var beforeCount = _channel.Reader.Count;
if (!_channel.Writer.TryWrite(evt))
{
return false;
}
if (beforeCount >= _capacity)
{
// The new event displaced an existing one.
RingBufferOverflowed?.Invoke();
}
return true;
}
/// <summary>
/// Drain the ring in FIFO order. Yields available events immediately and
/// then completes when the channel is empty AND <see cref="Complete"/> has
/// been called. Callers that only want to drain what's currently buffered
/// must call <see cref="Complete"/> first.
/// </summary>
public async IAsyncEnumerable<AuditEvent> DrainAsync(
[EnumeratorCancellation] CancellationToken cancellationToken)
{
await foreach (var evt in _channel.Reader.ReadAllAsync(cancellationToken).ConfigureAwait(false))
{
yield return evt;
}
}
/// <summary>
/// Non-blocking single-item dequeue used by the
/// <see cref="FallbackAuditWriter"/> recovery path. Returns
/// <see langword="false"/> when the ring is empty.
/// </summary>
public bool TryDequeue(out AuditEvent evt) => _channel.Reader.TryRead(out evt!);
/// <summary>
/// Mark the ring as no-more-writes. <see cref="DrainAsync"/> will yield the
/// remaining events and then complete.
/// </summary>
public void Complete() => _channel.Writer.TryComplete();
}

View File

@@ -0,0 +1,479 @@
using System.Threading.Channels;
using Microsoft.Data.Sqlite;
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Options;
using ScadaLink.AuditLog.Site.Telemetry;
using ScadaLink.Commons.Entities.Audit;
using ScadaLink.Commons.Interfaces.Services;
using ScadaLink.Commons.Types.Enums;
namespace ScadaLink.AuditLog.Site;
/// <summary>
/// Site-side SQLite hot-path writer for Audit Log (#23) events. Mirrors the
/// <see cref="ScadaLink.SiteEventLogging.SiteEventLogger"/> design — a single
/// owned <see cref="SqliteConnection"/> serialised behind a write lock, fed by a
/// bounded <see cref="Channel{T}"/> drained on a dedicated background writer
/// task — so script-thread callers never block on disk I/O.
/// </summary>
/// <remarks>
/// <para>
/// The schema is bootstrapped in the constructor (Bundle B-T1). The
/// Channel-based <see cref="WriteAsync"/> hot-path + Bundle D
/// <see cref="ReadPendingAsync"/> / <see cref="MarkForwardedAsync"/> support
/// surface are wired in Bundle B-T2.
/// </para>
/// <para>
/// Site rows always carry <see cref="AuditForwardState.Pending"/> on first
/// insert; the central row-shape's <c>IngestedAtUtc</c> column does NOT live in
/// the site SQLite schema — central stamps it on ingest.
/// </para>
/// </remarks>
public class SqliteAuditWriter : IAuditWriter, ISiteAuditQueue, IAsyncDisposable, IDisposable
{
// Microsoft.Data.Sqlite reports a generic SQLITE_CONSTRAINT (error code 19)
// on a PRIMARY KEY violation; the extended subcode 1555 (SQLITE_CONSTRAINT_PRIMARYKEY)
// is exposed via SqliteException.SqliteExtendedErrorCode but isn't reliably
// surfaced across all SQLite builds. We treat any constraint error on insert
// as a duplicate-eventid race and swallow it (first-write-wins) — the index
// on EventId is the only constraint on this table, so this scope is precise.
private const int SqliteErrorConstraint = 19;
private readonly SqliteConnection _connection;
private readonly SqliteAuditWriterOptions _options;
private readonly ILogger<SqliteAuditWriter> _logger;
private readonly object _writeLock = new();
private readonly Channel<PendingAuditEvent> _writeQueue;
private readonly Task _writerLoop;
private bool _disposed;
public SqliteAuditWriter(
IOptions<SqliteAuditWriterOptions> options,
ILogger<SqliteAuditWriter> logger,
string? connectionStringOverride = null)
{
ArgumentNullException.ThrowIfNull(options);
ArgumentNullException.ThrowIfNull(logger);
_options = options.Value;
_logger = logger;
var connectionString = connectionStringOverride
?? $"Data Source={_options.DatabasePath};Cache=Shared";
_connection = new SqliteConnection(connectionString);
_connection.Open();
InitializeSchema();
_writeQueue = Channel.CreateBounded<PendingAuditEvent>(
new BoundedChannelOptions(_options.ChannelCapacity)
{
// The hot-path enqueue must back-pressure if the background
// writer falls behind; a higher-level fallback (Bundle B-T4)
// handles truly catastrophic primary failure with a drop-oldest
// ring buffer.
FullMode = BoundedChannelFullMode.Wait,
SingleReader = true,
SingleWriter = false,
});
_writerLoop = Task.Run(ProcessWriteQueueAsync);
}
private void InitializeSchema()
{
// auto_vacuum must be set before any table is created for it to take
// effect on a fresh database. INCREMENTAL lets a future
// `PRAGMA incremental_vacuum` shrink the file after the 7-day retention
// purge — see alog.md §10.
using (var pragmaCmd = _connection.CreateCommand())
{
pragmaCmd.CommandText = "PRAGMA auto_vacuum = INCREMENTAL";
pragmaCmd.ExecuteNonQuery();
}
using var cmd = _connection.CreateCommand();
cmd.CommandText = """
CREATE TABLE IF NOT EXISTS AuditLog (
EventId TEXT NOT NULL,
OccurredAtUtc TEXT NOT NULL,
Channel TEXT NOT NULL,
Kind TEXT NOT NULL,
CorrelationId TEXT NULL,
SourceSiteId TEXT NULL,
SourceInstanceId TEXT NULL,
SourceScript TEXT NULL,
Actor TEXT NULL,
Target TEXT NULL,
Status TEXT NOT NULL,
HttpStatus INTEGER NULL,
DurationMs INTEGER NULL,
ErrorMessage TEXT NULL,
ErrorDetail TEXT NULL,
RequestSummary TEXT NULL,
ResponseSummary TEXT NULL,
PayloadTruncated INTEGER NOT NULL,
Extra TEXT NULL,
ForwardState TEXT NOT NULL,
PRIMARY KEY (EventId)
);
CREATE INDEX IF NOT EXISTS IX_SiteAuditLog_ForwardState_Occurred
ON AuditLog (ForwardState, OccurredAtUtc);
""";
cmd.ExecuteNonQuery();
}
/// <summary>
/// Enqueue an event for durable persistence. The returned <see cref="Task"/>
/// completes once the event has been INSERTed (or, in the duplicate-EventId
/// case, recognised as already present); it faults only if the writer loop
/// itself collapses. The enqueue side never blocks on disk I/O — it only
/// awaits the bounded channel's back-pressure when the writer is briefly
/// behind.
/// </summary>
public Task WriteAsync(AuditEvent evt, CancellationToken ct = default)
{
ArgumentNullException.ThrowIfNull(evt);
// Site rows always carry a non-null ForwardState; central rows leave it
// null. Force Pending on enqueue so callers can pass a bare AuditEvent
// without thinking about site-vs-central provenance.
var siteEvt = evt.ForwardState is null
? evt with { ForwardState = AuditForwardState.Pending }
: evt;
var pending = new PendingAuditEvent(siteEvt);
// CreateBounded(FullMode=Wait) means WriteAsync will await room rather
// than throw when full — exactly the hot-path back-pressure semantics
// we want.
if (!_writeQueue.Writer.TryWrite(pending))
{
// The writer is either completed (logger disposed) or the channel
// is at capacity. Fall back to the async path which honours the
// FullMode=Wait policy.
return WriteSlowPathAsync(pending, ct);
}
return pending.Completion.Task;
}
private async Task WriteSlowPathAsync(PendingAuditEvent pending, CancellationToken ct)
{
try
{
await _writeQueue.Writer.WriteAsync(pending, ct).ConfigureAwait(false);
}
catch (ChannelClosedException)
{
pending.Completion.TrySetException(
new ObjectDisposedException(nameof(SqliteAuditWriter),
"Event could not be recorded: the audit writer has been disposed."));
}
await pending.Completion.Task.ConfigureAwait(false);
}
private async Task ProcessWriteQueueAsync()
{
var batch = new List<PendingAuditEvent>(_options.BatchSize);
// ReadAllAsync completes when the channel is marked complete (Dispose).
await foreach (var first in _writeQueue.Reader.ReadAllAsync().ConfigureAwait(false))
{
batch.Clear();
batch.Add(first);
// Pull additional ready events up to BatchSize. TryRead is non-
// blocking and lets us amortise the transaction overhead across a
// burst of concurrent enqueues.
while (batch.Count < _options.BatchSize &&
_writeQueue.Reader.TryRead(out var next))
{
batch.Add(next);
}
FlushBatch(batch);
}
}
private void FlushBatch(IReadOnlyList<PendingAuditEvent> batch)
{
lock (_writeLock)
{
if (_disposed)
{
foreach (var pending in batch)
{
pending.Completion.TrySetException(
new ObjectDisposedException(nameof(SqliteAuditWriter),
"Event could not be recorded: the audit writer was disposed before the write completed."));
}
return;
}
using var transaction = _connection.BeginTransaction();
try
{
using var cmd = _connection.CreateCommand();
cmd.Transaction = transaction;
cmd.CommandText = """
INSERT INTO AuditLog (
EventId, OccurredAtUtc, Channel, Kind, CorrelationId,
SourceSiteId, SourceInstanceId, SourceScript, Actor, Target,
Status, HttpStatus, DurationMs, ErrorMessage, ErrorDetail,
RequestSummary, ResponseSummary, PayloadTruncated, Extra, ForwardState
) VALUES (
$EventId, $OccurredAtUtc, $Channel, $Kind, $CorrelationId,
$SourceSiteId, $SourceInstanceId, $SourceScript, $Actor, $Target,
$Status, $HttpStatus, $DurationMs, $ErrorMessage, $ErrorDetail,
$RequestSummary, $ResponseSummary, $PayloadTruncated, $Extra, $ForwardState
);
""";
var pEventId = cmd.Parameters.Add("$EventId", SqliteType.Text);
var pOccurredAt = cmd.Parameters.Add("$OccurredAtUtc", SqliteType.Text);
var pChannel = cmd.Parameters.Add("$Channel", SqliteType.Text);
var pKind = cmd.Parameters.Add("$Kind", SqliteType.Text);
var pCorrelationId = cmd.Parameters.Add("$CorrelationId", SqliteType.Text);
var pSourceSiteId = cmd.Parameters.Add("$SourceSiteId", SqliteType.Text);
var pSourceInstanceId = cmd.Parameters.Add("$SourceInstanceId", SqliteType.Text);
var pSourceScript = cmd.Parameters.Add("$SourceScript", SqliteType.Text);
var pActor = cmd.Parameters.Add("$Actor", SqliteType.Text);
var pTarget = cmd.Parameters.Add("$Target", SqliteType.Text);
var pStatus = cmd.Parameters.Add("$Status", SqliteType.Text);
var pHttpStatus = cmd.Parameters.Add("$HttpStatus", SqliteType.Integer);
var pDurationMs = cmd.Parameters.Add("$DurationMs", SqliteType.Integer);
var pErrorMessage = cmd.Parameters.Add("$ErrorMessage", SqliteType.Text);
var pErrorDetail = cmd.Parameters.Add("$ErrorDetail", SqliteType.Text);
var pRequestSummary = cmd.Parameters.Add("$RequestSummary", SqliteType.Text);
var pResponseSummary = cmd.Parameters.Add("$ResponseSummary", SqliteType.Text);
var pPayloadTruncated = cmd.Parameters.Add("$PayloadTruncated", SqliteType.Integer);
var pExtra = cmd.Parameters.Add("$Extra", SqliteType.Text);
var pForwardState = cmd.Parameters.Add("$ForwardState", SqliteType.Text);
foreach (var pending in batch)
{
var e = pending.Event;
pEventId.Value = e.EventId.ToString();
pOccurredAt.Value = e.OccurredAtUtc.ToString("o");
pChannel.Value = e.Channel.ToString();
pKind.Value = e.Kind.ToString();
pCorrelationId.Value = (object?)e.CorrelationId?.ToString() ?? DBNull.Value;
pSourceSiteId.Value = (object?)e.SourceSiteId ?? DBNull.Value;
pSourceInstanceId.Value = (object?)e.SourceInstanceId ?? DBNull.Value;
pSourceScript.Value = (object?)e.SourceScript ?? DBNull.Value;
pActor.Value = (object?)e.Actor ?? DBNull.Value;
pTarget.Value = (object?)e.Target ?? DBNull.Value;
pStatus.Value = e.Status.ToString();
pHttpStatus.Value = (object?)e.HttpStatus ?? DBNull.Value;
pDurationMs.Value = (object?)e.DurationMs ?? DBNull.Value;
pErrorMessage.Value = (object?)e.ErrorMessage ?? DBNull.Value;
pErrorDetail.Value = (object?)e.ErrorDetail ?? DBNull.Value;
pRequestSummary.Value = (object?)e.RequestSummary ?? DBNull.Value;
pResponseSummary.Value = (object?)e.ResponseSummary ?? DBNull.Value;
pPayloadTruncated.Value = e.PayloadTruncated ? 1 : 0;
pExtra.Value = (object?)e.Extra ?? DBNull.Value;
pForwardState.Value = (e.ForwardState ?? AuditForwardState.Pending).ToString();
try
{
cmd.ExecuteNonQuery();
pending.Completion.TrySetResult();
}
catch (SqliteException ex) when (ex.SqliteErrorCode == SqliteErrorConstraint)
{
// Duplicate EventId — first-write-wins (alog.md §11).
// Treat as success: the lifecycle event is durably
// recorded under the first writer's payload.
_logger.LogDebug(ex,
"Duplicate EventId {EventId} swallowed by SqliteAuditWriter",
e.EventId);
pending.Completion.TrySetResult();
}
}
transaction.Commit();
}
catch (Exception ex)
{
transaction.Rollback();
_logger.LogError(ex, "SqliteAuditWriter batch insert failed; faulting {Count} pending events", batch.Count);
foreach (var pending in batch)
{
pending.Completion.TrySetException(ex);
}
}
}
}
/// <summary>
/// Returns up to <paramref name="limit"/> rows in <see cref="AuditForwardState.Pending"/>,
/// oldest <see cref="AuditEvent.OccurredAtUtc"/> first, with <see cref="AuditEvent.EventId"/>
/// as the deterministic tiebreaker. Called by Bundle D's site telemetry
/// actor to build a batch for the gRPC push.
/// </summary>
public Task<IReadOnlyList<AuditEvent>> ReadPendingAsync(int limit, CancellationToken ct = default)
{
if (limit <= 0)
{
throw new ArgumentOutOfRangeException(nameof(limit), "limit must be > 0.");
}
// SqliteConnection is not thread-safe so we go through the same write
// lock the batch INSERTer uses. The actor caller is single-threaded,
// so contention is bounded.
lock (_writeLock)
{
ObjectDisposedException.ThrowIf(_disposed, this);
using var cmd = _connection.CreateCommand();
cmd.CommandText = """
SELECT EventId, OccurredAtUtc, Channel, Kind, CorrelationId,
SourceSiteId, SourceInstanceId, SourceScript, Actor, Target,
Status, HttpStatus, DurationMs, ErrorMessage, ErrorDetail,
RequestSummary, ResponseSummary, PayloadTruncated, Extra, ForwardState
FROM AuditLog
WHERE ForwardState = $pending
ORDER BY OccurredAtUtc ASC, EventId ASC
LIMIT $limit;
""";
cmd.Parameters.AddWithValue("$pending", AuditForwardState.Pending.ToString());
cmd.Parameters.AddWithValue("$limit", limit);
var rows = new List<AuditEvent>(Math.Min(limit, 256));
using var reader = cmd.ExecuteReader();
while (reader.Read())
{
rows.Add(MapRow(reader));
}
return Task.FromResult<IReadOnlyList<AuditEvent>>(rows);
}
}
/// <summary>
/// Flips the supplied EventIds from <see cref="AuditForwardState.Pending"/> to
/// <see cref="AuditForwardState.Forwarded"/> in a single UPDATE. Non-existent
/// or already-forwarded ids are no-ops.
/// </summary>
public Task MarkForwardedAsync(IReadOnlyList<Guid> eventIds, CancellationToken ct = default)
{
ArgumentNullException.ThrowIfNull(eventIds);
if (eventIds.Count == 0)
{
return Task.CompletedTask;
}
lock (_writeLock)
{
ObjectDisposedException.ThrowIf(_disposed, this);
using var cmd = _connection.CreateCommand();
// Build a single IN (...) parameter list so we issue one UPDATE per
// batch regardless of size. Each id is bound as its own parameter,
// so no string concatenation of user data ever enters the SQL.
var sb = new System.Text.StringBuilder();
sb.Append("UPDATE AuditLog SET ForwardState = $forwarded WHERE EventId IN (");
for (int i = 0; i < eventIds.Count; i++)
{
if (i > 0) sb.Append(',');
var p = $"$id{i}";
sb.Append(p);
cmd.Parameters.AddWithValue(p, eventIds[i].ToString());
}
sb.Append(");");
cmd.CommandText = sb.ToString();
cmd.Parameters.AddWithValue("$forwarded", AuditForwardState.Forwarded.ToString());
cmd.ExecuteNonQuery();
return Task.CompletedTask;
}
}
private static AuditEvent MapRow(SqliteDataReader reader)
{
return new AuditEvent
{
EventId = Guid.Parse(reader.GetString(0)),
OccurredAtUtc = DateTime.Parse(reader.GetString(1),
System.Globalization.CultureInfo.InvariantCulture,
System.Globalization.DateTimeStyles.RoundtripKind),
Channel = Enum.Parse<AuditChannel>(reader.GetString(2)),
Kind = Enum.Parse<AuditKind>(reader.GetString(3)),
CorrelationId = reader.IsDBNull(4) ? null : Guid.Parse(reader.GetString(4)),
SourceSiteId = reader.IsDBNull(5) ? null : reader.GetString(5),
SourceInstanceId = reader.IsDBNull(6) ? null : reader.GetString(6),
SourceScript = reader.IsDBNull(7) ? null : reader.GetString(7),
Actor = reader.IsDBNull(8) ? null : reader.GetString(8),
Target = reader.IsDBNull(9) ? null : reader.GetString(9),
Status = Enum.Parse<AuditStatus>(reader.GetString(10)),
HttpStatus = reader.IsDBNull(11) ? null : reader.GetInt32(11),
DurationMs = reader.IsDBNull(12) ? null : reader.GetInt32(12),
ErrorMessage = reader.IsDBNull(13) ? null : reader.GetString(13),
ErrorDetail = reader.IsDBNull(14) ? null : reader.GetString(14),
RequestSummary = reader.IsDBNull(15) ? null : reader.GetString(15),
ResponseSummary = reader.IsDBNull(16) ? null : reader.GetString(16),
PayloadTruncated = reader.GetInt32(17) != 0,
Extra = reader.IsDBNull(18) ? null : reader.GetString(18),
ForwardState = Enum.Parse<AuditForwardState>(reader.GetString(19)),
};
}
public void Dispose()
{
DisposeAsync().AsTask().GetAwaiter().GetResult();
}
public async ValueTask DisposeAsync()
{
Task? writerLoop;
lock (_writeLock)
{
if (_disposed) return;
// Stop accepting new events. Setting _disposed first ensures any
// FlushBatch entered after we mark disposed will fault its pending
// events rather than touching the about-to-close connection.
_writeQueue.Writer.TryComplete();
writerLoop = _writerLoop;
}
// Wait outside the lock — the loop reacquires it for each batch.
try
{
if (writerLoop is not null)
{
await writerLoop.WaitAsync(TimeSpan.FromSeconds(5)).ConfigureAwait(false);
}
}
catch (TimeoutException)
{
_logger.LogWarning("SqliteAuditWriter writer loop did not drain within 5s of dispose.");
}
catch (Exception ex)
{
// The loop's per-batch try/catch already routed individual failures
// to pending TCSes; a top-level fault here is unexpected.
_logger.LogError(ex, "SqliteAuditWriter writer loop faulted during dispose.");
}
lock (_writeLock)
{
if (_disposed) return;
_disposed = true;
_connection.Dispose();
}
}
/// <summary>An audit event awaiting persistence by the background writer.</summary>
private sealed class PendingAuditEvent
{
public PendingAuditEvent(AuditEvent evt)
{
Event = evt;
Completion = new TaskCompletionSource(TaskCreationOptions.RunContinuationsAsynchronously);
}
public AuditEvent Event { get; }
public TaskCompletionSource Completion { get; }
}
}

View File

@@ -0,0 +1,27 @@
namespace ScadaLink.AuditLog.Site;
/// <summary>
/// Options for the site-side SQLite hot-path audit writer.
/// Mirrors the ScadaLink.SiteEventLogging pattern: a single SQLite connection
/// fed by a background writer task draining a bounded
/// <see cref="System.Threading.Channels.Channel{T}"/> so script-thread enqueues
/// never block on disk I/O.
/// </summary>
public sealed class SqliteAuditWriterOptions
{
/// <summary>SQLite database path (or in-memory URI for tests).</summary>
public string DatabasePath { get; set; } = "auditlog.db";
/// <summary>
/// Capacity of the bounded write queue. Set high enough that ordinary
/// script bursts never fill it; <see cref="System.Threading.Channels.BoundedChannelFullMode.Wait"/>
/// applies when the writer falls behind.
/// </summary>
public int ChannelCapacity { get; set; } = 4096;
/// <summary>Max number of pending events the writer drains in one transaction.</summary>
public int BatchSize { get; set; } = 256;
/// <summary>Soft flush interval the writer enforces when fewer than BatchSize events are queued.</summary>
public int FlushIntervalMs { get; set; } = 50;
}

View File

@@ -0,0 +1,34 @@
using ScadaLink.Commons.Entities.Audit;
namespace ScadaLink.AuditLog.Site.Telemetry;
/// <summary>
/// Site-local audit-log queue surface consumed by <see cref="SiteAuditTelemetryActor"/>.
/// Extracted from <see cref="SqliteAuditWriter"/> so the telemetry actor can be
/// unit-tested against a stub without touching SQLite. <see cref="SqliteAuditWriter"/>
/// implements this interface; production wiring injects the same instance.
/// </summary>
/// <remarks>
/// Only the two methods the drain loop needs are exposed — the hot-path
/// <c>WriteAsync</c> stays on <see cref="Commons.Interfaces.Services.IAuditWriter"/>
/// (script-thread surface), separated by concern from the
/// telemetry-actor surface so each side can be mocked independently.
/// </remarks>
public interface ISiteAuditQueue
{
/// <summary>
/// Returns up to <paramref name="limit"/> rows currently in
/// <see cref="ScadaLink.Commons.Types.Enums.AuditForwardState.Pending"/>,
/// oldest first. Idempotent — repeated calls before
/// <see cref="MarkForwardedAsync"/> will yield the same rows again.
/// </summary>
Task<IReadOnlyList<AuditEvent>> ReadPendingAsync(int limit, CancellationToken ct = default);
/// <summary>
/// Flips the supplied EventIds from
/// <see cref="ScadaLink.Commons.Types.Enums.AuditForwardState.Pending"/> to
/// <see cref="ScadaLink.Commons.Types.Enums.AuditForwardState.Forwarded"/>.
/// Non-existent or already-forwarded ids are silent no-ops.
/// </summary>
Task MarkForwardedAsync(IReadOnlyList<Guid> eventIds, CancellationToken ct = default);
}

View File

@@ -0,0 +1,23 @@
using ScadaLink.Communication.Grpc;
namespace ScadaLink.AuditLog.Site.Telemetry;
/// <summary>
/// Mockable abstraction over the central site-stream gRPC client surface that
/// <see cref="SiteAuditTelemetryActor"/> uses to push <see cref="AuditEventBatch"/>
/// payloads. The production implementation (added in Bundle E host wiring)
/// wraps the auto-generated <c>SiteStreamService.SiteStreamServiceClient</c>;
/// unit tests substitute via NSubstitute against this interface so the actor
/// never needs a live gRPC channel.
/// </summary>
public interface ISiteStreamAuditClient
{
/// <summary>
/// Pushes <paramref name="batch"/> to the central <c>IngestAuditEvents</c>
/// RPC. The returned <see cref="IngestAck"/> carries the
/// <c>accepted_event_ids</c> the actor will flip to
/// <see cref="ScadaLink.Commons.Types.Enums.AuditForwardState.Forwarded"/>
/// in the site SQLite queue.
/// </summary>
Task<IngestAck> IngestAuditEventsAsync(AuditEventBatch batch, CancellationToken ct);
}

View File

@@ -0,0 +1,41 @@
using ScadaLink.Communication.Grpc;
namespace ScadaLink.AuditLog.Site.Telemetry;
/// <summary>
/// Default <see cref="ISiteStreamAuditClient"/> registered by
/// <see cref="ScadaLink.AuditLog.ServiceCollectionExtensions.AddAuditLog"/>.
/// Ships with M2 site-sync-pipeline wiring; the real gRPC-backed
/// implementation is deferred to M6 reconciliation, where a site→central gRPC
/// channel will be introduced (no such channel exists today — sites talk to
/// central exclusively via Akka ClusterClient, while the gRPC SiteStreamService
/// is hosted on the SITE side for central→site streaming).
/// </summary>
/// <remarks>
/// <para>
/// Returns an empty <see cref="IngestAck"/> so the
/// <see cref="SiteAuditTelemetryActor"/> doesn't flip any rows to
/// <c>Forwarded</c> when this NoOp is in effect — Bundle H's integration test
/// substitutes a stub client that routes directly to the central
/// <c>AuditLogIngestActor</c> in-process. Production wiring (M6) will replace
/// this binding with a real client.
/// </para>
/// <para>
/// Audit-write paths are best-effort by contract: a NoOp client keeps the
/// host running cleanly and is consistent with "audit-write failures never
/// abort the user-facing action".
/// </para>
/// </remarks>
public sealed class NoOpSiteStreamAuditClient : ISiteStreamAuditClient
{
private static readonly IngestAck EmptyAck = new();
/// <inheritdoc/>
public Task<IngestAck> IngestAuditEventsAsync(AuditEventBatch batch, CancellationToken ct)
{
ArgumentNullException.ThrowIfNull(batch);
// Empty ack — no EventIds will be flipped to Forwarded, so rows stay
// Pending until M6's real client (or a Bundle H test stub) takes over.
return Task.FromResult(EmptyAck);
}
}

View File

@@ -0,0 +1,179 @@
using Akka.Actor;
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Options;
using ScadaLink.AuditLog.Telemetry;
using ScadaLink.Commons.Entities.Audit;
using ScadaLink.Communication.Grpc;
namespace ScadaLink.AuditLog.Site.Telemetry;
/// <summary>
/// Site-side actor that drains the local SQLite audit queue and pushes Pending
/// rows to central via the <c>IngestAuditEvents</c> gRPC RPC. On a successful
/// ack the matching EventIds flip to
/// <see cref="ScadaLink.Commons.Types.Enums.AuditForwardState.Forwarded"/>; on
/// a gRPC failure the rows stay Pending and the next drain retries.
/// </summary>
/// <remarks>
/// <para>
/// The drain self-tick is a private <c>Drain</c> message scheduled via the
/// actor system scheduler. The cadence is options-driven: <c>BusyIntervalSeconds</c>
/// when the previous drain found rows (or faulted — we want quick recovery),
/// <c>IdleIntervalSeconds</c> when the queue was empty.
/// </para>
/// <para>
/// Both collaborators are injected as interfaces (<see cref="ISiteAuditQueue"/>
/// and <see cref="ISiteStreamAuditClient"/>) so unit tests substitute with
/// NSubstitute and never touch real SQLite or gRPC.
/// </para>
/// <para>
/// Per Bundle D's brief, audit-write paths must be fail-safe — a thrown
/// exception inside the actor MUST NOT crash it. The Drain handler wraps the
/// pipeline in a top-level try/catch that logs and re-schedules, and the
/// actor's <see cref="SupervisorStrategy"/> defaults to
/// <see cref="Akka.Actor.SupervisorStrategy.DefaultStrategy"/>'s Restart for
/// child actors — but this actor has no children, so the catch is what matters.
/// </para>
/// </remarks>
public class SiteAuditTelemetryActor : ReceiveActor
{
private readonly ISiteAuditQueue _queue;
private readonly ISiteStreamAuditClient _client;
private readonly SiteAuditTelemetryOptions _options;
private readonly ILogger<SiteAuditTelemetryActor> _logger;
private ICancelable? _pendingTick;
public SiteAuditTelemetryActor(
ISiteAuditQueue queue,
ISiteStreamAuditClient client,
IOptions<SiteAuditTelemetryOptions> options,
ILogger<SiteAuditTelemetryActor> logger)
{
ArgumentNullException.ThrowIfNull(queue);
ArgumentNullException.ThrowIfNull(client);
ArgumentNullException.ThrowIfNull(options);
ArgumentNullException.ThrowIfNull(logger);
_queue = queue;
_client = client;
_options = options.Value;
_logger = logger;
ReceiveAsync<Drain>(_ => OnDrainAsync());
}
protected override void PreStart()
{
base.PreStart();
// Initial tick fires on the busy interval so the actor starts polling
// soon after host startup. A subsequent empty drain will move to the
// idle interval naturally.
ScheduleNext(TimeSpan.FromSeconds(_options.BusyIntervalSeconds));
}
protected override void PostStop()
{
_pendingTick?.Cancel();
base.PostStop();
}
private async Task OnDrainAsync()
{
var nextDelay = TimeSpan.FromSeconds(_options.BusyIntervalSeconds);
try
{
var pending = await _queue.ReadPendingAsync(_options.BatchSize, CancellationToken.None)
.ConfigureAwait(false);
if (pending.Count == 0)
{
// No rows — settle into the idle cadence until the next write
// bumps us back into the busy cadence.
nextDelay = TimeSpan.FromSeconds(_options.IdleIntervalSeconds);
return;
}
var batch = BuildBatch(pending);
IngestAck ack;
try
{
ack = await _client.IngestAuditEventsAsync(batch, CancellationToken.None)
.ConfigureAwait(false);
}
catch (Exception ex)
{
// gRPC fault — leave the rows in Pending so the next drain
// retries. Bundle D's brief: "On gRPC exception (any), log
// Warning, schedule next Drain in BusyIntervalSeconds."
_logger.LogWarning(ex,
"IngestAuditEvents push failed for {Count} pending events; will retry next drain.",
pending.Count);
return;
}
var acceptedIds = ParseAcceptedIds(ack);
if (acceptedIds.Count > 0)
{
await _queue.MarkForwardedAsync(acceptedIds, CancellationToken.None)
.ConfigureAwait(false);
}
}
catch (Exception ex)
{
// Catch-all so a SQLite hiccup or mapper bug never crashes the
// actor. The next tick is still scheduled in the finally block.
_logger.LogError(ex, "Unexpected error during audit-log telemetry drain.");
}
finally
{
ScheduleNext(nextDelay);
}
}
private static AuditEventBatch BuildBatch(IReadOnlyList<AuditEvent> events)
{
var batch = new AuditEventBatch();
foreach (var e in events)
{
batch.Events.Add(AuditEventMapper.ToDto(e));
}
return batch;
}
private static IReadOnlyList<Guid> ParseAcceptedIds(IngestAck ack)
{
if (ack.AcceptedEventIds.Count == 0)
{
return Array.Empty<Guid>();
}
var list = new List<Guid>(ack.AcceptedEventIds.Count);
foreach (var raw in ack.AcceptedEventIds)
{
if (Guid.TryParse(raw, out var id))
{
list.Add(id);
}
// Malformed ids are ignored — central should never emit them, but
// we refuse to crash the actor over a bad string.
}
return list;
}
private void ScheduleNext(TimeSpan delay)
{
_pendingTick?.Cancel();
_pendingTick = Context.System.Scheduler.ScheduleTellOnceCancelable(
delay,
Self,
Drain.Instance,
Self);
}
/// <summary>Self-tick message that triggers a drain cycle.</summary>
private sealed class Drain
{
public static readonly Drain Instance = new();
private Drain() { }
}
}

View File

@@ -0,0 +1,28 @@
namespace ScadaLink.AuditLog.Site.Telemetry;
/// <summary>
/// Tuning knobs for the site-side <see cref="SiteAuditTelemetryActor"/> drain
/// loop. Defaults mirror Bundle D's plan: drain every 5 s while rows are
/// flowing (busy), every 30 s when the queue is empty (idle).
/// </summary>
public sealed class SiteAuditTelemetryOptions
{
/// <summary>
/// Maximum number of <see cref="ScadaLink.Commons.Entities.Audit.AuditEvent"/>
/// rows read from the site SQLite queue and pushed in a single gRPC batch.
/// </summary>
public int BatchSize { get; set; } = 256;
/// <summary>
/// Delay between drains when the previous drain found at least one Pending
/// row OR the previous push faulted. Re-drain quickly to keep telemetry
/// flowing and to retry transient gRPC errors.
/// </summary>
public int BusyIntervalSeconds { get; set; } = 5;
/// <summary>
/// Delay between drains when the previous drain found no Pending rows.
/// Longer interval avoids hammering an idle SQLite + gRPC channel.
/// </summary>
public int IdleIntervalSeconds { get; set; } = 30;
}

View File

@@ -0,0 +1,112 @@
using ScadaLink.Commons.Entities.Audit;
using ScadaLink.Commons.Types.Enums;
using ScadaLink.Communication.Grpc;
using Timestamp = Google.Protobuf.WellKnownTypes.Timestamp;
namespace ScadaLink.AuditLog.Telemetry;
/// <summary>
/// Bridges Audit Log (#23) rows between the in-process <see cref="AuditEvent"/> record
/// and the wire-format <see cref="AuditEventDto"/> exchanged over the
/// <c>IngestAuditEvents</c> RPC.
/// </summary>
/// <remarks>
/// <para><b>Lossy by design:</b> the proto contract intentionally omits two fields.</para>
/// <list type="bullet">
/// <item><see cref="AuditEvent.ForwardState"/> — site-local SQLite state, never travels.</item>
/// <item><see cref="AuditEvent.IngestedAtUtc"/> — central-set at ingest time, not at the site.</item>
/// </list>
/// <para>
/// String nullability convention: proto3 scalar strings cannot be absent, so nullable
/// .NET strings round-trip as empty strings on the wire. Nullable integers use the
/// <c>Int32Value</c> wrapper so they preserve true null semantics.
/// </para>
/// </remarks>
public static class AuditEventMapper
{
/// <summary>
/// Projects an <see cref="AuditEvent"/> into its wire-format DTO. Null reference
/// fields collapse to empty strings; null integer fields leave the wrapper unset.
/// </summary>
public static AuditEventDto ToDto(AuditEvent evt)
{
ArgumentNullException.ThrowIfNull(evt);
var dto = new AuditEventDto
{
EventId = evt.EventId.ToString(),
OccurredAtUtc = Timestamp.FromDateTime(EnsureUtc(evt.OccurredAtUtc)),
Channel = evt.Channel.ToString(),
Kind = evt.Kind.ToString(),
CorrelationId = evt.CorrelationId?.ToString() ?? string.Empty,
SourceSiteId = evt.SourceSiteId ?? string.Empty,
SourceInstanceId = evt.SourceInstanceId ?? string.Empty,
SourceScript = evt.SourceScript ?? string.Empty,
Actor = evt.Actor ?? string.Empty,
Target = evt.Target ?? string.Empty,
Status = evt.Status.ToString(),
ErrorMessage = evt.ErrorMessage ?? string.Empty,
ErrorDetail = evt.ErrorDetail ?? string.Empty,
RequestSummary = evt.RequestSummary ?? string.Empty,
ResponseSummary = evt.ResponseSummary ?? string.Empty,
PayloadTruncated = evt.PayloadTruncated,
Extra = evt.Extra ?? string.Empty
};
if (evt.HttpStatus.HasValue)
{
dto.HttpStatus = evt.HttpStatus.Value;
}
if (evt.DurationMs.HasValue)
{
dto.DurationMs = evt.DurationMs.Value;
}
return dto;
}
/// <summary>
/// Reconstructs an <see cref="AuditEvent"/> from its wire-format DTO. Empty strings
/// rehydrate as null reference values; absent integer wrappers stay null.
/// <see cref="AuditEvent.ForwardState"/> and <see cref="AuditEvent.IngestedAtUtc"/>
/// are intentionally left null — the central ingest actor sets the latter.
/// </summary>
public static AuditEvent FromDto(AuditEventDto dto)
{
ArgumentNullException.ThrowIfNull(dto);
return new AuditEvent
{
EventId = Guid.Parse(dto.EventId),
OccurredAtUtc = DateTime.SpecifyKind(dto.OccurredAtUtc.ToDateTime(), DateTimeKind.Utc),
IngestedAtUtc = null,
Channel = Enum.Parse<AuditChannel>(dto.Channel),
Kind = Enum.Parse<AuditKind>(dto.Kind),
CorrelationId = NullIfEmpty(dto.CorrelationId) is { } cid ? Guid.Parse(cid) : null,
SourceSiteId = NullIfEmpty(dto.SourceSiteId),
SourceInstanceId = NullIfEmpty(dto.SourceInstanceId),
SourceScript = NullIfEmpty(dto.SourceScript),
Actor = NullIfEmpty(dto.Actor),
Target = NullIfEmpty(dto.Target),
Status = Enum.Parse<AuditStatus>(dto.Status),
HttpStatus = dto.HttpStatus,
DurationMs = dto.DurationMs,
ErrorMessage = NullIfEmpty(dto.ErrorMessage),
ErrorDetail = NullIfEmpty(dto.ErrorDetail),
RequestSummary = NullIfEmpty(dto.RequestSummary),
ResponseSummary = NullIfEmpty(dto.ResponseSummary),
PayloadTruncated = dto.PayloadTruncated,
Extra = NullIfEmpty(dto.Extra),
ForwardState = null
};
}
private static string? NullIfEmpty(string? value) =>
string.IsNullOrEmpty(value) ? null : value;
private static DateTime EnsureUtc(DateTime value) =>
value.Kind == DateTimeKind.Utc
? value
: DateTime.SpecifyKind(value.ToUniversalTime(), DateTimeKind.Utc);
}

View File

@@ -0,0 +1,20 @@
using ScadaLink.Commons.Entities.Audit;
namespace ScadaLink.Commons.Messages.Audit;
/// <summary>
/// Akka message sent to the central <c>AuditLogIngestActor</c> (Audit Log #23,
/// M2 site-sync pipeline) carrying a batch of <see cref="AuditEvent"/> rows
/// decoded by the <c>SiteStreamGrpcServer</c> from a site's
/// <c>IngestAuditEvents</c> gRPC RPC. The actor stamps
/// <see cref="AuditEvent.IngestedAtUtc"/> and writes the rows idempotently to
/// the central <c>AuditLog</c> table.
/// </summary>
/// <remarks>
/// Lives in <c>ScadaLink.Commons</c> rather than <c>ScadaLink.AuditLog</c>
/// because the gRPC server in <c>ScadaLink.Communication</c> needs to construct
/// it, and <c>ScadaLink.AuditLog</c> already references
/// <c>ScadaLink.Communication</c> (the proto DTOs live there). Putting the
/// message in Commons avoids a project-reference cycle.
/// </remarks>
public sealed record IngestAuditEventsCommand(IReadOnlyList<AuditEvent> Events);

View File

@@ -0,0 +1,11 @@
namespace ScadaLink.Commons.Messages.Audit;
/// <summary>
/// Reply from the central <c>AuditLogIngestActor</c> for an
/// <see cref="IngestAuditEventsCommand"/>. <see cref="AcceptedEventIds"/> lists
/// every row the actor considers durably persisted at central — including ids
/// that were already present before the call (first-write-wins idempotency).
/// The gRPC handler echoes these ids back over the wire as the <c>IngestAck</c>
/// the site uses to flip rows to <c>Forwarded</c>.
/// </summary>
public sealed record IngestAuditEventsReply(IReadOnlyList<Guid> AcceptedEventIds);

View File

@@ -20,7 +20,12 @@ public record SiteHealthReport(
IReadOnlyDictionary<string, string>? DataConnectionEndpoints = null, IReadOnlyDictionary<string, string>? DataConnectionEndpoints = null,
IReadOnlyDictionary<string, TagQualityCounts>? DataConnectionTagQuality = null, IReadOnlyDictionary<string, TagQualityCounts>? DataConnectionTagQuality = null,
int ParkedMessageCount = 0, int ParkedMessageCount = 0,
IReadOnlyList<NodeStatus>? ClusterNodes = null); IReadOnlyList<NodeStatus>? ClusterNodes = null,
// Audit Log (#23) M2 Bundle G: per-interval count of FallbackAuditWriter
// primary failures (SQLite throws routed to the drop-oldest ring). Surfaces
// a sustained audit-write outage on /monitoring/health. Defaults to 0 so
// existing producers / tests that don't construct the field stay valid.
int SiteAuditWriteFailures = 0);
/// <summary> /// <summary>
/// Broadcast wrapper used between central nodes to keep per-node /// Broadcast wrapper used between central nodes to keep per-node

View File

@@ -4,6 +4,9 @@ using Akka.Actor;
using Grpc.Core; using Grpc.Core;
using Microsoft.Extensions.Logging; using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Options; using Microsoft.Extensions.Options;
using ScadaLink.Commons.Entities.Audit;
using ScadaLink.Commons.Messages.Audit;
using ScadaLink.Commons.Types.Enums;
using GrpcStatus = Grpc.Core.Status; using GrpcStatus = Grpc.Core.Status;
namespace ScadaLink.Communication.Grpc; namespace ScadaLink.Communication.Grpc;
@@ -23,6 +26,15 @@ public class SiteStreamGrpcServer : SiteStreamService.SiteStreamServiceBase
private readonly TimeSpan _maxStreamLifetime; private readonly TimeSpan _maxStreamLifetime;
private volatile bool _ready; private volatile bool _ready;
private long _actorCounter; private long _actorCounter;
// Audit Log (#23 M2): central-side ingest actor proxy. Set by the host
// after the cluster singleton starts (see Bundle E wiring). When null the
// IngestAuditEvents RPC replies with an empty IngestAck so sites can
// safely retry — wiring-incomplete is treated as transient, never fatal.
private IActorRef? _auditIngestActor;
// Per Bundle D's brief — Ask timeout is 30 s. The ingest actor's repo
// calls are sub-100 ms in steady state; a generous timeout absorbs a slow
// MSSQL connection without surfacing as a gRPC failure on a healthy site.
private static readonly TimeSpan AuditIngestAskTimeout = TimeSpan.FromSeconds(30);
/// <summary> /// <summary>
/// Test-only constructor — kept <c>internal</c> so the DI container sees a /// Test-only constructor — kept <c>internal</c> so the DI container sees a
@@ -76,6 +88,19 @@ public class SiteStreamGrpcServer : SiteStreamService.SiteStreamServiceBase
_ready = true; _ready = true;
} }
/// <summary>
/// Hands the central-side <c>AuditLogIngestActor</c> proxy to the gRPC
/// server so the <see cref="IngestAuditEvents"/> RPC can route incoming
/// site batches. Audit Log (#23) M2 wiring point — mirrors the way
/// <c>CommunicationService.SetNotificationOutbox</c> takes the Notification
/// Outbox singleton proxy. Bundle E supplies the actor after the cluster
/// singleton starts.
/// </summary>
public void SetAuditIngestActor(IActorRef proxy)
{
_auditIngestActor = proxy;
}
/// <summary> /// <summary>
/// Number of currently active streaming subscriptions. Exposed for diagnostics. /// Number of currently active streaming subscriptions. Exposed for diagnostics.
/// </summary> /// </summary>
@@ -168,6 +193,114 @@ public class SiteStreamGrpcServer : SiteStreamService.SiteStreamServiceBase
} }
} }
/// <summary>
/// Audit Log (#23) M2 site→central push RPC. Decodes a site batch into
/// <see cref="AuditEvent"/> rows, Asks the central <c>AuditLogIngestActor</c>
/// proxy to persist them, and echoes the accepted EventIds back so the site
/// can flip its local rows to <c>Forwarded</c>.
/// </summary>
/// <remarks>
/// <para>
/// The DTO→entity conversion is inlined here (rather than calling the
/// AuditLog mapper) to avoid a project-reference cycle:
/// <c>ScadaLink.AuditLog</c> already references
/// <c>ScadaLink.Communication</c>, so the gRPC server cannot reach back
/// into AuditLog for its mapper. The shape mirrors
/// <c>AuditEventMapper.FromDto</c> in <c>ScadaLink.AuditLog.Telemetry</c>;
/// the two must evolve together.
/// </para>
/// <para>
/// When <see cref="_auditIngestActor"/> is not yet wired (host startup
/// race window), the RPC returns an empty <see cref="IngestAck"/> rather
/// than failing — the site treats the missing ack as a transient outcome
/// and retries on the next drain, which is the desired idempotent
/// behaviour.
/// </para>
/// </remarks>
public override async Task<IngestAck> IngestAuditEvents(
AuditEventBatch request,
ServerCallContext context)
{
// Empty batch is a no-op; reply immediately so the client moves on.
if (request.Events.Count == 0)
{
return new IngestAck();
}
var actor = _auditIngestActor;
if (actor is null)
{
// Wiring incomplete (host startup race). Sites treat an empty
// ack as "nothing was acked, leave rows Pending, retry next
// drain" — exactly the right behaviour during host bring-up.
_logger.LogWarning(
"IngestAuditEvents received {Count} events before SetAuditIngestActor was called; returning empty ack.",
request.Events.Count);
return new IngestAck();
}
// Inlined FromDto. Keep in sync with AuditEventMapper.FromDto in
// ScadaLink.AuditLog.Telemetry — there is no shared mapper because
// doing so would create a project-reference cycle (AuditLog → Communication).
var entities = new List<AuditEvent>(request.Events.Count);
foreach (var dto in request.Events)
{
entities.Add(new AuditEvent
{
EventId = Guid.Parse(dto.EventId),
OccurredAtUtc = DateTime.SpecifyKind(dto.OccurredAtUtc.ToDateTime(), DateTimeKind.Utc),
IngestedAtUtc = null,
Channel = Enum.Parse<AuditChannel>(dto.Channel),
Kind = Enum.Parse<AuditKind>(dto.Kind),
CorrelationId = string.IsNullOrEmpty(dto.CorrelationId) ? null : Guid.Parse(dto.CorrelationId),
SourceSiteId = NullIfEmpty(dto.SourceSiteId),
SourceInstanceId = NullIfEmpty(dto.SourceInstanceId),
SourceScript = NullIfEmpty(dto.SourceScript),
Actor = NullIfEmpty(dto.Actor),
Target = NullIfEmpty(dto.Target),
Status = Enum.Parse<AuditStatus>(dto.Status),
HttpStatus = dto.HttpStatus,
DurationMs = dto.DurationMs,
ErrorMessage = NullIfEmpty(dto.ErrorMessage),
ErrorDetail = NullIfEmpty(dto.ErrorDetail),
RequestSummary = NullIfEmpty(dto.RequestSummary),
ResponseSummary = NullIfEmpty(dto.ResponseSummary),
PayloadTruncated = dto.PayloadTruncated,
Extra = NullIfEmpty(dto.Extra),
ForwardState = null,
});
}
var cmd = new IngestAuditEventsCommand(entities);
IngestAuditEventsReply reply;
try
{
reply = await actor.Ask<IngestAuditEventsReply>(
cmd, AuditIngestAskTimeout, context.CancellationToken);
}
catch (Exception ex)
{
// Audit ingest is best-effort; failing this RPC at the gRPC layer
// would surface as a transport error and force the site to retry
// (which it would do anyway). Logging + an empty ack keeps the
// semantics consistent with the "wiring incomplete" path above.
_logger.LogError(ex,
"AuditLogIngestActor Ask failed for batch of {Count} events; returning empty ack.",
request.Events.Count);
return new IngestAck();
}
var ack = new IngestAck();
foreach (var id in reply.AcceptedEventIds)
{
ack.AcceptedEventIds.Add(id.ToString());
}
return ack;
}
private static string? NullIfEmpty(string? value) =>
string.IsNullOrEmpty(value) ? null : value;
/// <summary> /// <summary>
/// Tracks a single active stream so cleanup only removes its own entry. /// Tracks a single active stream so cleanup only removes its own entry.
/// </summary> /// </summary>

View File

@@ -3,9 +3,11 @@ option csharp_namespace = "ScadaLink.Communication.Grpc";
package sitestream; package sitestream;
import "google/protobuf/timestamp.proto"; import "google/protobuf/timestamp.proto";
import "google/protobuf/wrappers.proto"; // Int32Value
service SiteStreamService { service SiteStreamService {
rpc SubscribeInstance(InstanceStreamRequest) returns (stream SiteStreamEvent); rpc SubscribeInstance(InstanceStreamRequest) returns (stream SiteStreamEvent);
rpc IngestAuditEvents(AuditEventBatch) returns (IngestAck);
} }
message InstanceStreamRequest { message InstanceStreamRequest {
@@ -63,3 +65,31 @@ message AlarmStateUpdate {
AlarmLevelEnum level = 6; // ALARM_LEVEL_NONE for binary trigger types; set by HiLo. AlarmLevelEnum level = 6; // ALARM_LEVEL_NONE for binary trigger types; set by HiLo.
string message = 7; // Optional per-band operator message; empty when unset. string message = 7; // Optional per-band operator message; empty when unset.
} }
// Audit Log (#23) telemetry: single lifecycle event ferried from a site SQLite
// hot-path row to central via IngestAuditEvents. Mirrors AuditEvent (Commons)
// minus the site-local ForwardState and the central IngestedAtUtc (set on ingest).
message AuditEventDto {
string event_id = 1;
google.protobuf.Timestamp occurred_at_utc = 2;
string channel = 3;
string kind = 4;
string correlation_id = 5; // empty string represents null
string source_site_id = 6;
string source_instance_id = 7;
string source_script = 8;
string actor = 9;
string target = 10;
string status = 11;
google.protobuf.Int32Value http_status = 12; // null when absent
google.protobuf.Int32Value duration_ms = 13;
string error_message = 14;
string error_detail = 15;
string request_summary = 16;
string response_summary = 17;
bool payload_truncated = 18;
string extra = 19;
}
message AuditEventBatch { repeated AuditEventDto events = 1; }
message IngestAck { repeated string accepted_event_ids = 1; }

File diff suppressed because it is too large Load Diff

View File

@@ -49,6 +49,10 @@ namespace ScadaLink.Communication.Grpc {
static readonly grpc::Marshaller<global::ScadaLink.Communication.Grpc.InstanceStreamRequest> __Marshaller_sitestream_InstanceStreamRequest = grpc::Marshallers.Create(__Helper_SerializeMessage, context => __Helper_DeserializeMessage(context, global::ScadaLink.Communication.Grpc.InstanceStreamRequest.Parser)); static readonly grpc::Marshaller<global::ScadaLink.Communication.Grpc.InstanceStreamRequest> __Marshaller_sitestream_InstanceStreamRequest = grpc::Marshallers.Create(__Helper_SerializeMessage, context => __Helper_DeserializeMessage(context, global::ScadaLink.Communication.Grpc.InstanceStreamRequest.Parser));
[global::System.CodeDom.Compiler.GeneratedCode("grpc_csharp_plugin", null)] [global::System.CodeDom.Compiler.GeneratedCode("grpc_csharp_plugin", null)]
static readonly grpc::Marshaller<global::ScadaLink.Communication.Grpc.SiteStreamEvent> __Marshaller_sitestream_SiteStreamEvent = grpc::Marshallers.Create(__Helper_SerializeMessage, context => __Helper_DeserializeMessage(context, global::ScadaLink.Communication.Grpc.SiteStreamEvent.Parser)); static readonly grpc::Marshaller<global::ScadaLink.Communication.Grpc.SiteStreamEvent> __Marshaller_sitestream_SiteStreamEvent = grpc::Marshallers.Create(__Helper_SerializeMessage, context => __Helper_DeserializeMessage(context, global::ScadaLink.Communication.Grpc.SiteStreamEvent.Parser));
[global::System.CodeDom.Compiler.GeneratedCode("grpc_csharp_plugin", null)]
static readonly grpc::Marshaller<global::ScadaLink.Communication.Grpc.AuditEventBatch> __Marshaller_sitestream_AuditEventBatch = grpc::Marshallers.Create(__Helper_SerializeMessage, context => __Helper_DeserializeMessage(context, global::ScadaLink.Communication.Grpc.AuditEventBatch.Parser));
[global::System.CodeDom.Compiler.GeneratedCode("grpc_csharp_plugin", null)]
static readonly grpc::Marshaller<global::ScadaLink.Communication.Grpc.IngestAck> __Marshaller_sitestream_IngestAck = grpc::Marshallers.Create(__Helper_SerializeMessage, context => __Helper_DeserializeMessage(context, global::ScadaLink.Communication.Grpc.IngestAck.Parser));
[global::System.CodeDom.Compiler.GeneratedCode("grpc_csharp_plugin", null)] [global::System.CodeDom.Compiler.GeneratedCode("grpc_csharp_plugin", null)]
static readonly grpc::Method<global::ScadaLink.Communication.Grpc.InstanceStreamRequest, global::ScadaLink.Communication.Grpc.SiteStreamEvent> __Method_SubscribeInstance = new grpc::Method<global::ScadaLink.Communication.Grpc.InstanceStreamRequest, global::ScadaLink.Communication.Grpc.SiteStreamEvent>( static readonly grpc::Method<global::ScadaLink.Communication.Grpc.InstanceStreamRequest, global::ScadaLink.Communication.Grpc.SiteStreamEvent> __Method_SubscribeInstance = new grpc::Method<global::ScadaLink.Communication.Grpc.InstanceStreamRequest, global::ScadaLink.Communication.Grpc.SiteStreamEvent>(
@@ -58,6 +62,14 @@ namespace ScadaLink.Communication.Grpc {
__Marshaller_sitestream_InstanceStreamRequest, __Marshaller_sitestream_InstanceStreamRequest,
__Marshaller_sitestream_SiteStreamEvent); __Marshaller_sitestream_SiteStreamEvent);
[global::System.CodeDom.Compiler.GeneratedCode("grpc_csharp_plugin", null)]
static readonly grpc::Method<global::ScadaLink.Communication.Grpc.AuditEventBatch, global::ScadaLink.Communication.Grpc.IngestAck> __Method_IngestAuditEvents = new grpc::Method<global::ScadaLink.Communication.Grpc.AuditEventBatch, global::ScadaLink.Communication.Grpc.IngestAck>(
grpc::MethodType.Unary,
__ServiceName,
"IngestAuditEvents",
__Marshaller_sitestream_AuditEventBatch,
__Marshaller_sitestream_IngestAck);
/// <summary>Service descriptor</summary> /// <summary>Service descriptor</summary>
public static global::Google.Protobuf.Reflection.ServiceDescriptor Descriptor public static global::Google.Protobuf.Reflection.ServiceDescriptor Descriptor
{ {
@@ -74,6 +86,12 @@ namespace ScadaLink.Communication.Grpc {
throw new grpc::RpcException(new grpc::Status(grpc::StatusCode.Unimplemented, "")); throw new grpc::RpcException(new grpc::Status(grpc::StatusCode.Unimplemented, ""));
} }
[global::System.CodeDom.Compiler.GeneratedCode("grpc_csharp_plugin", null)]
public virtual global::System.Threading.Tasks.Task<global::ScadaLink.Communication.Grpc.IngestAck> IngestAuditEvents(global::ScadaLink.Communication.Grpc.AuditEventBatch request, grpc::ServerCallContext context)
{
throw new grpc::RpcException(new grpc::Status(grpc::StatusCode.Unimplemented, ""));
}
} }
/// <summary>Client for SiteStreamService</summary> /// <summary>Client for SiteStreamService</summary>
@@ -113,6 +131,26 @@ namespace ScadaLink.Communication.Grpc {
{ {
return CallInvoker.AsyncServerStreamingCall(__Method_SubscribeInstance, null, options, request); return CallInvoker.AsyncServerStreamingCall(__Method_SubscribeInstance, null, options, request);
} }
[global::System.CodeDom.Compiler.GeneratedCode("grpc_csharp_plugin", null)]
public virtual global::ScadaLink.Communication.Grpc.IngestAck IngestAuditEvents(global::ScadaLink.Communication.Grpc.AuditEventBatch request, grpc::Metadata headers = null, global::System.DateTime? deadline = null, global::System.Threading.CancellationToken cancellationToken = default(global::System.Threading.CancellationToken))
{
return IngestAuditEvents(request, new grpc::CallOptions(headers, deadline, cancellationToken));
}
[global::System.CodeDom.Compiler.GeneratedCode("grpc_csharp_plugin", null)]
public virtual global::ScadaLink.Communication.Grpc.IngestAck IngestAuditEvents(global::ScadaLink.Communication.Grpc.AuditEventBatch request, grpc::CallOptions options)
{
return CallInvoker.BlockingUnaryCall(__Method_IngestAuditEvents, null, options, request);
}
[global::System.CodeDom.Compiler.GeneratedCode("grpc_csharp_plugin", null)]
public virtual grpc::AsyncUnaryCall<global::ScadaLink.Communication.Grpc.IngestAck> IngestAuditEventsAsync(global::ScadaLink.Communication.Grpc.AuditEventBatch request, grpc::Metadata headers = null, global::System.DateTime? deadline = null, global::System.Threading.CancellationToken cancellationToken = default(global::System.Threading.CancellationToken))
{
return IngestAuditEventsAsync(request, new grpc::CallOptions(headers, deadline, cancellationToken));
}
[global::System.CodeDom.Compiler.GeneratedCode("grpc_csharp_plugin", null)]
public virtual grpc::AsyncUnaryCall<global::ScadaLink.Communication.Grpc.IngestAck> IngestAuditEventsAsync(global::ScadaLink.Communication.Grpc.AuditEventBatch request, grpc::CallOptions options)
{
return CallInvoker.AsyncUnaryCall(__Method_IngestAuditEvents, null, options, request);
}
/// <summary>Creates a new instance of client from given <c>ClientBaseConfiguration</c>.</summary> /// <summary>Creates a new instance of client from given <c>ClientBaseConfiguration</c>.</summary>
[global::System.CodeDom.Compiler.GeneratedCode("grpc_csharp_plugin", null)] [global::System.CodeDom.Compiler.GeneratedCode("grpc_csharp_plugin", null)]
protected override SiteStreamServiceClient NewInstance(ClientBaseConfiguration configuration) protected override SiteStreamServiceClient NewInstance(ClientBaseConfiguration configuration)
@@ -127,7 +165,8 @@ namespace ScadaLink.Communication.Grpc {
public static grpc::ServerServiceDefinition BindService(SiteStreamServiceBase serviceImpl) public static grpc::ServerServiceDefinition BindService(SiteStreamServiceBase serviceImpl)
{ {
return grpc::ServerServiceDefinition.CreateBuilder() return grpc::ServerServiceDefinition.CreateBuilder()
.AddMethod(__Method_SubscribeInstance, serviceImpl.SubscribeInstance).Build(); .AddMethod(__Method_SubscribeInstance, serviceImpl.SubscribeInstance)
.AddMethod(__Method_IngestAuditEvents, serviceImpl.IngestAuditEvents).Build();
} }
/// <summary>Register service method with a service binder with or without implementation. Useful when customizing the service binding logic. /// <summary>Register service method with a service binder with or without implementation. Useful when customizing the service binding logic.
@@ -138,6 +177,7 @@ namespace ScadaLink.Communication.Grpc {
public static void BindService(grpc::ServiceBinderBase serviceBinder, SiteStreamServiceBase serviceImpl) public static void BindService(grpc::ServiceBinderBase serviceBinder, SiteStreamServiceBase serviceImpl)
{ {
serviceBinder.AddMethod(__Method_SubscribeInstance, serviceImpl == null ? null : new grpc::ServerStreamingServerMethod<global::ScadaLink.Communication.Grpc.InstanceStreamRequest, global::ScadaLink.Communication.Grpc.SiteStreamEvent>(serviceImpl.SubscribeInstance)); serviceBinder.AddMethod(__Method_SubscribeInstance, serviceImpl == null ? null : new grpc::ServerStreamingServerMethod<global::ScadaLink.Communication.Grpc.InstanceStreamRequest, global::ScadaLink.Communication.Grpc.SiteStreamEvent>(serviceImpl.SubscribeInstance));
serviceBinder.AddMethod(__Method_IngestAuditEvents, serviceImpl == null ? null : new grpc::UnaryServerMethod<global::ScadaLink.Communication.Grpc.AuditEventBatch, global::ScadaLink.Communication.Grpc.IngestAck>(serviceImpl.IngestAuditEvents));
} }
} }

View File

@@ -1,4 +1,7 @@
using Microsoft.Data.SqlClient;
using Microsoft.EntityFrameworkCore; using Microsoft.EntityFrameworkCore;
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Logging.Abstractions;
using ScadaLink.Commons.Entities.Audit; using ScadaLink.Commons.Entities.Audit;
using ScadaLink.Commons.Interfaces.Repositories; using ScadaLink.Commons.Interfaces.Repositories;
using ScadaLink.Commons.Types.Audit; using ScadaLink.Commons.Types.Audit;
@@ -12,11 +15,22 @@ namespace ScadaLink.ConfigurationDatabase.Repositories;
/// </summary> /// </summary>
public class AuditLogRepository : IAuditLogRepository public class AuditLogRepository : IAuditLogRepository
{ {
private readonly ScadaLinkDbContext _context; // SQL Server error numbers for duplicate-key violations on
// UX_AuditLog_EventId. 2601 is a unique-index violation; 2627 is a
// primary-key/unique-constraint violation. The IF NOT EXISTS … INSERT
// pattern has a check-then-act race window — two sessions can both pass
// the EXISTS check and then both attempt the INSERT — and the loser
// surfaces as one of these errors. Idempotency demands we swallow them.
private const int SqlErrorUniqueIndexViolation = 2601;
private const int SqlErrorPrimaryKeyViolation = 2627;
public AuditLogRepository(ScadaLinkDbContext context) private readonly ScadaLinkDbContext _context;
private readonly ILogger<AuditLogRepository> _logger;
public AuditLogRepository(ScadaLinkDbContext context, ILogger<AuditLogRepository>? logger = null)
{ {
_context = context ?? throw new ArgumentNullException(nameof(context)); _context = context ?? throw new ArgumentNullException(nameof(context));
_logger = logger ?? NullLogger<AuditLogRepository>.Instance;
} }
/// <summary> /// <summary>
@@ -44,6 +58,8 @@ public class AuditLogRepository : IAuditLogRepository
// FormattableString interpolation parameterises every value (no concatenation), // FormattableString interpolation parameterises every value (no concatenation),
// so this is safe against injection even for the string columns. // so this is safe against injection even for the string columns.
try
{
await _context.Database.ExecuteSqlInterpolatedAsync( await _context.Database.ExecuteSqlInterpolatedAsync(
$@"IF NOT EXISTS (SELECT 1 FROM dbo.AuditLog WHERE EventId = {evt.EventId}) $@"IF NOT EXISTS (SELECT 1 FROM dbo.AuditLog WHERE EventId = {evt.EventId})
INSERT INTO dbo.AuditLog INSERT INTO dbo.AuditLog
@@ -58,6 +74,23 @@ VALUES
{evt.ResponseSummary}, {evt.PayloadTruncated}, {evt.Extra}, {forwardState});", {evt.ResponseSummary}, {evt.PayloadTruncated}, {evt.Extra}, {forwardState});",
ct); ct);
} }
catch (SqlException ex) when (
ex.Number == SqlErrorUniqueIndexViolation
|| ex.Number == SqlErrorPrimaryKeyViolation)
{
// Two concurrent sessions both passed the IF NOT EXISTS check and
// both attempted the INSERT — the loser raises 2601/2627 against
// UX_AuditLog_EventId. First-write-wins idempotency is already the
// documented contract for this method, so the race outcome is
// semantically a no-op. Swallow at Debug; other SqlExceptions
// bubble.
_logger.LogDebug(
ex,
"InsertIfNotExistsAsync swallowed duplicate-key violation (error {SqlErrorNumber}) for EventId {EventId}; treating as no-op.",
ex.Number,
evt.EventId);
}
}
/// <summary> /// <summary>
/// Builds an <c>AsNoTracking</c> queryable over <see cref="AuditEvent"/>, applies /// Builds an <c>AsNoTracking</c> queryable over <see cref="AuditEvent"/>, applies

View File

@@ -12,6 +12,13 @@ public interface ISiteHealthCollector
void IncrementScriptError(); void IncrementScriptError();
void IncrementAlarmError(); void IncrementAlarmError();
void IncrementDeadLetter(); void IncrementDeadLetter();
/// <summary>
/// Audit Log (#23) Bundle G — increment the per-interval count of
/// <c>FallbackAuditWriter</c> primary failures. Bridged from the
/// <c>IAuditWriteFailureCounter</c> binding registered via
/// <c>AddAuditLogHealthMetricsBridge()</c>.
/// </summary>
void IncrementSiteAuditWriteFailures();
void UpdateConnectionHealth(string connectionName, ConnectionHealth health); void UpdateConnectionHealth(string connectionName, ConnectionHealth health);
void RemoveConnection(string connectionName); void RemoveConnection(string connectionName);
void UpdateTagResolution(string connectionName, int totalSubscribed, int successfullyResolved); void UpdateTagResolution(string connectionName, int totalSubscribed, int successfullyResolved);

View File

@@ -13,6 +13,7 @@ public class SiteHealthCollector : ISiteHealthCollector
private int _scriptErrorCount; private int _scriptErrorCount;
private int _alarmErrorCount; private int _alarmErrorCount;
private int _deadLetterCount; private int _deadLetterCount;
private int _siteAuditWriteFailures;
private readonly ConcurrentDictionary<string, ConnectionHealth> _connectionStatuses = new(); private readonly ConcurrentDictionary<string, ConnectionHealth> _connectionStatuses = new();
private readonly ConcurrentDictionary<string, TagResolutionStatus> _tagResolutionCounts = new(); private readonly ConcurrentDictionary<string, TagResolutionStatus> _tagResolutionCounts = new();
private readonly ConcurrentDictionary<string, string> _connectionEndpoints = new(); private readonly ConcurrentDictionary<string, string> _connectionEndpoints = new();
@@ -61,6 +62,18 @@ public class SiteHealthCollector : ISiteHealthCollector
Interlocked.Increment(ref _deadLetterCount); Interlocked.Increment(ref _deadLetterCount);
} }
/// <summary>
/// Audit Log (#23) Bundle G — increment the per-interval count of
/// <c>FallbackAuditWriter</c> primary failures. Bridged from the
/// <c>IAuditWriteFailureCounter</c> binding registered via
/// <c>AddAuditLogHealthMetricsBridge()</c>; reset every interval together
/// with the other per-interval counters.
/// </summary>
public void IncrementSiteAuditWriteFailures()
{
Interlocked.Increment(ref _siteAuditWriteFailures);
}
/// <summary> /// <summary>
/// Update the health status for a named data connection. /// Update the health status for a named data connection.
/// Called by DCL when connection state changes. /// Called by DCL when connection state changes.
@@ -144,6 +157,7 @@ public class SiteHealthCollector : ISiteHealthCollector
var scriptErrors = Interlocked.Exchange(ref _scriptErrorCount, 0); var scriptErrors = Interlocked.Exchange(ref _scriptErrorCount, 0);
var alarmErrors = Interlocked.Exchange(ref _alarmErrorCount, 0); var alarmErrors = Interlocked.Exchange(ref _alarmErrorCount, 0);
var deadLetters = Interlocked.Exchange(ref _deadLetterCount, 0); var deadLetters = Interlocked.Exchange(ref _deadLetterCount, 0);
var siteAuditWriteFailures = Interlocked.Exchange(ref _siteAuditWriteFailures, 0);
// Snapshot current connection and tag resolution state // Snapshot current connection and tag resolution state
var connectionStatuses = new Dictionary<string, ConnectionHealth>(_connectionStatuses); var connectionStatuses = new Dictionary<string, ConnectionHealth>(_connectionStatuses);
@@ -175,6 +189,7 @@ public class SiteHealthCollector : ISiteHealthCollector
DataConnectionEndpoints: connectionEndpoints, DataConnectionEndpoints: connectionEndpoints,
DataConnectionTagQuality: tagQuality, DataConnectionTagQuality: tagQuality,
ParkedMessageCount: Interlocked.CompareExchange(ref _parkedMessageCount, 0, 0), ParkedMessageCount: Interlocked.CompareExchange(ref _parkedMessageCount, 0, 0),
ClusterNodes: _clusterNodes?.ToList()); ClusterNodes: _clusterNodes?.ToList(),
SiteAuditWriteFailures: siteAuditWriteFailures);
} }
} }

View File

@@ -128,6 +128,13 @@ public class AkkaHostedService : IHostedService
var rolesStr = string.Join(",", roles.Select(QuoteHocon)); var rolesStr = string.Join(",", roles.Select(QuoteHocon));
return $@" return $@"
audit-telemetry-dispatcher {{
type = ForkJoinDispatcher
throughput = 100
dedicated-thread-pool {{
thread-count = 2
}}
}}
akka {{ akka {{
extensions = [ extensions = [
""Akka.Cluster.Tools.PublishSubscribe.DistributedPubSubExtensionProvider, Akka.Cluster.Tools"" ""Akka.Cluster.Tools.PublishSubscribe.DistributedPubSubExtensionProvider, Akka.Cluster.Tools""
@@ -294,6 +301,47 @@ akka {{
commService?.SetNotificationOutbox(outboxProxy); commService?.SetNotificationOutbox(outboxProxy);
_logger.LogInformation("NotificationOutbox singleton created and registered with CentralCommunicationActor"); _logger.LogInformation("NotificationOutbox singleton created and registered with CentralCommunicationActor");
// Audit Log (#23) — central singleton mirrors the Notification Outbox
// pattern. The IngestAuditEvents gRPC handler lives on SiteStreamGrpcServer
// (Communication.Grpc); a central node hosting that server (M6 reconciliation
// path) hands the proxy in via SetAuditIngestActor below. When the gRPC
// server is not registered (current central topology), the host still
// brings the singleton up so a Bundle H in-process test (or a future
// direct caller) can Ask the proxy without further wiring.
// IAuditLogRepository is a SCOPED EF Core service, so the singleton
// actor takes the root IServiceProvider and creates a fresh scope per
// message (mirroring NotificationOutboxActor). Pre-resolving the
// repository here would attempt to take a scoped service from the
// root and fail under DI scope validation.
var auditIngestLogger = _serviceProvider.GetRequiredService<ILoggerFactory>()
.CreateLogger<ScadaLink.AuditLog.Central.AuditLogIngestActor>();
var auditIngestSingletonProps = ClusterSingletonManager.Props(
singletonProps: Props.Create(() => new ScadaLink.AuditLog.Central.AuditLogIngestActor(
_serviceProvider,
auditIngestLogger)),
terminationMessage: PoisonPill.Instance,
settings: ClusterSingletonManagerSettings.Create(_actorSystem!)
.WithSingletonName("audit-log-ingest"));
_actorSystem!.ActorOf(auditIngestSingletonProps, "audit-log-ingest-singleton");
var auditIngestProxyProps = ClusterSingletonProxy.Props(
singletonManagerPath: "/user/audit-log-ingest-singleton",
settings: ClusterSingletonProxySettings.Create(_actorSystem)
.WithSingletonName("audit-log-ingest"));
var auditIngestProxy = _actorSystem.ActorOf(auditIngestProxyProps, "audit-log-ingest-proxy");
// Hand the proxy to the SiteStreamGrpcServer (if registered on this node)
// so the IngestAuditEvents RPC routes incoming site batches to the singleton.
// The gRPC server is currently only registered on Site nodes; on a central
// node this resolves to null and the wiring is a no-op until M6 (which
// brings central-hosted gRPC + a real site→central client).
var grpcServer = _serviceProvider.GetService<ScadaLink.Communication.Grpc.SiteStreamGrpcServer>();
grpcServer?.SetAuditIngestActor(auditIngestProxy);
_logger.LogInformation(
"AuditLogIngestActor singleton created (gRPC server bound: {GrpcBound})",
grpcServer is not null);
_logger.LogInformation("Central actors registered. CentralCommunicationActor created."); _logger.LogInformation("Central actors registered. CentralCommunicationActor created.");
} }
@@ -504,6 +552,41 @@ akka {{
contacts.Count, _nodeOptions.SiteId); contacts.Count, _nodeOptions.SiteId);
} }
// Audit Log (#23) — site-side telemetry actor that drains the SQLite
// Pending queue and pushes to central via IngestAuditEvents. Not a
// cluster singleton: each site is its own cluster, and the actor reads
// node-local SQLite (no replication). The Props are bound to the
// dedicated audit-telemetry-dispatcher (defined in BuildHocon) so a
// batch SQLite read + gRPC push never contend with the default
// dispatcher used by hot-path actors.
//
// Per Bundle E's brief: the SiteAuditTelemetryActor takes its
// collaborators through its constructor, so we resolve them from DI
// and pass them in via Props.Create rather than relying on a future
// FactoryProvider. This also lets the M6 follow-up swap the
// NoOpSiteStreamAuditClient registration for the real gRPC client
// without touching this site wiring.
var siteAuditOptions = _serviceProvider
.GetRequiredService<IOptions<ScadaLink.AuditLog.Site.Telemetry.SiteAuditTelemetryOptions>>();
var siteAuditQueue = _serviceProvider
.GetRequiredService<ScadaLink.AuditLog.Site.Telemetry.ISiteAuditQueue>();
var siteAuditClient = _serviceProvider
.GetRequiredService<ScadaLink.AuditLog.Site.Telemetry.ISiteStreamAuditClient>();
var siteAuditLogger = _serviceProvider.GetRequiredService<ILoggerFactory>()
.CreateLogger<ScadaLink.AuditLog.Site.Telemetry.SiteAuditTelemetryActor>();
var siteAuditTelemetryProps = Props.Create(() =>
new ScadaLink.AuditLog.Site.Telemetry.SiteAuditTelemetryActor(
siteAuditQueue,
siteAuditClient,
siteAuditOptions,
siteAuditLogger))
.WithDispatcher("audit-telemetry-dispatcher");
_actorSystem.ActorOf(siteAuditTelemetryProps, "site-audit-telemetry");
_logger.LogInformation(
"SiteAuditTelemetryActor created (dispatcher=audit-telemetry-dispatcher, client={ClientType})",
siteAuditClient.GetType().Name);
// Gate gRPC subscriptions until the actor system and SiteStreamManager are // Gate gRPC subscriptions until the actor system and SiteStreamManager are
// initialized (REQ-HOST-7). // initialized (REQ-HOST-7).
// //

View File

@@ -1,5 +1,6 @@
using HealthChecks.UI.Client; using HealthChecks.UI.Client;
using Microsoft.AspNetCore.Diagnostics.HealthChecks; using Microsoft.AspNetCore.Diagnostics.HealthChecks;
using ScadaLink.AuditLog;
using ScadaLink.CentralUI; using ScadaLink.CentralUI;
using ScadaLink.ClusterInfrastructure; using ScadaLink.ClusterInfrastructure;
using ScadaLink.Communication; using ScadaLink.Communication;
@@ -77,6 +78,10 @@ try
// AddNotificationService() SMTP machinery above. AddNotificationOutbox binds // AddNotificationService() SMTP machinery above. AddNotificationOutbox binds
// NotificationOutboxOptions via BindConfiguration, so no explicit Configure is needed. // NotificationOutboxOptions via BindConfiguration, so no explicit Configure is needed.
builder.Services.AddNotificationOutbox(); builder.Services.AddNotificationOutbox();
// Audit Log (#23) — central node owns the AuditLogIngestActor singleton +
// IAuditLogRepository. The site writer chain is still registered (lazy
// singletons) but is never resolved on a central node.
builder.Services.AddAuditLog(builder.Configuration);
builder.Services.AddTemplateEngine(); builder.Services.AddTemplateEngine();
builder.Services.AddDeploymentManager(); builder.Services.AddDeploymentManager();
builder.Services.AddSecurity(); builder.Services.AddSecurity();

View File

@@ -38,6 +38,7 @@
<ProjectReference Include="../ScadaLink.ExternalSystemGateway/ScadaLink.ExternalSystemGateway.csproj" /> <ProjectReference Include="../ScadaLink.ExternalSystemGateway/ScadaLink.ExternalSystemGateway.csproj" />
<ProjectReference Include="../ScadaLink.NotificationService/ScadaLink.NotificationService.csproj" /> <ProjectReference Include="../ScadaLink.NotificationService/ScadaLink.NotificationService.csproj" />
<ProjectReference Include="../ScadaLink.NotificationOutbox/ScadaLink.NotificationOutbox.csproj" /> <ProjectReference Include="../ScadaLink.NotificationOutbox/ScadaLink.NotificationOutbox.csproj" />
<ProjectReference Include="../ScadaLink.AuditLog/ScadaLink.AuditLog.csproj" />
<ProjectReference Include="../ScadaLink.CentralUI/ScadaLink.CentralUI.csproj" /> <ProjectReference Include="../ScadaLink.CentralUI/ScadaLink.CentralUI.csproj" />
<ProjectReference Include="../ScadaLink.Security/ScadaLink.Security.csproj" /> <ProjectReference Include="../ScadaLink.Security/ScadaLink.Security.csproj" />
<ProjectReference Include="../ScadaLink.HealthMonitoring/ScadaLink.HealthMonitoring.csproj" /> <ProjectReference Include="../ScadaLink.HealthMonitoring/ScadaLink.HealthMonitoring.csproj" />

View File

@@ -1,3 +1,4 @@
using ScadaLink.AuditLog;
using ScadaLink.ClusterInfrastructure; using ScadaLink.ClusterInfrastructure;
using ScadaLink.Communication; using ScadaLink.Communication;
using ScadaLink.DataConnectionLayer; using ScadaLink.DataConnectionLayer;
@@ -44,6 +45,19 @@ public static class SiteServiceRegistration
services.AddStoreAndForward(); services.AddStoreAndForward();
services.AddSiteEventLogging(); services.AddSiteEventLogging();
// Audit Log (#23) — site-side hot-path writer + telemetry collaborators.
// The SiteAuditTelemetryActor itself is registered by AkkaHostedService
// in the site-role block; this call wires every DI dependency it (and
// ScriptRuntimeContext, when Bundle F lands) reaches for.
services.AddAuditLog(config);
// Audit Log (#23) M2 Bundle G — bridge FallbackAuditWriter primary
// failures into the site health report payload as
// SiteAuditWriteFailures. Must come AFTER both AddSiteHealthMonitoring
// (registers ISiteHealthCollector) and AddAuditLog (registers the
// NoOp default this call replaces).
services.AddAuditLogHealthMetricsBridge();
// WP-13: Akka.NET bootstrap via hosted service // WP-13: Akka.NET bootstrap via hosted service
services.AddSingleton<AkkaHostedService>(); services.AddSingleton<AkkaHostedService>();
services.AddHostedService(sp => sp.GetRequiredService<AkkaHostedService>()); services.AddHostedService(sp => sp.GetRequiredService<AkkaHostedService>());

View File

@@ -101,6 +101,10 @@ public class ScriptExecutionActor : ReceiveActor
// provider supplies the site id stamped on enqueued notifications. // provider supplies the site id stamped on enqueued notifications.
StoreAndForwardService? storeAndForward = null; StoreAndForwardService? storeAndForward = null;
var siteId = string.Empty; var siteId = string.Empty;
// Audit Log #23 (M2 Bundle F): the writer is a singleton (FallbackAuditWriter
// composes the SQLite hot-path + drop-oldest ring); null in tests / hosts
// that haven't called AddAuditLog, which the helper handles as a no-op.
IAuditWriter? auditWriter = null;
if (serviceProvider != null) if (serviceProvider != null)
{ {
@@ -110,6 +114,7 @@ public class ScriptExecutionActor : ReceiveActor
storeAndForward = serviceScope.ServiceProvider.GetService<StoreAndForwardService>(); storeAndForward = serviceScope.ServiceProvider.GetService<StoreAndForwardService>();
siteId = serviceScope.ServiceProvider.GetService<ISiteIdentityProvider>()?.SiteId siteId = serviceScope.ServiceProvider.GetService<ISiteIdentityProvider>()?.SiteId
?? string.Empty; ?? string.Empty;
auditWriter = serviceScope.ServiceProvider.GetService<IAuditWriter>();
} }
var context = new ScriptRuntimeContext( var context = new ScriptRuntimeContext(
@@ -128,7 +133,12 @@ public class ScriptExecutionActor : ReceiveActor
siteId, siteId,
// Notification Outbox (FU3): stamp the executing script onto outbound // Notification Outbox (FU3): stamp the executing script onto outbound
// notifications using the Site Event Logging "Source" convention. // notifications using the Site Event Logging "Source" convention.
sourceScript: $"ScriptActor:{scriptName}"); sourceScript: $"ScriptActor:{scriptName}",
// Audit Log #23 (M2 Bundle F): emit one ApiOutbound/ApiCall row per
// ExternalSystem.Call. Writer is best-effort; failures are logged
// and swallowed inside the helper so the script's call path is
// never aborted by an audit failure.
auditWriter: auditWriter);
var globals = new ScriptGlobals var globals = new ScriptGlobals
{ {

View File

@@ -1,6 +1,9 @@
using System.Diagnostics;
using System.Text.Json; using System.Text.Json;
using System.Text.RegularExpressions;
using Akka.Actor; using Akka.Actor;
using Microsoft.Extensions.Logging; using Microsoft.Extensions.Logging;
using ScadaLink.Commons.Entities.Audit;
using ScadaLink.Commons.Interfaces.Services; using ScadaLink.Commons.Interfaces.Services;
using ScadaLink.Commons.Messages.Instance; using ScadaLink.Commons.Messages.Instance;
using ScadaLink.Commons.Messages.Notification; using ScadaLink.Commons.Messages.Notification;
@@ -75,6 +78,13 @@ public class ScriptRuntimeContext
/// </summary> /// </summary>
private readonly string? _sourceScript; private readonly string? _sourceScript;
/// <summary>
/// Audit Log #23: best-effort emitter for boundary-crossing actions executed
/// by the script. Optional — when null the helpers degrade to a no-op audit
/// path so tests / contexts that do not need the audit pipeline still work.
/// </summary>
private readonly IAuditWriter? _auditWriter;
public ScriptRuntimeContext( public ScriptRuntimeContext(
IActorRef instanceActor, IActorRef instanceActor,
IActorRef self, IActorRef self,
@@ -89,7 +99,8 @@ public class ScriptRuntimeContext
StoreAndForwardService? storeAndForward = null, StoreAndForwardService? storeAndForward = null,
ICanTell? siteCommunicationActor = null, ICanTell? siteCommunicationActor = null,
string siteId = "", string siteId = "",
string? sourceScript = null) string? sourceScript = null,
IAuditWriter? auditWriter = null)
{ {
_instanceActor = instanceActor; _instanceActor = instanceActor;
_self = self; _self = self;
@@ -105,6 +116,7 @@ public class ScriptRuntimeContext
_siteCommunicationActor = siteCommunicationActor; _siteCommunicationActor = siteCommunicationActor;
_siteId = siteId; _siteId = siteId;
_sourceScript = sourceScript; _sourceScript = sourceScript;
_auditWriter = auditWriter;
} }
/// <summary> /// <summary>
@@ -204,7 +216,8 @@ public class ScriptRuntimeContext
/// ExternalSystem.Call("systemName", "methodName", params) /// ExternalSystem.Call("systemName", "methodName", params)
/// ExternalSystem.CachedCall("systemName", "methodName", params) /// ExternalSystem.CachedCall("systemName", "methodName", params)
/// </summary> /// </summary>
public ExternalSystemHelper ExternalSystem => new(_externalSystemClient, _instanceName, _logger); public ExternalSystemHelper ExternalSystem => new(
_externalSystemClient, _instanceName, _logger, _auditWriter, _siteId, _sourceScript);
/// <summary> /// <summary>
/// WP-13: Provides access to database operations. /// WP-13: Provides access to database operations.
@@ -275,17 +288,41 @@ public class ScriptRuntimeContext
/// <summary> /// <summary>
/// WP-13: Helper for ExternalSystem.Call/CachedCall syntax. /// WP-13: Helper for ExternalSystem.Call/CachedCall syntax.
/// </summary> /// </summary>
/// <remarks>
/// Audit Log #23 (M2 Bundle F): every <see cref="Call"/> invocation emits
/// one <c>ApiOutbound</c>/<c>ApiCall</c> audit row via <see cref="IAuditWriter"/>.
/// The audit emission is wrapped in a try/catch that swallows every exception
/// — the audit pipeline is best-effort and must NEVER abort the script's
/// outbound call (alog.md §7). The original <see cref="ExternalCallResult"/>
/// (or the original thrown exception) flows back to the caller unchanged.
/// </remarks>
public class ExternalSystemHelper public class ExternalSystemHelper
{ {
private static readonly Regex HttpStatusRegex = new(
@"HTTP\s+(?<code>\d{3})",
RegexOptions.Compiled | RegexOptions.CultureInvariant);
private readonly IExternalSystemClient? _client; private readonly IExternalSystemClient? _client;
private readonly string _instanceName; private readonly string _instanceName;
private readonly ILogger _logger; private readonly ILogger _logger;
private readonly IAuditWriter? _auditWriter;
private readonly string _siteId;
private readonly string? _sourceScript;
internal ExternalSystemHelper(IExternalSystemClient? client, string instanceName, ILogger logger) internal ExternalSystemHelper(
IExternalSystemClient? client,
string instanceName,
ILogger logger,
IAuditWriter? auditWriter = null,
string siteId = "",
string? sourceScript = null)
{ {
_client = client; _client = client;
_instanceName = instanceName; _instanceName = instanceName;
_logger = logger; _logger = logger;
_auditWriter = auditWriter;
_siteId = siteId;
_sourceScript = sourceScript;
} }
public async Task<ExternalCallResult> Call( public async Task<ExternalCallResult> Call(
@@ -297,7 +334,31 @@ public class ScriptRuntimeContext
if (_client == null) if (_client == null)
throw new InvalidOperationException("External system client not available"); throw new InvalidOperationException("External system client not available");
return await _client.CallAsync(systemName, methodName, parameters, cancellationToken); // Audit Log #23 (M2 Bundle F): wrap the outbound call so every
// attempt emits exactly one ApiOutbound/ApiCall row. The wrapper
// mirrors the existing call-site behaviour — the original result
// OR original exception flows back to the script untouched; the
// audit emission is best-effort.
var occurredAtUtc = DateTime.UtcNow;
var startTicks = Stopwatch.GetTimestamp();
ExternalCallResult? result = null;
Exception? thrown = null;
try
{
result = await _client.CallAsync(systemName, methodName, parameters, cancellationToken);
return result;
}
catch (Exception ex)
{
thrown = ex;
throw;
}
finally
{
var elapsedMs = (int)((Stopwatch.GetTimestamp() - startTicks)
* 1000d / Stopwatch.Frequency);
EmitCallAudit(systemName, methodName, occurredAtUtc, elapsedMs, result, thrown);
}
} }
public async Task<ExternalCallResult> CachedCall( public async Task<ExternalCallResult> CachedCall(
@@ -311,6 +372,145 @@ public class ScriptRuntimeContext
return await _client.CachedCallAsync(systemName, methodName, parameters, _instanceName, cancellationToken); return await _client.CachedCallAsync(systemName, methodName, parameters, _instanceName, cancellationToken);
} }
/// <summary>
/// Best-effort emission of one <c>ApiOutbound</c>/<c>ApiCall</c> audit
/// row. Any exception thrown by the writer is logged and swallowed —
/// audit-write failures must never abort the user-facing action.
/// </summary>
private void EmitCallAudit(
string systemName,
string methodName,
DateTime occurredAtUtc,
int durationMs,
ExternalCallResult? result,
Exception? thrown)
{
if (_auditWriter == null)
{
return;
}
AuditEvent evt;
try
{
evt = BuildCallAuditEvent(systemName, methodName, occurredAtUtc, durationMs, result, thrown);
}
catch (Exception buildEx)
{
// Building the event itself must never propagate. This is a
// defensive guard — populating a record from already-validated
// values shouldn't throw, but we honour the alog.md §7
// best-effort contract regardless.
_logger.LogWarning(buildEx,
"Failed to build Audit Log #23 event for {System}.{Method} — skipping emission",
systemName, methodName);
return;
}
try
{
// Fire-and-forget so we never block the script on the audit
// writer; the writer itself is responsible for fast, durable
// enqueue (site SQLite hot-path). We DO observe failures via
// ContinueWith so a thrown writer is logged rather than going
// to the unobserved-task firehose.
var writeTask = _auditWriter.WriteAsync(evt, CancellationToken.None);
if (!writeTask.IsCompleted)
{
writeTask.ContinueWith(
t => _logger.LogWarning(t.Exception,
"Audit Log #23 write failed for EventId {EventId} ({System}.{Method})",
evt.EventId, systemName, methodName),
CancellationToken.None,
TaskContinuationOptions.OnlyOnFaulted | TaskContinuationOptions.ExecuteSynchronously,
TaskScheduler.Default);
}
else if (writeTask.IsFaulted)
{
_logger.LogWarning(writeTask.Exception,
"Audit Log #23 write failed for EventId {EventId} ({System}.{Method})",
evt.EventId, systemName, methodName);
}
}
catch (Exception writeEx)
{
// Synchronous throw from WriteAsync (e.g. ArgumentNullException
// before the writer's own try/catch). Swallow + log per the
// alog.md §7 contract.
_logger.LogWarning(writeEx,
"Audit Log #23 write threw synchronously for EventId {EventId} ({System}.{Method})",
evt.EventId, systemName, methodName);
}
}
private AuditEvent BuildCallAuditEvent(
string systemName,
string methodName,
DateTime occurredAtUtc,
int durationMs,
ExternalCallResult? result,
Exception? thrown)
{
// Status: Delivered on a Success result; Failed otherwise (the
// ExternalSystemClient already maps HTTP non-2xx + transient
// exceptions into Success=false on the result, or surfaces a raw
// exception). M2 makes no distinction between transient + permanent
// failure here — both manifest as Status.Failed on the sync path.
var status = (thrown == null && result != null && result.Success)
? AuditStatus.Delivered
: AuditStatus.Failed;
string? errorMessage = null;
string? errorDetail = null;
int? httpStatus = null;
if (thrown != null)
{
errorMessage = thrown.Message;
errorDetail = thrown.ToString();
}
else if (result != null && !result.Success)
{
errorMessage = result.ErrorMessage;
// The ExternalSystemClient embeds the HTTP status code in the
// error message as "HTTP {code}". Parse it back out so the
// audit row carries the structured value.
if (!string.IsNullOrEmpty(result.ErrorMessage))
{
var match = HttpStatusRegex.Match(result.ErrorMessage);
if (match.Success
&& int.TryParse(match.Groups["code"].Value, out var parsed))
{
httpStatus = parsed;
}
}
}
return new AuditEvent
{
EventId = Guid.NewGuid(),
OccurredAtUtc = DateTime.SpecifyKind(occurredAtUtc, DateTimeKind.Utc),
Channel = AuditChannel.ApiOutbound,
Kind = AuditKind.ApiCall,
CorrelationId = null,
SourceSiteId = string.IsNullOrEmpty(_siteId) ? null : _siteId,
SourceInstanceId = _instanceName,
SourceScript = _sourceScript,
Actor = null,
Target = $"{systemName}.{methodName}",
Status = status,
HttpStatus = httpStatus,
DurationMs = durationMs,
ErrorMessage = errorMessage,
ErrorDetail = errorDetail,
RequestSummary = null,
ResponseSummary = null,
PayloadTruncated = false,
Extra = null,
ForwardState = AuditForwardState.Pending,
};
}
} }
/// <summary> /// <summary>

View File

@@ -1,28 +1,43 @@
using Microsoft.Extensions.Configuration; using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.DependencyInjection; using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Logging.Abstractions;
using Microsoft.Extensions.Options; using Microsoft.Extensions.Options;
using ScadaLink.AuditLog.Configuration; using ScadaLink.AuditLog.Configuration;
using ScadaLink.AuditLog.Site;
using ScadaLink.AuditLog.Site.Telemetry;
using ScadaLink.Commons.Interfaces.Services;
using ScadaLink.HealthMonitoring;
namespace ScadaLink.AuditLog.Tests; namespace ScadaLink.AuditLog.Tests;
/// <summary> /// <summary>
/// Bundle E (M1) smoke tests for the Audit Log (#23) DI scaffold. Verifies /// Bundle E (M2 Task E1) DI surface tests for <c>AddAuditLog</c>. M1 shipped
/// <c>AddAuditLog</c> registers <see cref="AuditLogOptions"/> against the /// the options-only scaffold; M2 extends it with the site writer chain
/// <c>AuditLog</c> configuration section. Bundle E ships only the scaffold; /// (<see cref="SqliteAuditWriter"/> + <see cref="RingBufferFallback"/> +
/// the validator + full options surface land in Task 9. /// <see cref="FallbackAuditWriter"/>) and the telemetry collaborators
/// (<see cref="ISiteAuditQueue"/>, <see cref="ISiteStreamAuditClient"/>,
/// <see cref="IAuditWriteFailureCounter"/>).
/// </summary> /// </summary>
public class AddAuditLogTests public class AddAuditLogTests
{ {
[Fact] private static ServiceProvider BuildProvider(IDictionary<string, string?>? settings = null)
public void AddAuditLog_RegistersAuditLogOptions()
{ {
var config = new ConfigurationBuilder() var config = new ConfigurationBuilder()
.AddInMemoryCollection(new Dictionary<string, string?>()) .AddInMemoryCollection(settings ?? new Dictionary<string, string?>())
.Build(); .Build();
var services = new ServiceCollection(); var services = new ServiceCollection();
services.AddSingleton<ILoggerFactory, NullLoggerFactory>();
services.AddSingleton(typeof(ILogger<>), typeof(NullLogger<>));
services.AddAuditLog(config); services.AddAuditLog(config);
var provider = services.BuildServiceProvider(); return services.BuildServiceProvider();
}
[Fact]
public void AddAuditLog_RegistersAuditLogOptions()
{
using var provider = BuildProvider();
var opts = provider.GetService<IOptions<AuditLogOptions>>(); var opts = provider.GetService<IOptions<AuditLogOptions>>();
@@ -47,4 +62,182 @@ public class AddAuditLogTests
Assert.Throws<ArgumentNullException>( Assert.Throws<ArgumentNullException>(
() => services.AddAuditLog(null!)); () => services.AddAuditLog(null!));
} }
// -- Bundle E (M2 Task E1) ---------------------------------------------
[Fact]
public void AddAuditLog_Registers_SqliteAuditWriter_Singleton_FromDI()
{
using var provider = BuildProvider(new Dictionary<string, string?>
{
// In-memory database keeps the writer's owned connection portable
// across tests; the per-instance Cache=Shared in the writer's
// default connection string ensures no on-disk file is touched.
["AuditLog:SiteWriter:DatabasePath"] = ":memory:",
});
var writer = provider.GetService<SqliteAuditWriter>();
Assert.NotNull(writer);
// Singleton — same instance on a second resolve.
Assert.Same(writer, provider.GetService<SqliteAuditWriter>());
}
[Fact]
public void AddAuditLog_Registers_IAuditWriter_AsFallbackAuditWriter()
{
using var provider = BuildProvider(new Dictionary<string, string?>
{
["AuditLog:SiteWriter:DatabasePath"] = ":memory:",
});
var writer = provider.GetService<IAuditWriter>();
Assert.NotNull(writer);
Assert.IsType<FallbackAuditWriter>(writer);
}
[Fact]
public void AddAuditLog_Registers_ISiteAuditQueue_AsSameInstance_As_SqliteAuditWriter()
{
// The telemetry actor reads from ISiteAuditQueue while scripts write
// through IAuditWriter → SqliteAuditWriter. Both surfaces MUST resolve
// to the same instance or pending rows will never be visible.
using var provider = BuildProvider(new Dictionary<string, string?>
{
["AuditLog:SiteWriter:DatabasePath"] = ":memory:",
});
var queue = provider.GetService<ISiteAuditQueue>();
var writer = provider.GetService<SqliteAuditWriter>();
Assert.NotNull(queue);
Assert.NotNull(writer);
Assert.Same(writer, queue);
}
[Fact]
public void AddAuditLog_Registers_RingBufferFallback_Singleton()
{
using var provider = BuildProvider(new Dictionary<string, string?>
{
["AuditLog:SiteWriter:DatabasePath"] = ":memory:",
});
var ring = provider.GetService<RingBufferFallback>();
Assert.NotNull(ring);
Assert.Same(ring, provider.GetService<RingBufferFallback>());
}
[Fact]
public void AddAuditLog_Registers_AuditWriteFailureCounter_AsNoOpDefault()
{
using var provider = BuildProvider(new Dictionary<string, string?>
{
["AuditLog:SiteWriter:DatabasePath"] = ":memory:",
});
var counter = provider.GetService<IAuditWriteFailureCounter>();
Assert.NotNull(counter);
Assert.IsType<NoOpAuditWriteFailureCounter>(counter);
}
[Fact]
public void AddAuditLog_Registers_SiteStreamAuditClient_AsNoOpDefault()
{
using var provider = BuildProvider(new Dictionary<string, string?>
{
["AuditLog:SiteWriter:DatabasePath"] = ":memory:",
});
var client = provider.GetService<ISiteStreamAuditClient>();
Assert.NotNull(client);
Assert.IsType<NoOpSiteStreamAuditClient>(client);
}
[Fact]
public void AddAuditLog_Options_Bind_RoundTrip_SqliteWriter()
{
using var provider = BuildProvider(new Dictionary<string, string?>
{
["AuditLog:SiteWriter:DatabasePath"] = "/tmp/test-audit.db",
["AuditLog:SiteWriter:ChannelCapacity"] = "8192",
["AuditLog:SiteWriter:BatchSize"] = "128",
["AuditLog:SiteWriter:FlushIntervalMs"] = "75",
});
var opts = provider.GetRequiredService<IOptions<SqliteAuditWriterOptions>>().Value;
Assert.Equal("/tmp/test-audit.db", opts.DatabasePath);
Assert.Equal(8192, opts.ChannelCapacity);
Assert.Equal(128, opts.BatchSize);
Assert.Equal(75, opts.FlushIntervalMs);
}
[Fact]
public void AddAuditLog_Options_Bind_RoundTrip_SiteTelemetry()
{
using var provider = BuildProvider(new Dictionary<string, string?>
{
["AuditLog:SiteTelemetry:BatchSize"] = "512",
["AuditLog:SiteTelemetry:BusyIntervalSeconds"] = "3",
["AuditLog:SiteTelemetry:IdleIntervalSeconds"] = "60",
});
var opts = provider.GetRequiredService<IOptions<SiteAuditTelemetryOptions>>().Value;
Assert.Equal(512, opts.BatchSize);
Assert.Equal(3, opts.BusyIntervalSeconds);
Assert.Equal(60, opts.IdleIntervalSeconds);
}
// -- Bundle G (M2 Task G1) Site Health Monitoring bridge ----------------
[Fact]
public void AddAuditLogHealthMetricsBridge_Swaps_FailureCounter_To_HealthMetrics_Implementation()
{
var config = new ConfigurationBuilder()
.AddInMemoryCollection(new Dictionary<string, string?>
{
["AuditLog:SiteWriter:DatabasePath"] = ":memory:",
})
.Build();
var services = new ServiceCollection();
services.AddSingleton<ILoggerFactory, NullLoggerFactory>();
services.AddSingleton(typeof(ILogger<>), typeof(NullLogger<>));
services.AddAuditLog(config);
// The bridge depends on ISiteHealthCollector; AddHealthMonitoring is
// what registers it on the site (and the central self-host).
services.AddHealthMonitoring();
services.AddAuditLogHealthMetricsBridge();
using var provider = services.BuildServiceProvider();
var counter = provider.GetRequiredService<IAuditWriteFailureCounter>();
Assert.IsType<HealthMetricsAuditWriteFailureCounter>(counter);
}
[Fact]
public void AddAuditLogHealthMetricsBridge_Without_HealthMonitoring_Still_Resolves_But_Errors_On_Use()
{
// The bridge replaces the registration unconditionally; resolving the
// counter when ISiteHealthCollector is missing throws at GetRequiredService
// time. This documents the contract — callers must register
// AddHealthMonitoring() before the bridge.
var config = new ConfigurationBuilder()
.AddInMemoryCollection(new Dictionary<string, string?>
{
["AuditLog:SiteWriter:DatabasePath"] = ":memory:",
})
.Build();
var services = new ServiceCollection();
services.AddSingleton<ILoggerFactory, NullLoggerFactory>();
services.AddSingleton(typeof(ILogger<>), typeof(NullLogger<>));
services.AddAuditLog(config);
services.AddAuditLogHealthMetricsBridge();
using var provider = services.BuildServiceProvider();
Assert.Throws<InvalidOperationException>(
() => provider.GetRequiredService<IAuditWriteFailureCounter>());
}
} }

View File

@@ -0,0 +1,220 @@
using Akka.Actor;
using Akka.TestKit.Xunit2;
using Microsoft.EntityFrameworkCore;
using Microsoft.Extensions.Logging.Abstractions;
using ScadaLink.AuditLog.Central;
using ScadaLink.Commons.Entities.Audit;
using ScadaLink.Commons.Interfaces.Repositories;
using ScadaLink.Commons.Messages.Audit;
using ScadaLink.Commons.Types.Audit;
using ScadaLink.Commons.Types.Enums;
using ScadaLink.ConfigurationDatabase;
using ScadaLink.ConfigurationDatabase.Repositories;
using ScadaLink.ConfigurationDatabase.Tests.Migrations;
namespace ScadaLink.AuditLog.Tests.Central;
/// <summary>
/// Bundle D D2 tests for <see cref="AuditLogIngestActor"/>. Uses the same
/// <see cref="MsSqlMigrationFixture"/> as the M1 repository tests so the actor
/// exercises real <see cref="AuditLogRepository.InsertIfNotExistsAsync"/>
/// against a partitioned MSSQL schema (the only way to verify the
/// IngestedAtUtc stamp + duplicate-key idempotency end to end).
/// </summary>
public class AuditLogIngestActorTests : TestKit, IClassFixture<MsSqlMigrationFixture>
{
private readonly MsSqlMigrationFixture _fixture;
public AuditLogIngestActorTests(MsSqlMigrationFixture fixture)
{
_fixture = fixture;
}
private ScadaLinkDbContext CreateContext()
{
var options = new DbContextOptionsBuilder<ScadaLinkDbContext>()
.UseSqlServer(_fixture.ConnectionString)
.Options;
return new ScadaLinkDbContext(options);
}
private static string NewSiteId() =>
"test-bundle-d2-" + Guid.NewGuid().ToString("N").Substring(0, 8);
private static AuditEvent NewEvent(string siteId, Guid? id = null) => new()
{
EventId = id ?? Guid.NewGuid(),
OccurredAtUtc = new DateTime(2026, 5, 20, 10, 0, 0, DateTimeKind.Utc),
Channel = AuditChannel.ApiOutbound,
Kind = AuditKind.ApiCall,
Status = AuditStatus.Delivered,
SourceSiteId = siteId,
};
private IActorRef CreateActor(IAuditLogRepository repository) =>
Sys.ActorOf(Props.Create(() => new AuditLogIngestActor(
repository,
NullLogger<AuditLogIngestActor>.Instance)));
[SkippableFact]
public async Task Receive_BatchOf5_Calls_Repo_5Times_Acks_All_5()
{
Skip.IfNot(_fixture.Available, _fixture.SkipReason);
var siteId = NewSiteId();
var events = Enumerable.Range(0, 5).Select(_ => NewEvent(siteId)).ToList();
await using var context = CreateContext();
var repo = new AuditLogRepository(context);
var actor = CreateActor(repo);
actor.Tell(new IngestAuditEventsCommand(events), TestActor);
var reply = ExpectMsg<IngestAuditEventsReply>(TimeSpan.FromSeconds(10));
Assert.Equal(5, reply.AcceptedEventIds.Count);
Assert.True(events.Select(e => e.EventId).ToHashSet().SetEquals(reply.AcceptedEventIds.ToHashSet()));
// Verify rows landed in MSSQL.
await using var readContext = CreateContext();
var rows = await readContext.Set<AuditEvent>()
.Where(e => e.SourceSiteId == siteId)
.ToListAsync();
Assert.Equal(5, rows.Count);
}
[SkippableFact]
public async Task Receive_BatchWith_AlreadyExistingEvent_AcksAll_NoDoubleInsert()
{
Skip.IfNot(_fixture.Available, _fixture.SkipReason);
var siteId = NewSiteId();
var pre = NewEvent(siteId);
// Pre-insert one event directly via the repo so the actor sees it
// already present when it processes the batch.
await using (var seedContext = CreateContext())
{
var seedRepo = new AuditLogRepository(seedContext);
await seedRepo.InsertIfNotExistsAsync(pre);
}
// Build the batch including the pre-existing event plus 2 new ones.
var fresh1 = NewEvent(siteId);
var fresh2 = NewEvent(siteId);
var batch = new List<AuditEvent> { pre, fresh1, fresh2 };
await using var context = CreateContext();
var repo = new AuditLogRepository(context);
var actor = CreateActor(repo);
actor.Tell(new IngestAuditEventsCommand(batch), TestActor);
var reply = ExpectMsg<IngestAuditEventsReply>(TimeSpan.FromSeconds(10));
// All 3 acked under idempotent first-write-wins.
Assert.Equal(3, reply.AcceptedEventIds.Count);
// Verify no double-insert.
await using var readContext = CreateContext();
var count = await readContext.Set<AuditEvent>()
.Where(e => e.SourceSiteId == siteId)
.CountAsync();
Assert.Equal(3, count);
}
[SkippableFact]
public async Task Receive_Sets_IngestedAtUtc_Before_Insert()
{
Skip.IfNot(_fixture.Available, _fixture.SkipReason);
var siteId = NewSiteId();
var events = Enumerable.Range(0, 3).Select(_ => NewEvent(siteId)).ToList();
var before = DateTime.UtcNow.AddSeconds(-1);
await using var context = CreateContext();
var repo = new AuditLogRepository(context);
var actor = CreateActor(repo);
actor.Tell(new IngestAuditEventsCommand(events), TestActor);
ExpectMsg<IngestAuditEventsReply>(TimeSpan.FromSeconds(10));
var after = DateTime.UtcNow.AddSeconds(1);
await using var readContext = CreateContext();
var rows = await readContext.Set<AuditEvent>()
.Where(e => e.SourceSiteId == siteId)
.ToListAsync();
Assert.Equal(3, rows.Count);
Assert.All(rows, r =>
{
Assert.NotNull(r.IngestedAtUtc);
Assert.InRange(r.IngestedAtUtc!.Value, before, after);
});
}
[SkippableFact]
public async Task Receive_RepoThrowsForOneEvent_Other4StillPersisted()
{
Skip.IfNot(_fixture.Available, _fixture.SkipReason);
var siteId = NewSiteId();
var events = Enumerable.Range(0, 5).Select(_ => NewEvent(siteId)).ToList();
var poisonId = events[2].EventId;
// Wrapper repo that throws only when the poison EventId is being
// inserted. The four neighbours must still land in MSSQL.
await using var context = CreateContext();
var realRepo = new AuditLogRepository(context);
var wrappedRepo = new ThrowingRepository(realRepo, poisonId);
var actor = CreateActor(wrappedRepo);
actor.Tell(new IngestAuditEventsCommand(events), TestActor);
var reply = ExpectMsg<IngestAuditEventsReply>(TimeSpan.FromSeconds(10));
// The actor catches the throw per-row, so 4 ids are accepted and 1 is
// left out.
Assert.Equal(4, reply.AcceptedEventIds.Count);
Assert.DoesNotContain(poisonId, reply.AcceptedEventIds);
await using var readContext = CreateContext();
var rows = await readContext.Set<AuditEvent>()
.Where(e => e.SourceSiteId == siteId)
.ToListAsync();
Assert.Equal(4, rows.Count);
Assert.DoesNotContain(rows, r => r.EventId == poisonId);
}
/// <summary>
/// Tiny test double that delegates to a real repository but throws on a
/// specified EventId. Used to verify per-row failure isolation: one bad
/// row must not cause the rest of the batch to be lost.
/// </summary>
private sealed class ThrowingRepository : IAuditLogRepository
{
private readonly IAuditLogRepository _inner;
private readonly Guid _poisonId;
public ThrowingRepository(IAuditLogRepository inner, Guid poisonId)
{
_inner = inner;
_poisonId = poisonId;
}
public Task InsertIfNotExistsAsync(AuditEvent evt, CancellationToken ct = default)
{
if (evt.EventId == _poisonId)
{
throw new InvalidOperationException("simulated repo failure for poison row");
}
return _inner.InsertIfNotExistsAsync(evt, ct);
}
public Task<IReadOnlyList<AuditEvent>> QueryAsync(
AuditLogQueryFilter filter, AuditLogPaging paging, CancellationToken ct = default) =>
_inner.QueryAsync(filter, paging, ct);
public Task SwitchOutPartitionAsync(DateTime monthBoundary, CancellationToken ct = default) =>
_inner.SwitchOutPartitionAsync(monthBoundary, ct);
}
}

View File

@@ -0,0 +1,341 @@
using Akka.Actor;
using Akka.TestKit.Xunit2;
using Microsoft.EntityFrameworkCore;
using Microsoft.Extensions.Logging.Abstractions;
using Microsoft.Extensions.Options;
using ScadaLink.AuditLog.Central;
using ScadaLink.AuditLog.Site;
using ScadaLink.AuditLog.Site.Telemetry;
using ScadaLink.Commons.Entities.Audit;
using ScadaLink.Commons.Interfaces.Repositories;
using ScadaLink.Commons.Messages.Audit;
using ScadaLink.Commons.Types.Audit;
using ScadaLink.Commons.Types.Enums;
using ScadaLink.ConfigurationDatabase;
using ScadaLink.ConfigurationDatabase.Repositories;
using ScadaLink.ConfigurationDatabase.Tests.Migrations;
using ScadaLink.Communication.Grpc;
namespace ScadaLink.AuditLog.Tests.Integration;
/// <summary>
/// Bundle H — end-to-end test wiring the full Audit Log #23 M2 sync-call pipeline:
/// <see cref="FallbackAuditWriter"/> over a <see cref="SqliteAuditWriter"/> backed by
/// an in-memory SQLite database; the <see cref="SiteAuditTelemetryActor"/> drains
/// Pending rows and pushes them through a stub <see cref="ISiteStreamAuditClient"/>
/// that forwards directly to the central <see cref="AuditLogIngestActor"/> backed
/// by a real <see cref="AuditLogRepository"/> on the <see cref="MsSqlMigrationFixture"/>.
/// </summary>
/// <remarks>
/// <para>
/// This is a <b>component-level</b> integration test, not a full Akka-cluster
/// test (per the M2 brainstorm decision). The stub gRPC client short-circuits
/// the wire so we exercise the real telemetry actor, the real ingest actor, the
/// real SQLite writer, and the real MSSQL repository — without standing up a
/// Kestrel host or two-cluster topology.
/// </para>
/// <para>
/// The site-side telemetry actor's <c>Drain</c> message is private; rather than
/// expose it we drive the drain by setting <c>BusyIntervalSeconds = 1</c> so the
/// initial scheduled tick fires within a second of actor start. Tests then
/// <see cref="TestKitBase.AwaitAssertAsync"/> until the central repository
/// observes the expected rows.
/// </para>
/// <para>
/// Each test uses a unique <c>SourceSiteId</c> (Guid suffix) so concurrent tests
/// and the per-fixture MSSQL database lifetime don't interfere with each other.
/// </para>
/// </remarks>
public class SyncCallEmissionEndToEndTests : TestKit, IClassFixture<MsSqlMigrationFixture>
{
private readonly MsSqlMigrationFixture _fixture;
public SyncCallEmissionEndToEndTests(MsSqlMigrationFixture fixture)
{
_fixture = fixture;
}
private static string NewSiteId() =>
"test-bundle-h-" + Guid.NewGuid().ToString("N").Substring(0, 8);
private ScadaLinkDbContext CreateContext()
{
var options = new DbContextOptionsBuilder<ScadaLinkDbContext>()
.UseSqlServer(_fixture.ConnectionString)
.Options;
return new ScadaLinkDbContext(options);
}
private static AuditEvent NewEvent(string siteId, Guid? id = null) => new()
{
EventId = id ?? Guid.NewGuid(),
OccurredAtUtc = DateTime.UtcNow,
Channel = AuditChannel.ApiOutbound,
Kind = AuditKind.ApiCall,
Status = AuditStatus.Delivered,
SourceSiteId = siteId,
Target = "external-system-a/method",
};
private static IOptions<SqliteAuditWriterOptions> InMemorySqliteOptions() =>
Options.Create(new SqliteAuditWriterOptions
{
// Per-test unique database name + Mode=Memory + Cache=Shared keeps
// the in-memory database alive for the duration of the test even
// though Microsoft.Data.Sqlite tears the file down with the last
// connection. The DatabasePath field is unused because we override
// the connection string below.
DatabasePath = "ignored",
BatchSize = 64,
ChannelCapacity = 1024,
});
private static SqliteAuditWriter CreateInMemorySqliteWriter() =>
// The 3rd constructor argument is connectionStringOverride. A unique
// shared-cache in-memory URI keeps the schema scoped to this writer
// instance and torn down when the writer is disposed.
new SqliteAuditWriter(
InMemorySqliteOptions(),
NullLogger<SqliteAuditWriter>.Instance,
connectionStringOverride: $"Data Source=file:auditlog-h-{Guid.NewGuid():N}?mode=memory&cache=shared");
private static IOptions<SiteAuditTelemetryOptions> FastTelemetryOptions() =>
Options.Create(new SiteAuditTelemetryOptions
{
BatchSize = 256,
// 1s for both intervals so the initial scheduled tick fires fast
// and any failure-driven re-tick also fires fast — without
// requiring a public Drain message to be exposed.
BusyIntervalSeconds = 1,
IdleIntervalSeconds = 1,
});
private IActorRef CreateIngestActor(IAuditLogRepository repo) =>
Sys.ActorOf(Props.Create(() => new AuditLogIngestActor(
repo,
NullLogger<AuditLogIngestActor>.Instance)));
private IActorRef CreateTelemetryActor(
ISiteAuditQueue queue,
ISiteStreamAuditClient client) =>
Sys.ActorOf(Props.Create(() => new SiteAuditTelemetryActor(
queue,
client,
FastTelemetryOptions(),
NullLogger<SiteAuditTelemetryActor>.Instance)));
[SkippableFact]
public async Task EndToEnd_OneWrittenEvent_Reaches_Central_AuditLog_Within_Reasonable_Time()
{
Skip.IfNot(_fixture.Available, _fixture.SkipReason);
var siteId = NewSiteId();
// Real central wiring: repo + ingest actor.
await using var ingestContext = CreateContext();
var ingestRepo = new AuditLogRepository(ingestContext);
var ingestActor = CreateIngestActor(ingestRepo);
// Real site wiring: SQLite (in-memory) + ring + fallback + telemetry.
await using var sqliteWriter = CreateInMemorySqliteWriter();
var ring = new RingBufferFallback();
var fallback = new FallbackAuditWriter(
sqliteWriter,
ring,
new NoOpAuditWriteFailureCounter(),
NullLogger<FallbackAuditWriter>.Instance);
var stubClient = new DirectActorSiteStreamAuditClient(ingestActor);
CreateTelemetryActor(sqliteWriter, stubClient);
// Act — one fresh event written via the FallbackAuditWriter hot-path.
var evt = NewEvent(siteId);
await fallback.WriteAsync(evt);
// Assert — the central AuditLog row materialises within a window that
// covers initial tick (1s) + a generous slack for SQLite + the actor
// round-trip + EF/MSSQL latency.
await AwaitAssertAsync(async () =>
{
await using var readContext = CreateContext();
var readRepo = new AuditLogRepository(readContext);
var rows = await readRepo.QueryAsync(
new AuditLogQueryFilter(SourceSiteId: siteId),
new AuditLogPaging(PageSize: 10));
Assert.Single(rows);
Assert.Equal(evt.EventId, rows[0].EventId);
// Central stamps IngestedAtUtc; site never sets it.
Assert.NotNull(rows[0].IngestedAtUtc);
}, TimeSpan.FromSeconds(15));
}
[SkippableFact]
public async Task EndToEnd_GrpcStubError_RowStays_Pending_NextTick_Succeeds()
{
Skip.IfNot(_fixture.Available, _fixture.SkipReason);
var siteId = NewSiteId();
await using var ingestContext = CreateContext();
var ingestRepo = new AuditLogRepository(ingestContext);
var ingestActor = CreateIngestActor(ingestRepo);
await using var sqliteWriter = CreateInMemorySqliteWriter();
var ring = new RingBufferFallback();
var fallback = new FallbackAuditWriter(
sqliteWriter,
ring,
new NoOpAuditWriteFailureCounter(),
NullLogger<FallbackAuditWriter>.Instance);
// Stub fails the first push; subsequent calls flow through. The
// telemetry actor's on-failure branch keeps rows in Pending state, so
// the next tick re-reads them and tries again.
var stubClient = new DirectActorSiteStreamAuditClient(ingestActor)
{
FailNextCallCount = 1,
};
CreateTelemetryActor(sqliteWriter, stubClient);
var evt = NewEvent(siteId);
await fallback.WriteAsync(evt);
// Wait long enough for at least one failure-then-success cycle. With
// both intervals = 1s the actor retries quickly; allow 15s for slow CI.
await AwaitAssertAsync(async () =>
{
await using var readContext = CreateContext();
var readRepo = new AuditLogRepository(readContext);
var rows = await readRepo.QueryAsync(
new AuditLogQueryFilter(SourceSiteId: siteId),
new AuditLogPaging(PageSize: 10));
Assert.Single(rows);
Assert.Equal(evt.EventId, rows[0].EventId);
}, TimeSpan.FromSeconds(15));
Assert.True(stubClient.CallCount >= 2,
$"Expected at least one failed push + one successful push; saw {stubClient.CallCount} total client calls.");
// The site SQLite row must have flipped to Forwarded after the
// successful retry. ReadPendingAsync only returns Pending rows; the
// row should NOT show up there anymore.
var stillPending = await sqliteWriter.ReadPendingAsync(64);
Assert.DoesNotContain(stillPending, p => p.EventId == evt.EventId);
}
[SkippableFact]
public async Task EndToEnd_DuplicateSubmit_OnlyOneCentralRow()
{
Skip.IfNot(_fixture.Available, _fixture.SkipReason);
var siteId = NewSiteId();
await using var ingestContext = CreateContext();
var ingestRepo = new AuditLogRepository(ingestContext);
var ingestActor = CreateIngestActor(ingestRepo);
await using var sqliteWriter = CreateInMemorySqliteWriter();
var ring = new RingBufferFallback();
var fallback = new FallbackAuditWriter(
sqliteWriter,
ring,
new NoOpAuditWriteFailureCounter(),
NullLogger<FallbackAuditWriter>.Instance);
var stubClient = new DirectActorSiteStreamAuditClient(ingestActor);
CreateTelemetryActor(sqliteWriter, stubClient);
// Both writes carry the SAME EventId. Site SQLite's PRIMARY KEY
// constraint and the central repo's InsertIfNotExistsAsync both
// enforce first-write-wins, so only one central row must materialise.
var sharedId = Guid.NewGuid();
var evt1 = NewEvent(siteId, sharedId);
var evt2 = NewEvent(siteId, sharedId);
await fallback.WriteAsync(evt1);
await fallback.WriteAsync(evt2);
await AwaitAssertAsync(async () =>
{
await using var readContext = CreateContext();
var readRepo = new AuditLogRepository(readContext);
var rows = await readRepo.QueryAsync(
new AuditLogQueryFilter(SourceSiteId: siteId),
new AuditLogPaging(PageSize: 10));
Assert.Single(rows);
Assert.Equal(sharedId, rows[0].EventId);
}, TimeSpan.FromSeconds(15));
}
/// <summary>
/// Test double for <see cref="ISiteStreamAuditClient"/> that short-circuits
/// the gRPC wire and forwards the batch directly to a central
/// <see cref="AuditLogIngestActor"/> via Akka <see cref="Futures.Ask"/>. The
/// Akka <see cref="IngestAuditEventsReply"/> is converted to the proto
/// <see cref="IngestAck"/> that the telemetry actor expects.
/// </summary>
private sealed class DirectActorSiteStreamAuditClient : ISiteStreamAuditClient
{
private readonly IActorRef _ingestActor;
private int _failsRemaining;
private int _callCount;
public DirectActorSiteStreamAuditClient(IActorRef ingestActor)
{
_ingestActor = ingestActor ?? throw new ArgumentNullException(nameof(ingestActor));
}
/// <summary>
/// When &gt; 0, the next <c>FailNextCallCount</c> invocations of
/// <see cref="IngestAuditEventsAsync"/> throw to simulate a gRPC error;
/// after that count is exhausted, calls succeed normally.
/// </summary>
public int FailNextCallCount
{
get => _failsRemaining;
set => _failsRemaining = value;
}
public int CallCount => Volatile.Read(ref _callCount);
public async Task<IngestAck> IngestAuditEventsAsync(AuditEventBatch batch, CancellationToken ct)
{
Interlocked.Increment(ref _callCount);
// Atomically consume one of the queued failures, if any. This
// lets the test arm a deterministic number of failures before the
// stub recovers.
if (Interlocked.Decrement(ref _failsRemaining) >= 0)
{
throw new InvalidOperationException("simulated gRPC failure for test");
}
// Decrement under-ran into negative territory; clamp at -1 to keep
// the field bounded even under many calls.
Interlocked.Exchange(ref _failsRemaining, -1);
// Decode the proto batch back into AuditEvent records — this
// mirrors what the production SiteStreamGrpcServer does before
// dispatching to the ingest actor (see Bundle D's gRPC handler).
var events = new List<AuditEvent>(batch.Events.Count);
foreach (var dto in batch.Events)
{
events.Add(ScadaLink.AuditLog.Telemetry.AuditEventMapper.FromDto(dto));
}
// Ask the central actor; the reply carries the accepted EventIds.
var reply = await _ingestActor
.Ask<IngestAuditEventsReply>(
new IngestAuditEventsCommand(events),
TimeSpan.FromSeconds(10))
.ConfigureAwait(false);
var ack = new IngestAck();
foreach (var id in reply.AcceptedEventIds)
{
ack.AcceptedEventIds.Add(id.ToString());
}
return ack;
}
}
}

View File

@@ -9,11 +9,30 @@
</PropertyGroup> </PropertyGroup>
<ItemGroup> <ItemGroup>
<PackageReference Include="Akka.TestKit.Xunit2" />
<PackageReference Include="coverlet.collector" /> <PackageReference Include="coverlet.collector" />
<!--
Bundle D D2 needs Microsoft.Data.SqlClient for the MsSqlMigrationFixture
(mirroring ScadaLink.ConfigurationDatabase.Tests). Pinning 6.1.1 here for
the same reason: EF SqlServer 10.0.7 needs >= 6.1.1 but the central pin
is 6.0.2 (production ExternalSystemGateway). Override is test-only.
-->
<PackageReference Include="Microsoft.Data.SqlClient" VersionOverride="6.1.1" />
<PackageReference Include="Microsoft.Data.Sqlite" />
<PackageReference Include="Microsoft.EntityFrameworkCore.SqlServer" />
<PackageReference Include="Microsoft.Extensions.Configuration.Json" /> <PackageReference Include="Microsoft.Extensions.Configuration.Json" />
<PackageReference Include="Microsoft.Extensions.DependencyInjection" />
<PackageReference Include="Microsoft.Extensions.Logging.Abstractions" />
<PackageReference Include="Microsoft.NET.Test.Sdk" /> <PackageReference Include="Microsoft.NET.Test.Sdk" />
<PackageReference Include="NSubstitute" />
<PackageReference Include="xunit" /> <PackageReference Include="xunit" />
<PackageReference Include="xunit.runner.visualstudio" /> <PackageReference Include="xunit.runner.visualstudio" />
<!--
SkippableFact pattern (xunit 2.9.x has no native Assert.Skip) — used by
Bundle D D2 MSSQL-backed AuditLogIngestActor tests to report Skipped when
the dev MSSQL container is not reachable.
-->
<PackageReference Include="Xunit.SkippableFact" />
</ItemGroup> </ItemGroup>
<ItemGroup> <ItemGroup>
@@ -22,6 +41,13 @@
<ItemGroup> <ItemGroup>
<ProjectReference Include="../../src/ScadaLink.AuditLog/ScadaLink.AuditLog.csproj" /> <ProjectReference Include="../../src/ScadaLink.AuditLog/ScadaLink.AuditLog.csproj" />
<!--
D2: the AuditLogIngestActor tests use the real AuditLogRepository against
a per-test MSSQL database via MsSqlMigrationFixture. The fixture lives in
ScadaLink.ConfigurationDatabase.Tests; we reference that test project so
the fixture + EF migrations come along without duplicating them.
-->
<ProjectReference Include="../ScadaLink.ConfigurationDatabase.Tests/ScadaLink.ConfigurationDatabase.Tests.csproj" />
</ItemGroup> </ItemGroup>
</Project> </Project>

View File

@@ -0,0 +1,133 @@
using Microsoft.Extensions.Logging.Abstractions;
using NSubstitute;
using ScadaLink.AuditLog.Site;
using ScadaLink.Commons.Entities.Audit;
using ScadaLink.Commons.Interfaces.Services;
using ScadaLink.Commons.Types.Enums;
namespace ScadaLink.AuditLog.Tests.Site;
/// <summary>
/// Bundle B (M2-T4) tests for <see cref="FallbackAuditWriter"/> — composes the
/// primary <see cref="SqliteAuditWriter"/>, the drop-oldest
/// <see cref="RingBufferFallback"/>, and an
/// <see cref="IAuditWriteFailureCounter"/> health counter.
/// </summary>
public class FallbackAuditWriterTests
{
private static AuditEvent NewEvent(string? target = null) => new()
{
EventId = Guid.NewGuid(),
OccurredAtUtc = DateTime.UtcNow,
Channel = AuditChannel.ApiOutbound,
Kind = AuditKind.ApiCall,
Status = AuditStatus.Delivered,
Target = target,
PayloadTruncated = false,
ForwardState = AuditForwardState.Pending,
};
/// <summary>Flip-switch primary writer mock.</summary>
private sealed class FlipSwitchPrimary : IAuditWriter
{
public bool FailNext { get; set; }
public List<AuditEvent> Written { get; } = new();
public Task WriteAsync(AuditEvent evt, CancellationToken ct = default)
{
if (FailNext)
{
return Task.FromException(new InvalidOperationException("primary down"));
}
Written.Add(evt);
return Task.CompletedTask;
}
}
[Fact]
public async Task WriteAsync_PrimaryThrows_EventLandsInRing_CallReturnsSuccess()
{
var primary = new FlipSwitchPrimary { FailNext = true };
var ring = new RingBufferFallback(capacity: 16);
var counter = Substitute.For<IAuditWriteFailureCounter>();
var fallback = new FallbackAuditWriter(primary, ring, counter, NullLogger<FallbackAuditWriter>.Instance);
var evt = NewEvent("doomed");
// Must NOT throw — audit failures are always swallowed at this layer.
await fallback.WriteAsync(evt);
Assert.Equal(1, ring.Count);
counter.Received(1).Increment();
}
[Fact]
public async Task WriteAsync_PrimaryRecovers_RingDrains_InFIFOOrder_OnNextWrite()
{
var primary = new FlipSwitchPrimary { FailNext = true };
var ring = new RingBufferFallback(capacity: 16);
var counter = Substitute.For<IAuditWriteFailureCounter>();
var fallback = new FallbackAuditWriter(primary, ring, counter, NullLogger<FallbackAuditWriter>.Instance);
var failed = new[] { NewEvent("a"), NewEvent("b"), NewEvent("c") };
foreach (var e in failed)
{
await fallback.WriteAsync(e);
}
Assert.Equal(3, ring.Count);
// Primary recovers; the very next successful write should drain the
// ring in FIFO order through the primary.
primary.FailNext = false;
var trigger = NewEvent("trigger");
await fallback.WriteAsync(trigger);
Assert.Equal(0, ring.Count);
// Order: the triggering event reaches the primary first (that's the
// signal the primary has recovered), then the backlog drains in FIFO
// submission order behind it.
Assert.Equal(4, primary.Written.Count);
Assert.Equal("trigger", primary.Written[0].Target);
Assert.Equal("a", primary.Written[1].Target);
Assert.Equal("b", primary.Written[2].Target);
Assert.Equal("c", primary.Written[3].Target);
}
[Fact]
public async Task WriteAsync_PrimaryAlwaysSucceeds_Ring_StaysEmpty()
{
var primary = new FlipSwitchPrimary();
var ring = new RingBufferFallback(capacity: 16);
var counter = Substitute.For<IAuditWriteFailureCounter>();
var fallback = new FallbackAuditWriter(primary, ring, counter, NullLogger<FallbackAuditWriter>.Instance);
for (int i = 0; i < 10; i++)
{
await fallback.WriteAsync(NewEvent());
}
Assert.Equal(0, ring.Count);
Assert.Equal(10, primary.Written.Count);
counter.DidNotReceive().Increment();
}
[Fact]
public async Task WriteAsync_FailureCounter_Incremented_Per_PrimaryFailure()
{
var primary = new FlipSwitchPrimary { FailNext = true };
var ring = new RingBufferFallback(capacity: 16);
var counter = Substitute.For<IAuditWriteFailureCounter>();
var fallback = new FallbackAuditWriter(primary, ring, counter, NullLogger<FallbackAuditWriter>.Instance);
for (int i = 0; i < 5; i++)
{
await fallback.WriteAsync(NewEvent());
}
counter.Received(5).Increment();
}
}

View File

@@ -0,0 +1,46 @@
using NSubstitute;
using ScadaLink.AuditLog.Site;
using ScadaLink.HealthMonitoring;
namespace ScadaLink.AuditLog.Tests.Site;
/// <summary>
/// Bundle G (M2-T11) — the <see cref="HealthMetricsAuditWriteFailureCounter"/>
/// adapter is the production binding for <see cref="IAuditWriteFailureCounter"/>
/// on site nodes; it forwards every FallbackAuditWriter primary failure into
/// the shared <see cref="ISiteHealthCollector"/> so the site health report
/// surfaces the failure count as <c>SiteAuditWriteFailures</c>.
/// </summary>
public class HealthMetricsAuditWriteFailureCounterTests
{
[Fact]
public void Increment_Routes_To_Collector_IncrementSiteAuditWriteFailures()
{
var collector = Substitute.For<ISiteHealthCollector>();
var counter = new HealthMetricsAuditWriteFailureCounter(collector);
counter.Increment();
collector.Received(1).IncrementSiteAuditWriteFailures();
}
[Fact]
public void Increment_Multiple_Calls_Route_To_Collector_Each_Time()
{
var collector = Substitute.For<ISiteHealthCollector>();
var counter = new HealthMetricsAuditWriteFailureCounter(collector);
counter.Increment();
counter.Increment();
counter.Increment();
collector.Received(3).IncrementSiteAuditWriteFailures();
}
[Fact]
public void Construction_With_Null_Collector_Throws_ArgumentNullException()
{
Assert.Throws<ArgumentNullException>(
() => new HealthMetricsAuditWriteFailureCounter(null!));
}
}

View File

@@ -0,0 +1,91 @@
using ScadaLink.AuditLog.Site;
using ScadaLink.Commons.Entities.Audit;
using ScadaLink.Commons.Types.Enums;
namespace ScadaLink.AuditLog.Tests.Site;
/// <summary>
/// Bundle B (M2-T3) tests for <see cref="RingBufferFallback"/> — the
/// drop-oldest fallback used by <see cref="FallbackAuditWriter"/> when the
/// primary SQLite writer is throwing.
/// </summary>
public class RingBufferFallbackTests
{
private static AuditEvent NewEvent(string? target = null)
{
return new AuditEvent
{
EventId = Guid.NewGuid(),
OccurredAtUtc = DateTime.UtcNow,
Channel = AuditChannel.ApiOutbound,
Kind = AuditKind.ApiCall,
Status = AuditStatus.Delivered,
Target = target,
PayloadTruncated = false,
ForwardState = AuditForwardState.Pending,
};
}
[Fact]
public async Task Enqueue_1025_Into_1024Cap_Ring_DropsOldest_AndRaisesOverflowOnce()
{
var ring = new RingBufferFallback(capacity: 1024);
var overflowCount = 0;
ring.RingBufferOverflowed += () => Interlocked.Increment(ref overflowCount);
var events = Enumerable.Range(0, 1025).Select(i => NewEvent(target: i.ToString())).ToList();
foreach (var e in events)
{
Assert.True(ring.TryEnqueue(e));
}
Assert.Equal(1, overflowCount);
// The surviving 1024 are events[1..1024] (oldest dropped).
var drained = new List<AuditEvent>();
ring.Complete();
await foreach (var e in ring.DrainAsync(CancellationToken.None))
{
drained.Add(e);
}
Assert.Equal(1024, drained.Count);
Assert.Equal("1", drained[0].Target);
Assert.Equal("1024", drained[^1].Target);
}
[Fact]
public async Task DrainAsync_Yields_FIFO_Then_Completes_When_Empty()
{
var ring = new RingBufferFallback(capacity: 16);
var enqueued = Enumerable.Range(0, 5).Select(i => NewEvent(target: i.ToString())).ToList();
foreach (var e in enqueued)
{
Assert.True(ring.TryEnqueue(e));
}
ring.Complete();
var drained = new List<AuditEvent>();
await foreach (var e in ring.DrainAsync(CancellationToken.None))
{
drained.Add(e);
}
Assert.Equal(5, drained.Count);
for (int i = 0; i < 5; i++)
{
Assert.Equal(i.ToString(), drained[i].Target);
}
}
[Fact]
public void TryEnqueue_AllSucceeds_ReturnsTrue()
{
var ring = new RingBufferFallback(capacity: 16);
for (int i = 0; i < 8; i++)
{
Assert.True(ring.TryEnqueue(NewEvent()));
}
}
}

View File

@@ -0,0 +1,128 @@
using Microsoft.Data.Sqlite;
using Microsoft.Extensions.Logging.Abstractions;
using Microsoft.Extensions.Options;
using ScadaLink.AuditLog.Site;
namespace ScadaLink.AuditLog.Tests.Site;
/// <summary>
/// Bundle B (M2-T1) schema-bootstrap tests for <see cref="SqliteAuditWriter"/>.
/// Uses an in-memory shared-cache SQLite database so the same connection name
/// reaches the same file-less db across both the writer and the verifier.
/// </summary>
public class SqliteAuditWriterSchemaTests
{
/// <summary>
/// Each test uses a unique shared-cache in-memory database. The
/// "Mode=Memory;Cache=Shared" syntax lets two SqliteConnections see the same
/// in-memory store as long as both use the same Data Source name.
/// </summary>
private static (SqliteAuditWriter writer, string dataSource) CreateWriter(string testName)
{
var dataSource = $"file:{testName}-{Guid.NewGuid():N}?mode=memory&cache=shared";
var options = new SqliteAuditWriterOptions
{
DatabasePath = dataSource,
};
// The writer uses raw "Data Source={path}" by appending Cache=Shared. Override
// by passing the full connection string via the connectionStringOverride hook.
var writer = new SqliteAuditWriter(
Options.Create(options),
NullLogger<SqliteAuditWriter>.Instance,
connectionStringOverride: $"Data Source={dataSource};Cache=Shared");
return (writer, dataSource);
}
private static SqliteConnection OpenVerifierConnection(string dataSource)
{
var connection = new SqliteConnection($"Data Source={dataSource};Cache=Shared");
connection.Open();
return connection;
}
[Fact]
public void Opens_Creates_AuditLog_Table_With_20Columns_And_PK_On_EventId()
{
var (writer, dataSource) = CreateWriter(nameof(Opens_Creates_AuditLog_Table_With_20Columns_And_PK_On_EventId));
using (writer)
{
using var connection = OpenVerifierConnection(dataSource);
using var cmd = connection.CreateCommand();
cmd.CommandText = "PRAGMA table_info(AuditLog);";
using var reader = cmd.ExecuteReader();
var columns = new List<(string Name, int Pk)>();
while (reader.Read())
{
columns.Add((reader.GetString(1), reader.GetInt32(5)));
}
Assert.Equal(20, columns.Count);
var expected = new[]
{
"EventId", "OccurredAtUtc", "Channel", "Kind", "CorrelationId",
"SourceSiteId", "SourceInstanceId", "SourceScript", "Actor", "Target",
"Status", "HttpStatus", "DurationMs", "ErrorMessage", "ErrorDetail",
"RequestSummary", "ResponseSummary", "PayloadTruncated", "Extra",
"ForwardState",
};
Assert.Equal(expected.OrderBy(n => n), columns.Select(c => c.Name).OrderBy(n => n));
// PK is EventId only.
var pkColumns = columns.Where(c => c.Pk > 0).Select(c => c.Name).ToList();
Assert.Single(pkColumns);
Assert.Equal("EventId", pkColumns[0]);
}
}
[Fact]
public void Opens_Creates_IX_ForwardState_Occurred_Index()
{
var (writer, dataSource) = CreateWriter(nameof(Opens_Creates_IX_ForwardState_Occurred_Index));
using (writer)
{
using var connection = OpenVerifierConnection(dataSource);
using var cmd = connection.CreateCommand();
cmd.CommandText = "PRAGMA index_list(AuditLog);";
using var reader = cmd.ExecuteReader();
var indexNames = new List<string>();
while (reader.Read())
{
indexNames.Add(reader.GetString(1));
}
Assert.Contains("IX_SiteAuditLog_ForwardState_Occurred", indexNames);
// Verify the index columns are ForwardState, OccurredAtUtc in that order.
using var infoCmd = connection.CreateCommand();
infoCmd.CommandText = "PRAGMA index_info(IX_SiteAuditLog_ForwardState_Occurred);";
using var infoReader = infoCmd.ExecuteReader();
var indexColumns = new List<string>();
while (infoReader.Read())
{
indexColumns.Add(infoReader.GetString(2));
}
Assert.Equal(new[] { "ForwardState", "OccurredAtUtc" }, indexColumns);
}
}
[Fact]
public void PRAGMA_auto_vacuum_Is_INCREMENTAL()
{
var (writer, dataSource) = CreateWriter(nameof(PRAGMA_auto_vacuum_Is_INCREMENTAL));
using (writer)
{
using var connection = OpenVerifierConnection(dataSource);
using var cmd = connection.CreateCommand();
cmd.CommandText = "PRAGMA auto_vacuum;";
var value = Convert.ToInt32(cmd.ExecuteScalar());
// INCREMENTAL = 2 (0 = NONE, 1 = FULL, 2 = INCREMENTAL).
Assert.Equal(2, value);
}
}
}

View File

@@ -0,0 +1,207 @@
using Microsoft.Data.Sqlite;
using Microsoft.Extensions.Logging.Abstractions;
using Microsoft.Extensions.Options;
using ScadaLink.AuditLog.Site;
using ScadaLink.Commons.Entities.Audit;
using ScadaLink.Commons.Types.Enums;
namespace ScadaLink.AuditLog.Tests.Site;
/// <summary>
/// Bundle B (M2-T2) hot-path tests for <see cref="SqliteAuditWriter"/>. Exercise
/// the Channel-based enqueue, the background writer's batch INSERTs, duplicate-
/// EventId swallowing, ForwardState defaulting, and the
/// <see cref="SqliteAuditWriter.ReadPendingAsync"/> /
/// <see cref="SqliteAuditWriter.MarkForwardedAsync"/> support surface that
/// Bundle D's telemetry actor will call.
/// </summary>
public class SqliteAuditWriterWriteTests
{
private static (SqliteAuditWriter writer, string dataSource) CreateWriter(
string testName,
int? channelCapacity = null)
{
var dataSource = $"file:{testName}-{Guid.NewGuid():N}?mode=memory&cache=shared";
var opts = new SqliteAuditWriterOptions { DatabasePath = dataSource };
if (channelCapacity is int cap)
{
opts.ChannelCapacity = cap;
}
var writer = new SqliteAuditWriter(
Options.Create(opts),
NullLogger<SqliteAuditWriter>.Instance,
connectionStringOverride: $"Data Source={dataSource};Cache=Shared");
return (writer, dataSource);
}
private static SqliteConnection OpenVerifierConnection(string dataSource)
{
var connection = new SqliteConnection($"Data Source={dataSource};Cache=Shared");
connection.Open();
return connection;
}
private static AuditEvent NewEvent(Guid? id = null, DateTime? occurredAtUtc = null)
{
return new AuditEvent
{
EventId = id ?? Guid.NewGuid(),
OccurredAtUtc = occurredAtUtc ?? DateTime.UtcNow,
Channel = AuditChannel.ApiOutbound,
Kind = AuditKind.ApiCall,
Status = AuditStatus.Delivered,
PayloadTruncated = false,
};
}
[Fact]
public async Task WriteAsync_FreshEvent_PersistsWithForwardStatePending()
{
var (writer, dataSource) = CreateWriter(nameof(WriteAsync_FreshEvent_PersistsWithForwardStatePending));
await using var _ = writer;
var evt = NewEvent();
await writer.WriteAsync(evt);
using var connection = OpenVerifierConnection(dataSource);
using var cmd = connection.CreateCommand();
cmd.CommandText = "SELECT ForwardState FROM AuditLog WHERE EventId = $id;";
cmd.Parameters.AddWithValue("$id", evt.EventId.ToString());
var actual = cmd.ExecuteScalar() as string;
Assert.Equal(AuditForwardState.Pending.ToString(), actual);
}
[Fact]
public async Task WriteAsync_Concurrent_1000Calls_All_Persist_NoExceptions()
{
var (writer, dataSource) = CreateWriter(nameof(WriteAsync_Concurrent_1000Calls_All_Persist_NoExceptions));
await using var _ = writer;
var events = Enumerable.Range(0, 1000).Select(_ => NewEvent()).ToList();
await Parallel.ForEachAsync(events, new ParallelOptions { MaxDegreeOfParallelism = 16 },
async (evt, ct) => await writer.WriteAsync(evt, ct));
using var connection = OpenVerifierConnection(dataSource);
using var cmd = connection.CreateCommand();
cmd.CommandText = "SELECT COUNT(*) FROM AuditLog;";
var count = Convert.ToInt64(cmd.ExecuteScalar());
Assert.Equal(1000, count);
}
[Fact]
public async Task WriteAsync_DuplicateEventId_FirstWriteWins_NoException()
{
var (writer, dataSource) = CreateWriter(nameof(WriteAsync_DuplicateEventId_FirstWriteWins_NoException));
await using var _ = writer;
var sharedId = Guid.NewGuid();
var first = NewEvent(sharedId) with { Target = "first" };
var second = NewEvent(sharedId) with { Target = "second" };
await writer.WriteAsync(first);
await writer.WriteAsync(second);
using var connection = OpenVerifierConnection(dataSource);
using var countCmd = connection.CreateCommand();
countCmd.CommandText = "SELECT COUNT(*) FROM AuditLog WHERE EventId = $id;";
countCmd.Parameters.AddWithValue("$id", sharedId.ToString());
var count = Convert.ToInt64(countCmd.ExecuteScalar());
Assert.Equal(1, count);
using var targetCmd = connection.CreateCommand();
targetCmd.CommandText = "SELECT Target FROM AuditLog WHERE EventId = $id;";
targetCmd.Parameters.AddWithValue("$id", sharedId.ToString());
Assert.Equal("first", targetCmd.ExecuteScalar() as string);
}
[Fact]
public async Task WriteAsync_ForcesForwardStatePending_IfNull()
{
var (writer, dataSource) = CreateWriter(nameof(WriteAsync_ForcesForwardStatePending_IfNull));
await using var _ = writer;
var evt = NewEvent() with { ForwardState = null };
await writer.WriteAsync(evt);
using var connection = OpenVerifierConnection(dataSource);
using var cmd = connection.CreateCommand();
cmd.CommandText = "SELECT ForwardState FROM AuditLog WHERE EventId = $id;";
cmd.Parameters.AddWithValue("$id", evt.EventId.ToString());
Assert.Equal(AuditForwardState.Pending.ToString(), cmd.ExecuteScalar() as string);
}
[Fact]
public async Task ReadPendingAsync_Returns_OldestFirst_LimitedToN()
{
var (writer, _) = CreateWriter(nameof(ReadPendingAsync_Returns_OldestFirst_LimitedToN));
await using var _writer = writer;
var baseTime = new DateTime(2026, 5, 20, 12, 0, 0, DateTimeKind.Utc);
var evts = new[]
{
NewEvent(occurredAtUtc: baseTime.AddSeconds(5)),
NewEvent(occurredAtUtc: baseTime.AddSeconds(1)),
NewEvent(occurredAtUtc: baseTime.AddSeconds(3)),
NewEvent(occurredAtUtc: baseTime.AddSeconds(2)),
NewEvent(occurredAtUtc: baseTime.AddSeconds(4)),
};
foreach (var e in evts)
{
await writer.WriteAsync(e);
}
var rows = await writer.ReadPendingAsync(limit: 3);
Assert.Equal(3, rows.Count);
Assert.Equal(baseTime.AddSeconds(1), rows[0].OccurredAtUtc);
Assert.Equal(baseTime.AddSeconds(2), rows[1].OccurredAtUtc);
Assert.Equal(baseTime.AddSeconds(3), rows[2].OccurredAtUtc);
}
[Fact]
public async Task MarkForwardedAsync_FlipsRowsToForwarded()
{
var (writer, dataSource) = CreateWriter(nameof(MarkForwardedAsync_FlipsRowsToForwarded));
await using var _ = writer;
var ids = new[] { Guid.NewGuid(), Guid.NewGuid(), Guid.NewGuid() };
foreach (var id in ids)
{
await writer.WriteAsync(NewEvent(id));
}
await writer.MarkForwardedAsync(ids);
using var connection = OpenVerifierConnection(dataSource);
using var cmd = connection.CreateCommand();
cmd.CommandText = "SELECT ForwardState, COUNT(*) FROM AuditLog GROUP BY ForwardState;";
using var reader = cmd.ExecuteReader();
var byState = new Dictionary<string, long>();
while (reader.Read())
{
byState[reader.GetString(0)] = reader.GetInt64(1);
}
Assert.Equal(3, byState[AuditForwardState.Forwarded.ToString()]);
Assert.False(byState.ContainsKey(AuditForwardState.Pending.ToString()));
}
[Fact]
public async Task MarkForwardedAsync_NonExistentId_NoThrow()
{
var (writer, _) = CreateWriter(nameof(MarkForwardedAsync_NonExistentId_NoThrow));
await using var _writer = writer;
var phantomIds = new[] { Guid.NewGuid(), Guid.NewGuid() };
await writer.MarkForwardedAsync(phantomIds);
// No assertion needed: the call must complete without throwing.
}
}

View File

@@ -0,0 +1,235 @@
using Akka.Actor;
using Akka.TestKit.Xunit2;
using Google.Protobuf.WellKnownTypes;
using Microsoft.Extensions.Logging.Abstractions;
using Microsoft.Extensions.Options;
using NSubstitute;
using NSubstitute.ExceptionExtensions;
using ScadaLink.AuditLog.Site.Telemetry;
using ScadaLink.Commons.Entities.Audit;
using ScadaLink.Commons.Types.Enums;
using ScadaLink.Communication.Grpc;
namespace ScadaLink.AuditLog.Tests.Site.Telemetry;
/// <summary>
/// Bundle D D1 tests for <see cref="SiteAuditTelemetryActor"/>. The actor drains
/// the site SQLite queue via <see cref="ISiteAuditQueue"/>, pushes batches via
/// <see cref="ISiteStreamAuditClient"/>, and flips ack'd rows to Forwarded.
/// Both collaborators are NSubstitute mocks so the tests never touch real
/// SQLite or gRPC.
/// </summary>
public class SiteAuditTelemetryActorTests : TestKit
{
private readonly ISiteAuditQueue _queue = Substitute.For<ISiteAuditQueue>();
private readonly ISiteStreamAuditClient _client = Substitute.For<ISiteStreamAuditClient>();
/// <summary>
/// Fast options so tests don't stall waiting for the scheduler. 1s busy /
/// 2s idle still exercises the busy-vs-idle branching, but each test
/// completes in &lt; 5 s wall-clock.
/// </summary>
private static IOptions<SiteAuditTelemetryOptions> Opts(
int batchSize = 256,
int busySeconds = 1,
int idleSeconds = 2) =>
Options.Create(new SiteAuditTelemetryOptions
{
BatchSize = batchSize,
BusyIntervalSeconds = busySeconds,
IdleIntervalSeconds = idleSeconds,
});
private IActorRef CreateActor(IOptions<SiteAuditTelemetryOptions>? options = null) =>
Sys.ActorOf(Props.Create(() => new SiteAuditTelemetryActor(
_queue,
_client,
options ?? Opts(),
NullLogger<SiteAuditTelemetryActor>.Instance)));
private static AuditEvent NewEvent(Guid? id = null) => new()
{
EventId = id ?? Guid.NewGuid(),
OccurredAtUtc = new DateTime(2026, 5, 20, 10, 0, 0, DateTimeKind.Utc),
Channel = AuditChannel.ApiOutbound,
Kind = AuditKind.ApiCall,
Status = AuditStatus.Delivered,
SourceSiteId = "site-1",
ForwardState = AuditForwardState.Pending,
};
private static IngestAck AckAll(IReadOnlyList<AuditEvent> events)
{
var ack = new IngestAck();
foreach (var e in events)
{
ack.AcceptedEventIds.Add(e.EventId.ToString());
}
return ack;
}
[Fact]
public async Task Drain_With_50PendingRows_Sends_OneBatch_Of_50_Then_FlipsToForwarded()
{
// Arrange — 50 pending rows on the first read, then empty on subsequent
// reads so the actor settles after one productive drain.
var pending = Enumerable.Range(0, 50).Select(_ => NewEvent()).ToList();
_queue.ReadPendingAsync(Arg.Any<int>(), Arg.Any<CancellationToken>())
.Returns(
Task.FromResult<IReadOnlyList<AuditEvent>>(pending),
Task.FromResult<IReadOnlyList<AuditEvent>>(Array.Empty<AuditEvent>()));
AuditEventBatch? capturedBatch = null;
_client.IngestAuditEventsAsync(Arg.Any<AuditEventBatch>(), Arg.Any<CancellationToken>())
.Returns(call =>
{
capturedBatch = call.Arg<AuditEventBatch>();
return Task.FromResult(AckAll(pending));
});
// Act
CreateActor();
// Assert — give the scheduler time to fire the initial Drain tick.
await AwaitAssertAsync(async () =>
{
await _client.Received(1).IngestAuditEventsAsync(
Arg.Any<AuditEventBatch>(), Arg.Any<CancellationToken>());
await _queue.Received(1).MarkForwardedAsync(
Arg.Is<IReadOnlyList<Guid>>(g => g.Count == 50), Arg.Any<CancellationToken>());
}, TimeSpan.FromSeconds(5));
Assert.NotNull(capturedBatch);
Assert.Equal(50, capturedBatch!.Events.Count);
var expected = pending.Select(e => e.EventId).ToHashSet();
await _queue.Received(1).MarkForwardedAsync(
Arg.Is<IReadOnlyList<Guid>>(g => g.ToHashSet().SetEquals(expected)),
Arg.Any<CancellationToken>());
}
[Fact]
public async Task Drain_GrpcThrows_RowsStayPending_NextDrainRetries()
{
// Arrange — first read returns 3 rows; the gRPC client throws on the
// first push, then succeeds on the second. After the second push the
// queue returns empty so the actor settles.
var batch = Enumerable.Range(0, 3).Select(_ => NewEvent()).ToList();
_queue.ReadPendingAsync(Arg.Any<int>(), Arg.Any<CancellationToken>())
.Returns(
Task.FromResult<IReadOnlyList<AuditEvent>>(batch),
Task.FromResult<IReadOnlyList<AuditEvent>>(batch),
Task.FromResult<IReadOnlyList<AuditEvent>>(Array.Empty<AuditEvent>()));
var calls = 0;
_client.IngestAuditEventsAsync(Arg.Any<AuditEventBatch>(), Arg.Any<CancellationToken>())
.Returns(_ =>
{
calls++;
if (calls == 1)
{
throw new InvalidOperationException("simulated gRPC failure");
}
return Task.FromResult(AckAll(batch));
});
// Act
CreateActor();
// Assert — eventually MarkForwardedAsync is called exactly once (after
// the retry succeeded). The first failure must NOT have called
// MarkForwardedAsync because the rows stay Pending.
await AwaitAssertAsync(async () =>
{
await _queue.Received(1).MarkForwardedAsync(
Arg.Any<IReadOnlyList<Guid>>(), Arg.Any<CancellationToken>());
}, TimeSpan.FromSeconds(10));
Assert.True(calls >= 2, $"Expected at least 2 client calls (1 failure + 1 retry); saw {calls}");
}
[Fact]
public async Task Drain_ZeroPending_SchedulesAtIdleInterval_NoClientCall()
{
// Arrange — queue always empty.
_queue.ReadPendingAsync(Arg.Any<int>(), Arg.Any<CancellationToken>())
.Returns(Task.FromResult<IReadOnlyList<AuditEvent>>(Array.Empty<AuditEvent>()));
// Idle interval = 2 s. Pause 3 s after the first tick (1 s busy on
// PreStart) and assert the empty-queue branch did NOT push to the
// client.
CreateActor(Opts(busySeconds: 1, idleSeconds: 2));
// Allow the initial tick (~1 s) + a generous window for the idle re-tick.
await Task.Delay(TimeSpan.FromSeconds(3));
await _client.DidNotReceiveWithAnyArgs().IngestAuditEventsAsync(default!, default);
// ReadPendingAsync was called at least once (initial tick), and at
// most twice within the 3 s window (initial + one idle re-tick).
var readCalls = _queue.ReceivedCalls()
.Count(c => c.GetMethodInfo().Name == nameof(ISiteAuditQueue.ReadPendingAsync));
Assert.InRange(readCalls, 1, 2);
}
[Fact]
public async Task Drain_NonZeroPending_SchedulesAtBusyInterval()
{
// Arrange — every read returns 1 row. With busy=1s the actor should
// re-drain quickly, producing multiple client calls inside a short
// window.
var single = new List<AuditEvent> { NewEvent() };
_queue.ReadPendingAsync(Arg.Any<int>(), Arg.Any<CancellationToken>())
.Returns(Task.FromResult<IReadOnlyList<AuditEvent>>(single));
_client.IngestAuditEventsAsync(Arg.Any<AuditEventBatch>(), Arg.Any<CancellationToken>())
.Returns(call => Task.FromResult(AckAll(single)));
CreateActor(Opts(busySeconds: 1, idleSeconds: 10));
// 3-second window with busy=1s should fit at least 2 drains.
await Task.Delay(TimeSpan.FromSeconds(3));
var pushCalls = _client.ReceivedCalls()
.Count(c => c.GetMethodInfo().Name == nameof(ISiteStreamAuditClient.IngestAuditEventsAsync));
Assert.True(pushCalls >= 2,
$"Expected ≥2 pushes within 3s when busy=1s; saw {pushCalls}");
}
[Fact]
public async Task Drain_AcceptedEventIdsSubset_OnlyMarksAccepted()
{
// Arrange — 5 rows pushed, but the central ack only lists 3.
var rows = Enumerable.Range(0, 5).Select(_ => NewEvent()).ToList();
var ackedIds = rows.Take(3).Select(r => r.EventId).ToList();
_queue.ReadPendingAsync(Arg.Any<int>(), Arg.Any<CancellationToken>())
.Returns(
Task.FromResult<IReadOnlyList<AuditEvent>>(rows),
Task.FromResult<IReadOnlyList<AuditEvent>>(Array.Empty<AuditEvent>()));
var partialAck = new IngestAck();
foreach (var id in ackedIds)
{
partialAck.AcceptedEventIds.Add(id.ToString());
}
_client.IngestAuditEventsAsync(Arg.Any<AuditEventBatch>(), Arg.Any<CancellationToken>())
.Returns(Task.FromResult(partialAck));
// Act
CreateActor();
await AwaitAssertAsync(async () =>
{
await _queue.Received(1).MarkForwardedAsync(
Arg.Any<IReadOnlyList<Guid>>(), Arg.Any<CancellationToken>());
}, TimeSpan.FromSeconds(5));
// Assert — exactly the 3 ack'd ids made it to MarkForwardedAsync, not
// the other 2.
var ackedSet = ackedIds.ToHashSet();
await _queue.Received(1).MarkForwardedAsync(
Arg.Is<IReadOnlyList<Guid>>(g => g.Count == 3 && g.ToHashSet().SetEquals(ackedSet)),
Arg.Any<CancellationToken>());
}
}

View File

@@ -0,0 +1,224 @@
using Google.Protobuf.WellKnownTypes;
using ScadaLink.AuditLog.Telemetry;
using ScadaLink.Commons.Entities.Audit;
using ScadaLink.Commons.Types.Enums;
using ScadaLink.Communication.Grpc;
namespace ScadaLink.AuditLog.Tests.Telemetry;
/// <summary>
/// Round-trip + edge tests for the <see cref="AuditEventMapper"/> that bridges
/// <see cref="AuditEvent"/> (Commons) ↔ <see cref="AuditEventDto"/> (proto).
/// ForwardState is site-local and IngestedAtUtc is central-set, so neither survives
/// the proto round-trip.
/// </summary>
public class AuditEventMapperTests
{
[Fact]
public void ToDto_FromDto_Roundtrip_FullyPopulated_PreservesAllFields()
{
var occurredAt = new DateTime(2026, 5, 20, 10, 15, 30, 123, DateTimeKind.Utc);
var ingestedAt = new DateTime(2026, 5, 20, 10, 15, 31, 0, DateTimeKind.Utc);
var correlationId = Guid.NewGuid();
var eventId = Guid.NewGuid();
var original = new AuditEvent
{
EventId = eventId,
OccurredAtUtc = occurredAt,
IngestedAtUtc = ingestedAt,
Channel = AuditChannel.ApiOutbound,
Kind = AuditKind.ApiCallCached,
CorrelationId = correlationId,
SourceSiteId = "site-1",
SourceInstanceId = "Pump01",
SourceScript = "OnDemand",
Actor = "design-key",
Target = "weather-api",
Status = AuditStatus.Forwarded,
HttpStatus = 200,
DurationMs = 42,
ErrorMessage = "transient timeout",
ErrorDetail = "stack-trace",
RequestSummary = "GET /weather",
ResponseSummary = "{ \"ok\": true }",
PayloadTruncated = true,
Extra = "{ \"retryCount\": 1 }",
ForwardState = AuditForwardState.Pending
};
var dto = AuditEventMapper.ToDto(original);
var roundTripped = AuditEventMapper.FromDto(dto);
Assert.Equal(original.EventId, roundTripped.EventId);
Assert.Equal(original.OccurredAtUtc, roundTripped.OccurredAtUtc);
Assert.Equal(original.Channel, roundTripped.Channel);
Assert.Equal(original.Kind, roundTripped.Kind);
Assert.Equal(original.CorrelationId, roundTripped.CorrelationId);
Assert.Equal(original.SourceSiteId, roundTripped.SourceSiteId);
Assert.Equal(original.SourceInstanceId, roundTripped.SourceInstanceId);
Assert.Equal(original.SourceScript, roundTripped.SourceScript);
Assert.Equal(original.Actor, roundTripped.Actor);
Assert.Equal(original.Target, roundTripped.Target);
Assert.Equal(original.Status, roundTripped.Status);
Assert.Equal(original.HttpStatus, roundTripped.HttpStatus);
Assert.Equal(original.DurationMs, roundTripped.DurationMs);
Assert.Equal(original.ErrorMessage, roundTripped.ErrorMessage);
Assert.Equal(original.ErrorDetail, roundTripped.ErrorDetail);
Assert.Equal(original.RequestSummary, roundTripped.RequestSummary);
Assert.Equal(original.ResponseSummary, roundTripped.ResponseSummary);
Assert.Equal(original.PayloadTruncated, roundTripped.PayloadTruncated);
Assert.Equal(original.Extra, roundTripped.Extra);
// ForwardState + IngestedAtUtc are NOT carried in the proto contract.
Assert.Null(roundTripped.ForwardState);
Assert.Null(roundTripped.IngestedAtUtc);
}
[Fact]
public void ToDto_NullableStringFields_BecomeEmptyString()
{
var evt = new AuditEvent
{
EventId = Guid.NewGuid(),
OccurredAtUtc = DateTime.UtcNow,
Channel = AuditChannel.Notification,
Kind = AuditKind.NotifySend,
Status = AuditStatus.Submitted
// all string? fields left null; CorrelationId null
};
var dto = AuditEventMapper.ToDto(evt);
Assert.Equal(string.Empty, dto.CorrelationId);
Assert.Equal(string.Empty, dto.SourceSiteId);
Assert.Equal(string.Empty, dto.SourceInstanceId);
Assert.Equal(string.Empty, dto.SourceScript);
Assert.Equal(string.Empty, dto.Actor);
Assert.Equal(string.Empty, dto.Target);
Assert.Equal(string.Empty, dto.ErrorMessage);
Assert.Equal(string.Empty, dto.ErrorDetail);
Assert.Equal(string.Empty, dto.RequestSummary);
Assert.Equal(string.Empty, dto.ResponseSummary);
Assert.Equal(string.Empty, dto.Extra);
}
[Fact]
public void FromDto_EmptyString_BecomesNullProperty()
{
var dto = new AuditEventDto
{
EventId = Guid.NewGuid().ToString(),
OccurredAtUtc = Timestamp.FromDateTime(DateTime.UtcNow),
Channel = nameof(AuditChannel.ApiOutbound),
Kind = nameof(AuditKind.ApiCall),
Status = nameof(AuditStatus.Submitted),
CorrelationId = string.Empty,
SourceSiteId = string.Empty,
SourceInstanceId = string.Empty,
SourceScript = string.Empty,
Actor = string.Empty,
Target = string.Empty,
ErrorMessage = string.Empty,
ErrorDetail = string.Empty,
RequestSummary = string.Empty,
ResponseSummary = string.Empty,
Extra = string.Empty
};
var evt = AuditEventMapper.FromDto(dto);
Assert.Null(evt.CorrelationId);
Assert.Null(evt.SourceSiteId);
Assert.Null(evt.SourceInstanceId);
Assert.Null(evt.SourceScript);
Assert.Null(evt.Actor);
Assert.Null(evt.Target);
Assert.Null(evt.ErrorMessage);
Assert.Null(evt.ErrorDetail);
Assert.Null(evt.RequestSummary);
Assert.Null(evt.ResponseSummary);
Assert.Null(evt.Extra);
}
[Fact]
public void ToDto_OccurredAtUtc_PreservesUtcKind()
{
var occurredAt = new DateTime(2026, 5, 20, 8, 0, 0, DateTimeKind.Utc);
var evt = new AuditEvent
{
EventId = Guid.NewGuid(),
OccurredAtUtc = occurredAt,
Channel = AuditChannel.DbOutbound,
Kind = AuditKind.DbWrite,
Status = AuditStatus.Delivered
};
var dto = AuditEventMapper.ToDto(evt);
var roundTripped = AuditEventMapper.FromDto(dto);
Assert.Equal(DateTimeKind.Utc, roundTripped.OccurredAtUtc.Kind);
Assert.Equal(occurredAt, roundTripped.OccurredAtUtc);
}
[Fact]
public void ToDto_NullableInt_BecomesNullInt32Value()
{
var evt = new AuditEvent
{
EventId = Guid.NewGuid(),
OccurredAtUtc = DateTime.UtcNow,
Channel = AuditChannel.Notification,
Kind = AuditKind.NotifySend,
Status = AuditStatus.Submitted,
HttpStatus = null,
DurationMs = null
};
var dto = AuditEventMapper.ToDto(evt);
Assert.Null(dto.HttpStatus);
Assert.Null(dto.DurationMs);
}
[Fact]
public void FromDto_NullInt32Value_BecomesNullProperty()
{
var dto = new AuditEventDto
{
EventId = Guid.NewGuid().ToString(),
OccurredAtUtc = Timestamp.FromDateTime(DateTime.UtcNow),
Channel = nameof(AuditChannel.ApiInbound),
Kind = nameof(AuditKind.InboundRequest),
Status = nameof(AuditStatus.Delivered)
// HttpStatus + DurationMs intentionally left absent
};
Assert.Null(dto.HttpStatus);
Assert.Null(dto.DurationMs);
var evt = AuditEventMapper.FromDto(dto);
Assert.Null(evt.HttpStatus);
Assert.Null(evt.DurationMs);
}
[Fact]
public void ToDto_EnumValues_StoredAsStringNames()
{
var evt = new AuditEvent
{
EventId = Guid.NewGuid(),
OccurredAtUtc = DateTime.UtcNow,
Channel = AuditChannel.ApiOutbound,
Kind = AuditKind.ApiCallCached,
Status = AuditStatus.Parked
};
var dto = AuditEventMapper.ToDto(evt);
Assert.Equal("ApiOutbound", dto.Channel);
Assert.Equal("ApiCallCached", dto.Kind);
Assert.Equal("Parked", dto.Status);
}
}

View File

@@ -0,0 +1,123 @@
using Google.Protobuf;
using Google.Protobuf.WellKnownTypes;
using ScadaLink.Communication.Grpc;
namespace ScadaLink.Communication.Tests.Protos;
/// <summary>
/// Wire-format round-trip tests for the Audit Log (#23) telemetry proto messages
/// (<see cref="AuditEventDto"/>, <see cref="AuditEventBatch"/>, <see cref="IngestAck"/>).
/// Locks the additive contract the site → central audit pipeline depends on.
/// </summary>
public class AuditEventProtoTests
{
[Fact]
public void AuditEventDto_RoundTrip_PreservesAllFields()
{
var occurredAt = Timestamp.FromDateTimeOffset(
new DateTimeOffset(2026, 5, 20, 10, 15, 30, 123, TimeSpan.Zero));
var original = new AuditEventDto
{
EventId = Guid.NewGuid().ToString(),
OccurredAtUtc = occurredAt,
Channel = "ApiOutbound",
Kind = "ApiCall",
CorrelationId = Guid.NewGuid().ToString(),
SourceSiteId = "site-1",
SourceInstanceId = "Pump01",
SourceScript = "OnDemand",
Actor = "design-key",
Target = "weather-api",
Status = "Delivered",
HttpStatus = 200,
DurationMs = 42,
ErrorMessage = "no error",
ErrorDetail = "stack",
RequestSummary = "GET /weather?city=brisbane",
ResponseSummary = "{ \"temp\": 22.5 }",
PayloadTruncated = true,
Extra = "{ \"retryCount\": 0 }"
};
var bytes = original.ToByteArray();
var deserialized = AuditEventDto.Parser.ParseFrom(bytes);
Assert.Equal(original.EventId, deserialized.EventId);
Assert.Equal(original.OccurredAtUtc, deserialized.OccurredAtUtc);
Assert.Equal(original.Channel, deserialized.Channel);
Assert.Equal(original.Kind, deserialized.Kind);
Assert.Equal(original.CorrelationId, deserialized.CorrelationId);
Assert.Equal(original.SourceSiteId, deserialized.SourceSiteId);
Assert.Equal(original.SourceInstanceId, deserialized.SourceInstanceId);
Assert.Equal(original.SourceScript, deserialized.SourceScript);
Assert.Equal(original.Actor, deserialized.Actor);
Assert.Equal(original.Target, deserialized.Target);
Assert.Equal(original.Status, deserialized.Status);
Assert.Equal(original.HttpStatus, deserialized.HttpStatus);
Assert.Equal(original.DurationMs, deserialized.DurationMs);
Assert.Equal(original.ErrorMessage, deserialized.ErrorMessage);
Assert.Equal(original.ErrorDetail, deserialized.ErrorDetail);
Assert.Equal(original.RequestSummary, deserialized.RequestSummary);
Assert.Equal(original.ResponseSummary, deserialized.ResponseSummary);
Assert.Equal(original.PayloadTruncated, deserialized.PayloadTruncated);
Assert.Equal(original.Extra, deserialized.Extra);
}
[Fact]
public void AuditEventDto_NullableInt_AbsentByDefault_NotIncludedInWire()
{
// Int32Value fields (http_status, duration_ms) are wrapper-typed in proto;
// when unset, the wrapper is absent, not serialized, and deserializes back to null.
var original = new AuditEventDto
{
EventId = Guid.NewGuid().ToString(),
OccurredAtUtc = Timestamp.FromDateTimeOffset(DateTimeOffset.UtcNow),
Channel = "Notification",
Kind = "NotifySend",
Status = "Submitted"
};
Assert.Null(original.HttpStatus);
Assert.Null(original.DurationMs);
var bytes = original.ToByteArray();
var deserialized = AuditEventDto.Parser.ParseFrom(bytes);
Assert.Null(deserialized.HttpStatus);
Assert.Null(deserialized.DurationMs);
}
[Fact]
public void AuditEventBatch_Empty_RoundTrip_Yields_EmptyEvents()
{
var original = new AuditEventBatch();
Assert.Empty(original.Events);
var bytes = original.ToByteArray();
var deserialized = AuditEventBatch.Parser.ParseFrom(bytes);
Assert.Empty(deserialized.Events);
}
[Fact]
public void IngestAck_PreservesAcceptedEventIds()
{
var id1 = Guid.NewGuid().ToString();
var id2 = Guid.NewGuid().ToString();
var id3 = Guid.NewGuid().ToString();
var original = new IngestAck();
original.AcceptedEventIds.Add(id1);
original.AcceptedEventIds.Add(id2);
original.AcceptedEventIds.Add(id3);
var bytes = original.ToByteArray();
var deserialized = IngestAck.Parser.ParseFrom(bytes);
Assert.Equal(3, deserialized.AcceptedEventIds.Count);
Assert.Equal(id1, deserialized.AcceptedEventIds[0]);
Assert.Equal(id2, deserialized.AcceptedEventIds[1]);
Assert.Equal(id3, deserialized.AcceptedEventIds[2]);
}
}

View File

@@ -0,0 +1,100 @@
using Akka.Actor;
using Akka.TestKit.Xunit2;
using Google.Protobuf.WellKnownTypes;
using Grpc.Core;
using Microsoft.Extensions.Logging.Abstractions;
using NSubstitute;
using ScadaLink.Commons.Messages.Audit;
using ScadaLink.Communication.Grpc;
namespace ScadaLink.Communication.Tests;
/// <summary>
/// Bundle D D2 tests for <see cref="SiteStreamGrpcServer.IngestAuditEvents"/>.
/// Verifies the DTO→entity→actor→ack round-trip through the gRPC handler.
/// A tiny <c>StubIngestActor</c> stands in for the central
/// <c>AuditLogIngestActor</c>, replying with the EventIds it received so the
/// test asserts the wiring without depending on MSSQL.
/// </summary>
public class SiteStreamIngestAuditEventsTests : TestKit
{
private readonly ISiteStreamSubscriber _subscriber = Substitute.For<ISiteStreamSubscriber>();
private SiteStreamGrpcServer CreateServer() =>
new(_subscriber, NullLogger<SiteStreamGrpcServer>.Instance);
private static ServerCallContext NewContext(CancellationToken ct = default)
{
var context = Substitute.For<ServerCallContext>();
context.CancellationToken.Returns(ct);
return context;
}
private static AuditEventDto NewDto(Guid? id = null) => new()
{
EventId = (id ?? Guid.NewGuid()).ToString(),
OccurredAtUtc = Timestamp.FromDateTime(
DateTime.SpecifyKind(new DateTime(2026, 5, 20, 10, 0, 0), DateTimeKind.Utc)),
Channel = "ApiOutbound",
Kind = "ApiCall",
Status = "Delivered",
SourceSiteId = "site-1",
};
[Fact]
public async Task IngestAuditEvents_With_AuditIngestActor_Routes_To_Actor_Returns_Reply()
{
// Arrange — a stub actor that echoes every received EventId back.
var stubActor = Sys.ActorOf(Props.Create(() => new EchoIngestActor()));
var server = CreateServer();
server.SetAuditIngestActor(stubActor);
// Build a 3-event batch.
var dtos = Enumerable.Range(0, 3).Select(_ => NewDto()).ToList();
var batch = new AuditEventBatch();
batch.Events.AddRange(dtos);
// Act
var ack = await server.IngestAuditEvents(batch, NewContext());
// Assert — every dto's id appears in the ack, demonstrating end-to-
// end routing through the actor.
Assert.Equal(3, ack.AcceptedEventIds.Count);
var expectedIds = dtos.Select(d => d.EventId).ToHashSet();
Assert.True(expectedIds.SetEquals(ack.AcceptedEventIds.ToHashSet()));
}
[Fact]
public async Task IngestAuditEvents_NoActor_Wired_ReturnsEmptyAck()
{
var server = CreateServer();
// Intentionally do NOT call SetAuditIngestActor — simulates host
// startup race window.
var batch = new AuditEventBatch();
batch.Events.Add(NewDto());
var ack = await server.IngestAuditEvents(batch, NewContext());
Assert.Empty(ack.AcceptedEventIds);
}
/// <summary>
/// Tiny ReceiveActor that echoes every EventId in an incoming
/// <see cref="IngestAuditEventsCommand"/> back as an
/// <see cref="IngestAuditEventsReply"/>. Stands in for the central
/// AuditLogIngestActor so this test never touches MSSQL.
/// </summary>
private sealed class EchoIngestActor : ReceiveActor
{
public EchoIngestActor()
{
Receive<IngestAuditEventsCommand>(cmd =>
{
var ids = cmd.Events.Select(e => e.EventId).ToList();
Sender.Tell(new IngestAuditEventsReply(ids));
});
}
}
}

View File

@@ -217,6 +217,98 @@ public class AuditLogRepositoryTests : IClassFixture<MsSqlMigrationFixture>
Assert.Equal(t0.AddMinutes(0), page3[0].OccurredAtUtc); Assert.Equal(t0.AddMinutes(0), page3[0].OccurredAtUtc);
} }
[SkippableFact]
public async Task InsertIfNotExistsAsync_ConcurrentDuplicateInserts_ProduceExactlyOneRow()
{
Skip.IfNot(_fixture.Available, _fixture.SkipReason);
var siteId = NewSiteId();
// Single event used by every parallel call — same EventId, same payload.
// The repository's IF NOT EXISTS … INSERT pattern has a check-then-act
// race window between sessions; under concurrent load SQL Server can
// raise a unique-index violation (error 2601) on UX_AuditLog_EventId.
// Bundle A's hardening swallows 2601/2627 so duplicates collapse silently.
var evt = NewEvent(siteId, occurredAtUtc: new DateTime(2026, 5, 20, 12, 0, 0, DateTimeKind.Utc));
// 50 parallel inserters, each with its own DbContext (DbContext is not
// thread-safe). Parallel.ForEachAsync aggregates exceptions, so a single
// unhandled 2601 from the repository would fail this test loudly.
await Parallel.ForEachAsync(
Enumerable.Range(0, 50),
new ParallelOptions { MaxDegreeOfParallelism = 50 },
async (_, ct) =>
{
await using var context = CreateContext();
var repo = new AuditLogRepository(context);
await repo.InsertIfNotExistsAsync(evt, ct);
});
await using var readContext = CreateContext();
var count = await readContext.Set<AuditEvent>()
.Where(e => e.SourceSiteId == siteId)
.CountAsync();
Assert.Equal(1, count);
}
[SkippableFact]
public async Task QueryAsync_Keyset_SameOccurredAtUtc_TiebreaksOnEventId()
{
Skip.IfNot(_fixture.Available, _fixture.SkipReason);
var siteId = NewSiteId();
await using var context = CreateContext();
var repo = new AuditLogRepository(context);
// Four events all sharing the exact same OccurredAtUtc — the keyset
// cursor must lean on the EventId tiebreaker (descending) to page
// deterministically. Bundle D's reviewer flagged this as a deferred
// verification because it depends on EF Core 10 translating
// Guid.CompareTo against SQL Server's uniqueidentifier sort order.
var occurredAt = new DateTime(2026, 5, 20, 13, 0, 0, DateTimeKind.Utc);
// Build four distinct Guids; we don't care about the literal ordering
// produced by Guid.CompareTo — only that paging is deterministic and
// covers every row exactly once.
var events = Enumerable.Range(0, 4)
.Select(_ => NewEvent(siteId, occurredAtUtc: occurredAt))
.ToList();
foreach (var e in events)
{
await repo.InsertIfNotExistsAsync(e);
}
var filter = new AuditLogQueryFilter(SourceSiteId: siteId);
var page1 = await repo.QueryAsync(filter, new AuditLogPaging(PageSize: 2));
Assert.Equal(2, page1.Count);
Assert.All(page1, r => Assert.Equal(occurredAt, r.OccurredAtUtc));
var cursor = page1[^1];
var page2 = await repo.QueryAsync(
filter,
new AuditLogPaging(
PageSize: 2,
AfterOccurredAtUtc: cursor.OccurredAtUtc,
AfterEventId: cursor.EventId));
Assert.Equal(2, page2.Count);
Assert.All(page2, r => Assert.Equal(occurredAt, r.OccurredAtUtc));
var page1Ids = page1.Select(r => r.EventId).ToHashSet();
var page2Ids = page2.Select(r => r.EventId).ToHashSet();
// No overlap between pages.
Assert.Empty(page1Ids.Intersect(page2Ids));
// Every inserted EventId appears in exactly one of the two pages.
var allIds = page1Ids.Union(page2Ids).ToHashSet();
Assert.Equal(4, allIds.Count);
Assert.True(events.Select(e => e.EventId).ToHashSet().SetEquals(allIds));
}
[SkippableFact] [SkippableFact]
public async Task SwitchOutPartitionAsync_ThrowsNotSupported_ForM1() public async Task SwitchOutPartitionAsync_ThrowsNotSupported_ForM1()
{ {

View File

@@ -0,0 +1,52 @@
namespace ScadaLink.HealthMonitoring.Tests;
/// <summary>
/// Bundle G (M2-T11) regression coverage. The site-side Audit Log writer chain
/// (FallbackAuditWriter) increments <see cref="IAuditWriteFailureCounter"/>
/// every time the primary SQLite writer throws. Bundle G bridges that counter
/// into the Site Health Monitoring report payload as <c>SiteAuditWriteFailures</c>
/// so a sustained audit-write outage surfaces on /monitoring/health rather than
/// disappearing into a NoOp sink.
/// </summary>
public class SiteAuditWriteFailuresMetricTests
{
private readonly SiteHealthCollector _collector = new();
[Fact]
public void Increment_Three_Times_Counter_Reports_3()
{
_collector.IncrementSiteAuditWriteFailures();
_collector.IncrementSiteAuditWriteFailures();
_collector.IncrementSiteAuditWriteFailures();
var report = _collector.CollectReport("site-1");
Assert.Equal(3, report.SiteAuditWriteFailures);
}
[Fact]
public void Report_Payload_Includes_SiteAuditWriteFailures_AsZeroByDefault()
{
var report = _collector.CollectReport("site-1");
Assert.Equal(0, report.SiteAuditWriteFailures);
}
/// <summary>
/// Mirrors the existing per-interval reset semantics for ScriptErrorCount /
/// AlarmEvaluationErrorCount / DeadLetterCount — SiteAuditWriteFailures is an
/// interval count, not a running total.
/// </summary>
[Fact]
public void CollectReport_Resets_SiteAuditWriteFailures()
{
_collector.IncrementSiteAuditWriteFailures();
_collector.IncrementSiteAuditWriteFailures();
var first = _collector.CollectReport("site-1");
Assert.Equal(2, first.SiteAuditWriteFailures);
var second = _collector.CollectReport("site-1");
Assert.Equal(0, second.SiteAuditWriteFailures);
}
}

View File

@@ -0,0 +1,306 @@
using Akka.Configuration;
using Microsoft.AspNetCore.Builder;
using Microsoft.AspNetCore.Mvc.Testing;
using Microsoft.EntityFrameworkCore;
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Hosting;
using Microsoft.Extensions.Options;
using ScadaLink.AuditLog;
using ScadaLink.AuditLog.Site;
using ScadaLink.AuditLog.Site.Telemetry;
using ScadaLink.ClusterInfrastructure;
using ScadaLink.Commons.Interfaces.Services;
using ScadaLink.ConfigurationDatabase;
using ScadaLink.Host;
using ScadaLink.Host.Actors;
namespace ScadaLink.Host.Tests;
/// <summary>
/// Bundle E (M2 Task E1) — verifies the Audit Log (#23) DI surface is wired
/// into both composition roots and that the HOCON document emitted by
/// <see cref="AkkaHostedService.BuildHocon"/> includes the dedicated
/// <c>audit-telemetry-dispatcher</c> the site telemetry actor binds to.
/// </summary>
/// <remarks>
/// <para>
/// Full cluster bring-up is exercised by the existing
/// <see cref="CompositionRootTests"/> pattern — these tests reuse the same
/// <see cref="AkkaHostedServiceRemover"/> trick to short-circuit
/// <see cref="AkkaHostedService.StartAsync"/> so DI resolution is exercised
/// without the actor system actually being created.
/// </para>
/// </remarks>
public class AkkaHostedServiceAuditWiringHoconTests
{
[Fact]
public void BuildHocon_Emits_AuditTelemetryDispatcher_Block()
{
// Bundle E acceptance: the HOCON document the host parses must declare
// the dedicated dispatcher the SiteAuditTelemetryActor binds to. A
// missing dispatcher block would route the actor to the default
// dispatcher and silently lose the isolation guarantee.
var nodeOptions = new NodeOptions
{
Role = "Site",
NodeHostname = "site-test-1",
RemotingPort = 0,
SiteId = "TestSite",
};
var clusterOptions = new ClusterOptions
{
SeedNodes = new List<string> { "akka.tcp://scadalink@localhost:2551" },
SplitBrainResolverStrategy = "keep-oldest",
MinNrOfMembers = 1,
StableAfter = TimeSpan.FromSeconds(15),
HeartbeatInterval = TimeSpan.FromSeconds(2),
FailureDetectionThreshold = TimeSpan.FromSeconds(10),
};
var hocon = AkkaHostedService.BuildHocon(
nodeOptions,
clusterOptions,
new[] { "Site", "site-TestSite" },
TimeSpan.FromSeconds(5),
TimeSpan.FromSeconds(15));
var config = ConfigurationFactory.ParseString(hocon);
// The dispatcher is declared at the root, so the lookup is by its
// unqualified name. The HOCON parser must accept the block as a
// standalone dispatcher definition the actor system can resolve.
var dispatcherType = config.GetString("audit-telemetry-dispatcher.type");
Assert.Equal("ForkJoinDispatcher", dispatcherType);
var throughput = config.GetInt("audit-telemetry-dispatcher.throughput");
Assert.Equal(100, throughput);
var threadCount = config.GetInt("audit-telemetry-dispatcher.dedicated-thread-pool.thread-count");
Assert.Equal(2, threadCount);
}
}
/// <summary>
/// Verifies Audit Log (#23) services land in the Central composition root.
/// </summary>
public class CentralAuditWiringTests : IDisposable
{
private readonly WebApplicationFactory<Program> _factory;
private readonly string? _previousEnv;
public CentralAuditWiringTests()
{
_previousEnv = Environment.GetEnvironmentVariable("DOTNET_ENVIRONMENT");
Environment.SetEnvironmentVariable("DOTNET_ENVIRONMENT", "Central");
_factory = new WebApplicationFactory<Program>()
.WithWebHostBuilder(builder =>
{
builder.ConfigureAppConfiguration((_, config) =>
{
config.AddInMemoryCollection(new Dictionary<string, string?>
{
["ScadaLink:Node:NodeHostname"] = "localhost",
["ScadaLink:Node:RemotingPort"] = "0",
["ScadaLink:Cluster:SeedNodes:0"] = "akka.tcp://scadalink@localhost:2551",
["ScadaLink:Cluster:SeedNodes:1"] = "akka.tcp://scadalink@localhost:2552",
["ScadaLink:Database:SkipMigrations"] = "true",
["ScadaLink:Security:JwtSigningKey"] = "test-signing-key-must-be-at-least-32-chars-long!",
["ScadaLink:Security:LdapServer"] = "localhost",
["ScadaLink:Security:LdapPort"] = "3893",
["ScadaLink:Security:LdapUseTls"] = "false",
["ScadaLink:Security:AllowInsecureLdap"] = "true",
["ScadaLink:Security:LdapSearchBase"] = "dc=scadalink,dc=local",
["ScadaLink:InboundApi:ApiKeyPepper"] = "test-inbound-api-key-pepper-at-least-32-chars!",
});
});
builder.UseSetting("ScadaLink:Node:Role", "Central");
builder.UseSetting("ScadaLink:Database:SkipMigrations", "true");
builder.ConfigureServices(services =>
{
var descriptorsToRemove = services
.Where(d =>
d.ServiceType == typeof(DbContextOptions<ScadaLinkDbContext>) ||
d.ServiceType == typeof(DbContextOptions) ||
d.ServiceType == typeof(ScadaLinkDbContext) ||
d.ServiceType.FullName?.Contains("EntityFrameworkCore") == true)
.ToList();
foreach (var d in descriptorsToRemove)
services.Remove(d);
services.AddDbContext<ScadaLinkDbContext>(options =>
options.UseInMemoryDatabase($"CentralAuditWiringTests_{Guid.NewGuid()}"));
AkkaHostedServiceRemover.RemoveAkkaHostedServiceOnly(services);
});
});
_ = _factory.Server;
}
public void Dispose()
{
_factory.Dispose();
Environment.SetEnvironmentVariable("DOTNET_ENVIRONMENT", _previousEnv);
}
[Fact]
public void Central_Resolves_IAuditWriter_AsFallbackAuditWriter()
{
// Central nodes still register the writer chain because AddAuditLog is
// shared between roles — the registrations are lazy singletons and the
// writer is never resolved on a central node in production. Asserting
// it resolves here confirms the chain is intact and ready for the
// future case where a central-only actor needs to emit audit events.
var writer = _factory.Services.GetService<IAuditWriter>();
Assert.NotNull(writer);
Assert.IsType<FallbackAuditWriter>(writer);
}
[Fact]
public void Central_Resolves_AuditLogOptions()
{
var opts = _factory.Services.GetService<IOptions<ScadaLink.AuditLog.Configuration.AuditLogOptions>>();
Assert.NotNull(opts);
Assert.NotNull(opts!.Value);
}
[Fact]
public void Central_Resolves_SqliteAuditWriterOptions()
{
var opts = _factory.Services.GetService<IOptions<SqliteAuditWriterOptions>>();
Assert.NotNull(opts);
Assert.NotNull(opts!.Value);
}
[Fact]
public void Central_Resolves_SiteAuditTelemetryOptions()
{
var opts = _factory.Services.GetService<IOptions<SiteAuditTelemetryOptions>>();
Assert.NotNull(opts);
Assert.NotNull(opts!.Value);
}
[Fact]
public void Central_Resolves_ISiteStreamAuditClient_AsNoOpDefault()
{
var client = _factory.Services.GetService<ISiteStreamAuditClient>();
Assert.NotNull(client);
Assert.IsType<NoOpSiteStreamAuditClient>(client);
}
}
/// <summary>
/// Verifies Audit Log (#23) services land in the Site composition root.
/// </summary>
public class SiteAuditWiringTests : IDisposable
{
private readonly WebApplication _host;
private readonly string _tempDbPath;
public SiteAuditWiringTests()
{
_tempDbPath = Path.Combine(Path.GetTempPath(), $"scadalink_audit_wiring_{Guid.NewGuid()}.db");
var builder = WebApplication.CreateBuilder();
builder.Configuration.Sources.Clear();
builder.Configuration.AddInMemoryCollection(new Dictionary<string, string?>
{
["ScadaLink:Node:Role"] = "Site",
["ScadaLink:Node:NodeHostname"] = "test-site",
["ScadaLink:Node:SiteId"] = "TestSite",
["ScadaLink:Node:RemotingPort"] = "0",
["ScadaLink:Node:GrpcPort"] = "0",
["ScadaLink:Database:SiteDbPath"] = _tempDbPath,
["ScadaLink:Cluster:SeedNodes:0"] = "akka.tcp://scadalink@localhost:2551",
["ScadaLink:Cluster:SeedNodes:1"] = "akka.tcp://scadalink@localhost:2552",
// SqliteAuditWriter would attempt to open a SQLite file when first
// resolved; point it at an in-memory connection so the test doesn't
// pollute the working directory.
["AuditLog:SiteWriter:DatabasePath"] = ":memory:",
});
builder.Services.AddGrpc();
builder.Services.AddSingleton<ScadaLink.Communication.Grpc.SiteStreamGrpcServer>();
SiteServiceRegistration.Configure(builder.Services, builder.Configuration);
AkkaHostedServiceRemover.RemoveAkkaHostedServiceOnly(builder.Services);
_host = builder.Build();
}
public void Dispose()
{
(_host as IDisposable)?.Dispose();
try { File.Delete(_tempDbPath); } catch { /* best effort */ }
}
[Fact]
public void Site_Resolves_IAuditWriter_AsFallbackAuditWriter()
{
var writer = _host.Services.GetService<IAuditWriter>();
Assert.NotNull(writer);
Assert.IsType<FallbackAuditWriter>(writer);
}
[Fact]
public void Site_Resolves_SqliteAuditWriter_AsSingleton()
{
var a = _host.Services.GetService<SqliteAuditWriter>();
var b = _host.Services.GetService<SqliteAuditWriter>();
Assert.NotNull(a);
Assert.NotNull(b);
Assert.Same(a, b);
}
[Fact]
public void Site_ISiteAuditQueue_AndSqliteAuditWriter_AreSameInstance()
{
// The telemetry actor reads from ISiteAuditQueue while ScriptRuntimeContext
// writes through IAuditWriter → SqliteAuditWriter. If these don't resolve
// to the same instance, pending rows are invisible to the actor.
var queue = _host.Services.GetService<ISiteAuditQueue>();
var writer = _host.Services.GetService<SqliteAuditWriter>();
Assert.NotNull(queue);
Assert.NotNull(writer);
Assert.Same(writer, queue);
}
[Fact]
public void Site_Resolves_RingBufferFallback()
{
var ring = _host.Services.GetService<RingBufferFallback>();
Assert.NotNull(ring);
}
[Fact]
public void Site_Resolves_IAuditWriteFailureCounter_AsHealthMetricsBridge()
{
// Bundle G (M2-T11): site composition root calls
// AddAuditLogHealthMetricsBridge() after AddAuditLog + AddSiteHealthMonitoring,
// which swaps the NoOp default for the real health-metrics bridge so
// FallbackAuditWriter primary failures surface in the site health
// report payload as SiteAuditWriteFailures.
var counter = _host.Services.GetService<IAuditWriteFailureCounter>();
Assert.NotNull(counter);
Assert.IsType<HealthMetricsAuditWriteFailureCounter>(counter);
}
[Fact]
public void Site_Resolves_ISiteStreamAuditClient_AsNoOpDefault()
{
var client = _host.Services.GetService<ISiteStreamAuditClient>();
Assert.NotNull(client);
Assert.IsType<NoOpSiteStreamAuditClient>(client);
}
[Fact]
public void Site_Resolves_SiteAuditTelemetryOptions_WithDefaults()
{
var opts = _host.Services.GetService<IOptions<SiteAuditTelemetryOptions>>();
Assert.NotNull(opts);
Assert.Equal(256, opts!.Value.BatchSize);
Assert.Equal(5, opts.Value.BusyIntervalSeconds);
Assert.Equal(30, opts.Value.IdleIntervalSeconds);
}
}

View File

@@ -69,6 +69,7 @@ public class DeploymentManagerRedeployTests : TestKit, IDisposable
public void IncrementScriptError() { } public void IncrementScriptError() { }
public void IncrementAlarmError() { } public void IncrementAlarmError() { }
public void IncrementDeadLetter() { } public void IncrementDeadLetter() { }
public void IncrementSiteAuditWriteFailures() { }
public void UpdateConnectionHealth(string connectionName, ConnectionHealth health) { } public void UpdateConnectionHealth(string connectionName, ConnectionHealth health) { }
public void RemoveConnection(string connectionName) { } public void RemoveConnection(string connectionName) { }
public void UpdateTagResolution(string connectionName, int totalSubscribed, int successfullyResolved) { } public void UpdateTagResolution(string connectionName, int totalSubscribed, int successfullyResolved) { }

View File

@@ -0,0 +1,214 @@
using Microsoft.Extensions.Logging.Abstractions;
using Moq;
using ScadaLink.Commons.Entities.Audit;
using ScadaLink.Commons.Interfaces.Services;
using ScadaLink.Commons.Types;
using ScadaLink.Commons.Types.Enums;
using ScadaLink.SiteRuntime.Scripts;
namespace ScadaLink.SiteRuntime.Tests.Scripts;
/// <summary>
/// Audit Log #23 — M2 Bundle F (Task F1): every script-initiated
/// <c>ExternalSystem.Call</c> emits exactly one <c>ApiOutbound</c>/<c>ApiCall</c>
/// audit event via the wrapper inside
/// <see cref="ScriptRuntimeContext.ExternalSystemHelper"/>. The audit emission
/// is best-effort: a thrown <see cref="IAuditWriter.WriteAsync"/> must never
/// abort the script's call, and the original <see cref="ExternalCallResult"/>
/// (or original exception) must surface to the caller unchanged.
/// </summary>
public class ExternalSystemCallAuditEmissionTests
{
/// <summary>
/// In-memory <see cref="IAuditWriter"/> that records every event passed to
/// <see cref="WriteAsync"/>. Optionally configurable to throw, simulating a
/// catastrophic audit-writer failure that the wrapper must swallow.
/// </summary>
private sealed class CapturingAuditWriter : IAuditWriter
{
public List<AuditEvent> Events { get; } = new();
public Exception? ThrowOnWrite { get; set; }
public Task WriteAsync(AuditEvent evt, CancellationToken ct = default)
{
if (ThrowOnWrite != null)
{
return Task.FromException(ThrowOnWrite);
}
Events.Add(evt);
return Task.CompletedTask;
}
}
private const string SiteId = "site-77";
private const string InstanceName = "Plant.Pump42";
private const string SourceScript = "ScriptActor:CheckPressure";
private static ScriptRuntimeContext.ExternalSystemHelper CreateHelper(
IExternalSystemClient client,
IAuditWriter? auditWriter)
{
return new ScriptRuntimeContext.ExternalSystemHelper(
client,
InstanceName,
NullLogger.Instance,
auditWriter,
SiteId,
SourceScript);
}
[Fact]
public async Task Call_Success_EmitsOneEvent_Channel_ApiOutbound_Kind_ApiCall_Status_Delivered()
{
var client = new Mock<IExternalSystemClient>();
client
.Setup(c => c.CallAsync("ERP", "GetOrder", It.IsAny<IReadOnlyDictionary<string, object?>?>(), It.IsAny<CancellationToken>()))
.ReturnsAsync(new ExternalCallResult(true, "{}", null));
var writer = new CapturingAuditWriter();
var helper = CreateHelper(client.Object, writer);
var result = await helper.Call("ERP", "GetOrder");
Assert.True(result.Success);
Assert.Single(writer.Events);
var evt = writer.Events[0];
Assert.Equal(AuditChannel.ApiOutbound, evt.Channel);
Assert.Equal(AuditKind.ApiCall, evt.Kind);
Assert.Equal(AuditStatus.Delivered, evt.Status);
Assert.Equal("ERP.GetOrder", evt.Target);
Assert.Equal(AuditForwardState.Pending, evt.ForwardState);
Assert.Equal(DateTimeKind.Utc, evt.OccurredAtUtc.Kind);
Assert.NotEqual(Guid.Empty, evt.EventId);
Assert.False(evt.PayloadTruncated);
}
[Fact]
public async Task Call_HTTP500_EmitsEvent_Status_Failed_HttpStatus_500_ErrorMessage_Set()
{
var client = new Mock<IExternalSystemClient>();
client
.Setup(c => c.CallAsync("ERP", "GetOrder", It.IsAny<IReadOnlyDictionary<string, object?>?>(), It.IsAny<CancellationToken>()))
.ReturnsAsync(new ExternalCallResult(false, null, "Transient error: HTTP 500 from ERP: Internal Server Error"));
var writer = new CapturingAuditWriter();
var helper = CreateHelper(client.Object, writer);
var result = await helper.Call("ERP", "GetOrder");
Assert.False(result.Success);
Assert.Single(writer.Events);
var evt = writer.Events[0];
Assert.Equal(AuditStatus.Failed, evt.Status);
Assert.Equal(500, evt.HttpStatus);
Assert.False(string.IsNullOrEmpty(evt.ErrorMessage));
Assert.Contains("500", evt.ErrorMessage);
}
[Fact]
public async Task Call_HTTP400_EmitsEvent_Status_Failed_HttpStatus_400()
{
var client = new Mock<IExternalSystemClient>();
client
.Setup(c => c.CallAsync("ERP", "GetOrder", It.IsAny<IReadOnlyDictionary<string, object?>?>(), It.IsAny<CancellationToken>()))
.ReturnsAsync(new ExternalCallResult(false, null, "Permanent error: HTTP 400 from ERP: Bad Request"));
var writer = new CapturingAuditWriter();
var helper = CreateHelper(client.Object, writer);
var result = await helper.Call("ERP", "GetOrder");
Assert.False(result.Success);
Assert.Single(writer.Events);
var evt = writer.Events[0];
Assert.Equal(AuditStatus.Failed, evt.Status);
Assert.Equal(400, evt.HttpStatus);
}
[Fact]
public async Task Call_ClientThrows_NetworkException_EmitsEvent_Status_Failed_ErrorMessage_FromException()
{
var client = new Mock<IExternalSystemClient>();
var networkEx = new HttpRequestException("network down");
client
.Setup(c => c.CallAsync("ERP", "GetOrder", It.IsAny<IReadOnlyDictionary<string, object?>?>(), It.IsAny<CancellationToken>()))
.ThrowsAsync(networkEx);
var writer = new CapturingAuditWriter();
var helper = CreateHelper(client.Object, writer);
var thrown = await Assert.ThrowsAsync<HttpRequestException>(() => helper.Call("ERP", "GetOrder"));
Assert.Same(networkEx, thrown);
Assert.Single(writer.Events);
var evt = writer.Events[0];
Assert.Equal(AuditStatus.Failed, evt.Status);
Assert.Null(evt.HttpStatus);
Assert.Equal("network down", evt.ErrorMessage);
Assert.NotNull(evt.ErrorDetail);
Assert.Contains("HttpRequestException", evt.ErrorDetail);
}
[Fact]
public async Task AuditWriter_Throws_Script_Call_Returns_Original_Result_Unchanged()
{
var client = new Mock<IExternalSystemClient>();
var expected = new ExternalCallResult(true, "{\"v\":1}", null);
client
.Setup(c => c.CallAsync("ERP", "GetOrder", It.IsAny<IReadOnlyDictionary<string, object?>?>(), It.IsAny<CancellationToken>()))
.ReturnsAsync(expected);
var writer = new CapturingAuditWriter
{
ThrowOnWrite = new InvalidOperationException("audit writer down")
};
var helper = CreateHelper(client.Object, writer);
var result = await helper.Call("ERP", "GetOrder");
Assert.Same(expected, result);
Assert.Empty(writer.Events);
}
[Fact]
public async Task Provenance_Populated_FromContext()
{
var client = new Mock<IExternalSystemClient>();
client
.Setup(c => c.CallAsync("ERP", "GetOrder", It.IsAny<IReadOnlyDictionary<string, object?>?>(), It.IsAny<CancellationToken>()))
.ReturnsAsync(new ExternalCallResult(true, null, null));
var writer = new CapturingAuditWriter();
var helper = CreateHelper(client.Object, writer);
var beforeId = Guid.NewGuid();
await helper.Call("ERP", "GetOrder");
var evt = writer.Events[0];
Assert.NotEqual(beforeId, evt.EventId);
Assert.NotEqual(Guid.Empty, evt.EventId);
Assert.Equal(SiteId, evt.SourceSiteId);
Assert.Equal(InstanceName, evt.SourceInstanceId);
Assert.Equal(SourceScript, evt.SourceScript);
Assert.Null(evt.Actor);
Assert.Null(evt.CorrelationId);
}
[Fact]
public async Task DurationMs_Recorded_NonZero()
{
var client = new Mock<IExternalSystemClient>();
client
.Setup(c => c.CallAsync("ERP", "Slow", It.IsAny<IReadOnlyDictionary<string, object?>?>(), It.IsAny<CancellationToken>()))
.Returns(async () =>
{
await Task.Delay(20);
return new ExternalCallResult(true, null, null);
});
var writer = new CapturingAuditWriter();
var helper = CreateHelper(client.Object, writer);
await helper.Call("ERP", "Slow");
var evt = writer.Events[0];
Assert.NotNull(evt.DurationMs);
Assert.True(evt.DurationMs >= 0, $"DurationMs={evt.DurationMs} should be >= 0");
Assert.True(evt.DurationMs <= 5000, $"DurationMs={evt.DurationMs} should be <= 5000");
}
}