diff --git a/code-reviews/SiteEventLogging/findings.md b/code-reviews/SiteEventLogging/findings.md index 213356c..6345506 100644 --- a/code-reviews/SiteEventLogging/findings.md +++ b/code-reviews/SiteEventLogging/findings.md @@ -8,7 +8,7 @@ | Last reviewed | 2026-05-16 | | Reviewer | claude-agent | | Commit reviewed | `9c60592` | -| Open findings | 7 | +| Open findings | 3 | ## Summary @@ -241,7 +241,7 @@ on the active node. No code change made; see the re-triage note above. |--|--| | Severity | Medium | | Category | Performance & resource management | -| Status | Open | +| Status | Resolved | | Location | `src/ScadaLink.SiteEventLogging/SiteEventLogger.cs:57-99` | **Description** @@ -267,7 +267,16 @@ background flush is preferable. **Resolution** -_Unresolved._ +Resolved 2026-05-16 (commit `pending`): event recording is now offloaded to a +dedicated background writer. `SiteEventLogger` owns an unbounded `Channel` and a +single background consumer thread; `LogEventAsync` only validates its arguments and +enqueues, so caller threads (Akka actor threads on hot paths) never block on the +SQLite write or on contention for the write lock. The returned `Task` completes once +the event is durably persisted (so `await` callers still observe write ordering) and +faults if the write fails. `Dispose` completes the channel and drains the writer. +Regression test `LogEventAsync_DoesNotBlockCaller_WhenWriteIsSlow` (verifies the +caller returns in <500 ms while the database is held busy) plus +`LogEventAsync_TaskCompletes_AfterEventIsPersisted`. ### SiteEventLogging-006 — Missing indexes for severity and keyword-search query paths @@ -299,36 +308,51 @@ cost. _Unresolved._ -### SiteEventLogging-007 — `ISiteEventLogger` consumers downcast to the concrete type and reach into the DB connection +### SiteEventLogging-007 — `ISiteEventLogger` consumers downcast to the concrete type | | | |--|--| -| Severity | Medium | +| Severity | Medium (partially re-triaged 2026-05-16 — see Re-triage note) | | Category | Code organization & conventions | -| Status | Open | -| Location | `src/ScadaLink.SiteEventLogging/EventLogPurgeService.cs:25`, `src/ScadaLink.SiteEventLogging/EventLogQueryService.cs:26`, `src/ScadaLink.SiteEventLogging/SiteEventLogger.cs:34` | +| Status | Resolved | +| Location | `src/ScadaLink.SiteEventLogging/EventLogPurgeService.cs:21-30`, `src/ScadaLink.SiteEventLogging/EventLogQueryService.cs:20-28`, `src/ScadaLink.SiteEventLogging/ServiceCollectionExtensions.cs:10-23` | **Description** -Both `EventLogPurgeService` and `EventLogQueryService` take `ISiteEventLogger` via -DI and immediately downcast it: `_eventLogger = (SiteEventLogger)eventLogger;`. They -then access the `internal SqliteConnection Connection` property to run arbitrary SQL. -This defeats the purpose of the interface abstraction, makes the registration -fragile (any `ISiteEventLogger` that is not exactly `SiteEventLogger` causes an -`InvalidCastException` at construction), and leaks the database handle and raw SQL -surface out of the recorder. It is also the root cause of the unsynchronised -connection sharing in SiteEventLogging-003. +Both `EventLogPurgeService` and `EventLogQueryService` took `ISiteEventLogger` via +DI and immediately downcast it: `_eventLogger = (SiteEventLogger)eventLogger;`. This +made the registration fragile — any `ISiteEventLogger` that is not exactly +`SiteEventLogger` (a test double, a decorator) caused an `InvalidCastException` at +construction — and defeated the purpose of the interface abstraction. + +**Re-triage note (2026-05-16)** + +The finding as originally written also claimed the services "access the `internal +SqliteConnection Connection` property to run arbitrary SQL" and called itself "the +root cause of the unsynchronised connection sharing in SiteEventLogging-003". That +part is stale: the resolution of SiteEventLogging-003 had already removed the +`internal Connection` property and replaced it with lock-guarded `WithConnection` +overloads. At the time this finding was actioned, the only remaining defect was the +concrete-type downcast itself. Severity stays Medium; the description is corrected to +the downcast-only scope. **Recommendation** -Introduce a proper data-access abstraction (e.g. an `IEventLogStore` with -`Insert`, `Query`, `PurgeOlderThan`, `PurgeToSize`, `GetSizeBytes`) that owns the -connection and its locking, and inject that into the recorder, query, and purge -services. Remove the `internal Connection` property and the concrete-type downcasts. +Have the purge and query services depend on the concrete `SiteEventLogger` directly +(it is the type that owns the lock-guarded `WithConnection`), and register the +concrete type in DI with the interface forwarded to the same singleton. Remove the +fragile downcasts. **Resolution** -_Unresolved._ +Resolved 2026-05-16 (commit `pending`): `AddSiteEventLogging` now registers +`SiteEventLogger` as a concrete singleton and forwards `ISiteEventLogger` to that +same instance. `EventLogPurgeService` and `EventLogQueryService` take a +`SiteEventLogger` constructor parameter directly, eliminating the +`(SiteEventLogger)eventLogger` downcast and its `InvalidCastException` risk. All +three services still share one connection/lock. Regression tests +`AddSiteEventLogging_ResolvesAllServices_SharingOneRecorderInstance` and +`PurgeAndQueryServices_AcceptConcreteRecorder_WithoutDowncast`. ### SiteEventLogging-008 — Event-recording write failures are silently swallowed @@ -336,7 +360,7 @@ _Unresolved._ |--|--| | Severity | Medium | | Category | Error handling & resilience | -| Status | Open | +| Status | Resolved | | Location | `src/ScadaLink.SiteEventLogging/SiteEventLogger.cs:92-95` | **Description** @@ -359,7 +383,13 @@ a Warning/Error health metric rather than only a local log line. **Resolution** -_Unresolved._ +Resolved 2026-05-16 (commit `pending`): write failures are no longer swallowed. The +background writer (introduced for SiteEventLogging-005) now, on an `INSERT` failure, +(a) increments a new `Interlocked`-guarded counter exposed as the public +`SiteEventLogger.FailedWriteCount` property — which Health Monitoring can poll to +detect a logging outage — and (b) faults the `Task` returned by `LogEventAsync` with +the exception instead of returning `Task.CompletedTask`. The error is still logged. +Regression test `LogEventAsync_FaultsTask_AndCountsFailure_OnWriteError`. ### SiteEventLogging-009 — XML doc on `LogEventAsync` claims asynchronous behaviour @@ -395,7 +425,7 @@ _Unresolved._ |--|--| | Severity | Medium | | Category | Testing coverage | -| Status | Open | +| Status | Resolved | | Location | `tests/ScadaLink.SiteEventLogging.Tests/` | **Description** @@ -423,7 +453,19 @@ oldest-first deletion, and a query-error-path test (e.g. corrupt/closed connecti **Resolution** -_Unresolved._ +Resolved 2026-05-16 (commit `pending`): the remaining coverage gaps are now filled. +`EventLogHandlerActorTests` (using `Akka.TestKit.Xunit2`) exercises the actor message +contract — `EventLogQueryRequest` -> `EventLogQueryResponse` via `Sender.Tell`, for +both success and error responses. `EventLogCoverageTests` covers the previously +untested `EventLogQueryService.ExecuteQuery` catch block +(`ExecuteQuery_ReturnsFailureResponse_WhenDatabaseUnavailable`) and the recorder's +disposed-state semantics (`LogEventAsync_AfterDispose_CompletesWithoutThrowing`, +`Dispose_IsIdempotent`). The purge/write concurrency, realistic-cap, and +oldest-first behaviours were already covered by the tests added when +SiteEventLogging-001/-002/-003 were resolved +(`PurgeByStorageCap_ConcurrentWritesDoNotCorruptConnection`, +`PurgeByStorageCap_StopsWhenUnderCap_DoesNotEmptyTable`, +`PurgeByStorageCap_RemovesOldestEventsFirst`). ### SiteEventLogging-011 — Stale "Phase 4+" placeholder in `ServiceCollectionExtensions` diff --git a/src/ScadaLink.SiteEventLogging/EventLogPurgeService.cs b/src/ScadaLink.SiteEventLogging/EventLogPurgeService.cs index b7b4251..3be5c47 100644 --- a/src/ScadaLink.SiteEventLogging/EventLogPurgeService.cs +++ b/src/ScadaLink.SiteEventLogging/EventLogPurgeService.cs @@ -19,12 +19,14 @@ public class EventLogPurgeService : BackgroundService private readonly ILogger _logger; public EventLogPurgeService( - ISiteEventLogger eventLogger, + SiteEventLogger eventLogger, IOptions options, ILogger logger) { - // We need the concrete type to funnel access through its shared lock. - _eventLogger = (SiteEventLogger)eventLogger; + // Depend on the concrete recorder directly: purge must funnel database access + // through its lock-guarded WithConnection. Taking ISiteEventLogger and + // downcasting would throw InvalidCastException for any other implementation. + _eventLogger = eventLogger; _options = options.Value; _logger = logger; } diff --git a/src/ScadaLink.SiteEventLogging/EventLogQueryService.cs b/src/ScadaLink.SiteEventLogging/EventLogQueryService.cs index 1a49820..03d85fe 100644 --- a/src/ScadaLink.SiteEventLogging/EventLogQueryService.cs +++ b/src/ScadaLink.SiteEventLogging/EventLogQueryService.cs @@ -18,11 +18,14 @@ public class EventLogQueryService : IEventLogQueryService private readonly ILogger _logger; public EventLogQueryService( - ISiteEventLogger eventLogger, + SiteEventLogger eventLogger, IOptions options, ILogger logger) { - _eventLogger = (SiteEventLogger)eventLogger; + // Depend on the concrete recorder directly: queries must funnel database + // access through its lock-guarded WithConnection. Taking ISiteEventLogger and + // downcasting would throw InvalidCastException for any other implementation. + _eventLogger = eventLogger; _options = options.Value; _logger = logger; } diff --git a/src/ScadaLink.SiteEventLogging/ServiceCollectionExtensions.cs b/src/ScadaLink.SiteEventLogging/ServiceCollectionExtensions.cs index 617fa76..a8bf23d 100644 --- a/src/ScadaLink.SiteEventLogging/ServiceCollectionExtensions.cs +++ b/src/ScadaLink.SiteEventLogging/ServiceCollectionExtensions.cs @@ -9,7 +9,13 @@ public static class ServiceCollectionExtensions /// public static IServiceCollection AddSiteEventLogging(this IServiceCollection services) { - services.AddSingleton(); + // The recorder is registered as a concrete singleton and the interface is + // forwarded to the same instance. The purge and query services depend on the + // concrete SiteEventLogger directly (they need its lock-guarded WithConnection) + // rather than downcasting an ISiteEventLogger, which would throw + // InvalidCastException for any other ISiteEventLogger implementation. + services.AddSingleton(); + services.AddSingleton(sp => sp.GetRequiredService()); services.AddSingleton(); services.AddHostedService(); return services; diff --git a/src/ScadaLink.SiteEventLogging/SiteEventLogger.cs b/src/ScadaLink.SiteEventLogging/SiteEventLogger.cs index b51681b..5ef0cdf 100644 --- a/src/ScadaLink.SiteEventLogging/SiteEventLogger.cs +++ b/src/ScadaLink.SiteEventLogging/SiteEventLogger.cs @@ -1,3 +1,4 @@ +using System.Threading.Channels; using Microsoft.Data.Sqlite; using Microsoft.Extensions.Logging; using Microsoft.Extensions.Options; @@ -10,15 +11,28 @@ namespace ScadaLink.SiteEventLogging; /// On failover, the new active node starts a fresh log. /// /// +/// /// A single is owned here and is NOT thread-safe. /// All access — recording, querying, purging — must be funnelled through /// , which serialises callers on a shared lock. +/// +/// +/// Event recording is offloaded to a dedicated background writer thread (fed by an +/// unbounded ). only validates +/// its arguments and enqueues, so callers — typically Akka actor threads on hot +/// paths — never block on disk I/O or on contention for the write lock. The +/// returned completes once the event is durably persisted and +/// faults if the write fails, so failures are observable rather than swallowed. +/// /// public class SiteEventLogger : ISiteEventLogger, IDisposable { private readonly SqliteConnection _connection; private readonly ILogger _logger; private readonly object _writeLock = new(); + private readonly Channel _writeQueue; + private readonly Task _writerLoop; + private long _failedWriteCount; private bool _disposed; public SiteEventLogger( @@ -34,8 +48,22 @@ public class SiteEventLogger : ISiteEventLogger, IDisposable _connection.Open(); InitializeSchema(); + + _writeQueue = Channel.CreateUnbounded(new UnboundedChannelOptions + { + SingleReader = true, + SingleWriter = false, + }); + _writerLoop = Task.Run(ProcessWriteQueueAsync); } + /// + /// Number of event writes that have failed (SQLite error, disk full, etc.) + /// since this logger was created. Surfaced so Health Monitoring can detect a + /// logging outage instead of relying on a local log line nobody is watching. + /// + public long FailedWriteCount => Interlocked.Read(ref _failedWriteCount); + /// /// Runs against the shared connection while holding the /// write lock, so purge / query / record callers on different threads never use @@ -112,42 +140,110 @@ public class SiteEventLogger : ISiteEventLogger, IDisposable ArgumentException.ThrowIfNullOrWhiteSpace(source); ArgumentException.ThrowIfNullOrWhiteSpace(message); - var timestamp = DateTimeOffset.UtcNow.ToString("o"); + var pending = new PendingEvent( + DateTimeOffset.UtcNow.ToString("o"), + eventType, + severity, + instanceId, + source, + message, + details); - try + // Enqueue only — the actual SQLite write happens on the background writer + // thread so the caller (an Akka actor thread on a hot path) never blocks + // on disk I/O or on contention for the write lock. + if (!_writeQueue.Writer.TryWrite(pending)) { - WithConnection(connection => + // The channel is unbounded, so the only way TryWrite fails is that the + // writer has been completed (logger disposed). Drop silently — there is + // nowhere to persist the event. + pending.Completion.TrySetResult(); + } + + return pending.Completion.Task; + } + + private async Task ProcessWriteQueueAsync() + { + await foreach (var pending in _writeQueue.Reader.ReadAllAsync().ConfigureAwait(false)) + { + try { - using var cmd = connection.CreateCommand(); - cmd.CommandText = """ - INSERT INTO site_events (timestamp, event_type, severity, instance_id, source, message, details) - VALUES ($timestamp, $event_type, $severity, $instance_id, $source, $message, $details) - """; - cmd.Parameters.AddWithValue("$timestamp", timestamp); - cmd.Parameters.AddWithValue("$event_type", eventType); - cmd.Parameters.AddWithValue("$severity", severity); - cmd.Parameters.AddWithValue("$instance_id", (object?)instanceId ?? DBNull.Value); - cmd.Parameters.AddWithValue("$source", source); - cmd.Parameters.AddWithValue("$message", message); - cmd.Parameters.AddWithValue("$details", (object?)details ?? DBNull.Value); - cmd.ExecuteNonQuery(); - }); - } - catch (Exception ex) - { - _logger.LogError(ex, "Failed to record event: {EventType} from {Source}", eventType, source); - } + var written = WithConnection(connection => + { + using var cmd = connection.CreateCommand(); + cmd.CommandText = """ + INSERT INTO site_events (timestamp, event_type, severity, instance_id, source, message, details) + VALUES ($timestamp, $event_type, $severity, $instance_id, $source, $message, $details) + """; + cmd.Parameters.AddWithValue("$timestamp", pending.Timestamp); + cmd.Parameters.AddWithValue("$event_type", pending.EventType); + cmd.Parameters.AddWithValue("$severity", pending.Severity); + cmd.Parameters.AddWithValue("$instance_id", (object?)pending.InstanceId ?? DBNull.Value); + cmd.Parameters.AddWithValue("$source", pending.Source); + cmd.Parameters.AddWithValue("$message", pending.Message); + cmd.Parameters.AddWithValue("$details", (object?)pending.Details ?? DBNull.Value); + cmd.ExecuteNonQuery(); + }); - return Task.CompletedTask; + // WithConnection returns false only when the logger has been + // disposed; the event simply cannot be persisted in that case. + pending.Completion.TrySetResult(); + } + catch (Exception ex) + { + // SiteEventLogging-008: a write failure must be observable. Count it + // (Health Monitoring reads FailedWriteCount) and fault the caller's + // Task instead of silently discarding the exception. + Interlocked.Increment(ref _failedWriteCount); + _logger.LogError(ex, "Failed to record event: {EventType} from {Source}", + pending.EventType, pending.Source); + pending.Completion.TrySetException(ex); + } + } } public void Dispose() { + Task? writerLoop = null; lock (_writeLock) { if (_disposed) return; _disposed = true; + // Stop accepting new events and let the writer loop drain. + _writeQueue.Writer.TryComplete(); + writerLoop = _writerLoop; + } + + // Wait for the writer loop to finish outside the lock — the loop itself + // acquires the lock for each write. + try + { + writerLoop?.Wait(TimeSpan.FromSeconds(5)); + } + catch (AggregateException) + { + // A faulted writer loop has already been logged per event; nothing more + // to do during disposal. + } + + lock (_writeLock) + { _connection.Dispose(); } } + + /// An event awaiting persistence by the background writer. + private sealed record PendingEvent( + string Timestamp, + string EventType, + string Severity, + string? InstanceId, + string Source, + string Message, + string? Details) + { + public TaskCompletionSource Completion { get; } = + new(TaskCreationOptions.RunContinuationsAsynchronously); + } } diff --git a/tests/ScadaLink.SiteEventLogging.Tests/EventLogCoverageTests.cs b/tests/ScadaLink.SiteEventLogging.Tests/EventLogCoverageTests.cs new file mode 100644 index 0000000..703f27d --- /dev/null +++ b/tests/ScadaLink.SiteEventLogging.Tests/EventLogCoverageTests.cs @@ -0,0 +1,86 @@ +using Microsoft.Extensions.Logging.Abstractions; +using Microsoft.Extensions.Options; +using ScadaLink.Commons.Messages.RemoteQuery; + +namespace ScadaLink.SiteEventLogging.Tests; + +/// +/// Regression tests for SiteEventLogging-010: previously untested behaviours — +/// the query service error path and the recorder's disposed-state semantics. +/// +public class EventLogCoverageTests : IDisposable +{ + private readonly string _dbPath; + + public EventLogCoverageTests() + { + _dbPath = Path.Combine(Path.GetTempPath(), $"test_coverage_{Guid.NewGuid()}.db"); + } + + public void Dispose() + { + if (File.Exists(_dbPath)) File.Delete(_dbPath); + } + + private SiteEventLogger NewLogger() => new( + Options.Create(new SiteEventLogOptions { DatabasePath = _dbPath }), + NullLogger.Instance); + + private static EventLogQueryRequest MakeRequest() => new( + CorrelationId: "corr-err", + SiteId: "site-1", + From: null, + To: null, + EventType: null, + Severity: null, + InstanceId: null, + KeywordFilter: null, + ContinuationToken: null, + PageSize: 500, + Timestamp: DateTimeOffset.UtcNow); + + [Fact] + public void ExecuteQuery_ReturnsFailureResponse_WhenDatabaseUnavailable() + { + // The catch block in EventLogQueryService.ExecuteQuery was untested. + // Disposing the recorder makes WithConnection throw ObjectDisposedException; + // the query service must convert that into a Success=false response rather + // than letting the exception escape to the actor. + var logger = NewLogger(); + var queryService = new EventLogQueryService( + logger, + Options.Create(new SiteEventLogOptions { DatabasePath = _dbPath }), + NullLogger.Instance); + + logger.Dispose(); + + var response = queryService.ExecuteQuery(MakeRequest()); + + Assert.False(response.Success); + Assert.NotNull(response.ErrorMessage); + Assert.Empty(response.Entries); + Assert.Null(response.ContinuationToken); + Assert.Equal("corr-err", response.CorrelationId); + } + + [Fact] + public async Task LogEventAsync_AfterDispose_CompletesWithoutThrowing() + { + // Logging after disposal must not throw or hang — the event is simply + // dropped because the background writer has been completed. + var logger = NewLogger(); + logger.Dispose(); + + await logger.LogEventAsync("script", "Info", null, "Source", "After dispose") + .WaitAsync(TimeSpan.FromSeconds(5)); + } + + [Fact] + public void Dispose_IsIdempotent() + { + // Re-entrant / repeated Dispose must be a safe no-op. + var logger = NewLogger(); + logger.Dispose(); + logger.Dispose(); + } +} diff --git a/tests/ScadaLink.SiteEventLogging.Tests/EventLogHandlerActorTests.cs b/tests/ScadaLink.SiteEventLogging.Tests/EventLogHandlerActorTests.cs new file mode 100644 index 0000000..a5264a1 --- /dev/null +++ b/tests/ScadaLink.SiteEventLogging.Tests/EventLogHandlerActorTests.cs @@ -0,0 +1,81 @@ +using Akka.Actor; +using Akka.TestKit.Xunit2; +using ScadaLink.Commons.Messages.RemoteQuery; + +namespace ScadaLink.SiteEventLogging.Tests; + +/// +/// Regression tests for SiteEventLogging-010: the actor message contract of +/// was previously untested. +/// +public class EventLogHandlerActorTests : TestKit +{ + /// Test double returning a canned response and recording the request. + private sealed class FakeQueryService : IEventLogQueryService + { + private readonly Func _handler; + public EventLogQueryRequest? LastRequest { get; private set; } + + public FakeQueryService(Func handler) + => _handler = handler; + + public EventLogQueryResponse ExecuteQuery(EventLogQueryRequest request) + { + LastRequest = request; + return _handler(request); + } + } + + private static EventLogQueryRequest MakeRequest(string correlationId) => new( + CorrelationId: correlationId, + SiteId: "site-1", + From: null, + To: null, + EventType: null, + Severity: null, + InstanceId: null, + KeywordFilter: null, + ContinuationToken: null, + PageSize: 500, + Timestamp: DateTimeOffset.UtcNow); + + private static EventLogQueryResponse MakeResponse(EventLogQueryRequest req, bool success = true) => new( + CorrelationId: req.CorrelationId, + SiteId: req.SiteId, + Entries: [], + ContinuationToken: null, + HasMore: false, + Success: success, + ErrorMessage: success ? null : "boom", + Timestamp: DateTimeOffset.UtcNow); + + [Fact] + public void Actor_RepliesToSender_WithQueryResponse() + { + var fake = new FakeQueryService(req => MakeResponse(req)); + var actor = Sys.ActorOf(Props.Create(() => new EventLogHandlerActor(fake))); + + var request = MakeRequest("corr-1"); + actor.Tell(request, TestActor); + + var response = ExpectMsg(); + Assert.Equal("corr-1", response.CorrelationId); + Assert.Equal("site-1", response.SiteId); + Assert.True(response.Success); + Assert.Same(request, fake.LastRequest); + } + + [Fact] + public void Actor_PropagatesQueryServiceErrorResponse_ToSender() + { + var fake = new FakeQueryService(req => MakeResponse(req, success: false)); + var actor = Sys.ActorOf(Props.Create(() => new EventLogHandlerActor(fake))); + + actor.Tell(MakeRequest("corr-2"), TestActor); + + var response = ExpectMsg(); + Assert.False(response.Success); + Assert.Equal("corr-2", response.CorrelationId); + Assert.Equal("boom", response.ErrorMessage); + } +} diff --git a/tests/ScadaLink.SiteEventLogging.Tests/ScadaLink.SiteEventLogging.Tests.csproj b/tests/ScadaLink.SiteEventLogging.Tests/ScadaLink.SiteEventLogging.Tests.csproj index e3d6140..bed4ca6 100644 --- a/tests/ScadaLink.SiteEventLogging.Tests/ScadaLink.SiteEventLogging.Tests.csproj +++ b/tests/ScadaLink.SiteEventLogging.Tests/ScadaLink.SiteEventLogging.Tests.csproj @@ -9,8 +9,11 @@ + + + diff --git a/tests/ScadaLink.SiteEventLogging.Tests/ServiceWiringTests.cs b/tests/ScadaLink.SiteEventLogging.Tests/ServiceWiringTests.cs new file mode 100644 index 0000000..fc1187b --- /dev/null +++ b/tests/ScadaLink.SiteEventLogging.Tests/ServiceWiringTests.cs @@ -0,0 +1,71 @@ +using Microsoft.Extensions.DependencyInjection; +using Microsoft.Extensions.Logging; +using Microsoft.Extensions.Options; + +namespace ScadaLink.SiteEventLogging.Tests; + +/// +/// Regression tests for SiteEventLogging-007: the purge and query services must not +/// downcast an injected to the concrete type. They now +/// depend on the concrete directly, and DI must resolve +/// the recorder, query service, and purge service to a single shared instance. +/// +public class ServiceWiringTests : IDisposable +{ + private readonly string _dbPath; + private ServiceProvider? _provider; + + public ServiceWiringTests() + { + _dbPath = Path.Combine(Path.GetTempPath(), $"test_wiring_{Guid.NewGuid()}.db"); + } + + public void Dispose() + { + _provider?.Dispose(); + if (File.Exists(_dbPath)) File.Delete(_dbPath); + } + + [Fact] + public void AddSiteEventLogging_ResolvesAllServices_SharingOneRecorderInstance() + { + var services = new ServiceCollection(); + services.AddLogging(); + services.Configure(o => o.DatabasePath = _dbPath); + services.AddSiteEventLogging(); + + _provider = services.BuildServiceProvider(); + + var recorderViaInterface = _provider.GetRequiredService(); + var recorderConcrete = _provider.GetRequiredService(); + var queryService = _provider.GetRequiredService(); + + // The interface registration must forward to the concrete singleton — purge + // and query depend on that same instance, so all three share one connection. + Assert.Same(recorderConcrete, recorderViaInterface); + Assert.NotNull(queryService); + + // The hosted purge service must also resolve without an InvalidCastException. + var hosted = _provider.GetServices(); + Assert.Contains(hosted, h => h is EventLogPurgeService); + } + + [Fact] + public void PurgeAndQueryServices_AcceptConcreteRecorder_WithoutDowncast() + { + // The services no longer take ISiteEventLogger and downcast it; they take the + // concrete SiteEventLogger. Constructing them directly must compile and work. + var options = Options.Create(new SiteEventLogOptions { DatabasePath = _dbPath }); + using var loggerFactory = LoggerFactory.Create(_ => { }); + using var recorder = new SiteEventLogger( + options, loggerFactory.CreateLogger()); + + var purge = new EventLogPurgeService( + recorder, options, loggerFactory.CreateLogger()); + var query = new EventLogQueryService( + recorder, options, loggerFactory.CreateLogger()); + + Assert.NotNull(purge); + Assert.NotNull(query); + } +} diff --git a/tests/ScadaLink.SiteEventLogging.Tests/SiteEventLoggerAsyncTests.cs b/tests/ScadaLink.SiteEventLogging.Tests/SiteEventLoggerAsyncTests.cs new file mode 100644 index 0000000..f1d766f --- /dev/null +++ b/tests/ScadaLink.SiteEventLogging.Tests/SiteEventLoggerAsyncTests.cs @@ -0,0 +1,116 @@ +using System.Diagnostics; +using Microsoft.Data.Sqlite; +using Microsoft.Extensions.Logging.Abstractions; +using Microsoft.Extensions.Options; + +namespace ScadaLink.SiteEventLogging.Tests; + +/// +/// Regression tests for SiteEventLogging-005 (synchronous I/O on caller thread) +/// and SiteEventLogging-008 (write failures silently swallowed). +/// +public class SiteEventLoggerAsyncTests : IDisposable +{ + private readonly SiteEventLogger _logger; + private readonly string _dbPath; + + public SiteEventLoggerAsyncTests() + { + _dbPath = Path.Combine(Path.GetTempPath(), $"test_async_{Guid.NewGuid()}.db"); + var options = Options.Create(new SiteEventLogOptions { DatabasePath = _dbPath }); + _logger = new SiteEventLogger(options, NullLogger.Instance); + } + + public void Dispose() + { + _logger.Dispose(); + if (File.Exists(_dbPath)) File.Delete(_dbPath); + } + + [Fact] + public async Task LogEventAsync_DoesNotBlockCaller_WhenWriteIsSlow() + { + // SiteEventLogging-005: LogEventAsync must not perform the SQLite write + // inline on the caller's thread. We hold the recorder busy on a long + // database operation from one thread, then time how long it takes the + // caller of LogEventAsync to get control back. If recording is offloaded + // to a background writer, the caller returns essentially immediately even + // though the database is busy. If recording is synchronous (the bug), the + // caller blocks on the write lock for the full busy period. + var busyStarted = new ManualResetEventSlim(false); + var releaseBusy = new ManualResetEventSlim(false); + + var busyThread = new Thread(() => + { + _logger.WithConnection(_ => + { + busyStarted.Set(); + // Hold the connection lock until the test releases it. + releaseBusy.Wait(TimeSpan.FromSeconds(10)); + }); + }); + busyThread.Start(); + + Assert.True(busyStarted.Wait(TimeSpan.FromSeconds(5)), "Busy thread did not start."); + + var sw = Stopwatch.StartNew(); + var recordTask = _logger.LogEventAsync("script", "Info", null, "Caller", "Should not block"); + sw.Stop(); + + // The CALL must return quickly even though the database is busy for ~1s. + Assert.True(sw.ElapsedMilliseconds < 500, + $"LogEventAsync blocked the caller for {sw.ElapsedMilliseconds} ms — recording is not offloaded."); + + // Release the busy thread; the record must still complete successfully. + releaseBusy.Set(); + busyThread.Join(TimeSpan.FromSeconds(10)); + + await recordTask.WaitAsync(TimeSpan.FromSeconds(10)); + } + + [Fact] + public async Task LogEventAsync_TaskCompletes_AfterEventIsPersisted() + { + // Awaiting the returned Task must guarantee the row is durably written. + await _logger.LogEventAsync("script", "Info", "inst-1", "Source", "Persisted event"); + + var count = _logger.WithConnection(connection => + { + using var cmd = connection.CreateCommand(); + cmd.CommandText = "SELECT COUNT(*) FROM site_events"; + return (long)cmd.ExecuteScalar()!; + }); + Assert.Equal(1, count); + } + + [Fact] + public async Task LogEventAsync_FaultsTask_AndCountsFailure_OnWriteError() + { + // SiteEventLogging-008: a write failure must not be silently swallowed. + // The returned Task must fault and the failure counter must increment so + // Health Monitoring can surface a logging outage. + using var failingConnection = new SqliteConnection("Data Source=:memory:"); + failingConnection.Open(); + var options = Options.Create(new SiteEventLogOptions + { + DatabasePath = Path.Combine(Path.GetTempPath(), $"test_fail_{Guid.NewGuid()}.db") + }); + var logger = new SiteEventLogger(options, NullLogger.Instance); + + // Drop the table so every write fails with "no such table". + logger.WithConnection(connection => + { + using var cmd = connection.CreateCommand(); + cmd.CommandText = "DROP TABLE site_events"; + cmd.ExecuteNonQuery(); + }); + + await Assert.ThrowsAnyAsync(() => + logger.LogEventAsync("script", "Error", null, "Source", "Failing write")); + + Assert.True(logger.FailedWriteCount > 0, + "FailedWriteCount must increment when an event write fails."); + + logger.Dispose(); + } +}