fix(notification-service): resolve NotificationService-010,011,012 — disconnect SMTP on failure, relocate exception type, OAuth2/token-cache test coverage

This commit is contained in:
Joseph Doherty
2026-05-16 22:24:03 -04:00
parent dab0056d1b
commit a9bd017c88
5 changed files with 328 additions and 18 deletions

View File

@@ -8,7 +8,7 @@
| Last reviewed | 2026-05-16 | | Last reviewed | 2026-05-16 |
| Reviewer | claude-agent | | Reviewer | claude-agent |
| Commit reviewed | `9c60592` | | Commit reviewed | `9c60592` |
| Open findings | 3 | | Open findings | 0 |
## Summary ## Summary
@@ -360,7 +360,7 @@ under NotificationService-009.
|--|--| |--|--|
| Severity | Low | | Severity | Low |
| Category | Performance & resource management | | Category | Performance & resource management |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.NotificationService/NotificationDeliveryService.cs:121-154` | | Location | `src/ScadaLink.NotificationService/NotificationDeliveryService.cs:121-154` |
**Description** **Description**
@@ -373,7 +373,16 @@ Move disconnect/dispose into a `finally` block (or use `await using` once `ISmtp
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit pending). Root cause confirmed against the current source:
`DeliverAsync` invoked `smtp.DisconnectAsync` only on the success path inside the `try`
block, so a failed `Connect`/`Authenticate`/`Send` left the connection open until
finalization. `DisconnectAsync` was moved into the existing `finally` block (which also
releases the NS-007 concurrency slot), so the SMTP connection is always torn down
regardless of outcome. The disconnect is best-effort — wrapped in its own `try/catch` that
logs at debug level — so a disconnect failure (e.g. the connection is already dead) cannot
mask the original delivery exception. Regression tests
`Send_TransientFailureDuringSend_StillDisconnectsClient` and
`Send_FailureDuringAuthenticate_StillDisconnectsClient`.
### NotificationService-011 — `SmtpPermanentException` declared in the wrong file; module conventions ### NotificationService-011 — `SmtpPermanentException` declared in the wrong file; module conventions
@@ -381,7 +390,7 @@ _Unresolved._
|--|--| |--|--|
| Severity | Low | | Severity | Low |
| Category | Code organization & conventions | | Category | Code organization & conventions |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.NotificationService/NotificationDeliveryService.cs:173-177`, `src/ScadaLink.Commons/Entities/Notifications/SmtpConfiguration.cs:5-15` | | Location | `src/ScadaLink.NotificationService/NotificationDeliveryService.cs:173-177`, `src/ScadaLink.Commons/Entities/Notifications/SmtpConfiguration.cs:5-15` |
**Description** **Description**
@@ -394,7 +403,22 @@ Move `SmtpPermanentException` to its own file. For `SmtpConfiguration`, either k
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit pending). Both issues confirmed against source.
1. **`SmtpPermanentException` placement (in scope — fixed).** The public exception type
was extracted from the bottom of `NotificationDeliveryService.cs` into its own file,
`src/ScadaLink.NotificationService/SmtpPermanentException.cs`, restoring the
one-type-per-file layout. No behaviour change, so no dedicated regression test — the
move is verified by the module test suite still compiling and passing (56 tests green).
2. **`SmtpConfiguration` non-nullable strings (out of scope — re-triaged).**
`SmtpConfiguration` lives in `src/ScadaLink.Commons`, which is outside the
NotificationService module's edit scope; it cannot be changed here. This is the same
Commons entity already flagged for follow-up under **NotificationService-013**
(Deferred). The `required`-members / documented-constructor change should be folded into
that Commons/ConfigurationDatabase-scoped work. Note the delivery service's actual risk
is bounded: `SendAsync`/`DeliverBufferedAsync` already validate `TlsMode` and addresses
up front (NS-005/NS-008), and a null `Host`/`AuthType` from a malformed config row
surfaces as a classified/clean failure rather than silent corruption.
### NotificationService-012 — Test coverage gaps: OAuth2 delivery path, permanent-classification fallback, token-cache concurrency ### NotificationService-012 — Test coverage gaps: OAuth2 delivery path, permanent-classification fallback, token-cache concurrency
@@ -402,7 +426,7 @@ _Unresolved._
|--|--| |--|--|
| Severity | Low | | Severity | Low |
| Category | Testing coverage | | Category | Testing coverage |
| Status | Open | | Status | Resolved |
| Location | `tests/ScadaLink.NotificationService.Tests/NotificationDeliveryServiceTests.cs`, `tests/ScadaLink.NotificationService.Tests/OAuth2TokenServiceTests.cs` | | Location | `tests/ScadaLink.NotificationService.Tests/NotificationDeliveryServiceTests.cs`, `tests/ScadaLink.NotificationService.Tests/OAuth2TokenServiceTests.cs` |
**Description** **Description**
@@ -415,4 +439,29 @@ Add tests for: OAuth2-authenticated send with a mocked `OAuth2TokenService`; the
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit pending). Each listed gap was re-triaged against the current
source (the module was substantially rewritten by NS-001..009 since this finding was
filed) and the genuinely-missing tests were added:
1. **OAuth2 delivery path (gap real — covered).** New test
`Send_OAuth2Config_AuthenticatesWithResolvedAccessToken` drives `DeliverAsync` with an
`oauth2` `SmtpConfiguration` and a real `OAuth2TokenService` (mocked `IHttpClientFactory`),
asserting the SMTP client is authenticated with the *resolved access token*, not the raw
`tenant:client:secret` triple.
2. **5xx-message permanent fallback (gap obsolete — re-triaged).** The substring-based
permanent fallback this item describes no longer exists: NS-003 replaced message-text
matching with typed-exception/`SmtpStatusCode` classification. The equivalent path is
already covered by `Send_Smtp5xxCommandException_ClassifiedPermanent`, and
`Send_NonSmtpExceptionWith5xxLookalikeText_NotPromotedToPermanent` proves a "5xx
lookalike" message is no longer promoted. No new test needed for a removed branch.
3. **Token expiry/refresh and concurrent acquisition (gaps real — covered).** New tests
`GetTokenAsync_ExpiredToken_RefreshesOnNextCall` (uses `expires_in: 60` so the token is
stale immediately given the 60s refresh margin) and
`GetTokenAsync_ConcurrentCalls_MakeExactlyOneHttpRequest` (20 racing callers collapse to
a single token fetch, exercising the per-credential double-checked lock).
4. **End-to-end buffer-and-retry (gap already closed — re-triaged).** The
buffer-then-retry-then-deliver round trip was added when NS-001 was resolved
(`DeliverBuffered_HappyPath_ReturnsTrue`, `DeliverBuffered_ListNoLongerExists_…`, plus
`AkkaHostedService` handler registration), so no further test is required here.
Module test suite is green at 56 tests.

View File

@@ -315,8 +315,6 @@ public class NotificationDeliveryService : INotificationDeliveryService
var bccAddresses = recipients.Select(r => r.EmailAddress).ToList(); var bccAddresses = recipients.Select(r => r.EmailAddress).ToList();
await smtp.SendAsync(config.FromAddress, bccAddresses, subject, body, cancellationToken); await smtp.SendAsync(config.FromAddress, bccAddresses, subject, body, cancellationToken);
await smtp.DisconnectAsync(cancellationToken);
} }
catch (OperationCanceledException) when (cancellationToken.IsCancellationRequested) catch (OperationCanceledException) when (cancellationToken.IsCancellationRequested)
{ {
@@ -334,6 +332,23 @@ public class NotificationDeliveryService : INotificationDeliveryService
// catch filters (SmtpPermanentException / IsTransientSmtpError) handle them. // catch filters (SmtpPermanentException / IsTransientSmtpError) handle them.
finally finally
{ {
// NS-010: always tear the connection down, regardless of outcome. The
// SMTP QUIT used to run only on the success path inside the try block,
// so a failed Connect/Authenticate/Send left an open, authenticated
// connection until finalization reclaimed the socket — exhausting the
// server's connection slots under sustained transient failures.
// Disconnect is best-effort: a disconnect failure (e.g. the connection
// is already dead) must not mask the original delivery exception.
try
{
await smtp.DisconnectAsync(cancellationToken);
}
catch (Exception disconnectEx)
{
_logger.LogDebug(
"Ignoring SMTP disconnect failure during cleanup: {Reason}", disconnectEx.Message);
}
// NS-007: always release the concurrency slot, even on failure. // NS-007: always release the concurrency slot, even on failure.
limiter.Release(); limiter.Release();
} }
@@ -399,12 +414,3 @@ public class NotificationDeliveryService : INotificationDeliveryService
return ClassifySmtpError(ex, cancellationToken) == SmtpErrorClass.Transient; return ClassifySmtpError(ex, cancellationToken) == SmtpErrorClass.Transient;
} }
} }
/// <summary>
/// Signals a permanent SMTP failure (5xx) that should not be retried.
/// </summary>
public class SmtpPermanentException : Exception
{
public SmtpPermanentException(string message, Exception? innerException = null)
: base(message, innerException) { }
}

View File

@@ -0,0 +1,10 @@
namespace ScadaLink.NotificationService;
/// <summary>
/// Signals a permanent SMTP failure (5xx) that should not be retried.
/// </summary>
public class SmtpPermanentException : Exception
{
public SmtpPermanentException(string message, Exception? innerException = null)
: base(message, innerException) { }
}

View File

@@ -1,3 +1,4 @@
using System.Text.Json;
using MailKit; using MailKit;
using MailKit.Net.Smtp; using MailKit.Net.Smtp;
using Microsoft.Extensions.Logging.Abstractions; using Microsoft.Extensions.Logging.Abstractions;
@@ -407,6 +408,78 @@ public class NotificationDeliveryServiceTests
storage, new StoreAndForwardOptions(), NullLogger<StoreAndForwardService>.Instance); storage, new StoreAndForwardOptions(), NullLogger<StoreAndForwardService>.Instance);
} }
// ── NotificationService-010: SMTP client is disconnected on the failure path ──
/// <summary>
/// An SMTP wrapper that records whether <see cref="DisconnectAsync"/> ran and
/// can be told to fail at a chosen stage of the delivery sequence.
/// </summary>
private sealed class DisconnectTrackingClient : ISmtpClientWrapper, IDisposable
{
private readonly Func<Exception>? _failOnSend;
private readonly Func<Exception>? _failOnAuthenticate;
public DisconnectTrackingClient(
Func<Exception>? failOnSend = null, Func<Exception>? failOnAuthenticate = null)
{
_failOnSend = failOnSend;
_failOnAuthenticate = failOnAuthenticate;
}
public bool Disconnected { get; private set; }
public bool Disposed { get; private set; }
public Task ConnectAsync(string host, int port, SmtpTlsMode tlsMode, int connectionTimeoutSeconds, CancellationToken cancellationToken = default)
=> Task.CompletedTask;
public Task AuthenticateAsync(string authType, string? credentials, CancellationToken cancellationToken = default)
=> _failOnAuthenticate != null ? Task.FromException(_failOnAuthenticate()) : Task.CompletedTask;
public Task SendAsync(string from, IEnumerable<string> bccRecipients, string subject, string body, CancellationToken cancellationToken = default)
=> _failOnSend != null ? Task.FromException(_failOnSend()) : Task.CompletedTask;
public Task DisconnectAsync(CancellationToken cancellationToken = default)
{
Disconnected = true;
return Task.CompletedTask;
}
public void Dispose() => Disposed = true;
}
[Fact]
public async Task Send_TransientFailureDuringSend_StillDisconnectsClient()
{
// NS-010: DisconnectAsync used to run only on the success path inside the
// try block. A failure in SendAsync left the authenticated connection open
// (the SMTP QUIT was never issued), leaking server connection slots under
// sustained transient failures.
SetupHappyPath();
var tracking = new DisconnectTrackingClient(
failOnSend: () => new SmtpProtocolException("protocol error"));
var service = new NotificationDeliveryService(
_repository, () => tracking, NullLogger<NotificationDeliveryService>.Instance);
await service.SendAsync("ops-team", "Alert", "Body");
Assert.True(tracking.Disconnected, "DeliverAsync must disconnect the SMTP client even when the send fails");
}
[Fact]
public async Task Send_FailureDuringAuthenticate_StillDisconnectsClient()
{
// NS-010: an AuthenticateAsync failure must also tear the connection down.
SetupHappyPath();
var tracking = new DisconnectTrackingClient(
failOnAuthenticate: () => new SmtpProtocolException("auth handshake failed"));
var service = new NotificationDeliveryService(
_repository, () => tracking, NullLogger<NotificationDeliveryService>.Instance);
await service.SendAsync("ops-team", "Alert", "Body");
Assert.True(tracking.Disconnected, "DeliverAsync must disconnect the SMTP client even when authentication fails");
}
// ── NotificationService-005: explicit TLS mode passed through to the wrapper ── // ── NotificationService-005: explicit TLS mode passed through to the wrapper ──
/// <summary>An SMTP wrapper that records the TLS mode and timeout it was connected with.</summary> /// <summary>An SMTP wrapper that records the TLS mode and timeout it was connected with.</summary>
@@ -642,4 +715,77 @@ public class NotificationDeliveryServiceTests
Assert.False(result.Success); Assert.False(result.Success);
Assert.Contains("address", result.ErrorMessage, StringComparison.OrdinalIgnoreCase); Assert.Contains("address", result.ErrorMessage, StringComparison.OrdinalIgnoreCase);
} }
// ── NotificationService-012: OAuth2 delivery path coverage ──
/// <summary>An SMTP wrapper that records the auth type and credentials it received.</summary>
private sealed class RecordingAuthClient : ISmtpClientWrapper
{
public string? AuthType { get; private set; }
public string? Credentials { get; private set; }
public Task ConnectAsync(string host, int port, SmtpTlsMode tlsMode, int connectionTimeoutSeconds, CancellationToken cancellationToken = default)
=> Task.CompletedTask;
public Task AuthenticateAsync(string authType, string? credentials, CancellationToken cancellationToken = default)
{
AuthType = authType;
Credentials = credentials;
return Task.CompletedTask;
}
public Task SendAsync(string from, IEnumerable<string> bccRecipients, string subject, string body, CancellationToken cancellationToken = default)
=> Task.CompletedTask;
public Task DisconnectAsync(CancellationToken cancellationToken = default) => Task.CompletedTask;
}
private static OAuth2TokenService CreateTokenService(string accessToken, int expiresIn = 3600)
{
var json = JsonSerializer.Serialize(new
{
access_token = accessToken,
expires_in = expiresIn,
token_type = "Bearer"
});
var factory = Substitute.For<IHttpClientFactory>();
factory.CreateClient(Arg.Any<string>())
.Returns(_ => new HttpClient(new StubHttpHandler(json)));
return new OAuth2TokenService(factory, NullLogger<OAuth2TokenService>.Instance);
}
private sealed class StubHttpHandler : HttpMessageHandler
{
private readonly string _json;
public StubHttpHandler(string json) => _json = json;
protected override Task<HttpResponseMessage> SendAsync(
HttpRequestMessage request, CancellationToken cancellationToken)
=> Task.FromResult(new HttpResponseMessage(System.Net.HttpStatusCode.OK)
{
Content = new StringContent(_json)
});
}
[Fact]
public async Task Send_OAuth2Config_AuthenticatesWithResolvedAccessToken()
{
// NS-012: the OAuth2 delivery branch in DeliverAsync (token resolution during
// a send) was never exercised — every other test uses Basic Auth and a null
// token service. The credentials reaching the SMTP client must be the access
// token from OAuth2TokenService, not the raw tenant:client:secret triple.
var cfg = new SmtpConfiguration("smtp.office365.com", "oauth2", "noreply@example.com")
{
Id = 1, Port = 587, Credentials = "tenant1:client1:secret1", TlsMode = "starttls"
};
SetupHappyPathWithSmtp(cfg);
var recording = new RecordingAuthClient();
var service = new NotificationDeliveryService(
_repository,
() => recording,
NullLogger<NotificationDeliveryService>.Instance,
tokenService: CreateTokenService("oauth2-access-token-xyz"));
var result = await service.SendAsync("ops-team", "Alert", "Body");
Assert.True(result.Success);
Assert.Equal("oauth2", recording.AuthType);
Assert.Equal("oauth2-access-token-xyz", recording.Credentials);
}
} }

View File

@@ -124,6 +124,105 @@ public class OAuth2TokenServiceTests
Assert.Equal(2, handler.CallCount); // one per distinct credential, not per call Assert.Equal(2, handler.CallCount); // one per distinct credential, not per call
} }
// ── NotificationService-012: token expiry/refresh and concurrent acquisition ──
[Fact]
public async Task GetTokenAsync_ExpiredToken_RefreshesOnNextCall()
{
// NS-012: token expiry/refresh was untested — the cache test used a 3600s
// token so the refresh branch never ran. The service refreshes 60s before
// the stated expiry, so an expires_in of 60 makes the token immediately
// stale and the next call must fetch a fresh one.
var handler = new SequenceHttpMessageHandler(
TokenJson("first-token", expiresIn: 60),
TokenJson("second-token", expiresIn: 3600));
var client = new HttpClient(handler);
var factory = CreateMockFactory(client);
var service = new OAuth2TokenService(factory, NullLogger<OAuth2TokenService>.Instance);
var token1 = await service.GetTokenAsync("tenant:client:secret");
var token2 = await service.GetTokenAsync("tenant:client:secret");
Assert.Equal("first-token", token1);
Assert.Equal("second-token", token2); // refreshed because the first was already stale
Assert.Equal(2, handler.CallCount);
}
[Fact]
public async Task GetTokenAsync_ConcurrentCalls_MakeExactlyOneHttpRequest()
{
// NS-012: the double-checked-locking path was never exercised. Many callers
// racing for the same uncached credential must collapse to a single token
// fetch, not one HTTP call per caller.
var handler = new SlowCountingHttpMessageHandler(
TokenJson("concurrent-token", expiresIn: 3600), delay: TimeSpan.FromMilliseconds(100));
var client = new HttpClient(handler);
var factory = CreateMockFactory(client);
var service = new OAuth2TokenService(factory, NullLogger<OAuth2TokenService>.Instance);
var tasks = Enumerable.Range(0, 20)
.Select(_ => service.GetTokenAsync("tenant:client:secret"))
.ToArray();
var tokens = await Task.WhenAll(tasks);
Assert.All(tokens, t => Assert.Equal("concurrent-token", t));
Assert.Equal(1, handler.CallCount);
}
private static string TokenJson(string accessToken, int expiresIn) =>
JsonSerializer.Serialize(new
{
access_token = accessToken,
expires_in = expiresIn,
token_type = "Bearer"
});
/// <summary>HTTP handler returning a different response per invocation, in order.</summary>
private class SequenceHttpMessageHandler : HttpMessageHandler
{
private readonly string[] _responses;
public int CallCount { get; private set; }
public SequenceHttpMessageHandler(params string[] responses) => _responses = responses;
protected override Task<HttpResponseMessage> SendAsync(
HttpRequestMessage request, CancellationToken cancellationToken)
{
var body = _responses[Math.Min(CallCount, _responses.Length - 1)];
CallCount++;
return Task.FromResult(new HttpResponseMessage(HttpStatusCode.OK)
{
Content = new StringContent(body)
});
}
}
/// <summary>HTTP handler that delays and counts invocations (thread-safe count).</summary>
private class SlowCountingHttpMessageHandler : HttpMessageHandler
{
private readonly string _response;
private readonly TimeSpan _delay;
private int _callCount;
public int CallCount => _callCount;
public SlowCountingHttpMessageHandler(string response, TimeSpan delay)
{
_response = response;
_delay = delay;
}
protected override async Task<HttpResponseMessage> SendAsync(
HttpRequestMessage request, CancellationToken cancellationToken)
{
Interlocked.Increment(ref _callCount);
await Task.Delay(_delay, cancellationToken);
return new HttpResponseMessage(HttpStatusCode.OK)
{
Content = new StringContent(_response)
};
}
}
/// <summary> /// <summary>
/// HTTP handler that returns a distinct access token per tenant id, parsed from /// HTTP handler that returns a distinct access token per tenant id, parsed from
/// the request URL (<c>https://login.microsoftonline.com/{tenantId}/...</c>). /// the request URL (<c>https://login.microsoftonline.com/{tenantId}/...</c>).