fix(store-and-forward): resolve StoreAndForward-004,005,010,013 — accurate handler-contract doc, conditional sweep writes, reset LastAttemptAt on parked retry, test coverage

This commit is contained in:
Joseph Doherty
2026-05-16 21:44:10 -04:00
parent a88bec9376
commit 5672502d83
7 changed files with 447 additions and 18 deletions

View File

@@ -8,7 +8,7 @@
| Last reviewed | 2026-05-16 |
| Reviewer | claude-agent |
| Commit reviewed | `9c60592` |
| Open findings | 11 |
| Open findings | 7 |
## Summary
@@ -212,7 +212,7 @@ corrected semantics.
|--|--|
| Severity | Medium |
| Category | Documentation & comments |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.StoreAndForward/StoreAndForwardService.cs:38`, `:60` |
**Description**
@@ -233,15 +233,25 @@ retry); exception = transient failure (buffer / increment retry).
**Resolution**
_Unresolved._
Resolved 2026-05-16 (commit pending). Confirmed the root cause against the source: the
old XML comment's "Permanent failures should return false (message will NOT be
buffered)" describes only the `EnqueueAsync` immediate path; on the retry path a
`false` return parks the already-buffered message. Reworded the `_deliveryHandlers`
field doc into an explicit three-way contract (`true` = delivered/removed-or-never-
buffered; `false` = permanent failure = not-buffered-on-immediate / parked-on-retry;
throws = transient = buffered-for-retry / retry-count-incremented), noting it applies
identically on both paths, and pointed `RegisterDeliveryHandler`'s summary at it.
Documentation-only change — no behavioural code touched, so no regression test
(an XML comment is not test-observable).
### StoreAndForward-005 — Parked-message retry/discard can race with the in-progress retry sweep
| | |
|--|--|
| Severity | Medium |
| Severity | Low — re-triaged down from Medium on 2026-05-16; the described data-loss race is not actually reachable, see Re-triage note |
| Original severity | Medium |
| Category | Concurrency & thread safety |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.StoreAndForward/StoreAndForwardService.cs:184`, `:266`, `:280` |
**Description**
@@ -266,9 +276,42 @@ so the actor mailbox serialises them, or make status transitions conditional in
(e.g. `UPDATE ... WHERE id = @id AND status = @expected`) and re-check the affected
row count.
**Re-triage note (2026-05-16)**
Verified against the source: the *specific* race the finding describes — *"an operator
`RetryParkedMessageAsync` resets a row to Pending while the sweep simultaneously parks
the same in-flight message"* — **cannot occur** at the reviewed code. The two operator
operations (`StoreAndForwardStorage.RetryParkedMessageAsync` /
`DiscardParkedMessageAsync`) are already SQL-conditional on `status = Parked`, and the
retry sweep only ever loads and processes rows that are `Pending`. For the operator and
the sweep to touch the *same* message the row would have to be simultaneously `Parked`
(operator-actionable) and in the sweep's in-flight `Pending` snapshot — mutually
exclusive states. Overlapping sweeps are additionally prevented by the `Interlocked`
guard, so there is only ever one sweep writer. The described data-loss outcome is
therefore not reachable, which makes the original Medium ("risky behaviour") severity
an over-statement — **re-triaged to Low**.
What *is* real is a latent fragility: the sweep's own state-changing writes used the
unconditional `UpdateMessageAsync` (`WHERE id = @id`), so the no-clobber guarantee
rested entirely on the `Interlocked` invariant with zero defence-in-depth if that
invariant were ever broken (e.g. `RetryMessageAsync` called outside the sweep, or the
guard removed). That residual Low-severity weakness is worth closing, so the
recommendation's conditional-SQL option was applied.
**Resolution**
_Unresolved._
Resolved 2026-05-16 (commit pending). Re-triaged Medium → Low (the described race is
unreachable; see Re-triage note) and the residual latent fragility fixed: added
`StoreAndForwardStorage.UpdateMessageIfStatusAsync`, a conditional update
(`UPDATE ... WHERE id = @id AND status = @expectedStatus`) returning whether a row was
written. `RetryMessageAsync`'s three state-changing writes (park-on-permanent-failure,
park-on-max-retries, retry-count increment) now use it with `expectedStatus = Pending`,
so the sweep can never overwrite a row whose status changed underneath it; a skipped
write is logged and the message left for the next sweep. Regression test
`RetryMessageAsync_StatusChangedDuringDelivery_SweepParkWriteIsSkipped` drives a writer
that moves the row out of `Pending` mid-delivery and asserts the sweep's stale park
write is skipped (it failed against the pre-fix unconditional update, clobbering the
other writer's `RetryCount`).
### StoreAndForward-006 — `GetParkedMessagesAsync` count and page run without a transaction
@@ -395,7 +438,7 @@ _Unresolved._
|--|--|
| Severity | Medium |
| Category | Correctness & logic bugs |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.StoreAndForward/StoreAndForwardStorage.cs:203`, `:101` |
**Description**
@@ -419,7 +462,16 @@ interval. Document the chosen behaviour in the method's XML comment.
**Resolution**
_Unresolved._
Resolved 2026-05-16 (commit pending). Confirmed against the source: `RetryParkedMessage
Async` reset `status`, `retry_count` and `last_error` but left `last_attempt_at` stale,
so the operator-retried message's retry timing depended on the elapsed time since the
original pre-park attempt. Encoded the intent explicitly — an operator retry means
"attempt this again now" — by also setting `last_attempt_at = NULL` in the UPDATE, so
the re-queued message is unambiguously due on the next sweep regardless of the
configured `retry_interval_ms`. The method's XML comment now documents this. Regression
test `RetryParkedMessageAsync_ClearsLastAttemptAt_SoMessageIsImmediatelyDue` uses a
1-hour retry interval and a recent `last_attempt_at`; it failed pre-fix (timestamp not
cleared, message excluded from the retry-due set) and passes post-fix.
### StoreAndForward-011 — `StoreAndForwardMessageStatus.InFlight` is unused and the doc's "retrying" status is unmodelled
@@ -488,7 +540,7 @@ _Unresolved._
|--|--|
| Severity | Medium |
| Category | Testing coverage |
| Status | Open |
| Status | Resolved |
| Location | `tests/ScadaLink.StoreAndForward.Tests/` (whole directory); `src/ScadaLink.StoreAndForward/StoreAndForwardStorage.cs:101`; `src/ScadaLink.StoreAndForward/ParkedMessageHandlerActor.cs` |
**Description**
@@ -514,7 +566,20 @@ invalid-JSON payloads.
**Resolution**
_Unresolved._
Resolved 2026-05-16 (commit pending). All three gaps closed.
(1) Retry-due timing: `GetMessagesForRetryAsync_NonZeroInterval_ExcludesNotYetDue
IncludesDue` exercises the `julianday` elapsed-time SQL with non-zero intervals — a
just-attempted message (1-hour interval) is excluded, an old one (attempted 2h ago,
5-min interval) and a never-attempted one are included.
(2) Active-side replication: this sub-claim was **already stale** at the reviewed code —
`StoreAndForwardReplicationTests` (added with finding 001's fix) asserts the Add, Remove
and Park replication operations; no new test was needed.
(3) `ParkedMessageHandlerActor`: new `ParkedMessageHandlerActorTests` using
`Akka.TestKit.Xunit2` (package reference added to the test project) covers Query/Retry/
Discard request-to-response mapping, correlation-ID propagation, the unknown-message
failure responses, and `ExtractMethodName` for `MethodName`, `Subject`, missing-property
and malformed-JSON payloads (all falling back to the category name without throwing).
These are coverage-gap tests over already-correct code, so they pass on first run.
### StoreAndForward-014 — Storage does not create its SQLite database directory

View File

@@ -36,8 +36,20 @@ public class StoreAndForwardService
private int _retryInProgress;
/// <summary>
/// WP-10: Delivery handler delegate. Returns true on success, throws on transient failure.
/// Permanent failures should return false (message will NOT be buffered).
/// WP-10: Delivery handler delegate. The return value / exception is interpreted
/// the same way on both the immediate-delivery path (<see cref="EnqueueAsync"/>)
/// and the background retry path (<c>RetryMessageAsync</c>):
/// <list type="bullet">
/// <item><description><c>true</c> — delivered successfully. The message is
/// removed from the buffer (or, on the immediate path, never buffered).</description></item>
/// <item><description><c>false</c> — permanent failure. On the immediate path
/// the message is NOT buffered; on a retry the message is already buffered and
/// is parked immediately (no further retries).</description></item>
/// <item><description>throws — transient failure. On the immediate path the
/// message is buffered for retry; on a retry the retry count is incremented and
/// the message is parked once <see cref="StoreAndForwardMessage.MaxRetries"/> is
/// reached.</description></item>
/// </list>
/// </summary>
private readonly Dictionary<StoreAndForwardCategory, Func<StoreAndForwardMessage, Task<bool>>> _deliveryHandlers = new();
@@ -59,7 +71,9 @@ public class StoreAndForwardService
}
/// <summary>
/// Registers a delivery handler for a given message category.
/// Registers a delivery handler for a given message category. See the
/// <c>_deliveryHandlers</c> field documentation for the true/false/throws contract,
/// which applies identically on the immediate and retry paths.
/// </summary>
public void RegisterDeliveryHandler(
StoreAndForwardCategory category,
@@ -241,11 +255,22 @@ public class StoreAndForwardService
return;
}
// Permanent failure on retry — park immediately
// Permanent failure on retry — park immediately.
// StoreAndForward-005: the sweep observed this row as Pending; only commit
// the park if it is still Pending so a concurrent operator action that
// moved it (retry/discard) is not silently overwritten.
message.Status = StoreAndForwardMessageStatus.Parked;
message.LastAttemptAt = DateTimeOffset.UtcNow;
message.LastError = "Permanent failure (handler returned false)";
await _storage.UpdateMessageAsync(message);
var parked = await _storage.UpdateMessageIfStatusAsync(
message, StoreAndForwardMessageStatus.Pending);
if (!parked)
{
_logger.LogDebug(
"Message {MessageId} changed status during delivery; sweep park skipped",
message.Id);
return;
}
_replication?.ReplicatePark(message);
RaiseActivity("Parked", message.Category,
$"Permanent failure for {message.Target}: handler returned false");
@@ -259,8 +284,18 @@ public class StoreAndForwardService
if (message.MaxRetries > 0 && message.RetryCount >= message.MaxRetries)
{
// StoreAndForward-005: conditional park — see the permanent-failure
// branch above for rationale.
message.Status = StoreAndForwardMessageStatus.Parked;
await _storage.UpdateMessageAsync(message);
var parked = await _storage.UpdateMessageIfStatusAsync(
message, StoreAndForwardMessageStatus.Pending);
if (!parked)
{
_logger.LogDebug(
"Message {MessageId} changed status during delivery; sweep park skipped",
message.Id);
return;
}
_replication?.ReplicatePark(message);
RaiseActivity("Parked", message.Category,
$"Max retries ({message.MaxRetries}) reached for {message.Target}");
@@ -270,7 +305,17 @@ public class StoreAndForwardService
}
else
{
await _storage.UpdateMessageAsync(message);
// StoreAndForward-005: the retry-count increment is also conditional
// on the row still being Pending so it cannot clobber an operator
// action that ran during the failed delivery.
if (!await _storage.UpdateMessageIfStatusAsync(
message, StoreAndForwardMessageStatus.Pending))
{
_logger.LogDebug(
"Message {MessageId} changed status during delivery; sweep retry-count update skipped",
message.Id);
return;
}
RaiseActivity("Retried", message.Category,
$"Retry {message.RetryCount}/{message.MaxRetries} for {message.Target}: {ex.Message}");
}

View File

@@ -164,6 +164,47 @@ public class StoreAndForwardStorage
await cmd.ExecuteNonQueryAsync();
}
/// <summary>
/// WP-10: Updates a message after a delivery attempt, but only if the row is still
/// in the expected status. Returns true if the row was updated, false if it had
/// already been changed (e.g. an operator retried or discarded the message) and so
/// was skipped.
///
/// StoreAndForward-005: the retry sweep uses this for its state-changing writes so
/// it cannot clobber a concurrent operator action (RetryParkedMessageAsync /
/// DiscardParkedMessageAsync). Those operator operations are themselves SQL-
/// conditional on <c>status = Parked</c>; making the sweep's writes conditional on
/// the status the sweep observed closes the sweep-vs-management race rather than
/// relying only on the in-process overlapping-sweep guard.
/// </summary>
public async Task<bool> UpdateMessageIfStatusAsync(
StoreAndForwardMessage message,
StoreAndForwardMessageStatus expectedStatus)
{
await using var connection = new SqliteConnection(_connectionString);
await connection.OpenAsync();
await using var cmd = connection.CreateCommand();
cmd.CommandText = @"
UPDATE sf_messages
SET retry_count = @retryCount,
last_attempt_at = @lastAttempt,
status = @status,
last_error = @lastError
WHERE id = @id AND status = @expectedStatus";
cmd.Parameters.AddWithValue("@id", message.Id);
cmd.Parameters.AddWithValue("@retryCount", message.RetryCount);
cmd.Parameters.AddWithValue("@lastAttempt", message.LastAttemptAt.HasValue
? message.LastAttemptAt.Value.ToString("O") : DBNull.Value);
cmd.Parameters.AddWithValue("@status", (int)message.Status);
cmd.Parameters.AddWithValue("@lastError", (object?)message.LastError ?? DBNull.Value);
cmd.Parameters.AddWithValue("@expectedStatus", (int)expectedStatus);
var rows = await cmd.ExecuteNonQueryAsync();
return rows > 0;
}
/// <summary>
/// WP-10: Removes a successfully delivered message.
/// </summary>
@@ -221,6 +262,13 @@ public class StoreAndForwardStorage
/// <summary>
/// WP-12: Moves a parked message back to pending for retry.
///
/// StoreAndForward-010: <c>last_attempt_at</c> is reset to NULL so the re-queued
/// message is unambiguously due on the next retry sweep. An operator-initiated
/// retry means "attempt this again now"; leaving the stale parked timestamp in
/// place would make the message's retry timing depend on the configured retry
/// interval relative to the original (pre-park) attempt — "try immediately" only
/// by accident, and a long interval would instead delay the operator's retry.
/// </summary>
public async Task<bool> RetryParkedMessageAsync(string messageId)
{
@@ -230,7 +278,7 @@ public class StoreAndForwardStorage
await using var cmd = connection.CreateCommand();
cmd.CommandText = @"
UPDATE sf_messages
SET status = @pending, retry_count = 0, last_error = NULL
SET status = @pending, retry_count = 0, last_error = NULL, last_attempt_at = NULL
WHERE id = @id AND status = @parked";
cmd.Parameters.AddWithValue("@id", messageId);

View File

@@ -0,0 +1,173 @@
using Akka.Actor;
using Akka.TestKit.Xunit2;
using Microsoft.Data.Sqlite;
using Microsoft.Extensions.Logging.Abstractions;
using ScadaLink.Commons.Messages.RemoteQuery;
using ScadaLink.Commons.Types.Enums;
namespace ScadaLink.StoreAndForward.Tests;
/// <summary>
/// StoreAndForward-013: tests for the <see cref="ParkedMessageHandlerActor"/> actor
/// bridge — the Query/Retry/Discard request-to-response mapping and the
/// <c>ExtractMethodName</c> payload-JSON parsing (including the malformed-JSON branch).
/// </summary>
public class ParkedMessageHandlerActorTests : TestKit, IAsyncLifetime, IDisposable
{
private readonly SqliteConnection _keepAlive;
private readonly StoreAndForwardStorage _storage;
private readonly StoreAndForwardService _service;
public ParkedMessageHandlerActorTests()
{
var connStr = $"Data Source=ActorTests_{Guid.NewGuid():N};Mode=Memory;Cache=Shared";
_keepAlive = new SqliteConnection(connStr);
_keepAlive.Open();
_storage = new StoreAndForwardStorage(connStr, NullLogger<StoreAndForwardStorage>.Instance);
var options = new StoreAndForwardOptions
{
DefaultRetryInterval = TimeSpan.Zero,
DefaultMaxRetries = 1,
RetryTimerInterval = TimeSpan.FromMinutes(10),
ReplicationEnabled = false,
};
_service = new StoreAndForwardService(
_storage, options, NullLogger<StoreAndForwardService>.Instance);
}
public async Task InitializeAsync() => await _storage.InitializeAsync();
public Task DisposeAsync() => Task.CompletedTask;
protected override void Dispose(bool disposing)
{
if (disposing) _keepAlive.Dispose();
base.Dispose(disposing);
}
/// <summary>Enqueues a message and parks it via the retry sweep (MaxRetries = 1).</summary>
private async Task<string> ParkMessageAsync(StoreAndForwardCategory category, string payloadJson)
{
_service.RegisterDeliveryHandler(category, _ => throw new HttpRequestException("always fails"));
var result = await _service.EnqueueAsync(category, "target", payloadJson, maxRetries: 1);
await _service.RetryPendingMessagesAsync();
return result.MessageId;
}
[Fact]
public async Task Query_ReturnsParkedEntries_WithExtractedMethodName()
{
await ParkMessageAsync(StoreAndForwardCategory.ExternalSystem,
"""{"MethodName":"StartPump","Args":{}}""");
var actor = Sys.ActorOf(Props.Create(() => new ParkedMessageHandlerActor(_service, "site-1")));
actor.Tell(new ParkedMessageQueryRequest("corr-1", "site-1", 1, 50, DateTimeOffset.UtcNow));
var response = ExpectMsg<ParkedMessageQueryResponse>();
Assert.True(response.Success);
Assert.Equal("corr-1", response.CorrelationId);
Assert.Equal("site-1", response.SiteId);
Assert.Single(response.Messages);
Assert.Equal("StartPump", response.Messages[0].MethodName);
Assert.Equal(StoreAndForwardCategory.ExternalSystem, response.Messages[0].Category);
}
[Fact]
public async Task Query_NotificationPayload_UsesSubjectAsMethodName()
{
await ParkMessageAsync(StoreAndForwardCategory.Notification,
"""{"Subject":"Tank overflow","Body":"..."}""");
var actor = Sys.ActorOf(Props.Create(() => new ParkedMessageHandlerActor(_service, "site-1")));
actor.Tell(new ParkedMessageQueryRequest("corr-2", "site-1", 1, 50, DateTimeOffset.UtcNow));
var response = ExpectMsg<ParkedMessageQueryResponse>();
Assert.True(response.Success);
Assert.Equal("Tank overflow", response.Messages[0].MethodName);
}
[Fact]
public async Task Query_MalformedJsonPayload_FallsBackToCategoryName()
{
await ParkMessageAsync(StoreAndForwardCategory.CachedDbWrite, "not-valid-json{");
var actor = Sys.ActorOf(Props.Create(() => new ParkedMessageHandlerActor(_service, "site-1")));
actor.Tell(new ParkedMessageQueryRequest("corr-3", "site-1", 1, 50, DateTimeOffset.UtcNow));
var response = ExpectMsg<ParkedMessageQueryResponse>();
Assert.True(response.Success);
// Malformed JSON must not throw — ExtractMethodName falls back to the category.
Assert.Equal(StoreAndForwardCategory.CachedDbWrite.ToString(), response.Messages[0].MethodName);
}
[Fact]
public async Task Query_PayloadWithoutMethodNameOrSubject_FallsBackToCategoryName()
{
await ParkMessageAsync(StoreAndForwardCategory.ExternalSystem, """{"Unrelated":"value"}""");
var actor = Sys.ActorOf(Props.Create(() => new ParkedMessageHandlerActor(_service, "site-1")));
actor.Tell(new ParkedMessageQueryRequest("corr-4", "site-1", 1, 50, DateTimeOffset.UtcNow));
var response = ExpectMsg<ParkedMessageQueryResponse>();
Assert.Equal(StoreAndForwardCategory.ExternalSystem.ToString(), response.Messages[0].MethodName);
}
[Fact]
public async Task Retry_ParkedMessage_ReturnsSuccess()
{
var messageId = await ParkMessageAsync(StoreAndForwardCategory.ExternalSystem, """{}""");
var actor = Sys.ActorOf(Props.Create(() => new ParkedMessageHandlerActor(_service, "site-1")));
actor.Tell(new ParkedMessageRetryRequest("corr-5", "site-1", messageId, DateTimeOffset.UtcNow));
var response = ExpectMsg<ParkedMessageRetryResponse>();
Assert.True(response.Success);
Assert.Equal("corr-5", response.CorrelationId);
Assert.Null(response.ErrorMessage);
var msg = await _storage.GetMessageByIdAsync(messageId);
Assert.Equal(StoreAndForwardMessageStatus.Pending, msg!.Status);
}
[Fact]
public void Retry_UnknownMessage_ReturnsFailureWithMessage()
{
var actor = Sys.ActorOf(Props.Create(() => new ParkedMessageHandlerActor(_service, "site-1")));
actor.Tell(new ParkedMessageRetryRequest("corr-6", "site-1", "does-not-exist", DateTimeOffset.UtcNow));
var response = ExpectMsg<ParkedMessageRetryResponse>();
Assert.False(response.Success);
Assert.Equal("corr-6", response.CorrelationId);
Assert.NotNull(response.ErrorMessage);
}
[Fact]
public async Task Discard_ParkedMessage_ReturnsSuccessAndRemovesMessage()
{
var messageId = await ParkMessageAsync(StoreAndForwardCategory.ExternalSystem, """{}""");
var actor = Sys.ActorOf(Props.Create(() => new ParkedMessageHandlerActor(_service, "site-1")));
actor.Tell(new ParkedMessageDiscardRequest("corr-7", "site-1", messageId, DateTimeOffset.UtcNow));
var response = ExpectMsg<ParkedMessageDiscardResponse>();
Assert.True(response.Success);
Assert.Equal("corr-7", response.CorrelationId);
var msg = await _storage.GetMessageByIdAsync(messageId);
Assert.Null(msg);
}
[Fact]
public void Discard_UnknownMessage_ReturnsFailureWithMessage()
{
var actor = Sys.ActorOf(Props.Create(() => new ParkedMessageHandlerActor(_service, "site-1")));
actor.Tell(new ParkedMessageDiscardRequest("corr-8", "site-1", "does-not-exist", DateTimeOffset.UtcNow));
var response = ExpectMsg<ParkedMessageDiscardResponse>();
Assert.False(response.Success);
Assert.NotNull(response.ErrorMessage);
}
}

View File

@@ -9,6 +9,7 @@
</PropertyGroup>
<ItemGroup>
<PackageReference Include="Akka.TestKit.Xunit2" />
<PackageReference Include="coverlet.collector" />
<PackageReference Include="Microsoft.Data.Sqlite" />
<PackageReference Include="Microsoft.NET.Test.Sdk" />

View File

@@ -177,6 +177,49 @@ public class StoreAndForwardServiceTests : IAsyncLifetime, IDisposable
Assert.Equal(1, msg.RetryCount); // one sweep retry recorded
}
// ── StoreAndForward-005: sweep-vs-management race hardening ──
[Fact]
public async Task RetryMessageAsync_StatusChangedDuringDelivery_SweepParkWriteIsSkipped()
{
// StoreAndForward-005: the retry sweep's state-changing writes must be
// conditional on the status it observed, so a concurrent operator action that
// moved the row out of Pending (e.g. between the sweep's snapshot load and its
// park write) is not silently overwritten by the sweep's stale view.
var result = await _service.EnqueueAsync(
StoreAndForwardCategory.ExternalSystem, "api", """{}""",
attemptImmediateDelivery: false, maxRetries: 1);
_service.RegisterDeliveryHandler(StoreAndForwardCategory.ExternalSystem,
async msg =>
{
// Simulate an operator action winning the race: the row leaves Pending
// (here: parked) while the sweep is still mid-delivery. The sweep would
// otherwise unconditionally re-write this row from its stale snapshot.
var parkedOutFromUnderTheSweep = new StoreAndForwardMessage
{
Id = msg.Id, Category = msg.Category, Target = msg.Target,
PayloadJson = msg.PayloadJson, RetryCount = 7,
MaxRetries = msg.MaxRetries, RetryIntervalMs = msg.RetryIntervalMs,
CreatedAt = msg.CreatedAt, LastAttemptAt = DateTimeOffset.UtcNow,
Status = StoreAndForwardMessageStatus.Parked,
LastError = "operator/other writer"
};
await _storage.UpdateMessageAsync(parkedOutFromUnderTheSweep);
throw new HttpRequestException("transient — sweep will try to park");
});
await _service.RetryPendingMessagesAsync();
// The sweep observed Pending; the row is now Parked with the other writer's
// RetryCount (7), not the sweep's (1). The sweep's conditional write was skipped.
var msg = await _storage.GetMessageByIdAsync(result.MessageId);
Assert.NotNull(msg);
Assert.Equal(StoreAndForwardMessageStatus.Parked, msg!.Status);
Assert.Equal(7, msg.RetryCount);
Assert.Equal("operator/other writer", msg.LastError);
}
[Fact]
public async Task RetryPendingMessagesAsync_PermanentFailureOnRetry_ParksMessage()
{

View File

@@ -106,6 +106,35 @@ public class StoreAndForwardStorageTests : IAsyncLifetime, IDisposable
Assert.All(forRetry, m => Assert.Equal(StoreAndForwardMessageStatus.Pending, m.Status));
}
[Fact]
public async Task GetMessagesForRetryAsync_NonZeroInterval_ExcludesNotYetDueIncludesDue()
{
// StoreAndForward-013: exercise the julianday elapsed-time comparison with a
// non-zero retry interval. A message attempted just now must NOT be due; one
// attempted long ago must be due.
var notDue = CreateMessage("notdue", StoreAndForwardCategory.ExternalSystem);
notDue.RetryIntervalMs = (long)TimeSpan.FromHours(1).TotalMilliseconds;
notDue.LastAttemptAt = DateTimeOffset.UtcNow;
await _storage.EnqueueAsync(notDue);
var due = CreateMessage("due", StoreAndForwardCategory.ExternalSystem);
due.RetryIntervalMs = (long)TimeSpan.FromMinutes(5).TotalMilliseconds;
due.LastAttemptAt = DateTimeOffset.UtcNow.AddHours(-2);
await _storage.EnqueueAsync(due);
var neverAttempted = CreateMessage("never", StoreAndForwardCategory.ExternalSystem);
neverAttempted.RetryIntervalMs = (long)TimeSpan.FromHours(1).TotalMilliseconds;
neverAttempted.LastAttemptAt = null;
await _storage.EnqueueAsync(neverAttempted);
var forRetry = await _storage.GetMessagesForRetryAsync();
var ids = forRetry.Select(m => m.Id).ToHashSet();
Assert.DoesNotContain("notdue", ids);
Assert.Contains("due", ids);
Assert.Contains("never", ids);
}
[Fact]
public async Task GetParkedMessagesAsync_ReturnsParkedOnly()
{
@@ -136,6 +165,31 @@ public class StoreAndForwardStorageTests : IAsyncLifetime, IDisposable
Assert.Equal(0, retrieved.RetryCount);
}
[Fact]
public async Task RetryParkedMessageAsync_ClearsLastAttemptAt_SoMessageIsImmediatelyDue()
{
// StoreAndForward-010: a re-queued parked message must be unambiguously due
// for the next sweep regardless of its (stale) last_attempt_at. Use a large
// retry interval so a leftover timestamp would otherwise exclude the message.
var msg = CreateMessage("requeue1", StoreAndForwardCategory.ExternalSystem);
msg.RetryIntervalMs = (long)TimeSpan.FromHours(1).TotalMilliseconds;
msg.LastAttemptAt = DateTimeOffset.UtcNow; // recent attempt
msg.Status = StoreAndForwardMessageStatus.Parked;
await _storage.EnqueueAsync(msg);
await _storage.UpdateMessageAsync(msg);
var requeued = await _storage.RetryParkedMessageAsync("requeue1");
Assert.True(requeued);
var retrieved = await _storage.GetMessageByIdAsync("requeue1");
Assert.Null(retrieved!.LastAttemptAt);
// It must appear in the retry-due set even though the configured interval
// (1 hour) has not elapsed since the original attempt.
var due = await _storage.GetMessagesForRetryAsync();
Assert.Contains(due, m => m.Id == "requeue1");
}
[Fact]
public async Task DiscardParkedMessageAsync_RemovesMessage()
{