fix(store-and-forward): resolve StoreAndForward-015..017 — document maxRetries=0 contract, replicate operator retry/discard, real category in activity log

This commit is contained in:
Joseph Doherty
2026-05-17 03:18:41 -04:00
parent be274212f0
commit 0135a6b2a6
6 changed files with 283 additions and 12 deletions

View File

@@ -8,7 +8,7 @@
| Last reviewed | 2026-05-17 |
| Reviewer | claude-agent |
| Commit reviewed | `39d737e` |
| Open findings | 3 (3 Deferred: 002, 011, 012 — see notes) |
| Open findings | 0 (3 Deferred: 002, 011, 012 — see notes) |
## Summary
@@ -735,7 +735,7 @@ all six `SiteActorPathTests` now pass. Fixed by the commit whose message referen
|--|--|
| Severity | Medium |
| Category | Documentation & comments |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.StoreAndForward/StoreAndForwardService.cs:114``:130`, `:285` |
**Description**
@@ -784,7 +784,17 @@ public `EnqueueAsync` contract must state the chosen meaning; today it states no
**Resolution**
_Unresolved._
Resolved 2026-05-17. Documentation-only fix — retry semantics confirmed correct and
left unchanged. Root cause verified against the source: `EnqueueAsync`'s `maxRetries`
parameter had no `<param>` documentation and the method/class summaries described only
the "park on max retries" path, never the `0 = no limit / retry forever` special case
that `RetryMessageAsync`'s `MaxRetries > 0` guard actually enforces. Added an explicit
`<param>` tag for every `EnqueueAsync` parameter — `maxRetries` now states in bold that
`0` means "no limit, never parked for retry exhaustion" and is **not** a "never retry"
value — extended the method summary with a retry-count lifecycle paragraph, updated the
class-level lifecycle bullet, and tightened the `StoreAndForwardMessage.MaxRetries` field
doc to the same wording. No behavioural code touched; an XML comment is not
test-observable so no regression test was added.
### StoreAndForward-016 — Operator-initiated parked-message retry and discard are not replicated to the standby
@@ -792,7 +802,7 @@ _Unresolved._
|--|--|
| Severity | Medium |
| Category | Error handling & resilience |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.StoreAndForward/StoreAndForwardService.cs:339``:362`; `src/ScadaLink.StoreAndForward/ReplicationService.cs:131``:136` |
**Description**
@@ -842,7 +852,18 @@ paths).
**Resolution**
_Unresolved._
Resolved 2026-05-17. Root cause confirmed: the two operator paths
(`RetryParkedMessageAsync`, `DiscardParkedMessageAsync`) changed local SQLite state but
never touched `_replication`, so a failover after an operator action diverged the
standby buffer. `DiscardParkedMessageAsync` now calls `_replication?.ReplicateRemove`
after a successful local delete (the existing `Remove` op deletes on the standby). A new
`ReplicationOperationType.Requeue` was added; `RetryParkedMessageAsync` re-loads the
requeued row and calls `_replication?.ReplicateRequeue`, and the standby's
`ApplyReplicatedOperationAsync` `Requeue` case resets its matching row to `Pending` with
`retry_count = 0`. Regression tests `DiscardingAParkedMessage_ReplicatesARemoveOperation`,
`RetryingAParkedMessage_ReplicatesARequeueOperation` and
`ApplyReplicatedOperation_Requeue_MovesStandbyRowBackToPending` (in
`StoreAndForwardReplicationTests`) all pass; the first two fail against the pre-fix code.
### StoreAndForward-017 — Retry/Discard activity-log entries hard-code the `ExternalSystem` category
@@ -850,7 +871,7 @@ _Unresolved._
|--|--|
| Severity | Low |
| Category | Correctness & logic bugs |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.StoreAndForward/StoreAndForwardService.cs:344`, `:358` |
**Description**
@@ -882,4 +903,14 @@ allow a nullable category for management actions rather than asserting a false o
**Resolution**
_Unresolved._
Resolved 2026-05-17. Root cause confirmed: `RetryParkedMessageAsync` and
`DiscardParkedMessageAsync` raised activity notifications with a hard-coded
`StoreAndForwardCategory.ExternalSystem`, mislabelling parked `Notification` and
`CachedDbWrite` messages in the site event log. Both methods now obtain the message's
real category — `DiscardParkedMessageAsync` loads the row via `GetMessageByIdAsync`
before the delete, `RetryParkedMessageAsync` re-loads the requeued row (also used for
the StoreAndForward-016 replication) — and pass it to `RaiseActivity` (falling back to
`ExternalSystem` only if the row is unexpectedly absent). Regression tests
`RetryParkedMessageAsync_ActivityUsesMessageRealCategory` and
`DiscardParkedMessageAsync_ActivityUsesMessageRealCategory` assert the activity carries
`Notification` / `CachedDbWrite` respectively; both fail against the pre-fix code.