fix(store-and-forward): resolve S&F delivery + replication wiring (3 Critical findings)

Resolves StoreAndForward-001, ExternalSystemGateway-001, NotificationService-001
— one systemic gap where buffered messages were persisted but never delivered,
and the active node never replicated its buffer to the standby.

Delivery handlers (ExternalSystemGateway-001 / NotificationService-001):
- AkkaHostedService registers delivery handlers for the ExternalSystem,
  CachedDbWrite and Notification categories after StoreAndForwardService starts;
  each resolves its scoped consumer in a fresh DI scope.
- ExternalSystemClient, DatabaseGateway and NotificationDeliveryService each
  gain a DeliverBufferedAsync method: re-resolve the target and re-attempt
  delivery, returning true/false/throwing per the transient-vs-permanent contract.
- EnqueueAsync gains an attemptImmediateDelivery flag; CachedCallAsync and
  NotificationDeliveryService.SendAsync pass false (they already attempted
  delivery themselves) so registering a handler does not dispatch twice.

Replication (StoreAndForward-001):
- ReplicationService is injected into StoreAndForwardService; a new BufferAsync
  helper replicates every enqueue, and successful-retry removes and parks are
  replicated too. Fire-and-forget, no-op when replication is disabled.

Tests: StoreAndForwardReplicationTests (Add/Remove/Park observed),
attemptImmediateDelivery behaviour, and DeliverBufferedAsync paths for each
consumer. Full solution builds; StoreAndForward/ExternalSystemGateway/
NotificationService suites green.
This commit is contained in:
Joseph Doherty
2026-05-16 18:58:11 -04:00
parent a9bd7ee37c
commit 61253e3269
15 changed files with 538 additions and 37 deletions

View File

@@ -8,7 +8,7 @@
| Last reviewed | 2026-05-16 |
| Reviewer | claude-agent |
| Commit reviewed | `9c60592` |
| Open findings | 13 |
| Open findings | 12 |
## Summary
@@ -53,7 +53,7 @@ replication and retry-count issues are functional defects against the design.
|--|--|
| Severity | Critical |
| Category | Error handling & resilience |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.StoreAndForward/ReplicationService.cs:40`, `:53`, `:66`; `src/ScadaLink.StoreAndForward/StoreAndForwardService.cs:155`, `:212`, `:222`, `:236` |
**Description**
@@ -81,7 +81,14 @@ asserts the replication handler observes each operation type.
**Resolution**
_Unresolved._
Resolved 2026-05-16. `ReplicationService` is now injected into `StoreAndForwardService`
(wired in `AddStoreAndForward`), and every buffer operation is forwarded to the standby:
a new `BufferAsync` helper calls `ReplicateEnqueue` after each persist, `ReplicateRemove`
runs after a successful retry removes a message, and `ReplicatePark` runs on both park
paths. Replication stays fire-and-forget and is a no-op when `ReplicationEnabled` is
false or no handler is wired. Regression tests `StoreAndForwardReplicationTests` assert
the replication handler observes the Add, Remove and Park operations. Fixed by the
commit whose message references `StoreAndForward-001`.
### StoreAndForward-002 — Messages enqueued with no registered handler are buffered but never deliverable