Verify component designs against Akka.NET best practices documentation

Cluster Infrastructure: add min-nr-of-members=1 requirement for single-node
operation after failover. Add graceful shutdown / CoordinatedShutdown section
for fast singleton handover during planned maintenance.

Site Runtime: add explicit supervision strategies per actor type (Resume for
coordinators, Stop for short-lived execution actors). Stagger Instance Actor
startup to prevent reconnection storms. Add Tell-vs-Ask usage guidance per
Akka.NET best practices (Tell for hot path, Ask for system boundaries only).

Data Connection Layer: add Connection Actor Model section documenting the
Become/Stash pattern for connection lifecycle state machine.

Health Monitoring: add dead letter count as a monitored metric.

Host: add REQ-HOST-8a for dead letter monitoring (subscribe to EventStream,
log at Warning level, report as health metric).
This commit is contained in:
Joseph Doherty
2026-03-16 09:12:36 -04:00
parent de636b908b
commit 409cc62309
5 changed files with 58 additions and 4 deletions

View File

@@ -98,6 +98,10 @@ The Host must configure Serilog as the logging provider with:
- Automatic enrichment of every log entry with `SiteId`, `NodeHostname`, and `NodeRole` properties sourced from `NodeConfiguration`.
- Structured (machine-parseable) output format.
### REQ-HOST-8a: Dead Letter Monitoring
The Host must subscribe to the Akka.NET `DeadLetter` event stream and log dead letters at Warning level. Dead letters indicate messages sent to actors that no longer exist — a common symptom of failover timing issues, stale actor references, or race conditions during instance lifecycle transitions. The dead letter count is reported as a health metric (see Health Monitoring).
### REQ-HOST-9: Graceful Shutdown
When the Host process receives a stop signal (Windows Service stop, `Ctrl+C`, or SIGTERM), it must trigger Akka.NET CoordinatedShutdown to allow actors to drain in-flight work before the process exits. The Host must not call `Environment.Exit()` or forcibly terminate the actor system without coordinated shutdown.