fix(health): route site heartbeats into the aggregator

CentralCommunicationActor.HandleHeartbeat was forwarding each incoming
HeartbeatMessage to Context.Parent, which resolves to the /user
guardian — a non-actor. Every site heartbeat went straight to dead
letters (~1026 per central node per 30 minutes at the default ~2s
interval across three sites).

The aggregator now exposes MarkHeartbeat(siteId, receivedAt) which
bumps LastReportReceivedAt on already-known sites (and clears IsOnline
if it had flipped) without touching LatestReport. Heartbeats from
unregistered sites are dropped — first registration still happens on
the first full report. CentralCommunicationActor calls this in place
of the no-op Tell.

The result: heartbeats now serve their stated health-monitoring
purpose (per CLAUDE.md) by keeping a site marked online between the
30s full reports if a single report is briefly delayed, and the dead
letter noise disappears entirely.
This commit is contained in:
Joseph Doherty
2026-05-13 08:11:43 -04:00
parent 7bba48a14a
commit f66dc031a4
5 changed files with 41 additions and 8 deletions

View File

@@ -100,8 +100,8 @@ public class CentralCommunicationActor : ReceiveActor
private void HandleHeartbeat(HeartbeatMessage heartbeat)
{
// Forward heartbeat to parent for any interested central actors
Context.Parent.Tell(heartbeat);
var aggregator = _serviceProvider.GetService<ICentralHealthAggregator>();
aggregator?.MarkHeartbeat(heartbeat.SiteId, heartbeat.Timestamp);
}
/// <summary>