f66dc031a4
CentralCommunicationActor.HandleHeartbeat was forwarding each incoming HeartbeatMessage to Context.Parent, which resolves to the /user guardian — a non-actor. Every site heartbeat went straight to dead letters (~1026 per central node per 30 minutes at the default ~2s interval across three sites). The aggregator now exposes MarkHeartbeat(siteId, receivedAt) which bumps LastReportReceivedAt on already-known sites (and clears IsOnline if it had flipped) without touching LatestReport. Heartbeats from unregistered sites are dropped — first registration still happens on the first full report. CentralCommunicationActor calls this in place of the no-op Tell. The result: heartbeats now serve their stated health-monitoring purpose (per CLAUDE.md) by keeping a site marked online between the 30s full reports if a single report is briefly delayed, and the dead letter noise disappears entirely.
24 lines
838 B
C#
24 lines
838 B
C#
using ScadaLink.Commons.Messages.Health;
|
|
|
|
namespace ScadaLink.HealthMonitoring;
|
|
|
|
/// <summary>
|
|
/// Interface for central-side health aggregation.
|
|
/// Consumed by Central UI to display site health dashboards.
|
|
/// </summary>
|
|
public interface ICentralHealthAggregator
|
|
{
|
|
void ProcessReport(SiteHealthReport report);
|
|
|
|
/// <summary>
|
|
/// Bumps the last-seen timestamp for a site already known via a prior
|
|
/// SiteHealthReport. Used to keep a site marked online between full
|
|
/// 30s reports when ~2s heartbeats are arriving — protects against the
|
|
/// 60s offline threshold firing on a transiently delayed report.
|
|
/// </summary>
|
|
void MarkHeartbeat(string siteId, DateTimeOffset receivedAt);
|
|
|
|
IReadOnlyDictionary<string, SiteHealthState> GetAllSiteStates();
|
|
SiteHealthState? GetSiteState(string siteId);
|
|
}
|