fix(health-monitoring): resolve HealthMonitoring-013,014,016 — shorter-timeout cadence, options validation, injected TimeProvider; HealthMonitoring-015 left open (cross-module design decision)
This commit is contained in:
@@ -8,7 +8,7 @@
|
|||||||
| Last reviewed | 2026-05-17 |
|
| Last reviewed | 2026-05-17 |
|
||||||
| Reviewer | claude-agent |
|
| Reviewer | claude-agent |
|
||||||
| Commit reviewed | `39d737e` |
|
| Commit reviewed | `39d737e` |
|
||||||
| Open findings | 4 |
|
| Open findings | 1 |
|
||||||
|
|
||||||
## Summary
|
## Summary
|
||||||
|
|
||||||
@@ -585,7 +585,7 @@ the contract is now honest and no further code change was required.
|
|||||||
|--|--|
|
|--|--|
|
||||||
| Severity | Low |
|
| Severity | Low |
|
||||||
| Category | Documentation & comments |
|
| Category | Documentation & comments |
|
||||||
| Status | Open |
|
| Status | Resolved |
|
||||||
| Location | `src/ScadaLink.HealthMonitoring/CentralHealthAggregator.cs:194-196` |
|
| Location | `src/ScadaLink.HealthMonitoring/CentralHealthAggregator.cs:194-196` |
|
||||||
|
|
||||||
**Description**
|
**Description**
|
||||||
@@ -617,7 +617,16 @@ that the cadence is derived from `OfflineTimeout` only (acceptable while the def
|
|||||||
|
|
||||||
**Resolution**
|
**Resolution**
|
||||||
|
|
||||||
_Unresolved._
|
Resolved 2026-05-17. Root cause confirmed against source — `ExecuteAsync` derived the
|
||||||
|
`PeriodicTimer` cadence solely from `OfflineTimeout` while the comment claimed it
|
||||||
|
tracked the "(shorter)" timeout. Took the code-matches-comment option: extracted the
|
||||||
|
cadence into `CentralHealthAggregator.ComputeCheckInterval`, which now derives it from
|
||||||
|
half of the *shorter* of `OfflineTimeout` and `CentralOfflineTimeout`, so a
|
||||||
|
`CentralOfflineTimeout` configured below `OfflineTimeout` is still polled at least
|
||||||
|
twice within its window. The comment was rewritten to match. Regression test
|
||||||
|
`CentralHealthAggregatorTests.CheckInterval_IsHalfTheShorterTimeout` asserts the
|
||||||
|
default case (30s) and the shorter-`CentralOfflineTimeout` case (10s) — the latter
|
||||||
|
would have returned 30s against the pre-fix code.
|
||||||
|
|
||||||
### HealthMonitoring-014 — `HealthMonitoringOptions` intervals are unvalidated; a zero/negative value crashes the hosted service
|
### HealthMonitoring-014 — `HealthMonitoringOptions` intervals are unvalidated; a zero/negative value crashes the hosted service
|
||||||
|
|
||||||
@@ -625,7 +634,7 @@ _Unresolved._
|
|||||||
|--|--|
|
|--|--|
|
||||||
| Severity | Low |
|
| Severity | Low |
|
||||||
| Category | Error handling & resilience |
|
| Category | Error handling & resilience |
|
||||||
| Status | Open |
|
| Status | Resolved |
|
||||||
| Location | `src/ScadaLink.HealthMonitoring/HealthMonitoringOptions.cs:3-20`, `src/ScadaLink.HealthMonitoring/CentralHealthAggregator.cs:196`, `src/ScadaLink.HealthMonitoring/HealthReportSender.cs:67`, `src/ScadaLink.HealthMonitoring/CentralHealthReportLoop.cs:63` |
|
| Location | `src/ScadaLink.HealthMonitoring/HealthMonitoringOptions.cs:3-20`, `src/ScadaLink.HealthMonitoring/CentralHealthAggregator.cs:196`, `src/ScadaLink.HealthMonitoring/HealthReportSender.cs:67`, `src/ScadaLink.HealthMonitoring/CentralHealthReportLoop.cs:63` |
|
||||||
|
|
||||||
**Description**
|
**Description**
|
||||||
@@ -653,7 +662,21 @@ configuration fails fast with a message naming the section and key.
|
|||||||
|
|
||||||
**Resolution**
|
**Resolution**
|
||||||
|
|
||||||
_Unresolved._
|
Resolved 2026-05-17. Root cause confirmed — `HealthMonitoringOptions` had no
|
||||||
|
validator, so a zero/negative interval reached `new PeriodicTimer(...)` and crashed
|
||||||
|
the hosted service with an opaque `ArgumentOutOfRangeException`. Added
|
||||||
|
`HealthMonitoringOptionsValidator : IValidateOptions<HealthMonitoringOptions>` that
|
||||||
|
rejects non-positive `ReportInterval`/`OfflineTimeout`/`CentralOfflineTimeout` and a
|
||||||
|
`CentralOfflineTimeout` shorter than `OfflineTimeout`, each failure naming the
|
||||||
|
`ScadaLink:HealthMonitoring` config key. It is registered (idempotently, via
|
||||||
|
`TryAddEnumerable`) by all three `ServiceCollectionExtensions` registration methods,
|
||||||
|
so it fires when the hosted services resolve `IOptions.Value` at startup — failing
|
||||||
|
fast with a clear message. (`ValidateOnStart()` lives in the Host module's binding
|
||||||
|
call, which is out of scope; the validator nonetheless runs at startup because the
|
||||||
|
hosted-service constructors resolve the options eagerly — matching the existing
|
||||||
|
`ClusterOptionsValidator` registration pattern.) Regression tests in
|
||||||
|
`HealthMonitoringOptionsValidatorTests` cover the valid default plus zero/negative
|
||||||
|
intervals and the `CentralOfflineTimeout < OfflineTimeout` case.
|
||||||
|
|
||||||
### HealthMonitoring-015 — Heartbeat-registered site is left with a year-0001 `LastReportReceivedAt`
|
### HealthMonitoring-015 — Heartbeat-registered site is left with a year-0001 `LastReportReceivedAt`
|
||||||
|
|
||||||
@@ -689,7 +712,25 @@ nullable option is safer and matches the existing `LatestReport` treatment.
|
|||||||
|
|
||||||
**Resolution**
|
**Resolution**
|
||||||
|
|
||||||
_Unresolved._
|
_Unresolved — left Open; requires a coordinated cross-module change._ Root cause
|
||||||
|
confirmed against source: `CentralHealthAggregator.MarkHeartbeat` registers an
|
||||||
|
unknown site with `LastReportReceivedAt = default` (`0001-01-01`), and the audit of
|
||||||
|
the single UI reader (`src/ScadaLink.CentralUI/Components/Pages/Monitoring/Health.razor:74`)
|
||||||
|
confirms the bug is live — it passes `state.LastReportReceivedAt` straight into the
|
||||||
|
`TimestampDisplay` component, which would render the year-0001 sentinel verbatim for
|
||||||
|
a heartbeat-only site. The finding's recommended fix — change the field to
|
||||||
|
`DateTimeOffset?` — is the correct one, but it is a **breaking public-API change**:
|
||||||
|
`TimestampDisplay.Value` is a non-nullable `DateTimeOffset` parameter, so making
|
||||||
|
`LastReportReceivedAt` nullable stops `Health.razor` compiling, and the
|
||||||
|
`CentralUI` module is explicitly outside this task's edit scope. The only fix that
|
||||||
|
stays module-local would be to stamp the heartbeat time into `LastReportReceivedAt`,
|
||||||
|
which is semantically dishonest (the field documents "time the latest full report
|
||||||
|
was processed") and would simply move the bug rather than fix it. The correct
|
||||||
|
resolution therefore needs the HealthMonitoring change (`DateTimeOffset?`) **and** a
|
||||||
|
matching `CentralUI` change (a null-checked `TimestampDisplay` or an "awaiting first
|
||||||
|
report" placeholder) applied together — a design decision spanning two modules.
|
||||||
|
**Called out for follow-up: open as a coordinated HealthMonitoring + CentralUI work
|
||||||
|
item.**
|
||||||
|
|
||||||
### HealthMonitoring-016 — `SiteHealthCollector.CollectReport` reads `DateTimeOffset.UtcNow` directly instead of an injected `TimeProvider`
|
### HealthMonitoring-016 — `SiteHealthCollector.CollectReport` reads `DateTimeOffset.UtcNow` directly instead of an injected `TimeProvider`
|
||||||
|
|
||||||
@@ -697,7 +738,7 @@ _Unresolved._
|
|||||||
|--|--|
|
|--|--|
|
||||||
| Severity | Low |
|
| Severity | Low |
|
||||||
| Category | Code organization & conventions |
|
| Category | Code organization & conventions |
|
||||||
| Status | Open |
|
| Status | Resolved |
|
||||||
| Location | `src/ScadaLink.HealthMonitoring/SiteHealthCollector.cs:151` |
|
| Location | `src/ScadaLink.HealthMonitoring/SiteHealthCollector.cs:151` |
|
||||||
|
|
||||||
**Description**
|
**Description**
|
||||||
@@ -722,4 +763,15 @@ testable and the module is consistent.
|
|||||||
|
|
||||||
**Resolution**
|
**Resolution**
|
||||||
|
|
||||||
_Unresolved._
|
Resolved 2026-05-17. Root cause confirmed — `CollectReport` stamped `ReportTimestamp`
|
||||||
|
from `DateTimeOffset.UtcNow` directly while every other time-dependent class in the
|
||||||
|
module takes an injectable `TimeProvider`. Added an optional `TimeProvider`
|
||||||
|
constructor parameter to `SiteHealthCollector` (defaulting to `TimeProvider.System`,
|
||||||
|
mirroring `CentralHealthAggregator`/`HealthReportSender`/`CentralHealthReportLoop`)
|
||||||
|
and `CollectReport` now derives `ReportTimestamp` from `_timeProvider.GetUtcNow()`.
|
||||||
|
The DI registration (`AddSingleton<ISiteHealthCollector, SiteHealthCollector>`)
|
||||||
|
continues to work via the optional parameter. Regression test
|
||||||
|
`SiteHealthCollectorTests.CollectReport_StampsTimestamp_FromInjectedTimeProvider`
|
||||||
|
asserts the timestamp equals a fixed injected instant exactly (not just a
|
||||||
|
before/after window); it would not compile against the pre-fix single-arg-less
|
||||||
|
constructor.
|
||||||
|
|||||||
@@ -191,9 +191,10 @@ public class CentralHealthAggregator : BackgroundService, ICentralHealthAggregat
|
|||||||
"Central health aggregator started, offline timeout {Timeout}s (central {CentralTimeout}s)",
|
"Central health aggregator started, offline timeout {Timeout}s (central {CentralTimeout}s)",
|
||||||
_options.OfflineTimeout.TotalSeconds, _options.CentralOfflineTimeout.TotalSeconds);
|
_options.OfflineTimeout.TotalSeconds, _options.CentralOfflineTimeout.TotalSeconds);
|
||||||
|
|
||||||
// Check at half the (shorter) offline timeout interval for timely detection
|
// Check at half the shorter of the two offline timeouts so detection is
|
||||||
var checkInterval = TimeSpan.FromMilliseconds(_options.OfflineTimeout.TotalMilliseconds / 2);
|
// timely for whichever site class (real or "central") has the tighter
|
||||||
using var timer = new PeriodicTimer(checkInterval);
|
// window — see ComputeCheckInterval.
|
||||||
|
using var timer = new PeriodicTimer(ComputeCheckInterval(_options));
|
||||||
|
|
||||||
while (await timer.WaitForNextTickAsync(stoppingToken).ConfigureAwait(false))
|
while (await timer.WaitForNextTickAsync(stoppingToken).ConfigureAwait(false))
|
||||||
{
|
{
|
||||||
@@ -201,6 +202,24 @@ public class CentralHealthAggregator : BackgroundService, ICentralHealthAggregat
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/// <summary>
|
||||||
|
/// Computes the offline-check timer cadence: half of the <em>shorter</em> of
|
||||||
|
/// <see cref="HealthMonitoringOptions.OfflineTimeout"/> and
|
||||||
|
/// <see cref="HealthMonitoringOptions.CentralOfflineTimeout"/>. Deriving it
|
||||||
|
/// from the shorter timeout guarantees that whichever site class has the
|
||||||
|
/// tighter window is still polled at least twice within it — so if an
|
||||||
|
/// operator configures <c>CentralOfflineTimeout</c> smaller than
|
||||||
|
/// <c>OfflineTimeout</c>, central offline detection is not delayed by up to a
|
||||||
|
/// full <c>OfflineTimeout / 2</c>.
|
||||||
|
/// </summary>
|
||||||
|
internal static TimeSpan ComputeCheckInterval(HealthMonitoringOptions options)
|
||||||
|
{
|
||||||
|
var shorter = options.OfflineTimeout < options.CentralOfflineTimeout
|
||||||
|
? options.OfflineTimeout
|
||||||
|
: options.CentralOfflineTimeout;
|
||||||
|
return TimeSpan.FromMilliseconds(shorter.TotalMilliseconds / 2);
|
||||||
|
}
|
||||||
|
|
||||||
internal void CheckForOfflineSites()
|
internal void CheckForOfflineSites()
|
||||||
{
|
{
|
||||||
var now = _timeProvider.GetUtcNow();
|
var now = _timeProvider.GetUtcNow();
|
||||||
|
|||||||
@@ -0,0 +1,59 @@
|
|||||||
|
using Microsoft.Extensions.Options;
|
||||||
|
|
||||||
|
namespace ScadaLink.HealthMonitoring;
|
||||||
|
|
||||||
|
/// <summary>
|
||||||
|
/// HealthMonitoring-014: validates <see cref="HealthMonitoringOptions"/> at
|
||||||
|
/// startup. The interval values are fed straight into <c>new PeriodicTimer(...)</c>
|
||||||
|
/// (and into a division for the offline-check cadence); a zero or negative value
|
||||||
|
/// makes <see cref="PeriodicTimer"/>'s constructor throw
|
||||||
|
/// <see cref="ArgumentOutOfRangeException"/>, crashing the
|
||||||
|
/// <see cref="HealthReportSender"/> / <see cref="CentralHealthReportLoop"/> /
|
||||||
|
/// <see cref="CentralHealthAggregator"/> hosted service with an opaque exception
|
||||||
|
/// that does not name the offending config key. Registered with
|
||||||
|
/// <c>ValidateOnStart()</c> so a bad <c>ScadaLink:HealthMonitoring</c> section
|
||||||
|
/// fails fast at boot with a clear, key-naming message.
|
||||||
|
/// </summary>
|
||||||
|
public sealed class HealthMonitoringOptionsValidator : IValidateOptions<HealthMonitoringOptions>
|
||||||
|
{
|
||||||
|
public ValidateOptionsResult Validate(string? name, HealthMonitoringOptions options)
|
||||||
|
{
|
||||||
|
var failures = new List<string>();
|
||||||
|
|
||||||
|
if (options.ReportInterval <= TimeSpan.Zero)
|
||||||
|
{
|
||||||
|
failures.Add(
|
||||||
|
$"ScadaLink:HealthMonitoring:ReportInterval must be a positive duration " +
|
||||||
|
$"(was {options.ReportInterval}); it is used directly as a PeriodicTimer period.");
|
||||||
|
}
|
||||||
|
|
||||||
|
if (options.OfflineTimeout <= TimeSpan.Zero)
|
||||||
|
{
|
||||||
|
failures.Add(
|
||||||
|
$"ScadaLink:HealthMonitoring:OfflineTimeout must be a positive duration " +
|
||||||
|
$"(was {options.OfflineTimeout}); it drives the offline-check PeriodicTimer cadence.");
|
||||||
|
}
|
||||||
|
|
||||||
|
if (options.CentralOfflineTimeout <= TimeSpan.Zero)
|
||||||
|
{
|
||||||
|
failures.Add(
|
||||||
|
$"ScadaLink:HealthMonitoring:CentralOfflineTimeout must be a positive duration " +
|
||||||
|
$"(was {options.CentralOfflineTimeout}).");
|
||||||
|
}
|
||||||
|
|
||||||
|
if (options.OfflineTimeout > TimeSpan.Zero
|
||||||
|
&& options.CentralOfflineTimeout > TimeSpan.Zero
|
||||||
|
&& options.CentralOfflineTimeout < options.OfflineTimeout)
|
||||||
|
{
|
||||||
|
failures.Add(
|
||||||
|
$"ScadaLink:HealthMonitoring:CentralOfflineTimeout ({options.CentralOfflineTimeout}) " +
|
||||||
|
$"must be >= OfflineTimeout ({options.OfflineTimeout}): the synthetic 'central' site has " +
|
||||||
|
"no heartbeat source and is fed only by the slower self-report loop, so it needs at " +
|
||||||
|
"least as much offline grace as a real site.");
|
||||||
|
}
|
||||||
|
|
||||||
|
return failures.Count > 0
|
||||||
|
? ValidateOptionsResult.Fail(failures)
|
||||||
|
: ValidateOptionsResult.Success;
|
||||||
|
}
|
||||||
|
}
|
||||||
@@ -1,4 +1,6 @@
|
|||||||
using Microsoft.Extensions.DependencyInjection;
|
using Microsoft.Extensions.DependencyInjection;
|
||||||
|
using Microsoft.Extensions.DependencyInjection.Extensions;
|
||||||
|
using Microsoft.Extensions.Options;
|
||||||
|
|
||||||
namespace ScadaLink.HealthMonitoring;
|
namespace ScadaLink.HealthMonitoring;
|
||||||
|
|
||||||
@@ -10,6 +12,7 @@ public static class ServiceCollectionExtensions
|
|||||||
/// </summary>
|
/// </summary>
|
||||||
public static IServiceCollection AddSiteHealthMonitoring(this IServiceCollection services)
|
public static IServiceCollection AddSiteHealthMonitoring(this IServiceCollection services)
|
||||||
{
|
{
|
||||||
|
AddOptionsValidation(services);
|
||||||
services.AddSingleton<ISiteHealthCollector, SiteHealthCollector>();
|
services.AddSingleton<ISiteHealthCollector, SiteHealthCollector>();
|
||||||
services.AddHostedService<HealthReportSender>();
|
services.AddHostedService<HealthReportSender>();
|
||||||
return services;
|
return services;
|
||||||
@@ -21,6 +24,7 @@ public static class ServiceCollectionExtensions
|
|||||||
/// </summary>
|
/// </summary>
|
||||||
public static IServiceCollection AddHealthMonitoring(this IServiceCollection services)
|
public static IServiceCollection AddHealthMonitoring(this IServiceCollection services)
|
||||||
{
|
{
|
||||||
|
AddOptionsValidation(services);
|
||||||
services.AddSingleton<ISiteHealthCollector, SiteHealthCollector>();
|
services.AddSingleton<ISiteHealthCollector, SiteHealthCollector>();
|
||||||
return services;
|
return services;
|
||||||
}
|
}
|
||||||
@@ -32,10 +36,27 @@ public static class ServiceCollectionExtensions
|
|||||||
/// </summary>
|
/// </summary>
|
||||||
public static IServiceCollection AddCentralHealthAggregation(this IServiceCollection services)
|
public static IServiceCollection AddCentralHealthAggregation(this IServiceCollection services)
|
||||||
{
|
{
|
||||||
|
AddOptionsValidation(services);
|
||||||
services.AddSingleton<CentralHealthAggregator>();
|
services.AddSingleton<CentralHealthAggregator>();
|
||||||
services.AddSingleton<ICentralHealthAggregator>(sp => sp.GetRequiredService<CentralHealthAggregator>());
|
services.AddSingleton<ICentralHealthAggregator>(sp => sp.GetRequiredService<CentralHealthAggregator>());
|
||||||
services.AddHostedService(sp => sp.GetRequiredService<CentralHealthAggregator>());
|
services.AddHostedService(sp => sp.GetRequiredService<CentralHealthAggregator>());
|
||||||
services.AddHostedService<CentralHealthReportLoop>();
|
services.AddHostedService<CentralHealthReportLoop>();
|
||||||
return services;
|
return services;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/// <summary>
|
||||||
|
/// HealthMonitoring-014: register the <see cref="HealthMonitoringOptionsValidator"/>
|
||||||
|
/// so a misconfigured <c>ScadaLink:HealthMonitoring</c> section (zero/negative
|
||||||
|
/// intervals, or a <c>CentralOfflineTimeout</c> shorter than
|
||||||
|
/// <c>OfflineTimeout</c>) is rejected with a clear, key-naming message when the
|
||||||
|
/// hosted services resolve their options at startup — rather than crashing
|
||||||
|
/// later inside a <see cref="PeriodicTimer"/> constructor with an opaque
|
||||||
|
/// <see cref="ArgumentOutOfRangeException"/>. Idempotent so it is safe when
|
||||||
|
/// more than one of the registration methods above is called.
|
||||||
|
/// </summary>
|
||||||
|
private static void AddOptionsValidation(IServiceCollection services)
|
||||||
|
{
|
||||||
|
services.TryAddEnumerable(
|
||||||
|
ServiceDescriptor.Singleton<IValidateOptions<HealthMonitoringOptions>, HealthMonitoringOptionsValidator>());
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -23,6 +23,18 @@ public class SiteHealthCollector : ISiteHealthCollector
|
|||||||
private volatile string _nodeHostname = "";
|
private volatile string _nodeHostname = "";
|
||||||
private volatile IReadOnlyList<Commons.Messages.Health.NodeStatus>? _clusterNodes;
|
private volatile IReadOnlyList<Commons.Messages.Health.NodeStatus>? _clusterNodes;
|
||||||
private volatile bool _isActiveNode;
|
private volatile bool _isActiveNode;
|
||||||
|
private readonly TimeProvider _timeProvider;
|
||||||
|
|
||||||
|
/// <summary>
|
||||||
|
/// Creates a collector. The <paramref name="timeProvider"/> stamps each
|
||||||
|
/// report's timestamp; it defaults to <see cref="TimeProvider.System"/> and
|
||||||
|
/// is injectable so the report timestamp is deterministically testable —
|
||||||
|
/// consistent with the rest of the module's time-dependent classes.
|
||||||
|
/// </summary>
|
||||||
|
public SiteHealthCollector(TimeProvider? timeProvider = null)
|
||||||
|
{
|
||||||
|
_timeProvider = timeProvider ?? TimeProvider.System;
|
||||||
|
}
|
||||||
|
|
||||||
/// <summary>
|
/// <summary>
|
||||||
/// Increment the script error counter. Covers unhandled exceptions,
|
/// Increment the script error counter. Covers unhandled exceptions,
|
||||||
@@ -148,7 +160,7 @@ public class SiteHealthCollector : ISiteHealthCollector
|
|||||||
return new SiteHealthReport(
|
return new SiteHealthReport(
|
||||||
SiteId: siteId,
|
SiteId: siteId,
|
||||||
SequenceNumber: 0, // Caller (HealthReportSender) assigns the sequence number
|
SequenceNumber: 0, // Caller (HealthReportSender) assigns the sequence number
|
||||||
ReportTimestamp: DateTimeOffset.UtcNow,
|
ReportTimestamp: _timeProvider.GetUtcNow(),
|
||||||
DataConnectionStatuses: connectionStatuses,
|
DataConnectionStatuses: connectionStatuses,
|
||||||
TagResolutionCounts: tagResolution,
|
TagResolutionCounts: tagResolution,
|
||||||
ScriptErrorCount: scriptErrors,
|
ScriptErrorCount: scriptErrors,
|
||||||
|
|||||||
@@ -307,6 +307,36 @@ public class CentralHealthAggregatorTests
|
|||||||
Assert.False(_aggregator.GetSiteState(CentralHealthReportLoop.CentralSiteId)!.IsOnline);
|
Assert.False(_aggregator.GetSiteState(CentralHealthReportLoop.CentralSiteId)!.IsOnline);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/// <summary>
|
||||||
|
/// HealthMonitoring-013 regression: the offline-check cadence must be derived
|
||||||
|
/// from the *shorter* of <see cref="HealthMonitoringOptions.OfflineTimeout"/>
|
||||||
|
/// and <see cref="HealthMonitoringOptions.CentralOfflineTimeout"/>, so that if
|
||||||
|
/// an operator configures <c>CentralOfflineTimeout</c> smaller than
|
||||||
|
/// <c>OfflineTimeout</c>, central offline detection is still timely instead of
|
||||||
|
/// being delayed by up to a full <c>OfflineTimeout / 2</c>.
|
||||||
|
/// </summary>
|
||||||
|
[Fact]
|
||||||
|
public void CheckInterval_IsHalfTheShorterTimeout()
|
||||||
|
{
|
||||||
|
// Default: OfflineTimeout (60s) is the shorter of the two.
|
||||||
|
Assert.Equal(
|
||||||
|
TimeSpan.FromSeconds(30),
|
||||||
|
CentralHealthAggregator.ComputeCheckInterval(new HealthMonitoringOptions
|
||||||
|
{
|
||||||
|
OfflineTimeout = TimeSpan.FromSeconds(60),
|
||||||
|
CentralOfflineTimeout = TimeSpan.FromMinutes(3)
|
||||||
|
}));
|
||||||
|
|
||||||
|
// Operator configures CentralOfflineTimeout shorter — cadence must adapt.
|
||||||
|
Assert.Equal(
|
||||||
|
TimeSpan.FromSeconds(10),
|
||||||
|
CentralHealthAggregator.ComputeCheckInterval(new HealthMonitoringOptions
|
||||||
|
{
|
||||||
|
OfflineTimeout = TimeSpan.FromSeconds(60),
|
||||||
|
CentralOfflineTimeout = TimeSpan.FromSeconds(20)
|
||||||
|
}));
|
||||||
|
}
|
||||||
|
|
||||||
[Fact]
|
[Fact]
|
||||||
public void SequenceNumberReset_RejectedUntilExceedsPrevMax()
|
public void SequenceNumberReset_RejectedUntilExceedsPrevMax()
|
||||||
{
|
{
|
||||||
|
|||||||
@@ -0,0 +1,73 @@
|
|||||||
|
using Microsoft.Extensions.Options;
|
||||||
|
|
||||||
|
namespace ScadaLink.HealthMonitoring.Tests;
|
||||||
|
|
||||||
|
/// <summary>
|
||||||
|
/// HealthMonitoring-014 regression: <see cref="HealthMonitoringOptions"/> intervals
|
||||||
|
/// are fed straight into <c>new PeriodicTimer(...)</c>, which throws
|
||||||
|
/// <see cref="ArgumentOutOfRangeException"/> for a zero/negative period. A
|
||||||
|
/// misconfigured <c>appsettings.json</c> must be rejected by an
|
||||||
|
/// <see cref="IValidateOptions{TOptions}"/> with a clear, key-naming message
|
||||||
|
/// rather than crashing the hosted service with an opaque exception.
|
||||||
|
/// </summary>
|
||||||
|
public class HealthMonitoringOptionsValidatorTests
|
||||||
|
{
|
||||||
|
private static ValidateOptionsResult Validate(HealthMonitoringOptions options) =>
|
||||||
|
new HealthMonitoringOptionsValidator().Validate(Options.DefaultName, options);
|
||||||
|
|
||||||
|
[Fact]
|
||||||
|
public void DefaultOptions_AreValid()
|
||||||
|
{
|
||||||
|
var result = Validate(new HealthMonitoringOptions());
|
||||||
|
Assert.True(result.Succeeded, result.FailureMessage);
|
||||||
|
}
|
||||||
|
|
||||||
|
[Fact]
|
||||||
|
public void ZeroReportInterval_IsRejected()
|
||||||
|
{
|
||||||
|
var result = Validate(new HealthMonitoringOptions { ReportInterval = TimeSpan.Zero });
|
||||||
|
|
||||||
|
Assert.True(result.Failed);
|
||||||
|
Assert.Contains("ReportInterval", result.FailureMessage);
|
||||||
|
}
|
||||||
|
|
||||||
|
[Fact]
|
||||||
|
public void NegativeReportInterval_IsRejected()
|
||||||
|
{
|
||||||
|
var result = Validate(new HealthMonitoringOptions { ReportInterval = TimeSpan.FromSeconds(-1) });
|
||||||
|
|
||||||
|
Assert.True(result.Failed);
|
||||||
|
Assert.Contains("ReportInterval", result.FailureMessage);
|
||||||
|
}
|
||||||
|
|
||||||
|
[Fact]
|
||||||
|
public void ZeroOfflineTimeout_IsRejected()
|
||||||
|
{
|
||||||
|
var result = Validate(new HealthMonitoringOptions { OfflineTimeout = TimeSpan.Zero });
|
||||||
|
|
||||||
|
Assert.True(result.Failed);
|
||||||
|
Assert.Contains("OfflineTimeout", result.FailureMessage);
|
||||||
|
}
|
||||||
|
|
||||||
|
[Fact]
|
||||||
|
public void ZeroCentralOfflineTimeout_IsRejected()
|
||||||
|
{
|
||||||
|
var result = Validate(new HealthMonitoringOptions { CentralOfflineTimeout = TimeSpan.Zero });
|
||||||
|
|
||||||
|
Assert.True(result.Failed);
|
||||||
|
Assert.Contains("CentralOfflineTimeout", result.FailureMessage);
|
||||||
|
}
|
||||||
|
|
||||||
|
[Fact]
|
||||||
|
public void CentralOfflineTimeout_ShorterThanOfflineTimeout_IsRejected()
|
||||||
|
{
|
||||||
|
var result = Validate(new HealthMonitoringOptions
|
||||||
|
{
|
||||||
|
OfflineTimeout = TimeSpan.FromSeconds(60),
|
||||||
|
CentralOfflineTimeout = TimeSpan.FromSeconds(30)
|
||||||
|
});
|
||||||
|
|
||||||
|
Assert.True(result.Failed);
|
||||||
|
Assert.Contains("CentralOfflineTimeout", result.FailureMessage);
|
||||||
|
}
|
||||||
|
}
|
||||||
@@ -131,6 +131,24 @@ public class SiteHealthCollectorTests
|
|||||||
Assert.InRange(report.ReportTimestamp, before, after);
|
Assert.InRange(report.ReportTimestamp, before, after);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/// <summary>
|
||||||
|
/// HealthMonitoring-016 regression: <see cref="SiteHealthCollector.CollectReport"/>
|
||||||
|
/// must stamp <c>ReportTimestamp</c> from an injected <see cref="TimeProvider"/>
|
||||||
|
/// (consistent with the rest of the module), not directly from
|
||||||
|
/// <c>DateTimeOffset.UtcNow</c>, so the report timestamp is deterministically
|
||||||
|
/// testable against a known instant.
|
||||||
|
/// </summary>
|
||||||
|
[Fact]
|
||||||
|
public void CollectReport_StampsTimestamp_FromInjectedTimeProvider()
|
||||||
|
{
|
||||||
|
var fixedInstant = new DateTimeOffset(2026, 5, 17, 9, 30, 0, TimeSpan.Zero);
|
||||||
|
var collector = new SiteHealthCollector(new TestTimeProvider(fixedInstant));
|
||||||
|
|
||||||
|
var report = collector.CollectReport("site-1");
|
||||||
|
|
||||||
|
Assert.Equal(fixedInstant, report.ReportTimestamp);
|
||||||
|
}
|
||||||
|
|
||||||
[Fact]
|
[Fact]
|
||||||
public void CollectReport_SequenceNumberIsZero_CallerAssignsIt()
|
public void CollectReport_SequenceNumberIsZero_CallerAssignsIt()
|
||||||
{
|
{
|
||||||
|
|||||||
Reference in New Issue
Block a user