fix(host): resolve Host-012..015 — consume DownIfAlone in HOCON, sub-second timing precision, config-driven Serilog sinks, transient-only startup retry
This commit is contained in:
@@ -8,7 +8,7 @@
|
||||
| Last reviewed | 2026-05-17 |
|
||||
| Reviewer | claude-agent |
|
||||
| Commit reviewed | `39d737e` |
|
||||
| Open findings | 4 |
|
||||
| Open findings | 0 |
|
||||
|
||||
## Summary
|
||||
|
||||
@@ -585,7 +585,7 @@ key). Full Host suite green (175 passed).
|
||||
|--|--|
|
||||
| Severity | Medium |
|
||||
| Category | Design-document adherence |
|
||||
| Status | Open |
|
||||
| Status | Resolved |
|
||||
| Location | `src/ScadaLink.Host/Actors/AkkaHostedService.cs:146-148` |
|
||||
|
||||
**Description**
|
||||
@@ -618,7 +618,15 @@ declared-but-dead.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
Resolved 2026-05-17 (commit `<pending>`). Root cause confirmed against
|
||||
`AkkaHostedService.cs`: `BuildHocon` emitted `keep-oldest { down-if-alone = on }` as
|
||||
a literal, and `ClusterOptions.DownIfAlone` was referenced nowhere outside its own
|
||||
declaration, so the bound option was dead. Fix: `BuildHocon` now emits
|
||||
`down-if-alone = {(clusterOptions.DownIfAlone ? "on" : "off")}`, consuming the bound
|
||||
property. Regression tests in `HoconBuilderTests`: `BuildHocon_DownIfAloneTrue_EmitsOn`
|
||||
and `BuildHocon_DownIfAloneFalse_EmitsOff` parse the resulting HOCON and assert the
|
||||
`keep-oldest.down-if-alone` token tracks the property; the `false` case failed before
|
||||
the fix (token was still `on`) and passes after. Full Host suite green (182 passed).
|
||||
|
||||
### Host-013 — `:F0` rounding of cluster timing values silently degrades sub-second configuration
|
||||
|
||||
@@ -626,7 +634,7 @@ _Unresolved._
|
||||
|--|--|
|
||||
| Severity | Low |
|
||||
| Category | Correctness & logic bugs |
|
||||
| Status | Open |
|
||||
| Status | Resolved |
|
||||
| Location | `src/ScadaLink.Host/Actors/AkkaHostedService.cs:135-136,145,151-152` |
|
||||
|
||||
**Description**
|
||||
@@ -653,7 +661,17 @@ number of seconds and fail fast otherwise. Either way, do not silently round.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
Resolved 2026-05-17 (commit `<pending>`). Root cause confirmed: `BuildHocon`
|
||||
rendered every duration via `TotalSeconds:F0`, rounding sub-second values — a 750ms
|
||||
heartbeat became `1s` and a 500ms value would collapse to a degenerate `0s`. Fix:
|
||||
durations are now rendered by a new `DurationHocon` helper that emits whole
|
||||
milliseconds with an `ms` suffix (Akka's HOCON parser accepts it), so sub-second
|
||||
configuration survives exactly; `BuildHocon` takes the transport heartbeat/failure
|
||||
as `TimeSpan` rather than pre-rounded `double` seconds. Regression test
|
||||
`HoconBuilderTests.BuildHocon_SubSecondTimings_PreservedWithoutRounding` configures
|
||||
750ms/2600ms/500ms/2.5s/7.5s timings and asserts each round-trips through the parsed
|
||||
HOCON unchanged; it failed before the fix (750ms parsed as `00:00:01`) and passes
|
||||
after. Full Host suite green (182 passed).
|
||||
|
||||
### Host-014 — Serilog sinks are hard-coded in `Program.cs`, not configuration-driven (REQ-HOST-8)
|
||||
|
||||
@@ -661,7 +679,7 @@ _Unresolved._
|
||||
|--|--|
|
||||
| Severity | Low |
|
||||
| Category | Design-document adherence |
|
||||
| Status | Open |
|
||||
| Status | Resolved |
|
||||
| Location | `src/ScadaLink.Host/Program.cs:43-48`, `src/ScadaLink.Host/appsettings.json:1-7` |
|
||||
|
||||
**Description**
|
||||
@@ -690,7 +708,22 @@ state the sinks are intentionally code-defined — but the design doc currently
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
Resolved 2026-05-17 (commit `<pending>`). Root cause confirmed: `LoggerConfigurationFactory.Build`
|
||||
called `ReadFrom.Configuration` but neither `appsettings.json` nor either
|
||||
role-specific file contained a `Serilog` section, so the two sinks that actually ran
|
||||
were hard-coded `.WriteTo.Console(...)` / `.WriteTo.File(...)` calls in `Program.cs`
|
||||
— uneditable without recompiling, contrary to REQ-HOST-8. Fix: added a `Serilog`
|
||||
section to `appsettings.json` with `WriteTo` entries for the Console and File sinks
|
||||
(carrying the output template, file path `logs/scadalink-.log` and `Day` rolling
|
||||
interval as args) and removed the hard-coded `.WriteTo` calls from `Program.cs`; the
|
||||
factory's `ReadFrom.Configuration` (backed by the transitive
|
||||
`Serilog.Settings.Configuration` package) now drives the sinks. Regression tests in
|
||||
new `SerilogSinkConfigTests`: `ShippedAppSettings_HasSerilogSection_WithConsoleAndFileSinks`
|
||||
asserts the shipped `appsettings.json` declares both sinks in a `Serilog:WriteTo`
|
||||
array (fails against the pre-fix file, which had no `Serilog` section), and
|
||||
`LoggerConfigurationFactory_AppliesConfiguredFileSink` proves a configuration-supplied
|
||||
File sink reaches the built logger and writes to the configured path. Full Host suite
|
||||
green (182 passed).
|
||||
|
||||
### Host-015 — `StartupRetry` retries on every exception type, including permanent failures
|
||||
|
||||
@@ -698,7 +731,7 @@ _Unresolved._
|
||||
|--|--|
|
||||
| Severity | Low |
|
||||
| Category | Error handling & resilience |
|
||||
| Status | Open |
|
||||
| Status | Resolved |
|
||||
| Location | `src/ScadaLink.Host/StartupRetry.cs:36-45` |
|
||||
|
||||
**Description**
|
||||
@@ -727,4 +760,20 @@ single attempt.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
Resolved 2026-05-17 (commit `<pending>`). Root cause confirmed: `ExecuteWithRetryAsync`
|
||||
caught `Exception` with only the `when (attempt < maxAttempts)` guard, so it retried
|
||||
any exception — including a permanent schema-version mismatch raised by the
|
||||
migration call site, delaying the inevitable fatal exit by ~2 minutes. Fix: added an
|
||||
optional `Func<Exception, bool> isTransient` parameter (default treats all
|
||||
exceptions as transient, preserving prior behaviour for unclassified callers); the
|
||||
catch guard is now `when (attempt < maxAttempts && isTransient(ex))`, so a
|
||||
non-transient fault propagates after a single attempt. Added a
|
||||
`StartupRetry.IsTransientDatabaseFault` classifier that returns `true` only for
|
||||
connection-class faults (`TimeoutException`, `SocketException`, `SqlException` /
|
||||
`DbException` by type-name match) and `false` for everything else — notably
|
||||
schema-validation `InvalidOperationException`s — and the `Program.cs` migration call
|
||||
site now passes it. Regression tests in `StartupRetryTests`:
|
||||
`ExecuteWithRetry_NonTransientFailure_RethrowsAfterSingleAttempt` (runs exactly once
|
||||
when `isTransient` returns false) and `ExecuteWithRetry_TransientThenPermanent_StopsAtPermanent`
|
||||
(retries a `TimeoutException` then stops at a permanent `InvalidOperationException`).
|
||||
Full Host suite green (182 passed).
|
||||
|
||||
@@ -67,7 +67,8 @@ public class AkkaHostedService : IHostedService
|
||||
// interpolated value, so a hostname, seed node or strategy containing a quote,
|
||||
// backslash or whitespace cannot corrupt the configuration document.
|
||||
var hocon = BuildHocon(_nodeOptions, _clusterOptions, roles,
|
||||
transportHeartbeatSec, transportFailureSec);
|
||||
_communicationOptions.TransportHeartbeatInterval,
|
||||
_communicationOptions.TransportFailureThreshold);
|
||||
|
||||
var config = ConfigurationFactory.ParseString(hocon);
|
||||
_actorSystem = ActorSystem.Create("scadalink", config);
|
||||
@@ -106,13 +107,21 @@ public class AkkaHostedService : IHostedService
|
||||
/// routed through <see cref="QuoteHocon"/> (string values) so a hostname,
|
||||
/// seed-node URI, role or split-brain strategy containing a quote, backslash or
|
||||
/// whitespace cannot corrupt the document or be silently misparsed (Host-006).
|
||||
///
|
||||
/// Host-012: the <c>keep-oldest down-if-alone</c> flag is emitted from
|
||||
/// <see cref="ClusterOptions.DownIfAlone"/> rather than hard-coded, so the bound
|
||||
/// configuration value is actually consumed.
|
||||
///
|
||||
/// Host-013: every duration is rendered via <see cref="DurationHocon"/> in
|
||||
/// milliseconds, so sub-second cluster timing values (e.g. a 750ms heartbeat) are
|
||||
/// preserved exactly instead of being rounded to whole seconds.
|
||||
/// </summary>
|
||||
public static string BuildHocon(
|
||||
NodeOptions nodeOptions,
|
||||
ClusterOptions clusterOptions,
|
||||
IEnumerable<string> roles,
|
||||
double transportHeartbeatSec,
|
||||
double transportFailureSec)
|
||||
TimeSpan transportHeartbeat,
|
||||
TimeSpan transportFailure)
|
||||
{
|
||||
var seedNodesStr = string.Join(",",
|
||||
clusterOptions.SeedNodes.Select(QuoteHocon));
|
||||
@@ -132,8 +141,8 @@ akka {{
|
||||
port = {nodeOptions.RemotingPort}
|
||||
}}
|
||||
transport-failure-detector {{
|
||||
heartbeat-interval = {transportHeartbeatSec:F0}s
|
||||
acceptable-heartbeat-pause = {transportFailureSec:F0}s
|
||||
heartbeat-interval = {DurationHocon(transportHeartbeat)}
|
||||
acceptable-heartbeat-pause = {DurationHocon(transportFailure)}
|
||||
}}
|
||||
}}
|
||||
cluster {{
|
||||
@@ -142,14 +151,14 @@ akka {{
|
||||
min-nr-of-members = {clusterOptions.MinNrOfMembers}
|
||||
split-brain-resolver {{
|
||||
active-strategy = {QuoteHocon(clusterOptions.SplitBrainResolverStrategy)}
|
||||
stable-after = {clusterOptions.StableAfter.TotalSeconds:F0}s
|
||||
stable-after = {DurationHocon(clusterOptions.StableAfter)}
|
||||
keep-oldest {{
|
||||
down-if-alone = on
|
||||
down-if-alone = {(clusterOptions.DownIfAlone ? "on" : "off")}
|
||||
}}
|
||||
}}
|
||||
failure-detector {{
|
||||
heartbeat-interval = {clusterOptions.HeartbeatInterval.TotalSeconds:F0}s
|
||||
acceptable-heartbeat-pause = {clusterOptions.FailureDetectionThreshold.TotalSeconds:F0}s
|
||||
heartbeat-interval = {DurationHocon(clusterOptions.HeartbeatInterval)}
|
||||
acceptable-heartbeat-pause = {DurationHocon(clusterOptions.FailureDetectionThreshold)}
|
||||
}}
|
||||
run-coordinated-shutdown-when-down = on
|
||||
}}
|
||||
@@ -159,6 +168,18 @@ akka {{
|
||||
}}";
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Renders a <see cref="TimeSpan"/> as a HOCON duration in milliseconds. Akka's
|
||||
/// HOCON parser accepts a <c>ms</c> suffix, so emitting whole milliseconds
|
||||
/// preserves sub-second configuration exactly — a 750ms heartbeat stays 750ms
|
||||
/// rather than being rounded to <c>1s</c> (or, for sub-half-second values,
|
||||
/// silently collapsing to a degenerate <c>0s</c>) — Host-013.
|
||||
/// </summary>
|
||||
private static string DurationHocon(TimeSpan duration)
|
||||
{
|
||||
return $"{(long)Math.Round(duration.TotalMilliseconds)}ms";
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Renders a value as a HOCON double-quoted string, escaping backslashes and
|
||||
/// double quotes so the resulting token cannot break out of its string literal.
|
||||
|
||||
@@ -8,11 +8,13 @@ namespace ScadaLink.Host;
|
||||
///
|
||||
/// REQ-HOST-8 / Host-011: the configured minimum level comes from
|
||||
/// <c>ScadaLink:Logging:MinimumLevel</c> (bound to <see cref="LoggingOptions"/>) so an
|
||||
/// operator editing that key changes the effective log level. The standard
|
||||
/// <c>Serilog</c> configuration section is still read (via
|
||||
/// <see cref="Serilog.Configuration.ConfigurationLoggerConfigurationExtensions"/>)
|
||||
/// for sink/override customisation; the explicit <c>MinimumLevel.Is</c> below pins
|
||||
/// the floor from <see cref="LoggingOptions"/>.
|
||||
/// operator editing that key changes the effective log level.
|
||||
///
|
||||
/// REQ-HOST-8 / Host-014: the console and file sinks are read from the standard
|
||||
/// <c>Serilog</c> configuration section via <c>ReadFrom.Configuration</c> — the sink
|
||||
/// set, console output template, file path and rolling interval are all
|
||||
/// configuration-driven (defined in <c>appsettings.json</c>), not hard-coded. The
|
||||
/// explicit <c>MinimumLevel.Is</c> below pins the floor from <see cref="LoggingOptions"/>.
|
||||
/// </summary>
|
||||
public static class LoggerConfigurationFactory
|
||||
{
|
||||
|
||||
@@ -40,11 +40,12 @@ var siteId = configuration["ScadaLink:Node:SiteId"] ?? "central";
|
||||
|
||||
// WP-14: Serilog structured logging.
|
||||
// Host-011: minimum level is driven by ScadaLink:Logging:MinimumLevel (LoggingOptions).
|
||||
// Host-014: console and file sinks are defined in the `Serilog` configuration
|
||||
// section (appsettings.json) and applied via ReadFrom.Configuration inside the
|
||||
// factory — the sink set, output template, file path and rolling interval are all
|
||||
// configuration-driven per REQ-HOST-8, not hard-coded here.
|
||||
Log.Logger = ScadaLink.Host.LoggerConfigurationFactory
|
||||
.Build(configuration, nodeRole, siteId, nodeHostname)
|
||||
.WriteTo.Console(outputTemplate:
|
||||
"[{Timestamp:HH:mm:ss} {Level:u3}] [{NodeRole}/{NodeHostname}] {Message:lj}{NewLine}{Exception}")
|
||||
.WriteTo.File("logs/scadalink-.log", rollingInterval: Serilog.RollingInterval.Day)
|
||||
.CreateLogger();
|
||||
|
||||
try
|
||||
@@ -121,6 +122,8 @@ try
|
||||
// Host-010: tolerate a database that is briefly unreachable at boot
|
||||
// (e.g. app and DB containers starting together) with a bounded
|
||||
// exponential backoff before failing fatally.
|
||||
// Host-015: only connection-class (transient) faults are retried — a
|
||||
// schema-version mismatch is permanent and must fail fast on attempt 1.
|
||||
await StartupRetry.ExecuteWithRetryAsync(
|
||||
"database-migration",
|
||||
async () =>
|
||||
@@ -131,7 +134,8 @@ try
|
||||
},
|
||||
maxAttempts: 8,
|
||||
initialDelay: TimeSpan.FromSeconds(2),
|
||||
migrationLogger);
|
||||
migrationLogger,
|
||||
isTransient: StartupRetry.IsTransientDatabaseFault);
|
||||
}
|
||||
|
||||
// Middleware pipeline
|
||||
|
||||
@@ -9,6 +9,12 @@ namespace ScadaLink.Host;
|
||||
/// unreachable. Rather than crashing the process on the first connection failure,
|
||||
/// the migration step is wrapped in this bounded exponential backoff: it tolerates a
|
||||
/// short outage and only fails fatally once attempts are exhausted.
|
||||
///
|
||||
/// Host-015: only <em>transient</em> faults are retried. The optional
|
||||
/// <c>isTransient</c> predicate classifies each exception; a permanent failure
|
||||
/// (e.g. a database schema-version mismatch — which no amount of waiting can fix)
|
||||
/// is rethrown immediately rather than being retried for minutes before the
|
||||
/// inevitable fatal exit.
|
||||
/// </summary>
|
||||
public static class StartupRetry
|
||||
{
|
||||
@@ -18,8 +24,13 @@ public static class StartupRetry
|
||||
int maxAttempts,
|
||||
TimeSpan initialDelay,
|
||||
ILogger logger,
|
||||
Func<Exception, bool>? isTransient = null,
|
||||
CancellationToken cancellationToken = default)
|
||||
{
|
||||
// Default: treat every exception as transient (preserves the pre-Host-015
|
||||
// behaviour for callers that do not classify faults).
|
||||
isTransient ??= static _ => true;
|
||||
|
||||
var delay = initialDelay;
|
||||
for (var attempt = 1; ; attempt++)
|
||||
{
|
||||
@@ -33,10 +44,10 @@ public static class StartupRetry
|
||||
operationName, attempt);
|
||||
return;
|
||||
}
|
||||
catch (Exception ex) when (attempt < maxAttempts)
|
||||
catch (Exception ex) when (attempt < maxAttempts && isTransient(ex))
|
||||
{
|
||||
logger.LogWarning(ex,
|
||||
"Startup operation '{Operation}' failed on attempt {Attempt}/{MaxAttempts}; " +
|
||||
"Startup operation '{Operation}' failed (transient) on attempt {Attempt}/{MaxAttempts}; " +
|
||||
"retrying in {Delay}.",
|
||||
operationName, attempt, maxAttempts, delay);
|
||||
await Task.Delay(delay, cancellationToken);
|
||||
@@ -45,4 +56,39 @@ public static class StartupRetry
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Transient-fault classifier for the database-migration startup step (Host-015).
|
||||
/// Returns <c>true</c> only for connection-class faults that a brief wait can
|
||||
/// resolve — a SQL connection/transport error or a timeout — and <c>false</c>
|
||||
/// for everything else (notably schema-validation <see cref="InvalidOperationException"/>s
|
||||
/// raised by <c>MigrationHelper.ApplyOrValidateMigrationsAsync</c>, which are
|
||||
/// permanent and must fail fast).
|
||||
/// </summary>
|
||||
public static bool IsTransientDatabaseFault(Exception ex)
|
||||
{
|
||||
// Unwrap a single layer of aggregation so a faulted Task surfaces correctly.
|
||||
if (ex is AggregateException agg && agg.InnerException != null)
|
||||
ex = agg.InnerException;
|
||||
|
||||
if (ex is TimeoutException)
|
||||
return true;
|
||||
|
||||
// Socket / network errors raised while opening the connection.
|
||||
if (ex is System.Net.Sockets.SocketException)
|
||||
return true;
|
||||
|
||||
// Microsoft.Data.SqlClient throws SqlException; matching by type name keeps
|
||||
// the Host free of a direct SqlClient package reference. A SqlException at
|
||||
// the migration stage is, in practice, a connection failure (the server is
|
||||
// not yet reachable) rather than a schema fault — schema mismatches surface
|
||||
// as InvalidOperationException from the migration helper.
|
||||
var typeName = ex.GetType().FullName;
|
||||
if (typeName != null &&
|
||||
(typeName.EndsWith("SqlException", StringComparison.Ordinal) ||
|
||||
typeName.EndsWith("DbException", StringComparison.Ordinal)))
|
||||
return true;
|
||||
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
@@ -3,5 +3,26 @@
|
||||
"LogLevel": {
|
||||
"Default": "Information"
|
||||
}
|
||||
},
|
||||
"Serilog": {
|
||||
"Using": [
|
||||
"Serilog.Sinks.Console",
|
||||
"Serilog.Sinks.File"
|
||||
],
|
||||
"WriteTo": [
|
||||
{
|
||||
"Name": "Console",
|
||||
"Args": {
|
||||
"outputTemplate": "[{Timestamp:HH:mm:ss} {Level:u3}] [{NodeRole}/{NodeHostname}] {Message:lj}{NewLine}{Exception}"
|
||||
}
|
||||
},
|
||||
{
|
||||
"Name": "File",
|
||||
"Args": {
|
||||
"path": "logs/scadalink-.log",
|
||||
"rollingInterval": "Day"
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
|
||||
@@ -37,7 +37,8 @@ public class HoconBuilderTests
|
||||
};
|
||||
|
||||
var hocon = AkkaHostedService.BuildHocon(
|
||||
node, DefaultCluster(), new[] { "Central" }, 5, 15);
|
||||
node, DefaultCluster(), new[] { "Central" },
|
||||
TimeSpan.FromSeconds(5), TimeSpan.FromSeconds(15));
|
||||
|
||||
var config = ConfigurationFactory.ParseString(hocon);
|
||||
Assert.Equal("central-node1", config.GetString("akka.remote.dot-netty.tcp.hostname"));
|
||||
@@ -57,7 +58,8 @@ public class HoconBuilderTests
|
||||
};
|
||||
|
||||
var hocon = AkkaHostedService.BuildHocon(
|
||||
node, DefaultCluster(), new[] { "Central" }, 5, 15);
|
||||
node, DefaultCluster(), new[] { "Central" },
|
||||
TimeSpan.FromSeconds(5), TimeSpan.FromSeconds(15));
|
||||
|
||||
// Must still parse, and the keys after the hostname must remain intact.
|
||||
var config = ConfigurationFactory.ParseString(hocon);
|
||||
@@ -79,9 +81,83 @@ public class HoconBuilderTests
|
||||
};
|
||||
|
||||
var hocon = AkkaHostedService.BuildHocon(
|
||||
node, cluster, new[] { "Central" }, 5, 15);
|
||||
node, cluster, new[] { "Central" },
|
||||
TimeSpan.FromSeconds(5), TimeSpan.FromSeconds(15));
|
||||
|
||||
var config = ConfigurationFactory.ParseString(hocon);
|
||||
Assert.Equal("keep oldest", config.GetString("akka.cluster.split-brain-resolver.active-strategy"));
|
||||
}
|
||||
|
||||
private static NodeOptions DefaultNode() => new()
|
||||
{
|
||||
Role = "Central",
|
||||
NodeHostname = "node1",
|
||||
RemotingPort = 8081,
|
||||
};
|
||||
|
||||
[Fact]
|
||||
public void BuildHocon_DownIfAloneTrue_EmitsOn()
|
||||
{
|
||||
// Host-012: BuildHocon must consume ClusterOptions.DownIfAlone, not
|
||||
// hard-code the down-if-alone token.
|
||||
var cluster = DefaultCluster();
|
||||
cluster.DownIfAlone = true;
|
||||
|
||||
var hocon = AkkaHostedService.BuildHocon(
|
||||
DefaultNode(), cluster, new[] { "Central" },
|
||||
TimeSpan.FromSeconds(5), TimeSpan.FromSeconds(15));
|
||||
|
||||
var config = ConfigurationFactory.ParseString(hocon);
|
||||
Assert.True(config.GetBoolean(
|
||||
"akka.cluster.split-brain-resolver.keep-oldest.down-if-alone"));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void BuildHocon_DownIfAloneFalse_EmitsOff()
|
||||
{
|
||||
// Host-012: setting DownIfAlone=false must actually flip the emitted token.
|
||||
var cluster = DefaultCluster();
|
||||
cluster.DownIfAlone = false;
|
||||
|
||||
var hocon = AkkaHostedService.BuildHocon(
|
||||
DefaultNode(), cluster, new[] { "Central" },
|
||||
TimeSpan.FromSeconds(5), TimeSpan.FromSeconds(15));
|
||||
|
||||
var config = ConfigurationFactory.ParseString(hocon);
|
||||
Assert.False(config.GetBoolean(
|
||||
"akka.cluster.split-brain-resolver.keep-oldest.down-if-alone"));
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void BuildHocon_SubSecondTimings_PreservedWithoutRounding()
|
||||
{
|
||||
// Host-013: cluster timing values below one second must not be rounded to
|
||||
// whole seconds. A 750ms heartbeat must survive as 750ms, not collapse to
|
||||
// 1s — and a 500ms value must not collapse to a degenerate 0s.
|
||||
var cluster = DefaultCluster();
|
||||
cluster.HeartbeatInterval = TimeSpan.FromMilliseconds(750);
|
||||
cluster.FailureDetectionThreshold = TimeSpan.FromMilliseconds(2600);
|
||||
cluster.StableAfter = TimeSpan.FromMilliseconds(500);
|
||||
|
||||
var hocon = AkkaHostedService.BuildHocon(
|
||||
DefaultNode(), cluster, new[] { "Central" },
|
||||
TimeSpan.FromMilliseconds(2500), TimeSpan.FromMilliseconds(7500));
|
||||
|
||||
var config = ConfigurationFactory.ParseString(hocon);
|
||||
Assert.Equal(
|
||||
TimeSpan.FromMilliseconds(750),
|
||||
config.GetTimeSpan("akka.cluster.failure-detector.heartbeat-interval"));
|
||||
Assert.Equal(
|
||||
TimeSpan.FromMilliseconds(2600),
|
||||
config.GetTimeSpan("akka.cluster.failure-detector.acceptable-heartbeat-pause"));
|
||||
Assert.Equal(
|
||||
TimeSpan.FromMilliseconds(500),
|
||||
config.GetTimeSpan("akka.cluster.split-brain-resolver.stable-after"));
|
||||
Assert.Equal(
|
||||
TimeSpan.FromMilliseconds(2500),
|
||||
config.GetTimeSpan("akka.remote.transport-failure-detector.heartbeat-interval"));
|
||||
Assert.Equal(
|
||||
TimeSpan.FromMilliseconds(7500),
|
||||
config.GetTimeSpan("akka.remote.transport-failure-detector.acceptable-heartbeat-pause"));
|
||||
}
|
||||
}
|
||||
|
||||
93
tests/ScadaLink.Host.Tests/SerilogSinkConfigTests.cs
Normal file
93
tests/ScadaLink.Host.Tests/SerilogSinkConfigTests.cs
Normal file
@@ -0,0 +1,93 @@
|
||||
using System.Reflection;
|
||||
using System.Text.Json;
|
||||
using Microsoft.Extensions.Configuration;
|
||||
|
||||
namespace ScadaLink.Host.Tests;
|
||||
|
||||
/// <summary>
|
||||
/// Host-014 regression: REQ-HOST-8 requires the Host's Serilog sinks
|
||||
/// (console and file at minimum) to be <em>configuration-driven</em>. The sinks
|
||||
/// must be defined in a <c>Serilog</c> section in <c>appsettings.json</c> and
|
||||
/// applied via <c>ReadFrom.Configuration</c> — they must not be hard-coded in
|
||||
/// <c>Program.cs</c>, so an operator can change the file path, rolling interval or
|
||||
/// output template without recompiling.
|
||||
/// </summary>
|
||||
public class SerilogSinkConfigTests
|
||||
{
|
||||
private static string FindHostProjectDirectory()
|
||||
{
|
||||
var assemblyDir = Path.GetDirectoryName(Assembly.GetExecutingAssembly().Location)!;
|
||||
var dir = new DirectoryInfo(assemblyDir);
|
||||
while (dir != null)
|
||||
{
|
||||
var hostPath = Path.Combine(dir.FullName, "src", "ScadaLink.Host");
|
||||
if (Directory.Exists(hostPath))
|
||||
return hostPath;
|
||||
dir = dir.Parent;
|
||||
}
|
||||
throw new DirectoryNotFoundException("Could not locate src/ScadaLink.Host");
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void ShippedAppSettings_HasSerilogSection_WithConsoleAndFileSinks()
|
||||
{
|
||||
var path = Path.Combine(FindHostProjectDirectory(), "appsettings.json");
|
||||
using var doc = JsonDocument.Parse(File.ReadAllText(path));
|
||||
|
||||
Assert.True(
|
||||
doc.RootElement.TryGetProperty("Serilog", out var serilog),
|
||||
"appsettings.json must contain a `Serilog` section so ReadFrom.Configuration " +
|
||||
"drives the sinks (REQ-HOST-8 / Host-014).");
|
||||
|
||||
Assert.True(
|
||||
serilog.TryGetProperty("WriteTo", out var writeTo),
|
||||
"the `Serilog` section must contain a `WriteTo` array defining the sinks.");
|
||||
|
||||
var sinkNames = writeTo.EnumerateArray()
|
||||
.Select(e => e.TryGetProperty("Name", out var n) ? n.GetString() : null)
|
||||
.ToList();
|
||||
|
||||
Assert.Contains("Console", sinkNames);
|
||||
Assert.Contains("File", sinkNames);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void LoggerConfigurationFactory_AppliesConfiguredFileSink()
|
||||
{
|
||||
// A `Serilog` section in configuration must actually reach the built logger
|
||||
// via ReadFrom.Configuration — proving the sink set is configuration-driven.
|
||||
var logDir = Path.Combine(Path.GetTempPath(), "scadalink-host014-" + Guid.NewGuid().ToString("N"));
|
||||
var logPath = Path.Combine(logDir, "test-.log");
|
||||
try
|
||||
{
|
||||
var configuration = new ConfigurationBuilder()
|
||||
.AddInMemoryCollection(new Dictionary<string, string?>
|
||||
{
|
||||
["Serilog:Using:0"] = "Serilog.Sinks.File",
|
||||
["Serilog:WriteTo:0:Name"] = "File",
|
||||
["Serilog:WriteTo:0:Args:path"] = logPath,
|
||||
["Serilog:WriteTo:0:Args:rollingInterval"] = "Day",
|
||||
})
|
||||
.Build();
|
||||
|
||||
var logger = LoggerConfigurationFactory
|
||||
.Build(configuration, "Central", "central", "node1")
|
||||
.CreateLogger();
|
||||
|
||||
logger.Information("host-014 configured-sink probe");
|
||||
logger.Dispose();
|
||||
|
||||
Assert.True(Directory.Exists(logDir), "the configured file sink must have created its directory.");
|
||||
var written = Directory.GetFiles(logDir, "test-*.log");
|
||||
Assert.NotEmpty(written);
|
||||
Assert.Contains(
|
||||
"host-014 configured-sink probe",
|
||||
File.ReadAllText(written[0]));
|
||||
}
|
||||
finally
|
||||
{
|
||||
if (Directory.Exists(logDir))
|
||||
Directory.Delete(logDir, recursive: true);
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -57,9 +57,59 @@ public class StartupRetryTests
|
||||
},
|
||||
maxAttempts: 3,
|
||||
initialDelay: TimeSpan.FromMilliseconds(1),
|
||||
NullLogger.Instance));
|
||||
NullLogger.Instance,
|
||||
isTransient: _ => true));
|
||||
|
||||
Assert.Equal(3, attempts);
|
||||
Assert.Equal("failure 3", ex.Message);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task ExecuteWithRetry_NonTransientFailure_RethrowsAfterSingleAttempt()
|
||||
{
|
||||
// Host-015: a permanent failure (e.g. a schema-version mismatch) must NOT be
|
||||
// retried — retrying it cannot succeed and only delays the fatal exit by
|
||||
// minutes. The isTransient predicate classifies it as non-retryable, so the
|
||||
// operation runs exactly once before the exception propagates.
|
||||
var attempts = 0;
|
||||
var ex = await Assert.ThrowsAsync<InvalidOperationException>(() =>
|
||||
StartupRetry.ExecuteWithRetryAsync(
|
||||
"test-op",
|
||||
() =>
|
||||
{
|
||||
attempts++;
|
||||
throw new InvalidOperationException("permanent schema mismatch");
|
||||
},
|
||||
maxAttempts: 8,
|
||||
initialDelay: TimeSpan.FromMilliseconds(1),
|
||||
NullLogger.Instance,
|
||||
isTransient: _ => false));
|
||||
|
||||
Assert.Equal(1, attempts);
|
||||
Assert.Equal("permanent schema mismatch", ex.Message);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public async Task ExecuteWithRetry_TransientThenPermanent_StopsAtPermanent()
|
||||
{
|
||||
// A transient fault is retried; a subsequent permanent fault is not.
|
||||
var attempts = 0;
|
||||
await Assert.ThrowsAsync<InvalidOperationException>(() =>
|
||||
StartupRetry.ExecuteWithRetryAsync(
|
||||
"test-op",
|
||||
() =>
|
||||
{
|
||||
attempts++;
|
||||
if (attempts == 1)
|
||||
throw new TimeoutException("transient");
|
||||
throw new InvalidOperationException("permanent");
|
||||
},
|
||||
maxAttempts: 8,
|
||||
initialDelay: TimeSpan.FromMilliseconds(1),
|
||||
NullLogger.Instance,
|
||||
isTransient: e => e is TimeoutException));
|
||||
|
||||
// 1 transient (retried) + 1 permanent (not retried) = 2.
|
||||
Assert.Equal(2, attempts);
|
||||
}
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user