30 KiB
Code Review — Host
| Field | Value |
|---|---|
| Module | src/ScadaLink.Host |
| Design doc | docs/requirements/Component-Host.md |
| Status | Reviewed |
| Last reviewed | 2026-05-16 |
| Reviewer | claude-agent |
| Commit reviewed | 9c60592 |
| Open findings | 0 |
Summary
The Host module is the composition root for the entire ScadaLink system: a single
binary whose behaviour (Central vs Site) is driven entirely by configuration. The
implementation is generally faithful to Component-Host.md — startup validation,
role-based registration, Serilog enrichment, Windows Service support, dead-letter
monitoring, CoordinatedShutdown, and gRPC hosting on site nodes are all present and
backed by a solid test suite (tests/ScadaLink.Host.Tests).
The most significant problem is the readiness endpoint: /health/ready runs all
registered health checks, including the leader-only active-node check, so a fully
operational standby central node permanently reports 503 on /health/ready —
directly contradicting REQ-HOST-4a, which defines readiness as cluster membership +
DB connectivity (not leadership). Several other findings concern configuration that
is validated-but-never-consumed (MachineDataDb), design-doc drift (Akka.Persistence
is required by REQ-HOST-6 but the system uses no persistent actors), an incorrect
seed-node entry in the shipped site config, blocking sync-over-async during startup,
and unguarded string interpolation when building HOCON. None are crash/data-loss
class, but the readiness bug is High because it breaks load-balancer behaviour with
no safe workaround.
Checklist coverage
| # | Category | Examined | Notes |
|---|---|---|---|
| 1 | Correctness & logic bugs | ☑ | /health/ready includes the leader-only check (Host-001); site seed-node config points at the gRPC port (Host-004). |
| 2 | Akka.NET conventions | ☑ | CoordinatedShutdown, receptionist registration, singleton scoping all correct. HOCON built by raw string interpolation (Host-006); StartAsync returns before actors are confirmed running (Host-009). |
| 3 | Concurrency & thread safety | ☑ | Blocking GetAwaiter().GetResult() on a hosted-service startup thread (Host-005). DeadLetterMonitorActor state is actor-confined — no issues. |
| 4 | Error handling & resilience | ☑ | Top-level try/catch logs fatal and rethrows. No retry around DB migration / readiness preconditions (Host-010). |
| 5 | Security | ☑ | Plaintext DB password, LDAP service-account password and dev JWT key checked into appsettings.Central.json (Host-003). |
| 6 | Performance & resource management | ☑ | No undisposed resources. Inbound API script compilation is a synchronous startup loop — acceptable. |
| 7 | Design-document adherence | ☑ | REQ-HOST-6 mandates Akka.Persistence config but none exists and no persistent actors exist — doc is stale (Host-002). REQ-HOST-4 GrpcPort-≠-RemotingPort rule not enforced (Host-007). |
| 8 | Code organization & conventions | ☑ | MachineDataDb validated/declared but never consumed (Host-008). LoggingOptions.MinimumLevel is dead (Host-011). |
| 9 | Testing coverage | ☑ | Strong suite; no test asserts /health/ready excludes active-node, which is why Host-001 slipped through (noted in Host-001). |
| 10 | Documentation & comments | ☑ | Comments are accurate. REQ-HOST-6 in the design doc is the main stale-doc item (Host-002). |
Findings
Host-001 — /health/ready includes the leader-only active-node check
| Severity | High |
| Category | Correctness & logic bugs |
| Status | Resolved |
| Location | src/ScadaLink.Host/Program.cs:135-145 |
Description
/health/ready is mapped with MapHealthChecks("/health/ready", ...) and no
Predicate, so it executes every registered check: database, akka-cluster
and active-node. ActiveNodeHealthCheck (Health/ActiveNodeHealthCheck.cs:38)
returns Unhealthy on any node that is not the cluster leader. As a result a
standby central node that is fully operational (cluster member Up, database
reachable) still returns 503 on /health/ready. This contradicts REQ-HOST-4a,
which defines readiness as cluster membership + DB connectivity + singletons —
explicitly not leadership. /health/active is the endpoint intended to report
leadership. A load balancer using /health/ready to decide whether a node may
serve traffic will permanently treat the standby as unready, defeating failover
readiness. No test covers this: HealthCheckTests.HealthReady_Endpoint_ReturnsResponse
only asserts a response is returned, not the standby semantics.
Recommendation
Add a Predicate to the /health/ready mapping that excludes the active-node
check, e.g. Predicate = check => check.Name != "active-node" (or tag the readiness
checks and filter by tag). Add a regression test asserting a non-leader node returns
200 on /health/ready.
Resolution
Resolved 2026-05-16 (commit <pending>). Root cause confirmed against
Program.cs: the /health/ready mapping had no Predicate, so it executed all
three registered checks including the leader-only active-node check, while
ActiveNodeHealthCheck returns Unhealthy on any non-leader node — making a fully
operational standby central node permanently report 503. Fix: added
Predicate = check => check.Name != "active-node" to the /health/ready
HealthCheckOptions, so readiness now reflects cluster membership + DB connectivity
only (REQ-HOST-4a); leadership remains reported solely by /health/active.
Regression test HealthCheckTests.HealthReady_Endpoint_ExcludesActiveNodeCheck
asserts the active-node check name does not appear in the /health/ready
response body; it failed before the fix and passes after. Full Host suite green
(156 passed).
Host-002 — Akka.Persistence required by REQ-HOST-6 is not configured and not used
| Severity | Medium |
| Category | Design-document adherence |
| Status | Resolved |
| Location | src/ScadaLink.Host/Actors/AkkaHostedService.cs:70-108, docs/requirements/Component-Host.md REQ-HOST-6 |
Description
REQ-HOST-6 states the Host "must configure the Akka.NET actor system using
Akka.Hosting with ... Persistence: Configured with the appropriate journal and
snapshot store (SQL for central, SQLite for site)." The HOCON built in
AkkaHostedService.StartAsync contains no akka.persistence section, no journal and
no snapshot-store plugin, and ScadaLink.Host.csproj references neither
Akka.Persistence.Hosting nor any persistence plugin (the design doc Dependencies
list Akka.Persistence.Hosting). A repo-wide search finds no PersistentActor /
ReceivePersistentActor subclasses — the system deliberately uses custom SQLite
storage services instead. The code is internally consistent, but the design document
is stale: it mandates a subsystem that does not exist. This is a documented-vs-actual
drift that will mislead future maintainers and any audit against REQ-HOST-6.
Recommendation
Update Component-Host.md REQ-HOST-6 and the Dependencies list to remove the
Akka.Persistence requirement (or explicitly state persistence is provided by
component-owned SQLite storage, not Akka.Persistence). If persistence is intended,
add the plugin packages and HOCON. Either way, code and doc must agree.
Resolution
Resolved 2026-05-16 (commit <pending>). Confirmed accurate: a repo-wide search
finds no PersistentActor / ReceivePersistentActor subclasses anywhere in
src/, no akka.persistence section in the HOCON built by
AkkaHostedService.StartAsync, and ScadaLink.Host.csproj references no
persistence plugin packages. The Host code is correct and internally consistent;
the defect was a stale design doc. Fix: updated docs/requirements/Component-Host.md
— REQ-HOST-6 no longer lists Persistence as a configured subsystem and now carries
an explicit Persistence note stating that durable state is owned by individual
components and persisted through component-owned stores (SQLite at sites, MS SQL
centrally), with no Akka journal/snapshot store and no PersistentActor subclasses
by design. The Responsibilities line and the Dependencies entry
(Akka.Persistence.Hosting removed) were corrected to match. Code and doc now
agree; no code or test change required.
Host-003 — Secrets committed in plaintext in appsettings.Central.json
| Severity | Medium |
| Category | Security |
| Status | Resolved |
| Location | src/ScadaLink.Host/appsettings.Central.json:20-31 |
Description
appsettings.Central.json contains real-looking secrets in plaintext, checked into
source control: SQL Server passwords in the ConfigurationDb / MachineDataDb
connection strings (Password=ScadaLink_Dev1#), an LDAP service-account password
(LdapServiceAccountPassword: "password"), and a JWT signing key
(JwtSigningKey: "scadalink-dev-jwt-signing-key-..."). Even though these are
intended as development defaults, shipping them in the default config invites them
being reused verbatim in production, and a committed JWT signing key allows anyone
with repo access to forge session tokens. TrustServerCertificate=true additionally
disables TLS validation for the SQL connection.
Recommendation
Move all secrets out of committed appsettings*.json into environment variables,
user-secrets, or a secret store. Keep only non-sensitive structural defaults in the
file and document the required environment variables. At minimum add a clear comment
that these values are dev-only and must be overridden, and rotate the JWT key per
environment.
Resolution
Resolved 2026-05-16 (commit pending). Root cause confirmed against
appsettings.Central.json: the committed file carried real-looking secrets in
plaintext — SQL Server passwords (Password=ScadaLink_Dev1#) in both connection
strings, an LDAP service-account password, and a JWT signing key. Fix: all four
secrets were removed from the committed file and replaced with non-functional
${...} placeholder references (ConfigurationDb / MachineDataDb,
LdapServiceAccountPassword, JwtSigningKey). A new top-level _secrets note
documents that the Host's configuration builder (AddEnvironmentVariables())
overlays the real values supplied via environment variables
(ScadaLink__Database__ConfigurationDb, ScadaLink__Database__MachineDataDb,
ScadaLink__Security__LdapServiceAccountPassword,
ScadaLink__Security__JwtSigningKey); the placeholders are intentionally invalid so
a misconfigured deployment fails loudly rather than silently using a committed key.
Regression test class ConfigSecretsTests asserts the committed file carries no
plaintext Password= value, no committed LDAP service-account password, and no
committed JWT signing key; all three tests failed before the fix and pass after.
Tests that drive the full Program startup pipeline against the real SQL provider
(HealthCheckTests, HostStartupTests.CentralRole_StartsWithoutError) were adapted
to supply the local dev connection strings themselves via the new
CentralDbTestEnvironment test helper (environment variables) — they must no longer
depend on committed secrets. Note: the docker/central-node-*/appsettings.Central.json
files still contain the same dev secrets but lie outside this task's permitted edit
scope; they should receive the same treatment in a follow-up.
Host-004 — Site seed-node list points at the gRPC port, not a remoting port
| Severity | Medium |
| Category | Correctness & logic bugs |
| Status | Resolved |
| Location | src/ScadaLink.Host/appsettings.Site.json:10-19 |
Description
The shipped site config sets Node:RemotingPort = 8082 and Node:GrpcPort = 8083,
but Cluster:SeedNodes is ["akka.tcp://scadalink@localhost:8082", "akka.tcp://scadalink@localhost:8083"]. The second seed node targets 8083, which
is the Kestrel HTTP/2 gRPC port — not an Akka remoting endpoint. A node attempting to
join via that seed will try to establish an Akka.Remote TCP association against the
gRPC listener and fail. StartupValidator only checks that ≥2 seed nodes exist
(StartupValidator.cs:54-56), so this misconfiguration passes validation silently.
For the single-node dev site it is harmless (the first seed succeeds), but it is an
incorrect example that will be copied into multi-node site configs.
Recommendation
Correct the site seed-node list to reference the two site nodes' remoting ports
(e.g. 8082 and 8084), never the gRPC port. Consider extending StartupValidator
to reject a seed node whose port equals this node's GrpcPort.
Resolution
Resolved 2026-05-16 (commit pending). Root cause confirmed against
appsettings.Site.json: with Node:RemotingPort = 8082 and Node:GrpcPort = 8083,
the second Cluster:SeedNodes entry was akka.tcp://scadalink@localhost:8083 — the
Kestrel HTTP/2 gRPC port, not an Akka.Remote endpoint. StartupValidator only
checked seed-node count (≥2), so the misconfiguration passed silently. Fix, two
parts: (1) the shipped appsettings.Site.json second seed entry was corrected to a
remoting port (localhost:8084); (2) StartupValidator.Validate was extended — for
Site nodes it now parses each seed node's TCP port (via a new SeedNodePort
helper) and rejects any entry whose port equals the node's GrpcPort, using the
resolved GrpcPort including the 8083 NodeOptions default when the key is absent.
The seed-node-count check was hoisted above the Site block so the new rule can reuse
the parsed list. Regression tests in StartupValidatorTests:
Site_SeedNodeOnGrpcPort_FailsValidation,
Site_SeedNodeOnDefaultGrpcPort_FailsValidation (default-8083 path),
Site_SeedNodesOnRemotingPort_PassesValidation, and
Central_SeedNodeOnPort8083_PassesValidation (rule is Site-only) — all failed
appropriately before the fix and pass after.
Host-005 — Blocking sync-over-async (GetAwaiter().GetResult()) inside StartAsync
| Severity | Low |
| Category | Concurrency & thread safety |
| Status | Resolved |
| Location | src/ScadaLink.Host/Actors/AkkaHostedService.cs:345 |
Description
RegisterSiteActors calls storeAndForwardService.StartAsync().GetAwaiter().GetResult()
synchronously, blocking inside the IHostedService.StartAsync path. StartAsync is
itself declared synchronous (returns Task.CompletedTask), so the work cannot be
awaited cleanly. Blocking on async work risks thread-pool starvation during startup
and, if the awaited operation captures a synchronization context, deadlock. It also
hides exceptions behind an AggregateException wrapper.
Recommendation
Make AkkaHostedService.StartAsync genuinely async and await storeAndForwardService.StartAsync(cancellationToken). Propagate the
CancellationToken and let exceptions surface as the original type.
Resolution
Resolved 2026-05-16 (commit pending). Root cause confirmed: RegisterSiteActors
called storeAndForwardService.StartAsync().GetAwaiter().GetResult() synchronously
inside the IHostedService.StartAsync path, blocking a startup thread and wrapping
any failure in AggregateException. Fix: AkkaHostedService.StartAsync is now
genuinely async Task; the site path was renamed RegisterSiteActorsAsync and made
async; the store-and-forward init is now await storeAndForwardService.StartAsync()
(the service's own signature, in another module, takes no CancellationToken, so the
token is honoured via a cancellationToken.ThrowIfCancellationRequested() checkpoint
immediately before the await). No blocking, no sync-context deadlock risk, and the
original exception type now surfaces. No dedicated regression test: the only
observable change is non-blocking/exception-unwrapping behaviour on a path that
requires the full site DI graph plus a real SQLite store; the existing
HostStartupTests.SiteRole_* and CompositionRootTests continue to exercise the
async startup pipeline and remain green (175 passed).
Host-006 — HOCON assembled by unescaped string interpolation
| Severity | Low |
| Category | Akka.NET conventions |
| Status | Resolved |
| Location | src/ScadaLink.Host/Actors/AkkaHostedService.cs:70-108 |
Description
The Akka HOCON is built with an interpolated string that injects
_nodeOptions.NodeHostname, _clusterOptions.SeedNodes, the computed roles, and
SplitBrainResolverStrategy directly into the configuration text. Values are not
escaped. A hostname or seed-node string containing a quote, backslash, brace, or
comment sequence would corrupt the HOCON and produce a confusing parse error far from
the real cause; SplitBrainResolverStrategy is interpolated without quoting, so a
value with whitespace breaks the document. Building cluster configuration from raw
string concatenation is also harder to maintain than the typed Akka.Hosting builder
the design doc (REQ-HOST-6) actually calls for ("via Akka.Hosting").
Recommendation
Prefer the Akka.Hosting AddAkka(...) builder with strongly-typed WithRemoting,
WithClustering, and split-brain-resolver configuration instead of hand-built HOCON.
If HOCON must be retained, validate/escape interpolated values (especially hostname
and seed nodes) before substitution.
Resolution
Resolved 2026-05-16 (commit pending). Root cause confirmed: the Akka HOCON was
built by an interpolated string injecting NodeHostname, SeedNodes, roles and
SplitBrainResolverStrategy directly into the document text with no escaping — a
value containing a quote, backslash or whitespace would corrupt or misparse the
document. Re-triaged the recommendation: switching wholesale to the typed
Akka.Hosting builder is a larger refactor than this Low finding warrants and would
touch how the whole cluster is bootstrapped, so the HOCON form was retained but made
safe. Fix: the HOCON construction was extracted into a testable static
AkkaHostedService.BuildHocon, and every interpolated string value is now routed
through a QuoteHocon helper that wraps the value in a double-quoted HOCON literal
and escapes embedded backslashes and quotes — NodeHostname, every seed node, every
role, and SplitBrainResolverStrategy (the last was previously unquoted entirely, so
whitespace alone would break it). Regression tests in new HoconBuilderTests:
BuildHocon_HostnameWithQuote_DoesNotCorruptDocument (a hostname with a " still
parses and following keys stay intact), BuildHocon_StrategyWithWhitespace_RemainsQuoted,
and BuildHocon_PlainValues_ParsesAndPreservesValues. With the old raw interpolation
a quote-containing hostname produced hostname = "evil"host", closing the literal
early and corrupting following keys; the escaped form parses correctly. Full Host
suite green (175 passed).
Host-007 — REQ-HOST-4 rule "GrpcPort ≠ RemotingPort" is not enforced
| Severity | Low |
| Category | Design-document adherence |
| Status | Resolved |
| Location | src/ScadaLink.Host/StartupValidator.cs:43-47 |
Description
REQ-HOST-4 requires: "Site nodes must have GrpcPort in valid port range (1–65535)
and different from RemotingPort." StartupValidator validates the GrpcPort
range but never compares it to RemotingPort. A site config that sets both ports to
the same value passes validation and then fails opaquely at runtime when Kestrel and
Akka.Remote both try to bind the port. The GrpcPort range check is also skipped
entirely when the key is absent (grpcPortStr != null), relying on the
NodeOptions default of 8083 — acceptable, but the equality rule is the missing
piece.
Recommendation
Add a check in the role == "Site" block: if GrpcPort (resolved, including the
8083 default) equals RemotingPort, add an error
"ScadaLink:Node:GrpcPort must differ from RemotingPort".
Resolution
Resolved 2026-05-16 (commit pending). Root cause confirmed against
StartupValidator.cs: the role == "Site" block validated the GrpcPort range but
never compared it to RemotingPort, so a site config setting both ports equal
passed validation and then failed opaquely when Kestrel and Akka.Remote both bound
the port. Fix: added if (port == grpcPort) errors.Add("ScadaLink:Node:GrpcPort must differ from RemotingPort") in the Site block, using the resolved GrpcPort (including
the 8083 NodeOptions default when the key is absent) against the parsed
RemotingPort. Regression tests in StartupValidatorTests:
Site_GrpcPortEqualsRemotingPort_FailsValidation,
Site_DefaultGrpcPortEqualsRemotingPort_FailsValidation (default-8083 path), and
Site_GrpcPortDiffersFromRemotingPort_PassesValidation — all failed appropriately
before the fix and pass after. Full Host suite green (175 passed).
Host-008 — MachineDataDb is validated and declared but never consumed
| Severity | Low |
| Category | Code organization & conventions |
| Status | Resolved |
| Location | src/ScadaLink.Host/StartupValidator.cs:33-34, src/ScadaLink.Host/DatabaseOptions.cs:6 |
Description
StartupValidator requires a non-empty ScadaLink:Database:MachineDataDb connection
string for Central nodes, and DatabaseOptions exposes a MachineDataDb property,
but a repo-wide search shows the value is never read anywhere outside the Host module
— only ConfigurationDb is passed to AddConfigurationDatabase
(Program.cs:83-85). The Host therefore fails startup if MachineDataDb is missing
even though nothing uses it. This is either dead configuration that should be removed
or a missing wiring (a machine-data DbContext that was never registered).
Recommendation
Determine whether a machine-data store is actually required. If yes, wire it into the
relevant component's DI registration. If no, remove the MachineDataDb validation
rule, the DatabaseOptions property, and the key from appsettings.Central.json.
Resolution
Resolved 2026-05-16 (commit pending). Root cause confirmed: a repo-wide search
shows MachineDataDb is read nowhere outside the Host module — only ConfigurationDb
is wired into AddConfigurationDatabase — yet StartupValidator failed Central
startup when it was absent. No machine-data store exists or is registered, so this is
dead configuration. Fix (the "if no" branch of the recommendation): removed the
MachineDataDb validation rule from StartupValidator, the MachineDataDb property
from DatabaseOptions, the MachineDataDb key from appsettings.Central.json, and
the now-pointless MachineDataDb reference from that file's _secrets note and from
the CentralDbTestEnvironment test helper. Regression test: the prior
StartupValidatorTests.Central_MissingMachineDataDb_FailsValidation was inverted to
Central_MissingMachineDataDb_PassesValidation (absence must now NOT fail startup),
and MachineDataDb was removed from the shared ValidCentralConfig fixture so all
Central tests prove the key is no longer required; it failed under the old code and
passes after. Full Host suite green (175 passed). Note: appsettings.Central.json
under docker/central-node-* is outside this task's edit scope and still carries the
key harmlessly — it will simply be ignored; a follow-up may remove it for tidiness.
Host-009 — StartAsync reports success before role actors are confirmed running
| Severity | Low |
| Category | Akka.NET conventions |
| Status | Resolved |
| Location | src/ScadaLink.Host/Actors/AkkaHostedService.cs:127-141 |
Description
StartAsync creates actors with ActorOf (a fire-and-forget operation — the actor's
PreStart runs asynchronously on its own thread) and then returns
Task.CompletedTask. For site nodes, grpcServer.SetReady(_actorSystem) is called
synchronously at the end of RegisterSiteActors, marking the gRPC server ready even
though SiteCommunicationActor, the deployment-manager singleton, and the
ClusterClient may not yet have completed their PreStart/initial-contact handshake.
REQ-HOST-7 requires "Actor system and SiteStreamManager ... initialized before gRPC
begins accepting connections" — SiteStreamManager.Initialize is awaited-equivalent,
but the broader actor graph is not. The window is small and the gRPC server still
rejects streams until SetReady, so impact is limited, but readiness is being
asserted optimistically.
Recommendation
If strict ordering matters, gate SetReady on confirmation that
SiteCommunicationActor is fully initialized (e.g. an Ask round-trip or a
readiness message), or document explicitly that gRPC readiness only guarantees the
actor system exists, not that the cluster handshake has completed.
Resolution
Resolved 2026-05-16 (commit pending) — re-triaged, documentation fix. Root cause
confirmed: SetReady is called optimistically after ActorOf registrations whose
PreStart/initial-contact handshakes run asynchronously. Re-triaged the first
recommended option (gating SetReady on a SiteCommunicationActor Ask
round-trip): that option is incorrect because the meaningful asynchronous step is
the ClusterClient initial-contact handshake with central — gating site gRPC
readiness on it would prevent a site from coming up and streaming locally while
central is briefly unreachable, contradicting the design's site-autonomy model.
Confirmed SiteStreamGrpcServer already rejects any stream opened before SetReady
with StatusCode.Unavailable, so the window is benign. Fix (the second, correct
option): the SetReady call site now carries an explicit comment documenting the
narrow readiness contract — by SetReady the actor system exists,
SiteStreamManager.Initialize has run, and every role actor has been created with
ordered synchronous ActorOf + registration Tells; what is deliberately NOT
guaranteed is PreStart completion or the central handshake. No regression test is
meaningful for a documentation-only clarification of an intentional design contract;
the existing HostStartupTests.SiteRole_* cover the surrounding startup path. Full
Host suite green (175 passed).
Host-010 — No retry/backoff around startup preconditions (DB migration, readiness)
| Severity | Low |
| Category | Error handling & resilience |
| Status | Resolved |
| Location | src/ScadaLink.Host/Program.cs:112-125 |
Description
On Central startup the Host opens a DI scope and calls
MigrationHelper.ApplyOrValidateMigrationsAsync directly. If the SQL Server is not
yet reachable (common in container orchestration where the DB and app start
together), the call throws, the top-level catch logs Fatal, and the process
exits. There is no bounded retry/backoff to tolerate a database that is briefly
unavailable at boot. The design intent (REQ-HOST-4a, readiness gating, 503 until
ready) is about serving traffic, but the migration step happens before the host
even runs and has no such tolerance.
Recommendation
Wrap the migration/validation step in a bounded retry with exponential backoff (e.g.
Polly), or move schema apply behind the readiness gate so the process stays up and
reports 503 until the database becomes reachable.
Resolution
Resolved 2026-05-16 (commit pending). Root cause confirmed: Central startup called
MigrationHelper.ApplyOrValidateMigrationsAsync directly with no tolerance for a
database that is briefly unreachable at boot — a first connection failure threw, the
top-level catch logged Fatal, and the process exited. Fix: added a new
StartupRetry.ExecuteWithRetryAsync helper (bounded retry with capped exponential
backoff — no Polly dependency, keeping the Host's package set minimal) and wrapped
the migration step in it (maxAttempts: 8, initialDelay: 2s, doubling, capped at
30s). Each retry opens a fresh DI scope so a transient connection fault does not
leave a poisoned DbContext; once attempts are exhausted the last exception
propagates and the existing fatal-log path still fires. Regression tests in new
StartupRetryTests: ExecuteWithRetry_SucceedsFirstTry_RunsOnce,
ExecuteWithRetry_TransientFailures_RetriesUntilSuccess (fails twice then succeeds),
and ExecuteWithRetry_ExhaustsAttempts_RethrowsLastException (verifies the original
exception type/message surfaces after exhaustion) — they fail to compile/pass against
the pre-fix code (the helper did not exist) and pass after. Full Host suite green
(175 passed).
Host-011 — LoggingOptions.MinimumLevel is dead configuration
| Severity | Low |
| Category | Code organization & conventions |
| Status | Resolved |
| Location | src/ScadaLink.Host/LoggingOptions.cs:5, src/ScadaLink.Host/Program.cs:42-50 |
Description
LoggingOptions exposes a MinimumLevel property bound from ScadaLink:Logging
(SiteServiceRegistration.BindSharedOptions), and both appsettings.Central.json
and appsettings.Site.json set "Logging": { "MinimumLevel": "Information" }.
However Serilog is configured purely via ReadFrom.Configuration(configuration),
which reads the standard Serilog section — not ScadaLink:Logging. The
LoggingOptions.MinimumLevel value is never read by any code, so changing it has no
effect. This is misleading: an operator editing ScadaLink:Logging:MinimumLevel
expecting a log-level change will see nothing happen.
Recommendation
Either consume LoggingOptions.MinimumLevel when configuring the Serilog
LoggerConfiguration (e.g. set MinimumLevel.Is(...) from it), or remove the option
class and the ScadaLink:Logging sections and rely solely on the Serilog
configuration section. Keep one mechanism, not two.
Resolution
Resolved 2026-05-16 (commit pending). Root cause confirmed: LoggingOptions.MinimumLevel
was bound from ScadaLink:Logging but Serilog was configured purely via
ReadFrom.Configuration (standard Serilog section), so editing
ScadaLink:Logging:MinimumLevel had no effect. Re-triaged the two options: the
"remove the option class" branch was rejected because Component-Host.md REQ-HOST-3
explicitly lists ScadaLink:Logging → LoggingOptions and that design doc is outside
this task's edit scope — removing the class would create a fresh code-vs-doc drift.
Fix (the "consume it" branch): added a LoggerConfigurationFactory.Build that binds
LoggingOptions, parses MinimumLevel into a Serilog LogEventLevel (falling back
to Information for null/blank/unrecognised values), and pins it via
MinimumLevel.Is(...) on top of ReadFrom.Configuration; Program.cs now builds the
logger through this factory. ScadaLink:Logging:MinimumLevel is now live.
Regression tests in new LoggerConfigurationTests:
MinimumLevel_Warning_SuppressesInformationLogs,
MinimumLevel_Debug_AllowsDebugLogs, and MinimumLevel_Absent_DefaultsToInformation
— they assert the configured level actually filters log events; they pass against the
new factory (and would fail against the old inline configuration, which ignored the
key). Full Host suite green (175 passed).