Files

Joseph Doherty 3d0c1c6963 fix(host): resolve Host-002 — REQ-HOST-6 doc corrected to drop unused Akka.Persistence requirement

2026-05-16 23:49:25 -04:00

30 KiB

Raw Blame History

Code Review — Host

Field	Value
Module	`src/ScadaLink.Host`
Design doc	`docs/requirements/Component-Host.md`
Status	Reviewed
Last reviewed	2026-05-16
Reviewer	claude-agent
Commit reviewed	`9c60592`
Open findings	0

Summary

The Host module is the composition root for the entire ScadaLink system: a single binary whose behaviour (Central vs Site) is driven entirely by configuration. The implementation is generally faithful to Component-Host.md — startup validation, role-based registration, Serilog enrichment, Windows Service support, dead-letter monitoring, CoordinatedShutdown, and gRPC hosting on site nodes are all present and backed by a solid test suite (tests/ScadaLink.Host.Tests).

The most significant problem is the readiness endpoint: /health/ready runs all registered health checks, including the leader-only active-node check, so a fully operational standby central node permanently reports 503 on /health/ready — directly contradicting REQ-HOST-4a, which defines readiness as cluster membership + DB connectivity (not leadership). Several other findings concern configuration that is validated-but-never-consumed (MachineDataDb), design-doc drift (Akka.Persistence is required by REQ-HOST-6 but the system uses no persistent actors), an incorrect seed-node entry in the shipped site config, blocking sync-over-async during startup, and unguarded string interpolation when building HOCON. None are crash/data-loss class, but the readiness bug is High because it breaks load-balancer behaviour with no safe workaround.

Checklist coverage

#	Category	Examined	Notes
1	Correctness & logic bugs	☑	`/health/ready` includes the leader-only check (Host-001); site seed-node config points at the gRPC port (Host-004).
2	Akka.NET conventions	☑	CoordinatedShutdown, receptionist registration, singleton scoping all correct. HOCON built by raw string interpolation (Host-006); `StartAsync` returns before actors are confirmed running (Host-009).
3	Concurrency & thread safety	☑	Blocking `GetAwaiter().GetResult()` on a hosted-service startup thread (Host-005). `DeadLetterMonitorActor` state is actor-confined — no issues.
4	Error handling & resilience	☑	Top-level try/catch logs fatal and rethrows. No retry around DB migration / readiness preconditions (Host-010).
5	Security	☑	Plaintext DB password, LDAP service-account password and dev JWT key checked into `appsettings.Central.json` (Host-003).
6	Performance & resource management	☑	No undisposed resources. Inbound API script compilation is a synchronous startup loop — acceptable.
7	Design-document adherence	☑	REQ-HOST-6 mandates Akka.Persistence config but none exists and no persistent actors exist — doc is stale (Host-002). REQ-HOST-4 GrpcPort-≠-RemotingPort rule not enforced (Host-007).
8	Code organization & conventions	☑	`MachineDataDb` validated/declared but never consumed (Host-008). `LoggingOptions.MinimumLevel` is dead (Host-011).
9	Testing coverage	☑	Strong suite; no test asserts `/health/ready` excludes `active-node`, which is why Host-001 slipped through (noted in Host-001).
10	Documentation & comments	☑	Comments are accurate. REQ-HOST-6 in the design doc is the main stale-doc item (Host-002).

Findings

Host-001 — `/health/ready` includes the leader-only `active-node` check


Severity	High
Category	Correctness & logic bugs
Status	Resolved
Location	`src/ScadaLink.Host/Program.cs:135-145`

Description

/health/ready is mapped with MapHealthChecks("/health/ready", ...) and no Predicate, so it executes every registered check: database, akka-cluster and active-node. ActiveNodeHealthCheck (Health/ActiveNodeHealthCheck.cs:38) returns Unhealthy on any node that is not the cluster leader. As a result a standby central node that is fully operational (cluster member Up, database reachable) still returns 503 on /health/ready. This contradicts REQ-HOST-4a, which defines readiness as cluster membership + DB connectivity + singletons — explicitly not leadership. /health/active is the endpoint intended to report leadership. A load balancer using /health/ready to decide whether a node may serve traffic will permanently treat the standby as unready, defeating failover readiness. No test covers this: HealthCheckTests.HealthReady_Endpoint_ReturnsResponse only asserts a response is returned, not the standby semantics.

Recommendation

Add a Predicate to the /health/ready mapping that excludes the active-node check, e.g. Predicate = check => check.Name != "active-node" (or tag the readiness checks and filter by tag). Add a regression test asserting a non-leader node returns 200 on /health/ready.

Resolution

Resolved 2026-05-16 (commit <pending>). Root cause confirmed against Program.cs: the /health/ready mapping had no Predicate, so it executed all three registered checks including the leader-only active-node check, while ActiveNodeHealthCheck returns Unhealthy on any non-leader node — making a fully operational standby central node permanently report 503. Fix: added Predicate = check => check.Name != "active-node" to the /health/ready HealthCheckOptions, so readiness now reflects cluster membership + DB connectivity only (REQ-HOST-4a); leadership remains reported solely by /health/active. Regression test HealthCheckTests.HealthReady_Endpoint_ExcludesActiveNodeCheck asserts the active-node check name does not appear in the /health/ready response body; it failed before the fix and passes after. Full Host suite green (156 passed).

Host-002 — Akka.Persistence required by REQ-HOST-6 is not configured and not used


Severity	Medium
Category	Design-document adherence
Status	Resolved
Location	`src/ScadaLink.Host/Actors/AkkaHostedService.cs:70-108`, `docs/requirements/Component-Host.md` REQ-HOST-6

Description

REQ-HOST-6 states the Host "must configure the Akka.NET actor system using Akka.Hosting with ... Persistence: Configured with the appropriate journal and snapshot store (SQL for central, SQLite for site)." The HOCON built in AkkaHostedService.StartAsync contains no akka.persistence section, no journal and no snapshot-store plugin, and ScadaLink.Host.csproj references neither Akka.Persistence.Hosting nor any persistence plugin (the design doc Dependencies list Akka.Persistence.Hosting). A repo-wide search finds no PersistentActor / ReceivePersistentActor subclasses — the system deliberately uses custom SQLite storage services instead. The code is internally consistent, but the design document is stale: it mandates a subsystem that does not exist. This is a documented-vs-actual drift that will mislead future maintainers and any audit against REQ-HOST-6.

Recommendation

Update Component-Host.md REQ-HOST-6 and the Dependencies list to remove the Akka.Persistence requirement (or explicitly state persistence is provided by component-owned SQLite storage, not Akka.Persistence). If persistence is intended, add the plugin packages and HOCON. Either way, code and doc must agree.

Resolution

Resolved 2026-05-16 (commit <pending>). Confirmed accurate: a repo-wide search finds no PersistentActor / ReceivePersistentActor subclasses anywhere in src/, no akka.persistence section in the HOCON built by AkkaHostedService.StartAsync, and ScadaLink.Host.csproj references no persistence plugin packages. The Host code is correct and internally consistent; the defect was a stale design doc. Fix: updated docs/requirements/Component-Host.md — REQ-HOST-6 no longer lists Persistence as a configured subsystem and now carries an explicit Persistence note stating that durable state is owned by individual components and persisted through component-owned stores (SQLite at sites, MS SQL centrally), with no Akka journal/snapshot store and no PersistentActor subclasses by design. The Responsibilities line and the Dependencies entry (Akka.Persistence.Hosting removed) were corrected to match. Code and doc now agree; no code or test change required.

Host-003 — Secrets committed in plaintext in `appsettings.Central.json`


Severity	Medium
Category	Security
Status	Resolved
Location	`src/ScadaLink.Host/appsettings.Central.json:20-31`

Description

appsettings.Central.json contains real-looking secrets in plaintext, checked into source control: SQL Server passwords in the ConfigurationDb / MachineDataDb connection strings (Password=ScadaLink_Dev1#), an LDAP service-account password (LdapServiceAccountPassword: "password"), and a JWT signing key (JwtSigningKey: "scadalink-dev-jwt-signing-key-..."). Even though these are intended as development defaults, shipping them in the default config invites them being reused verbatim in production, and a committed JWT signing key allows anyone with repo access to forge session tokens. TrustServerCertificate=true additionally disables TLS validation for the SQL connection.

Recommendation

Move all secrets out of committed appsettings*.json into environment variables, user-secrets, or a secret store. Keep only non-sensitive structural defaults in the file and document the required environment variables. At minimum add a clear comment that these values are dev-only and must be overridden, and rotate the JWT key per environment.

Resolution

Resolved 2026-05-16 (commit pending). Root cause confirmed against appsettings.Central.json: the committed file carried real-looking secrets in plaintext — SQL Server passwords (Password=ScadaLink_Dev1#) in both connection strings, an LDAP service-account password, and a JWT signing key. Fix: all four secrets were removed from the committed file and replaced with non-functional ${...} placeholder references (ConfigurationDb / MachineDataDb, LdapServiceAccountPassword, JwtSigningKey). A new top-level _secrets note documents that the Host's configuration builder (AddEnvironmentVariables()) overlays the real values supplied via environment variables (ScadaLink__Database__ConfigurationDb, ScadaLink__Database__MachineDataDb, ScadaLink__Security__LdapServiceAccountPassword, ScadaLink__Security__JwtSigningKey); the placeholders are intentionally invalid so a misconfigured deployment fails loudly rather than silently using a committed key. Regression test class ConfigSecretsTests asserts the committed file carries no plaintext Password= value, no committed LDAP service-account password, and no committed JWT signing key; all three tests failed before the fix and pass after. Tests that drive the full Program startup pipeline against the real SQL provider (HealthCheckTests, HostStartupTests.CentralRole_StartsWithoutError) were adapted to supply the local dev connection strings themselves via the new CentralDbTestEnvironment test helper (environment variables) — they must no longer depend on committed secrets. Note: the docker/central-node-*/appsettings.Central.json files still contain the same dev secrets but lie outside this task's permitted edit scope; they should receive the same treatment in a follow-up.

Host-004 — Site seed-node list points at the gRPC port, not a remoting port


Severity	Medium
Category	Correctness & logic bugs
Status	Resolved
Location	`src/ScadaLink.Host/appsettings.Site.json:10-19`

Description

The shipped site config sets Node:RemotingPort = 8082 and Node:GrpcPort = 8083, but Cluster:SeedNodes is ["akka.tcp://scadalink@localhost:8082", "akka.tcp://scadalink@localhost:8083"]. The second seed node targets 8083, which is the Kestrel HTTP/2 gRPC port — not an Akka remoting endpoint. A node attempting to join via that seed will try to establish an Akka.Remote TCP association against the gRPC listener and fail. StartupValidator only checks that ≥2 seed nodes exist (StartupValidator.cs:54-56), so this misconfiguration passes validation silently. For the single-node dev site it is harmless (the first seed succeeds), but it is an incorrect example that will be copied into multi-node site configs.

Recommendation

Correct the site seed-node list to reference the two site nodes' remoting ports (e.g. 8082 and 8084), never the gRPC port. Consider extending StartupValidator to reject a seed node whose port equals this node's GrpcPort.

Resolution

Resolved 2026-05-16 (commit pending). Root cause confirmed against appsettings.Site.json: with Node:RemotingPort = 8082 and Node:GrpcPort = 8083, the second Cluster:SeedNodes entry was akka.tcp://scadalink@localhost:8083 — the Kestrel HTTP/2 gRPC port, not an Akka.Remote endpoint. StartupValidator only checked seed-node count (≥2), so the misconfiguration passed silently. Fix, two parts: (1) the shipped appsettings.Site.json second seed entry was corrected to a remoting port (localhost:8084); (2) StartupValidator.Validate was extended — for Site nodes it now parses each seed node's TCP port (via a new SeedNodePort helper) and rejects any entry whose port equals the node's GrpcPort, using the resolved GrpcPort including the 8083 NodeOptions default when the key is absent. The seed-node-count check was hoisted above the Site block so the new rule can reuse the parsed list. Regression tests in StartupValidatorTests: Site_SeedNodeOnGrpcPort_FailsValidation, Site_SeedNodeOnDefaultGrpcPort_FailsValidation (default-8083 path), Site_SeedNodesOnRemotingPort_PassesValidation, and Central_SeedNodeOnPort8083_PassesValidation (rule is Site-only) — all failed appropriately before the fix and pass after.

Host-005 — Blocking sync-over-async (`GetAwaiter().GetResult()`) inside `StartAsync`


Severity	Low
Category	Concurrency & thread safety
Status	Resolved
Location	`src/ScadaLink.Host/Actors/AkkaHostedService.cs:345`

Description

RegisterSiteActors calls storeAndForwardService.StartAsync().GetAwaiter().GetResult() synchronously, blocking inside the IHostedService.StartAsync path. StartAsync is itself declared synchronous (returns Task.CompletedTask), so the work cannot be awaited cleanly. Blocking on async work risks thread-pool starvation during startup and, if the awaited operation captures a synchronization context, deadlock. It also hides exceptions behind an AggregateException wrapper.

Recommendation

Make AkkaHostedService.StartAsync genuinely async and await storeAndForwardService.StartAsync(cancellationToken). Propagate the CancellationToken and let exceptions surface as the original type.

Resolution

Resolved 2026-05-16 (commit pending). Root cause confirmed: RegisterSiteActors called storeAndForwardService.StartAsync().GetAwaiter().GetResult() synchronously inside the IHostedService.StartAsync path, blocking a startup thread and wrapping any failure in AggregateException. Fix: AkkaHostedService.StartAsync is now genuinely async Task; the site path was renamed RegisterSiteActorsAsync and made async; the store-and-forward init is now await storeAndForwardService.StartAsync() (the service's own signature, in another module, takes no CancellationToken, so the token is honoured via a cancellationToken.ThrowIfCancellationRequested() checkpoint immediately before the await). No blocking, no sync-context deadlock risk, and the original exception type now surfaces. No dedicated regression test: the only observable change is non-blocking/exception-unwrapping behaviour on a path that requires the full site DI graph plus a real SQLite store; the existing HostStartupTests.SiteRole_* and CompositionRootTests continue to exercise the async startup pipeline and remain green (175 passed).

Host-006 — HOCON assembled by unescaped string interpolation


Severity	Low
Category	Akka.NET conventions
Status	Resolved
Location	`src/ScadaLink.Host/Actors/AkkaHostedService.cs:70-108`

Description

The Akka HOCON is built with an interpolated string that injects _nodeOptions.NodeHostname, _clusterOptions.SeedNodes, the computed roles, and SplitBrainResolverStrategy directly into the configuration text. Values are not escaped. A hostname or seed-node string containing a quote, backslash, brace, or comment sequence would corrupt the HOCON and produce a confusing parse error far from the real cause; SplitBrainResolverStrategy is interpolated without quoting, so a value with whitespace breaks the document. Building cluster configuration from raw string concatenation is also harder to maintain than the typed Akka.Hosting builder the design doc (REQ-HOST-6) actually calls for ("via Akka.Hosting").

Recommendation

Prefer the Akka.Hosting AddAkka(...) builder with strongly-typed WithRemoting, WithClustering, and split-brain-resolver configuration instead of hand-built HOCON. If HOCON must be retained, validate/escape interpolated values (especially hostname and seed nodes) before substitution.

Resolution

Resolved 2026-05-16 (commit pending). Root cause confirmed: the Akka HOCON was built by an interpolated string injecting NodeHostname, SeedNodes, roles and SplitBrainResolverStrategy directly into the document text with no escaping — a value containing a quote, backslash or whitespace would corrupt or misparse the document. Re-triaged the recommendation: switching wholesale to the typed Akka.Hosting builder is a larger refactor than this Low finding warrants and would touch how the whole cluster is bootstrapped, so the HOCON form was retained but made safe. Fix: the HOCON construction was extracted into a testable static AkkaHostedService.BuildHocon, and every interpolated string value is now routed through a QuoteHocon helper that wraps the value in a double-quoted HOCON literal and escapes embedded backslashes and quotes — NodeHostname, every seed node, every role, and SplitBrainResolverStrategy (the last was previously unquoted entirely, so whitespace alone would break it). Regression tests in new HoconBuilderTests: BuildHocon_HostnameWithQuote_DoesNotCorruptDocument (a hostname with a " still parses and following keys stay intact), BuildHocon_StrategyWithWhitespace_RemainsQuoted, and BuildHocon_PlainValues_ParsesAndPreservesValues. With the old raw interpolation a quote-containing hostname produced hostname = "evil"host", closing the literal early and corrupting following keys; the escaped form parses correctly. Full Host suite green (175 passed).

Host-007 — REQ-HOST-4 rule "GrpcPort ≠ RemotingPort" is not enforced


Severity	Low
Category	Design-document adherence
Status	Resolved
Location	`src/ScadaLink.Host/StartupValidator.cs:43-47`

Description

REQ-HOST-4 requires: "Site nodes must have GrpcPort in valid port range (1–65535) and different from RemotingPort." StartupValidator validates the GrpcPort range but never compares it to RemotingPort. A site config that sets both ports to the same value passes validation and then fails opaquely at runtime when Kestrel and Akka.Remote both try to bind the port. The GrpcPort range check is also skipped entirely when the key is absent (grpcPortStr != null), relying on the NodeOptions default of 8083 — acceptable, but the equality rule is the missing piece.

Recommendation

Add a check in the role == "Site" block: if GrpcPort (resolved, including the 8083 default) equals RemotingPort, add an error "ScadaLink:Node:GrpcPort must differ from RemotingPort".

Resolution

Resolved 2026-05-16 (commit pending). Root cause confirmed against StartupValidator.cs: the role == "Site" block validated the GrpcPort range but never compared it to RemotingPort, so a site config setting both ports equal passed validation and then failed opaquely when Kestrel and Akka.Remote both bound the port. Fix: added if (port == grpcPort) errors.Add("ScadaLink:Node:GrpcPort must differ from RemotingPort") in the Site block, using the resolved GrpcPort (including the 8083 NodeOptions default when the key is absent) against the parsed RemotingPort. Regression tests in StartupValidatorTests: Site_GrpcPortEqualsRemotingPort_FailsValidation, Site_DefaultGrpcPortEqualsRemotingPort_FailsValidation (default-8083 path), and Site_GrpcPortDiffersFromRemotingPort_PassesValidation — all failed appropriately before the fix and pass after. Full Host suite green (175 passed).

Host-008 — `MachineDataDb` is validated and declared but never consumed


Severity	Low
Category	Code organization & conventions
Status	Resolved
Location	`src/ScadaLink.Host/StartupValidator.cs:33-34`, `src/ScadaLink.Host/DatabaseOptions.cs:6`

Description

StartupValidator requires a non-empty ScadaLink:Database:MachineDataDb connection string for Central nodes, and DatabaseOptions exposes a MachineDataDb property, but a repo-wide search shows the value is never read anywhere outside the Host module — only ConfigurationDb is passed to AddConfigurationDatabase (Program.cs:83-85). The Host therefore fails startup if MachineDataDb is missing even though nothing uses it. This is either dead configuration that should be removed or a missing wiring (a machine-data DbContext that was never registered).

Recommendation

Determine whether a machine-data store is actually required. If yes, wire it into the relevant component's DI registration. If no, remove the MachineDataDb validation rule, the DatabaseOptions property, and the key from appsettings.Central.json.

Resolution

Resolved 2026-05-16 (commit pending). Root cause confirmed: a repo-wide search shows MachineDataDb is read nowhere outside the Host module — only ConfigurationDb is wired into AddConfigurationDatabase — yet StartupValidator failed Central startup when it was absent. No machine-data store exists or is registered, so this is dead configuration. Fix (the "if no" branch of the recommendation): removed the MachineDataDb validation rule from StartupValidator, the MachineDataDb property from DatabaseOptions, the MachineDataDb key from appsettings.Central.json, and the now-pointless MachineDataDb reference from that file's _secrets note and from the CentralDbTestEnvironment test helper. Regression test: the prior StartupValidatorTests.Central_MissingMachineDataDb_FailsValidation was inverted to Central_MissingMachineDataDb_PassesValidation (absence must now NOT fail startup), and MachineDataDb was removed from the shared ValidCentralConfig fixture so all Central tests prove the key is no longer required; it failed under the old code and passes after. Full Host suite green (175 passed). Note: appsettings.Central.json under docker/central-node-* is outside this task's edit scope and still carries the key harmlessly — it will simply be ignored; a follow-up may remove it for tidiness.

Host-009 — `StartAsync` reports success before role actors are confirmed running


Severity	Low
Category	Akka.NET conventions
Status	Resolved
Location	`src/ScadaLink.Host/Actors/AkkaHostedService.cs:127-141`

Description

StartAsync creates actors with ActorOf (a fire-and-forget operation — the actor's PreStart runs asynchronously on its own thread) and then returns Task.CompletedTask. For site nodes, grpcServer.SetReady(_actorSystem) is called synchronously at the end of RegisterSiteActors, marking the gRPC server ready even though SiteCommunicationActor, the deployment-manager singleton, and the ClusterClient may not yet have completed their PreStart/initial-contact handshake. REQ-HOST-7 requires "Actor system and SiteStreamManager ... initialized before gRPC begins accepting connections" — SiteStreamManager.Initialize is awaited-equivalent, but the broader actor graph is not. The window is small and the gRPC server still rejects streams until SetReady, so impact is limited, but readiness is being asserted optimistically.

Recommendation

If strict ordering matters, gate SetReady on confirmation that SiteCommunicationActor is fully initialized (e.g. an Ask round-trip or a readiness message), or document explicitly that gRPC readiness only guarantees the actor system exists, not that the cluster handshake has completed.

Resolution

Resolved 2026-05-16 (commit pending) — re-triaged, documentation fix. Root cause confirmed: SetReady is called optimistically after ActorOf registrations whose PreStart/initial-contact handshakes run asynchronously. Re-triaged the first recommended option (gating SetReady on a SiteCommunicationActor Ask round-trip): that option is incorrect because the meaningful asynchronous step is the ClusterClient initial-contact handshake with central — gating site gRPC readiness on it would prevent a site from coming up and streaming locally while central is briefly unreachable, contradicting the design's site-autonomy model. Confirmed SiteStreamGrpcServer already rejects any stream opened before SetReady with StatusCode.Unavailable, so the window is benign. Fix (the second, correct option): the SetReady call site now carries an explicit comment documenting the narrow readiness contract — by SetReady the actor system exists, SiteStreamManager.Initialize has run, and every role actor has been created with ordered synchronous ActorOf + registration Tells; what is deliberately NOT guaranteed is PreStart completion or the central handshake. No regression test is meaningful for a documentation-only clarification of an intentional design contract; the existing HostStartupTests.SiteRole_* cover the surrounding startup path. Full Host suite green (175 passed).

Host-010 — No retry/backoff around startup preconditions (DB migration, readiness)


Severity	Low
Category	Error handling & resilience
Status	Resolved
Location	`src/ScadaLink.Host/Program.cs:112-125`

Description

On Central startup the Host opens a DI scope and calls MigrationHelper.ApplyOrValidateMigrationsAsync directly. If the SQL Server is not yet reachable (common in container orchestration where the DB and app start together), the call throws, the top-level catch logs Fatal, and the process exits. There is no bounded retry/backoff to tolerate a database that is briefly unavailable at boot. The design intent (REQ-HOST-4a, readiness gating, 503 until ready) is about serving traffic, but the migration step happens before the host even runs and has no such tolerance.

Recommendation

Wrap the migration/validation step in a bounded retry with exponential backoff (e.g. Polly), or move schema apply behind the readiness gate so the process stays up and reports 503 until the database becomes reachable.

Resolution

Resolved 2026-05-16 (commit pending). Root cause confirmed: Central startup called MigrationHelper.ApplyOrValidateMigrationsAsync directly with no tolerance for a database that is briefly unreachable at boot — a first connection failure threw, the top-level catch logged Fatal, and the process exited. Fix: added a new StartupRetry.ExecuteWithRetryAsync helper (bounded retry with capped exponential backoff — no Polly dependency, keeping the Host's package set minimal) and wrapped the migration step in it (maxAttempts: 8, initialDelay: 2s, doubling, capped at 30s). Each retry opens a fresh DI scope so a transient connection fault does not leave a poisoned DbContext; once attempts are exhausted the last exception propagates and the existing fatal-log path still fires. Regression tests in new StartupRetryTests: ExecuteWithRetry_SucceedsFirstTry_RunsOnce, ExecuteWithRetry_TransientFailures_RetriesUntilSuccess (fails twice then succeeds), and ExecuteWithRetry_ExhaustsAttempts_RethrowsLastException (verifies the original exception type/message surfaces after exhaustion) — they fail to compile/pass against the pre-fix code (the helper did not exist) and pass after. Full Host suite green (175 passed).

Host-011 — `LoggingOptions.MinimumLevel` is dead configuration


Severity	Low
Category	Code organization & conventions
Status	Resolved
Location	`src/ScadaLink.Host/LoggingOptions.cs:5`, `src/ScadaLink.Host/Program.cs:42-50`

Description

LoggingOptions exposes a MinimumLevel property bound from ScadaLink:Logging (SiteServiceRegistration.BindSharedOptions), and both appsettings.Central.json and appsettings.Site.json set "Logging": { "MinimumLevel": "Information" }. However Serilog is configured purely via ReadFrom.Configuration(configuration), which reads the standard Serilog section — not ScadaLink:Logging. The LoggingOptions.MinimumLevel value is never read by any code, so changing it has no effect. This is misleading: an operator editing ScadaLink:Logging:MinimumLevel expecting a log-level change will see nothing happen.

Recommendation

Either consume LoggingOptions.MinimumLevel when configuring the Serilog LoggerConfiguration (e.g. set MinimumLevel.Is(...) from it), or remove the option class and the ScadaLink:Logging sections and rely solely on the Serilog configuration section. Keep one mechanism, not two.

Resolution

Resolved 2026-05-16 (commit pending). Root cause confirmed: LoggingOptions.MinimumLevel was bound from ScadaLink:Logging but Serilog was configured purely via ReadFrom.Configuration (standard Serilog section), so editing ScadaLink:Logging:MinimumLevel had no effect. Re-triaged the two options: the "remove the option class" branch was rejected because Component-Host.md REQ-HOST-3 explicitly lists ScadaLink:Logging → LoggingOptions and that design doc is outside this task's edit scope — removing the class would create a fresh code-vs-doc drift. Fix (the "consume it" branch): added a LoggerConfigurationFactory.Build that binds LoggingOptions, parses MinimumLevel into a Serilog LogEventLevel (falling back to Information for null/blank/unrecognised values), and pins it via MinimumLevel.Is(...) on top of ReadFrom.Configuration; Program.cs now builds the logger through this factory. ScadaLink:Logging:MinimumLevel is now live. Regression tests in new LoggerConfigurationTests: MinimumLevel_Warning_SuppressesInformationLogs, MinimumLevel_Debug_AllowsDebugLogs, and MinimumLevel_Absent_DefaultsToInformation — they assert the configured level actually filters log events; they pass against the new factory (and would fail against the old inline configuration, which ignored the key). Full Host suite green (175 passed).

30 KiB Raw Blame History Unescape Escape

Code Review — Host

Summary

Checklist coverage

Findings

Host-001 — /health/ready includes the leader-only active-node check

Host-002 — Akka.Persistence required by REQ-HOST-6 is not configured and not used

Host-003 — Secrets committed in plaintext in appsettings.Central.json

Host-004 — Site seed-node list points at the gRPC port, not a remoting port

Host-005 — Blocking sync-over-async (GetAwaiter().GetResult()) inside StartAsync

Host-006 — HOCON assembled by unescaped string interpolation

Host-007 — REQ-HOST-4 rule "GrpcPort ≠ RemotingPort" is not enforced

Host-008 — MachineDataDb is validated and declared but never consumed

Host-009 — StartAsync reports success before role actors are confirmed running

Host-010 — No retry/backoff around startup preconditions (DB migration, readiness)

Host-011 — LoggingOptions.MinimumLevel is dead configuration

30 KiB

Raw Blame History

Host-001 — `/health/ready` includes the leader-only `active-node` check

Host-003 — Secrets committed in plaintext in `appsettings.Central.json`

Host-005 — Blocking sync-over-async (`GetAwaiter().GetResult()`) inside `StartAsync`

Host-008 — `MachineDataDb` is validated and declared but never consumed

Host-009 — `StartAsync` reports success before role actors are confirmed running

Host-011 — `LoggingOptions.MinimumLevel` is dead configuration