Eight area-focused reviews (BCD rewriter, multiplexer, response cache, supervisor + hot-reload, admin + diagnostics, hosting + options, test suite) plus an Overview that prioritises findings across areas, and a RemediationPlan that groups the work into three waves with per-item file:line citations and regression-test sketches. Findings call out: hot-reload tag-list/cache changes that don't reach the running multiplexer, a coalescing factory leak that hangs late attachers, backend-reader head-of-line block on a wedged upstream, stranded outbound frames after cascade, and ShutdownCoordinator double-stop ordering. Plus the unconventional 32-bit BCD wire format (two base-10000 digits in CDAB, not standard binary), unreachable BcdValidationError.DuplicateAddress, mbproxy.cache.flushed event that's defined but never emitted, and missing test coverage for Cache.AllowLongTtl. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
9.9 KiB
Hosting / DI / Options / Windows Service Audit
Scope: Program.cs, HostingExtensions.cs, Mbproxy.csproj, appsettings.json, Options/*.cs, install/, Mbproxy.slnx. Cross-checked against docs/design.md (Configuration, Logging, Startup posture), docs/Architecture/Overview.md, docs/Operations/Configuration.md. Tests under tests/Mbproxy.Tests/Options/MbproxyOptionsBindingTests.cs + tests/Mbproxy.Tests/HostSmokeTests.cs.
Summary
- The host pipeline is small, idiomatic for .NET 10
Host.CreateApplicationBuilder, and the Windows Service / console-fallback story is correctly handled viaAddWindowsService()+WindowsServiceHelpers.IsWindowsService(). - Hot-reload posture is mostly clean (
IOptionsMonitor<MbproxyOptions>everywhere it matters), with one regression:ShutdownCoordinatorconsumesIOptions<MbproxyOptions>and will freezeGracefulShutdownTimeoutMsat startup-time value. appsettings.jsonshipped insrc/Mbproxy/is an empty template (no PLCs, no BCD tags, noCachesection). The richer working example lives only ininstall/mbproxy.config.template.json— a maintenance-time hazard.- Single-file self-contained publish is wired correctly for net10.0; package versions are pinned and GA (no preview/RC strings present).
Critical findings
None. The service starts, reaches mbproxy.startup.ready, and shuts down cleanly per HostSmokeTests.
Major findings
-
ShutdownCoordinatorusesIOptions<MbproxyOptions>, defeating hot-reload ofGracefulShutdownTimeoutMs.src/Mbproxy/Diagnostics/ShutdownCoordinator.cs:93,102,118and the DI factory atsrc/Mbproxy/Program.cs:42.IOptions<T>is a singleton snapshot: any operator who editsConnection.GracefulShutdownTimeoutMsafter start will silently see the original value at shutdown. Perdocs/Operations/Configuration.md:160"GracefulShutdownTimeoutMsis sampled only at shutdown" — so the contract is "current config wins at stop time". Switch toIOptionsMonitor<MbproxyOptions>and read_options.CurrentValue.Connection.GracefulShutdownTimeoutMsinsideShutdownAsync. -
appsettings.jsonshipped with the binary is a near-empty stub.src/Mbproxy/appsettings.json:1shipsBcdTags.Global = [],Plcs = [], noCachesection, and no example tag. The fully-commented working example lives only ininstall/mbproxy.config.template.json. Copy-on-build (Mbproxy.csproj:52) will publish the empty stub next toMbproxy.exe, andinstall.ps1:99-105then copies that intoInstallPath. The Production binary therefore carries anappsettings.jsonthat only the install script's separate copy of the template will ever fix at%ProgramData%\mbproxy\…. A freshMbproxy.exerun from the publish folder yields a no-op service. Recommend either markingsrc/Mbproxy/appsettings.jsonCopyToOutputDirectory=Neverso the install template is the single source of truth, or making the shipped file mirror the template. -
ConfigReconciler.Attachhappens between supervisor construction andStartAsync— not "BEFORE accepting hot-reload events".src/Mbproxy/Proxy/ProxyWorker.cs:153. The comment claims this is safe because the channel "only enqueues" and apply runs async. That is true, but it does mean a config save that lands during the very early window betweenhost.RunAsync()and_reconciler.Attachis observed byIOptionsMonitor.OnChangebefore the reconciler has its supervisor map. Whether that is benign depends onConfigReconcilerinternals (it does buffer); worth a quick review to confirm the buffer is unbounded or that the missed event is replayed from_options.CurrentValueafterAttach. -
IHostApplicationLifetime.ApplicationStoppingcallback runscoordinator.ShutdownAsync().GetAwaiter().GetResult()synchronously on the lifetime thread.src/Mbproxy/Program.cs:60-66. The comment acknowledges async isn't supported and notes the coordinator manages its own deadline, which is correct. But the drain loop inShutdownCoordinator.ShutdownAsync(line 167)Task.Delay(10, drainCts.Token)ContinueWith chains will resume on the threadpool — fine — yet the synchronousGetResultblocks one IHost lifetime thread for up toGracefulShutdownTimeoutMs + ~7 s(5 s stop + drain + 2 s admin). That is by design, but worth documenting as the reason theGracefulShutdownTimeoutMsdefault (10 s) is bounded well below the SCM 30 s wait hint.
Minor findings
-
Bare-config-only Serilog wiring; the design's "Event Log sink for Error+ in service mode" is satisfied via a custom sink, not the standard
Serilog.Sinks.EventLogpackage.src/Mbproxy/HostingExtensions.cs:79-84+src/Mbproxy/Diagnostics/EventLogBridge.cs:25. This is intentional (the bridge silently no-ops if the source isn't registered, avoiding admin-requiredEventLog.CreateEventSourceat startup). Just call this out indocs/Operations/Configuration.mdso reviewers don't add a duplicate sink. -
ProxyWorkerregistered as singleton + hosted service via thesp.GetRequiredServicefactory pattern.src/Mbproxy/Program.cs:29-30and the same trick atsrc/Mbproxy/HostingExtensions.cs:57-58forAdminEndpointHost. The pattern is correct and idiomatic; just notice thatProxyWorkerholds a private mutable dictionary_supervisors(line 41) — singleton lifetime is correct because that dict is per-process, not per-PLC, andSupervisorsis exposed asIReadOnlyDictionaryfor the status page. -
MbproxyOptions.Plcstyped asIReadOnlyList<PlcOptions>.src/Mbproxy/Options/MbproxyOptions.cs:8. Configuration binder support forIReadOnlyList<T>works on .NET 8+ but historically had quirks;MbproxyOptionsBindingTests.cs:75confirms binding works. No action needed. -
AddJsonFileis implicit viaHost.CreateApplicationBuilder. Not called manually inProgram.cs. The default loader usesoptional: true, reloadOnChange: trueforappsettings.jsonandappsettings.{Environment}.json. Perdocs/Operations/Configuration.md:13the design statesreloadOnChange: true— satisfied by the default. Environment overrides and command-line are also picked up automatically byCreateApplicationBuilder(args)atProgram.cs:9. -
MbproxyOptionsValidatoris registered manually asIValidateOptions<>(line 28-30) AND.ValidateOnStart()is called (line 26). Correct combination, but noteValidateOnStart()only gates startup — runtime reload validation is delegated toReloadValidator(perConfiguration.md:245). The schema-level validator does NOT run on reload viaIOptionsMonitor.OnChangebecause options binding rebinds without re-invokingIValidateOptions<>. This is the design (the validation entry-point on reload is the reconciler), but it's a sharp edge — a width-of-8 sneaked in via hot-reload would NOT be caught byMbproxyOptionsValidator; it has to be re-applied byReloadValidator. -
Mbproxy.csproj:42describes Polly as "deferred" — but Polly is now used inPolicyFactory.BuildBackendConnect/ListenerRecovery. Comment is stale; remove it. -
No
appsettings.Development.jsonexists — fine, but the host will still attempt to load it (optional). If you want to suppress dev-only Serilog overrides in published artifacts, OK as-is.
What looks good
- TFM
net10.0, GA package versionsMicrosoft.Extensions.Hosting.WindowsServices 10.0.8,Serilog.Extensions.Hosting 10.0.0,Polly 8.6.6— all production releases, no*-preview*/*-rc*strings (Mbproxy.csproj:37-43). OutputType=Exe+Sdk="Microsoft.NET.Sdk.Worker"is the right combo for a Worker Service that also hosts ASP.NET Core (Kestrel admin) via<FrameworkReference Include="Microsoft.AspNetCore.App" />(line 31).- Single-file publish gated to Release only (
Mbproxy.csproj:22-27) — Debug iteration stays fast.IncludeNativeLibrariesForSelfExtract=trueis the correct hint for self-extracting native deps. AddWindowsService()is called unconditionally (Program.cs:12);EventLogBridgewiring is gated byWindowsServiceHelpers.IsWindowsService()so console runs don't depend on Event Log source registration (Program.cs:15,HostingExtensions.cs:79-84).- Cancellation flows correctly:
BackgroundService.ExecuteAsync(stoppingToken)is plumbed into supervisorStartAsync(stoppingToken)andWaitForInitialBindAttemptAsync(readyLinked.Token)(ProxyWorker.cs:163-177). - Install/uninstall scripts are idempotent:
install.ps1:71-84stops gracefully before rebuilding,:112-117updates an existing service in-place rather than recreating,:166-171checks Event Log source existence before creating,:154-162preserves an existing operator config.uninstall.ps1:75-93archives logs to a timestamped folder rather than deleting — excellent operational instinct. - Failure-recovery actions are configured (
install.ps1:127):restart/60000/restart/60000/""/0with a 24 h reset window — restart twice then back off, which is the right policy for a fleet proxy that should not flap-restart. Mbproxy.slnx(the new XML solution format) is correctly minimal and references only the two projects.
Open questions
- Is the empty
src/Mbproxy/appsettings.jsonshipping intentionally so operators are forced into%ProgramData%? If so, document it explicitly indocs/Operations/Configuration.md. If not, mirror the install template into the source tree. - Should
ShutdownCoordinatorswitch toIOptionsMonitor<T>to honour a hot-reloadedGracefulShutdownTimeoutMs, or is the doc claim ("sampled only at shutdown") meant to mean "as of the last reload before shutdown was requested"? Either intent is fine, but the code currently means "as of process start" and that contradicts both readings. - Is the missing per-PLC reload-validation re-run of
MbproxyOptionsValidator(Width != 16/32) covered byReloadValidatortest coverage? The width check appears to live only inMbproxyOptionsValidator(MbproxyOptions.cs:62-63) — worth confirmingReloadValidatorre-checks it on hot-reload, or it could be skipped on a runtime config change.