mbproxy: Wave 2 fixes from 2026-05-14 code review
Resolves the 21 Major findings catalogued in
codereviews/2026-05-14/RemediationPlan.md (Wave 2). Tests: 370 pass / 0 fail
(baseline 363 + 7 new W2 regression tests).
Multiplexer / concurrency:
W2.1 ConfigReconciler.Attach now threads the live coalescingAccessor through
to add/restart-built supervisors so a hot-reload of
ReadCoalescing.{Enabled,MaxParties} propagates to PLCs added or
restarted via reload.
W2.2 PlcMultiplexer._disposed and UpstreamPipe._disposed are now volatile
for ARM/portability defense.
W2.3 ProxyWorker._supervisors / ConfigReconciler._supervisors switched from
Dictionary to ConcurrentDictionary; reconciler uses TryRemove. The
outer Apply is serialised by a semaphore but the inner Add/Remove/
Restart Task.WhenAll continuations run in parallel.
W2.4 Counter parity for cache miss + coalescing-saturation miss documented
inline (per-design contract; behavior unchanged).
W2.5 _disposeCts.Dispose() and _connectGate.Dispose() guarded against late
watchdog ticks.
W2.6 _connectGate disposed in DisposeAsync.
W2.7 Inline doc clarifying the post-rewriter FC byte read.
Cache / hot-reload:
W2.8 PlcListenerSupervisor.ReplaceContextAsync now calls Clear() to capture
the entry count, emits mbproxy.cache.flushed, then disposes the old
cache. Previously the event was defined but never emitted.
W2.9 Inline doc explaining the implicit "skip cache invalidation while
recovering" gating (no backend reader during recovery → no FC06/FC16
response → no invalidation).
W2.10 ReloadValidator now re-checks resolved per-tag CacheTtlMs against
Cache.AllowLongTtl after BcdTagMapBuilder folds the per-PLC default.
BCD rewriter:
W2.11 Duplicate addresses detected within Global itself and within the per-PLC
Add list itself, BEFORE the working dictionary collapses keys. Cross-list
collisions (Global vs Add) remain the documented width-override pattern.
Previously the DuplicateAddress error was unreachable dead code.
W2.12 OverlappingHighRegister reports each colliding pair exactly once
(canonicalised low/high pair tracked in a HashSet).
W2.13 FC16 32-bit write rejects clientLow > 9999 or clientHigh > 9999 BEFORE
the high*10000+low reconstruction. Without this guard, (high=9999,
low=9999) silently re-encoded as (high=9998, low=9999), losing 1 from
the high word.
W2.14 FC16 validates pdu.Length >= 6 + qty*2 upfront — no half-rewritten
requests when a malformed client claims more registers than it ships.
Supervisor:
W2.15 WaitForInitialBindAttemptAsync now backed by TaskCompletionSource
instead of 10ms busy-poll. Resolves race against fast Stopped→Bound→
Stopped transitions and hangs when the supervisor task throws.
W2.16 StartAsync refuses re-entry on a non-Stopped supervisor (was leaking
the previous _supervisorCts).
W2.17 New TransitionTo helper writes _state, _lastBindError, and (optionally)
_recoveryAttempts under one lock. Snapshot() reads under the same lock
so the status page never reports an inconsistent triple. Truncate
helper extracted (was copy-pasted across three sites).
W2.18 MbproxyOptionsValidator + ReloadValidator reject Connection.{Backend
ConnectTimeoutMs, BackendRequestTimeoutMs, GracefulShutdownTimeoutMs}
<= 0. Misconfigured 0 produces immediate CancelAfter(0) failures.
Hosting / diagnostics:
W2.20 ProxyWorker.StopAsync supervisor-stop deadline now reads from
IOptionsMonitor.CurrentValue.Connection.GracefulShutdownTimeoutMs
(was hard-coded 5s).
W2.21 src/Mbproxy/appsettings.json deleted; the published file is now a Link
to install/mbproxy.config.template.json so the binary ships with a
usable, fully-commented example config instead of an empty stub. Tests
strip the inherited file from their bin via an AfterTargets="Build"
Target so they don't pick up the template's example PLCs.
W2.22 invalidBcdWarnings (PlcPdusStatus) and codeOther (ExceptionCounts)
added to StatusDto, plumbed through StatusSnapshotBuilder, surfaced
in StatusHtmlRenderer table cells.
W2.23 EventLogBridge caches EventLog.SourceExists at construction so Emit
doesn't hit the registry on every Error+ log line.
New regression tests:
ReloadValidatorTests:
Validate_PerTagCacheTtl_Above60s_Without_AllowLongTtl_Fails
Validate_PerTagCacheTtl_Above60s_With_AllowLongTtl_Passes
Validate_ResolvedTtl_FromPerPlcDefault_AboveCap_Fails
Validate_ZeroBackendConnectTimeoutMs_Fails
Validate_NegativeGracefulShutdownTimeoutMs_Fails
BcdPduPipelineTests:
FC16_32Bit_ClientHighOrLowAbove9999_PassesThroughRaw_WithInvalidBcdWarning
FC16_TruncatedRegisterData_PassesThroughRaw_NoPartialRewrite
Reworked tests in BcdTagMapBuilderTests for the W2.11 contract (Global dup,
Add dup, Add-overrides-Global accepted as width override).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -1,3 +1,4 @@
|
||||
using System.Collections.Concurrent;
|
||||
using System.Diagnostics;
|
||||
using Mbproxy.Admin;
|
||||
using Mbproxy.Bcd;
|
||||
@@ -49,7 +50,13 @@ internal sealed partial class ProxyWorker : BackgroundService
|
||||
// Phase 06: supervisors are now managed jointly by ProxyWorker (initial bootstrap)
|
||||
// and ConfigReconciler (subsequent hot-reload changes). The dictionary is shared
|
||||
// via ConfigReconciler.Attach() after initial startup.
|
||||
private readonly Dictionary<string, PlcListenerSupervisor> _supervisors = new(StringComparer.Ordinal);
|
||||
//
|
||||
// Phase 12 (W2.3) — ConcurrentDictionary because ConfigReconciler mutates this from
|
||||
// parallel Task.WhenAll continuations (Add/Remove/Restart paths). The outer Apply is
|
||||
// serialised by a semaphore but the inner per-PLC tasks run concurrently. Status-page
|
||||
// reads via IReadOnlyDictionary still work without locking.
|
||||
private readonly ConcurrentDictionary<string, PlcListenerSupervisor> _supervisors =
|
||||
new(StringComparer.Ordinal);
|
||||
|
||||
/// <summary>
|
||||
/// Read-only view of the live supervisor dictionary. Consumed by Phase 07's
|
||||
@@ -164,7 +171,11 @@ internal sealed partial class ProxyWorker : BackgroundService
|
||||
// initial options snapshot. The reconciler won't process OnChange events until
|
||||
// after this call — the brief window between Attach and first supervisor start
|
||||
// is safe because the channel signal only enqueues; apply runs asynchronously.
|
||||
_reconciler.Attach(_supervisors, opts);
|
||||
// Phase 12 (W2.1) — also pass the live coalescing accessor so reconciler-built
|
||||
// supervisors (add/restart paths) honour hot-reloaded ReadCoalescing values.
|
||||
Func<ReadCoalescingOptions> reconcilerCoalescingAccessor =
|
||||
() => _options.CurrentValue.Resilience.ReadCoalescing;
|
||||
_reconciler.Attach(_supervisors, opts, reconcilerCoalescingAccessor);
|
||||
|
||||
if (_supervisors.Count == 0)
|
||||
{
|
||||
@@ -252,9 +263,13 @@ internal sealed partial class ProxyWorker : BackgroundService
|
||||
await base.StopAsync(cancellationToken).ConfigureAwait(false);
|
||||
|
||||
var sw = Stopwatch.StartNew();
|
||||
// Phase 12 (W2.20) — supervisor stop deadline read from the live config so a
|
||||
// hot-reloaded GracefulShutdownTimeoutMs is honoured. Previously hard-coded 5 s.
|
||||
// The supervisor stop budget is bounded by the same total-shutdown budget.
|
||||
int gracefulMs = _options.CurrentValue.Connection.GracefulShutdownTimeoutMs;
|
||||
|
||||
// ── 1. Stop accepting new connections ─────────────────────────────────────────
|
||||
using var stopCts = new CancellationTokenSource(TimeSpan.FromSeconds(5));
|
||||
using var stopCts = new CancellationTokenSource(TimeSpan.FromMilliseconds(gracefulMs));
|
||||
using var linked = CancellationTokenSource.CreateLinkedTokenSource(
|
||||
stopCts.Token, cancellationToken);
|
||||
|
||||
@@ -272,9 +287,8 @@ internal sealed partial class ProxyWorker : BackgroundService
|
||||
}
|
||||
|
||||
// ── 2. Drain in-flight PDUs ───────────────────────────────────────────────────
|
||||
// Reads the current configured deadline so a hot-reloaded
|
||||
// GracefulShutdownTimeoutMs is honoured at stop time, not frozen at process start.
|
||||
int drainDeadlineMs = _options.CurrentValue.Connection.GracefulShutdownTimeoutMs;
|
||||
// Same `gracefulMs` budget the supervisor-stop step used.
|
||||
int drainDeadlineMs = gracefulMs;
|
||||
int inFlightAtCancel = 0;
|
||||
|
||||
if (drainDeadlineMs > 0)
|
||||
|
||||
Reference in New Issue
Block a user