mbproxy: Wave 1 fixes from 2026-05-14 code review

Resolves the four critical correctness defects + the ShutdownCoordinator double-stop ordering bug called out in codereviews/2026-05-14/Overview.md. Tests: 362 pass / 0 fail (baseline 358 + 4 new W1 regression tests). W1.1 — Context swap on running multiplexer. PlcMultiplexer._ctx becomes volatile with a new ReplaceContext() method that re-registers the cache stats provider on the (preserved) counters. PlcListener exposes its multiplexer; PlcListenerSupervisor.ReplaceContextAsync swaps the running mux first, then disposes the old cache. Hot-reload tag-list changes and the cache-flush-on-reload contract now actually take effect on the next PDU instead of waiting for the next listener fault. W1.2 — Coalescing factory leak. When the InFlightByKey factory soft-fails (allocator saturation or duplicate TxId), the cleanup path now TryRemoves the stub and walks every party on it (including late attachers) to deliver Modbus exception 0x04. Previously only the leader got the exception; late attachers waited forever for a response that no backend round-trip would ever fire. W1.3 — Backend-reader head-of-line block. UpstreamPipe gains TrySendResponse for non-blocking enqueue. The per-PLC backend reader's fan-out loop uses it instead of awaiting SendResponseAsync, so a wedged upstream's full bounded response channel can no longer stall the single backend reader and starve every other client on that PLC. New responseDropForFullUpstream counter on ProxyCounters / CounterSnapshot records the drops. W1.4 — Stranded outbound frames after cascade. TearDownBackendAsync acquires _connectGate and drains any frames left in _outboundChannel after the writer task faulted/cancelled, releasing their proxy TxIds back to the allocator. Without this, a fresh EnsureBackendConnectedAsync racing the cascade would send stranded frames with old TxIds onto the new backend socket; the responses would arrive with no correlation entry and the upstream peers would hang on the watchdog until BackendRequestTimeoutMs. W1.5 — Delete ShutdownCoordinator (Option B). Drain logic moved into ProxyWorker.StopAsync. AdminEndpointHost is no longer registered as IHostedService; ProxyWorker drives its lifecycle directly so admin starts after listeners are bound and stops AFTER the in-flight drain (the design's documented contract). Admin is resolved lazily in ExecuteAsync to break the circular DI graph (Admin -> StatusSnapshotBuilder -> ProxyWorker). GracefulShutdownTimeoutMs is now read fresh from IOptionsMonitor.CurrentValue at stop time, so a hot-reloaded value is honoured. Removes ShutdownCoordinator + tests. New tests: PlcMultiplexerTests.ReplaceContext_NewTagMap_VisibleOnNextPdu PlcMultiplexerTests.ReplaceContext_NewCache_NextReadGoesToBackend_NotOldCache UpstreamPipeTests.TrySendResponse_WhenChannelFull_ReturnsFalse_WithoutBlocking UpstreamPipeTests.TrySendResponse_AfterDispose_ReturnsFalse Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 05:16:13 -04:00
parent f2c6669444
commit ce32c5cee8
14 changed files with 614 additions and 532 deletions
@@ -47,7 +47,12 @@ internal sealed class PlcMultiplexer : IAsyncDisposable, IMultiplexCountersProvi
    private readonly PlcOptions _plc;
    private readonly ConnectionOptions _connectionOptions;
    private readonly IPduPipeline _pipeline;
-    private readonly PerPlcContext _ctx;
+
+    // Phase 12 (W1.1) — `_ctx` is volatile so a hot-reload reseat can swap it on the running
+    // multiplexer. Each method that uses the context snapshots it into a local at the start
+    // of the operation so a single PDU sees a consistent (TagMap, Cache) pair even if the
+    // swap fires mid-PDU. ReplaceContext is the single mutator.
+    private volatile PerPlcContext _ctx;
    private readonly ILogger<PlcMultiplexer> _logger;
    private readonly ResiliencePipeline? _backendConnectPipeline;
    // Phase 10: live read-coalescing config accessor. The accessor is read per-PDU on the
@@ -145,6 +150,35 @@ internal sealed class PlcMultiplexer : IAsyncDisposable, IMultiplexCountersProvi
        _pipes[pipe.Id] = pipe;
    }

+    /// <summary>
+    /// Phase 12 (W1.1) — atomically swaps the per-PLC context on a running multiplexer.
+    /// Called by <see cref="Supervision.PlcListenerSupervisor.ReplaceContextAsync"/> when a
+    /// hot-reload tag-list change is applied to a PLC whose listener is already bound.
+    ///
+    /// <para>The new context's tag map and (optional) response cache become visible on the
+    /// next PDU through the volatile <c>_ctx</c> field. Counters are PRESERVED across reseat
+    /// (the supervisor builds the new context with the running counters), so we only need
+    /// to re-register the cache stats provider — the multiplex provider already points at
+    /// this same instance.</para>
+    ///
+    /// <para>Existing per-call snapshots of the old context held by in-flight PDUs (via
+    /// <c>WithCurrentRequest</c>) finish on the old map. New PDUs after this call see the
+    /// new map. Per the design contract a one-PDU "old map" tail is acceptable; partial-BCD
+    /// rewrites mid-request would be worse.</para>
+    /// </summary>
+    public void ReplaceContext(PerPlcContext newContext)
+    {
+        if (_disposed) return;
+
+        _ctx = newContext;
+
+        // Re-register the cache stats provider on the (preserved) counters so the status
+        // page sees the new cache's count/bytes immediately. Pass null when the new context
+        // opted out of caching to clear any stale provider from the previous context.
+        newContext.Counters.SetCacheStatsProvider(
+            newContext.Cache is not null ? new CacheStatsAdapter(newContext.Cache) : null);
+    }
+
    /// <summary>
    /// Starts the read+write tasks for <paramref name="pipe"/> and returns a task that
    /// completes when the pipe's read loop ends. The multiplexer detaches the pipe when
@@ -284,73 +318,98 @@ internal sealed class PlcMultiplexer : IAsyncDisposable, IMultiplexCountersProvi

    private async Task TearDownBackendAsync(string reason, bool cascadeUpstreams)
    {
-        Socket? oldSocket;
-        CancellationTokenSource? oldCts;
-        Task? writer, reader;
-        lock (_backendLock)
+        // Phase 12 (W1.4) — serialise tear-down vs connect-up via the connect gate. Without
+        // this, a fresh EnsureBackendConnectedAsync racing with the channel drain below
+        // could see stranded frames sent on its new socket with old (already-released) TxIds,
+        // producing orphaned responses that hang upstream peers via the watchdog.
+        await _connectGate.WaitAsync().ConfigureAwait(false);
+        try
        {
-            oldSocket = _backendSocket;
-            oldCts    = _backendCts;
-            writer    = _backendWriterTask;
-            reader    = _backendReaderTask;
-
-            _backendSocket    = null;
-            _backendCts       = null;
-            _backendWriterTask = null;
-            _backendReaderTask = null;
-        }
-
-        if (oldSocket is null && oldCts is null) return;
-
-        try { oldCts?.Cancel(); } catch { /* best effort */ }
-
-        try { oldSocket?.Shutdown(SocketShutdown.Both); } catch { /* already closed */ }
-        try { oldSocket?.Dispose(); } catch { /* best effort */ }
-
-        // Drain correlation map; cascade-close every interested upstream pipe.
-        var dropped = _correlation.DrainAll();
-        var cascadeIds = new HashSet<Guid>();
-
-        foreach (var kvp in dropped)
-        {
-            _allocator.Release(kvp.Key);
-            foreach (var party in kvp.Value.InterestedParties)
-                cascadeIds.Add(party.Pipe.Id);
-        }
-
-        // Phase 10 — also drain the in-flight-by-key map so a brand-new identical request
-        // through the freshly-reconnected backend is treated as a miss (no stale entries
-        // outlive the backend they were destined for).
-        _inFlightByKey.DrainAll();
-
-        int upstreamCount = 0;
-        if (cascadeUpstreams)
-        {
-            // Close every attached pipe that had a request in flight; the others will
-            // simply re-issue on next request through a fresh backend connect.
-            // Per the design doc, ALL attached upstreams cascade on backend disconnect.
-            upstreamCount = _pipes.Count;
-
-            // Snapshot keys before disposal modifies the dictionary indirectly.
-            var pipeList = _pipes.Values.ToArray();
-            foreach (var pipe in pipeList)
+            Socket? oldSocket;
+            CancellationTokenSource? oldCts;
+            Task? writer, reader;
+            lock (_backendLock)
            {
-                try { await pipe.DisposeAsync().ConfigureAwait(false); }
-                catch { /* best effort */ }
+                oldSocket = _backendSocket;
+                oldCts    = _backendCts;
+                writer    = _backendWriterTask;
+                reader    = _backendReaderTask;
+
+                _backendSocket    = null;
+                _backendCts       = null;
+                _backendWriterTask = null;
+                _backendReaderTask = null;
            }
-            _pipes.Clear();

-            _ctx.Counters.AddDisconnectCascades(upstreamCount);
+            if (oldSocket is null && oldCts is null) return;
+
+            try { oldCts?.Cancel(); } catch { /* best effort */ }
+
+            try { oldSocket?.Shutdown(SocketShutdown.Both); } catch { /* already closed */ }
+            try { oldSocket?.Dispose(); } catch { /* best effort */ }
+
+            // Drain correlation map; cascade-close every interested upstream pipe.
+            var dropped = _correlation.DrainAll();
+
+            foreach (var kvp in dropped)
+            {
+                _allocator.Release(kvp.Key);
+            }
+
+            // Phase 10 — also drain the in-flight-by-key map so a brand-new identical request
+            // through the freshly-reconnected backend is treated as a miss (no stale entries
+            // outlive the backend they were destined for).
+            _inFlightByKey.DrainAll();
+
+            int upstreamCount = 0;
+            if (cascadeUpstreams)
+            {
+                // Close every attached pipe that had a request in flight; the others will
+                // simply re-issue on next request through a fresh backend connect.
+                // Per the design doc, ALL attached upstreams cascade on backend disconnect.
+                upstreamCount = _pipes.Count;
+
+                // Snapshot keys before disposal modifies the dictionary indirectly.
+                var pipeList = _pipes.Values.ToArray();
+                foreach (var pipe in pipeList)
+                {
+                    try { await pipe.DisposeAsync().ConfigureAwait(false); }
+                    catch { /* best effort */ }
+                }
+                _pipes.Clear();
+
+                _ctx.Counters.AddDisconnectCascades(upstreamCount);
+            }
+
+            // Phase 12 (W1.4) — drain any stranded frames left in the outbound channel by
+            // the writer task that just faulted/cancelled. Released their proxy TxIds back
+            // to the allocator. By the time we reach this line the writer has stopped
+            // reading from the channel (cancelled CTS) and the upstream pipes have been
+            // cascaded (no more enqueues), so the channel state is stable.
+            int strandedDropped = 0;
+            while (_outboundChannel.Reader.TryRead(out byte[]? stranded))
+            {
+                if (stranded.Length >= 2)
+                {
+                    ushort strandedTxId = (ushort)((stranded[0] << 8) | stranded[1]);
+                    _allocator.Release(strandedTxId);
+                }
+                strandedDropped++;
+            }
+
+            // Best-effort join.
+            try { if (writer is not null) await writer.WaitAsync(TimeSpan.FromSeconds(2)).ConfigureAwait(false); } catch { /* swallow */ }
+            try { if (reader is not null) await reader.WaitAsync(TimeSpan.FromSeconds(2)).ConfigureAwait(false); } catch { /* swallow */ }
+
+            oldCts?.Dispose();
+
+            if (upstreamCount > 0 || dropped.Count > 0 || strandedDropped > 0)
+                MultiplexerLogEvents.BackendDisconnected(_logger, _plc.Name, upstreamCount, dropped.Count + strandedDropped, reason);
+        }
+        finally
+        {
+            _connectGate.Release();
        }
-
-        // Best-effort join.
-        try { if (writer is not null) await writer.WaitAsync(TimeSpan.FromSeconds(2)).ConfigureAwait(false); } catch { /* swallow */ }
-        try { if (reader is not null) await reader.WaitAsync(TimeSpan.FromSeconds(2)).ConfigureAwait(false); } catch { /* swallow */ }
-
-        oldCts?.Dispose();
-
-        if (upstreamCount > 0 || dropped.Count > 0)
-            MultiplexerLogEvents.BackendDisconnected(_logger, _plc.Name, upstreamCount, dropped.Count, reason);
    }

    // ── Backend writer / reader tasks ─────────────────────────────────────────
@@ -513,6 +572,13 @@ internal sealed class PlcMultiplexer : IAsyncDisposable, IMultiplexCountersProvi
                // Phase 9: always exactly one party. Phase 10: N parties (read coalescing).
                // Note: the InFlightByKey TryRemove above (for FC03/FC04) guarantees no
                // further attaches can occur — the parties list is now a stable snapshot.
+                //
+                // Phase 12 (W1.3) — non-blocking fan-out via `TrySendResponse`. The
+                // single backend reader task must NEVER `await` a per-upstream channel
+                // write: a wedged upstream (full bounded response channel) would otherwise
+                // stall the reader and starve every other client on this PLC. A drop here
+                // is recorded via `responseDropForFullUpstream`; the wedged upstream loses
+                // its own response and will be reaped by its own socket-close path.
                foreach (var party in inFlight.InterestedParties)
                {
                    if (!party.Pipe.IsAlive)
@@ -542,7 +608,10 @@ internal sealed class PlcMultiplexer : IAsyncDisposable, IMultiplexCountersProvi
                    outFrame[0] = (byte)(party.OriginalTxId >> 8);
                    outFrame[1] = (byte)(party.OriginalTxId & 0xFF);

-                    await party.Pipe.SendResponseAsync(outFrame, ct).ConfigureAwait(false);
+                    if (!party.Pipe.TrySendResponse(outFrame))
+                    {
+                        _ctx.Counters.IncrementResponseDropForFullUpstream();
+                    }
                }
            }

@@ -723,12 +792,38 @@ internal sealed class PlcMultiplexer : IAsyncDisposable, IMultiplexCountersProvi

            if (inFlightForSend is null)
            {
-                // The factory hit the allocator-saturation path or a duplicate-key race.
-                // Surface a Modbus exception 04 to the upstream and clean up.
+                // Phase 12 (W1.2) — the factory hit the allocator-saturation path or a
+                // duplicate-key race and stored a stub `InFlightRequest` under `key`. Late
+                // attachers may have joined the stub between the factory call and this
+                // cleanup; we must deliver the saturation exception to ALL of them, not just
+                // the leader, otherwise the late attachers wait forever for a response that
+                // never comes (the stub has no proxy TxId, so no backend round-trip will
+                // ever fire).
                MultiplexerLogEvents.Saturated(_logger, _plc.Name, pipe.RemoteEp?.ToString() ?? "?");
-                byte[] excFrame = BuildExceptionFrame(originalTxId, unitId, fcByte, exceptionCode: 4);
-                _inFlightByKey.TryRemove(key, out _);
-                await pipe.SendResponseAsync(excFrame, ct).ConfigureAwait(false);
+
+                if (_inFlightByKey.TryRemove(key, out var stub))
+                {
+                    foreach (var party in stub.InterestedParties)
+                    {
+                        byte[] excFrame = BuildExceptionFrame(party.OriginalTxId, unitId, fcByte, exceptionCode: 4);
+                        try
+                        {
+                            await party.Pipe.SendResponseAsync(excFrame, ct).ConfigureAwait(false);
+                        }
+                        catch
+                        {
+                            // Best-effort delivery. A dead pipe will be collected by its own
+                            // socket close path; nothing more we can do here.
+                        }
+                    }
+                }
+                else
+                {
+                    // The stub was already removed by another path (extremely unlikely, but
+                    // defensive). Surface the exception to the original requester.
+                    byte[] excFrame = BuildExceptionFrame(originalTxId, unitId, fcByte, exceptionCode: 4);
+                    await pipe.SendResponseAsync(excFrame, ct).ConfigureAwait(false);
+                }
                return;
            }

@@ -224,6 +224,26 @@ internal sealed partial class UpstreamPipe : IAsyncDisposable
        }
    }

+    /// <summary>
+    /// Phase 12 (W1.3) — non-blocking response enqueue. Returns <c>true</c> when the frame
+    /// was queued for delivery, <c>false</c> when the pipe is dead OR the response channel
+    /// is full. Used by the per-PLC backend reader's fan-out loop so a single wedged
+    /// upstream cannot stall responses to peers sharing the same backend socket — without
+    /// this, a full <c>_responseChannel</c> on one pipe would block the reader task.
+    ///
+    /// <para>A <c>false</c> return indicates the frame is the multiplexer's responsibility
+    /// to drop and (optionally) account for via a counter. The wedged upstream's socket
+    /// will eventually time out and close on its own; its read loop will then dispose the
+    /// pipe and the multiplexer's correlation/coalescing entries will be reaped naturally.</para>
+    /// </summary>
+    public bool TrySendResponse(byte[] frame)
+    {
+        if (!IsAlive)
+            return false;
+
+        return _responseChannel.Writer.TryWrite(frame);
+    }
+
    /// <summary>
    /// Closes the pipe: cancels the read+write loops and shuts down the socket. Idempotent.
    /// </summary>
@@ -48,6 +48,13 @@ internal sealed partial class PlcListener : IAsyncDisposable
    public IReadOnlyCollection<UpstreamPipe> ActiveUpstreams
        => _multiplexer?.AttachedPipes ?? Array.Empty<UpstreamPipe>();

+    /// <summary>
+    /// Phase 12 (W1.1) — exposes the running multiplexer so a hot-reload reseat can swap
+    /// the per-PLC context on the live instance. <c>null</c> between StopAsync and a fresh
+    /// start; callers must null-check.
+    /// </summary>
+    internal PlcMultiplexer? Multiplexer => _multiplexer;
+
    public PlcListener(
        PlcOptions plc,
        ConnectionOptions connectionOptions,
@@ -124,7 +124,15 @@ public sealed record CounterSnapshot(
    /// Phase 11 — point-in-time approximation of cached PDU bytes for this PLC. Sum of
    /// <see cref="Cache.CacheEntry.Length"/> across entries. Read on the snapshot path.
    /// </summary>
-    long CacheBytes);
+    long CacheBytes,
+    /// <summary>
+    /// Phase 12 (W1.3) — cumulative count of backend response frames the per-PLC reader
+    /// task dropped because the destination upstream pipe's bounded response channel was
+    /// full. A non-zero value indicates one or more upstream clients are not draining their
+    /// socket fast enough to keep up with the backend; the wedged client loses its own
+    /// responses but its peers on the same PLC continue to receive theirs.
+    /// </summary>
+    long ResponseDropForFullUpstream);

 /// <summary>
 /// Thread-safe per-PLC counters backed by <see cref="System.Threading.Interlocked"/> longs.
@@ -169,6 +177,12 @@ internal sealed class ProxyCounters
    private long _cacheMissCount;
    private long _cacheInvalidations;

+    // Phase 12 (W1.3) — backend-reader fan-out drop counter. Increments when the reader
+    // task tried to enqueue a response to an upstream pipe whose bounded response channel
+    // was full. Without the non-blocking enqueue this would deadlock the reader; with it
+    // we drop and account.
+    private long _responseDropForFullUpstream;
+
    // Phase 11 — live cache state pulled from a per-PLC ResponseCache on each snapshot.
    // The multiplexer registers a single provider via SetCacheStatsProvider so the status
    // page sees current entry-count / bytes without a separate poll.
@@ -293,6 +307,13 @@ internal sealed class ProxyCounters
    public void AddCacheInvalidations(int n)
        => Interlocked.Add(ref _cacheInvalidations, n);

+    /// <summary>
+    /// Phase 12 (W1.3) — records one backend response frame dropped because the destination
+    /// upstream pipe's response channel was full.
+    /// </summary>
+    public void IncrementResponseDropForFullUpstream()
+        => Interlocked.Increment(ref _responseDropForFullUpstream);
+
    /// <summary>
    /// Phase 11 — wires the per-PLC <see cref="Cache.ResponseCache"/> as the live stats
    /// source for the snapshot path. Pass <c>null</c> to detach during disposal.
@@ -422,7 +443,8 @@ internal sealed class ProxyCounters
            CacheMissCount:            Interlocked.Read(ref _cacheMissCount),
            CacheInvalidations:        Interlocked.Read(ref _cacheInvalidations),
            CacheEntryCount:           cacheEntries,
-            CacheBytes:                cacheBytes);
+            CacheBytes:                cacheBytes,
+            ResponseDropForFullUpstream: Interlocked.Read(ref _responseDropForFullUpstream));
    }
 }

@@ -1,3 +1,5 @@
+using System.Diagnostics;
+using Mbproxy.Admin;
 using Mbproxy.Bcd;
 using Mbproxy.Configuration;
 using Mbproxy.Options;
@@ -5,7 +7,6 @@ using Mbproxy.Proxy.Cache;
 using Mbproxy.Proxy.Multiplexing;
 using Mbproxy.Proxy.Supervision;
 using Microsoft.Extensions.Options;
-using Polly;

 namespace Mbproxy.Proxy;

@@ -34,6 +35,16 @@ internal sealed partial class ProxyWorker : BackgroundService
    private readonly ILogger<ProxyWorker> _logger;
    private readonly ILoggerFactory _loggerFactory;
    private readonly ConfigReconciler _reconciler;
+    // Phase 12 (W1.5) — admin endpoint is no longer IHostedService; ProxyWorker drives its
+    // lifecycle directly so the design's "drain THEN stop admin" ordering is honoured.
+    //
+    // Resolved LAZILY (in ExecuteAsync) rather than in the constructor because the DI graph
+    // is circular: AdminEndpointHost → StatusSnapshotBuilder → ProxyWorker. A constructor
+    // GetService<AdminEndpointHost>() during ProxyWorker's own construction returns null
+    // silently. Lazy resolution sidesteps the cycle — by the time ExecuteAsync runs the DI
+    // container is fully built.
+    private readonly IServiceProvider _services;
+    private AdminEndpointHost? _admin;

    // Phase 06: supervisors are now managed jointly by ProxyWorker (initial bootstrap)
    // and ConfigReconciler (subsequent hot-reload changes). The dictionary is shared
@@ -52,13 +63,16 @@ internal sealed partial class ProxyWorker : BackgroundService
        IPduPipeline pipeline,
        ILogger<ProxyWorker> logger,
        ILoggerFactory loggerFactory,
-        ConfigReconciler reconciler)
+        ConfigReconciler reconciler,
+        IServiceProvider services)
    {
        _options       = options;
        _pipeline      = pipeline;
        _logger        = logger;
        _loggerFactory = loggerFactory;
        _reconciler    = reconciler;
+        _services      = services;
+        // Phase 12 (W1.5) — admin endpoint resolved lazily in ExecuteAsync (see field comment).
    }

    protected override async Task ExecuteAsync(CancellationToken stoppingToken)
@@ -188,17 +202,58 @@ internal sealed partial class ProxyWorker : BackgroundService
        int boundCount = _supervisors.Values.Count(s => s.Snapshot().State == SupervisorState.Bound);
        LogStartupReady(_logger, boundCount, plcsConfigured);

+        // Phase 12 (W1.5) — start the admin endpoint AFTER listeners are bound so the
+        // status page can never observe the service in a "no PLCs configured yet" state.
+        // The admin endpoint is no longer registered as IHostedService (the host's reverse
+        // stop order would tear it down BEFORE drain). ProxyWorker drives both ends.
+        //
+        // Resolution happens here, not in the constructor — the DI graph is circular
+        // (admin → StatusSnapshotBuilder → ProxyWorker) and a constructor-time lookup
+        // returns null silently.
+        _admin = _services.GetService<AdminEndpointHost>();
+        if (_admin is not null)
+        {
+            try
+            {
+                await _admin.StartAsync(stoppingToken).ConfigureAwait(false);
+            }
+            catch (Exception ex)
+            {
+                _logger.LogError(ex, "Admin endpoint failed to start: {Message}", ex.Message);
+            }
+        }
+
        // ── 6. Keep the worker alive until the host signals stop ─────────────────────
        // Supervisors run their own background loops; ExecuteAsync just waits.
        await Task.Delay(Timeout.Infinite, stoppingToken).ConfigureAwait(false);
    }

+    /// <summary>
+    /// Phase 12 (W1.5) — graceful shutdown sequence (replaces the deleted
+    /// <c>ShutdownCoordinator</c>):
+    /// <list type="number">
+    ///   <item>Cancel <see cref="ExecuteAsync"/> via <c>base.StopAsync</c>.</item>
+    ///   <item>Stop all supervisors with a 5 s hard deadline (no new connections; existing
+    ///         pipes are cascaded by <see cref="PlcListenerSupervisor"/> teardown).</item>
+    ///   <item>Wait for in-flight PDUs to drain via the live
+    ///         <see cref="ConnectionOptions.GracefulShutdownTimeoutMs"/> (read fresh from
+    ///         <see cref="IOptionsMonitor{T}.CurrentValue"/> so a hot-reloaded value is
+    ///         honoured at stop time).</item>
+    ///   <item>Stop the admin endpoint LAST so the status page survives the drain phase
+    ///         and an operator polling it sees the in-flight count fall to zero.</item>
+    ///   <item>Dispose every supervisor to release sockets, channels, and watchdog timers.</item>
+    /// </list>
+    /// Logs <c>mbproxy.shutdown.complete</c> on the way out with the in-flight count at
+    /// drain-deadline (zero on a clean shutdown, positive when forced cancel).
+    /// </summary>
    public override async Task StopAsync(CancellationToken cancellationToken)
    {
        // Cancel ExecuteAsync first.
        await base.StopAsync(cancellationToken).ConfigureAwait(false);

-        // Stop all supervisors in parallel with a 5-second hard deadline.
+        var sw = Stopwatch.StartNew();
+
+        // ── 1. Stop accepting new connections ─────────────────────────────────────────
        using var stopCts = new CancellationTokenSource(TimeSpan.FromSeconds(5));
        using var linked  = CancellationTokenSource.CreateLinkedTokenSource(
            stopCts.Token, cancellationToken);
@@ -216,10 +271,59 @@ internal sealed partial class ProxyWorker : BackgroundService
            // Best effort — don't let individual supervisor failures block shutdown.
        }

+        // ── 2. Drain in-flight PDUs ───────────────────────────────────────────────────
+        // Reads the current configured deadline so a hot-reloaded
+        // GracefulShutdownTimeoutMs is honoured at stop time, not frozen at process start.
+        int drainDeadlineMs = _options.CurrentValue.Connection.GracefulShutdownTimeoutMs;
+        int inFlightAtCancel = 0;
+
+        if (drainDeadlineMs > 0)
+        {
+            using var drainCts = new CancellationTokenSource(TimeSpan.FromMilliseconds(drainDeadlineMs));
+            try
+            {
+                while (!drainCts.Token.IsCancellationRequested)
+                {
+                    int total = CountInFlight();
+                    if (total == 0) break;
+                    await Task.Delay(10, drainCts.Token).ConfigureAwait(false);
+                }
+            }
+            catch (OperationCanceledException)
+            {
+                inFlightAtCancel = CountInFlight();
+            }
+        }
+
+        // ── 3. Stop admin endpoint LAST ───────────────────────────────────────────────
+        if (_admin is not null)
+        {
+            try
+            {
+                using var adminCts = new CancellationTokenSource(TimeSpan.FromSeconds(2));
+                await _admin.StopAsync(adminCts.Token).ConfigureAwait(false);
+            }
+            catch
+            {
+                // Best-effort.
+            }
+        }
+
+        // ── 4. Dispose supervisors (releases sockets, channels, watchdog timers) ─────
        foreach (var supervisor in _supervisors.Values)
            await supervisor.DisposeAsync().ConfigureAwait(false);

        _supervisors.Clear();
+
+        LogShutdownComplete(_logger, inFlightAtCancel, sw.ElapsedMilliseconds);
+    }
+
+    private int CountInFlight()
+    {
+        int total = 0;
+        foreach (var supervisor in _supervisors.Values)
+            total += (int)supervisor.CurrentCounters.Snapshot().InFlightCount;
+        return total;
    }

    // ── Logging ───────────────────────────────────────────────────────────────────────────
@@ -247,4 +351,10 @@ internal sealed partial class ProxyWorker : BackgroundService
        Level = LogLevel.Error,
        Message = "Failed to bind listener: Plc={Plc} Port={Port} Reason={Reason}")]
    private static partial void LogBindFailed(ILogger logger, string plc, int port, string reason);
+
+    // Phase 12 (W1.5) — moved here from the deleted ShutdownCoordinator.
+    [LoggerMessage(EventId = 80, EventName = "mbproxy.shutdown.complete",
+        Level = LogLevel.Information,
+        Message = "Graceful shutdown complete: InFlightAtCancel={InFlightAtCancel} ElapsedMs={ElapsedMs}")]
+    private static partial void LogShutdownComplete(ILogger logger, int inFlightAtCancel, long elapsedMs);
 }
@@ -180,35 +180,40 @@ internal sealed partial class PlcListenerSupervisor : IAsyncDisposable
        RecoveryAttempts: Interlocked.CompareExchange(ref _recoveryAttempts, 0, 0));

    /// <summary>
-    /// Atomically swaps the per-PLC context (tag map) without restarting the listener.
+    /// Atomically swaps the per-PLC context (tag map + optional response cache) on the
+    /// running listener AND its live multiplexer.
    ///
-    /// <para><b>Transition window</b>: there is a brief overlap where the old
-    /// <see cref="PlcListener"/> is running its accept loop with the old context while the
-    /// new context reference is being written. The volatile write ensures that the very
-    /// next <c>PlcListener</c> constructed inside the Polly loop (on any subsequent fault
-    /// recovery) picks up <paramref name="newCtx"/>. Existing in-flight upstream pipes
-    /// served by the current multiplexer keep their reference to the context captured at
-    /// multiplexer construction time; they finish on the old map. New connections after
-    /// this call use the new map. This is the correct design — partial-BCD rewrites
-    /// mid-request would be worse than a one-request gap.</para>
+    /// <para><b>Phase 12 (W1.1)</b> — previously this method only updated the supervisor's
+    /// <c>_currentContext</c> slot, which meant the running <see cref="PlcMultiplexer"/>
+    /// kept using the OLD context (it captured the reference at construction). A reload
+    /// only became visible on the next listener fault. Now the swap propagates into the
+    /// running mux via <see cref="PlcMultiplexer.ReplaceContext"/>, so the very next PDU
+    /// sees the new tag map / new cache. Counters are preserved (the new context carries
+    /// the same <c>ProxyCounters</c> instance) so operator history is not reset.</para>
    ///
-    /// <para>This method is intentionally lightweight: it performs only the volatile write
-    /// and returns immediately. The <paramref name="ct"/> parameter is present for API
-    /// symmetry with start/stop and to accommodate future async expansion.</para>
+    /// <para><b>Old cache lifecycle</b>: the supervisor disposes the outgoing context's
+    /// cache AFTER the multiplexer has been swapped to the new context. By that point no
+    /// more reads or writes can land on the old cache. Per the design contract, any
+    /// tag-list change drops the entire PLC cache.</para>
    /// </summary>
    public Task ReplaceContextAsync(PerPlcContext newCtx, CancellationToken ct)
    {
-        // Phase 11: dispose the outgoing context's response cache (if any) so its
-        // eviction loop terminates. The "any tag-list reload flushes the affected PLC's
-        // whole cache" doctrine is satisfied here — the new context constructs its own
-        // fresh cache, the old cache is dropped wholesale.
        var oldCache = _currentContext?.Cache;

-        // Volatile write: the next PlcListener created in RunSupervisorAsync will see
-        // the new context. The accept loop itself does not hold a direct reference to
-        // _currentContext — it was captured at PlcListener construction time.
+        // Volatile write: the next PlcListener created in RunSupervisorAsync (on any
+        // subsequent fault recovery) will pick up newCtx through this slot.
        _currentContext = newCtx;

+        // Phase 12 (W1.1) — push the swap into the running multiplexer so existing
+        // connections see the new tag map / new cache on their next PDU. _currentListener
+        // may be null between Polly retry attempts; in that case the next listener built
+        // inside the Polly loop will pick up newCtx through _currentContext above.
+        _currentListener?.Multiplexer?.ReplaceContext(newCtx);
+
+        // Phase 12 (W1.1 + W2.8 prereq) — drop the outgoing cache AFTER the swap so the
+        // running multiplexer can no longer reach it. Dispose stops the eviction loop and
+        // releases the timer. (The cache.flushed log event is W2.8 work; this Wave-1 fix
+        // is the "no longer in use, safe to drop" piece.)
        if (oldCache is not null && !ReferenceEquals(oldCache, newCtx.Cache))
            oldCache.Dispose();