wwtools/mbproxy/codereviews/2026-05-14/Multiplexer.md

# PlcMultiplexer Concurrency Review

Scope: `src/Mbproxy/Proxy/Multiplexing/*`, `src/Mbproxy/Proxy/{PlcListener,PerPlcContext,ProxyWorker,ProxyCounters}.cs`, the relevant `Options` POCOs, and the matching tests under `tests/Mbproxy.Tests/Proxy/Multiplexing/*` + `ProxyForwardingTests.cs`. Cross-checked against `docs/design.md` (Connection model, Read coalescing), `docs/Architecture/ConnectionModel.md`, `docs/Architecture/ReadCoalescing.md`.

## Summary

- The architecture is sound and the documented invariants (single backend reader/writer, claim-then-dispatch in the watchdog, remove-from-key-map-before-fanout) are upheld. Most paths are well-defended.
- Two correctness defects in the coalescing path can deliver responses without restoring the original TxId or duplicate counter increments; one in the cache path lets a hit survive a freshly-cascaded backend.
- A handful of leaks and lifecycle races exist around `_connectGate`, the outbound channel after teardown, the `_disposed` flag, and the second backend-CTS dispose.
- Test coverage is solid for happy-path multiplexing/coalescing/saturation/cascade/watchdog but does not exercise the watchdog↔response race or coalescing-mid-cascade.

## Critical Findings

### C1. Coalescing factory failure can leak the entry, deliver no response, and hang every late attacher
`PlcMultiplexer.cs:692-700` — when `_correlation.TryAdd` fails inside the factory, the code releases the TxId, logs, and **returns the stub `inFlight`**. The map keeps the just-stored entry under `key` but `inFlightForSend` stays `null`, so the post-lock branch at line 724 takes the saturation cleanup (`_inFlightByKey.TryRemove(key, out _)`) and sends an exception **only to the leader**. Any late attacher that joined between the factory call and the `TryRemove` is appended to a `List<InterestedParty>` that no one will ever drain — its pipe hangs until the upstream socket closes. Same pathology if `_allocator.TryAllocate` returns false inside the factory: the stub is stored under `key`, late attaches accumulate, and only the leader gets exception 0x04. The factory must either (a) not store on failure, or (b) the post-lock cleanup must walk all parties on the stub before removing. (See also `InFlightByKeyMap.cs:82-83`: `_entries[key] = req` runs unconditionally.)

### C2. Coalesce-hit late attaches against an FC03/FC04 cache hit can be silently dropped
`PlcMultiplexer.cs:603-624` — for an FC03/FC04 cache hit, the path returns immediately at line 623 **without** consulting `_inFlightByKey`. Meanwhile, a peer in flight on the exact same `(unit, fc, start, qty)` may already be coalescing late attaches (line 645-718). The cache lookup happens before `EnsureBackendConnectedAsync` and before coalescing — but if the cached entry was just stored by the backend reader at line 490 *while a coalesced-leader's response is still in fan-out*, a late upstream that arrives during fan-out can take the cache hit path and return correctly. The real hazard is the inverse: at `RunBackendReaderAsync` line 490 the cache `Set` happens **before** the InFlight TryRemove at line 441. A new request arriving between those two steps could miss the cache (entry not yet stored… actually the order is reversed: cache Set is at 490 and `_inFlightByKey.TryRemove` is at 441, so cache is populated *after* the key is removed — correct order). However, between line 441 and line 490 a new identical FC03 will: cache miss (not yet set), then check `_inFlightByKey` — missing too — and open a fresh round-trip. That's wasteful but not corrupting. **Downgrade to Major.**

### C3. Cache hit can be served against a backend that has just cascaded
`PlcMultiplexer.cs:610-624` — by design the cache "survives backend disconnects," but the contract in `ConnectionModel.md` says cascade closes **every attached upstream pipe**. A new pipe that connects mid-cascade and immediately reads a cached tag returns successfully, while a peer attached one millisecond earlier just had its socket closed. This is an inconsistent visibility model — but the design explicitly accepts it for cached reads. Note however: after a cascade, `_inFlightByKey.DrainAll()` runs (line 324) without delivering anything to the parties on the drained entries. Those parties' pipes are then disposed at line 338 so the hang in C1 doesn't apply here. **Acceptable** but worth a doc note.

### C4. Backend reader's send to upstream uses the backend CTS, so a slow upstream blocks the whole reader
`PlcMultiplexer.cs:545` — `await party.Pipe.SendResponseAsync(outFrame, ct)` uses the backend reader's `ct` (= `_backendCts.Token`). If the upstream pipe's `_responseChannel` is full (capacity 16) and the backend writer keeps producing for it, this `WriteAsync` blocks **inside the single backend reader task**, stalling responses to **every other PLC client** sharing that backend socket. The bounded outbound channel only backpressures inbound; on the response side a single wedged upstream stops the world for that PLC. Mitigations: drop-on-full for response channel, write with timeout, or fan-out tasks per party.

## Major Findings

### M1. `_disposed` is non-volatile and read on every hot path
`PlcMultiplexer.cs:86`, checked at lines 142, 181, 216, 566. A non-volatile `bool` set on the disposing thread is not guaranteed to be observed by other threads under the .NET memory model on weakly-ordered platforms (ARM). Same in `UpstreamPipe.cs:52,77`. Recommend `volatile bool` or use `Interlocked.Exchange(ref int, 1)` pattern.

### M2. `_disposeCts` is disposed while watchdog/cancel paths may still touch it
`PlcMultiplexer.cs:209` disposes `_disposeCts` after a 2-second `WaitAsync` on the watchdog. If the watchdog took >2 s to settle (e.g. blocked on a slow `SendResponseAsync`), it will subsequently call `Task.Delay(tickMs, ct)` against a disposed CTS → `ObjectDisposedException`, which the outer `catch (Exception ex)` (line 909) logs as an error. Best-effort, but noisy. Same risk: `oldCts?.Dispose()` at line 350 races with `RunBackendReaderAsync`'s `ct.IsCancellationRequested` check at line 392 and with the catch handler.

### M3. `_connectGate` is never disposed
`PlcMultiplexer.cs:283`. `SemaphoreSlim` holds an internal `WaitHandle` lazily; in practice rarely realized so leak is small, but it's a missed dispose on a long-lived object.

### M4. Counter parity contract violated when allocator saturates on a coalescing miss
`PlcMultiplexer.cs:721-733` — the miss is counted at line 721 (`IncrementCoalescedMiss`). If the factory then fails (saturation/duplicate-key), the path at line 729 sends exception 04 but **never** decrements the miss counter. The doc's identity `coalescedHitCount + coalescedMissCount = total FC03+FC04` still holds in raw counting terms (each FC03 entered the path once), but the operator semantically expects "miss = a fresh backend round-trip." Saturation produced no round-trip. Document this, or move the increment after success.

### M5. Watchdog snapshot enumerates a `ConcurrentDictionary` non-atomically across multiple ticks
`CorrelationMap.cs:75-79` uses `foreach (var kvp in _entries)` which is a snapshot enumerator but does **not** take a moment-in-time lock. An entry added after the enumerator started may or may not be seen this tick — fine. The issue is that the watchdog reads `kvp.Value.SentAtUtc` and may evaluate against an entry whose true `SentAtUtc` is later than threshold because the entry was added between the snapshot copy at the dictionary level and the field read. Since `InFlightRequest` is immutable, the field read is consistent, and a too-young entry just won't satisfy `<=` — so this is benign. Worth a comment.

### M6. Watchdog comment says "1-second floor" but code uses 100 ms
`PlcMultiplexer.cs:844-846` — the doc string and `ConnectionModel.md` both say "100 ms floor"; the inline comment at line 845 says "1-second floor". Confusing. The 100 ms floor in code is correct.

### M7. Pre-rewriter FC byte read after rewriter mutates the buffer
`PlcMultiplexer.cs:468` — `byte fcInResponse = frame[MbapFrame.HeaderSize]; // post-rewriter, but the FC byte is never rewritten`. True for FC03/04/06/16 responses, but if a future rewriter ever touches byte 0 of the PDU body this becomes silently wrong. Use `inFlight.Fc` from the request side instead.

### M8. `RunBackendWriterAsync` doesn't drain remaining frames on cascade
`PlcMultiplexer.cs:362-385` — when the writer faults at line 380 it triggers a cascade. Frames already enqueued in `_outboundChannel` are stranded (the channel is not completed and no subsequent writer task will ever exist against the same channel — `EnsureBackendConnectedAsync` wires a fresh task at line 269, which then drains those stranded frames into the *new* backend socket). Those frames carry old proxy TxIds whose correlation entries were just dropped by `TearDownBackendAsync`. The PLC will respond to TxIds that have no correlation entry, the reader logs nothing (line 423-428: silent drop), the upstream peers hang waiting for the watchdog. Either complete the channel on cascade and recreate it on reconnect, or drain-and-discard before re-running the writer.

### M9. `IncrementCacheMiss` always increments before the fall-through to coalescing
`PlcMultiplexer.cs:626` — for a cache-eligible read, the miss is counted, then the request falls through to coalescing. If two upstream peers issue the same read, one stores the response in cache, the other coalesces onto it. Result: one CacheMiss, one CoalescedMiss, one CoalescedHit (the second hits coalesce, not cache). If two peers issue, peer A is the leader: A → CacheMiss, CoalescedMiss; B → CacheMiss, CoalescedHit. Per-second the operator reads `CacheMiss = 2` but only one actually had to go to backend. This is reasonable but worth documenting.

## Minor Findings

- **`PlcMultiplexer.cs:445-450`**: Dead code — `elapsedMs` is computed and never used; the actual EWMA conversion happens two lines below.
- **`UpstreamPipe.cs:262-263`**: `return firstRead && remaining == count ? false : false;` — both branches return `false`. The conditional is meaningless; just `return false;`.
- **`PlcMultiplexer.cs:269-270`**: Backend tasks are created with `CancellationToken.None` so they cannot be observed via that token; relies on `cts2.Token` propagating cancellation through `SendAsync`/`ReceiveAsync`. Fine but worth a comment.
- **`PlcMultiplexer.cs:537-540`**: The `Count == 1` reuse of the original `frame` buffer is correct only because the BCD rewriter has already run. After patching `outFrame[0..2]` for a single party we mutate the same buffer the cache `Set` at line 478 already snapshotted (`Buffer.BlockCopy`), so cache integrity holds. Good — but a comment would help reviewers.
- **`InFlightByKeyMap.cs:62`** signature: `public bool TryAttachOrCreate(...)` always returns `true`; the bool is dead.
- **Test gap**: No test exercises C1 (factory-saturation/duplicate-key with multiple attachers in flight). No test exercises the watchdog↔response race (response arriving in the same tick as `SnapshotOlderThan`). No test exercises cascade racing with a brand-new accept (does the new pipe get attached to the freshly-empty `_pipes` and survive, or does it land in a zombie multiplexer mid-teardown?).
- **`ProxyWorker.cs:64-194`**: There is no add/remove reconciliation path here — this is delegated to `ConfigReconciler` via `Attach(_supervisors, opts)`. Out of scope for this audit but worth verifying that reconciler tear-down respects the same disposal ordering.

## What Looks Good
- **TxIdAllocator** is straightforward and demonstrably correct: single lock on the hot path, scan-from-cursor with wrap detection, defensive double-release no-op, exhaustion returns `false` cleanly. Stress test at `TxIdAllocatorTests.cs:101` covers concurrent allocate/release for duplicate detection.
- **Claim-then-dispatch** in the watchdog (`PlcMultiplexer.cs:863`): `TryRemove` is the single source of truth; a response that wins the race makes the watchdog skip silently. Symmetric in the response path (line 423).
- **`InFlightByKeyMap` lock discipline** — every state mutation under the same `object` lock; remove-before-iterate ordering at the multiplexer (line 441) preserves the no-late-attach invariant for fan-out. The `Concurrent_AttachOrCreate_From_Two_Threads` test (16×500 ops, single key, unbounded MaxParties → exactly one create) is a clean concurrency proof.
- **`TearDownBackendAsync` idempotency** via the `_backendLock`-protected null-swap (line 290-301) is correct; second invocation early-returns at line 303.
- **`BuildExceptionFrame`** correctly produces a real 9-byte Modbus exception frame so client libraries see a normal `ModbusException`.
- **Per-call context cloning** via `WithCurrentRequest` (`PerPlcContext.cs:61`) avoids cross-talk between concurrent multiplexed responses on the shared `PerPlcContext`.
- **Counters are uniformly Interlocked** — no `++` on shared longs, peak via CAS, EWMA via fixed-point CAS loop.

## Open Questions
1. Is the design intent that **FC03/FC04 with `MaxParties = int.MaxValue`** is supported in production, or is the test-only setting an antipattern? `InFlightByKeyMapTests.cs:208` uses it — a real backend response with 8000 parties means the reader walks 8000 frame clones synchronously inside the single backend reader (C4 amplified).
2. When `EnsureBackendConnectedAsync` returns false, `OnUpstreamFrameAsync` disposes the upstream pipe (line 636). Is that the right behavior given the cache-hit-survives-backend-down contract? An upstream that needs a non-cached read will be killed even if its next read would have been served from cache.
3. The cascade walks `_pipes.Values.ToArray()` (line 335). A pipe accepted between snapshot and `_pipes.Clear()` (line 341) is **not** disposed and **not** counted in `upstreamCount`. The next `OnUpstreamFrameAsync` it receives will see `_disposed == false` and try to reuse the multiplexer — likely hits `EnsureBackendConnectedAsync` and proceeds normally. Intentional?
4. `RunBackendReaderAsync` line 550 does `_ = TearDownBackendAsync(...)` fire-and-forget. The reader task itself then exits cleanly. But the cascade's "best-effort join" at line 348 awaits the same reader task (`reader = _backendReaderTask`) → self-join? Trace: `TearDown` snapshots `reader` from the field, the field is then set to null inside the lock, and `reader.WaitAsync` is awaited. Since the reader returned and started cascade fire-and-forget, the await on its task completes immediately — fine. Worth a comment confirming the intent.