Files
wwtools/mbproxy/docs/plan/03-proxy-plumbing.md
T
Joseph Doherty 56eee3c563 mbproxy: initial commit through Phase 9 (TxId multiplexing)
Adds the mbproxy service end-to-end. Phases 00-08 implement the
production-ready single-listener / 1:1-backend transparent Modbus TCP
proxy with bidirectional BCD rewriting for the ~54-PLC DL205/DL260
fleet. Phase 9 replaces the connection layer with a single backend
socket per PLC plus MBAP TxId rewriting, lifting the H2-ECOM100's
4-concurrent-client cap as an operational ceiling.

Phase 9 additions of note:
- PlcMultiplexer + UpstreamPipe + TxIdAllocator + CorrelationMap
- InFlightRequest with IReadOnlyList<InterestedParty> (load-bearing
  for Phase 10 read coalescing — do not collapse to a single field)
- Per-request watchdog: surfaces Modbus exception 0x0B to upstream
  on BackendRequestTimeoutMs, defending against lost responses,
  dead-PLC paths, and pymodbus 3.13.0's concurrent-multiplexed-
  request bug (its ServerRequestHandler.last_pdu state race)
- Status DTO + HTML gain inFlight / maxInFlight / txIdWraps /
  disconnectCascades / queueDepth (Tier 1.6 in docs/kpi.md)

Tests: 263 unit + 38 E2E. Multiplexer correctness under truly
concurrent backend traffic is proved against a stub backend in
PlcMultiplexerTests; MultiplexerE2ETests paces requests so pymodbus
3.13's single-PDU framer stays in known-good mode.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 01:49:35 -04:00

9.4 KiB

Phase 03 — Proxy plumbing

The minimum-viable proxy: one TcpListener per configured PLC, 1:1 upstream-client ↔ backend-socket, byte-for-byte forwarding both directions, transparent MBAP TxId / unit ID. No BCD rewriting yet — that's phase 04. No supervisor / auto-recovery — that's phase 05.

Depends on: Phase 00 (host, options). Parallel-safe with: Phase 02 (BCD codec lives under src/Mbproxy/Bcd/; this phase lives under src/Mbproxy/Proxy/).

Goal

Stand up the listener-and-forwarder pair so an e2e test can:

  1. Configure the proxy with Plcs: [{ Host: "127.0.0.1", Port: <simPort>, ListenPort: <proxyPort> }].
  2. Start the host.
  3. Drive NModbus against 127.0.0.1:<proxyPort> and see the SAME bytes the simulator would return on a direct connection.

The proxy is transparent in this phase. The BCD rewrite hook point is reserved but not wired.

Outputs

src/Mbproxy/Proxy/PlcListener.cs            # owns one TcpListener; accepts loop
src/Mbproxy/Proxy/PlcConnectionPair.cs      # one upstream socket + one backend socket; forwarder
src/Mbproxy/Proxy/IPduPipeline.cs           # the rewrite hook contract (no-op impl in this phase)
src/Mbproxy/Proxy/NoopPduPipeline.cs        # the no-op impl
src/Mbproxy/Proxy/ProxyWorker.cs            # BackgroundService that owns all PlcListeners
src/Mbproxy/Proxy/MbapFrame.cs              # MBAP header parse helpers (length, txid, unit)

tests/Mbproxy.Tests/Proxy/ProxyForwardingTests.cs   # e2e against the simulator
tests/Mbproxy.Tests/Proxy/MbapFrameTests.cs         # unit tests for the MBAP parser

Modifications:

  • src/Mbproxy/Program.cs — register ProxyWorker as a hosted service. The HeartbeatWorker from phase 00 is DELETED in this phase (its job is replaced by ProxyWorker logging mbproxy.startup.ready after all listeners are bound).
  • src/Mbproxy/Workers/HeartbeatWorker.cs — DELETED.

Tasks

  1. MbapFrame.cs — pure helpers, no allocations. Static methods:
    • static bool TryParseHeader(ReadOnlySpan<byte> buffer, out ushort txId, out ushort protocolId, out ushort length, out byte unitId) — returns false if buffer.Length < 7.
    • static int TotalFrameLength(ushort lengthField)lengthField + 6 (7 header bytes minus the 1-byte unit ID which is counted in the length field).
  2. IPduPipeline.cs — the rewrite hook. Single method:
    void Process(MbapDirection direction, ReadOnlySpan<byte> mbapHeader, Span<byte> pdu, PduContext context);
    
    MbapDirection is RequestToBackend or ResponseToClient. PduContext carries the per-pair state (counters, PLC name, configured tag map). In phase 03, the only implementation is NoopPduPipeline which does nothing.
  3. NoopPduPipeline.cs — empty Process method. Registered as the default IPduPipeline in DI for this phase. Phase 04 replaces it with the real rewriter.
  4. PlcConnectionPair.cs — owns the upstream Socket (or TcpClient) handed to it by PlcListener.Accept, opens a fresh backend socket to the configured PLC, and runs two Tasks:
    • Upstream → backend: read one full MBAP frame at a time (header → length → rest), call pipeline.Process(RequestToBackend, header, pdu, ctx), write the frame to the backend.
    • Backend → upstream: same shape, with ResponseToClient. Either task ending (socket closed, exception, cancellation) tears down both sides cleanly. No retry loop; that's phase 05. Backend connect is wrapped in a try/catch with the configured BackendConnectTimeoutMs. Connect failures close the upstream socket immediately and log mbproxy.backend.failed. Polly bounded retries on backend connect are deferred to phase 05 to keep this phase scope tight — note the deferral in code with // Phase 05: wrap in Polly pipeline.
  5. PlcListener.cs — owns one TcpListener for one PLC. StartAsync binds; on bind failure, throws (caller logs mbproxy.startup.bind.failed and decides what to do — phase 05 will introduce the supervisor that turns this into a recoverable state). On each accept, hands the socket to a fresh PlcConnectionPair and runs it on the thread-pool.
  6. ProxyWorker.csBackgroundService. On start: enumerates MbproxyOptions.Plcs, instantiates one PlcListener per entry, starts them all. Each bind that succeeds logs mbproxy.startup.bind; each that fails logs mbproxy.startup.bind.failed and continues to the next PLC (matching the design's "eager, continue on per-port failure" posture). After all bind attempts, logs mbproxy.startup.ready with { ListenersBound, PlcsConfigured }. On stop: cancels and disposes all listeners and their open pairs.
  7. Program.cs — remove the HeartbeatWorker registration; register ProxyWorker. Also register IPduPipeline as a singleton NoopPduPipeline in DI.

Public surface declared in this phase

All internal sealed class — the proxy types are not consumed outside this assembly. The only public-shaped surfaces are the IPduPipeline interface and the MbapDirection enum (so phase 04 can implement its own pipeline cleanly).

namespace Mbproxy.Proxy;

public interface IPduPipeline {
    void Process(MbapDirection direction, ReadOnlySpan<byte> mbapHeader, Span<byte> pdu, PduContext context);
}

public enum MbapDirection { RequestToBackend, ResponseToClient }

public sealed class PduContext {
    public string PlcName { get; init; } = "";
    // Phase 04 adds: BcdTagMap, counters, logger
}

internal sealed class NoopPduPipeline : IPduPipeline { /* no-op */ }
internal sealed class MbapFrame { /* static helpers */ }
internal sealed class PlcListener : IAsyncDisposable { /* ... */ }
internal sealed class PlcConnectionPair : IAsyncDisposable { /* ... */ }
internal sealed class ProxyWorker : BackgroundService { /* ... */ }

Tests required

Unit (Category = Unit)

MbapFrameTests (≥ 8 tests):

  1. TryParseHeader_TooShort_ReturnsFalse
  2. TryParseHeader_ValidFrame_ParsesAllFields
  3. TryParseHeader_ProtocolId_NotZero_StillParses — we don't reject non-zero protocol IDs; that's the PLC's job.
  4. TotalFrameLength_LengthField7_Returns13
  5. TotalFrameLength_LengthFieldMax_Returns_LengthFieldPlus6
  6. Round-trip: parse a known good FC03 frame and assert each field.
  7. Round-trip: parse a known good FC16 write-multiple frame.
  8. Negative: a frame with length < 2 returns the parsed value but is callers' responsibility to reject. Document in a test.

E2E (Category = E2E)

ProxyForwardingTests (≥ 5 tests, [Collection(nameof(DL205SimulatorCollection))]):

  1. Forward_FC03_HR0_Returns_SimulatorRawValue_0xCAFE — proxy is transparent; client sees the raw simulator value.
  2. Forward_FC03_HR1072_Returns_RawBCD_0x1234 — the BCD register is NOT rewritten in phase 03 (NoopPduPipeline). This test will be REPLACED in phase 04 with one that asserts 1234 instead. Document the planned replacement in a comment so phase 04's agent knows what to update.
  3. Forward_FC06_WriteHR200_ThenReadBack_RoundTrips — proves the write path forwards correctly.
  4. Forward_FC16_WriteMultipleHR201_203_ThenReadBack_RoundTrips.
  5. MbapTxId_IsPreservedEndToEnd — issue 20 back-to-back FC03 reads with monotonically increasing TxIds; assert every response carries the matching TxId.
  6. BackendConnectFailure_ClosesUpstreamCleanly — point the proxy at an unreachable backend (127.0.0.1:1), assert the client's socket is closed within BackendConnectTimeoutMs + 200ms.

Phase gate

  • Zero-warnings build.
  • All phase 00, 02 tests still green.
  • All new unit tests green (≥ 8 in MbapFrameTests).
  • All new e2e tests green when the simulator is available; skip cleanly when it isn't.
  • dotnet run --project src/Mbproxy with an appsettings.json pointing at the simulator: NModbus can read/write through the proxy and gets the simulator's raw values.
  • On startup with one bad and one good PLC config, the good one binds and the bad one logs mbproxy.startup.bind.failed, and the service does NOT abort. (Hand the supervisor work to phase 05; this phase only proves the "continue on per-port failure" posture.)
  • mbproxy.startup.ready is now logged by ProxyWorker, not by a heartbeat worker. The heartbeat worker file is deleted.

Out of scope

  • BCD rewriting (phase 04 replaces NoopPduPipeline).
  • Polly retries on backend connect (phase 05 supervisor wraps this).
  • Auto-recovery for failed listener binds (phase 05).
  • Counter tracking / per-PLC telemetry (phase 04 starts adding counters via PduContext).
  • Half-MBAP-frame handling (split TCP packets): rely on NetworkStream.ReadAsync returning short reads; loop to fill the header (7 bytes) and then loop to fill the body (length - 1 more bytes). Test 5 above verifies this stays correct over 20 back-to-back requests.

Notes for the subagent

  • Socket vs TcpClient: prefer Socket directly so framing reads can use ReadOnlyMemory<byte> without NetworkStream allocation overhead. The performance difference is small but the byte-precise API matches what the rewriter in phase 04 will need.
  • Frame reads use a per-pair pooled buffer of 260 bytes (MBAP header 7 + max PDU 253). Don't allocate per-frame.
  • The "Phase 04 will replace test 2" pattern is intentional. Leave breadcrumbs so the next phase's agent knows exactly which test to update; do NOT silently make the test pass against a future rewriter.
  • Both forwarder tasks run with the same CancellationTokenSource. Cancellation propagates from listener stop → pair stop → both task ends → socket dispose.