When two or more upstream clients send the same FC03/FC04 read while a matching request is already in flight on the same PLC's multiplexed backend socket, attach the late arrivals to the existing InFlightRequest .InterestedParties list instead of opening a second backend round-trip. The single backend response fans out to every attached party with each party's original MBAP TxId restored individually. Zero post-response staleness — coalescing operates entirely within the in-flight window (microseconds to ~10 ms typical); the proxy is NOT a cache layer. Headline mechanism: - New record struct CoalescingKey(UnitId, Fc, StartAddress, Qty) keys the per-PLC InFlightByKeyMap. FC03 and FC04 are separate Modbus tables and never share a key; different unit IDs never coalesce; writes (FC06/FC16) bypass the coalescing path entirely. - InFlightByKeyMap uses a simple lock around a Dictionary; atomic TryAttachOrCreate either appends a new party to the in-flight request's mutable List<InterestedParty> or invokes a factory to build a fresh entry. Per-entry MaxParties cap (default 32) bounds fan-out cost; past the cap, the next arrival opens a new entry. - PlcMultiplexer.OnUpstreamFrameAsync takes the coalescing path for FC03/FC04 when Mbproxy.Resilience.ReadCoalescing.Enabled. The factory closure does the Phase-9 work (allocate TxId, add to CorrelationMap); the channel send happens AFTER returning from TryAttachOrCreate so the map lock is not held across the async send. - Response fan-out in RunBackendReaderAsync removes the entry from InFlightByKeyMap before iterating InterestedParties, ensuring no concurrent attach can mutate the list during iteration. - Cascade + watchdog paths also drain the key map so a stale entry cannot outlive its backend round-trip. Counter accounting balance (per snapshot): CoalescedHitCount + CoalescedMissCount equals total FC03 + FC04 requests since startup. Even with coalescing disabled, every read still bumps Miss so dashboard math stays balanced. New surface (additive only): - src/Mbproxy/Proxy/Multiplexing/CoalescingKey.cs - src/Mbproxy/Proxy/Multiplexing/InFlightByKeyMap.cs - src/Mbproxy/Proxy/Multiplexing/CoalescingLogEvents.cs - ReadCoalescingOptions on ResilienceOptions - CoalescedHitCount / CoalescedMissCount / CoalescedResponseToDeadUpstream counters surfaced on /status.json per PLC and as a compact "Coal" cell on the HTML status page. Phase 9 test patch: TwoUpstreams_ProxyTxIds_AreDistinct_OnTheWire previously read the same register from both clients (which now coalesces). Patched to read two different addresses so the test still proves distinct backend TxIds without violating the coalescing contract. Tests added: 24 new (19 unit + 5 E2E): - CoalescingKeyTests (5) - InFlightByKeyMapTests (6, includes concurrent stress) - ReadCoalescingTests (8, stub-backend with deterministic delay) - ReadCoalescingE2ETests (5, pymodbus simulator; coalescing-active during overlap is proven against the stub, not the sim, due to pymodbus 3.13's known concurrent-frame bug) Total: 325 tests passing (282 unit + 43 E2E). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
11 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
What this is
mbproxy is a C# .NET 10 background service (Windows Service) that sits inline as a Modbus TCP proxy in front of a fleet of ~54 AutomationDirect DirectLOGIC DL205 / DL260 equipment controllers. It is pre-configured with two pieces of static data:
- A list of BCD tags — the holding/input registers (by Modbus address and bit width) that the controllers store in DirectLOGIC's native BCD encoding (
V2000 = 1234is stored on the wire as0x1234, not0x04D2). - A list of equipment controller IP addresses (~54 entries) for the DL205/DL260 fleet. Each controller speaks Modbus TCP on port 502 via either the built-in DL260 Ethernet port or an H2-ECOM100 / H2-EBC100 coprocessor.
Purpose: bidirectional BCD rewrite inline on the MBTCP stream
The service is not a polling/cache layer. It is a transparent Modbus TCP proxy whose job is to rewrite the configured BCD tags in real time, in both directions, while proxying every other byte of the MBTCP connection untouched:
- Upstream read path (client → PLC → client). When a client reads a register on the BCD tag list, the proxy intercepts the PLC's response and rewrites the raw BCD nibbles (
0x1234) into the binary integer the client expects (0x04D2= decimal 1234) before forwarding. 32-bit BCD values that span the CDAB word pair are rewritten as a unit. - Downstream write path (client → PLC). When a client writes a register on the BCD tag list, the proxy intercepts the request and re-encodes the client's binary integer (
0x04D2) into BCD nibbles (0x1234) before forwarding to the PLC, so the value the operator sees in ladder matches what the client wrote. - Everything else passes through unchanged. Non-BCD registers, coils, discrete inputs, function codes the service doesn't touch (diagnostics, exception responses, etc.) are forwarded byte-for-byte. MBAP transaction IDs and unit IDs are preserved end-to-end so the proxy is invisible to both sides.
The integration win is that upstream consumers (Wonderware / Historian / OPC UA gateways / generic Modbus clients) can read and write the configured BCD tags as if they were plain Int16/Int32, and the proxy is the only place that has to know which registers are BCD.
Architecture
The full design plan is in docs/design.md — settled 2026-05-13, updated for Phase 9 multiplexing on 2026-05-14. Headline choices the agent should keep in mind without opening that file:
- One
TcpListenerper PLC (54 distinct ports). Each PLC has one shared backend socket owned by aPlcMultiplexer; many upstream clients are multiplexed onto that single backend via MBAP TxId rewriting (Phase 9). The H2-ECOM100's 4-client cap no longer caps upstream connections. - Transparent pass-through of every byte except the MBAP TxId field (rewritten by the multiplexer on each request and restored on each response) and FC03/FC04 response payloads + FC06/FC16 request payloads at configured BCD addresses (re-encoded between BCD nibbles and binary integers).
- In-flight FC03/FC04 read coalescing (Phase 10): same-key reads arriving while a peer is in flight attach to the existing
InFlightRequest.InterestedPartieslist; the single backend response fans out to every attached client with original TxIds restored. Zero post-response staleness — coalescing entries die when the response arrives. Hot-reload viaMbproxy.Resilience.ReadCoalescing.Enabled. - Polly-backed listener supervisor auto-recovers any listener that fails to bind at startup or faults at runtime; the same code path also brings up newly-added PLCs from hot-reload and tears down removed ones.
appsettings.jsonis hot-reloadable viaIOptionsMonitor<MbproxyOptions>; tag-list changes propagate per-PDU, PLC add/remove flows through the supervisor.- Polly bounded retries on backend connect (3 attempts at 100ms / 500ms / 2000ms). No retries on mid-request failures (FC06/FC16 are non-idempotent on BCD tags). A per-request watchdog in the multiplexer surfaces Modbus exception 0x0B to the upstream client if a backend response never arrives within
BackendRequestTimeoutMs. - Backend disconnect cascades upstream: when the shared backend socket dies, every attached upstream pipe is closed in the same cycle (counter
BackendDisconnectCascades); clients reconnect on their next request. - Read-only Kestrel admin port (default 8080) exposes
GET /(auto-refreshing HTML) andGET /status.jsonwith service-wide and per-PLC counters (including Phase-9 mux fieldsinFlight,maxInFlight,txIdWraps,disconnectCascades,queueDepthand Phase-10 coalescing fieldscoalescedHitCount,coalescedMissCount,coalescedResponseToDeadUpstream).
Anything beyond this short list — JSON schema, propagation table, stable log event names, status counter catalog, test plan — lives in docs/design.md. Open that doc before writing code; keep it in sync when decisions change.
Current state
Implementation complete through Phase 10. Phases 00–08 shipped the production-ready 1:1-model service; Phase 9 swapped the connection layer for the TxId-multiplexed model without changing the transparent-rewrite contract; Phase 10 added in-flight read coalescing as an additive optimization on top of the multiplexer. The service is production-ready as a Windows Service:
- 325 tests passing: 282 unit tests + 43 E2E tests (against the pymodbus DL205 simulator + stub backends).
- Single-file self-contained publish (
dotnet publish -c Release -r win-x64). - PowerShell install/uninstall scripts under
install/. - Graceful shutdown with configurable drain timeout (
Connection.GracefulShutdownTimeoutMs, default 10 s). - Windows Event Log integration (Error+ events when running as a service).
- Read-only HTTP status page at
AdminPort(default 8080) — surfaces Phase-9 mux fields alongside Phase-7 counters. connectsSuccess/connectsFailedcounters wired inPlcMultiplexer.- Phase 9 per-request watchdog defends against any backend that drops or mis-echoes a response (real-world packet loss; pymodbus 3.13 simulator's concurrent-multiplexed-request bug).
AssemblyInformationalVersionset to1.0.0(CI can override via/p:InformationalVersion=...).
The human-facing entry point is README.md. All design decisions remain in docs/design.md.
Constraints that still apply to this codebase (do not change without updating the design doc):
- The csproj targets .NET 10 (
net10.0). This is the only tool inwwtools/not pinned to .NET Framework 4.8 / x86. - The sample test
DL260/DL205BcdQuirkTests.csis a pattern reference only — its types are not available in this project.
Device quirks (read before writing Modbus code)
The DL205/DL260 family is almost Modbus-spec-compliant, but every category below has at least one trap. The authoritative reference is DL260/dl205.md — read it end-to-end before touching the wire protocol. Highlights that bear directly on this proxy:
- BCD-by-default numeric encoding.
V2000 = 1234stores0x1234on the wire, not0x04D2. This is the entire reason this service exists. - CDAB word order for 32-bit values. Low word first, big-endian bytes within each word.
0xAABBCCDDlands as[0xCC 0xDD][0xAA 0xBB]. - Octal V-memory ↔ decimal Modbus translation.
V2000octal = decimal 1024 = Modbus PDU0x0400. Config addresses are PDU-decimal, not octal V-memory and not 1-based 4xxxx. - FC03/FC04 max qty = 128 (above spec's 125). FC16 max qty = 100 (below spec's 123). The proxy passes these through; the PLC enforces the cap with exception 03.
- Max 4 concurrent TCP clients per ECOM100. Direct constraint on this proxy's 1:1 connection model — see
docs/design.md→ "Connection model" for the band-aid-vs-rearchitect decision tree if this becomes a real problem. - No TCP keepalive from the device. Middleboxes typically drop idle sockets at 2–5 min. With the 1:1 model, backend liveness tracks upstream client liveness; if both are idle long enough, the path dies on its own and the next request reconnects.
- Register 0 is valid on DL205/DL260 in factory "absolute" addressing mode — don't probe-skip it.
- As-deployed PLC parameters (captured in
DL260/mbtcp_settings.JPG): port 502, "Use Concept data structures (Longs/Reals)" enabled, "Swap bytes" enabled, "Use Zero Based Addressing" unchecked, Register type = Binary, max coil read 1976 / coil write 800 / register read 122 / register write 100. The proxy must speak Modbus as-is; these settings describe the wire it'll see.
Resource index
| Task | Go to |
|---|---|
| Full architecture / design plan (decisions, schema, log events, status counters, test plan) | docs/design.md |
| Phase-by-phase implementation plan (parallel-safety, phase gates, per-phase test list) | docs/plan/README.md |
| Dashboard KPI catalogue — what's exposed today and proposed additions (rates, percentiles, availability, fleet aggregates) | docs/kpi.md |
| DL205/DL260 Modbus quirks (BCD, CDAB, octal V-memory, FC limits, exception codes, oddities) | DL260/dl205.md |
| pymodbus simulator profile that models those quirks as concrete register values | DL260/dl205.json |
| Example integration test pattern (xUnit + Shouldly + simulator fixture) | DL260/DL205BcdQuirkTests.cs |
| As-deployed PLC Modbus parameters screenshot | DL260/mbtcp_settings.JPG |
Maintenance
Documentation doctrine for wwtools/ lives in ../DOCS-GUIDE.md. The three-layer rules apply:
README.mdis the canonical human entry point (Layer-2 per DOCS-GUIDE). It routes to deep docs; it does not duplicate them. Update it when the service's public surface or install steps change.- This
CLAUDE.mdstays a router for LLM coding agents. Deep design decisions live indocs/design.md; device quirks live inDL260/dl205.md. When you change a design decision, updatedocs/design.mdfirst (it's the source of truth) and only mirror the change into the Architecture summary above if it shifts one of the headline bullets. - When the service's task→tool mapping changes in the root index, update
../CLAUDE.mdtoo. - Any further work beyond Phase 08 belongs in a new design revision (dated, in
docs/design.md) and a new phase plan.