Files
wwtools/mbproxy/CLAUDE.md
T
Joseph Doherty 56eee3c563 mbproxy: initial commit through Phase 9 (TxId multiplexing)
Adds the mbproxy service end-to-end. Phases 00-08 implement the
production-ready single-listener / 1:1-backend transparent Modbus TCP
proxy with bidirectional BCD rewriting for the ~54-PLC DL205/DL260
fleet. Phase 9 replaces the connection layer with a single backend
socket per PLC plus MBAP TxId rewriting, lifting the H2-ECOM100's
4-concurrent-client cap as an operational ceiling.

Phase 9 additions of note:
- PlcMultiplexer + UpstreamPipe + TxIdAllocator + CorrelationMap
- InFlightRequest with IReadOnlyList<InterestedParty> (load-bearing
  for Phase 10 read coalescing — do not collapse to a single field)
- Per-request watchdog: surfaces Modbus exception 0x0B to upstream
  on BackendRequestTimeoutMs, defending against lost responses,
  dead-PLC paths, and pymodbus 3.13.0's concurrent-multiplexed-
  request bug (its ServerRequestHandler.last_pdu state race)
- Status DTO + HTML gain inFlight / maxInFlight / txIdWraps /
  disconnectCascades / queueDepth (Tier 1.6 in docs/kpi.md)

Tests: 263 unit + 38 E2E. Multiplexer correctness under truly
concurrent backend traffic is proved against a stub backend in
PlcMultiplexerTests; MultiplexerE2ETests paces requests so pymodbus
3.13's single-PDU framer stays in known-good mode.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 01:49:35 -04:00

10 KiB
Raw Blame History

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

What this is

mbproxy is a C# .NET 10 background service (Windows Service) that sits inline as a Modbus TCP proxy in front of a fleet of ~54 AutomationDirect DirectLOGIC DL205 / DL260 equipment controllers. It is pre-configured with two pieces of static data:

  1. A list of BCD tags — the holding/input registers (by Modbus address and bit width) that the controllers store in DirectLOGIC's native BCD encoding (V2000 = 1234 is stored on the wire as 0x1234, not 0x04D2).
  2. A list of equipment controller IP addresses (~54 entries) for the DL205/DL260 fleet. Each controller speaks Modbus TCP on port 502 via either the built-in DL260 Ethernet port or an H2-ECOM100 / H2-EBC100 coprocessor.

Purpose: bidirectional BCD rewrite inline on the MBTCP stream

The service is not a polling/cache layer. It is a transparent Modbus TCP proxy whose job is to rewrite the configured BCD tags in real time, in both directions, while proxying every other byte of the MBTCP connection untouched:

  • Upstream read path (client → PLC → client). When a client reads a register on the BCD tag list, the proxy intercepts the PLC's response and rewrites the raw BCD nibbles (0x1234) into the binary integer the client expects (0x04D2 = decimal 1234) before forwarding. 32-bit BCD values that span the CDAB word pair are rewritten as a unit.
  • Downstream write path (client → PLC). When a client writes a register on the BCD tag list, the proxy intercepts the request and re-encodes the client's binary integer (0x04D2) into BCD nibbles (0x1234) before forwarding to the PLC, so the value the operator sees in ladder matches what the client wrote.
  • Everything else passes through unchanged. Non-BCD registers, coils, discrete inputs, function codes the service doesn't touch (diagnostics, exception responses, etc.) are forwarded byte-for-byte. MBAP transaction IDs and unit IDs are preserved end-to-end so the proxy is invisible to both sides.

The integration win is that upstream consumers (Wonderware / Historian / OPC UA gateways / generic Modbus clients) can read and write the configured BCD tags as if they were plain Int16/Int32, and the proxy is the only place that has to know which registers are BCD.

Architecture

The full design plan is in docs/design.md — settled 2026-05-13, updated for Phase 9 multiplexing on 2026-05-14. Headline choices the agent should keep in mind without opening that file:

  • One TcpListener per PLC (54 distinct ports). Each PLC has one shared backend socket owned by a PlcMultiplexer; many upstream clients are multiplexed onto that single backend via MBAP TxId rewriting (Phase 9). The H2-ECOM100's 4-client cap no longer caps upstream connections.
  • Transparent pass-through of every byte except the MBAP TxId field (rewritten by the multiplexer on each request and restored on each response) and FC03/FC04 response payloads + FC06/FC16 request payloads at configured BCD addresses (re-encoded between BCD nibbles and binary integers).
  • Polly-backed listener supervisor auto-recovers any listener that fails to bind at startup or faults at runtime; the same code path also brings up newly-added PLCs from hot-reload and tears down removed ones.
  • appsettings.json is hot-reloadable via IOptionsMonitor<MbproxyOptions>; tag-list changes propagate per-PDU, PLC add/remove flows through the supervisor.
  • Polly bounded retries on backend connect (3 attempts at 100ms / 500ms / 2000ms). No retries on mid-request failures (FC06/FC16 are non-idempotent on BCD tags). A per-request watchdog in the multiplexer surfaces Modbus exception 0x0B to the upstream client if a backend response never arrives within BackendRequestTimeoutMs.
  • Backend disconnect cascades upstream: when the shared backend socket dies, every attached upstream pipe is closed in the same cycle (counter BackendDisconnectCascades); clients reconnect on their next request.
  • Read-only Kestrel admin port (default 8080) exposes GET / (auto-refreshing HTML) and GET /status.json with service-wide and per-PLC counters (including Phase-9 mux fields inFlight, maxInFlight, txIdWraps, disconnectCascades, queueDepth).

Anything beyond this short list — JSON schema, propagation table, stable log event names, status counter catalog, test plan — lives in docs/design.md. Open that doc before writing code; keep it in sync when decisions change.

Current state

Implementation complete through Phase 9. Phases 0008 shipped the production-ready 1:1-model service; Phase 9 swapped the connection layer for the TxId-multiplexed model without changing the transparent-rewrite contract. The service is production-ready as a Windows Service:

  • 301 tests passing: 263 unit tests + 38 E2E tests (against the pymodbus DL205 simulator + stub backends).
  • Single-file self-contained publish (dotnet publish -c Release -r win-x64).
  • PowerShell install/uninstall scripts under install/.
  • Graceful shutdown with configurable drain timeout (Connection.GracefulShutdownTimeoutMs, default 10 s).
  • Windows Event Log integration (Error+ events when running as a service).
  • Read-only HTTP status page at AdminPort (default 8080) — surfaces Phase-9 mux fields alongside Phase-7 counters.
  • connectsSuccess / connectsFailed counters wired in PlcMultiplexer.
  • Phase 9 per-request watchdog defends against any backend that drops or mis-echoes a response (real-world packet loss; pymodbus 3.13 simulator's concurrent-multiplexed-request bug).
  • AssemblyInformationalVersion set to 1.0.0 (CI can override via /p:InformationalVersion=...).

The human-facing entry point is README.md. All design decisions remain in docs/design.md.

Constraints that still apply to this codebase (do not change without updating the design doc):

  • The csproj targets .NET 10 (net10.0). This is the only tool in wwtools/ not pinned to .NET Framework 4.8 / x86.
  • The sample test DL260/DL205BcdQuirkTests.cs is a pattern reference only — its types are not available in this project.

Device quirks (read before writing Modbus code)

The DL205/DL260 family is almost Modbus-spec-compliant, but every category below has at least one trap. The authoritative reference is DL260/dl205.md — read it end-to-end before touching the wire protocol. Highlights that bear directly on this proxy:

  • BCD-by-default numeric encoding. V2000 = 1234 stores 0x1234 on the wire, not 0x04D2. This is the entire reason this service exists.
  • CDAB word order for 32-bit values. Low word first, big-endian bytes within each word. 0xAABBCCDD lands as [0xCC 0xDD][0xAA 0xBB].
  • Octal V-memory ↔ decimal Modbus translation. V2000 octal = decimal 1024 = Modbus PDU 0x0400. Config addresses are PDU-decimal, not octal V-memory and not 1-based 4xxxx.
  • FC03/FC04 max qty = 128 (above spec's 125). FC16 max qty = 100 (below spec's 123). The proxy passes these through; the PLC enforces the cap with exception 03.
  • Max 4 concurrent TCP clients per ECOM100. Direct constraint on this proxy's 1:1 connection model — see docs/design.md → "Connection model" for the band-aid-vs-rearchitect decision tree if this becomes a real problem.
  • No TCP keepalive from the device. Middleboxes typically drop idle sockets at 25 min. With the 1:1 model, backend liveness tracks upstream client liveness; if both are idle long enough, the path dies on its own and the next request reconnects.
  • Register 0 is valid on DL205/DL260 in factory "absolute" addressing mode — don't probe-skip it.
  • As-deployed PLC parameters (captured in DL260/mbtcp_settings.JPG): port 502, "Use Concept data structures (Longs/Reals)" enabled, "Swap bytes" enabled, "Use Zero Based Addressing" unchecked, Register type = Binary, max coil read 1976 / coil write 800 / register read 122 / register write 100. The proxy must speak Modbus as-is; these settings describe the wire it'll see.

Resource index

Task Go to
Full architecture / design plan (decisions, schema, log events, status counters, test plan) docs/design.md
Phase-by-phase implementation plan (parallel-safety, phase gates, per-phase test list) docs/plan/README.md
Dashboard KPI catalogue — what's exposed today and proposed additions (rates, percentiles, availability, fleet aggregates) docs/kpi.md
DL205/DL260 Modbus quirks (BCD, CDAB, octal V-memory, FC limits, exception codes, oddities) DL260/dl205.md
pymodbus simulator profile that models those quirks as concrete register values DL260/dl205.json
Example integration test pattern (xUnit + Shouldly + simulator fixture) DL260/DL205BcdQuirkTests.cs
As-deployed PLC Modbus parameters screenshot DL260/mbtcp_settings.JPG

Maintenance

Documentation doctrine for wwtools/ lives in ../DOCS-GUIDE.md. The three-layer rules apply:

  • README.md is the canonical human entry point (Layer-2 per DOCS-GUIDE). It routes to deep docs; it does not duplicate them. Update it when the service's public surface or install steps change.
  • This CLAUDE.md stays a router for LLM coding agents. Deep design decisions live in docs/design.md; device quirks live in DL260/dl205.md. When you change a design decision, update docs/design.md first (it's the source of truth) and only mirror the change into the Architecture summary above if it shifts one of the headline bullets.
  • When the service's task→tool mapping changes in the root index, update ../CLAUDE.md too.
  • Any further work beyond Phase 08 belongs in a new design revision (dated, in docs/design.md) and a new phase plan.