Four simple .bat files — install / remove / start / stop-service — that
manage the 'mbproxy' Windows service against the Mbproxy.exe in their
own folder. publish.ps1 / publish.sh copy them into each win-* publish
flavour, so a published Windows folder is self-managing.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The publish-out appsettings templates carried ~250 lines of mostly
prose — BCD/CDAB encoding, coalescing rationale, the full cache
contract — all of which is already documented in
docs/Operations/Configuration.md. Replaced the prose with brief
per-section pointers and a header directing operators to that
reference; all config values are unchanged. Also dropped a stale
comment claiming a 1:1 connection model and a 4-client cap (lifted by
the Phase-9 TxId multiplexer).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fixes every finding from the codereviews/2026-05-16 multi-agent review
(2 Critical, 20 Major, 38 Minor) and adds that review to the repo.
Highlights: dashboard XSS escape; response cache invalidated on the
write request (not just the response); ReloadValidator now runs at
startup so port collisions / duplicate names / malformed Resilience
profiles fail fast; AdminPort 0 genuinely disables the admin endpoint;
PlcListener accept-loop faults propagate to the supervisor's faulted
path; reconciler Restart builds before removing; Resilience pipelines
are restart-only from a frozen snapshot; multiplexer connect-race leak,
watchdog party-list snapshot, backend-response and FC16 framing
validation; frontend reconnect retry and util.js load guard; plus the
log-event/doc drift sweep and test-port hygiene.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reviewed the new SignalR dashboard and fixed its two top findings: a stored XSS on the connection-detail page (unescaped tag name / direction / timestamp rendered into innerHTML) and FC03/FC04 cache hits bypassing the debug-view capture, which left cached tags frozen while their age climbed. Also adds an optional human-friendly Name to BCD tags surfaced on the debug view, and loads the real fleet config from tags.txt (12 named BCD tags, PLC Z28061) so the published appsettings.json is deploy-ready.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Make the service build, run, and install on Linux as a first-class
target while keeping the Windows Service + Event Log behaviour intact.
- Build: drop the hardcoded win-x64 RID — single-file publish now works
for any RID. publish.ps1 gains -Rid; new publish.sh for Linux hosts.
- Diagnostics: DiagnosticSinkSelector picks the Error+ sink per host —
Windows Event Log under the SCM, local syslog under systemd
(Serilog.Sinks.SyslogMessages), none for interactive runs. The
EventLog truncation helper is extracted so it is testable cross-OS.
- Host: Program.cs registers AddSystemd() alongside AddWindowsService().
- Config: a RID-conditioned appsettings template ships Windows or Unix
paths; both templates are schema-validated by a test.
- Install: systemd unit (Type=exec) plus install.sh / uninstall.sh.
Also fixes two cross-platform bugs found while testing: install.ps1
and uninstall.ps1 used New-EventLog / Remove-EventLog (absent in
PowerShell 7), and the E2E sim launcher hardcoded Windows venv paths.
- Docs updated across README, CLAUDE.md, and docs/ for dual-platform.
413 tests pass on Windows; 374 (all non-simulator) on Linux.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The DL205/DL260 ECOM emits no TCP keepalives, so an idle backend socket
can be silently dropped by a middlebox (switch, firewall, NAT) after
2-5 minutes. Enable OS SO_KEEPALIVE on backend and accepted upstream
sockets, and drive a periodic synthetic FC03 heartbeat on each idle
backend socket so a dead path is detected before a real client request
hits it. Controlled by Connection.Keepalive (ON by default).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The standalone design.md, kpi.md, operations.md, and the docs/plan/
phase tree were point-in-time planning artefacts now superseded by the
topic-organized docs/ tree (Architecture/, Features/, Operations/,
Reference/, Testing/). The DL260/ folder mixed a device-reference doc, a
test fixture, a sample test, and a screenshot; its contents now live in
their natural homes (dl205.md + mbtcp_settings.JPG under docs/Reference/,
dl205.json next to its launcher in tests/sim/, sample test dropped).
All cross-references in the surviving docs, README, CLAUDE.md, the config
template, and source comments are repointed to the new locations.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Captures the dotnet publish invocations used to produce the self-contained
(~100 MB, bundles .NET 10) and framework-dependent (~1.5 MB, requires .NET
10 preinstalled) win-x64 single-file Mbproxy.exe builds, so re-cutting a
release isn't institutional knowledge.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lands the design-contract pivot ahead of any cache implementation code so
reviewers can evaluate the change to the "purely transparent proxy" stance
independently of the Phase-11 code that depends on it.
- docs/design.md: rewrite "What this is" / Read-coalescing / Failure-modes
sections to acknowledge the opt-in cache; add new "Response cache (Phase
11)" section covering lookup order (cache -> coalesce -> backend), multi-
tag range TTL = min, post-rewriter storage, address-range-overlap write
invalidation, hot-reload PLC-wide flush, no-persistence, AllowLongTtl gate,
and LRU-bounded capacity. Extend log event table with mbproxy.cache.*
events. Extend per-PLC status field table with cacheHitCount /
cacheMissCount / cacheInvalidations / cacheEntryCount / cacheBytes.
Extend hot-reload propagation table with CacheTtlMs / Cache.* rows.
- docs/kpi.md: graduate Tier 1.8 (response cache) from "requires Phase 11"
to "shipped in Phase 11" and add Tier 2.4a cache-memory section.
- CLAUDE.md (mbproxy): update Purpose paragraph and the Architecture
headline bullets to reflect the transparent-by-default + opt-in-cache
contract; flip "Implementation complete through Phase 10" to "through
Phase 11".
- install/mbproxy.config.template.json: add a fully-commented Mbproxy.Cache
block and a CacheTtlMs example on a BcdTags.Global entry, with prominent
staleness commentary documenting the design contract.
No code changes in this commit - implementation lands in a follow-up.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When two or more upstream clients send the same FC03/FC04 read while a
matching request is already in flight on the same PLC's multiplexed
backend socket, attach the late arrivals to the existing InFlightRequest
.InterestedParties list instead of opening a second backend round-trip.
The single backend response fans out to every attached party with each
party's original MBAP TxId restored individually. Zero post-response
staleness — coalescing operates entirely within the in-flight window
(microseconds to ~10 ms typical); the proxy is NOT a cache layer.
Headline mechanism:
- New record struct CoalescingKey(UnitId, Fc, StartAddress, Qty) keys
the per-PLC InFlightByKeyMap. FC03 and FC04 are separate Modbus
tables and never share a key; different unit IDs never coalesce;
writes (FC06/FC16) bypass the coalescing path entirely.
- InFlightByKeyMap uses a simple lock around a Dictionary; atomic
TryAttachOrCreate either appends a new party to the in-flight
request's mutable List<InterestedParty> or invokes a factory to
build a fresh entry. Per-entry MaxParties cap (default 32) bounds
fan-out cost; past the cap, the next arrival opens a new entry.
- PlcMultiplexer.OnUpstreamFrameAsync takes the coalescing path for
FC03/FC04 when Mbproxy.Resilience.ReadCoalescing.Enabled. The
factory closure does the Phase-9 work (allocate TxId, add to
CorrelationMap); the channel send happens AFTER returning from
TryAttachOrCreate so the map lock is not held across the async send.
- Response fan-out in RunBackendReaderAsync removes the entry from
InFlightByKeyMap before iterating InterestedParties, ensuring no
concurrent attach can mutate the list during iteration.
- Cascade + watchdog paths also drain the key map so a stale entry
cannot outlive its backend round-trip.
Counter accounting balance (per snapshot): CoalescedHitCount +
CoalescedMissCount equals total FC03 + FC04 requests since startup.
Even with coalescing disabled, every read still bumps Miss so dashboard
math stays balanced.
New surface (additive only):
- src/Mbproxy/Proxy/Multiplexing/CoalescingKey.cs
- src/Mbproxy/Proxy/Multiplexing/InFlightByKeyMap.cs
- src/Mbproxy/Proxy/Multiplexing/CoalescingLogEvents.cs
- ReadCoalescingOptions on ResilienceOptions
- CoalescedHitCount / CoalescedMissCount /
CoalescedResponseToDeadUpstream counters surfaced on /status.json
per PLC and as a compact "Coal" cell on the HTML status page.
Phase 9 test patch: TwoUpstreams_ProxyTxIds_AreDistinct_OnTheWire
previously read the same register from both clients (which now
coalesces). Patched to read two different addresses so the test still
proves distinct backend TxIds without violating the coalescing
contract.
Tests added: 24 new (19 unit + 5 E2E):
- CoalescingKeyTests (5)
- InFlightByKeyMapTests (6, includes concurrent stress)
- ReadCoalescingTests (8, stub-backend with deterministic delay)
- ReadCoalescingE2ETests (5, pymodbus simulator; coalescing-active
during overlap is proven against the stub, not the sim, due to
pymodbus 3.13's known concurrent-frame bug)
Total: 325 tests passing (282 unit + 43 E2E).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the mbproxy service end-to-end. Phases 00-08 implement the
production-ready single-listener / 1:1-backend transparent Modbus TCP
proxy with bidirectional BCD rewriting for the ~54-PLC DL205/DL260
fleet. Phase 9 replaces the connection layer with a single backend
socket per PLC plus MBAP TxId rewriting, lifting the H2-ECOM100's
4-concurrent-client cap as an operational ceiling.
Phase 9 additions of note:
- PlcMultiplexer + UpstreamPipe + TxIdAllocator + CorrelationMap
- InFlightRequest with IReadOnlyList<InterestedParty> (load-bearing
for Phase 10 read coalescing — do not collapse to a single field)
- Per-request watchdog: surfaces Modbus exception 0x0B to upstream
on BackendRequestTimeoutMs, defending against lost responses,
dead-PLC paths, and pymodbus 3.13.0's concurrent-multiplexed-
request bug (its ServerRequestHandler.last_pdu state race)
- Status DTO + HTML gain inFlight / maxInFlight / txIdWraps /
disconnectCascades / queueDepth (Tier 1.6 in docs/kpi.md)
Tests: 263 unit + 38 E2E. Multiplexer correctness under truly
concurrent backend traffic is proved against a stub backend in
PlcMultiplexerTests; MultiplexerE2ETests paces requests so pymodbus
3.13's single-PDU framer stays in known-good mode.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>