Commit Graph

11 Commits

Author SHA1 Message Date
Joseph Doherty 1eeee1e292 mbproxy/install: add Windows service-management batch files
Four simple .bat files — install / remove / start / stop-service — that
manage the 'mbproxy' Windows service against the Mbproxy.exe in their
own folder. publish.ps1 / publish.sh copy them into each win-* publish
flavour, so a published Windows folder is self-managing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 21:32:33 -04:00
Joseph Doherty 0756bb0066 mbproxy/install: slim config templates, point to Configuration.md
The publish-out appsettings templates carried ~250 lines of mostly
prose — BCD/CDAB encoding, coalescing rationale, the full cache
contract — all of which is already documented in
docs/Operations/Configuration.md. Replaced the prose with brief
per-section pointers and a header directing operators to that
reference; all config values are unchanged. Also dropped a stale
comment claiming a 1:1 connection model and a 4-client cap (lifted by
the Phase-9 TxId multiplexer).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 18:18:59 -04:00
Joseph Doherty b222362ce0 mbproxy: remediate the 2026-05-16 code-review findings
Fixes every finding from the codereviews/2026-05-16 multi-agent review
(2 Critical, 20 Major, 38 Minor) and adds that review to the repo.

Highlights: dashboard XSS escape; response cache invalidated on the
write request (not just the response); ReloadValidator now runs at
startup so port collisions / duplicate names / malformed Resilience
profiles fail fast; AdminPort 0 genuinely disables the admin endpoint;
PlcListener accept-loop faults propagate to the supervisor's faulted
path; reconciler Restart builds before removing; Resilience pipelines
are restart-only from a frozen snapshot; multiplexer connect-race leak,
watchdog party-list snapshot, backend-response and FC16 framing
validation; frontend reconnect retry and util.js load guard; plus the
log-event/doc drift sweep and test-port hygiene.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 18:08:06 -04:00
Joseph Doherty 554b05d28c mbproxy: fix dashboard review findings, add named BCD tags + fleet config
Reviewed the new SignalR dashboard and fixed its two top findings: a stored XSS on the connection-detail page (unescaped tag name / direction / timestamp rendered into innerHTML) and FC03/FC04 cache hits bypassing the debug-view capture, which left cached tags frozen while their age climbed. Also adds an optional human-friendly Name to BCD tags surfaced on the debug view, and loads the real fleet config from tags.txt (12 named BCD tags, PLC Z28061) so the published appsettings.json is deploy-ready.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-16 03:39:39 -04:00
Joseph Doherty b330faff03 mbproxy: cross-platform support — Linux/systemd alongside Windows
Make the service build, run, and install on Linux as a first-class
target while keeping the Windows Service + Event Log behaviour intact.

- Build: drop the hardcoded win-x64 RID — single-file publish now works
  for any RID. publish.ps1 gains -Rid; new publish.sh for Linux hosts.
- Diagnostics: DiagnosticSinkSelector picks the Error+ sink per host —
  Windows Event Log under the SCM, local syslog under systemd
  (Serilog.Sinks.SyslogMessages), none for interactive runs. The
  EventLog truncation helper is extracted so it is testable cross-OS.
- Host: Program.cs registers AddSystemd() alongside AddWindowsService().
- Config: a RID-conditioned appsettings template ships Windows or Unix
  paths; both templates are schema-validated by a test.
- Install: systemd unit (Type=exec) plus install.sh / uninstall.sh.
  Also fixes two cross-platform bugs found while testing: install.ps1
  and uninstall.ps1 used New-EventLog / Remove-EventLog (absent in
  PowerShell 7), and the E2E sim launcher hardcoded Windows venv paths.
- Docs updated across README, CLAUDE.md, and docs/ for dual-platform.

413 tests pass on Windows; 374 (all non-simulator) on Linux.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 09:41:59 -04:00
Joseph Doherty 0868613890 mbproxy: add keepalive / connection monitoring
The DL205/DL260 ECOM emits no TCP keepalives, so an idle backend socket
can be silently dropped by a middlebox (switch, firewall, NAT) after
2-5 minutes. Enable OS SO_KEEPALIVE on backend and accepted upstream
sockets, and drive a periodic synthetic FC03 heartbeat on each idle
backend socket so a dead path is detected before a real client request
hits it. Controlled by Connection.Keepalive (ON by default).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 09:40:54 -04:00
Joseph Doherty 7466a46aa7 mbproxy/docs: retire superseded design/plan docs and dissolve DL260/
The standalone design.md, kpi.md, operations.md, and the docs/plan/
phase tree were point-in-time planning artefacts now superseded by the
topic-organized docs/ tree (Architecture/, Features/, Operations/,
Reference/, Testing/). The DL260/ folder mixed a device-reference doc, a
test fixture, a sample test, and a screenshot; its contents now live in
their natural homes (dl205.md + mbtcp_settings.JPG under docs/Reference/,
dl205.json next to its launcher in tests/sim/, sample test dropped).

All cross-references in the surviving docs, README, CLAUDE.md, the config
template, and source comments are repointed to the new locations.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 07:37:48 -04:00
Joseph Doherty ee1ae89e25 mbproxy/install: add publish.ps1 for the two single-file build flavours
Captures the dotnet publish invocations used to produce the self-contained
(~100 MB, bundles .NET 10) and framework-dependent (~1.5 MB, requires .NET
10 preinstalled) win-x64 single-file Mbproxy.exe builds, so re-cutting a
release isn't institutional knowledge.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 13:53:22 -04:00
Joseph Doherty 892b10baf4 mbproxy/docs: pivot design contract for Phase 11 response cache
Lands the design-contract pivot ahead of any cache implementation code so
reviewers can evaluate the change to the "purely transparent proxy" stance
independently of the Phase-11 code that depends on it.

- docs/design.md: rewrite "What this is" / Read-coalescing / Failure-modes
  sections to acknowledge the opt-in cache; add new "Response cache (Phase
  11)" section covering lookup order (cache -> coalesce -> backend), multi-
  tag range TTL = min, post-rewriter storage, address-range-overlap write
  invalidation, hot-reload PLC-wide flush, no-persistence, AllowLongTtl gate,
  and LRU-bounded capacity. Extend log event table with mbproxy.cache.*
  events. Extend per-PLC status field table with cacheHitCount /
  cacheMissCount / cacheInvalidations / cacheEntryCount / cacheBytes.
  Extend hot-reload propagation table with CacheTtlMs / Cache.* rows.
- docs/kpi.md: graduate Tier 1.8 (response cache) from "requires Phase 11"
  to "shipped in Phase 11" and add Tier 2.4a cache-memory section.
- CLAUDE.md (mbproxy): update Purpose paragraph and the Architecture
  headline bullets to reflect the transparent-by-default + opt-in-cache
  contract; flip "Implementation complete through Phase 10" to "through
  Phase 11".
- install/mbproxy.config.template.json: add a fully-commented Mbproxy.Cache
  block and a CacheTtlMs example on a BcdTags.Global entry, with prominent
  staleness commentary documenting the design contract.

No code changes in this commit - implementation lands in a follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 02:35:35 -04:00
Joseph Doherty a2dba4bd07 mbproxy: add in-flight read coalescing (Phase 10)
When two or more upstream clients send the same FC03/FC04 read while a
matching request is already in flight on the same PLC's multiplexed
backend socket, attach the late arrivals to the existing InFlightRequest
.InterestedParties list instead of opening a second backend round-trip.
The single backend response fans out to every attached party with each
party's original MBAP TxId restored individually. Zero post-response
staleness — coalescing operates entirely within the in-flight window
(microseconds to ~10 ms typical); the proxy is NOT a cache layer.

Headline mechanism:

- New record struct CoalescingKey(UnitId, Fc, StartAddress, Qty) keys
  the per-PLC InFlightByKeyMap. FC03 and FC04 are separate Modbus
  tables and never share a key; different unit IDs never coalesce;
  writes (FC06/FC16) bypass the coalescing path entirely.
- InFlightByKeyMap uses a simple lock around a Dictionary; atomic
  TryAttachOrCreate either appends a new party to the in-flight
  request's mutable List<InterestedParty> or invokes a factory to
  build a fresh entry. Per-entry MaxParties cap (default 32) bounds
  fan-out cost; past the cap, the next arrival opens a new entry.
- PlcMultiplexer.OnUpstreamFrameAsync takes the coalescing path for
  FC03/FC04 when Mbproxy.Resilience.ReadCoalescing.Enabled. The
  factory closure does the Phase-9 work (allocate TxId, add to
  CorrelationMap); the channel send happens AFTER returning from
  TryAttachOrCreate so the map lock is not held across the async send.
- Response fan-out in RunBackendReaderAsync removes the entry from
  InFlightByKeyMap before iterating InterestedParties, ensuring no
  concurrent attach can mutate the list during iteration.
- Cascade + watchdog paths also drain the key map so a stale entry
  cannot outlive its backend round-trip.

Counter accounting balance (per snapshot): CoalescedHitCount +
CoalescedMissCount equals total FC03 + FC04 requests since startup.
Even with coalescing disabled, every read still bumps Miss so dashboard
math stays balanced.

New surface (additive only):
- src/Mbproxy/Proxy/Multiplexing/CoalescingKey.cs
- src/Mbproxy/Proxy/Multiplexing/InFlightByKeyMap.cs
- src/Mbproxy/Proxy/Multiplexing/CoalescingLogEvents.cs
- ReadCoalescingOptions on ResilienceOptions
- CoalescedHitCount / CoalescedMissCount /
  CoalescedResponseToDeadUpstream counters surfaced on /status.json
  per PLC and as a compact "Coal" cell on the HTML status page.

Phase 9 test patch: TwoUpstreams_ProxyTxIds_AreDistinct_OnTheWire
previously read the same register from both clients (which now
coalesces). Patched to read two different addresses so the test still
proves distinct backend TxIds without violating the coalescing
contract.

Tests added: 24 new (19 unit + 5 E2E):
- CoalescingKeyTests (5)
- InFlightByKeyMapTests (6, includes concurrent stress)
- ReadCoalescingTests (8, stub-backend with deterministic delay)
- ReadCoalescingE2ETests (5, pymodbus simulator; coalescing-active
  during overlap is proven against the stub, not the sim, due to
  pymodbus 3.13's known concurrent-frame bug)

Total: 325 tests passing (282 unit + 43 E2E).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 02:26:06 -04:00
Joseph Doherty 56eee3c563 mbproxy: initial commit through Phase 9 (TxId multiplexing)
Adds the mbproxy service end-to-end. Phases 00-08 implement the
production-ready single-listener / 1:1-backend transparent Modbus TCP
proxy with bidirectional BCD rewriting for the ~54-PLC DL205/DL260
fleet. Phase 9 replaces the connection layer with a single backend
socket per PLC plus MBAP TxId rewriting, lifting the H2-ECOM100's
4-concurrent-client cap as an operational ceiling.

Phase 9 additions of note:
- PlcMultiplexer + UpstreamPipe + TxIdAllocator + CorrelationMap
- InFlightRequest with IReadOnlyList<InterestedParty> (load-bearing
  for Phase 10 read coalescing — do not collapse to a single field)
- Per-request watchdog: surfaces Modbus exception 0x0B to upstream
  on BackendRequestTimeoutMs, defending against lost responses,
  dead-PLC paths, and pymodbus 3.13.0's concurrent-multiplexed-
  request bug (its ServerRequestHandler.last_pdu state race)
- Status DTO + HTML gain inFlight / maxInFlight / txIdWraps /
  disconnectCascades / queueDepth (Tier 1.6 in docs/kpi.md)

Tests: 263 unit + 38 E2E. Multiplexer correctness under truly
concurrent backend traffic is proved against a stub backend in
PlcMultiplexerTests; MultiplexerE2ETests paces requests so pymodbus
3.13's single-PDU framer stays in known-good mode.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 01:49:35 -04:00