a2dba4bd07
When two or more upstream clients send the same FC03/FC04 read while a matching request is already in flight on the same PLC's multiplexed backend socket, attach the late arrivals to the existing InFlightRequest .InterestedParties list instead of opening a second backend round-trip. The single backend response fans out to every attached party with each party's original MBAP TxId restored individually. Zero post-response staleness — coalescing operates entirely within the in-flight window (microseconds to ~10 ms typical); the proxy is NOT a cache layer. Headline mechanism: - New record struct CoalescingKey(UnitId, Fc, StartAddress, Qty) keys the per-PLC InFlightByKeyMap. FC03 and FC04 are separate Modbus tables and never share a key; different unit IDs never coalesce; writes (FC06/FC16) bypass the coalescing path entirely. - InFlightByKeyMap uses a simple lock around a Dictionary; atomic TryAttachOrCreate either appends a new party to the in-flight request's mutable List<InterestedParty> or invokes a factory to build a fresh entry. Per-entry MaxParties cap (default 32) bounds fan-out cost; past the cap, the next arrival opens a new entry. - PlcMultiplexer.OnUpstreamFrameAsync takes the coalescing path for FC03/FC04 when Mbproxy.Resilience.ReadCoalescing.Enabled. The factory closure does the Phase-9 work (allocate TxId, add to CorrelationMap); the channel send happens AFTER returning from TryAttachOrCreate so the map lock is not held across the async send. - Response fan-out in RunBackendReaderAsync removes the entry from InFlightByKeyMap before iterating InterestedParties, ensuring no concurrent attach can mutate the list during iteration. - Cascade + watchdog paths also drain the key map so a stale entry cannot outlive its backend round-trip. Counter accounting balance (per snapshot): CoalescedHitCount + CoalescedMissCount equals total FC03 + FC04 requests since startup. Even with coalescing disabled, every read still bumps Miss so dashboard math stays balanced. New surface (additive only): - src/Mbproxy/Proxy/Multiplexing/CoalescingKey.cs - src/Mbproxy/Proxy/Multiplexing/InFlightByKeyMap.cs - src/Mbproxy/Proxy/Multiplexing/CoalescingLogEvents.cs - ReadCoalescingOptions on ResilienceOptions - CoalescedHitCount / CoalescedMissCount / CoalescedResponseToDeadUpstream counters surfaced on /status.json per PLC and as a compact "Coal" cell on the HTML status page. Phase 9 test patch: TwoUpstreams_ProxyTxIds_AreDistinct_OnTheWire previously read the same register from both clients (which now coalesces). Patched to read two different addresses so the test still proves distinct backend TxIds without violating the coalescing contract. Tests added: 24 new (19 unit + 5 E2E): - CoalescingKeyTests (5) - InFlightByKeyMapTests (6, includes concurrent stress) - ReadCoalescingTests (8, stub-backend with deterministic delay) - ReadCoalescingE2ETests (5, pymodbus simulator; coalescing-active during overlap is proven against the stub, not the sim, due to pymodbus 3.13's known concurrent-frame bug) Total: 325 tests passing (282 unit + 43 E2E). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
183 lines
8.9 KiB
JSON
183 lines
8.9 KiB
JSON
// mbproxy configuration template — copy to %ProgramData%\mbproxy\appsettings.json
|
||
// and edit before starting the service.
|
||
//
|
||
// The .NET configuration loader accepts // and /* */ comments in JSON files
|
||
// (JSONC semantics) when using the default Host.CreateApplicationBuilder path.
|
||
//
|
||
// IMPORTANT: This file is overwritten on each install ONLY if no appsettings.json
|
||
// already exists at the destination. An existing file is always preserved.
|
||
{
|
||
"Mbproxy": {
|
||
|
||
// ── Global BCD tag list ─────────────────────────────────────────────────────────────
|
||
// These tags apply to EVERY PLC by default.
|
||
// Each entry: Address (Modbus PDU address, decimal), Width (16 or 32 bits).
|
||
//
|
||
// Width 16 — one register holds 4 BCD digits (0–9999).
|
||
// Wire value 0x1234 decodes to decimal 1234.
|
||
//
|
||
// Width 32 — a CDAB-ordered register pair (Address = low word, Address+1 = high word).
|
||
// Decoded decimal = high * 10000 + low (DirectLOGIC CDAB word order).
|
||
//
|
||
// Per-PLC overrides (see Plcs[].BcdTags below):
|
||
// Add — appends extra tags beyond what Global defines, or overrides a
|
||
// Global entry's Width when the same Address appears in both.
|
||
// Remove — removes specific addresses from the effective set for that PLC.
|
||
// Effective set = (Global ∪ Add) − Remove, resolved per PDU.
|
||
"BcdTags": {
|
||
"Global": [
|
||
// V2000 (octal) = decimal address 1024. 16-bit BCD counter.
|
||
{ "Address": 1024, "Width": 16 },
|
||
|
||
// V2040 (octal) = decimal address 1056. 32-bit BCD total at 1056/1057.
|
||
{ "Address": 1056, "Width": 32 },
|
||
|
||
// V2100 (octal) = decimal address 1088. 16-bit BCD setpoint.
|
||
{ "Address": 1088, "Width": 16 }
|
||
]
|
||
},
|
||
|
||
// ── PLC list ────────────────────────────────────────────────────────────────────────
|
||
// Each entry maps one upstream proxy port → one backend PLC.
|
||
// Upstream clients connect to ListenPort; the proxy forwards to Host:Port.
|
||
//
|
||
// IMPORTANT: H2-ECOM100 modules accept at most 4 simultaneous TCP connections.
|
||
// With the 1:1 upstream↔backend model, a fifth upstream client to the same proxy
|
||
// port will cause a backend connect failure and an immediate upstream disconnect.
|
||
"Plcs": [
|
||
{
|
||
"Name": "Line1-Mixer", // Human-readable name (shown on status page and in logs)
|
||
"ListenPort": 5020, // Port the proxy listens on (upstream clients connect here)
|
||
"Host": "10.0.1.1", // PLC IP address or hostname
|
||
"Port": 502, // PLC Modbus TCP port (almost always 502)
|
||
"BcdTags": {
|
||
// Additional 32-bit tag specific to this PLC only.
|
||
"Add": [
|
||
{ "Address": 1200, "Width": 32 }
|
||
],
|
||
// Remove address 1056 from the Global list for this PLC
|
||
// (this mixer doesn't use the 32-bit BCD total).
|
||
"Remove": [ 1056 ]
|
||
}
|
||
},
|
||
{
|
||
"Name": "Line1-Conveyor",
|
||
"ListenPort": 5021,
|
||
"Host": "10.0.1.2",
|
||
"Port": 502
|
||
// No BcdTags override — uses the Global set as-is.
|
||
}
|
||
// Add one entry per PLC. Ports must be unique per host. Typical fleet: 54 PLCs.
|
||
],
|
||
|
||
// ── Admin port ──────────────────────────────────────────────────────────────────────
|
||
// Read-only HTTP status page.
|
||
// GET / → self-contained HTML (auto-refreshes every 5 s)
|
||
// GET /status.json → same data as JSON for monitoring scrapers
|
||
//
|
||
// Authentication is assumed at the network layer (trusted internal segment).
|
||
// Set to 0 to disable the admin endpoint.
|
||
"AdminPort": 8080,
|
||
|
||
// ── Connection timeouts ─────────────────────────────────────────────────────────────
|
||
"Connection": {
|
||
// Max time (ms) to wait for a TCP connect to the PLC backend.
|
||
// Each Polly retry attempt gets its own copy of this timeout.
|
||
"BackendConnectTimeoutMs": 3000,
|
||
|
||
// Max time (ms) to wait for the PLC to respond to a forwarded PDU.
|
||
// Non-idempotent FC06/FC16 writes are one-shot — the upstream client
|
||
// is disconnected immediately on timeout (no retry).
|
||
"BackendRequestTimeoutMs": 3000,
|
||
|
||
// Max time (ms) to wait for in-flight PDUs to complete during graceful shutdown
|
||
// (sc.exe stop / Windows Service stop signal). After this deadline the coordinator
|
||
// cancels remaining work and proceeds. Keep at or below the SCM wait-hint (30 s).
|
||
"GracefulShutdownTimeoutMs": 10000
|
||
},
|
||
|
||
// ── Resilience policies ─────────────────────────────────────────────────────────────
|
||
"Resilience": {
|
||
|
||
// Polly retry policy for backend TCP connect attempts.
|
||
// MaxAttempts: total connect tries (including the first).
|
||
// BackoffMs: delay between each attempt (must have MaxAttempts−1 entries).
|
||
"BackendConnect": {
|
||
"MaxAttempts": 3,
|
||
"BackoffMs": [ 100, 500, 2000 ]
|
||
},
|
||
|
||
// Polly recovery policy for listener bind failures.
|
||
// If a PLC's listen port can't be bound (in-use, bad IP, transient OS error),
|
||
// the supervisor retries according to this schedule.
|
||
// InitialBackoffMs: backoff per step (first N retries).
|
||
// SteadyStateMs: backoff for all subsequent retries (runs indefinitely).
|
||
"ListenerRecovery": {
|
||
"InitialBackoffMs": [ 1000, 2000, 5000, 15000, 30000 ],
|
||
"SteadyStateMs": 30000
|
||
},
|
||
|
||
// Phase 10 — in-flight read coalescing.
|
||
//
|
||
// When two or more upstream clients (HMI / historian / engineering workstation /
|
||
// gateway) issue the SAME FC03 or FC04 read while a matching backend round-trip is
|
||
// already in flight, the proxy attaches the late arrivals to the existing in-flight
|
||
// entry and fans the single PLC response out to every attached client — saving the
|
||
// ECOM's per-scan PDU budget on duplicated reads.
|
||
//
|
||
// Zero post-response staleness: coalescing operates ONLY between "first request
|
||
// sent to PLC" and "response received from PLC" (microseconds to ~10 ms typical).
|
||
// Each upstream client still sees its own MBAP transaction ID echoed correctly;
|
||
// the proxy is transparent.
|
||
//
|
||
// FC06 / FC16 writes are NEVER coalesced (non-idempotent). FC03 vs FC04 are
|
||
// separate Modbus tables and never share a coalescing key. Different unit IDs
|
||
// (multi-drop / gateway-backed setups) never coalesce.
|
||
//
|
||
// Enabled — master switch. Hot-reloadable; flipping to false leaves running
|
||
// coalesced entries to drain naturally.
|
||
// MaxParties — per-entry cap on attached parties. Past the cap, the next
|
||
// identical request opens a fresh backend round-trip (load-shedding
|
||
// safety valve for very fan-out-heavy fleets).
|
||
"ReadCoalescing": {
|
||
"Enabled": true,
|
||
"MaxParties": 32
|
||
}
|
||
}
|
||
},
|
||
|
||
// ── Serilog ─────────────────────────────────────────────────────────────────────────────
|
||
// Structured log output. Default: Information level, rolling-file under ProgramData.
|
||
// The EventLogBridge writes Error+ events to the Windows Application Event Log
|
||
// automatically when the service runs under the SCM (not under dotnet run).
|
||
"Serilog": {
|
||
"Using": [ "Serilog.Sinks.Console", "Serilog.Sinks.File" ],
|
||
"MinimumLevel": {
|
||
"Default": "Information",
|
||
"Override": {
|
||
"Microsoft": "Warning",
|
||
"System": "Warning"
|
||
}
|
||
},
|
||
"WriteTo": [
|
||
{
|
||
"Name": "Console",
|
||
"Args": {
|
||
"outputTemplate": "[{Timestamp:HH:mm:ss} {Level:u3}] {Message:lj} {Properties:j}{NewLine}{Exception}"
|
||
}
|
||
},
|
||
{
|
||
"Name": "File",
|
||
"Args": {
|
||
// Rolling log: one file per day, kept for 30 days.
|
||
// Survives uninstall — logs are archived to %ProgramData%\mbproxy.archived-<ts>\.
|
||
"path": "C:\\ProgramData\\mbproxy\\logs\\mbproxy-.log",
|
||
"rollingInterval": "Day",
|
||
"retainedFileCountLimit": 30,
|
||
"outputTemplate": "[{Timestamp:yyyy-MM-dd HH:mm:ss.fff zzz} {Level:u3}] {Message:lj} {Properties:j}{NewLine}{Exception}"
|
||
}
|
||
}
|
||
]
|
||
}
|
||
}
|