Adds 11 topic-focused docs under docs/{Architecture,Features,Operations,Reference,Testing}/
and links them from README.md's new "Detailed documentation" section. Existing
top-level docs (design.md, kpi.md, operations.md) remain as canonical landings.
Architecture/
- Overview.md (150 lines) — listener topology, request flow, per-PLC isolation
- ConnectionModel.md (247 lines) — TxId multiplexer, watchdog, disconnect cascade
- ReadCoalescing.md (243 lines) — in-flight FC03/04 dedup via InFlightByKeyMap
- ResponseCache.md (398 lines) — opt-in per-tag TTL cache + range-overlap invalidation
Features/
- BcdRewriting.md (252 lines) — codec, CDAB, FC scope, partial-overlap policy
- HotReload.md (189 lines) — IOptionsMonitor + per-change-kind reconcile rules
Operations/
- Configuration.md (422 lines) — every Mbproxy:* option + validation rules
- StatusPage.md (334 lines) — admin endpoint surface, every JSON field
- Troubleshooting.md (364 lines) — diagnosis playbook keyed to log events
Reference/
- LogEvents.md (499 lines) — 28 events across 7 categories, grep-verified
Testing/
- Simulator.md (235 lines) — pymodbus fixture, skip policy, 3.13 framer quirk
Each doc was written by a dedicated agent against the StyleGuide.md rules with
a per-doc phase gate (PascalCase filename, H1 Title Case, code-fence language
tags, Related Documentation section with >=3 relative links, real type names
verified against src/). Cross-references between docs use relative paths;
all 18 README->docs links and all sibling links resolve.
Known follow-up: docs/design.md lines 215-251 are stale on two log-event
property templates (config.reload.applied and config.reload.rejected) and
mention LogContext.PushProperty scoping that isn't actually used. Reference/
LogEvents.md is now the authoritative event catalog and source-of-truth.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
11 KiB
Read Coalescing
In-flight read coalescing collapses identical FC03/FC04 requests that arrive while a backend response is still in flight onto a single backend round-trip, then fans the single response out to every attached upstream client with each client's original MBAP transaction ID restored.
What Coalescing Does
When two upstream clients each send (unitId=1, FC=3, start=100, qty=10)
within the in-flight window of a previously-routed request, the second
arrival attaches to the existing InFlightRequest instead of opening a new
proxy transaction ID and a second backend round-trip. The PLC's reply is
delivered to both upstream pipes; each pipe sees its own MBAP TxId
restored on its copy of the response.
The value each upstream sees is the same value an uncoalesced request would
have returned within the PLC's own scan-time precision (microseconds to
~10 ms typical window). Coalescing is not a cache layer — once the response
fans out, the in-flight entry dies, and a subsequent identical read opens a
fresh round-trip. Bounded-staleness caching is a separate feature; see
./ResponseCache.md.
The Coalescing Key
The lookup tuple is defined in CoalescingKey.cs:
internal readonly record struct CoalescingKey(
byte UnitId,
byte Fc,
ushort StartAddress,
ushort Qty);
Record-struct value equality drives the dictionary lookup in
InFlightByKeyMap. Several axes never coalesce, by design:
- Function code. FC03 (Read Holding Registers) and FC04 (Read Input Registers) read different Modbus tables on the device. Their responses are not interchangeable, so they do not share a key even at the same address.
- Unit ID. Distinct unit IDs behind a shared socket address different Modbus personalities — coalescing never crosses a unit boundary.
- Start address and quantity. Two reads with overlapping but non-identical ranges never coalesce. Range-overlap logic exists for cache invalidation, not for coalescing.
Eligibility
Only FC03 and FC04 enter the coalescing path. The multiplexer's request
handler parses the function code from the inbound PDU and gates on
fcByte is 0x03 or 0x04 before consulting _inFlightByKey.
- FC06 (Write Single Register) and FC16 (Write Multiple Registers) are non-idempotent on BCD tags — a second send would write the value twice. Writes bypass coalescing entirely and always take the one-round-trip path.
- Exception responses do not coalesce. Each upstream sees an exception
delivered against its own MBAP
TxIdthrough the normal correlation map fan-out; there is no special exception-deduplication path.
The InterestedParties Seam
The data shape that powers fan-out lives on InFlightRequest:
internal sealed record InFlightRequest(
byte UnitId,
byte Fc,
ushort StartAddress,
ushort Qty,
IReadOnlyList<InterestedParty> InterestedParties,
DateTimeOffset SentAtUtc,
int ResolvedCacheTtlMs = 0);
internal sealed record InterestedParty(UpstreamPipe Pipe, ushort OriginalTxId);
Each InterestedParty records the upstream pipe to deliver the response to
and the original MBAP TxId that pipe sent. The backend reader iterates
this list, patches each party's OriginalTxId into a per-party copy of the
response frame, and hands the frame to party.Pipe.SendResponseAsync.
Multi-writer multi-reader safety
The list typed as IReadOnlyList<InterestedParty> on the public surface is
in fact a mutable List<InterestedParty> underneath. InFlightByKeyMap
serialises every state mutation under a single object lock:
TryAttachOrCreatelooks up the key, casts the existingInterestedPartiesback toList<InterestedParty>, and appends the new party — all under the lock.- The backend reader calls
TryRemove(coalKey, out _)before it iterates the parties list during fan-out. Once the key is gone from the map, no future attach can find it, so no further appends can occur.
The reader's removal-before-iteration ordering is the load-bearing
invariant. By the time fan-out reads the list, the list is effectively
frozen — there is no other writer that can reach it. The watchdog timeout
path observes the same protocol: it removes the coalescing key before it
walks req.InterestedParties to deliver exception 0x0B.
The reverse race (reader removes first, then a late attach arrives) is
impossible by construction — TryRemove and TryAttachOrCreate both take
the same map lock, so any late attach is serialised either entirely before
the removal (and is part of the fan-out) or entirely after (and opens a
fresh entry under a new factory call).
MaxParties Cap
ResilienceOptions.cs exposes the load-shedding cap:
public sealed class ReadCoalescingOptions
{
public bool Enabled { get; init; } = true;
public int MaxParties { get; init; } = 32;
}
Mbproxy.Resilience.ReadCoalescing.MaxParties defaults to 32. Inside
TryAttachOrCreate, an existing entry is only extended when
existingList.Count < maxParties; once the cap is hit, the next identical
arrival falls through to the factory branch and opens a fresh in-flight
entry (which means a fresh backend round-trip).
The cap bounds two costs:
- Fan-out cost per entry at O(MaxParties). The backend reader's
per-party copy-and-patch loop runs at most
MaxPartiestimes for any single response. - Backend reader latency under pile-on. A single pathologically popular read (every HMI hitting the same tag at the same second) cannot stretch one fan-out arbitrarily long.
Hot-Reloadable On/Off
Mbproxy.Resilience.ReadCoalescing.Enabled defaults to true. The
multiplexer holds a Func<ReadCoalescingOptions> accessor that production
binds to () => optionsMonitor.CurrentValue.Resilience.ReadCoalescing, so
a hot-reload of appsettings.json propagates immediately on the next
inbound PDU.
Flipping Enabled to false at runtime does not disturb already-coalesced
entries: existing fan-outs drain through the backend reader naturally.
Subsequent FC03/FC04 requests skip the coalescing branch entirely and take
the one-proxy-TxId-per-upstream-request path verbatim.
The same accessor reads MaxParties per PDU, so an operator can raise or
lower the cap without restarting the service.
Lookup Order in the Multiplexer's Read Path
OnUpstreamFrameAsync consults three tiers in fixed order for FC03/FC04:
- Cache — if
_ctx.Cacheis wired and_ctx.TagMap.ResolveCacheTtlMsreturns a positive TTL for the read range, the response cache is checked first. A hit short-circuits everything, including theEnsureBackendConnectedAsynccall. See./ResponseCache.md. - Coalesce — on a cache miss (or no cache configured), the request
consults
_inFlightByKeyviaTryAttachOrCreate. A hit attaches the new party to an in-flight peer and emits no backend traffic. - Backend — on a coalescing miss, the factory branch allocates a
proxy
TxIdthroughTxIdAllocator, registers the entry inCorrelationMap, runs the BCD rewriter on the request PDU, and queues the frame onto the outbound channel.
The order is load-bearing. Cache hits avoid both backend traffic and any coalescing-entry housekeeping. Coalescing hits avoid the backend but still incur a list-append and a fan-out. Backend round-trips are the most expensive of the three.
Counter Accounting
PerPlcContext.Counters exposes three coalescing-specific counters, all
surfaced on the status page:
coalescedHitCount— increments insideOnUpstreamFrameAsyncwhenTryAttachOrCreatereturnswasNew == false(the request attached to an existing in-flight entry).coalescedMissCount— increments whenwasNew == true. The non-coalescing FC03/FC04 path also increments this counter when coalescing is disabled, so the identitycoalescedHitCount + coalescedMissCount == total FC03+FC04 requests since multiplexer constructionholds regardless ofEnabled.coalescedResponseToDeadUpstream— increments inside the backend reader's fan-out loop when a coalesced party's pipe has gone dead (party.Pipe.IsAlive == false) before the response landed. Only counted when the in-flight entry had more than one party — single-party dead-upstream skips are the normal Phase-9 behaviour and are silent.
When ReadCoalescing.Enabled == false, coalescedHitCount remains zero
and every FC03/FC04 read increments coalescedMissCount. Aggregate fleet
metrics (hit ratio, requests per second) read directly from these
counters; see ../Operations/StatusPage.md.
The Debug-level log events mbproxy.coalesce.hit,
mbproxy.coalesce.miss, and mbproxy.coalesce.dead_upstream mirror each
counter increment; see ../Reference/LogEvents.md.
Transparency Contract Preserved
Each upstream client receives the same response shape it would have received from a one-to-one proxy:
- Original MBAP
TxIdrestored. The backend reader patchesoutFrame[0..2]withparty.OriginalTxIdfor each party in theInterestedPartieslist. The proxy's internal TxId never reaches an upstream socket. - BCD rewriter runs once.
_pipeline.Process(ResponseToClient, ...)fires exactly once against the shared backend response buffer. Cached rewriter context (start address, quantity) comes from theInFlightRequestthat opened the round-trip. - One-party fan-out reuses the buffer. When
inFlight.InterestedParties.Count == 1, the backend reader assigns the originalframereference tooutFrameinstead of cloning, saving the allocation. Multi-party fan-outs clone the frame per party so each can carry a distinctTxIdwithout trampling its peers.
Coalescing is invisible at the wire-protocol layer. An upstream client cannot tell whether its read was served by a fresh backend round-trip or by attaching to a peer's in-flight request — only the timing distribution changes.
Related Documentation
./ConnectionModel.md— multiplexer overview; theInterestedPartiesseam,CorrelationMap, andTxIdAllocatorlive here../ResponseCache.md— bounded-staleness cache that sits above coalescing in the lookup order; cache hits short-circuit coalescing entirely.../Operations/StatusPage.md— exposescoalescedHitCount,coalescedMissCount, andcoalescedResponseToDeadUpstreamper PLC and as fleet aggregates.../Reference/LogEvents.md— fullmbproxy.coalesce.*event catalogue with event IDs.../Operations/Configuration.md— binding forMbproxy.Resilience.ReadCoalescing.EnabledandMaxParties, hot-reload semantics.../Features/BcdRewriting.md— the rewriter that runs once on the shared response buffer before fan-out.