mbproxy: Wave 3 cleanups, docs, and test gaps from 2026-05-14 review
Closes the Wave 3 (cleanup) tier of codereviews/2026-05-14/RemediationPlan.md.
Tests: 378 pass / 0 fail (baseline 370 + 8 new W3 regression tests).
Code cleanups:
* PlcMultiplexer: removed dead `elapsedMs` calculation (the actual EWMA
conversion uses Stopwatch ticks two lines below).
* UpstreamPipe.FillAsync: dropped the meaningless `firstRead && remaining
== count ? false : false` ternary; both branches were `false`.
* InFlightByKeyMap.TryAttachOrCreate (always returned `true`) renamed to
`AttachOrCreate` and made `void`. Test sites updated to drop the dead
`bool ok = ...; ok.ShouldBeTrue();` assertions.
* BcdCodec.HasBadNibble promoted from private to internal; the duplicate
copy in BcdPduPipeline removed and the call sites updated to
`BcdCodec.HasBadNibble`.
* PlcMultiplexer watchdog comment fixed: said "1-second floor", code uses
100 ms. Now both agree.
* StatusSnapshotBuilder: simplified the unreachable
`RemoteEp?.ToString() ?? RemoteEp?.Address.ToString() ?? "?"` to
`RemoteEp?.ToString() ?? "?"`.
* Mbproxy.csproj: stale "deferred" Polly comment replaced with a real
description of where Polly is used (BackendConnect + ListenerRecovery).
Doc updates:
* README: added a callout about the unconventional 32-bit BCD wire format
("two base-10000 digits in CDAB", not standard binary CDAB Int32) so
integrators using off-the-shelf clients learn about the silent-corruption
hazard before configuring writes.
* docs/design.md: clarified `cacheMissCount` and `coalescedMissCount`
semantics — "miss" means "did not find a fresh entry / did not coalesce",
NOT "produced a backend round-trip". Operators wanting actual backend
traffic should compute `miss − coalescedHit − exception04`.
* docs/Architecture/ResponseCache.md: documented the structural
"skip invalidation while recovering" gating (no backend reader during
recovery → no FC06/FC16 response → no invalidation).
* docs/Operations/Configuration.md: noted that the Event Log sink is the
custom EventLogBridge, not Serilog.Sinks.EventLog (W2.23 cached check).
* docs/plan/README.md: added a Phase 12 row pointing at the remediation
plan and linking out to codereviews/2026-05-14/.
Test additions (W3 high-value gaps):
* BcdPduPipelineTests:
- FC16_WriteStartsOnHighWord_Of32BitPair_PassesThroughRaw_WithPartialWarning
(symmetric inverse of the existing low-side partial-overlap test).
- FC03_Mixed_16Bit_32Bit_AndNonBcd_InOneRead_OnlyConfiguredSlotsRewritten
(mixed-slot routing in a single FC03 read).
- FC16_Response_PassesThroughUnchanged_RegardlessOfTagMap (FC16 response
carries no register data; rewriter must pass through).
* AdminEndpointTests:
- NonGetMethod_AgainstAdminRoutes_Returns405 (Theory: POST/PUT/DELETE/
PATCH against `/` and `/status.json` must return 405; guards against
an accidental MapPost being added later).
* HotReloadE2ETests:
- E2E_TagListReload_OnCacheablePlc_EmitsCacheFlushedEvent (validates the
W2.8 cache.flushed wiring end-to-end via the real FileSystemWatcher
reload path).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -303,6 +303,17 @@ cached read is still consistent with the device's actual state. Skipping
|
||||
the invalidation matches reality — the write did not take effect, so the
|
||||
read is not stale.
|
||||
|
||||
The skip is **structural**, not conditional. Cache invalidation only
|
||||
fires inside the per-PLC backend reader task, after a non-exception
|
||||
FC06/FC16 response arrives from the PLC. A `recovering` supervisor has
|
||||
torn down its multiplexer and there is no backend reader, so no response
|
||||
can land and the invalidation path is never entered. This is the
|
||||
reasoning the code at `Proxy/Multiplexing/PlcMultiplexer.cs` documents
|
||||
inline (W2.9). If a future change ever produced a write response off the
|
||||
live backend (e.g. a mocked-response path), an explicit `Recovering`
|
||||
check would need to be added at the invalidator call site to keep the
|
||||
skip semantics correct.
|
||||
|
||||
## No Persistence
|
||||
|
||||
The cache is purely in-memory. Process restart wipes every entry. There
|
||||
|
||||
@@ -86,6 +86,8 @@ Every supported key under `Mbproxy:*`, populated to a representative default:
|
||||
|
||||
`Serilog` configuration is documented in [`./Troubleshooting.md`](./Troubleshooting.md) and lives outside the `Mbproxy` section.
|
||||
|
||||
> The Windows Event Log sink is **not** the standard `Serilog.Sinks.EventLog` package. It is a custom `EventLogBridge` (`src/Mbproxy/Diagnostics/EventLogBridge.cs`) that writes Error+ events to the `mbproxy` source under `Application` only when the service runs under the SCM. Event Log source registration is intentionally NOT attempted at runtime (the service account may not be admin); `install.ps1` registers the source at install time. Don't add `Serilog.Sinks.EventLog` — the bridge would duplicate every event. The bridge caches the source-exists check at construction (Phase 12 / W2.23), so a missing source produces no per-event registry traffic.
|
||||
|
||||
## `Mbproxy.AdminPort`
|
||||
|
||||
Port for the read-only HTTP status server. Binds to all interfaces on startup.
|
||||
|
||||
@@ -136,7 +136,7 @@ Properties:
|
||||
- **Hot-reloadable on/off.** `Mbproxy.Resilience.ReadCoalescing.Enabled` defaults to `true`. Flipping it to `false` at runtime leaves running coalesced entries to drain naturally; subsequent FC03/04 requests take the Phase-9 (one round-trip per upstream request) path.
|
||||
- **Transparency contract preserved.** Each upstream client still sees its own original MBAP TxId on the response. The BCD rewriter runs once on the shared response buffer; per-party copies are only made when fan-out has more than one party.
|
||||
|
||||
Counter accounting balance (per snapshot): `coalescedHitCount + coalescedMissCount` equals the total FC03 + FC04 requests seen since the multiplexer was constructed. Both counters increment regardless of whether the coalescing feature is enabled — `coalescedHitCount` is 0 when disabled, but every read still increments `coalescedMissCount`.
|
||||
Counter accounting balance (per snapshot): `coalescedHitCount + coalescedMissCount` equals the total FC03 + FC04 requests seen since the multiplexer was constructed. Both counters increment regardless of whether the coalescing feature is enabled — `coalescedHitCount` is 0 when disabled, but every read still increments `coalescedMissCount`. **Saturation paths (allocator full, duplicate-key race) also count as a miss** even though they produce no backend round-trip — the identity above is preserved by counting every entry into the coalescing path, not every backend send. Operators wanting "actual backend round-trips opened" should subtract the multiplexer's exception-04 frames produced from saturation.
|
||||
|
||||
## Response cache (Phase 11) — opt-in bounded-staleness cache
|
||||
|
||||
@@ -173,7 +173,7 @@ The BCD rewriter runs once on the cache-miss path (the backend reader task decod
|
||||
### Counter accounting
|
||||
|
||||
- `cacheHitCount` — FC03/FC04 requests served from the cache.
|
||||
- `cacheMissCount` — FC03/FC04 requests that fell through to the coalescing/backend path. (Cache hit + Cache miss = total FC03/FC04 requests that were cache-eligible, i.e. whose resolved TTL was > 0; reads whose effective TTL is 0 increment neither.)
|
||||
- `cacheMissCount` — FC03/FC04 requests that fell through to the coalescing/backend path. (Cache hit + Cache miss = total FC03/FC04 requests that were cache-eligible, i.e. whose resolved TTL was > 0; reads whose effective TTL is 0 increment neither.) **A "miss" does NOT mean "produced a backend round-trip."** Two upstream peers issuing the same cache-eligible read both increment `cacheMissCount`; one then opens a backend round-trip and the other coalesces onto it via the InFlightByKey path (incrementing `coalescedHitCount`). Operators reading these counters as "backend reads opened" should use `cacheMissCount − coalescedHitCount` as the lower bound on actual backend traffic.
|
||||
- `cacheInvalidations` — count of cache entries invalidated by FC06/FC16 write responses.
|
||||
- `cacheEntryCount` — point-in-time snapshot of `ResponseCache.Count` (Tier-2 memory-watch KPI).
|
||||
- `cacheBytes` — point-in-time approximation of cached PDU bytes (Tier-2 memory-watch KPI).
|
||||
|
||||
@@ -20,6 +20,7 @@ Phase-by-phase implementation plan for the `mbproxy` service. Each phase is a se
|
||||
| 09 | [TxId multiplexing](09-txid-multiplexing.md) — single backend connection per PLC (post-1.0 follow-on) | 04, 05, 07 | — |
|
||||
| 10 | [Read coalescing](10-read-coalescing.md) — in-flight FC03/04 dedup (post-1.0 follow-on) | 09 | — |
|
||||
| 11 | [Response cache](11-response-cache.md) — short-TTL post-response cache, bounded staleness (post-1.0; **design-contract pivot**) | 10 | — |
|
||||
| 12 | Code-review remediation (2026-05-14) — Wave 1 critical, Wave 2 major, Wave 3 cleanup. Plan and findings in [`../../codereviews/2026-05-14/`](../../codereviews/2026-05-14/RemediationPlan.md). | 11 | — |
|
||||
|
||||
```
|
||||
┌── 01 (sim) ──┐
|
||||
|
||||
Reference in New Issue
Block a user