Files
wwtools/mbproxy/docs/plan/11-response-cache.md
T
Joseph Doherty 56eee3c563 mbproxy: initial commit through Phase 9 (TxId multiplexing)
Adds the mbproxy service end-to-end. Phases 00-08 implement the
production-ready single-listener / 1:1-backend transparent Modbus TCP
proxy with bidirectional BCD rewriting for the ~54-PLC DL205/DL260
fleet. Phase 9 replaces the connection layer with a single backend
socket per PLC plus MBAP TxId rewriting, lifting the H2-ECOM100's
4-concurrent-client cap as an operational ceiling.

Phase 9 additions of note:
- PlcMultiplexer + UpstreamPipe + TxIdAllocator + CorrelationMap
- InFlightRequest with IReadOnlyList<InterestedParty> (load-bearing
  for Phase 10 read coalescing — do not collapse to a single field)
- Per-request watchdog: surfaces Modbus exception 0x0B to upstream
  on BackendRequestTimeoutMs, defending against lost responses,
  dead-PLC paths, and pymodbus 3.13.0's concurrent-multiplexed-
  request bug (its ServerRequestHandler.last_pdu state race)
- Status DTO + HTML gain inFlight / maxInFlight / txIdWraps /
  disconnectCascades / queueDepth (Tier 1.6 in docs/kpi.md)

Tests: 263 unit + 38 E2E. Multiplexer correctness under truly
concurrent backend traffic is proved against a stub backend in
PlcMultiplexerTests; MultiplexerE2ETests paces requests so pymodbus
3.13's single-PDU framer stays in known-good mode.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 01:49:35 -04:00

25 KiB
Raw Blame History

Phase 11 — Short-TTL response cache (bounded staleness)

Cache FC03/FC04 responses with a per-tag TTL. Subsequent same-key reads within the TTL window are served from the cache without backend traffic. FC06/FC16 writes invalidate overlapping cache entries on the response side. This phase is a deliberate design-contract change — the proxy gains an opt-in cache layer with explicit bounded staleness.

Status: post-1.0 follow-on, depends on Phase 10. Architectural pivot — read the "Design pivot" section below before scoping. Depends on: Phase 09 (multiplexer chokepoint), Phase 10 (CoalescingKey is reused as CacheKey — same shape). Parallel-safe with: nothing.

Design pivot — do NOT skip this section

Phases 09 and 10 were additive performance optimisations that preserved the design's "transparent inline proxy" contract. Phase 11 is different. It changes the load-bearing claim in docs/design.md:

  • Today's contract (lines 12-20 of design.md): "The service is not a polling/cache layer. It is a transparent Modbus TCP proxy whose job is to rewrite the configured BCD tags in real time, in both directions, while proxying every other byte of the MBTCP connection untouched."
  • Post-Phase-11 contract: the proxy is optionally a cache layer within a bounded TTL. The TTL is per-tag, default 0 (no caching), opt-in by operator action.

Implication: Task 1 of this phase is rewriting the relevant design.md sections. The contract update is a code commit too — review, land first, then build the implementation against the new contract. Shipping cache code while design.md still says "not a cache layer" is a gate failure, not a merge-it-and-fix-later situation.

The cache is OFF by default. A fresh post-Phase-11 deployment with no TTL configuration behaves identically to a Phase-10 deployment. The opt-in shape (per-tag CacheTtlMs configuration) means a deployment can adopt Phase 11 without changing semantics until an operator explicitly opts a tag in.

Goal

Reduce backend Modbus traffic for the common SCADA case where many clients poll the same registers at near-identical cadences. Phase 10 already coalesces within the in-flight window (~10 ms). Phase 11 extends the "served without backend traffic" window from the in-flight microseconds to operator-configurable seconds.

Concretely: with CacheTtlMs = 1000 on a frequently-read BCD tag, the backend sees at most one read of that tag per second per PLC regardless of how many upstream clients are polling.

What it does NOT do

  • No active polling. Cache entries are populated on demand by upstream reads, not by proactive polling. (Active polling is Tier C-3 from the conversation history — a separate phase if ever wanted.)
  • No predictive prefetching.
  • No SCADA-style subscription/notification model.
  • No write-back caching. Writes always go straight through to the backend; cache invalidation happens on the write-response side, not by intercepting the write.
  • No cross-PLC caching. Each PLC's cache is independent.
  • No persistence. Process restart wipes the cache. Cache survives backend disconnects (the cached data was fresh when stored; disconnects don't retroactively invalidate it).

Outputs (new files)

src/Mbproxy/Proxy/Cache/CacheKey.cs                  # reuses CoalescingKey shape; type-aliased or reflected
src/Mbproxy/Proxy/Cache/CacheEntry.cs                # response bytes + expiry + lastFetched
src/Mbproxy/Proxy/Cache/ResponseCache.cs             # the cache itself; TTL-based eviction, LRU under cap
src/Mbproxy/Proxy/Cache/CacheInvalidator.cs          # address-range-overlap matcher for write invalidation
src/Mbproxy/Proxy/Cache/CacheLogEvents.cs            # [LoggerMessage] vocab for this phase

tests/Mbproxy.Tests/Proxy/Cache/CacheKeyTests.cs
tests/Mbproxy.Tests/Proxy/Cache/CacheEntryTests.cs
tests/Mbproxy.Tests/Proxy/Cache/ResponseCacheTests.cs
tests/Mbproxy.Tests/Proxy/Cache/CacheInvalidatorTests.cs
tests/Mbproxy.Tests/Proxy/Cache/ResponseCacheE2ETests.cs

Files modified

src/Mbproxy/Proxy/Multiplexing/PlcMultiplexer.cs       # OnFrame: cache check BEFORE coalescing; OnResponse: cache store + write invalidation
src/Mbproxy/Options/BcdTagOptions.cs                   # add CacheTtlMs (default 0 = no caching)
src/Mbproxy/Options/PlcOptions.cs                      # add DefaultCacheTtlMs
src/Mbproxy/Options/MbproxyOptions.cs                  # add Cache section (AllowLongTtl, MaxEntriesPerPlc, EvictionIntervalMs)
src/Mbproxy/Bcd/BcdTag.cs                              # carry CacheTtlMs on the record
src/Mbproxy/Bcd/BcdTagMapBuilder.cs                    # resolve per-tag TTL with per-PLC default fallback
src/Mbproxy/Proxy/ProxyCounters.cs                     # new: CacheHit, CacheMiss, CacheInvalidations, CacheEntryCount, CacheBytes
src/Mbproxy/Admin/StatusDto.cs                         # surface cache KPIs in PlcBackendStatus
src/Mbproxy/Admin/StatusSnapshotBuilder.cs             # populate
src/Mbproxy/Admin/StatusHtmlRenderer.cs                # show cache-hit ratio per PLC row
src/Mbproxy/Configuration/ReloadValidator.cs           # validate CacheTtlMs bounds; require AllowLongTtl=true for > 60s

docs/design.md                                         # SUBSTANTIAL — see Task 1
docs/kpi.md                                            # graduate cache KPIs from future to Tier 1
install/mbproxy.config.template.json                   # add CacheTtlMs examples + staleness commentary
mbproxy/CLAUDE.md                                      # Architecture summary: add the cache-layer bullet

Tasks

11.1 Design contract update — DO THIS FIRST

  1. docs/design.md updates (review and land before writing implementation code):

    a. "What this is" section — add the cache disclosure paragraph:

    As of Phase 11, the proxy gains an optional per-tag response cache with a bounded staleness window (CacheTtlMs). The cache is OFF by default (CacheTtlMs = 0) and must be opt-in per tag. With caching enabled, the proxy is no longer purely transparent — upstream reads may return a value up to CacheTtlMs milliseconds old. The 1:1 read-to-backend-request guarantee no longer holds; operators opting tags into caching MUST acknowledge the staleness bound.

    b. New section "Cache contract" between "Rewriter" and "Failure modes":

    • Cache populates on demand only. No polling.
    • Cache entries carry their TTL with them. Hits older than TTL are evicted on access.
    • FC06/FC16 successful responses invalidate cache entries whose address range overlaps the write.
    • Cache survives backend disconnects (cached data was valid at cache time).
    • Cache does NOT survive process restart.
    • Multi-tag read range: effective TTL is the minimum of all configured tags in the range. Any tag with TTL = 0 in the range disables caching for the whole read.
    • Cache stores POST-rewriter bytes (BCD already decoded). Hits bypass the rewriter entirely.

    c. "Failure modes" section — add bullet on cache behaviour during backend recovery:

    • Cache hits remain valid during a recovering listener state. Data was fresh when cached; recovery only affects future requests.
    • Invalidations during recovery: writes that arrive cannot reach the backend, so the invalidation never happens. This is consistent — the write didn't take effect either. Cache entries remain valid until their TTL expires.

    d. "Rewriter" section — clarify that the rewriter runs on the cache-miss path (decode on store), and that cache hits return pre-decoded bytes without re-invoking the rewriter.

    Treat (a)-(d) as one atomic change. Get them reviewed, land them, then implement against the new contract.

11.2 Cache key

  1. CacheKey — same shape as Phase 10's CoalescingKey: readonly record struct CacheKey(byte UnitId, byte Fc, ushort StartAddress, ushort Qty). If Phase 10 is already merged, prefer a using CacheKey = CoalescingKey; alias over a redefinition — same data, same hashing, single source of truth. If the two phases land together (Phase 10 + 11 in a coordinated release), consider renaming CoalescingKeyReadKey to make the shared use site neutral.

11.3 Cache entry and storage

  1. CacheEntryinternal sealed record CacheEntry(byte[] PduBytes, DateTimeOffset CachedAtUtc, DateTimeOffset ExpiresAtUtc, int Length, ushort LastUsedTick). LastUsedTick is a monotonic counter for LRU ordering (avoids DateTimeOffset.UtcNow calls on every cache access).

  2. ResponseCacheinternal sealed class ResponseCache : IDisposable. Methods:

    • bool TryGet(CacheKey key, out CacheEntry entry) — returns true ONLY if entry exists and entry.ExpiresAtUtc > DateTimeOffset.UtcNow. Updates LastUsedTick on hit. Expired entries removed lazily.
    • void Set(CacheKey key, CacheEntry entry) — replaces any existing entry. If Count >= MaxEntriesPerPlc, evict the LRU entry first.
    • int Invalidate(byte unitId, ushort startAddress, ushort qty) — delegates to CacheInvalidator. Returns count invalidated.
    • int Count { get; }, long ApproximateBytes { get; }
    • Background eviction loop (started in constructor, stopped in Dispose): every EvictionIntervalMs (default 5000), scans the map and removes entries past ExpiresAtUtc.
  3. CacheInvalidator — pure logic: static IEnumerable<CacheKey> FindOverlapping(IReadOnlyCollection<CacheKey> haystack, byte unitId, ushort writeStart, ushort writeQty). Returns keys whose range [StartAddress, StartAddress + Qty) intersects [writeStart, writeStart + writeQty). Limit scope to keys matching unitId and Fc in {3, 4} (we never cache writes; invalidation only applies to read entries).

11.4 Multiplexer integration

  1. Cache lookup in PlcMultiplexer.OnFrame — for FC03/04 requests when the read range has a non-zero resolved TTL:

    if (fc is 0x03 or 0x04 && resolvedTtlMs > 0) {
        var key = new CacheKey(unitId, fc, startAddr, qty);
        if (cache.TryGet(key, out var entry)) {
            counters.IncrementCacheHit();
            // Build a fresh MBAP wrapper for this client and send.
            var hitFrame = BuildResponseFrame(entry.PduBytes, originalTxId, unitId);
            upstreamPipe.SendResponse(hitFrame);
            return;   // no coalescing check, no backend round-trip
        }
        counters.IncrementCacheMiss();
    }
    // Fall through to Phase 10 coalescing path → Phase 9 send path
    

    Order matters: cache check FIRST, then coalescing. A cache hit short-circuits everything; only on a miss do we engage Phase 10's coalescing logic.

  2. Cache store on response — in the backend reader fan-out path, AFTER the rewriter has run on the response:

    if (req.Fc is 0x03 or 0x04 && req.ResolvedCacheTtlMs > 0) {
        var key = new CacheKey(req.UnitId, req.Fc, req.StartAddress, req.Qty);
        var now = DateTimeOffset.UtcNow;
        var entry = new CacheEntry(
            PduBytes:     rewrittenPduBytes.ToArray(),   // defensive copy
            CachedAtUtc:  now,
            ExpiresAtUtc: now.AddMilliseconds(req.ResolvedCacheTtlMs),
            Length:       rewrittenPduBytes.Length,
            LastUsedTick: NextLruTick());
        cache.Set(key, entry);
    }
    

    Note: req.ResolvedCacheTtlMs is computed at request-receive time by walking the BcdTagMap for tags in [StartAddress, StartAddress + Qty) and taking min(CacheTtlMs). If any tag has TTL = 0, ResolvedCacheTtlMs = 0 and the whole read is uncached.

  3. Cache invalidation on write response — FC06 / FC16 successful response (NOT exception response):

    if (req.Fc is 0x06 or 0x10 && (fc & 0x80) == 0) {
        int invalidated = cache.Invalidate(req.UnitId, req.StartAddress, req.Qty);
        if (invalidated > 0) {
            counters.AddCacheInvalidations(invalidated);
            CacheLogEvents.WriteInvalidatedEntries(logger, req.UnitId,
                req.StartAddress, req.Qty, invalidated);
        }
    }
    

    Invalidation is by ADDRESS RANGE OVERLAP, not by exact key match. A write to register 105 invalidates a cached read of [100..110] and a cached read of [105..115] but NOT a cached read of [200..210].

11.5 Per-tag TTL configuration

  1. BcdTagOptions extension:

    public sealed class BcdTagOptions {
        public ushort Address    { get; init; }
        public byte   Width      { get; init; }
        public int    CacheTtlMs { get; init; } = 0;   // 0 = no caching (default)
    }
    
  2. PlcOptions.DefaultCacheTtlMs — applies to any tag whose explicit CacheTtlMs was not set (use a nullable int? on BcdTagOptions instead of int = 0 to distinguish "explicitly zero" from "unset"). Default for the PLC default itself is 0.

  3. MbproxyOptions.Cache section:

    public sealed class CacheOptions {
        public bool AllowLongTtl       { get; init; } = false; // gate for TTL > 60_000
        public int  MaxEntriesPerPlc   { get; init; } = 1000;
        public int  EvictionIntervalMs { get; init; } = 5000;
    }
    
  4. Validation in ReloadValidator: CacheTtlMs >= 0 always; CacheTtlMs > 60_000 requires Cache.AllowLongTtl = true. Reject reloads that violate. Prevents "left at 1 hour by accident" deployments.

  5. BcdTagMapBuilder.Build resolution: returns each BcdTag with CacheTtlMs resolved per fallback rules: explicit per-tag → per-PLC default → 0.

11.6 Counters and status surfacing

  1. ProxyCounters additions:

    • CacheHitCount (Interlocked long)
    • CacheMissCount (Interlocked long)
    • CacheInvalidations (Interlocked long)
    • CacheEntryCount (snapshot from ResponseCache.Count — read-time)
    • CacheBytes (snapshot from ResponseCache.ApproximateBytes — read-time)
  2. StatusDto.PlcBackendStatus extension:

    public sealed record PlcBackendStatus(
        long ConnectsSuccess, long ConnectsFailed,
        ExceptionCounts ExceptionsByCode,
        double LastRoundTripMs,
        long CoalescedHitCount, long CoalescedMissCount, long CoalescedResponseToDeadUpstream,  // Phase 10
        long CacheHitCount, long CacheMissCount,                                                 // Phase 11
        long CacheInvalidations, long CacheEntryCount, long CacheBytes);                         // Phase 11
    
  3. HTML page — add a compact Cache: 73% cell per PLC row. Page-weight assertion (under 50 KB for 54 PLCs) must continue to pass.

11.7 Documentation and template

  1. docs/kpi.md — graduate cache-hit-ratio KPIs from "deferred / future" to Tier 1 supported. Add cacheEntryCount and cacheBytes as Tier 2 memory-watch KPIs.

  2. install/mbproxy.config.template.json — add a fully-commented Mbproxy.Cache section showing AllowLongTtl, MaxEntriesPerPlc, EvictionIntervalMs. Show example per-tag CacheTtlMs: 1000 and per-PLC DefaultCacheTtlMs: 500 entries. Include a prominent comment explaining the staleness contract: "clients reading these tags will see values up to CacheTtlMs milliseconds old".

  3. mbproxy/CLAUDE.md Architecture summary — add a bullet:

    • Optional response cache with per-tag TTL (default 0 = off). Cached FC03/04 responses serve subsequent same-key reads without backend traffic; FC06/FC16 write responses invalidate overlapping entries by address range.

Public surface declared in this phase

namespace Mbproxy.Proxy.Cache;

internal readonly record struct CacheKey(
    byte UnitId, byte Fc, ushort StartAddress, ushort Qty);

internal sealed record CacheEntry(
    byte[] PduBytes,
    DateTimeOffset CachedAtUtc, DateTimeOffset ExpiresAtUtc,
    int Length, ushort LastUsedTick);

internal sealed class ResponseCache : IDisposable {
    public bool TryGet(CacheKey key, out CacheEntry entry);
    public void Set(CacheKey key, CacheEntry entry);
    public int Invalidate(byte unitId, ushort startAddress, ushort qty);
    public int Count { get; }
    public long ApproximateBytes { get; }
    public void Dispose();
}

internal static class CacheInvalidator {
    public static IEnumerable<CacheKey> FindOverlapping(
        IReadOnlyCollection<CacheKey> haystack,
        byte unitId, ushort writeStart, ushort writeQty);
}
namespace Mbproxy.Options;

public sealed class CacheOptions {
    public bool AllowLongTtl       { get; init; } = false;
    public int  MaxEntriesPerPlc   { get; init; } = 1000;
    public int  EvictionIntervalMs { get; init; } = 5000;
}
// Added field on MbproxyOptions:
public CacheOptions Cache { get; init; } = new();

// Added field on BcdTagOptions (nullable to distinguish "unset" from "explicitly 0"):
public int? CacheTtlMs { get; init; }

// Added field on PlcOptions:
public int DefaultCacheTtlMs { get; init; } = 0;

ProxyCounters and CounterSnapshot gain 5 new long fields. No public-surface removals or renames.

Tests required

Unit (Category = Unit)

CacheKeyTests (≥ 3 tests): equality across identical keys; FC03 vs FC04 differs; UnitId differs.

CacheEntryTests (≥ 3 tests): expired detection at boundary; immutability of PduBytes; LRU tick monotonicity.

CacheInvalidatorTests (≥ 5 tests, range-overlap math):

  1. FullOverlap_WriteCoversEntryRange_Invalidates
  2. PartialOverlap_WriteStartsBeforeEntry_Invalidates
  3. PartialOverlap_WriteEndsAfterEntry_Invalidates
  4. Adjacent_NotOverlapping_DoesNotInvalidate — write to [10..15] does NOT invalidate cached [15..20] (half-open intervals — 15 is not in the entry's range).
  5. NoOverlap_DoesNotInvalidate
  6. DifferentUnitId_DoesNotInvalidate

ResponseCacheTests (≥ 8 tests):

  1. SetThenGet_RoundTrips
  2. GetExpiredEntry_ReturnsFalse_AndRemoves — uses a small TTL + Task.Delay
  3. Invalidate_OverlappingRange_RemovesMatching — set 3 entries, invalidate a range overlapping 2 of them, verify Count drops by 2
  4. Invalidate_OnlyAffectsFc03Fc04_KeysWithFcOther_NotTouched — there shouldn't be FC06/FC16 entries in cache, but a defensive test
  5. Set_AtMaxEntries_EvictsLRU
  6. LRU_TracksAccessOrder_Across_Get_And_Set
  7. Concurrent_GetSet_NoDataRace — 100 tasks, 1000 ops each
  8. Dispose_StopsEvictionLoop

E2E (Category = E2E)

ResponseCacheE2ETests (≥ 6 tests, against pymodbus simulator):

  1. E2E_CacheHit_AfterFirstRead_NoBackendTraffic — configure tag at HR1072 with CacheTtlMs = 5000; first read goes to backend; second read within 5s hits cache. Verify via the simulator's HTTP introspection or by timing (cache hits return ~ms; backend reads return ~10ms).
  2. E2E_CacheExpires_AfterTtl_NextReadHitsBackend — short TTL (e.g., 200 ms); after delay, second read goes to backend.
  3. E2E_WriteInvalidatesOverlappingCacheEntries — read HR1072 (cache it), write to HR1072 with FC06, next read MUST miss cache and re-fetch.
  4. E2E_NonOverlappingWrite_DoesNotInvalidate — read HR1072 (cache it), write to HR1080, next read of HR1072 still hits cache.
  5. E2E_BcdDecodedBytesAreCached_NotRawBcd — cache hit returns the decoded 1234, not 0x1234. Proves the cache stores post-rewriter bytes.
  6. E2E_DisablingCache_ViaHotReload_FlushesEntries — set CacheTtlMs = 1000 on a tag, do a read (cached), hot-reload with CacheTtlMs = 0, next read must hit the backend even though the old entry is still within its TTL window.
  7. E2E_MultiTagRead_RangeWithZeroTtlTag_DisablesCaching — read [100..110] where one tag in the range has CacheTtlMs = 0; verify no caching of the whole read.

Phase gate

  • docs/design.md updates from Task 1 are merged FIRST (or in the same PR). The contract change is not optional and not deferrable. Gate fail otherwise.
  • dotnet build Mbproxy.slnx -c Debug — zero warnings, zero errors.
  • All prior tests still green — the 4 critical Phase-9 regression guards + Phase 10's coalescing tests.
  • All new unit + e2e tests pass (≥ 25 new).
  • Default TTL = 0 → no observable behavior change vs Phase 10. Verify: run the full Phase 10 test suite with the Phase 11 build; everything green.
  • Headline assertion (E2E): configure CacheTtlMs = 1000 on HR1072; issue 10 reads at 100 ms intervals; backend (stub or sim with introspection) sees exactly 1 backend round-trip.
  • Write invalidation correctly handles all 6 range-overlap cases (full, two partial, adjacent, none, different-unit-id).
  • Memory cap enforced: with MaxEntriesPerPlc = 5, 6 distinct cache inserts produce 5 entries (one LRU eviction observed).
  • Validation rejects CacheTtlMs > 60_000 unless Cache.AllowLongTtl = true.
  • Hot-reload of CacheTtlMs flushes entries for the affected tag (or, simpler: flushes the entire cache for the PLC). Pick the simpler option (PLC-wide flush) and document.
  • HTML page weight under 50 KB for 54 PLCs (verify with the existing renderer test).
  • docs/kpi.md Tier 1 includes cache-hit-ratio.
  • install/mbproxy.config.template.json includes the new Mbproxy.Cache block with the staleness commentary.

Out of scope

  • Active polling — cache populates on demand only. No background poll loop.
  • Predictive prefetching — no speculative reads.
  • Range-overlap coalescing of cache entries — if reads [100..110] and [105..115] are both cached, no attempt to merge them into one [100..115] entry. Same-key only.
  • Cross-PLC caching — each PLC's cache is independent. No optimisation across PLCs.
  • Persistence — process restart wipes the cache. No file/Redis backing store.
  • Cache warming — no pre-populating the cache from a snapshot, last-known-good file, etc.
  • TTL > 60 seconds without explicit AllowLongTtl opt-in — refused at validation.
  • Adaptive TTL — operator-configured only. No auto-tuning.

Subagent briefing

If you're the agent picking up this phase:

  1. Task 1 is design.md, not code. The contract update is the gate. Do not write the cache code until the design changes have been reviewed and merged (or are in the same PR with explicit reviewer attention). A reviewer who lands the code without the design update has failed the gate, and so have you.

  2. Default TTL = 0 means default behavior = Phase 10 unchanged. Critical for backwards-compat. Every existing test that doesn't set CacheTtlMs must continue to pass without modification.

  3. Cache stores POST-rewriter bytes. The rewriter runs once on the cache-miss path; subsequent hits return cached decoded bytes directly. Do not re-invoke the rewriter on hits — wastes CPU and changes nothing.

  4. Write-invalidation is by ADDRESS RANGE OVERLAP, not by exact key match. A write to register 105 invalidates a cached read of [100..110]. Use half-open interval math: write [w, w+q) overlaps entry [s, s+n) iff w < s+n && s < w+q.

  5. Multi-tag read range: effective TTL is min(TTLs). If any tag in the read range has TTL = 0, the whole read is uncached. Conservative-by-design.

  6. Cache lookup happens BEFORE coalescing. Order: cache check → cache miss → coalescing check (Phase 10) → backend send (Phase 9). A cache hit short-circuits everything.

  7. CacheKey is structurally identical to CoalescingKey. Prefer aliasing over redefinition. If the two phases land together, rename the shared type to ReadKey to make the joint use site neutral.

  8. MBAP TxId restoration on cache-hit responses. The cache stores the PDU bytes (post-rewriter); on hit, build a fresh MBAP wrapper with the requesting client's OriginalTxId. There's no cached MBAP — the per-request TxId is supplied by the upstream pipe's request.

  9. Hot-reload of CacheTtlMs: flush the whole PLC cache on any tag-list change. Tag-level granularity is technically possible but complicates the reload code path. The simple correctness move is "any tag-list change to this PLC → drop all cached entries for this PLC and let them re-populate." Document the choice.

  10. Eviction loop: PeriodicTimer + cancellation token. Not System.Timers.Timer. The cache is IDisposable; the loop honours Dispose.

  11. Update docs/design.md AND docs/kpi.md AND mbproxy/CLAUDE.md AND install/mbproxy.config.template.json IN THE SAME PR AS THE CODE. Doc drift is a gate fail. The architectural pivot must be visible across all reader-facing surfaces.

Cross-references

  • Phase 9's multiplexer is the chokepoint that hosts the cache check: 09-txid-multiplexing.md.
  • Phase 10's CoalescingKey is the same shape as Phase 11's CacheKey: 10-read-coalescing.md.
  • The "not a polling/cache layer" stance that this phase pivots away from: ../design.md → "What this is" + "Purpose".
  • KPI graduation target: ../kpi.md → Tier 1 (cache-hit-ratio joins this tier).
  • Resolution rules for per-tag CacheTtlMs (Global Add Remove fallback + per-PLC default): ../design.md → "Hybrid tag resolution".