wwtools/mbproxy/docs/Architecture/ResponseCache.md

# Response Cache

The response cache is an opt-in per-tag, bounded-staleness layer that serves
FC03 and FC04 reads from in-process memory. It sits above read coalescing in
the request path so a hit avoids both the coalescing entry and the backend
round-trip entirely.

## Cache Contract

The cache is **off by default for every tag**. `CacheTtlMs = 0` on every BCD
tag is the default state, and a deployment that ships without any TTL
configuration behaves identically to one compiled without the cache at all
— no in-memory entries are created, every FC03/FC04 read falls through to
the coalescing-then-backend path, and counters that track cache activity
stay at zero.

Operators opt a tag in by setting a positive `CacheTtlMs`. That positive
value is the explicit acknowledgement of the staleness window: the operator
is stating, "I am willing for upstream clients to see a value up to N
milliseconds old in exchange for taking the read off the backend." There is
no implicit cache enablement. There is no global cache toggle that turns
caching on for previously-uncached tags. Every cached tag is one whose
configuration has a positive TTL on its line.

This stance is the design-contract pivot the cache introduces: before it,
the proxy is purely transparent except for BCD rewriting. With the cache,
the proxy is transparent **by default**, with an opt-in cache layer the
operator can engage tag-by-tag.

## TTL Resolution Order

Each FC03/FC04 read range resolves to one effective TTL through three
tiers:

1. **Explicit per-tag.** `BcdTagOptions.CacheTtlMs` on the tag entry. A
   non-null value wins regardless of the per-PLC default. An explicit `0`
   here disables caching for that tag even when the PLC default is
   positive.
2. **Per-PLC default.** `PlcOptions.DefaultCacheTtlMs` applies to any tag
   whose explicit `CacheTtlMs` is `null` (unset). A `0` default means "no
   caching by default at this PLC."
3. **Zero.** With nothing set at either tier, the resolved TTL is `0` and
   the read is uncached.

`BcdTagMap.ResolveCacheTtlMs(startAddress, qty)` implements the per-read
resolution. It enumerates the BCD tags whose register footprints intersect
the requested range and returns the smallest positive TTL across the hits,
or `0` if the range covers no configured tags.

```csharp
public int ResolveCacheTtlMs(ushort startAddress, ushort qty)
{
    if (!TryGetForRange(startAddress, qty, out var hits) || hits.Count == 0)
        return 0;

    int min = int.MaxValue;
    foreach (var hit in hits)
    {
        int ttl = hit.Tag.CacheTtlMs;
        if (ttl <= 0) return 0;
        if (ttl < min) min = ttl;
    }
    return min == int.MaxValue ? 0 : min;
}
```

The `hit.Tag.CacheTtlMs` value resolved on each `BcdTag` already reflects
the explicit-then-default order — the options binder resolves the per-tag
override against the per-PLC default at config build time, so the runtime
hot path sees a single integer per tag.

## Multi-Tag Range TTL Rule

When a single FC03/FC04 read covers multiple configured BCD tags, the
effective TTL is the minimum across them:

```text
range covers tags { A:TTL=500, B:TTL=2000, C:TTL=100 } → effective TTL = 100
range covers tags { A:TTL=500, B:TTL=0 (uncached)    } → effective TTL = 0
range covers tags { A:TTL=500 }                        → effective TTL = 500
range covers no configured tags                        → effective TTL = 0
```

If any covered tag has `CacheTtlMs = 0`, the whole read is uncached. The
rationale is conservative-by-design: a multi-tag read whose narrowest TTL
is, for example, 100 ms cannot be served safely from an entry that was
stored under a tag with TTL 2 s, because that entry's freshness was only
guaranteed by the longer window. Rather than partition a range read across
heterogeneous TTLs or invent inheritance rules that an operator would have
to reason about per-deployment, the cache refuses to serve any multi-tag
read whose narrowest covered TTL is zero. Operators who want a tag cached
in isolation but uncached when read alongside an uncached neighbour get the
expected behaviour by leaving the neighbour at `CacheTtlMs = 0`.

A read whose range covers no configured BCD tags also resolves to `0`.
There is nothing to be conservative about because the cache only serves
ranges that contain rewriter-tracked tags — a read of plain non-BCD
registers does not engage the cache regardless of any per-PLC default.

## Lookup Order

The multiplexer's FC03/FC04 path consults three tiers in fixed order:

1. **Cache.** When `_ctx.Cache` is wired and `BcdTagMap.ResolveCacheTtlMs`
   returns a positive TTL for the read range, `ResponseCache.TryGet` is
   called against a `CacheKey(unitId, fc, startAddress, qty)`. A hit
   splices the cached payload onto a fresh MBAP header carrying the
   original upstream TxId, pushes the frame onto that pipe's response
   channel, and **returns without engaging coalescing or the backend at
   all**.
2. **Coalesce.** On a cache miss (or when the resolved TTL is zero), the
   request is offered to `InFlightByKeyMap.TryAttachOrCreate`. A hit
   attaches the new party to a peer's in-flight request.
3. **Backend.** On a coalescing miss, the request opens a proxy TxId,
   registers a `CorrelationMap` entry, runs the BCD rewriter on any FC06
   or FC16 payload, and queues the frame onto the outbound channel.

The cache check happens **before** the multiplexer's
`EnsureBackendConnectedAsync` call. A cache hit serves the upstream even
when the backend socket is currently disconnected or recovering. This is
not an accident — the cached payload's freshness is bounded by its TTL,
not by the liveness of the backend socket. See
[`../Operations/Troubleshooting.md`](../Operations/Troubleshooting.md) for
the operator view of cache-served reads during a backend outage.

## Storage Format: Post-Rewriter Bytes

`CacheEntry.PduBytes` holds the **post-rewriter response PDU body** — the
function code byte, the byte count, and the rewriter-decoded register
data, with no MBAP header. The backend reader task decodes the response
through `BcdPduPipeline` first and only then hands the rewritten payload
to `ResponseCache.Set`.

```csharp
internal sealed record CacheEntry(
    byte[] PduBytes,
    DateTimeOffset CachedAtUtc,
    DateTimeOffset ExpiresAtUtc,
    int Length,
    long LastUsedTick);
```

Storing post-rewriter bytes is both a CPU optimisation and a correctness
guarantee:

- **CPU.** A cache hit returns ready-to-send bytes. The rewriter does not
  re-run per hit; only the MBAP header is regenerated to carry the
  upstream's original TxId.
- **Correctness.** An entry decoded against an earlier rewriter version
  never gets retroactively re-transformed against a newer version. If the
  rewriter's behaviour changes mid-process (it does not today, but the
  guarantee is durable across future changes), in-flight cached entries
  age out under their TTL and are replaced by fresh entries decoded
  through the new rewriter. A bidirectional re-encode never happens to an
  already-stored entry.

## Write Invalidation by Address Range Overlap

A successful (non-exception) FC06 or FC16 response invalidates every
cached FC03 or FC04 entry whose address range
`[StartAddress, StartAddress + Qty)` overlaps the write range
`[writeStart, writeStart + writeQty)`. The pure overlap math lives in
`CacheInvalidator.FindOverlapping`:

```csharp
int writeEnd = writeStart + writeQty;   // half-open upper bound

foreach (var key in haystack)
{
    if (key.UnitId != unitId) continue;
    if (key.Fc != 0x03 && key.Fc != 0x04) continue;

    int keyEnd = key.StartAddress + key.Qty;
    // Overlap iff writeStart < keyEnd AND key.StartAddress < writeEnd.
    if (writeStart < keyEnd && key.StartAddress < writeEnd)
        hits.Add(key);
}
```

Worked examples on a single unit ID:

```text
Write to register 105 (qty=1)
  └─ invalidates cached FC03 [100..110) — register 105 is inside the cached range
  └─ leaves    cached FC03 [200..210) untouched

Write to registers [10..15) (qty=5)
  └─ leaves    cached FC03 [15..20) untouched — half-open intervals, 15 is not in [10..15)

Write to registers [98..108) (qty=10)
  └─ invalidates cached FC03 [100..110) — ranges overlap on [100..108)
```

Three properties of the invalidator deserve calling out:

- **Exception responses do not invalidate.** A Modbus exception (code 01,
  02, 03, 04, or any other) means the write did not take effect on the
  PLC. The cached read is still consistent with the device, so the
  invalidator is not engaged.
- **Different unit IDs never invalidate each other.** Multi-drop and
  gateway personalities behind a shared socket address logically separate
  Modbus tables. `CacheKey.UnitId` discriminates.
- **Only FC03 and FC04 entries are evicted.** The cache never stores write
  responses, so the invalidator's function-code filter is defensive
  rather than load-bearing.

## Bounded Capacity (LRU)

Each `ResponseCache` instance is capped at `Cache.MaxEntriesPerPlc`
(default 1000). When the dictionary is at the cap and a fresh insert
arrives, `EvictLeastRecentlyUsed` walks the entries and removes the one
with the smallest `CacheEntry.LastUsedTick`. The linear scan is
intentional — at 1000 entries the scan is cheaper than the network
round-trip the cache is saving, and a sorted secondary structure would
add complexity for no measurable win.

`LastUsedTick` is a monotonic 64-bit counter incremented on every hit and
every fresh insert. Using the counter rather than `DateTimeOffset.UtcNow`
keeps the hot path free of clock calls and survives wall-clock skew.

A background task drives proactive expiry. The constructor starts a
`PeriodicTimer` at `Cache.EvictionIntervalMs` (default 5000 ms; values
under 100 ms are clamped at 100 ms to prevent tight loops) and the
eviction loop sweeps every entry whose `ExpiresAtUtc` has passed. The
loop is the safety net that keeps abandoned entries — say, those for a
PLC whose upstream clients have all dropped — from holding memory until
process exit. Lazy expiry on `TryGet` still removes entries on demand
when traffic is steady; the background loop only matters under low- or
zero-traffic conditions.

## Long-TTL Safety Gate

`MbproxyOptionsValidator.ValidateCacheTtl` rejects any explicit
`CacheTtlMs > 60_000` unless `Cache.AllowLongTtl = true`. The same gate
applies to `PlcOptions.DefaultCacheTtlMs`. The rejection runs at config
bind / hot-reload time, so a misconfigured `appsettings.json` fails fast
before the cache sees the value.

The gate exists to catch the "left at 1 hour by accident" mistake — a
deployment where a developer set `CacheTtlMs = 3_600_000` for a debugging
session and the value survived into production. Operators who legitimately
need long TTLs (slow-moving setpoints, configuration values that change
once per shift) flip `Cache.AllowLongTtl` to `true` as the explicit
acknowledgement that the long staleness window is intentional.

## Cache and the Rewriter

The BCD rewriter runs **once** on the cache-miss path: the backend reader
task decodes the response through `BcdPduPipeline` and only then hands the
decoded bytes to `ResponseCache.Set`. Cache hits return the stored
post-rewriter bytes directly.

This division has two consequences worth restating:

- **The rewriter cost is amortised across hits.** A high cache hit ratio
  on a tag-dense PLC drops the per-request rewriter cost from "every
  response" to "every cache-miss response," which on a hot register at
  TTL=500 ms is one-in-many.
- **The cached payload is decoupled from the rewriter implementation.**
  An entry stored under one rewriter does not get re-transformed if the
  rewriter changes. Entries age out under TTL and are replaced by fresh
  entries decoded under the current rewriter — there is no in-place
  recomputation pass.

## Hot-Reload Semantics

Configuration changes propagate through `IOptionsMonitor<MbproxyOptions>`.
The cache reacts to four kinds of change:

| Change | Cache behaviour |
|--------|----------------|
| Tag's `CacheTtlMs` changed (`0 → N`, `N → 0`, `N → M`) | Entire PLC cache is flushed via `ResponseCache.Clear()`; entries re-populate on demand under the new TTL. |
| New PLC added / removed | New PLC starts with an empty cache; removed PLC's `ResponseCache` is disposed with the multiplexer. |
| `Cache.AllowLongTtl` flipped | Validation runs on the next reload only; existing entries are unaffected. |
| `Cache.MaxEntriesPerPlc` changed | Existing entries are unaffected; the new cap applies to subsequent inserts. |
| `Cache.EvictionIntervalMs` changed | Existing eviction loop continues with its old period; subsequent loops use the new interval. |

Per-tag flush granularity is intentionally not implemented. The clean move
is "any tag-list change to a PLC → drop every entry for that PLC and let
the natural traffic re-populate." Tracking which keys correspond to which
tag IDs adds bookkeeping for no operational win — a tag-list reload is
already a once-in-a-while event, and the rebuild cost on the affected
PLC's hot keys is one round-trip per key under traffic.

See [`../Features/HotReload.md`](../Features/HotReload.md) for the
broader `IOptionsMonitor` propagation model.

## Cache Survives Backend Disconnects

A cached entry's data was valid when stored. A subsequent backend
disconnect does not retroactively invalidate it — the value the upstream
client sees on a hit is the value the PLC reported within the TTL
window, irrespective of whether the backend socket is up at the moment
of the hit. This is the cache's most operationally visible property
during PLC outages: upstream consumers that read hot tags within the
cache window continue to receive responses while the listener supervisor
is in `recovering` state.

The companion rule on the write side keeps the invariant consistent:
**invalidations during a `recovering` listener state are skipped**. If
the backend is down, an FC06 or FC16 write did not reach the PLC, so the
cached read is still consistent with the device's actual state. Skipping
the invalidation matches reality — the write did not take effect, so the
read is not stale.

The skip is **structural**, not conditional. Cache invalidation only
fires inside the per-PLC backend reader task, after a non-exception
FC06/FC16 response arrives from the PLC. A `recovering` supervisor has
torn down its multiplexer and there is no backend reader, so no response
can land and the invalidation path is never entered. This is the
reasoning the code at `Proxy/Multiplexing/PlcMultiplexer.cs` documents
inline (W2.9). If a future change ever produced a write response off the
live backend (e.g. a mocked-response path), an explicit `Recovering`
check would need to be added at the invalidator call site to keep the
skip semantics correct.

## No Persistence

The cache is purely in-memory. Process restart wipes every entry. There
is no file-backed snapshot, no Redis or other external store, and no
last-known-good replay. A restarted service rebuilds its cache from
fresh backend round-trips driven by upstream traffic, exactly as it
would after a TTL-induced flush.

Intentional, for two reasons. First, the staleness contract is bounded
by `CacheTtlMs` measured from when the data was first read, and a
persisted entry would re-emerge with an unknown wall-clock age — every
invariant the cache offers would need a freshness field, freshness
arithmetic on load, and recovery against a clock that may have jumped.
Second, the operational model is that the proxy is a stateless
transformer; treating its cache as durable state would change the
deployment story for no measurable production benefit.

## Counter Accounting

`ProxyCounters` exposes five cache counters per PLC, surfaced on the
status page as both per-PLC and fleet-aggregate values:

- **`cacheHitCount`** — FC03/FC04 requests served from the cache. Bumped
  inside `OnUpstreamFrameAsync` when `ResponseCache.TryGet` returns true.
- **`cacheMissCount`** — FC03/FC04 requests whose resolved TTL was
  positive but whose key was not in the cache (or whose entry had
  expired). The identity `cacheHitCount + cacheMissCount = total
  cache-eligible FC03/FC04 requests` holds — reads whose effective TTL
  is `0` (uncached) increment neither counter.
- **`cacheHitRatio`** — derived on the status page snapshot as
  `cacheHitCount / (cacheHitCount + cacheMissCount)` when the
  denominator is non-zero.
- **`cacheInvalidations`** — count of cache entries invalidated by
  successful FC06/FC16 write responses, summed across writes.
- **`cacheEntryCount`** — point-in-time snapshot of
  `ResponseCache.Count` (Tier-2 memory-watch KPI).
- **`cacheBytes`** — point-in-time approximation of cached PDU bytes,
  computed as the running sum of `CacheEntry.Length` across entries
  (Tier-2 memory-watch KPI).

The structured log events `mbproxy.cache.hit`, `mbproxy.cache.miss`,
`mbproxy.cache.store`, `mbproxy.cache.invalidated`, and
`mbproxy.cache.flushed` (defined in `CacheLogEvents`) mirror the counter
increments at Debug level for incident-time diagnosis. Counters are the
steady-state observability surface; the events are for tracing one
request through the cache when something looks wrong. See
[`../Operations/StatusPage.md`](../Operations/StatusPage.md) and
[`../Reference/LogEvents.md`](../Reference/LogEvents.md).

## Design-Contract Note

The cache changes the proxy's posture from "purely transparent except
for BCD rewriting" to "transparent by default, with an opt-in cache
layer." The transition is deliberate and operator-driven: setting
`CacheTtlMs > 0` on a tag is the explicit consent to the staleness
window, and a deployment that ships no positive TTLs is observationally
indistinguishable from one compiled without the cache code path.

There is no global switch, no implicit warm-up, and no behavioural
divergence from the transparent baseline until the operator opts in
tag-by-tag. The cache is the only place in the proxy where an upstream
read can resolve to a value that did not just round-trip the wire, and
its engagement is gated entirely by the per-tag and per-PLC TTL
configuration described above.

## Related Documentation

- [`./ConnectionModel.md`](./ConnectionModel.md) — TxId multiplexing,
  correlation map, and the backend socket the cache short-circuits on a
  hit.
- [`./ReadCoalescing.md`](./ReadCoalescing.md) — sits below the cache in
  the lookup order; cache hits short-circuit coalescing entirely.
- [`../Features/BcdRewriting.md`](../Features/BcdRewriting.md) — the
  `BcdPduPipeline` whose post-decode bytes the cache stores.
- [`../Features/HotReload.md`](../Features/HotReload.md) — the
  `IOptionsMonitor` propagation that drives the per-PLC flush on
  tag-list change.
- [`../Operations/Configuration.md`](../Operations/Configuration.md) —
  binding for `BcdTagOptions.CacheTtlMs`,
  `PlcOptions.DefaultCacheTtlMs`, and the `Cache` section
  (`AllowLongTtl`, `MaxEntriesPerPlc`, `EvictionIntervalMs`).
- [`../Operations/StatusPage.md`](../Operations/StatusPage.md) — exposes
  `cacheHitCount`, `cacheMissCount`, `cacheHitRatio`,
  `cacheInvalidations`, `cacheEntryCount`, and `cacheBytes`.
- [`../Operations/Troubleshooting.md`](../Operations/Troubleshooting.md)
  — the operator view of cache-served reads while a backend is in
  `recovering` state.
- [`../Reference/LogEvents.md`](../Reference/LogEvents.md) — full
  `mbproxy.cache.*` event catalogue with event IDs.
- [`../Testing/Simulator.md`](../Testing/Simulator.md) — the
  `pymodbus` DL205 stand-in used by the end-to-end cache tests.