mbproxy: add opt-in response cache (Phase 11)

Layers a per-PLC, per-tag response cache on top of Phase 10's coalescing.
Cache is OFF by default per tag (CacheTtlMs = 0); a fresh deployment with no
TTL config behaves identically to Phase 10. Operators opt tags in by setting
CacheTtlMs > 0 on a BcdTagOptions entry (or DefaultCacheTtlMs > 0 on a
PlcOptions entry), explicitly acknowledging the staleness window.

Cache lookup order: cache -> coalesce -> backend. A cache hit short-circuits
both Phase 10's coalescing path and Phase 9's backend send. Cache stores
POST-rewriter PDU bytes so hits never re-invoke the BCD rewriter. FC06/FC16
write responses invalidate every cached entry whose address range overlaps
the write (half-open interval math).

New types (Mbproxy.Proxy.Cache, all internal):
- CacheKey (record-struct, same shape as CoalescingKey but kept SEPARATE so
  the two phases evolve independently).
- CacheEntry, ResponseCache (IDisposable; LRU + PeriodicTimer eviction
  loop), CacheInvalidator (pure overlap matcher), CacheLogEvents (stable
  mbproxy.cache.* names).

Multi-tag range TTL = min(TTLs); any tag with TTL = 0 in the range disables
caching for the whole read (conservative-by-design).

Options surface:
- BcdTagOptions.CacheTtlMs (nullable int; null = fall through to PLC default)
- PlcOptions.DefaultCacheTtlMs
- MbproxyOptions.Cache.{AllowLongTtl, MaxEntriesPerPlc, EvictionIntervalMs}
- TTL > 60_000 ms requires Cache.AllowLongTtl = true (reload validation).

Admin counters (Tier 1.8 + Tier 2 cache-memory KPIs from docs/kpi.md):
- CacheHitCount, CacheMissCount, CacheInvalidations on ProxyCounters.
- CacheEntryCount, CacheBytes via a new ICacheStatsProvider snapshot path.
- /status.json and the HTML page surface a new Cache cell per PLC row.

Hot-reload: any tag-list change to a PLC reseats the per-PLC context with a
fresh cache; the old cache is disposed inside ReplaceContextAsync. Per-tag
flush granularity is intentionally not implemented in v1.

PLCs with no cache-eligible tags (every resolved tag has CacheTtlMs = 0)
get Cache = null on the context and skip the eviction timer entirely, so
the no-cache path is byte-identical to Phase 10.

Tests (32 new unit + 5 new E2E = 37 new; suite now 314 unit + 48 E2E):
- CacheKeyTests, CacheEntryTests (records + boundary semantics).
- CacheInvalidatorTests: full overlap, both partials, adjacent-not-
  overlapping, disjoint, different unit ID + auxiliary FC-filter / zero-qty.
- ResponseCacheTests: round-trip, lazy expiry, range invalidation,
  unit-id filter, LRU bound, LRU access tracking, concurrent get/set,
  dispose, clear, approximate-bytes accounting.
- ResponseCacheMultiplexerTests (stub-backend): hit short-circuits
  coalescing, BCD-decoded bytes are cached not raw, FC06 invalidates
  overlapping, non-overlapping write does not invalidate, multi-tag
  TTL=min rule, regression-cache-disabled-by-default-is-Phase-10, hit
  works even when backend unreachable.
- ResponseCacheE2ETests (pymodbus DL205 sim, sequential reads):
  * Headline: 10 reads with TTL=1000 ms -> 9 hits, 1 miss, 1 backend trip.
  * TTL expiry path with sleep > TTL.
  * Write invalidation through the proxy on a scratch register.
  * BCD-decoded bytes are cached, not raw BCD nibbles.
  * Regression: Cache disabled by default -> behaviour byte-identical to
    Phase 10.

Pre-existing flake hardened: BackendDisconnect_CascadesToAllUpstreams now
polls briefly for the cascade counter to absorb the inherent scheduling
gap between "upstream EOF observed" and "counter incremented inside
TearDownBackendAsync." Counter semantics unchanged.

Phase doc updated with implementation clarifications discovered during
this work (CacheKey kept separate from CoalescingKey, LastUsedTick is
long, FC06/FC16 startAddr/qty parsing extension, cache-pre-connect
short-circuit, write-invalidation only on successful responses).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-05-14 03:08:51 -04:00
parent 892b10baf4
commit 1db900edef
33 changed files with 2407 additions and 38 deletions
@@ -0,0 +1,277 @@
namespace Mbproxy.Proxy.Cache;
/// <summary>
/// Per-PLC opt-in response cache for FC03 / FC04 read responses. Phase 11.
///
/// <para><b>Lifecycle.</b> One instance per PLC, owned by the per-PLC context. The cache
/// is consulted on every FC03/FC04 request before coalescing; populated by the backend
/// reader task AFTER the BCD rewriter has decoded the response; invalidated on every
/// successful FC06/FC16 write response that overlaps a cached read range.</para>
///
/// <para><b>Concurrency.</b> A single <see cref="object"/> lock serialises every method.
/// A per-PLC cache sees at most one outstanding FC03/FC04 read on the backend at any
/// instant (the multiplexer serialises onto the shared socket), but the read-on-hit path
/// is called from many upstream task contexts concurrently; the lock is small and fast.</para>
///
/// <para><b>LRU eviction.</b> Each touch (hit or insert) assigns the entry the next value
/// from <see cref="_lruTicker"/>. When the cache reaches <see cref="_maxEntries"/> and a
/// new entry is inserted, the entry with the smallest <see cref="CacheEntry.LastUsedTick"/>
/// is removed.</para>
///
/// <para><b>TTL expiry.</b> Entries past their <see cref="CacheEntry.ExpiresAtUtc"/> are
/// dropped lazily on every read attempt, and also swept proactively by a background
/// <see cref="PeriodicTimer"/> loop every <see cref="_evictionIntervalMs"/>. The background
/// loop is the safety net that prevents abandoned entries (PLC whose clients all dropped)
/// from holding memory until process exit.</para>
/// </summary>
internal sealed class ResponseCache : IDisposable
{
// ── State ────────────────────────────────────────────────────────────────────
private readonly object _lock = new();
private readonly Dictionary<CacheKey, CacheEntry> _entries;
private readonly int _maxEntries;
private readonly int _evictionIntervalMs;
private long _lruTicker;
private long _approxBytes;
private readonly CancellationTokenSource _cts = new();
private readonly Task _evictionTask;
private bool _disposed;
/// <summary>
/// Constructs a cache with the supplied capacity and eviction tick interval. The
/// eviction loop starts immediately; the cache becomes usable as soon as the
/// constructor returns.
/// </summary>
/// <param name="maxEntriesPerPlc">LRU cap. Past this count, the next insert evicts
/// the least-recently-used entry. Must be &gt;= 0; 0 disables caching entirely (every
/// <see cref="Set"/> call no-ops).</param>
/// <param name="evictionIntervalMs">Background sweep interval in milliseconds. Clamped
/// to a 100 ms floor and an effective ceiling of <c>int.MaxValue</c>.</param>
public ResponseCache(int maxEntriesPerPlc, int evictionIntervalMs)
{
if (maxEntriesPerPlc < 0)
throw new ArgumentOutOfRangeException(nameof(maxEntriesPerPlc),
"maxEntriesPerPlc must be >= 0.");
if (evictionIntervalMs < 0)
throw new ArgumentOutOfRangeException(nameof(evictionIntervalMs),
"evictionIntervalMs must be >= 0.");
_maxEntries = maxEntriesPerPlc;
// 100 ms floor — protects against pathologically tight loops; 0 (operator-pinned)
// becomes 100 ms here so the eviction task isn't a tight loop spinning on
// _entries.
_evictionIntervalMs = Math.Max(100, evictionIntervalMs);
_entries = new Dictionary<CacheKey, CacheEntry>(capacity: Math.Min(_maxEntries, 64));
_evictionTask = Task.Run(() => RunEvictionLoopAsync(_cts.Token));
}
/// <summary>Current entry count. Stable read under lock.</summary>
public int Count
{
get { lock (_lock) return _entries.Count; }
}
/// <summary>Approximation of cached PDU bytes (Sum of <see cref="CacheEntry.Length"/>). Stable read under lock.</summary>
public long ApproximateBytes
{
get { lock (_lock) return _approxBytes; }
}
/// <summary>
/// Returns <c>true</c> with the cached <see cref="CacheEntry"/> when a non-expired
/// entry is present for <paramref name="key"/>. Expired entries are removed lazily.
/// Updates LRU ordering on hit.
/// </summary>
public bool TryGet(CacheKey key, out CacheEntry entry)
{
DateTimeOffset now = DateTimeOffset.UtcNow;
lock (_lock)
{
if (!_entries.TryGetValue(key, out var existing))
{
entry = null!;
return false;
}
if (existing.ExpiresAtUtc <= now)
{
// Expired — remove and miss.
_entries.Remove(key);
_approxBytes -= existing.Length;
entry = null!;
return false;
}
long tick = ++_lruTicker;
var refreshed = existing with { LastUsedTick = tick };
_entries[key] = refreshed;
entry = refreshed;
return true;
}
}
/// <summary>
/// Inserts or replaces the entry under <paramref name="key"/>. If the cache is at
/// capacity, evicts the LRU entry first. No-op when <see cref="_maxEntries"/> is 0.
/// </summary>
public void Set(CacheKey key, CacheEntry entry)
{
if (_maxEntries == 0) return;
lock (_lock)
{
long tick = ++_lruTicker;
var stamped = entry with { LastUsedTick = tick };
if (_entries.TryGetValue(key, out var existing))
{
// Replace; adjust byte accounting.
_approxBytes -= existing.Length;
_approxBytes += stamped.Length;
_entries[key] = stamped;
return;
}
// Insert. Evict LRU if at cap.
if (_entries.Count >= _maxEntries)
EvictLeastRecentlyUsed();
_entries[key] = stamped;
_approxBytes += stamped.Length;
}
}
/// <summary>
/// Invalidates every entry whose <see cref="CacheKey"/> range overlaps the write
/// <c>[startAddress, startAddress + qty)</c> on <paramref name="unitId"/>. Returns the
/// count of invalidated entries.
/// </summary>
public int Invalidate(byte unitId, ushort startAddress, ushort qty)
{
lock (_lock)
{
// Snapshot keys for the pure overlap matcher.
var keys = _entries.Keys.ToArray();
int count = 0;
foreach (var k in CacheInvalidator.FindOverlapping(keys, unitId, startAddress, qty))
{
if (_entries.TryGetValue(k, out var existing))
{
_entries.Remove(k);
_approxBytes -= existing.Length;
count++;
}
}
return count;
}
}
/// <summary>
/// Drops every entry. Used by hot-reload (per-PLC flush on tag-map change).
/// Returns the count of entries that were present before the flush.
/// </summary>
public int Clear()
{
lock (_lock)
{
int n = _entries.Count;
_entries.Clear();
_approxBytes = 0;
return n;
}
}
/// <summary>
/// Stops the eviction loop and disposes the internal CTS. Idempotent.
/// </summary>
public void Dispose()
{
if (_disposed) return;
_disposed = true;
try { _cts.Cancel(); } catch { /* best effort */ }
// Best-effort join the eviction loop; the loop will observe the cancellation and
// exit. We bound the wait so a faulted loop doesn't hold up disposal.
try { _evictionTask.Wait(TimeSpan.FromSeconds(1)); } catch { /* best effort */ }
_cts.Dispose();
}
// ── Eviction internals ───────────────────────────────────────────────────────
private void EvictLeastRecentlyUsed()
{
// Linear scan — acceptable at MaxEntriesPerPlc = 1000 (insert path is far cheaper
// than the network round-trip the cache is saving). A sorted secondary structure
// would be a premature optimisation.
CacheKey lruKey = default;
long lruTick = long.MaxValue;
bool found = false;
foreach (var kvp in _entries)
{
if (kvp.Value.LastUsedTick < lruTick)
{
lruTick = kvp.Value.LastUsedTick;
lruKey = kvp.Key;
found = true;
}
}
if (found && _entries.TryGetValue(lruKey, out var existing))
{
_entries.Remove(lruKey);
_approxBytes -= existing.Length;
}
}
private async Task RunEvictionLoopAsync(CancellationToken ct)
{
var period = TimeSpan.FromMilliseconds(_evictionIntervalMs);
using var timer = new PeriodicTimer(period);
try
{
while (await timer.WaitForNextTickAsync(ct).ConfigureAwait(false))
{
SweepExpired();
}
}
catch (OperationCanceledException)
{
// Normal disposal.
}
catch
{
// Defensive — eviction loop must never fault the host. A swallow here means
// entries are only evicted on access until disposal, which is correctness-preserving.
}
}
private void SweepExpired()
{
DateTimeOffset now = DateTimeOffset.UtcNow;
lock (_lock)
{
if (_entries.Count == 0) return;
// Two-pass to avoid mutating during enumeration.
var expired = new List<CacheKey>();
foreach (var kvp in _entries)
{
if (kvp.Value.ExpiresAtUtc <= now)
expired.Add(kvp.Key);
}
foreach (var k in expired)
{
if (_entries.TryGetValue(k, out var existing))
{
_entries.Remove(k);
_approxBytes -= existing.Length;
}
}
}
}
}