Files
wwtools/mbproxy/install/mbproxy.config.template.json
T
Joseph Doherty 892b10baf4 mbproxy/docs: pivot design contract for Phase 11 response cache
Lands the design-contract pivot ahead of any cache implementation code so
reviewers can evaluate the change to the "purely transparent proxy" stance
independently of the Phase-11 code that depends on it.

- docs/design.md: rewrite "What this is" / Read-coalescing / Failure-modes
  sections to acknowledge the opt-in cache; add new "Response cache (Phase
  11)" section covering lookup order (cache -> coalesce -> backend), multi-
  tag range TTL = min, post-rewriter storage, address-range-overlap write
  invalidation, hot-reload PLC-wide flush, no-persistence, AllowLongTtl gate,
  and LRU-bounded capacity. Extend log event table with mbproxy.cache.*
  events. Extend per-PLC status field table with cacheHitCount /
  cacheMissCount / cacheInvalidations / cacheEntryCount / cacheBytes.
  Extend hot-reload propagation table with CacheTtlMs / Cache.* rows.
- docs/kpi.md: graduate Tier 1.8 (response cache) from "requires Phase 11"
  to "shipped in Phase 11" and add Tier 2.4a cache-memory section.
- CLAUDE.md (mbproxy): update Purpose paragraph and the Architecture
  headline bullets to reflect the transparent-by-default + opt-in-cache
  contract; flip "Implementation complete through Phase 10" to "through
  Phase 11".
- install/mbproxy.config.template.json: add a fully-commented Mbproxy.Cache
  block and a CacheTtlMs example on a BcdTags.Global entry, with prominent
  staleness commentary documenting the design contract.

No code changes in this commit - implementation lands in a follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 02:35:35 -04:00

223 lines
12 KiB
JSON
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
// mbproxy configuration template — copy to %ProgramData%\mbproxy\appsettings.json
// and edit before starting the service.
//
// The .NET configuration loader accepts // and /* */ comments in JSON files
// (JSONC semantics) when using the default Host.CreateApplicationBuilder path.
//
// IMPORTANT: This file is overwritten on each install ONLY if no appsettings.json
// already exists at the destination. An existing file is always preserved.
{
"Mbproxy": {
// ── Global BCD tag list ─────────────────────────────────────────────────────────────
// These tags apply to EVERY PLC by default.
// Each entry: Address (Modbus PDU address, decimal), Width (16 or 32 bits).
//
// Width 16 — one register holds 4 BCD digits (09999).
// Wire value 0x1234 decodes to decimal 1234.
//
// Width 32 — a CDAB-ordered register pair (Address = low word, Address+1 = high word).
// Decoded decimal = high * 10000 + low (DirectLOGIC CDAB word order).
//
// Per-PLC overrides (see Plcs[].BcdTags below):
// Add — appends extra tags beyond what Global defines, or overrides a
// Global entry's Width when the same Address appears in both.
// Remove — removes specific addresses from the effective set for that PLC.
// Effective set = (Global Add) Remove, resolved per PDU.
"BcdTags": {
"Global": [
// V2000 (octal) = decimal address 1024. 16-bit BCD counter.
{ "Address": 1024, "Width": 16 },
// V2040 (octal) = decimal address 1056. 32-bit BCD total at 1056/1057.
{ "Address": 1056, "Width": 32 },
// V2100 (octal) = decimal address 1088. 16-bit BCD setpoint.
//
// Phase 11: CacheTtlMs (optional) opts this tag into the response cache. With
// CacheTtlMs > 0 set, upstream clients reading this register will see values up
// to CacheTtlMs MILLISECONDS OLD — explicit acknowledgement of the staleness
// window is required by enabling it. Default (omitted or 0) = cache disabled
// for this tag. The cache is OFF by default for every tag.
{ "Address": 1088, "Width": 16 /* , "CacheTtlMs": 1000 */ }
]
},
// ── PLC list ────────────────────────────────────────────────────────────────────────
// Each entry maps one upstream proxy port → one backend PLC.
// Upstream clients connect to ListenPort; the proxy forwards to Host:Port.
//
// IMPORTANT: H2-ECOM100 modules accept at most 4 simultaneous TCP connections.
// With the 1:1 upstream↔backend model, a fifth upstream client to the same proxy
// port will cause a backend connect failure and an immediate upstream disconnect.
"Plcs": [
{
"Name": "Line1-Mixer", // Human-readable name (shown on status page and in logs)
"ListenPort": 5020, // Port the proxy listens on (upstream clients connect here)
"Host": "10.0.1.1", // PLC IP address or hostname
"Port": 502, // PLC Modbus TCP port (almost always 502)
"BcdTags": {
// Additional 32-bit tag specific to this PLC only.
"Add": [
{ "Address": 1200, "Width": 32 }
],
// Remove address 1056 from the Global list for this PLC
// (this mixer doesn't use the 32-bit BCD total).
"Remove": [ 1056 ]
}
},
{
"Name": "Line1-Conveyor",
"ListenPort": 5021,
"Host": "10.0.1.2",
"Port": 502
// No BcdTags override — uses the Global set as-is.
}
// Add one entry per PLC. Ports must be unique per host. Typical fleet: 54 PLCs.
],
// ── Admin port ──────────────────────────────────────────────────────────────────────
// Read-only HTTP status page.
// GET / → self-contained HTML (auto-refreshes every 5 s)
// GET /status.json → same data as JSON for monitoring scrapers
//
// Authentication is assumed at the network layer (trusted internal segment).
// Set to 0 to disable the admin endpoint.
"AdminPort": 8080,
// ── Connection timeouts ─────────────────────────────────────────────────────────────
"Connection": {
// Max time (ms) to wait for a TCP connect to the PLC backend.
// Each Polly retry attempt gets its own copy of this timeout.
"BackendConnectTimeoutMs": 3000,
// Max time (ms) to wait for the PLC to respond to a forwarded PDU.
// Non-idempotent FC06/FC16 writes are one-shot — the upstream client
// is disconnected immediately on timeout (no retry).
"BackendRequestTimeoutMs": 3000,
// Max time (ms) to wait for in-flight PDUs to complete during graceful shutdown
// (sc.exe stop / Windows Service stop signal). After this deadline the coordinator
// cancels remaining work and proceeds. Keep at or below the SCM wait-hint (30 s).
"GracefulShutdownTimeoutMs": 10000
},
// ── Resilience policies ─────────────────────────────────────────────────────────────
"Resilience": {
// Polly retry policy for backend TCP connect attempts.
// MaxAttempts: total connect tries (including the first).
// BackoffMs: delay between each attempt (must have MaxAttempts1 entries).
"BackendConnect": {
"MaxAttempts": 3,
"BackoffMs": [ 100, 500, 2000 ]
},
// Polly recovery policy for listener bind failures.
// If a PLC's listen port can't be bound (in-use, bad IP, transient OS error),
// the supervisor retries according to this schedule.
// InitialBackoffMs: backoff per step (first N retries).
// SteadyStateMs: backoff for all subsequent retries (runs indefinitely).
"ListenerRecovery": {
"InitialBackoffMs": [ 1000, 2000, 5000, 15000, 30000 ],
"SteadyStateMs": 30000
},
// Phase 10 — in-flight read coalescing.
//
// When two or more upstream clients (HMI / historian / engineering workstation /
// gateway) issue the SAME FC03 or FC04 read while a matching backend round-trip is
// already in flight, the proxy attaches the late arrivals to the existing in-flight
// entry and fans the single PLC response out to every attached client — saving the
// ECOM's per-scan PDU budget on duplicated reads.
//
// Zero post-response staleness: coalescing operates ONLY between "first request
// sent to PLC" and "response received from PLC" (microseconds to ~10 ms typical).
// Each upstream client still sees its own MBAP transaction ID echoed correctly;
// the proxy is transparent.
//
// FC06 / FC16 writes are NEVER coalesced (non-idempotent). FC03 vs FC04 are
// separate Modbus tables and never share a coalescing key. Different unit IDs
// (multi-drop / gateway-backed setups) never coalesce.
//
// Enabled — master switch. Hot-reloadable; flipping to false leaves running
// coalesced entries to drain naturally.
// MaxParties — per-entry cap on attached parties. Past the cap, the next
// identical request opens a fresh backend round-trip (load-shedding
// safety valve for very fan-out-heavy fleets).
"ReadCoalescing": {
"Enabled": true,
"MaxParties": 32
}
},
// ── Response cache (Phase 11) — opt-in bounded-staleness cache ──────────────────
//
// ⚠ DESIGN-CONTRACT PIVOT: with caching enabled the proxy is no longer purely
// transparent. Upstream FC03/FC04 reads for cache-enabled tags may return values
// up to CacheTtlMs MILLISECONDS OLD. Operators opt tags in by setting a non-zero
// CacheTtlMs on a BcdTagOptions entry (or DefaultCacheTtlMs on a PlcOptions entry).
//
// The cache is OFF BY DEFAULT for every tag. A deployment with NO TTL config (this
// section entirely absent and no BcdTags.*.CacheTtlMs / Plcs[i].DefaultCacheTtlMs)
// behaves IDENTICALLY to a pre-Phase-11 deployment — no behaviour change.
//
// AllowLongTtl — gate for any CacheTtlMs > 60_000. Reload validation
// rejects configs that exceed 60 s without this opt-in,
// to prevent accidentally-stale-for-an-hour deployments.
// MaxEntriesPerPlc — LRU cap per-PLC. Past this cap, the next insert evicts
// the least-recently-used entry. Defaults to 1000.
// EvictionIntervalMs — background eviction tick. Scans each PLC's cache and
// removes entries past their TTL. Defaults to 5000.
//
// Properties (full text in docs/design.md → "Response cache"):
// * Cache hits SHORT-CIRCUIT coalescing entirely (cache → coalesce → backend).
// * Successful FC06/FC16 write responses invalidate every cached FC03/FC04 entry
// whose address range OVERLAPS the write — not just exact-key match.
// * Multi-tag read range: effective TTL = min(TTLs). Any tag with TTL=0 in the
// range disables caching for the whole read.
// * Cache stores POST-rewriter bytes; hits never re-invoke the BCD rewriter.
// * Tag-list hot-reload flushes the affected PLC's whole cache.
// * No persistence — process restart wipes the cache.
"Cache": {
"AllowLongTtl": false,
"MaxEntriesPerPlc": 1000,
"EvictionIntervalMs": 5000
}
},
// ── Serilog ─────────────────────────────────────────────────────────────────────────────
// Structured log output. Default: Information level, rolling-file under ProgramData.
// The EventLogBridge writes Error+ events to the Windows Application Event Log
// automatically when the service runs under the SCM (not under dotnet run).
"Serilog": {
"Using": [ "Serilog.Sinks.Console", "Serilog.Sinks.File" ],
"MinimumLevel": {
"Default": "Information",
"Override": {
"Microsoft": "Warning",
"System": "Warning"
}
},
"WriteTo": [
{
"Name": "Console",
"Args": {
"outputTemplate": "[{Timestamp:HH:mm:ss} {Level:u3}] {Message:lj} {Properties:j}{NewLine}{Exception}"
}
},
{
"Name": "File",
"Args": {
// Rolling log: one file per day, kept for 30 days.
// Survives uninstall — logs are archived to %ProgramData%\mbproxy.archived-<ts>\.
"path": "C:\\ProgramData\\mbproxy\\logs\\mbproxy-.log",
"rollingInterval": "Day",
"retainedFileCountLimit": 30,
"outputTemplate": "[{Timestamp:yyyy-MM-dd HH:mm:ss.fff zzz} {Level:u3}] {Message:lj} {Properties:j}{NewLine}{Exception}"
}
}
]
}
}