mbproxy: cross-platform support — Linux/systemd alongside Windows

Make the service build, run, and install on Linux as a first-class
target while keeping the Windows Service + Event Log behaviour intact.

- Build: drop the hardcoded win-x64 RID — single-file publish now works
  for any RID. publish.ps1 gains -Rid; new publish.sh for Linux hosts.
- Diagnostics: DiagnosticSinkSelector picks the Error+ sink per host —
  Windows Event Log under the SCM, local syslog under systemd
  (Serilog.Sinks.SyslogMessages), none for interactive runs. The
  EventLog truncation helper is extracted so it is testable cross-OS.
- Host: Program.cs registers AddSystemd() alongside AddWindowsService().
- Config: a RID-conditioned appsettings template ships Windows or Unix
  paths; both templates are schema-validated by a test.
- Install: systemd unit (Type=exec) plus install.sh / uninstall.sh.
  Also fixes two cross-platform bugs found while testing: install.ps1
  and uninstall.ps1 used New-EventLog / Remove-EventLog (absent in
  PowerShell 7), and the E2E sim launcher hardcoded Windows venv paths.
- Docs updated across README, CLAUDE.md, and docs/ for dual-platform.

413 tests pass on Windows; 374 (all non-simulator) on Linux.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-05-15 09:41:59 -04:00
parent 0868613890
commit b330faff03
29 changed files with 1805 additions and 106 deletions
+1 -1
View File
@@ -23,7 +23,7 @@ When in doubt about where content belongs, default to pushing it deeper. `DOCS-G
- [`graccesscli/`](graccesscli/README.md) — `.NET Framework 4.8 / x86` CliFx-based CLI for automating Galaxy configuration through the ArchestrA GRAccess COM interop.
- [`grdb/`](grdb/README.md) — SQL/DDL exploration of the Galaxy Repository SQL database (queries, schema, hierarchy/tag-name translation).
- [`histdb/`](histdb/README.md) — LLM-oriented reference for AVEVA Historian retrieval (extension tables, `wwXxx` time-domain extensions, retrieval modes/options, alarm-event SQL, REST API). Distilled from the official Historian Retrieval Guide.
- [`mbproxy/`](mbproxy/README.md) — `.NET 10` Windows Service that proxies Modbus TCP for a fleet of ~54 DL205/DL260 PLCs: inline bidirectional BCD rewriting, single-backend-conn TxId multiplexing (lifts the H2-ECOM100 4-client cap), in-flight read coalescing, and opt-in per-tag response caching.
- [`mbproxy/`](mbproxy/README.md) — `.NET 10` background service (Windows Service or Linux systemd unit) that proxies Modbus TCP for a fleet of ~54 DL205/DL260 PLCs: inline bidirectional BCD rewriting, single-backend-conn TxId multiplexing (lifts the H2-ECOM100 4-client cap), in-flight read coalescing, and opt-in per-tag response caching.
- [`mxaccesscli/`](mxaccesscli/README.md) — `.NET Framework 4.8 / x86` CliFx-based CLI for reading, writing, and subscribing to System Platform tags via the **MxAccess** COM proxy (`LMXProxyServerClass`).
- [`secrets/`](secrets/README.md) — Self-hosted Infisical CLI + `secret` PowerShell helper for fetching credentials from `https://infisical.dohertylan.com` instead of inlining plaintext.
+9 -7
View File
@@ -4,7 +4,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
## What this is
`mbproxy` is a **C# .NET 10** background service (Windows Service) that sits **inline as a Modbus TCP proxy** in front of a fleet of **~54 AutomationDirect DirectLOGIC DL205 / DL260** equipment controllers. It is pre-configured with two pieces of static data:
`mbproxy` is a **C# .NET 10** background service — a **Windows Service** or a **Linux systemd unit** that sits **inline as a Modbus TCP proxy** in front of a fleet of **~54 AutomationDirect DirectLOGIC DL205 / DL260** equipment controllers. It is pre-configured with two pieces of static data:
1. **A list of BCD tags** — the holding/input registers (by Modbus address and bit width) that the controllers store in DirectLOGIC's native BCD encoding (`V2000 = 1234` is stored on the wire as `0x1234`, *not* `0x04D2`).
2. **A list of equipment controller IP addresses** (~54 entries) for the DL205/DL260 fleet. Each controller speaks Modbus TCP on port 502 via either the built-in DL260 Ethernet port or an H2-ECOM100 / H2-EBC100 coprocessor.
@@ -31,19 +31,20 @@ The full architecture is documented under **[`docs/`](docs/)** — see the `Arch
- **`appsettings.json` is hot-reloadable** via `IOptionsMonitor<MbproxyOptions>`; tag-list changes propagate per-PDU, PLC add/remove flows through the supervisor. A tag-list reload flushes the affected PLC's response cache (per-tag granularity intentionally not done in v1).
- **Polly bounded retries** on backend connect (3 attempts at 100ms / 500ms / 2000ms). No retries on mid-request failures (FC06/FC16 are non-idempotent on BCD tags). A per-request watchdog in the multiplexer surfaces Modbus exception 0x0B to the upstream client if a backend response never arrives within `BackendRequestTimeoutMs`.
- **Backend disconnect cascades upstream**: when the shared backend socket dies, every attached upstream pipe is closed in the same cycle (counter `BackendDisconnectCascades`); clients reconnect on their next request.
- **Keepalive / connection monitoring** (ON by default, `Connection.Keepalive`): OS `SO_KEEPALIVE` on backend and accepted upstream sockets, plus a per-PLC application heartbeat — a synthetic FC03 qty=1 read fired on an idle backend socket (`BackendHeartbeatIdleMs`). An unanswered heartbeat proactively tears the backend down (counters `backendHeartbeatsSent/Failed`, `backendIdleDisconnects`). The DL260 has no FC08, so the probe is a real register read. See [`docs/Architecture/Keepalive.md`](docs/Architecture/Keepalive.md).
- **Read-only Kestrel admin port** (default 8080) exposes `GET /` (auto-refreshing HTML) and `GET /status.json` with service-wide and per-PLC counters (including Phase-9 mux fields, Phase-10 coalescing fields, and Phase-11 cache fields `cacheHitCount`, `cacheMissCount`, `cacheInvalidations`, `cacheEntryCount`, `cacheBytes`).
Anything beyond this short list lives in the `docs/` tree: the appsettings.json schema in [`docs/Operations/Configuration.md`](docs/Operations/Configuration.md), config propagation in [`docs/Features/HotReload.md`](docs/Features/HotReload.md), stable log event names in [`docs/Reference/LogEvents.md`](docs/Reference/LogEvents.md), the status counter catalog in [`docs/Operations/StatusPage.md`](docs/Operations/StatusPage.md), and the simulator-backed test fixture in [`docs/Testing/Simulator.md`](docs/Testing/Simulator.md). Open the relevant page before writing code; keep it in sync when decisions change.
## Current state
**Implementation complete through Phase 11.** Phases 0008 shipped the production-ready 1:1-model service; Phase 9 swapped the connection layer for the TxId-multiplexed model; Phase 10 added in-flight read coalescing on top; Phase 11 added an opt-in per-tag response cache (bounded staleness, OFF by default — see [`docs/Architecture/ResponseCache.md`](docs/Architecture/ResponseCache.md)). The service is production-ready as a Windows Service:
**Implementation complete through Phase 11.** Phases 0008 shipped the production-ready 1:1-model service; Phase 9 swapped the connection layer for the TxId-multiplexed model; Phase 10 added in-flight read coalescing on top; Phase 11 added an opt-in per-tag response cache (bounded staleness, OFF by default — see [`docs/Architecture/ResponseCache.md`](docs/Architecture/ResponseCache.md)). The service is production-ready as a **Windows Service or a Linux systemd unit**:
- Test count grew through Phase 11 (see `tests/Mbproxy.Tests/` for the current suite; previous baseline was 325 = 282 unit + 43 E2E).
- Single-file self-contained publish (`dotnet publish -c Release -r win-x64`).
- PowerShell install/uninstall scripts under `install/`.
- Graceful shutdown with configurable drain timeout (`Connection.GracefulShutdownTimeoutMs`, default 10 s).
- Windows Event Log integration (Error+ events when running as a service).
- Single-file self-contained publish for `win-x64` **and** `linux-x64` (`dotnet publish -c Release -r <rid>`) — the RID is supplied per publish, never hardcoded in the csproj.
- Install/uninstall scripts under `install/`: PowerShell (`install.ps1` / `uninstall.ps1`) for the Windows Service; shell (`install.sh` / `uninstall.sh` + the `mbproxy.service` unit) for systemd.
- Graceful shutdown with configurable drain timeout (`Connection.GracefulShutdownTimeoutMs`, default 10 s) — driven by the Windows SCM stop signal or POSIX `SIGTERM`.
- Platform diagnostic sink for Error+ events, chosen once at the composition root by `DiagnosticSinkSelector`: Windows Application Event Log under the SCM, local syslog under systemd, none for interactive/dev runs. The systemd unit is `Type=exec` (not `notify`).
- Read-only HTTP status page at `AdminPort` (default 8080) — surfaces Phase-9 mux fields alongside Phase-7 counters.
- `connectsSuccess` / `connectsFailed` counters wired in `PlcMultiplexer`.
- Phase 9 per-request watchdog defends against any backend that drops or mis-echoes a response (real-world packet loss; pymodbus 3.13 simulator's concurrent-multiplexed-request bug).
@@ -63,7 +64,7 @@ The DL205/DL260 family is *almost* Modbus-spec-compliant, but every category bel
- **Octal V-memory ↔ decimal Modbus translation.** `V2000` octal = decimal 1024 = Modbus PDU `0x0400`. Config addresses are PDU-decimal, **not** octal V-memory and **not** 1-based 4xxxx.
- **FC03/FC04 max qty = 128** (above spec's 125). **FC16 max qty = 100** (below spec's 123). The proxy passes these through; the PLC enforces the cap with exception 03.
- **Max 4 concurrent TCP clients per ECOM100.** This is why the proxy uses a single TxId-multiplexed backend socket per PLC — see [`docs/Architecture/ConnectionModel.md`](docs/Architecture/ConnectionModel.md) for how the connection model lifts this cap.
- **No TCP keepalive from the device.** Middleboxes typically drop idle sockets at 25 min. With the 1:1 model, backend liveness tracks upstream client liveness; if both are idle long enough, the path dies on its own and the next request reconnects.
- **No TCP keepalive from the device.** Middleboxes typically drop idle sockets at 25 min. The proxy compensates with its own keepalive — `SO_KEEPALIVE` on every socket plus an idle backend FC03 heartbeat (see the Architecture summary and [`docs/Architecture/Keepalive.md`](docs/Architecture/Keepalive.md)).
- **Register 0 is valid** on DL205/DL260 in factory "absolute" addressing mode — don't probe-skip it.
- **As-deployed PLC parameters** (captured in `docs/Reference/mbtcp_settings.JPG`): port 502, "Use Concept data structures (Longs/Reals)" enabled, "Swap bytes" enabled, "Use Zero Based Addressing" **unchecked**, Register type = Binary, max coil read 1976 / coil write 800 / register read 122 / register write 100. The proxy must speak Modbus as-is; these settings describe the wire it'll see.
@@ -73,6 +74,7 @@ The DL205/DL260 family is *almost* Modbus-spec-compliant, but every category bel
| --- | --- |
| Architecture — listener topology, request flow, per-PLC isolation | [`docs/Architecture/Overview.md`](docs/Architecture/Overview.md) |
| Connection model — single backend socket per PLC, TxId multiplexing, request-timeout watchdog, disconnect cascade | [`docs/Architecture/ConnectionModel.md`](docs/Architecture/ConnectionModel.md) |
| Keepalive / connection monitoring — TCP `SO_KEEPALIVE` + backend FC03 heartbeat | [`docs/Architecture/Keepalive.md`](docs/Architecture/Keepalive.md) |
| In-flight read coalescing / opt-in response cache | [`docs/Architecture/ReadCoalescing.md`](docs/Architecture/ReadCoalescing.md), [`docs/Architecture/ResponseCache.md`](docs/Architecture/ResponseCache.md) |
| BCD rewriting (codec, CDAB word order, FC03/04/06/16 scope) and config hot-reload | [`docs/Features/BcdRewriting.md`](docs/Features/BcdRewriting.md), [`docs/Features/HotReload.md`](docs/Features/HotReload.md) |
| Operations — full appsettings.json reference, status page / JSON fields, troubleshooting playbook | [`docs/Operations/Configuration.md`](docs/Operations/Configuration.md), [`docs/Operations/StatusPage.md`](docs/Operations/StatusPage.md), [`docs/Operations/Troubleshooting.md`](docs/Operations/Troubleshooting.md) |
+37 -20
View File
@@ -1,14 +1,14 @@
# mbproxy
A .NET 10 Windows Service that sits inline as a Modbus TCP proxy in front of a fleet of AutomationDirect DirectLOGIC DL205/DL260 controllers, rewriting BCD-encoded registers bidirectionally so upstream clients can read and write them as plain integers. The proxy also offers an opt-in per-tag response cache (default OFF) for FC03/FC04 reads with bounded operator-configured staleness — see [`docs/Architecture/ResponseCache.md`](docs/Architecture/ResponseCache.md) before enabling it.
A .NET 10 background service — a **Windows Service** or a **Linux systemd unit** that sits inline as a Modbus TCP proxy in front of a fleet of AutomationDirect DirectLOGIC DL205/DL260 controllers, rewriting BCD-encoded registers bidirectionally so upstream clients can read and write them as plain integers. The proxy also offers an opt-in per-tag response cache (default OFF) for FC03/FC04 reads with bounded operator-configured staleness — see [`docs/Architecture/ResponseCache.md`](docs/Architecture/ResponseCache.md) before enabling it.
> ⚠ **32-bit BCD wire format is "two base-10000 digits in CDAB", not standard CDAB binary Int32.** A 32-bit BCD tag at address `A` decodes as `decimal = high * 10_000 + low` where `low` is the register at `A` and `high` is the register at `A+1`. Each word independently must be 09999. Standard Modbus clients (NModbus, FluentModbus, Wonderware DAServer) that interpret CDAB as straight binary Int32 will silently corrupt any value > 9999 on writes and read garbage on reads. Configure your client to send/receive each register as a separate base-10000 BCD digit pair, not as a single binary Int32. Full details in [`docs/Features/BcdRewriting.md`](docs/Features/BcdRewriting.md).
## Hard constraints / prerequisites
- **Windows 10 / Server 2019 or later, 64-bit.** No Linux or Docker support — the service uses `Microsoft.Extensions.Hosting.WindowsServices` and the Windows Event Log.
- **Windows (10 / Server 2019+) or Linux (any systemd distro), 64-bit.** Ships as a Windows Service (Application Event Log integration) or a systemd unit (syslog integration); builds single-file for `win-x64` and `linux-x64`. macOS is not a deployment target — it runs only as a foreground console process.
- **Modbus TCP backends reachable** from the proxy host on port 502 (or the port configured per PLC). The H2-ECOM100 module caps simultaneous connections at **4 per PLC** — a fifth upstream client will fail to connect.
- **Admin rights** to install the service (`install.ps1` requires elevation).
- **Admin / root rights** to install the service (`install.ps1` requires elevation; `install.sh` requires root).
- **No COM dependency** — this is a pure .NET 10 socket-level proxy (unlike the `.NET Framework 4.8 / x86` siblings in this repo).
- **Python 3.10+** on the test machine to run the pymodbus-backed E2E simulator (not needed to run the service in production).
@@ -16,8 +16,8 @@ A .NET 10 Windows Service that sits inline as a Modbus TCP proxy in front of a f
```
src/Mbproxy/ Main C# project (net10.0, Microsoft.NET.Sdk.Worker)
tests/Mbproxy.Tests/ xUnit v3 test project (314 unit + 48 E2E tests)
install/ PowerShell install/uninstall scripts and config template
tests/Mbproxy.Tests/ xUnit v3 test project (unit + simulator-backed E2E tests)
install/ Install/uninstall + publish scripts (PowerShell + shell), systemd unit, config templates
docs/ Architecture, features, operations, reference, and testing docs
```
@@ -40,6 +40,7 @@ The `docs/` tree is organized by topic. Start with [`Architecture/Overview.md`](
- [`Architecture/ConnectionModel.md`](docs/Architecture/ConnectionModel.md) — Single backend connection per PLC, TxId multiplexing, request-timeout watchdog, disconnect cascade.
- [`Architecture/ReadCoalescing.md`](docs/Architecture/ReadCoalescing.md) — In-flight FC03/FC04 deduplication via `InFlightByKeyMap`.
- [`Architecture/ResponseCache.md`](docs/Architecture/ResponseCache.md) — Opt-in per-tag response cache with bounded operator-configured staleness.
- [`Architecture/Keepalive.md`](docs/Architecture/Keepalive.md) — TCP `SO_KEEPALIVE` on every socket plus an idle-backend FC03 heartbeat.
### Features
@@ -54,7 +55,7 @@ The `docs/` tree is organized by topic. Start with [`Architecture/Overview.md`](
### Reference
- [`Reference/LogEvents.md`](docs/Reference/LogEvents.md) — Stable `mbproxy.*` event catalog (28 events across 7 categories).
- [`Reference/LogEvents.md`](docs/Reference/LogEvents.md) — Stable `mbproxy.*` event catalog (31 events across 8 categories).
### Testing
@@ -68,20 +69,27 @@ The `docs/` tree is organized by topic. Start with [`Architecture/Overview.md`](
dotnet build Mbproxy.slnx -c Debug
```
**Publish (Release, single-file, win-x64):**
**Publish (Release, single-file):**
```powershell
.\install\publish.ps1 -Clean
.\install\publish.ps1 -Clean # win-x64 (default)
.\install\publish.ps1 -Rid linux-x64 -Clean # cross-publish for linux-x64
```
Produces both flavours under `publish-out\`:
On a Linux build host, use the shell counterpart:
| Flavour | Path | Size | Target prerequisite |
```bash
./install/publish.sh --clean # linux-x64 (default)
```
Each run produces both flavours under `publish-out\`:
| Flavour | Path (win-x64) | Size | Target prerequisite |
|---|---|---|---|
| Self-contained | `publish-out\self-contained\Mbproxy.exe` | ~100 MB | None — bundles .NET 10 + ASP.NET Core runtime |
| Framework-dependent | `publish-out\framework-dependent\Mbproxy.exe` | ~1.5 MB | .NET 10 + ASP.NET Core preinstalled |
| Framework-dependent | `publish-out\framework-dependent\Mbproxy.exe` | ~1.6 MB | .NET 10 + ASP.NET Core preinstalled |
Pass `-OutputDir <path>` to publish elsewhere; omit `-Clean` to skip the wipe. The script wraps `dotnet publish src/Mbproxy/Mbproxy.csproj -c Release -r win-x64 [-p:SelfContained=false]` — run those directly if you only need one flavour.
On `linux-x64` the binary is `Mbproxy` (no extension) and ships the Linux config template. Pass `-OutputDir`/`-o` to publish elsewhere; omit `-Clean`/`--clean` to skip the wipe. The scripts wrap `dotnet publish src/Mbproxy/Mbproxy.csproj -c Release -r <rid> [-p:SelfContained=false]` — run that directly if you only need one flavour.
**Run tests:**
@@ -102,21 +110,30 @@ Edit `src/Mbproxy/appsettings.json` to configure PLCs before running. The admin
## Install
The `install/` directory holds the publish, install, and uninstall scripts. Quick path:
The `install/` directory holds the publish, install, and uninstall scripts for both platforms.
**Windows** — elevated PowerShell:
```powershell
# 1. Publish (produces publish-out\self-contained\ and publish-out\framework-dependent\)
.\install\publish.ps1 -Clean
# 2. Install (elevated PowerShell) — point at the flavour you want to deploy
.\install\install.ps1 -PublishOutput .\publish-out\self-contained -Start
# 3. Edit the config that was placed at %ProgramData%\mbproxy\appsettings.json
# 4. Verify
# Config is placed at %ProgramData%\mbproxy\appsettings.json — edit it, then:
# Restart-Service mbproxy
Invoke-WebRequest http://localhost:8080/ -UseBasicParsing
```
**Linux** — root / `sudo` on a systemd host:
```bash
./install/publish.sh --clean
sudo ./install/install.sh --publish-dir ./publish-out/self-contained
# Config is placed at /etc/mbproxy/appsettings.json — edit it, then:
# sudo systemctl restart mbproxy
curl http://localhost:8080/
```
`uninstall.ps1` / `uninstall.sh` reverse the install; both archive log files rather than deleting them. The systemd unit runs mbproxy as `Type=exec` under a dedicated `mbproxy` service account.
## Maintenance
Documentation doctrine for this repo: [`../DOCS-GUIDE.md`](../DOCS-GUIDE.md).
+1 -1
View File
@@ -6,7 +6,7 @@ This document is the entry point for readers new to the codebase. It sketches th
## Runtime Shape
The process is a single .NET 10 Generic Host worker. `Microsoft.Extensions.Hosting.WindowsServices` registers the host as a Windows Service so the same binary runs interactively (for development) or under the SCM (in production). All configuration binds from `appsettings.json` through `IOptionsMonitor<MbproxyOptions>`, which makes the tag list and PLC roster hot-reloadable without process restart. `ProxyWorker` is the long-lived `BackgroundService` that owns startup, shutdown, and the listener supervisors for every PLC. A small Kestrel admin endpoint runs in the same process to serve the read-only status page.
The process is a single .NET 10 Generic Host worker. It registers both `Microsoft.Extensions.Hosting.WindowsServices` and `Microsoft.Extensions.Hosting.Systemd` — each a no-op off its own init system — so the same binary runs interactively (for development), as a Windows Service under the SCM, or as a Linux systemd unit. All configuration binds from `appsettings.json` through `IOptionsMonitor<MbproxyOptions>`, which makes the tag list and PLC roster hot-reloadable without process restart. `ProxyWorker` is the long-lived `BackgroundService` that owns startup, shutdown, and the listener supervisors for every PLC. A small Kestrel admin endpoint runs in the same process to serve the read-only status page.
There is no in-process database, no message broker, and no persistent cache file: state is per-PLC, in-memory, and ephemeral. Restarting the service drops every in-flight request and every cached response. Upstream clients are expected to reconnect and reissue; the proxy never replays a request on their behalf.
+1 -1
View File
@@ -6,7 +6,7 @@ A save to `appsettings.json` propagates to a running `mbproxy` without restartin
`Microsoft.Extensions.Configuration` loads `appsettings.json` with `reloadOnChange: true`. Every consumer reads its options through `IOptionsMonitor<MbproxyOptions>` instead of capturing a one-shot `IOptions<T>` snapshot at construction. When the framework's `FileSystemWatcher` sees the file change, it re-parses the JSON, re-binds the option tree, and notifies subscribers through `IOptionsMonitor.OnChange`.
The chosen mechanism is deliberate. There is no custom file watcher, no IPC channel, no admin-port mutation endpoint, and no SIGHUP-style trigger. An operator edits the file in place (or a deployment tool atomically rewrites it) and the running service catches up. The reload contract is identical whether the service is running interactively or as a Windows Service under the SCM.
The chosen mechanism is deliberate. There is no custom file watcher, no IPC channel, no admin-port mutation endpoint, and no SIGHUP-style trigger. An operator edits the file in place (or a deployment tool atomically rewrites it) and the running service catches up. The reload contract is identical whether the service is running interactively, as a Windows Service under the SCM, or as a Linux systemd unit.
The `OnChange` callback can fire multiple times for a single logical save because text editors on Windows commonly use a rename-and-replace pattern that produces two or three `FileSystemWatcher` events. The reconciler debounces these inside its own background loop with a 250 ms quiescent window so a single save produces a single apply.
+31 -4
View File
@@ -7,8 +7,11 @@
The configuration loader resolves `appsettings.json` relative to the executable.
- **Development run** (`dotnet run`): `src/Mbproxy/appsettings.json` next to the build output.
- **Single-file publish** (`dotnet publish -c Release -r win-x64`): `appsettings.json` next to `Mbproxy.exe` in the publish folder.
- **Installed as a Windows Service**: `%ProgramData%\mbproxy\appsettings.json`. The install script copies the template at `install/mbproxy.config.template.json` to this path the first time only — an existing file is preserved across reinstalls.
- **Single-file publish** (`dotnet publish -c Release -r <rid>`): `appsettings.json` next to the published binary. A `win-x64` publish ships `install/mbproxy.config.template.json`; a `linux-x64` publish ships `install/mbproxy.linux.config.template.json` (same keys, Unix log path) — each linked into the bundle as `appsettings.json`.
- **Installed as a Windows Service**: `%ProgramData%\mbproxy\appsettings.json`, seeded by `install.ps1` from `mbproxy.config.template.json`.
- **Installed as a systemd unit**: `/etc/mbproxy/appsettings.json` (the unit's `WorkingDirectory`), seeded by `install.sh` from the Linux template.
In both installed cases the install script copies the template only when no config already exists — an existing file is preserved across reinstalls.
The file is loaded with `reloadOnChange: true`. All consumers read through `IOptionsMonitor<MbproxyOptions>`, so a save propagates without restarting the service. See [`../Features/HotReload.md`](../Features/HotReload.md) for per-key propagation semantics.
@@ -51,11 +54,19 @@ Every supported key under `Mbproxy:*`, populated to a representative default:
// Read-only HTTP status page. Set to 0 to disable.
"AdminPort": 8080,
// Backend connection / request / shutdown timeouts.
// Backend connection / request / shutdown timeouts and keepalive.
"Connection": {
"BackendConnectTimeoutMs": 3000,
"BackendRequestTimeoutMs": 3000,
"GracefulShutdownTimeoutMs": 10000
"GracefulShutdownTimeoutMs": 10000,
"Keepalive": {
"Enabled": true,
"TcpIdleTimeMs": 30000,
"TcpProbeIntervalMs": 5000,
"TcpProbeCount": 4,
"BackendHeartbeatIdleMs": 30000,
"BackendHeartbeatProbeAddress": 0
}
},
// Polly resilience policies.
@@ -169,6 +180,21 @@ Operational sizing notes:
- A 3 s request timeout is generous compared with typical DL205/DL260 scan times (a few ms to tens of ms for FC03 of 100 registers). The slack absorbs PLC scan-overlap jitter without faulting the upstream client.
- `GracefulShutdownTimeoutMs` should be less than the Service Control Manager's stop deadline. The default 10 s suits a fleet of 54 PLCs; on a much larger fleet, raise both the SCM wait hint and this value in lockstep.
## `Mbproxy.Connection.Keepalive`
TCP keepalive and backend heartbeat settings. Source: `KeepaliveOptions.cs`. Enabled by default — the DL205/DL260 ECOM never emits TCP keepalives, so an idle socket is otherwise dropped by middleboxes after 25 minutes. See [`../Architecture/Keepalive.md`](../Architecture/Keepalive.md) for the full design.
| Field | Type | Default | Notes |
|-------|------|---------|-------|
| `Enabled` | bool | `true` | Master switch. When `false`, neither `SO_KEEPALIVE` nor the backend heartbeat is applied and the proxy behaves exactly as a pre-keepalive build. |
| `TcpIdleTimeMs` | int | `30000` | `SO_KEEPALIVE` idle time before the OS sends its first probe. Applied to the backend socket and accepted upstream sockets. |
| `TcpProbeIntervalMs` | int | `5000` | `SO_KEEPALIVE` interval between probes once idle. |
| `TcpProbeCount` | int | `4` | `SO_KEEPALIVE` unanswered probes before the OS declares the socket dead. |
| `BackendHeartbeatIdleMs` | int | `30000` | After this much backend idle, the proxy issues a synthetic FC03 qty=1 read to keep the path warm and prove the ECOM still answers Modbus. Must be greater than `BackendRequestTimeoutMs`. |
| `BackendHeartbeatProbeAddress` | int | `0` | Modbus PDU address the heartbeat FC03 probe reads. Address `0` (`V0`) is valid on DL205/DL260 in factory absolute mode. Range `[0, 65535]`. |
On hot reload, the heartbeat interval and probe address are re-read on every loop tick. The `Tcp*` socket options are applied at connect/accept time, so a reload affects only sockets opened after the change. A reload where `BackendHeartbeatIdleMs <= BackendRequestTimeoutMs` is rejected — a heartbeat interval at or below the request timeout would fire continuously.
## `Mbproxy.Resilience`
Polly retry pipelines for backend connect, listener bind, and the in-flight read coalescer. Source: `ResilienceOptions.cs`.
@@ -391,6 +417,7 @@ A reduced view of [`../Features/HotReload.md`](../Features/HotReload.md), restri
| `Plcs[i]` removed | Supervisor stops the listener and closes all upstream connections for that PLC. |
| `Plcs[i].ListenPort` or `Host` changed | Equivalent to remove + add. |
| `Connection.Backend*TimeoutMs` | Next backend connect or request uses the new value. |
| `Connection.Keepalive` heartbeat fields | Re-read on every heartbeat loop tick. `Tcp*` socket options apply to backend/upstream sockets opened after the change. |
| `AdminPort` | Requires a service restart — the Kestrel admin host is built once at startup. |
| `Resilience.ReadCoalescing.Enabled` | Hot-reloadable; in-flight coalesced entries drain naturally. |
| `BcdTags.*.CacheTtlMs`, `Plcs[i].DefaultCacheTtlMs` | Tag-map reseat for the affected PLC drops that PLC's entire cache. |
+25 -2
View File
@@ -2,7 +2,9 @@
Operator diagnosis playbook for mbproxy. Each entry maps an observable symptom to the log event name and status-page counter that confirms it, then lists likely causes and remediation steps.
The rolling log lives at `C:\ProgramData\mbproxy\logs\mbproxy-<date>.log`. The live counters are at `http://<host>:<AdminPort>/status.json` (default port `8080`). Events at Error level and above are also mirrored to the Windows Application Event Log under source `mbproxy`.
The rolling log lives at `C:\ProgramData\mbproxy\logs\mbproxy-<date>.log` on Windows, or `/var/log/mbproxy/mbproxy-<date>.log` on Linux. The live counters are at `http://<host>:<AdminPort>/status.json` (default port `8080`). Events at Error level and above are also mirrored to the **Windows Application Event Log** (Windows Service) or the **local syslog / journal** (systemd) under source `mbproxy` — view the latter with `journalctl -t mbproxy` or `journalctl -u mbproxy`.
Paths and service commands below are written for Windows (`%ProgramData%`, `sc.exe`); the systemd equivalents are `/etc/mbproxy` + `/var/log/mbproxy` and `systemctl start|stop|status mbproxy`.
## Service Startup Failures
@@ -124,7 +126,28 @@ The rolling log lives at `C:\ProgramData\mbproxy\logs\mbproxy-<date>.log`. The l
1. Verify the upstream count on the status page returns to normal as clients reconnect — `plcs[].clients.connected` should climb again within seconds.
2. If cascades fire repeatedly against the same PLC, investigate the PLC and intermediate network for stability. The proxy itself has no state to repair.
3. If cascades correlate with idle periods, the idle middlebox-drop pattern is the likeliest cause; reduce the upstream client's poll interval below the middlebox idle timeout to keep traffic flowing.
3. If cascades correlate with idle periods, the idle middlebox-drop pattern is the likeliest cause. Keepalive is enabled by default and should already be preventing this — confirm `Connection.Keepalive.Enabled` is `true` and that `BackendHeartbeatIdleMs` is comfortably below the middlebox idle timeout. See [`../Architecture/Keepalive.md`](../Architecture/Keepalive.md).
### Backend keepalive heartbeat failing
**Symptom.** A PLC's backend connection is torn down while idle — no client was actively talking to it. `plcs[].backend.backendIdleDisconnects` increments and the upstream clients (if any were attached) are cascaded.
**Where to look.**
- Log events: `mbproxy.keepalive.heartbeat.timeout` (Warning) followed by `mbproxy.keepalive.backend.idle_disconnect` (Information).
- Status fields: `plcs[].backend.backendHeartbeatsSent`, `backendHeartbeatsFailed`, `backendIdleDisconnects`.
**Root causes.**
- The ECOM is reachable at the IP layer but no longer answering Modbus (firmware hang, ECOM reset mid-session).
- The path died between heartbeats and the heartbeat was the first request to discover it — this is the feature working as intended (the failure is found during idle, not on a client request).
- `BackendHeartbeatProbeAddress` points at an address the PLC rejects. The default (0 = `V0`) is safe on DL205/DL260; only an operator override could break this.
**Remediation.**
1. A single idle-disconnect that recovers on the next client request needs no action — the proxy reconnected the path proactively.
2. Repeated idle-disconnects on one PLC mean it keeps going dark while idle. Investigate the device and the network path; the proxy has no state to repair.
3. If `backendHeartbeatsFailed` climbs but the PLC answers real client requests fine, check that `BackendHeartbeatProbeAddress` is a register the device actually serves.
### Request timeout watchdog firing
+49 -3
View File
@@ -6,9 +6,9 @@ The stable catalog of every `mbproxy.*` event name the service emits, with its l
The service uses [Serilog](https://serilog.net/) wired through the `Microsoft.Extensions.Logging` bridge. Three sinks are configured (see `src/Mbproxy/HostingExtensions.cs`):
- **Console**written to stdout for interactive `--console` runs and for the SCM stdout capture.
- **Rolling file** under `%ProgramData%\mbproxy\logs\` (`mbproxy-<date>.log`).
- **Windows Event Log** — only when running as a Windows Service, and only for events at `Error` and above (see `src/Mbproxy/Diagnostics/EventLogBridge.cs`).
- **Console**stdout; captured by the Windows SCM or by systemd-journald.
- **Rolling file**`%ProgramData%\mbproxy\logs\` on Windows, `/var/log/mbproxy/` on Linux (`mbproxy-<date>.log`).
- **Platform diagnostic sink**`Error`+ events only. `DiagnosticSinkSelector` picks it once at the composition root: the **Windows Application Event Log** under the SCM (`EventLogBridge`), **local syslog** under systemd (`SyslogBridge`), or none for interactive/dev runs.
Every event uses source-generated `[LoggerMessage]` definitions, so the property names below match the message template token-for-token. The default minimum level is `Information`; lower the floor for `Mbproxy.*` categories via the standard `Logging:LogLevel` configuration to surface `Debug` events such as the coalesce and cache traces.
@@ -385,6 +385,51 @@ Fires whenever the entire per-PLC cache is wiped at once — primarily after a b
**Operator action:** none unless flushes happen on a tight loop, which would indicate the backend connection itself is unstable.
## Keepalive
See [`../Architecture/Keepalive.md`](../Architecture/Keepalive.md) for the backend heartbeat design.
### mbproxy.keepalive.heartbeat.sent
**Level:** Debug &middot; **EventId:** 150 &middot; **Source:** `src/Mbproxy/Proxy/Multiplexing/KeepaliveLogEvents.cs`
| Property | Type | Meaning |
|----------|------|---------|
| `Plc` | `string` | Configured PLC name. |
| `ProxyTxId` | `ushort` | Proxy-allocated TxId carrying the synthetic FC03 probe. |
| `Address` | `ushort` | Modbus address the probe reads (`BackendHeartbeatProbeAddress`). |
Fires each time the heartbeat loop issues a probe on an idle backend socket — at most one per `BackendHeartbeatIdleMs` per idle PLC.
**Operator action:** none. Debug-level; useful only when confirming the heartbeat is alive.
### mbproxy.keepalive.heartbeat.timeout
**Level:** Warning &middot; **EventId:** 151 &middot; **Source:** `src/Mbproxy/Proxy/Multiplexing/KeepaliveLogEvents.cs`
| Property | Type | Meaning |
|----------|------|---------|
| `Plc` | `string` | Configured PLC name. |
| `ProxyTxId` | `ushort` | Proxy TxId of the unanswered probe. |
| `ElapsedMs` | `long` | Milliseconds from probe send to timeout. |
Fires when a heartbeat probe is not answered within `BackendRequestTimeoutMs` — the backend is connected but no longer answering Modbus.
**Operator action:** check the PLC and the network path. Paired with `mbproxy.keepalive.backend.idle_disconnect` for the same PLC.
### mbproxy.keepalive.backend.idle_disconnect
**Level:** Information &middot; **EventId:** 152 &middot; **Source:** `src/Mbproxy/Proxy/Multiplexing/KeepaliveLogEvents.cs`
| Property | Type | Meaning |
|----------|------|---------|
| `Plc` | `string` | Configured PLC name. |
| `ElapsedMs` | `long` | Milliseconds the failed heartbeat waited before the teardown. |
Fires when a failed heartbeat triggers a proactive backend teardown. Every attached upstream pipe is cascaded; clients reconnect on their next request. This is the keepalive feature doing its job — finding a dead path during idle instead of on the next real request.
**Operator action:** none if isolated. Repeated idle-disconnects on one PLC indicate it keeps going dark while idle — investigate the device or the network path.
## BCD Rewriter
### mbproxy.rewrite.partial_bcd
@@ -495,5 +540,6 @@ Lifecycle events (`startup.*`, `listener.*`, `admin.*`, `shutdown.*`, `config.re
- [Response Cache](../Architecture/ResponseCache.md) — context for the `mbproxy.cache.*` events.
- [Status Page](../Operations/StatusPage.md) — counter equivalents for the high-volume Debug-level events.
- [Read Coalescing](../Architecture/ReadCoalescing.md) — context for the `mbproxy.coalesce.*` events.
- [Keepalive](../Architecture/Keepalive.md) — context for the `mbproxy.keepalive.*` events.
- [BCD Rewriting](../Features/BcdRewriting.md) — context for the `mbproxy.rewrite.*` and `mbproxy.exception.passthrough` events.
- [Hot Reload](../Features/HotReload.md) — context for the `mbproxy.config.reload.*` events.
+4 -1
View File
@@ -165,7 +165,10 @@ if (-not (Test-Path $configDest)) {
if (-not [System.Diagnostics.EventLog]::SourceExists('mbproxy')) {
Write-Host "Registering Windows Event Log source 'mbproxy'..."
New-EventLog -Source 'mbproxy' -LogName 'Application'
# .NET API, not New-EventLog: the *-EventLog cmdlets exist only in Windows
# PowerShell 5.1, not PowerShell 7+. This call is symmetric with the
# SourceExists check above and works on every PowerShell edition.
[System.Diagnostics.EventLog]::CreateEventSource('mbproxy', 'Application')
} else {
Write-Host "Windows Event Log source 'mbproxy' already registered."
}
+134
View File
@@ -0,0 +1,134 @@
#!/usr/bin/env bash
#
# install.sh — install the mbproxy service on a Linux / systemd host.
#
# The Linux counterpart of install.ps1. Copies the published binary to
# /opt/mbproxy, seeds the config at /etc/mbproxy/appsettings.json (preserving any
# existing one), creates the log and bundle-cache directories and the mbproxy
# service account, installs the systemd unit, and enables + starts the service.
#
# Re-running on an already-installed service is safe (idempotent): the binary is
# refreshed, an existing /etc/mbproxy/appsettings.json is preserved, and the
# service is restarted.
#
# Usage:
# sudo ./install.sh [--publish-dir DIR] [--no-start]
#
# --publish-dir DIR directory containing the published Mbproxy binary.
# Default: <repo>/publish-out/self-contained
# --no-start install and enable the unit but do not start it.
#
set -euo pipefail
# ── 0. Settings ──────────────────────────────────────────────────────────────
SERVICE_NAME="mbproxy"
SERVICE_USER="mbproxy"
INSTALL_DIR="/opt/mbproxy"
CONFIG_DIR="/etc/mbproxy"
LOG_DIR="/var/log/mbproxy"
CACHE_DIR="/var/cache/mbproxy"
UNIT_DEST="/etc/systemd/system/${SERVICE_NAME}.service"
script_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
repo_root="$(dirname "$script_dir")"
publish_dir="${repo_root}/publish-out/self-contained"
start_service=1
while [[ $# -gt 0 ]]; do
case "$1" in
--publish-dir) publish_dir="$2"; shift 2 ;;
--no-start) start_service=0; shift ;;
*) echo "Unknown argument: $1" >&2; exit 2 ;;
esac
done
# ── 1. Pre-flight checks ─────────────────────────────────────────────────────
if [[ "$(id -u)" -ne 0 ]]; then
echo "install.sh must run as root (use sudo)." >&2
exit 1
fi
binary_src="${publish_dir}/Mbproxy"
if [[ ! -f "$binary_src" ]]; then
echo "Mbproxy binary not found at '${binary_src}'." >&2
echo "Run install/publish.sh first, or pass --publish-dir." >&2
exit 1
fi
unit_src="${script_dir}/mbproxy.service"
config_src="${publish_dir}/appsettings.json"
if [[ ! -f "$unit_src" ]]; then
echo "Unit file not found at '${unit_src}'." >&2
exit 1
fi
echo "Installing ${SERVICE_NAME} service..."
echo " Publish dir : ${publish_dir}"
echo " Install dir : ${INSTALL_DIR}"
echo " Config dir : ${CONFIG_DIR}"
# ── 2. Service account ───────────────────────────────────────────────────────
if ! id -u "$SERVICE_USER" >/dev/null 2>&1; then
echo "Creating service account '${SERVICE_USER}'..."
useradd --system --no-create-home --shell /usr/sbin/nologin "$SERVICE_USER"
else
echo "Service account '${SERVICE_USER}' already exists."
fi
# ── 3. Stop the service if running (so the binary can be replaced) ───────────
if systemctl is-active --quiet "$SERVICE_NAME" 2>/dev/null; then
echo "Stopping running service '${SERVICE_NAME}'..."
systemctl stop "$SERVICE_NAME"
fi
# ── 4. Directories ───────────────────────────────────────────────────────────
install -d -m 0755 "$INSTALL_DIR"
install -d -m 0755 "$CONFIG_DIR"
install -d -m 0750 -o "$SERVICE_USER" -g "$SERVICE_USER" "$LOG_DIR"
install -d -m 0750 -o "$SERVICE_USER" -g "$SERVICE_USER" "$CACHE_DIR"
# ── 5. Binary ────────────────────────────────────────────────────────────────
echo "Copying binary to '${INSTALL_DIR}/Mbproxy'..."
install -m 0755 "$binary_src" "${INSTALL_DIR}/Mbproxy"
# ── 6. Config (preserve an existing one) ─────────────────────────────────────
config_dest="${CONFIG_DIR}/appsettings.json"
if [[ -f "$config_dest" ]]; then
echo "Preserving existing config at '${config_dest}'."
elif [[ -f "$config_src" ]]; then
echo "Seeding config template to '${config_dest}'..."
install -m 0644 "$config_src" "$config_dest"
else
echo "WARNING: no appsettings.json in '${publish_dir}' — create '${config_dest}' manually." >&2
fi
# ── 7. systemd unit ──────────────────────────────────────────────────────────
echo "Installing systemd unit to '${UNIT_DEST}'..."
install -m 0644 "$unit_src" "$UNIT_DEST"
systemctl daemon-reload
systemctl enable "$SERVICE_NAME" >/dev/null
# ── 8. Start ─────────────────────────────────────────────────────────────────
if [[ "$start_service" -eq 1 ]]; then
echo "Starting service '${SERVICE_NAME}'..."
systemctl start "$SERVICE_NAME"
sleep 1
if systemctl is-active --quiet "$SERVICE_NAME"; then
echo "Service '${SERVICE_NAME}' is running."
else
echo "WARNING: service '${SERVICE_NAME}' did not reach active state." >&2
echo "Check: journalctl -u ${SERVICE_NAME} -e" >&2
fi
fi
echo ""
echo "Install complete."
echo " Config : ${config_dest}"
echo " Logs : ${LOG_DIR}"
echo " Binary : ${INSTALL_DIR}/Mbproxy"
echo ""
echo "Next steps:"
echo " 1. Edit '${config_dest}' to configure your PLC list and BCD tags."
echo " 2. Restart: sudo systemctl restart ${SERVICE_NAME}"
echo " 3. Logs: journalctl -u ${SERVICE_NAME} -f"
echo " 4. Status: http://localhost:8080/"
@@ -0,0 +1,255 @@
// mbproxy configuration template (Linux / systemd) copy to /etc/mbproxy/appsettings.json
// and edit before starting the service.
//
// The .NET configuration loader accepts // and /* */ comments in JSON files
// (JSONC semantics) when using the default Host.CreateApplicationBuilder path.
//
// IMPORTANT: install.sh overwrites this file at the destination ONLY if no
// appsettings.json already exists there. An existing file is always preserved.
//
// This is the Linux counterpart of mbproxy.config.template.json identical except
// for the rolling-log path (/var/log/mbproxy) and a few platform notes. It is shipped
// as appsettings.json by a `dotnet publish -r linux-*` build.
{
"Mbproxy": {
// Global BCD tag list
// These tags apply to EVERY PLC by default.
// Each entry: Address (Modbus PDU address, decimal), Width (16 or 32 bits).
//
// Width 16 one register holds 4 BCD digits (09999).
// Wire value 0x1234 decodes to decimal 1234.
//
// Width 32 a CDAB-ordered register pair (Address = low word, Address+1 = high word).
// Decoded decimal = high * 10000 + low (DirectLOGIC CDAB word order).
//
// Per-PLC overrides (see Plcs[].BcdTags below):
// Add appends extra tags beyond what Global defines, or overrides a
// Global entry's Width when the same Address appears in both.
// Remove removes specific addresses from the effective set for that PLC.
// Effective set = (Global Add) Remove, resolved per PDU.
"BcdTags": {
"Global": [
// V2000 (octal) = decimal address 1024. 16-bit BCD counter.
{ "Address": 1024, "Width": 16 },
// V2040 (octal) = decimal address 1056. 32-bit BCD total at 1056/1057.
{ "Address": 1056, "Width": 32 },
// V2100 (octal) = decimal address 1088. 16-bit BCD setpoint.
//
// Phase 11: CacheTtlMs (optional) opts this tag into the response cache. With
// CacheTtlMs > 0 set, upstream clients reading this register will see values up
// to CacheTtlMs MILLISECONDS OLD explicit acknowledgement of the staleness
// window is required by enabling it. Default (omitted or 0) = cache disabled
// for this tag. The cache is OFF by default for every tag.
{ "Address": 1088, "Width": 16 /* , "CacheTtlMs": 1000 */ }
]
},
// PLC list
// Each entry maps one upstream proxy port one backend PLC.
// Upstream clients connect to ListenPort; the proxy forwards to Host:Port.
//
// IMPORTANT: H2-ECOM100 modules accept at most 4 simultaneous TCP connections.
// With the 1:1 upstreambackend model, a fifth upstream client to the same proxy
// port will cause a backend connect failure and an immediate upstream disconnect.
"Plcs": [
{
"Name": "Line1-Mixer", // Human-readable name (shown on status page and in logs)
"ListenPort": 5020, // Port the proxy listens on (upstream clients connect here)
"Host": "10.0.1.1", // PLC IP address or hostname
"Port": 502, // PLC Modbus TCP port (almost always 502)
"BcdTags": {
// Additional 32-bit tag specific to this PLC only.
"Add": [
{ "Address": 1200, "Width": 32 }
],
// Remove address 1056 from the Global list for this PLC
// (this mixer doesn't use the 32-bit BCD total).
"Remove": [ 1056 ]
}
},
{
"Name": "Line1-Conveyor",
"ListenPort": 5021,
"Host": "10.0.1.2",
"Port": 502
// No BcdTags override uses the Global set as-is.
}
// Add one entry per PLC. Ports must be unique per host. Typical fleet: 54 PLCs.
],
// Admin port
// Read-only HTTP status page.
// GET / self-contained HTML (auto-refreshes every 5 s)
// GET /status.json same data as JSON for monitoring scrapers
//
// Authentication is assumed at the network layer (trusted internal segment).
// Set to 0 to disable the admin endpoint.
"AdminPort": 8080,
// Connection timeouts
"Connection": {
// Max time (ms) to wait for a TCP connect to the PLC backend.
// Each Polly retry attempt gets its own copy of this timeout.
"BackendConnectTimeoutMs": 3000,
// Max time (ms) to wait for the PLC to respond to a forwarded PDU.
// Non-idempotent FC06/FC16 writes are one-shot the upstream client
// is disconnected immediately on timeout (no retry).
"BackendRequestTimeoutMs": 3000,
// Max time (ms) to wait for in-flight PDUs to complete during graceful shutdown
// (systemctl stop SIGTERM). After this deadline the coordinator cancels
// remaining work and proceeds. Keep at or below the unit's TimeoutStopSec.
"GracefulShutdownTimeoutMs": 10000,
// Keepalive / connection monitoring
// The DL205/DL260 ECOM does not emit TCP keepalives, so an idle backend
// socket can be silently dropped by a middlebox (switch, firewall, NAT)
// after 2-5 minutes. This section enables OS-level SO_KEEPALIVE on both
// backend and upstream sockets, and drives a periodic Modbus FC03 heartbeat
// on each idle backend socket so a dead path is detected before a real
// client request hits it. See docs/Architecture/Keepalive.md.
"Keepalive": {
// Master switch. false no SO_KEEPALIVE and no heartbeat; the proxy
// behaves exactly as a pre-keepalive build.
"Enabled": true,
// SO_KEEPALIVE: idle time (ms) before the OS sends its first probe.
"TcpIdleTimeMs": 30000,
// SO_KEEPALIVE: interval (ms) between probes once the idle time elapses.
"TcpProbeIntervalMs": 5000,
// SO_KEEPALIVE: unanswered probes before the OS declares the socket dead.
"TcpProbeCount": 4,
// Backend heartbeat: after this much backend idle (ms) the proxy issues a
// synthetic FC03 qty=1 read to keep the path warm and prove the ECOM is
// still answering Modbus. Must be greater than BackendRequestTimeoutMs.
"BackendHeartbeatIdleMs": 30000,
// FC03 PDU address the heartbeat reads. 0 = V0, valid on DL205/DL260.
"BackendHeartbeatProbeAddress": 0
}
},
// Resilience policies
"Resilience": {
// Polly retry policy for backend TCP connect attempts.
// MaxAttempts: total connect tries (including the first).
// BackoffMs: delay between each attempt (must have MaxAttempts1 entries).
"BackendConnect": {
"MaxAttempts": 3,
"BackoffMs": [ 100, 500, 2000 ]
},
// Polly recovery policy for listener bind failures.
// If a PLC's listen port can't be bound (in-use, bad IP, transient OS error),
// the supervisor retries according to this schedule.
// InitialBackoffMs: backoff per step (first N retries).
// SteadyStateMs: backoff for all subsequent retries (runs indefinitely).
"ListenerRecovery": {
"InitialBackoffMs": [ 1000, 2000, 5000, 15000, 30000 ],
"SteadyStateMs": 30000
},
// Phase 10 in-flight read coalescing.
//
// When two or more upstream clients (HMI / historian / engineering workstation /
// gateway) issue the SAME FC03 or FC04 read while a matching backend round-trip is
// already in flight, the proxy attaches the late arrivals to the existing in-flight
// entry and fans the single PLC response out to every attached client saving the
// ECOM's per-scan PDU budget on duplicated reads.
//
// Zero post-response staleness: coalescing operates ONLY between "first request
// sent to PLC" and "response received from PLC" (microseconds to ~10 ms typical).
// Each upstream client still sees its own MBAP transaction ID echoed correctly;
// the proxy is transparent.
//
// FC06 / FC16 writes are NEVER coalesced (non-idempotent). FC03 vs FC04 are
// separate Modbus tables and never share a coalescing key. Different unit IDs
// (multi-drop / gateway-backed setups) never coalesce.
//
// Enabled master switch. Hot-reloadable; flipping to false leaves running
// coalesced entries to drain naturally.
// MaxParties per-entry cap on attached parties. Past the cap, the next
// identical request opens a fresh backend round-trip (load-shedding
// safety valve for very fan-out-heavy fleets).
"ReadCoalescing": {
"Enabled": true,
"MaxParties": 32
}
},
// Response cache (Phase 11) opt-in bounded-staleness cache
//
// DESIGN-CONTRACT PIVOT: with caching enabled the proxy is no longer purely
// transparent. Upstream FC03/FC04 reads for cache-enabled tags may return values
// up to CacheTtlMs MILLISECONDS OLD. Operators opt tags in by setting a non-zero
// CacheTtlMs on a BcdTagOptions entry (or DefaultCacheTtlMs on a PlcOptions entry).
//
// The cache is OFF BY DEFAULT for every tag. A deployment with NO TTL config (this
// section entirely absent and no BcdTags.*.CacheTtlMs / Plcs[i].DefaultCacheTtlMs)
// behaves IDENTICALLY to a pre-Phase-11 deployment no behaviour change.
//
// AllowLongTtl gate for any CacheTtlMs > 60_000. Reload validation
// rejects configs that exceed 60 s without this opt-in,
// to prevent accidentally-stale-for-an-hour deployments.
// MaxEntriesPerPlc LRU cap per-PLC. Past this cap, the next insert evicts
// the least-recently-used entry. Defaults to 1000.
// EvictionIntervalMs background eviction tick. Scans each PLC's cache and
// removes entries past their TTL. Defaults to 5000.
//
// Properties (full text in docs/Architecture/ResponseCache.md):
// * Cache hits SHORT-CIRCUIT coalescing entirely (cache coalesce backend).
// * Successful FC06/FC16 write responses invalidate every cached FC03/FC04 entry
// whose address range OVERLAPS the write not just exact-key match.
// * Multi-tag read range: effective TTL = min(TTLs). Any tag with TTL=0 in the
// range disables caching for the whole read.
// * Cache stores POST-rewriter bytes; hits never re-invoke the BCD rewriter.
// * Tag-list hot-reload flushes the affected PLC's whole cache.
// * No persistence process restart wipes the cache.
"Cache": {
"AllowLongTtl": false,
"MaxEntriesPerPlc": 1000,
"EvictionIntervalMs": 5000
}
},
// Serilog
// Structured log output. Default: Information level, console + rolling-file.
// The console sink is captured by systemd-journald (view with `journalctl -u mbproxy`).
// In addition, when mbproxy runs as a systemd service the SyslogBridge writes Error+
// events to the local syslog with proper RFC5424 severity (wired in code, not here).
"Serilog": {
"Using": [ "Serilog.Sinks.Console", "Serilog.Sinks.File" ],
"MinimumLevel": {
"Default": "Information",
"Override": {
"Microsoft": "Warning",
"System": "Warning"
}
},
"WriteTo": [
{
"Name": "Console",
"Args": {
"outputTemplate": "[{Timestamp:HH:mm:ss} {Level:u3}] {Message:lj} {Properties:j}{NewLine}{Exception}"
}
},
{
"Name": "File",
"Args": {
// Rolling log: one file per day, kept for 30 days, under /var/log/mbproxy
// (created by install.sh and owned by the mbproxy service account).
// Survives uninstall uninstall.sh archives logs to /var/log/mbproxy.archived-<ts>.
"path": "/var/log/mbproxy/mbproxy-.log",
"rollingInterval": "Day",
"retainedFileCountLimit": 30,
"outputTemplate": "[{Timestamp:yyyy-MM-dd HH:mm:ss.fff zzz} {Level:u3}] {Message:lj} {Properties:j}{NewLine}{Exception}"
}
}
]
}
}
+45
View File
@@ -0,0 +1,45 @@
# systemd unit for mbproxy — the Modbus TCP BCD proxy.
#
# Installed to /etc/systemd/system/mbproxy.service by install.sh.
# The Linux counterpart of the Windows Service registered by install.ps1.
#
# Type=exec (not Type=notify): mbproxy is a leaf service that nothing orders
# against, so systemd's readiness signal is unnecessary. Type=exec marks the
# unit active once the binary is exec'd; graceful stop still works because the
# .NET generic host handles SIGTERM directly (drains in-flight requests within
# Connection.GracefulShutdownTimeoutMs).
[Unit]
Description=mbproxy — Modbus TCP BCD proxy
After=network-online.target
Wants=network-online.target
[Service]
Type=exec
ExecStart=/opt/mbproxy/Mbproxy
WorkingDirectory=/etc/mbproxy
User=mbproxy
Group=mbproxy
# Restart on crash, but not on a clean SIGTERM stop.
Restart=on-failure
RestartSec=5
# Keep above Connection.GracefulShutdownTimeoutMs (default 10 s) so the drain
# completes before systemd escalates to SIGKILL.
TimeoutStopSec=30
# Self-contained single-file publish: pin native-library extraction to a stable,
# writable directory (install.sh creates it and grants the mbproxy account access).
Environment=DOTNET_BUNDLE_EXTRACT_BASE_DIR=/var/cache/mbproxy
# Hardening. The service only needs to write its log and bundle-cache directories.
NoNewPrivileges=true
ProtectSystem=strict
ProtectHome=true
PrivateTmp=true
ReadWritePaths=/var/log/mbproxy /var/cache/mbproxy
# If any configured ListenPort is below 1024, also add:
# AmbientCapabilities=CAP_NET_BIND_SERVICE
[Install]
WantedBy=multi-user.target
+34 -21
View File
@@ -1,19 +1,27 @@
<#
.SYNOPSIS
Publishes Mbproxy.exe in two flavours: self-contained and framework-dependent.
Publishes the Mbproxy binary in two flavours: self-contained and framework-dependent.
.DESCRIPTION
Produces two single-file win-x64 builds under <repo>\publish-out\:
Produces two single-file builds for the requested runtime under <repo>\publish-out\:
self-contained\Mbproxy.exe ~100 MB bundles the .NET 10 runtime;
no .NET install needed on target.
framework-dependent\Mbproxy.exe ~1.6 MB requires .NET 10 + ASP.NET Core
runtime preinstalled on target.
self-contained\ ~100 MB bundles the .NET 10 + ASP.NET Core runtime;
no .NET install needed on the target.
framework-dependent\ ~1.6 MB requires the .NET 10 + ASP.NET Core runtime
preinstalled on the target.
Both builds use the Release configuration and inherit the publish settings
declared in src\Mbproxy\Mbproxy.csproj (PublishSingleFile=true,
IncludeNativeLibrariesForSelfExtract=true). The framework-dependent build
overrides SelfContained=false on the command line.
The runtime is selected with -Rid (default win-x64). The binary is Mbproxy.exe on
Windows RIDs and Mbproxy on Linux/macOS RIDs.
Both builds use the Release configuration and inherit the publish settings declared
in src\Mbproxy\Mbproxy.csproj (PublishSingleFile=true, SelfContained=true,
IncludeNativeLibrariesForSelfExtract=true; those settings are gated on an explicit
RID, which is supplied here). The framework-dependent build overrides
SelfContained=false on the command line.
.PARAMETER Rid
.NET runtime identifier to publish for. Examples: win-x64, linux-x64.
Default: win-x64
.PARAMETER OutputDir
Root output directory. Two subfolders are created beneath it.
@@ -24,10 +32,12 @@
.EXAMPLE
.\publish.ps1
.\publish.ps1 -Clean
.\publish.ps1 -Rid linux-x64
.\publish.ps1 -Rid win-x64 -Clean
#>
[CmdletBinding()]
param(
[string]$Rid = 'win-x64',
[string]$OutputDir = (Join-Path (Split-Path -Parent $PSScriptRoot) 'publish-out'),
[switch]$Clean
)
@@ -46,15 +56,18 @@ if ($Clean -and (Test-Path $OutputDir)) {
Remove-Item -Recurse -Force $OutputDir
}
# Binary name: Windows RIDs produce an .exe, every other RID produces an extensionless ELF/Mach-O.
$exeName = if ($Rid -like 'win-*') { 'Mbproxy.exe' } else { 'Mbproxy' }
$selfContainedOut = Join-Path $OutputDir 'self-contained'
$frameworkDependentOut = Join-Path $OutputDir 'framework-dependent'
Write-Host "`n=== Publishing self-contained (~100 MB) ===" -ForegroundColor Cyan
& dotnet publish $csproj -c Release -r win-x64 -o $selfContainedOut --nologo
Write-Host "`n=== Publishing self-contained ($Rid, ~100 MB) ===" -ForegroundColor Cyan
& dotnet publish $csproj -c Release -r $Rid -o $selfContainedOut --nologo
if ($LASTEXITCODE -ne 0) { throw "self-contained publish failed (exit $LASTEXITCODE)" }
Write-Host "`n=== Publishing framework-dependent (~1.6 MB) ===" -ForegroundColor Cyan
& dotnet publish $csproj -c Release -r win-x64 -p:SelfContained=false -p:PublishSingleFile=true -o $frameworkDependentOut --nologo
Write-Host "`n=== Publishing framework-dependent ($Rid, ~1.6 MB) ===" -ForegroundColor Cyan
& dotnet publish $csproj -c Release -r $Rid -p:SelfContained=false -p:PublishSingleFile=true -o $frameworkDependentOut --nologo
if ($LASTEXITCODE -ne 0) { throw "framework-dependent publish failed (exit $LASTEXITCODE)" }
function Format-Size {
@@ -63,14 +76,14 @@ function Format-Size {
else { '{0:N1} KB' -f ($Bytes / 1KB) }
}
Write-Host "`n=== Result ===" -ForegroundColor Green
Write-Host "`n=== Result ($Rid) ===" -ForegroundColor Green
foreach ($flavour in 'self-contained','framework-dependent') {
$exe = Join-Path $OutputDir "$flavour\Mbproxy.exe"
if (Test-Path $exe) {
$size = (Get-Item $exe).Length
Write-Host (" {0,-22} {1,10} {2}" -f $flavour, (Format-Size $size), $exe)
$bin = Join-Path $OutputDir "$flavour\$exeName"
if (Test-Path $bin) {
$size = (Get-Item $bin).Length
Write-Host (" {0,-22} {1,10} {2}" -f $flavour, (Format-Size $size), $bin)
} else {
Write-Warning "Missing: $exe"
Write-Warning "Missing: $bin"
}
}
Write-Host ""
+82
View File
@@ -0,0 +1,82 @@
#!/usr/bin/env bash
#
# publish.sh — Linux/macOS counterpart of publish.ps1.
#
# Publishes the Mbproxy binary in two flavours for the requested runtime under
# <repo>/publish-out/:
#
# self-contained/ ~100 MB — bundles the .NET 10 + ASP.NET Core runtime;
# no .NET install needed on the target.
# framework-dependent/ ~1.6 MB — requires the .NET 10 + ASP.NET Core runtime
# preinstalled on the target.
#
# Both builds use the Release configuration and inherit the publish settings in
# src/Mbproxy/Mbproxy.csproj (those settings are gated on an explicit RID, which
# is supplied here). The framework-dependent build overrides SelfContained=false.
#
# Usage:
# ./publish.sh [-r RID] [-o OUTPUT_DIR] [--clean]
#
# -r RID .NET runtime identifier (default: linux-x64)
# -o OUTPUT_DIR root output directory (default: <repo>/publish-out)
# --clean delete OUTPUT_DIR before publishing
#
# Examples:
# ./publish.sh
# ./publish.sh -r linux-x64 --clean
#
set -euo pipefail
rid="linux-x64"
script_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
repo_root="$(dirname "$script_dir")"
output_dir="$repo_root/publish-out"
clean=0
while [[ $# -gt 0 ]]; do
case "$1" in
-r) rid="$2"; shift 2 ;;
-o) output_dir="$2"; shift 2 ;;
--clean) clean=1; shift ;;
*) echo "Unknown argument: $1" >&2; exit 2 ;;
esac
done
csproj="$repo_root/src/Mbproxy/Mbproxy.csproj"
if [[ ! -f "$csproj" ]]; then
echo "Cannot find $csproj" >&2
exit 1
fi
if [[ "$clean" -eq 1 && -d "$output_dir" ]]; then
echo "Cleaning $output_dir"
rm -rf "$output_dir"
fi
# Binary name: Windows RIDs produce an .exe, every other RID an extensionless binary.
if [[ "$rid" == win-* ]]; then bin_name="Mbproxy.exe"; else bin_name="Mbproxy"; fi
self_contained_out="$output_dir/self-contained"
framework_dependent_out="$output_dir/framework-dependent"
echo
echo "=== Publishing self-contained ($rid, ~100 MB) ==="
dotnet publish "$csproj" -c Release -r "$rid" -o "$self_contained_out" --nologo
echo
echo "=== Publishing framework-dependent ($rid, ~1.6 MB) ==="
dotnet publish "$csproj" -c Release -r "$rid" \
-p:SelfContained=false -p:PublishSingleFile=true -o "$framework_dependent_out" --nologo
echo
echo "=== Result ($rid) ==="
for flavour in self-contained framework-dependent; do
bin="$output_dir/$flavour/$bin_name"
if [[ -f "$bin" ]]; then
size="$(du -h "$bin" | cut -f1)"
printf ' %-22s %8s %s\n' "$flavour" "$size" "$bin"
else
echo " WARNING: missing $bin" >&2
fi
done
echo
+4 -1
View File
@@ -122,7 +122,10 @@ if (Test-Path $InstallPath) {
if ([System.Diagnostics.EventLog]::SourceExists('mbproxy')) {
Write-Host "Removing Windows Event Log source 'mbproxy'..."
try {
Remove-EventLog -Source 'mbproxy'
# .NET API, not Remove-EventLog: the *-EventLog cmdlets exist only in
# Windows PowerShell 5.1, not PowerShell 7+. Symmetric with the
# SourceExists check above.
[System.Diagnostics.EventLog]::DeleteEventSource('mbproxy')
} catch {
Write-Warning "Could not remove Event Log source: $_"
}
+85
View File
@@ -0,0 +1,85 @@
#!/usr/bin/env bash
#
# uninstall.sh — remove the mbproxy service from a Linux / systemd host.
#
# The Linux counterpart of uninstall.ps1. Stops and disables the service,
# removes the systemd unit and installed files, and (unless --keep-config)
# removes the config directory. Log files are always preserved: they are moved
# to a timestamped archive so post-uninstall diagnostics remain accessible.
#
# Usage:
# sudo ./uninstall.sh [--keep-config] [--keep-user]
#
# --keep-config leave /etc/mbproxy/appsettings.json in place.
# --keep-user leave the mbproxy service account in place.
#
set -euo pipefail
SERVICE_NAME="mbproxy"
SERVICE_USER="mbproxy"
INSTALL_DIR="/opt/mbproxy"
CONFIG_DIR="/etc/mbproxy"
LOG_DIR="/var/log/mbproxy"
CACHE_DIR="/var/cache/mbproxy"
UNIT_DEST="/etc/systemd/system/${SERVICE_NAME}.service"
keep_config=0
keep_user=0
while [[ $# -gt 0 ]]; do
case "$1" in
--keep-config) keep_config=1; shift ;;
--keep-user) keep_user=1; shift ;;
*) echo "Unknown argument: $1" >&2; exit 2 ;;
esac
done
if [[ "$(id -u)" -ne 0 ]]; then
echo "uninstall.sh must run as root (use sudo)." >&2
exit 1
fi
echo "Uninstalling ${SERVICE_NAME} service..."
# ── 1. Stop + disable the service ────────────────────────────────────────────
if systemctl list-unit-files "${SERVICE_NAME}.service" >/dev/null 2>&1 \
&& [[ -n "$(systemctl list-unit-files "${SERVICE_NAME}.service" --no-legend 2>/dev/null)" ]]; then
echo "Stopping and disabling '${SERVICE_NAME}'..."
systemctl disable --now "$SERVICE_NAME" >/dev/null 2>&1 || true
fi
# ── 2. Remove the systemd unit ───────────────────────────────────────────────
if [[ -f "$UNIT_DEST" ]]; then
echo "Removing systemd unit '${UNIT_DEST}'..."
rm -f "$UNIT_DEST"
fi
systemctl daemon-reload
systemctl reset-failed "$SERVICE_NAME" >/dev/null 2>&1 || true
# ── 3. Archive logs (always preserved, never deleted) ────────────────────────
if [[ -d "$LOG_DIR" ]]; then
timestamp="$(date -u +%Y%m%dT%H%M%SZ)"
archive_dir="${LOG_DIR}.archived-${timestamp}"
echo "Archiving logs to '${archive_dir}'..."
mv "$LOG_DIR" "$archive_dir"
fi
# ── 4. Remove installed files ────────────────────────────────────────────────
rm -rf "$INSTALL_DIR" "$CACHE_DIR"
if [[ "$keep_config" -eq 1 ]]; then
echo "Keeping config at '${CONFIG_DIR}/appsettings.json' (--keep-config)."
else
rm -rf "$CONFIG_DIR"
fi
# ── 5. Remove the service account ────────────────────────────────────────────
if [[ "$keep_user" -eq 0 ]] && id -u "$SERVICE_USER" >/dev/null 2>&1; then
echo "Removing service account '${SERVICE_USER}'..."
userdel "$SERVICE_USER" 2>/dev/null || true
fi
echo ""
echo "Uninstall complete."
if compgen -G "${LOG_DIR}.archived-*" >/dev/null; then
echo "Archived logs: ${LOG_DIR}.archived-*"
fi
+576
View File
@@ -0,0 +1,576 @@
# mbproxy Multiplatform Implementation Plan
**Created:** 2026-05-15
**Status:** All six phases implemented. 413 tests green on Windows; Windows Service and
Linux systemd install E2E both green. Two findings (pymodbus-sim-on-Linux, `AddSystemd()`
notify) logged as orthogonal follow-ups. Working tree only — nothing committed.
**Working artifact** — not part of the `docs/` source-of-truth tree (per `../DOCS-GUIDE.md`).
Delete or archive once the work lands.
### Progress log
- **2026-05-15 — Phase 1 done, Gate 1 green.** RID removed from `csproj`
(single-file settings now gated on `'$(RuntimeIdentifier)' != ''`);
`publish.ps1` gained `-Rid`; `publish.sh` added. `dotnet build -c Debug` 0
warnings; `dotnet test` **398 passed / 0 failed** (baseline 325 → 398, the
Keepalive feature added tests); `win-x64``Mbproxy.exe` 100.1 MB,
`linux-x64``Mbproxy` ELF 97.2 MB. ELF launch-smoked on `10.100.0.35`:
full startup, listeners bound, `mbproxy.startup.ready` + admin endpoint up,
no errors. Box prep done (.NET SDK 10.0.300, shellcheck 0.10.0 installed).
- **2026-05-15 — Phases 2 + 3 code done (combined integrator pass).** Packages
added: `Microsoft.Extensions.Hosting.Systemd` 10.0.8,
`Serilog.Sinks.SyslogMessages` 4.1.0 (the maintained IonxSolutions package —
the bare `Serilog.Sinks.Syslog` ID is a near-abandoned 0.2.0 package; same
approved intent). New `DiagnosticSink` enum + `DiagnosticSinkSelector` (pure);
new `SyslogBridge`; `EventLogBridge` truncation extracted to a non-annotated
`EventLogMessage` type (testable cross-OS). `AddMbproxySerilog` now selects
the sink internally; `Program.cs` calls `AddSystemd()` + `AddWindowsService()`.
13 new tests. **411 passed / 0 failed on Windows**; on `10.100.0.35`
**372 passed / 39 skipped / 0 failed** — all 39 skips are simulator-backed
E2E (see finding below), every host/diagnostic/smoke test green on Linux.
- **2026-05-15 — Two cross-platform bugs found and fixed in install tooling.**
(1) `tests/sim/run-dl205-sim.ps1` was Windows-only — hardcoded venv paths
`Scripts\*.exe`; now branches `Scripts`/`.exe` vs `bin`/`` on `$IsWindows`
and adds `python3` to the interpreter candidates. (2) `install.ps1` /
`uninstall.ps1` used `New-EventLog` / `Remove-EventLog`, which exist only in
Windows PowerShell 5.1 — they fail under PowerShell 7+. Switched to the .NET
API (`[EventLog]::CreateEventSource` / `DeleteEventSource`), symmetric with
the `SourceExists` calls already in those scripts.
- **2026-05-15 — Windows Service E2E green (local, admin).** Republished
`win-x64`; `install.ps1 -Start` installs + starts the service; verified
Running/Automatic, `status.json` served, listeners bound,
`mbproxy.startup.ready` logged, Event Log source registered,
`WindowsServiceLifetime` wrote "Service started successfully" (proves the
process runs under the SCM). `uninstall.ps1` stopped/deleted the service,
archived logs, removed the Event Log source. Box left clean. (A forced
`EventLogBridge` Error+ write was not pursued — `Emit` is unchanged code,
covered by `EventLogMessageTests`; sink selection is covered by
`DiagnosticSinkSelectorTests`.)
- **2026-05-15 — Linux systemd E2E done.** The `linux-x64` ELF runs under a
real systemd unit on `10.100.0.35`: starts, binds listeners, serves the
admin endpoint, and `systemctl stop` → graceful SIGTERM drain
(`mbproxy.shutdown.complete` in the journal). `Type=notify` does not work
(see Findings) → Phase 5 will ship `Type=exec`. Box prep this session:
`dotnet-sdk-10.0`, `shellcheck`, `python3-venv`, pwsh 7.6.1 (dotnet global
tool), pymodbus 3.13.0 venv.
- **2026-05-15 — Phases 46 done.** Phase 4: new `install/mbproxy.linux.config.template.json`
(Unix log path `/var/log/mbproxy`, systemd-oriented comments); `csproj` links the
platform-correct template into the published `appsettings.json` by RID
(`win-*`/RID-less → Windows, else Unix) — verified by publishing both RIDs;
`MbproxyOptionsBindingTests` extended to load + schema-validate both templates
(now 413 tests on Windows). Phase 5: `install/mbproxy.service` (`Type=exec`,
hardened, `mbproxy` service account), `install/install.sh`, `install/uninstall.sh`
`shellcheck` clean; install→active→`status.json` served→uninstall→clean E2E
passed on `10.100.0.35`. Phase 6: `README.md`, `mbproxy/CLAUDE.md`,
`../CLAUDE.md`, `docs/Operations/Configuration.md`, `docs/Reference/LogEvents.md`,
`docs/Operations/Troubleshooting.md`, `docs/Architecture/Overview.md`,
`docs/Features/HotReload.md` updated for the dual-platform reality.
### Findings
- **Linux full run: 374 passed / 37 failed / 0 skipped.** With the simulator
launcher fixed and pymodbus provisioned, the simulator-backed E2E tests now
*run* on Linux (0 skipped) but **37 fail** with `IOException: Broken pipe`
(`SocketException`) when the NModbus client writes through the proxy. The
failures are broad across all simulator-backed E2E (cache, forwarding,
rewriter, supervision). **Not a Phases 13 regression:** the multiplatform
work touches only build config, diagnostic sinks, and host registration —
none of the Modbus proxy data path. The same 37 tests pass on Windows
(411/411), and every non-E2E test — including all 13 new diagnostic tests —
passes on Linux. **Root cause isolated:** the `SimulatorSmokeTests` — which
connect *directly to the pymodbus simulator with no proxy in the path* — also
fail (TCP connect error). So the fault is the pymodbus 3.13.0 simulator
itself on this box, not mbproxy's proxy code. Likely pymodbus 3.13.0 vs
Python 3.13.5 (both very new), or the box's Docker-host networking. Treated
as a **separate investigation** (pymodbus-simulator-on-Linux), entirely
orthogonal to the multiplatform service work — see the session report.
- The `run-dl205-sim.ps1` idempotency check keys on `Test-Path $venvDir` only;
a venv left structurally broken by a killed run (no `bin/`) is not detected
and re-created. Pre-existing latent gap, not platform-specific — noted, not
fixed (out of scope; a clean run is unaffected).
- **`AddSystemd()` does not deliver `sd_notify(READY=1)` here → Phase 5 uses
`Type=exec`.** mbproxy runs correctly under systemd (starts, binds, serves,
and SIGTERM → graceful drain all work — verified in the journal), but a
`Type=notify` unit never receives `READY=1` and times out. Isolated step by
step: `SystemdHelpers.IsSystemdService()` correctly returns `True` under
systemd; a *minimal* `Host.CreateApplicationBuilder()` + `AddSystemd()` host
reproduces the failure; both a `systemd-run` transient unit and a real
`Type=notify` unit file fail identically. So it is **not an mbproxy bug**
it is a `HostApplicationBuilder` + `Microsoft.Extensions.Hosting.Systemd`
10.0.8 (minimal-hosting) issue. **Resolution:** the Phase 5 unit uses
`Type=exec` — mbproxy is a leaf service that nothing orders against, so the
readiness signal is unnecessary; `Type=exec` + the generic host's built-in
POSIX `SIGTERM` handling (independent of `SystemdLifetime`) gives a fully
working unit with `Restart=on-failure`. `AddSystemd()` stays in `Program.cs`
(correct, documented, forward-compatible, harmless). Root-causing the .NET
notify gap is logged as a separate follow-up.
A plan to make mbproxy run on Linux (and incidentally macOS) as a first-class
target while keeping the Windows Service + Event Log behavior intact and adding
systemd + journald/syslog equivalents.
The hosting model (`Host.CreateApplicationBuilder` + `IHostedService` + Kestrel)
is already portable, so the work is narrow: generalize the build, abstract one
diagnostic sink, add one package + one call, and add Linux tooling/docs.
---
## 0. Test Environments
Both platforms can be exercised fully — no environment is simulated or
deferred.
### 0.1 Windows (the dev box — local)
The dev box runs **with administrator rights**, so every Windows gate runs
locally with no separate test machine:
- `install.ps1` (requires elevation) installs the real Windows Service.
- The Event Log source `mbproxy` can be registered and `EventLogBridge` writes
verified against the Application log.
- Install → start → stop → uninstall is a full local round-trip.
> Windows Service E2E mutates machine state (a registered service + Event Log
> source). It is **integrator-only** and the integrator always runs
> `uninstall.ps1` to leave the box clean after each gate.
### 0.2 Linux
**Host:** `dohertj2@10.100.0.35` — Debian 13 (trixie), amd64, kernel 6.12,
hostname `DOCKER`. systemd 257.
- **Access:** passwordless SSH from the Windows dev box; passwordless `sudo`
(verified 2026-05-15).
- **Reachable** on `10.100.0.35` (also `10.50.0.35`, `10.200.0.35`).
- **One-time prep** (run once before Wave 1 gates):
```
ssh dohertj2@10.100.0.35 'sudo apt-get update && \
sudo apt-get install -y dotnet-sdk-10.0 shellcheck'
```
`dotnet-sdk-10.0` candidate is `10.0.203` — matches the `net10.0` target.
- **Docker is installed** on the box (the user is in the `docker` group). Use
ephemeral Debian containers to isolate per-subagent E2E runs so parallel
Wave-4 agents don't collide on the host's systemd / ports (see section 3,
rule 8).
**How the integrator uses the box per gate:**
- Push the integration branch (or `rsync` the worktree) to the box, then run
`dotnet build` / `dotnet test` / `dotnet publish -r linux-x64` over SSH.
- Run the *actual* `linux-x64` ELF binary, the systemd unit, and `shellcheck`
here — Windows can cross-*publish* a `linux-x64` binary but cannot *run* or
service-host it.
> The box is a **shared mutable resource**. Host-level mutations (apt installs,
> `systemctl` on the real host, privileged-port binds) are integrator-only and
> run serially between waves. Subagents that need Linux E2E use throwaway
> Docker containers, never the host's init system directly.
---
## 1. Scope
**In scope**
- Linux (`linux-x64`) as a supported runtime target alongside `win-x64`.
- systemd integration (`Type=notify`, sd_notify readiness, SIGTERM drain).
- A Linux-appropriate error-event diagnostic sink (syslog, severity-mapped).
- RID-agnostic build + dual-RID publish tooling.
- Linux install tooling (systemd unit + shell scripts).
- Docs/README/CLAUDE.md updates.
**Out of scope (state explicitly in docs)**
- macOS `launchd` integration — mbproxy will *run* on macOS as a console
process but ships no service-manager integration.
- ARM RIDs (`linux-arm64`) — the build will not *forbid* them, but they are
untested.
- Container/Docker packaging — separate future effort.
**Locked design decisions**
- Reference `Microsoft.Extensions.Hosting.WindowsServices` *and*
`Microsoft.Extensions.Hosting.Systemd` unconditionally; both packages are
portable and both helpers self-detect their host. No conditional
`<PackageReference>`.
- All Windows API calls (`System.Diagnostics.EventLog`) stay behind
`OperatingSystem.IsWindows()` + `[SupportedOSPlatform("windows")]`; CA1416
(already enforced via `TreatWarningsAsErrors`) is the safety net.
- Diagnostic sink selection happens **once**, at the composition root
(`AddMbproxySerilog`). No OS branching anywhere else.
- Prefer **new files** over editing shared files, to keep parallel work
conflict-free.
- **Linux error-event sink: `Serilog.Sinks.Syslog`** (decided 2026-05-15).
Error+ events get RFC5424 severity mapping on Linux, mirroring the Windows
Event Log behavior where Error+ is surfaced distinctly.
`DiagnosticSinkSelector` returns `EventLog | Syslog | None`.
---
## 2. Phase Breakdown
Each phase lists its **owned file set** (the parallel-safety contract),
changes, tests, and a **gate** that must be green before the next phase starts.
### Phase 1 — Build & publish generalization (foundation)
**Objective:** Remove the hardcoded RID so the project builds/publishes for any
runtime; keep the Windows output byte-identical.
**Owned files**
- `src/Mbproxy/Mbproxy.csproj`
- `install/publish.ps1`
- `install/publish.sh` *(new)*
**Changes**
- `Mbproxy.csproj`: delete `<RuntimeIdentifier>win-x64</RuntimeIdentifier>`
from the Release `PropertyGroup`; keep `PublishSingleFile` / `SelfContained`
/ `IncludeNativeLibrariesForSelfExtract`. RID becomes a publish-time `-r`
argument.
- `publish.ps1`: add a `-Rid` parameter (default `win-x64`), keep the
two-flavor logic.
- `publish.sh`: Linux counterpart producing `linux-x64` self-contained +
framework-dependent builds.
- (The RID-conditioned `appsettings.json` content item is Phase 4; in Phase 1
just confirm the build works without a baked RID.)
**Tests**
- No xunit tests (build-config change). Gate is publish success on both RIDs.
**Gate 1**
- `dotnet build -c Debug` green; `dotnet test` full suite green (unchanged
count).
- `dotnet publish -c Release -r win-x64` produces a single-file `Mbproxy.exe`
(same size class as before).
- `dotnet publish -c Release -r linux-x64` produces a single-file `Mbproxy`
ELF binary. Cross-published from the Windows dev box; the ELF is then copied
to `10.100.0.35` and confirmed to launch (`./Mbproxy --version`-class smoke).
- Zero new analyzer warnings.
---
### Phase 2 — Diagnostic sink abstraction
**Objective:** Make error-event delivery a platform-selected sink. Windows
keeps `EventLogBridge`; Linux gets a syslog sink.
**Owned files**
- `src/Mbproxy/Diagnostics/DiagnosticSinkSelector.cs` *(new — pure selection
logic)*
- `src/Mbproxy/Diagnostics/SyslogBridge.cs` *(new)*
- `src/Mbproxy/Diagnostics/EventLogBridge.cs` *(minor: extract the 32 KB
truncation helper into a testable static method)*
- `src/Mbproxy/HostingExtensions.cs` *(only `AddMbproxySerilog`)*
- `src/Mbproxy/Mbproxy.csproj` *(add `Serilog.Sinks.Syslog` package)*
- New test files (see below)
> `HostingExtensions.cs` and `Mbproxy.csproj` are also touched by Phase 3.
> **Phases 2 and 3 must not run in parallel** (see section 3). They are
> sequential.
**Changes**
- `DiagnosticSinkSelector` — a pure function taking
`(bool isWindows, bool isWindowsService, bool isSystemd)` and returning an
enum (`EventLog | Syslog | None`). No I/O, fully unit-testable.
- `SyslogBridge`: Serilog `ILogEventSink` wrapping `Serilog.Sinks.Syslog`,
active for Error+ only, mirroring `EventLogBridge`'s contract (silent no-op
if syslog unavailable).
- `AddMbproxySerilog`: replace the `addEventLogBridge` bool parameter with a
`DiagnosticSinkSelector` result; wire the chosen sink. Keep the
`OperatingSystem.IsWindows()` guard around `EventLogBridge`.
- Extract `EventLogBridge`'s message-truncation into
`internal static string TruncateToEventLogLimit(string)` so it can be tested
OS-independently.
**Tests** (`tests/Mbproxy.Tests/Diagnostics/`)
- `DiagnosticSinkSelectorTests` — table-driven: Windows+service→`EventLog`;
Windows console→`None`; Linux+systemd→`Syslog`; Linux console→`None`;
macOS→`None`.
- `EventLogBridgeTests``[Trait("Category","Unit")]`, Windows-guarded facts:
source-missing → silent no-op; truncation helper caps at 32 KB and appends
`...` (this fact runs on all OSes since the helper is pure).
- `SyslogBridgeTests` — Error+ filter; no-throw when transport unavailable.
**Gate 2**
- Full test suite green on Windows (local); full suite green on Linux —
integrator runs `dotnet test` over SSH on `10.100.0.35`.
- `EventLogBridge` emits to the Application log — verified locally via a real
Windows Service install (`install.ps1`, admin rights available), then
`uninstall.ps1` to clean up.
- CA1416: zero warnings.
---
### Phase 3 — Service host integration (systemd)
**Objective:** Register both init-system integrations; the host correctly
reports readiness to whichever launched it.
**Owned files**
- `src/Mbproxy/Program.cs`
- `src/Mbproxy/HostingExtensions.cs` *(call-site update only)*
- `src/Mbproxy/Mbproxy.csproj` *(add `Microsoft.Extensions.Hosting.Systemd`)*
**Changes**
- `csproj`: add
`<PackageReference Include="Microsoft.Extensions.Hosting.Systemd" />` (pin to
the 10.0.x line matching the existing Windows-services package).
- `Program.cs`: call `builder.Services.AddSystemd();` alongside
`AddWindowsService();`. Compute `isSystemd` via
`SystemdHelpers.IsSystemdService()` and feed `DiagnosticSinkSelector`
together with `isWindowsService`.
- Confirm SIGTERM → host shutdown → existing
`Connection.GracefulShutdownTimeoutMs` drain path works (it does — POSIX
signal handling is built into the generic host; just verify).
**Tests** (`tests/Mbproxy.Tests/HostSmokeTests.cs` — extend existing file)
- `HostSmoke_RegistersBothServiceIntegrations_StartsAndStops` — builds the host
with both `AddWindowsService` + `AddSystemd`, asserts no throw, asserts
`mbproxy.startup.ready` still logged.
- Existing two smoke tests must remain green.
**Gate 3**
- Full suite green on Windows (local) and Linux (`10.100.0.35` via SSH).
- Windows Service E2E, run locally with admin rights: `install.ps1` → service
starts, logs `mbproxy.startup.ready` + writes to Event Log, `Stop-Service`
drains cleanly, `uninstall.ps1` removes it. **No regression** in Windows
behavior is the hard requirement of this gate.
- Linux systemd E2E on `10.100.0.35`**done.** The `linux-x64` binary runs
under a real systemd unit: it starts, binds listeners, serves the admin
endpoint, and `systemctl stop` (SIGTERM) drains gracefully
(`mbproxy.shutdown.complete` in the journal). `Type=notify` was found not to
deliver `READY=1` (Findings) → the Phase 5 unit uses `Type=exec`, under which
the service is fully functional.
---
### Phase 4 — Config & filesystem portability
**Objective:** No Windows-only paths in the shipped/installed config.
**Owned files**
- `install/mbproxy.config.template.json` *(Windows — keep `C:\ProgramData\...`
path)*
- `install/mbproxy.linux.config.template.json` *(new — `/var/log/mbproxy/...`,
Linux syslog `Using` entry)*
- `src/Mbproxy/Mbproxy.csproj` *(condition the linked `appsettings.json`
content item by `$(RuntimeIdentifier)`)*
> Touches `csproj`. Must run after Phase 3's csproj edit is merged (sequential
> w.r.t. csproj), but is otherwise independent of Phase 5/6.
**Changes**
- New Linux template: log path `/var/log/mbproxy/mbproxy-.log`; Serilog
`Using` array includes the syslog sink; comment header points at
`/etc/mbproxy/appsettings.json`.
- `csproj`: link the win template for `win-*` RIDs and the linux template for
`linux-*` RIDs into the published `appsettings.json` (RID-conditioned
`<Content>` items).
**Tests** (`tests/Mbproxy.Tests/Options/`)
- Extend `MbproxyOptionsBindingTests`: load **each** shipped template through
the config binder + `MbproxyOptionsValidator`; assert both bind and validate
cleanly. Catches a malformed Linux template at build time.
**Gate 4**
- Both templates bind + validate (new test green).
- `dotnet publish -r linux-x64` ships the Linux template as `appsettings.json`;
`-r win-x64` ships the Windows one. Verify by inspecting publish output.
---
### Phase 5 — Linux install tooling
**Objective:** Parity with `install.ps1` for systemd hosts.
**Owned files** (all new, fully disjoint from all other phases)
- `install/mbproxy.service` — systemd unit, **`Type=exec`** (not `Type=notify`
see Findings: `AddSystemd()` does not deliver `READY=1` for the minimal
hosting model), `Restart=on-failure`, `User=mbproxy`, `ExecStart` pointing at
the installed binary; sets `DOTNET_BUNDLE_EXTRACT_BASE_DIR`.
- `install/install.sh` — creates `mbproxy` service account, lays down binary +
`/etc/mbproxy/appsettings.json` (preserve-if-exists, matching `install.ps1`
semantics), creates `/var/log/mbproxy`, installs + `systemctl enable --now`.
- `install/uninstall.sh``systemctl disable --now`, archives logs (mirror the
`.archived-<ts>` convention), removes unit.
**Tests**
- Not xunit. Gate = `shellcheck` clean + a dry-run inside a throwaway Debian
container on `10.100.0.35`.
**Gate 5**
- `shellcheck install/*.sh` clean — run on `10.100.0.35` (shellcheck installed
in the one-time prep).
- End-to-end on `10.100.0.35`, inside a throwaway Debian container:
`install.sh` → service active → proxy answers Modbus on a configured port →
`uninstall.sh` → service gone, logs archived. Container isolation keeps the
`mbproxy` service account / unit off the real host.
---
### Phase 6 — Documentation
**Objective:** Docs reflect dual-platform reality; doctrine in `DOCS-GUIDE.md`
respected.
**Owned files**
- `README.md` — rewrite "Hard constraints / prerequisites" (drop "No Linux or
Docker support"); add Linux install path; document both publish flavors ×
both RIDs.
- `docs/Operations/Configuration.md` — both config templates, log-path
differences, syslog vs Event Log.
- `docs/Operations/Troubleshooting.md``journalctl` guidance alongside Event
Viewer.
- `docs/Architecture/Overview.md` — note dual init-system hosting (only if it
shifts a headline bullet).
- `docs/Reference/LogEvents.md` — note Error+ events route to Event Log
(Windows) / syslog (Linux).
- `mbproxy/CLAUDE.md` — correct the implied Windows-only framing.
- `wwtools/CLAUDE.md` — broaden the mbproxy index row if the task→tool mapping
changed.
**Tests**
- Markdown link-check across touched files.
**Gate 6**
- All internal doc links resolve.
- README "Hard constraints" no longer contradicts the shipped tooling.
---
## 3. Parallel Subagent Execution Plan
### Dependency graph
```
Phase 1 (build) ──> Phase 2 (diagnostics) ──> Phase 3 (host) ──┬─> Phase 4 (config)
├─> Phase 5 (install)
└─> Phase 6 (docs)
```
Phases 2 and 3 are **strictly sequential**: Phase 3 calls the new
`AddMbproxySerilog` signature Phase 2 defines, and both edit
`HostingExtensions.cs` + `csproj`. Phases 4, 5, 6 are **mutually independent**
and parallelizable once Phase 3 is merged.
### Wave plan
| Wave | Phases | Agents | Mode |
| ---- | --------- | ------------------- | ----------------------------------------------- |
| W1 | Phase 1 | 1 agent | Single — touches `csproj` |
| W2 | Phase 2 | 1 agent | Single — touches `csproj` + `HostingExtensions` |
| W3 | Phase 3 | 1 agent | Single — touches `csproj` + `HostingExtensions` + `Program.cs` |
| W4 | 4, 5, 6 | 3 agents (parallel) | Parallel — disjoint file sets |
> Phase 4 touches `csproj` but no other W4 phase does, so within W4 the file
> sets are still disjoint. Safe.
### File-ownership matrix (the parallel-safety contract)
| File | P1 | P2 | P3 | P4 | P5 | P6 |
| --------------------------------------------- | -- | -- | -- | -- | -- | -- |
| `Mbproxy.csproj` | x | x | x | x | | |
| `HostingExtensions.cs` | | x | x | | | |
| `Program.cs` | | | x | | | |
| `Diagnostics/*` (new + EventLogBridge) | | x | | | | |
| `install/publish.*` | x | | | | | |
| `install/*.config.template.json` | | | | x | | |
| `install/install.sh`, `uninstall.sh`, `.service` | | | | | x | |
| `tests/**` | | x | x | x | | |
| docs / READMEs / CLAUDE.md | | | | | | x |
No column in W4 (P4/P5/P6) shares a row. Confirmed conflict-free.
### Subagent rules (enforce in every dispatch prompt)
1. **One git worktree per subagent** — dispatch each `Agent` call with
`isolation: "worktree"`. Physical isolation means even a stray edit can't
corrupt a sibling's tree.
2. **Owned-file contract** — each subagent is told its exact owned file set
from the matrix and instructed to edit nothing outside it. A subagent that
discovers it needs an out-of-set file must stop and report, not edit.
3. **No intra-wave API coupling** — subagents in the same wave may only depend
on public APIs from *already-merged* prior waves, never on a sibling's
in-progress work. (This is why P2→P3 are separate waves, not parallel.)
4. **Tests ship with code** — the subagent that writes a phase's code also
writes that phase's tests and runs `dotnet test` green *in its own
worktree* before reporting done. No separate "test agent."
5. **Integrator merges in declared order** — the main agent merges each
worktree, runs the full build + test suite, and only then declares the
phase gate met. A failed gate blocks the next wave.
6. **High-contention files are single-agent-only**`csproj`,
`HostingExtensions.cs`, `Program.cs`, `CLAUDE.md` are never edited by two
agents in the same wave (the matrix guarantees this).
7. **Prefer new files**`DiagnosticSinkSelector.cs`, `SyslogBridge.cs`,
`mbproxy.linux.config.template.json`, the shell scripts, the unit file are
all new — new files can't merge-conflict, maximizing safe parallelism.
8. **Shared test hosts are integrator-only for mutations** — subagents may run
`dotnet build` / `dotnet test` (read-mostly) but must **not** install a
Windows Service, register an Event Log source, or `systemctl` against the
real `10.100.0.35` host. Service-level E2E is the integrator's job at gate
time; if a subagent needs Linux E2E it spins an ephemeral Docker container
on the box (named per-agent, `--rm`) so parallel agents never collide on
ports, the init system, or service accounts.
### Merge protocol per wave
```
for each wave:
dispatch agent(s) with isolation: worktree + owned-file list
on completion:
integrator: merge worktree(s) in matrix order
integrator: dotnet build -c Debug (must be green)
integrator: dotnet test (green, count >= prior)
integrator: dotnet publish -r win-x64 AND -r linux-x64 (must succeed)
integrator: verify phase-specific gate checklist
gate green? -> next wave. gate red? -> fix in a single-agent pass, re-gate.
```
---
## 4. Cross-Cutting Test Strategy
- **Existing baseline (325 = 282 unit + 43 E2E) must never regress.** Every
gate re-runs the full suite.
- **New tests target pure logic**`DiagnosticSinkSelector` is a pure function
precisely so platform-selection is testable without being a service. Highest-
value new test.
- **OS-conditional tests** use `[Trait]` + a runtime `OperatingSystem.IsWindows()`
skip so the suite is green on both Windows and Linux.
- **Both platforms are exercised every gate, no simulation.** Windows runs
locally (admin rights → real Windows Service install). Linux runs on
`dohertj2@10.100.0.35` (Debian 13, systemd 257) — the integrator drives
`dotnet build` / `dotnet test` / publish / systemd E2E over SSH.
- **CI** (if/when a pipeline exists): add a `linux-x64` build+test leg, ideally
pointed at the same box or an equivalent image. Until then the integrator's
per-gate SSH run on `10.100.0.35` is the Linux leg.
- **CA1416 platform analyzer** is treated as a test — `TreatWarningsAsErrors`
already fails the build if a Windows API escapes its guard.
---
## 5. Risk Register
| Risk | Phase | Mitigation |
| --------------------------------------------- | ----- | -------------------------------------------------------------------------- |
| Windows Service behavior regresses unnoticed | P3 | Gate 3 mandates a real Windows Service install/start/stop smoke check |
| `Serilog.Sinks.Syslog` version drift | P2 | Pin the version; `SyslogBridge` is isolated behind `DiagnosticSinkSelector` |
| Linux publish ships Windows config path | P4 | RID-conditioned `<Content>` item + `MbproxyOptionsBindingTests` on both templates |
| Self-extracting single-file temp-dir perms | P1/P5 | Document + set `DOTNET_BUNDLE_EXTRACT_BASE_DIR` in the systemd unit |
| Two agents racing `csproj` | all | Matrix forbids it — `csproj` edited only in single-agent waves W1W3 + lone P4 |
| Hidden Windows path elsewhere in code | all | `Grep` sweep for `C:\\`, `ProgramData`, `\\\\` before Gate 6 |
| Parallel Wave-4 agents collide on the shared `10.100.0.35` host | W4 | Rule 8 — service-level E2E is integrator-only and serial; subagent E2E uses per-agent `--rm` Docker containers |
| Windows Service E2E leaves stale service/Event Log source | P2/P3 | Integrator always runs `uninstall.ps1` after each Windows gate |
---
## 6. Deliverable Summary
- **3 modified source files** (`csproj`, `HostingExtensions.cs`, `Program.cs`)
+ **3 new** (`DiagnosticSinkSelector.cs`, `SyslogBridge.cs`, and the
truncation-helper extraction in `EventLogBridge.cs`).
- **2 new packages** (`Microsoft.Extensions.Hosting.Systemd`,
`Serilog.Sinks.Syslog`).
- **6 new install/tooling files** (`publish.sh`, Linux config template,
`mbproxy.service`, `install.sh`, `uninstall.sh`).
- **~68 new tests** across 3 new/extended test files; baseline 325 preserved.
- **7 doc files** updated.
- **4 waves**, max 3 concurrent subagents, conflict-free by construction.
@@ -0,0 +1,60 @@
namespace Mbproxy.Diagnostics;
/// <summary>
/// The platform diagnostic sink to wire for <c>Error</c>+ events — picked once,
/// at the composition root, by <see cref="DiagnosticSinkSelector"/>.
/// </summary>
internal enum DiagnosticSink
{
/// <summary>
/// No platform diagnostic sink — console (and rolling-file) sinks only. Used
/// for interactive / dev runs on every OS.
/// </summary>
None,
/// <summary>
/// Windows Application Event Log, via <see cref="EventLogBridge"/>. Selected
/// only when the process is hosted as a Windows Service.
/// </summary>
EventLog,
/// <summary>
/// Local syslog, via <see cref="SyslogBridge"/>. Selected only when the
/// process is hosted as a systemd service on Linux.
/// </summary>
Syslog,
}
/// <summary>
/// Pure platform-selection logic for the <c>Error</c>+ diagnostic sink. Holds no
/// I/O and no host APIs so it is unit-testable for every OS / host combination;
/// the host detection itself happens in <see cref="HostingExtensions.AddMbproxySerilog"/>.
/// </summary>
internal static class DiagnosticSinkSelector
{
/// <summary>
/// Picks the diagnostic sink for the current host:
/// <list type="bullet">
/// <item>Windows hosted as a Windows Service → <see cref="DiagnosticSink.EventLog"/>.</item>
/// <item>Linux hosted as a systemd service → <see cref="DiagnosticSink.Syslog"/>.</item>
/// <item>Everything else — interactive / dev runs, macOS, launches not owned
/// by an init system → <see cref="DiagnosticSink.None"/>.</item>
/// </list>
/// The managed-service gate mirrors the original <see cref="EventLogBridge"/>
/// contract: a diagnostic sink is wired only when an init system actually owns
/// the process, so dev / console runs never need an Event Log source registered
/// or a syslog socket reachable.
/// </summary>
/// <param name="isWindows">Running on Windows.</param>
/// <param name="isWindowsService">Hosted by the Windows Service Control Manager.</param>
/// <param name="isSystemd">Hosted by systemd.</param>
public static DiagnosticSink Select(bool isWindows, bool isWindowsService, bool isSystemd)
{
// Windows takes precedence: isSystemd is meaningless there, and on
// non-Windows isWindowsService is always false.
if (isWindows)
return isWindowsService ? DiagnosticSink.EventLog : DiagnosticSink.None;
return isSystemd ? DiagnosticSink.Syslog : DiagnosticSink.None;
}
}
@@ -5,6 +5,32 @@ using Serilog.Events;
namespace Mbproxy.Diagnostics;
/// <summary>
/// Pure message-shaping helpers for the Windows Event Log. Kept on a separate,
/// non-platform-annotated type — <em>not</em> on <see cref="EventLogBridge"/>,
/// which is <c>[SupportedOSPlatform("windows")]</c> — so the truncation logic is
/// unit-testable on any OS without tripping the platform-compatibility analyzer.
/// </summary>
internal static class EventLogMessage
{
/// <summary>The Windows Event Log single-entry limit, in bytes (32 KB).</summary>
public const int MaxBytes = 32 * 1024;
/// <summary>
/// Truncates <paramref name="message"/> so its UTF-16 byte length stays within
/// <see cref="MaxBytes"/>, appending an ellipsis when truncation occurs. Shorter
/// messages are returned unchanged.
/// </summary>
public static string TruncateToLimit(string message)
{
// Rough UTF-16 upper bound: 2 bytes per char.
if (message.Length * 2 <= MaxBytes) return message;
int charLimit = MaxBytes / 2 - 3; // leave room for the "..." suffix
return message[..charLimit] + "...";
}
}
/// <summary>
/// Serilog sink that writes events at level Error and above to the Windows Event Log
/// under source <c>mbproxy</c>.
@@ -26,7 +52,6 @@ internal sealed class EventLogBridge : ILogEventSink
{
private const string Source = "mbproxy";
private const string LogName = "Application";
private const int MaxMessageBytes = 32 * 1024; // 32 KB Event Log limit
private readonly bool _enabled;
// Cache the source-exists check at construction so Emit doesn't hit the registry on
@@ -63,11 +88,7 @@ internal sealed class EventLogBridge : ILogEventSink
}
// Truncate to the Event Log single-entry limit.
if (message.Length * 2 > MaxMessageBytes) // rough UTF-16 upper bound
{
int charLimit = MaxMessageBytes / 2 - 3;
message = message[..charLimit] + "...";
}
message = EventLogMessage.TruncateToLimit(message);
var type = logEvent.Level switch
{
@@ -0,0 +1,50 @@
using Serilog;
using Serilog.Debugging;
using Serilog.Events;
namespace Mbproxy.Diagnostics;
/// <summary>
/// Wires the local-syslog sink for <c>Error</c>+ events when mbproxy runs as a
/// systemd service on Linux — the cross-platform counterpart of
/// <see cref="EventLogBridge"/>.
///
/// <para>Events at <see cref="LogEventLevel.Error"/> and above are written to the
/// local syslog socket (<c>/dev/log</c>) under the application name
/// <see cref="AppName"/>, with Serilog levels mapped to syslog severities by the
/// sink. On a systemd host the local syslog socket is provided by
/// <c>systemd-journald</c>, so these events land in the journal at
/// <c>err</c>/<c>crit</c> priority — distinct from the process's stdout, which
/// journald captures at <c>info</c>.</para>
///
/// <para>If the local syslog socket is unavailable the bridge degrades silently
/// to the console (and rolling-file) sinks rather than failing logger
/// construction, mirroring <see cref="EventLogBridge"/>'s no-op-when-unavailable
/// contract.</para>
/// </summary>
internal static class SyslogBridge
{
/// <summary>syslog application name — the <c>TAG</c> field of each entry.</summary>
internal const string AppName = "mbproxy";
/// <summary>
/// Attaches the <c>Error</c>+ local-syslog sink to <paramref name="cfg"/> and
/// returns it for fluent chaining. Never throws: a host where the syslog sink
/// cannot be configured degrades to <paramref name="cfg"/> unchanged.
/// </summary>
public static LoggerConfiguration AttachTo(LoggerConfiguration cfg)
{
try
{
return cfg.WriteTo.LocalSyslog(
appName: AppName,
restrictedToMinimumLevel: LogEventLevel.Error);
}
catch (Exception ex)
{
// Degrade to console-only rather than crash logger construction.
SelfLog.WriteLine("SyslogBridge: local syslog unavailable, console-only: {0}", ex);
return cfg;
}
}
}
+30 -13
View File
@@ -2,7 +2,10 @@ using Mbproxy.Admin;
using Mbproxy.Configuration;
using Mbproxy.Diagnostics;
using Mbproxy.Options;
using Microsoft.Extensions.Hosting.Systemd;
using Microsoft.Extensions.Hosting.WindowsServices;
using Serilog;
using Serilog.Events;
namespace Mbproxy;
@@ -62,25 +65,39 @@ internal static class HostingExtensions
/// Configures Serilog from the <c>"Serilog"</c> configuration section, with console
/// and rolling-file sinks as defaults.
///
/// <para>When <paramref name="addEventLogBridge"/> is <c>true</c>, the
/// <see cref="Diagnostics.EventLogBridge"/> is added as a sub-sink for events at
/// <see cref="Serilog.Events.LogEventLevel.Error"/> and above. This flag should only be
/// set when the service is running as a Windows Service — the bridge silently ignores
/// events when the Event Log source is not registered.</para>
/// <para>This is the single composition-root point where the platform diagnostic
/// sink for <c>Error</c>+ events is chosen. <see cref="DiagnosticSinkSelector"/>
/// picks it from the current host:
/// <list type="bullet">
/// <item>Windows Service → <see cref="Diagnostics.EventLogBridge"/> (Application
/// Event Log).</item>
/// <item>systemd service → <see cref="Diagnostics.SyslogBridge"/> (local syslog).</item>
/// <item>interactive / dev runs (any OS) → no platform sink.</item>
/// </list>
/// Both bridges silently no-op when their backing facility is unavailable, so a
/// dev run never needs an Event Log source registered or a syslog socket.</para>
/// </summary>
public static IHostApplicationBuilder AddMbproxySerilog(
this IHostApplicationBuilder builder,
bool addEventLogBridge = false)
public static IHostApplicationBuilder AddMbproxySerilog(this IHostApplicationBuilder builder)
{
var cfg = new LoggerConfiguration()
.ReadFrom.Configuration(builder.Configuration);
if (addEventLogBridge && OperatingSystem.IsWindows())
var sink = DiagnosticSinkSelector.Select(
isWindows: OperatingSystem.IsWindows(),
isWindowsService: WindowsServiceHelpers.IsWindowsService(),
isSystemd: SystemdHelpers.IsSystemdService());
cfg = sink switch
{
cfg = cfg.WriteTo.Sink(
new EventLogBridge(enabled: true),
Serilog.Events.LogEventLevel.Error);
}
// EventLogBridge is [SupportedOSPlatform("windows")]; the extra
// OperatingSystem.IsWindows() guard satisfies the platform analyzer
// (DiagnosticSinkSelector already guarantees Windows for this case).
DiagnosticSink.EventLog when OperatingSystem.IsWindows()
=> cfg.WriteTo.Sink(new EventLogBridge(enabled: true), LogEventLevel.Error),
DiagnosticSink.Syslog
=> SyslogBridge.AttachTo(cfg),
_ => cfg,
};
Log.Logger = cfg.CreateLogger();
+36 -14
View File
@@ -12,16 +12,19 @@
<InformationalVersion>1.0.0</InformationalVersion>
</PropertyGroup>
<!-- Single-file self-contained publish (Release only; Debug stays normal for fast iteration).
The resulting Mbproxy.exe is ~100 MB because the self-contained publish bundles the full
.NET 10 + ASP.NET Core runtime — fixed cost of self-contained deployment on .NET 10 with
ASP.NET Core. Operators who need a smaller footprint can use a framework-dependent publish
(dotnet publish -c Release -r win-x64 -p:SelfContained=false -p:PublishSingleFile=true)
if the target machine has .NET 10 installed. -->
<PropertyGroup Condition="'$(Configuration)' == 'Release'">
<!-- Single-file publish settings — apply only to a Release publish with an explicit RID.
Publishing with -r <rid> produces a single-file binary, self-contained by default
(bundles the .NET 10 + ASP.NET Core runtime, ~100 MB) so no .NET install is needed on
the target. Override with -p:SelfContained=false for a framework-dependent build
(~1.6 MB) when the target already has the .NET 10 + ASP.NET Core runtime.
The RID is supplied per publish (win-x64, linux-x64, ...) and is deliberately NOT
hardcoded here — see install/publish.ps1 / install/publish.sh. The
'$(RuntimeIdentifier)' != '' guard means a plain `dotnet build -c Release` with no RID
stays an ordinary framework build (SelfContained without a RID is an SDK error). -->
<PropertyGroup Condition="'$(Configuration)' == 'Release' and '$(RuntimeIdentifier)' != ''">
<PublishSingleFile>true</PublishSingleFile>
<SelfContained>true</SelfContained>
<RuntimeIdentifier>win-x64</RuntimeIdentifier>
<IncludeNativeLibrariesForSelfExtract>true</IncludeNativeLibrariesForSelfExtract>
</PropertyGroup>
@@ -32,12 +35,19 @@
<ItemGroup>
<!-- Microsoft.Extensions.Hosting is already included transitively via
Microsoft.AspNetCore.App — do not re-add it explicitly. -->
Microsoft.AspNetCore.App — do not re-add it explicitly.
The two init-system integration packages are both portable: each is
safe to reference and call on any OS (the helper self-detects its host
and no-ops otherwise), so no conditional reference is needed. -->
<PackageReference Include="Microsoft.Extensions.Hosting.WindowsServices" Version="10.0.8" />
<PackageReference Include="Microsoft.Extensions.Hosting.Systemd" Version="10.0.8" />
<PackageReference Include="Serilog.Extensions.Hosting" Version="10.0.0" />
<PackageReference Include="Serilog.Settings.Configuration" Version="10.0.0" />
<PackageReference Include="Serilog.Sinks.Console" Version="6.1.1" />
<PackageReference Include="Serilog.Sinks.File" Version="7.0.0" />
<!-- Local-syslog sink for the Linux diagnostic bridge (Error+ events).
Serilog.Sinks.SyslogMessages is the maintained IonxSolutions package. -->
<PackageReference Include="Serilog.Sinks.SyslogMessages" Version="4.1.0" />
<!-- Polly: backend-connect retry pipeline (PolicyFactory.BuildBackendConnect) and
listener-recovery pipeline (PolicyFactory.BuildListenerRecovery). -->
<PackageReference Include="Polly" Version="8.6.6" />
@@ -48,17 +58,29 @@
<InternalsVisibleTo Include="Mbproxy.Tests" />
</ItemGroup>
<!-- Link the platform-appropriate install template as the published appsettings.json so
the binary ships with a fully-commented, usable example config (PLCs, BCD tags, all
sections present) instead of an empty stub. The .NET configuration loader supports
JSONC (comments) under the default Host.CreateApplicationBuilder path, so the comments
in the template are valid at runtime.
The two templates differ only in OS-specific paths (log directory) and platform
notes. A `dotnet publish -r linux-*` (or any non-win RID) ships the Linux template;
win-* and a plain RID-less dev build ship the Windows one. -->
<ItemGroup>
<!-- Link the install template as the published appsettings.json so the binary ships
with a fully-commented, usable example config (one PLC, one BCD tag, all sections
present) instead of an empty stub. The .NET configuration loader supports JSONC
(comments) under the default Host.CreateApplicationBuilder path, so the comments
in the template are valid at runtime. -->
<None Remove="appsettings.json" />
</ItemGroup>
<ItemGroup Condition="'$(RuntimeIdentifier)' == '' or $(RuntimeIdentifier.StartsWith('win'))">
<Content Include="..\..\install\mbproxy.config.template.json"
Link="appsettings.json">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
</Content>
</ItemGroup>
<ItemGroup Condition="'$(RuntimeIdentifier)' != '' and !$(RuntimeIdentifier.StartsWith('win'))">
<Content Include="..\..\install\mbproxy.linux.config.template.json"
Link="appsettings.json">
<CopyToOutputDirectory>PreserveNewest</CopyToOutputDirectory>
</Content>
</ItemGroup>
</Project>
+10 -6
View File
@@ -1,17 +1,21 @@
using Mbproxy;
using Mbproxy.Proxy;
using Microsoft.Extensions.Hosting.Systemd;
using Microsoft.Extensions.Hosting.WindowsServices;
var builder = Host.CreateApplicationBuilder(args);
// Windows Service support; no-op when running under dotnet run / console.
// Init-system integration. Both helpers self-detect their host and are no-ops
// otherwise, so calling both unconditionally is correct on every platform:
// - AddWindowsService(): active only when launched by the Windows SCM.
// - AddSystemd(): active only when launched by systemd (wires sd_notify
// readiness; SIGTERM shutdown is handled by the host).
builder.Services.AddWindowsService();
builder.Services.AddSystemd();
// Wire EventLogBridge only when actually running as a Windows Service.
bool isWindowsService = WindowsServiceHelpers.IsWindowsService();
// Wire up structured config, Serilog, and typed options.
builder.AddMbproxySerilog(addEventLogBridge: isWindowsService);
// Wire up structured config, Serilog, and typed options. AddMbproxySerilog selects
// the platform diagnostic sink (Windows Event Log / syslog / none) internally.
builder.AddMbproxySerilog();
builder.AddMbproxyOptions();
// PDU pipeline: BcdPduPipeline is stateless (per-call correlation flows through
@@ -0,0 +1,40 @@
using Mbproxy.Diagnostics;
using Shouldly;
using Xunit;
namespace Mbproxy.Tests.Diagnostics;
/// <summary>
/// Unit tests for <see cref="DiagnosticSinkSelector"/> — the pure platform-selection
/// logic for the Error+ diagnostic sink. Covers every OS / host combination so the
/// selection contract is pinned without needing a real Windows Service or systemd host.
/// </summary>
[Trait("Category", "Unit")]
public sealed class DiagnosticSinkSelectorTests
{
// 'expected' is the underlying int of DiagnosticSink: the enum is internal and
// cannot appear in a public (xunit-discoverable) method signature.
[Theory]
[InlineData(true, true, false, (int)DiagnosticSink.EventLog)] // Windows, hosted as a Windows Service
[InlineData(true, false, false, (int)DiagnosticSink.None)] // Windows, interactive / dev run
[InlineData(false, false, true, (int)DiagnosticSink.Syslog)] // Linux, hosted as a systemd service
[InlineData(false, false, false, (int)DiagnosticSink.None)] // Linux / macOS, interactive / dev run
public void Select_PicksExpectedSink(
bool isWindows, bool isWindowsService, bool isSystemd, int expected)
=> ((int)DiagnosticSinkSelector.Select(isWindows, isWindowsService, isSystemd)).ShouldBe(expected);
[Fact]
public void Select_Windows_TakesPrecedence_OverASpuriousSystemdFlag()
=> DiagnosticSinkSelector.Select(isWindows: true, isWindowsService: true, isSystemd: true)
.ShouldBe(DiagnosticSink.EventLog);
[Fact]
public void Select_WindowsConsoleRun_GetsNoSink_EvenIfSystemdFlagSet()
=> DiagnosticSinkSelector.Select(isWindows: true, isWindowsService: false, isSystemd: true)
.ShouldBe(DiagnosticSink.None);
[Fact]
public void Select_NonWindowsWithoutSystemd_GetsNoSink()
=> DiagnosticSinkSelector.Select(isWindows: false, isWindowsService: false, isSystemd: false)
.ShouldBe(DiagnosticSink.None);
}
@@ -0,0 +1,41 @@
using Mbproxy.Diagnostics;
using Shouldly;
using Xunit;
namespace Mbproxy.Tests.Diagnostics;
/// <summary>
/// Unit tests for <see cref="EventLogMessage.TruncateToLimit"/> — the 32 KB Windows
/// Event Log truncation rule. The helper is pure and OS-agnostic, so these run on
/// every platform (the Windows-only <see cref="EventLogBridge"/> sink itself is not
/// exercised here).
/// </summary>
[Trait("Category", "Unit")]
public sealed class EventLogMessageTests
{
[Fact]
public void TruncateToLimit_ShortMessage_ReturnedUnchanged()
{
const string msg = "mbproxy backend connect failed";
EventLogMessage.TruncateToLimit(msg).ShouldBeSameAs(msg);
}
[Fact]
public void TruncateToLimit_MessageAtTheLimit_NotTruncated()
{
// MaxBytes / 2 chars = exactly MaxBytes at the 2-bytes-per-char upper bound.
var atLimit = new string('y', EventLogMessage.MaxBytes / 2);
EventLogMessage.TruncateToLimit(atLimit).ShouldBe(atLimit);
}
[Fact]
public void TruncateToLimit_OversizeMessage_TruncatedWithinLimit_AndEndsWithEllipsis()
{
var huge = new string('x', EventLogMessage.MaxBytes); // well over the limit
var result = EventLogMessage.TruncateToLimit(huge);
(result.Length * 2).ShouldBeLessThanOrEqualTo(EventLogMessage.MaxBytes);
result.ShouldEndWith("...");
result.Length.ShouldBeLessThan(huge.Length);
}
}
@@ -0,0 +1,27 @@
using Mbproxy.Diagnostics;
using Serilog;
using Shouldly;
using Xunit;
namespace Mbproxy.Tests.Diagnostics;
/// <summary>
/// Unit tests for <see cref="SyslogBridge"/>. The bridge's fail-safe contract is that
/// attaching the local-syslog sink and building the resulting logger never throw —
/// even on a host with no <c>/dev/log</c> (e.g. the Windows test leg), where the sink
/// connects lazily and degrades silently.
/// </summary>
[Trait("Category", "Unit")]
public sealed class SyslogBridgeTests
{
[Fact]
public void AttachTo_ReturnsAConfiguration_AndNeverThrows()
=> SyslogBridge.AttachTo(new LoggerConfiguration()).ShouldNotBeNull();
[Fact]
public void AttachTo_ResultCreatesALogger_WithoutThrowing()
{
using var logger = SyslogBridge.AttachTo(new LoggerConfiguration()).CreateLogger();
logger.ShouldNotBeNull();
}
}
@@ -5,6 +5,8 @@ using Mbproxy.Proxy;
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Hosting;
using Microsoft.Extensions.Hosting.Systemd;
using Microsoft.Extensions.Hosting.WindowsServices;
using Serilog;
using Serilog.Core;
using Serilog.Events;
@@ -71,6 +73,26 @@ public sealed class HostSmokeTests
// Assert: does not throw / time out.
await stopTask.ShouldCompleteWithinAsync(TimeSpan.FromSeconds(3));
}
[Fact]
public async Task HostSmoke_BothInitSystemIntegrations_CoRegister_AndHostRunsCleanly()
{
// Arrange: register BOTH init-system integrations. Each is a no-op off its
// own init system, so on a test run (neither) the default console lifetime
// applies — they must co-register without conflict and leave the host
// startable and stoppable.
var builder = Host.CreateApplicationBuilder();
builder.Services.AddWindowsService();
builder.Services.AddSystemd();
builder.ConfigureForTest(new LoggerConfiguration().CreateLogger());
using var host = builder.Build();
using var cts = new CancellationTokenSource(TimeSpan.FromSeconds(5));
// Act + Assert: start/stop do not throw or time out.
await host.StartAsync(cts.Token);
await host.StopAsync(cts.Token);
}
}
/// <summary>
@@ -102,6 +102,39 @@ public sealed class MbproxyOptionsBindingTests
options.Resilience.ListenerRecovery.InitialBackoffMs.ShouldBe([1000, 2000, 5000, 15000, 30000]);
options.Plcs.ShouldBeEmpty();
options.BcdTags.Global.ShouldBeEmpty();
// Keepalive defaults — enabled, with the documented timer values.
options.Connection.Keepalive.Enabled.ShouldBeTrue();
options.Connection.Keepalive.TcpIdleTimeMs.ShouldBe(30000);
options.Connection.Keepalive.TcpProbeIntervalMs.ShouldBe(5000);
options.Connection.Keepalive.TcpProbeCount.ShouldBe(4);
options.Connection.Keepalive.BackendHeartbeatIdleMs.ShouldBe(30000);
options.Connection.Keepalive.BackendHeartbeatProbeAddress.ShouldBe(0);
}
// -------------------------------------------------------------------------
// Test 5 — the Connection:Keepalive block binds from configuration
// -------------------------------------------------------------------------
[Fact]
public void MbproxyOptionsBinding_BindsKeepaliveBlock()
{
var options = BindOptions(new Dictionary<string, string?>
{
["Mbproxy:Connection:Keepalive:Enabled"] = "false",
["Mbproxy:Connection:Keepalive:TcpIdleTimeMs"] = "45000",
["Mbproxy:Connection:Keepalive:TcpProbeIntervalMs"] = "7000",
["Mbproxy:Connection:Keepalive:TcpProbeCount"] = "6",
["Mbproxy:Connection:Keepalive:BackendHeartbeatIdleMs"] = "20000",
["Mbproxy:Connection:Keepalive:BackendHeartbeatProbeAddress"] = "1024",
});
var ka = options.Connection.Keepalive;
ka.Enabled.ShouldBeFalse();
ka.TcpIdleTimeMs.ShouldBe(45000);
ka.TcpProbeIntervalMs.ShouldBe(7000);
ka.TcpProbeCount.ShouldBe(6);
ka.BackendHeartbeatIdleMs.ShouldBe(20000);
ka.BackendHeartbeatProbeAddress.ShouldBe(1024);
}
// -------------------------------------------------------------------------
@@ -129,4 +162,47 @@ public sealed class MbproxyOptionsBindingTests
result.Failed.ShouldBeTrue("Width=8 should fail schema validation");
result.Failures.ShouldNotBeEmpty();
}
// -------------------------------------------------------------------------
// Test 6 — every shipped install template (Windows + Linux) loads as JSONC,
// binds to MbproxyOptions, and passes schema validation. This catches
// a malformed template at build time and keeps the two platform
// variants in lockstep.
// -------------------------------------------------------------------------
[Theory]
[InlineData("mbproxy.config.template.json")]
[InlineData("mbproxy.linux.config.template.json")]
public void MbproxyOptionsBinding_ShippedInstallTemplate_BindsAndValidates(string templateFileName)
{
var templatePath = ResolveInstallFile(templateFileName);
// The templates are JSONC; the .NET JSON config provider skips // and /* */
// comments and allows trailing commas, so AddJsonFile loads them directly.
var config = new ConfigurationBuilder()
.AddJsonFile(templatePath, optional: false)
.Build();
var options = config.GetSection("Mbproxy").Get<MbproxyOptions>() ?? new MbproxyOptions();
var result = new MbproxyOptionsValidator().Validate(null, options);
result.Succeeded.ShouldBeTrue(
$"{templateFileName} must pass schema validation — failures: " +
string.Join("; ", result.Failures ?? []));
}
/// <summary>
/// Resolves an <c>install/</c> file by walking up from the test assembly directory.
/// Works from both the Windows dev box and the Linux test box.
/// </summary>
private static string ResolveInstallFile(string fileName)
{
for (var dir = new DirectoryInfo(AppContext.BaseDirectory); dir is not null; dir = dir.Parent)
{
var candidate = Path.Combine(dir.FullName, "install", fileName);
if (File.Exists(candidate))
return candidate;
}
throw new FileNotFoundException(
$"Could not locate install/{fileName} above {AppContext.BaseDirectory}");
}
}
+13 -5
View File
@@ -48,9 +48,13 @@ if (-not $ProfileResolved) {
}
# ── 2. Locate Python ─────────────────────────────────────────────────────────
# Try 'python' first (standard PATH install), then the Windows-store launcher 'py'.
# Windows: 'python' (standard PATH install), then the 'py' launcher.
# Linux/macOS: 'python3' (the canonical name), then 'python'.
# The candidate order is platform-specific so Windows never matches the Microsoft
# Store 'python3' stub.
$pythonExe = $null
foreach ($candidate in 'python', 'py') {
$pythonCandidates = $IsWindows ? @('python', 'py') : @('python3', 'python')
foreach ($candidate in $pythonCandidates) {
try {
$ver = & $candidate --version 2>&1
if ($LASTEXITCODE -eq 0) {
@@ -77,9 +81,13 @@ or use the Windows Store launcher ('py').
$PYMODBUS_VERSION = '3.13.0'
$venvDir = Join-Path $PSScriptRoot '.venv'
$venvPython = Join-Path $venvDir 'Scripts\python.exe'
$pipExe = Join-Path $venvDir 'Scripts\pip.exe'
$simulatorExe = Join-Path $venvDir 'Scripts\pymodbus.simulator.exe' # sentinel for complete install
# venv executable layout differs by OS: Windows puts them in Scripts\ with a .exe
# extension; Linux/macOS put them in bin/ with no extension.
$venvBin = $IsWindows ? 'Scripts' : 'bin'
$exeExt = $IsWindows ? '.exe' : ''
$venvPython = Join-Path $venvDir $venvBin "python$exeExt"
$pipExe = Join-Path $venvDir $venvBin "pip$exeExt"
$simulatorExe = Join-Path $venvDir $venvBin "pymodbus.simulator$exeExt" # sentinel for complete install
# Provisioning is idempotent: we only skip it when pymodbus.simulator.exe exists.
# Checking only the .venv directory is not enough — a previous run killed mid-install