b330faff03
Make the service build, run, and install on Linux as a first-class target while keeping the Windows Service + Event Log behaviour intact. - Build: drop the hardcoded win-x64 RID — single-file publish now works for any RID. publish.ps1 gains -Rid; new publish.sh for Linux hosts. - Diagnostics: DiagnosticSinkSelector picks the Error+ sink per host — Windows Event Log under the SCM, local syslog under systemd (Serilog.Sinks.SyslogMessages), none for interactive runs. The EventLog truncation helper is extracted so it is testable cross-OS. - Host: Program.cs registers AddSystemd() alongside AddWindowsService(). - Config: a RID-conditioned appsettings template ships Windows or Unix paths; both templates are schema-validated by a test. - Install: systemd unit (Type=exec) plus install.sh / uninstall.sh. Also fixes two cross-platform bugs found while testing: install.ps1 and uninstall.ps1 used New-EventLog / Remove-EventLog (absent in PowerShell 7), and the E2E sim launcher hardcoded Windows venv paths. - Docs updated across README, CLAUDE.md, and docs/ for dual-platform. 413 tests pass on Windows; 374 (all non-simulator) on Linux. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
577 lines
29 KiB
Markdown
577 lines
29 KiB
Markdown
# mbproxy Multiplatform Implementation Plan
|
||
|
||
**Created:** 2026-05-15
|
||
**Status:** All six phases implemented. 413 tests green on Windows; Windows Service and
|
||
Linux systemd install E2E both green. Two findings (pymodbus-sim-on-Linux, `AddSystemd()`
|
||
notify) logged as orthogonal follow-ups. Working tree only — nothing committed.
|
||
**Working artifact** — not part of the `docs/` source-of-truth tree (per `../DOCS-GUIDE.md`).
|
||
Delete or archive once the work lands.
|
||
|
||
### Progress log
|
||
|
||
- **2026-05-15 — Phase 1 done, Gate 1 green.** RID removed from `csproj`
|
||
(single-file settings now gated on `'$(RuntimeIdentifier)' != ''`);
|
||
`publish.ps1` gained `-Rid`; `publish.sh` added. `dotnet build -c Debug` 0
|
||
warnings; `dotnet test` **398 passed / 0 failed** (baseline 325 → 398, the
|
||
Keepalive feature added tests); `win-x64` → `Mbproxy.exe` 100.1 MB,
|
||
`linux-x64` → `Mbproxy` ELF 97.2 MB. ELF launch-smoked on `10.100.0.35`:
|
||
full startup, listeners bound, `mbproxy.startup.ready` + admin endpoint up,
|
||
no errors. Box prep done (.NET SDK 10.0.300, shellcheck 0.10.0 installed).
|
||
- **2026-05-15 — Phases 2 + 3 code done (combined integrator pass).** Packages
|
||
added: `Microsoft.Extensions.Hosting.Systemd` 10.0.8,
|
||
`Serilog.Sinks.SyslogMessages` 4.1.0 (the maintained IonxSolutions package —
|
||
the bare `Serilog.Sinks.Syslog` ID is a near-abandoned 0.2.0 package; same
|
||
approved intent). New `DiagnosticSink` enum + `DiagnosticSinkSelector` (pure);
|
||
new `SyslogBridge`; `EventLogBridge` truncation extracted to a non-annotated
|
||
`EventLogMessage` type (testable cross-OS). `AddMbproxySerilog` now selects
|
||
the sink internally; `Program.cs` calls `AddSystemd()` + `AddWindowsService()`.
|
||
13 new tests. **411 passed / 0 failed on Windows**; on `10.100.0.35`
|
||
**372 passed / 39 skipped / 0 failed** — all 39 skips are simulator-backed
|
||
E2E (see finding below), every host/diagnostic/smoke test green on Linux.
|
||
|
||
- **2026-05-15 — Two cross-platform bugs found and fixed in install tooling.**
|
||
(1) `tests/sim/run-dl205-sim.ps1` was Windows-only — hardcoded venv paths
|
||
`Scripts\*.exe`; now branches `Scripts`/`.exe` vs `bin`/`` on `$IsWindows`
|
||
and adds `python3` to the interpreter candidates. (2) `install.ps1` /
|
||
`uninstall.ps1` used `New-EventLog` / `Remove-EventLog`, which exist only in
|
||
Windows PowerShell 5.1 — they fail under PowerShell 7+. Switched to the .NET
|
||
API (`[EventLog]::CreateEventSource` / `DeleteEventSource`), symmetric with
|
||
the `SourceExists` calls already in those scripts.
|
||
- **2026-05-15 — Windows Service E2E green (local, admin).** Republished
|
||
`win-x64`; `install.ps1 -Start` installs + starts the service; verified
|
||
Running/Automatic, `status.json` served, listeners bound,
|
||
`mbproxy.startup.ready` logged, Event Log source registered,
|
||
`WindowsServiceLifetime` wrote "Service started successfully" (proves the
|
||
process runs under the SCM). `uninstall.ps1` stopped/deleted the service,
|
||
archived logs, removed the Event Log source. Box left clean. (A forced
|
||
`EventLogBridge` Error+ write was not pursued — `Emit` is unchanged code,
|
||
covered by `EventLogMessageTests`; sink selection is covered by
|
||
`DiagnosticSinkSelectorTests`.)
|
||
- **2026-05-15 — Linux systemd E2E done.** The `linux-x64` ELF runs under a
|
||
real systemd unit on `10.100.0.35`: starts, binds listeners, serves the
|
||
admin endpoint, and `systemctl stop` → graceful SIGTERM drain
|
||
(`mbproxy.shutdown.complete` in the journal). `Type=notify` does not work
|
||
(see Findings) → Phase 5 will ship `Type=exec`. Box prep this session:
|
||
`dotnet-sdk-10.0`, `shellcheck`, `python3-venv`, pwsh 7.6.1 (dotnet global
|
||
tool), pymodbus 3.13.0 venv.
|
||
|
||
- **2026-05-15 — Phases 4–6 done.** Phase 4: new `install/mbproxy.linux.config.template.json`
|
||
(Unix log path `/var/log/mbproxy`, systemd-oriented comments); `csproj` links the
|
||
platform-correct template into the published `appsettings.json` by RID
|
||
(`win-*`/RID-less → Windows, else Unix) — verified by publishing both RIDs;
|
||
`MbproxyOptionsBindingTests` extended to load + schema-validate both templates
|
||
(now 413 tests on Windows). Phase 5: `install/mbproxy.service` (`Type=exec`,
|
||
hardened, `mbproxy` service account), `install/install.sh`, `install/uninstall.sh`
|
||
— `shellcheck` clean; install→active→`status.json` served→uninstall→clean E2E
|
||
passed on `10.100.0.35`. Phase 6: `README.md`, `mbproxy/CLAUDE.md`,
|
||
`../CLAUDE.md`, `docs/Operations/Configuration.md`, `docs/Reference/LogEvents.md`,
|
||
`docs/Operations/Troubleshooting.md`, `docs/Architecture/Overview.md`,
|
||
`docs/Features/HotReload.md` updated for the dual-platform reality.
|
||
|
||
### Findings
|
||
|
||
- **Linux full run: 374 passed / 37 failed / 0 skipped.** With the simulator
|
||
launcher fixed and pymodbus provisioned, the simulator-backed E2E tests now
|
||
*run* on Linux (0 skipped) but **37 fail** with `IOException: Broken pipe`
|
||
(`SocketException`) when the NModbus client writes through the proxy. The
|
||
failures are broad across all simulator-backed E2E (cache, forwarding,
|
||
rewriter, supervision). **Not a Phases 1–3 regression:** the multiplatform
|
||
work touches only build config, diagnostic sinks, and host registration —
|
||
none of the Modbus proxy data path. The same 37 tests pass on Windows
|
||
(411/411), and every non-E2E test — including all 13 new diagnostic tests —
|
||
passes on Linux. **Root cause isolated:** the `SimulatorSmokeTests` — which
|
||
connect *directly to the pymodbus simulator with no proxy in the path* — also
|
||
fail (TCP connect error). So the fault is the pymodbus 3.13.0 simulator
|
||
itself on this box, not mbproxy's proxy code. Likely pymodbus 3.13.0 vs
|
||
Python 3.13.5 (both very new), or the box's Docker-host networking. Treated
|
||
as a **separate investigation** (pymodbus-simulator-on-Linux), entirely
|
||
orthogonal to the multiplatform service work — see the session report.
|
||
- The `run-dl205-sim.ps1` idempotency check keys on `Test-Path $venvDir` only;
|
||
a venv left structurally broken by a killed run (no `bin/`) is not detected
|
||
and re-created. Pre-existing latent gap, not platform-specific — noted, not
|
||
fixed (out of scope; a clean run is unaffected).
|
||
- **`AddSystemd()` does not deliver `sd_notify(READY=1)` here → Phase 5 uses
|
||
`Type=exec`.** mbproxy runs correctly under systemd (starts, binds, serves,
|
||
and SIGTERM → graceful drain all work — verified in the journal), but a
|
||
`Type=notify` unit never receives `READY=1` and times out. Isolated step by
|
||
step: `SystemdHelpers.IsSystemdService()` correctly returns `True` under
|
||
systemd; a *minimal* `Host.CreateApplicationBuilder()` + `AddSystemd()` host
|
||
reproduces the failure; both a `systemd-run` transient unit and a real
|
||
`Type=notify` unit file fail identically. So it is **not an mbproxy bug** —
|
||
it is a `HostApplicationBuilder` + `Microsoft.Extensions.Hosting.Systemd`
|
||
10.0.8 (minimal-hosting) issue. **Resolution:** the Phase 5 unit uses
|
||
`Type=exec` — mbproxy is a leaf service that nothing orders against, so the
|
||
readiness signal is unnecessary; `Type=exec` + the generic host's built-in
|
||
POSIX `SIGTERM` handling (independent of `SystemdLifetime`) gives a fully
|
||
working unit with `Restart=on-failure`. `AddSystemd()` stays in `Program.cs`
|
||
(correct, documented, forward-compatible, harmless). Root-causing the .NET
|
||
notify gap is logged as a separate follow-up.
|
||
|
||
A plan to make mbproxy run on Linux (and incidentally macOS) as a first-class
|
||
target while keeping the Windows Service + Event Log behavior intact and adding
|
||
systemd + journald/syslog equivalents.
|
||
|
||
The hosting model (`Host.CreateApplicationBuilder` + `IHostedService` + Kestrel)
|
||
is already portable, so the work is narrow: generalize the build, abstract one
|
||
diagnostic sink, add one package + one call, and add Linux tooling/docs.
|
||
|
||
---
|
||
|
||
## 0. Test Environments
|
||
|
||
Both platforms can be exercised fully — no environment is simulated or
|
||
deferred.
|
||
|
||
### 0.1 Windows (the dev box — local)
|
||
|
||
The dev box runs **with administrator rights**, so every Windows gate runs
|
||
locally with no separate test machine:
|
||
|
||
- `install.ps1` (requires elevation) installs the real Windows Service.
|
||
- The Event Log source `mbproxy` can be registered and `EventLogBridge` writes
|
||
verified against the Application log.
|
||
- Install → start → stop → uninstall is a full local round-trip.
|
||
|
||
> Windows Service E2E mutates machine state (a registered service + Event Log
|
||
> source). It is **integrator-only** and the integrator always runs
|
||
> `uninstall.ps1` to leave the box clean after each gate.
|
||
|
||
### 0.2 Linux
|
||
|
||
**Host:** `dohertj2@10.100.0.35` — Debian 13 (trixie), amd64, kernel 6.12,
|
||
hostname `DOCKER`. systemd 257.
|
||
|
||
- **Access:** passwordless SSH from the Windows dev box; passwordless `sudo`
|
||
(verified 2026-05-15).
|
||
- **Reachable** on `10.100.0.35` (also `10.50.0.35`, `10.200.0.35`).
|
||
- **One-time prep** (run once before Wave 1 gates):
|
||
```
|
||
ssh dohertj2@10.100.0.35 'sudo apt-get update && \
|
||
sudo apt-get install -y dotnet-sdk-10.0 shellcheck'
|
||
```
|
||
`dotnet-sdk-10.0` candidate is `10.0.203` — matches the `net10.0` target.
|
||
- **Docker is installed** on the box (the user is in the `docker` group). Use
|
||
ephemeral Debian containers to isolate per-subagent E2E runs so parallel
|
||
Wave-4 agents don't collide on the host's systemd / ports (see section 3,
|
||
rule 8).
|
||
|
||
**How the integrator uses the box per gate:**
|
||
- Push the integration branch (or `rsync` the worktree) to the box, then run
|
||
`dotnet build` / `dotnet test` / `dotnet publish -r linux-x64` over SSH.
|
||
- Run the *actual* `linux-x64` ELF binary, the systemd unit, and `shellcheck`
|
||
here — Windows can cross-*publish* a `linux-x64` binary but cannot *run* or
|
||
service-host it.
|
||
|
||
> The box is a **shared mutable resource**. Host-level mutations (apt installs,
|
||
> `systemctl` on the real host, privileged-port binds) are integrator-only and
|
||
> run serially between waves. Subagents that need Linux E2E use throwaway
|
||
> Docker containers, never the host's init system directly.
|
||
|
||
---
|
||
|
||
## 1. Scope
|
||
|
||
**In scope**
|
||
- Linux (`linux-x64`) as a supported runtime target alongside `win-x64`.
|
||
- systemd integration (`Type=notify`, sd_notify readiness, SIGTERM drain).
|
||
- A Linux-appropriate error-event diagnostic sink (syslog, severity-mapped).
|
||
- RID-agnostic build + dual-RID publish tooling.
|
||
- Linux install tooling (systemd unit + shell scripts).
|
||
- Docs/README/CLAUDE.md updates.
|
||
|
||
**Out of scope (state explicitly in docs)**
|
||
- macOS `launchd` integration — mbproxy will *run* on macOS as a console
|
||
process but ships no service-manager integration.
|
||
- ARM RIDs (`linux-arm64`) — the build will not *forbid* them, but they are
|
||
untested.
|
||
- Container/Docker packaging — separate future effort.
|
||
|
||
**Locked design decisions**
|
||
- Reference `Microsoft.Extensions.Hosting.WindowsServices` *and*
|
||
`Microsoft.Extensions.Hosting.Systemd` unconditionally; both packages are
|
||
portable and both helpers self-detect their host. No conditional
|
||
`<PackageReference>`.
|
||
- All Windows API calls (`System.Diagnostics.EventLog`) stay behind
|
||
`OperatingSystem.IsWindows()` + `[SupportedOSPlatform("windows")]`; CA1416
|
||
(already enforced via `TreatWarningsAsErrors`) is the safety net.
|
||
- Diagnostic sink selection happens **once**, at the composition root
|
||
(`AddMbproxySerilog`). No OS branching anywhere else.
|
||
- Prefer **new files** over editing shared files, to keep parallel work
|
||
conflict-free.
|
||
- **Linux error-event sink: `Serilog.Sinks.Syslog`** (decided 2026-05-15).
|
||
Error+ events get RFC5424 severity mapping on Linux, mirroring the Windows
|
||
Event Log behavior where Error+ is surfaced distinctly.
|
||
`DiagnosticSinkSelector` returns `EventLog | Syslog | None`.
|
||
|
||
---
|
||
|
||
## 2. Phase Breakdown
|
||
|
||
Each phase lists its **owned file set** (the parallel-safety contract),
|
||
changes, tests, and a **gate** that must be green before the next phase starts.
|
||
|
||
### Phase 1 — Build & publish generalization (foundation)
|
||
|
||
**Objective:** Remove the hardcoded RID so the project builds/publishes for any
|
||
runtime; keep the Windows output byte-identical.
|
||
|
||
**Owned files**
|
||
- `src/Mbproxy/Mbproxy.csproj`
|
||
- `install/publish.ps1`
|
||
- `install/publish.sh` *(new)*
|
||
|
||
**Changes**
|
||
- `Mbproxy.csproj`: delete `<RuntimeIdentifier>win-x64</RuntimeIdentifier>`
|
||
from the Release `PropertyGroup`; keep `PublishSingleFile` / `SelfContained`
|
||
/ `IncludeNativeLibrariesForSelfExtract`. RID becomes a publish-time `-r`
|
||
argument.
|
||
- `publish.ps1`: add a `-Rid` parameter (default `win-x64`), keep the
|
||
two-flavor logic.
|
||
- `publish.sh`: Linux counterpart producing `linux-x64` self-contained +
|
||
framework-dependent builds.
|
||
- (The RID-conditioned `appsettings.json` content item is Phase 4; in Phase 1
|
||
just confirm the build works without a baked RID.)
|
||
|
||
**Tests**
|
||
- No xunit tests (build-config change). Gate is publish success on both RIDs.
|
||
|
||
**Gate 1**
|
||
- `dotnet build -c Debug` green; `dotnet test` full suite green (unchanged
|
||
count).
|
||
- `dotnet publish -c Release -r win-x64` produces a single-file `Mbproxy.exe`
|
||
(same size class as before).
|
||
- `dotnet publish -c Release -r linux-x64` produces a single-file `Mbproxy`
|
||
ELF binary. Cross-published from the Windows dev box; the ELF is then copied
|
||
to `10.100.0.35` and confirmed to launch (`./Mbproxy --version`-class smoke).
|
||
- Zero new analyzer warnings.
|
||
|
||
---
|
||
|
||
### Phase 2 — Diagnostic sink abstraction
|
||
|
||
**Objective:** Make error-event delivery a platform-selected sink. Windows
|
||
keeps `EventLogBridge`; Linux gets a syslog sink.
|
||
|
||
**Owned files**
|
||
- `src/Mbproxy/Diagnostics/DiagnosticSinkSelector.cs` *(new — pure selection
|
||
logic)*
|
||
- `src/Mbproxy/Diagnostics/SyslogBridge.cs` *(new)*
|
||
- `src/Mbproxy/Diagnostics/EventLogBridge.cs` *(minor: extract the 32 KB
|
||
truncation helper into a testable static method)*
|
||
- `src/Mbproxy/HostingExtensions.cs` *(only `AddMbproxySerilog`)*
|
||
- `src/Mbproxy/Mbproxy.csproj` *(add `Serilog.Sinks.Syslog` package)*
|
||
- New test files (see below)
|
||
|
||
> `HostingExtensions.cs` and `Mbproxy.csproj` are also touched by Phase 3.
|
||
> **Phases 2 and 3 must not run in parallel** (see section 3). They are
|
||
> sequential.
|
||
|
||
**Changes**
|
||
- `DiagnosticSinkSelector` — a pure function taking
|
||
`(bool isWindows, bool isWindowsService, bool isSystemd)` and returning an
|
||
enum (`EventLog | Syslog | None`). No I/O, fully unit-testable.
|
||
- `SyslogBridge`: Serilog `ILogEventSink` wrapping `Serilog.Sinks.Syslog`,
|
||
active for Error+ only, mirroring `EventLogBridge`'s contract (silent no-op
|
||
if syslog unavailable).
|
||
- `AddMbproxySerilog`: replace the `addEventLogBridge` bool parameter with a
|
||
`DiagnosticSinkSelector` result; wire the chosen sink. Keep the
|
||
`OperatingSystem.IsWindows()` guard around `EventLogBridge`.
|
||
- Extract `EventLogBridge`'s message-truncation into
|
||
`internal static string TruncateToEventLogLimit(string)` so it can be tested
|
||
OS-independently.
|
||
|
||
**Tests** (`tests/Mbproxy.Tests/Diagnostics/`)
|
||
- `DiagnosticSinkSelectorTests` — table-driven: Windows+service→`EventLog`;
|
||
Windows console→`None`; Linux+systemd→`Syslog`; Linux console→`None`;
|
||
macOS→`None`.
|
||
- `EventLogBridgeTests` — `[Trait("Category","Unit")]`, Windows-guarded facts:
|
||
source-missing → silent no-op; truncation helper caps at 32 KB and appends
|
||
`...` (this fact runs on all OSes since the helper is pure).
|
||
- `SyslogBridgeTests` — Error+ filter; no-throw when transport unavailable.
|
||
|
||
**Gate 2**
|
||
- Full test suite green on Windows (local); full suite green on Linux —
|
||
integrator runs `dotnet test` over SSH on `10.100.0.35`.
|
||
- `EventLogBridge` emits to the Application log — verified locally via a real
|
||
Windows Service install (`install.ps1`, admin rights available), then
|
||
`uninstall.ps1` to clean up.
|
||
- CA1416: zero warnings.
|
||
|
||
---
|
||
|
||
### Phase 3 — Service host integration (systemd)
|
||
|
||
**Objective:** Register both init-system integrations; the host correctly
|
||
reports readiness to whichever launched it.
|
||
|
||
**Owned files**
|
||
- `src/Mbproxy/Program.cs`
|
||
- `src/Mbproxy/HostingExtensions.cs` *(call-site update only)*
|
||
- `src/Mbproxy/Mbproxy.csproj` *(add `Microsoft.Extensions.Hosting.Systemd`)*
|
||
|
||
**Changes**
|
||
- `csproj`: add
|
||
`<PackageReference Include="Microsoft.Extensions.Hosting.Systemd" />` (pin to
|
||
the 10.0.x line matching the existing Windows-services package).
|
||
- `Program.cs`: call `builder.Services.AddSystemd();` alongside
|
||
`AddWindowsService();`. Compute `isSystemd` via
|
||
`SystemdHelpers.IsSystemdService()` and feed `DiagnosticSinkSelector`
|
||
together with `isWindowsService`.
|
||
- Confirm SIGTERM → host shutdown → existing
|
||
`Connection.GracefulShutdownTimeoutMs` drain path works (it does — POSIX
|
||
signal handling is built into the generic host; just verify).
|
||
|
||
**Tests** (`tests/Mbproxy.Tests/HostSmokeTests.cs` — extend existing file)
|
||
- `HostSmoke_RegistersBothServiceIntegrations_StartsAndStops` — builds the host
|
||
with both `AddWindowsService` + `AddSystemd`, asserts no throw, asserts
|
||
`mbproxy.startup.ready` still logged.
|
||
- Existing two smoke tests must remain green.
|
||
|
||
**Gate 3**
|
||
- Full suite green on Windows (local) and Linux (`10.100.0.35` via SSH).
|
||
- Windows Service E2E, run locally with admin rights: `install.ps1` → service
|
||
starts, logs `mbproxy.startup.ready` + writes to Event Log, `Stop-Service`
|
||
drains cleanly, `uninstall.ps1` removes it. **No regression** in Windows
|
||
behavior is the hard requirement of this gate.
|
||
- Linux systemd E2E on `10.100.0.35` — **done.** The `linux-x64` binary runs
|
||
under a real systemd unit: it starts, binds listeners, serves the admin
|
||
endpoint, and `systemctl stop` (SIGTERM) drains gracefully
|
||
(`mbproxy.shutdown.complete` in the journal). `Type=notify` was found not to
|
||
deliver `READY=1` (Findings) → the Phase 5 unit uses `Type=exec`, under which
|
||
the service is fully functional.
|
||
|
||
---
|
||
|
||
### Phase 4 — Config & filesystem portability
|
||
|
||
**Objective:** No Windows-only paths in the shipped/installed config.
|
||
|
||
**Owned files**
|
||
- `install/mbproxy.config.template.json` *(Windows — keep `C:\ProgramData\...`
|
||
path)*
|
||
- `install/mbproxy.linux.config.template.json` *(new — `/var/log/mbproxy/...`,
|
||
Linux syslog `Using` entry)*
|
||
- `src/Mbproxy/Mbproxy.csproj` *(condition the linked `appsettings.json`
|
||
content item by `$(RuntimeIdentifier)`)*
|
||
|
||
> Touches `csproj`. Must run after Phase 3's csproj edit is merged (sequential
|
||
> w.r.t. csproj), but is otherwise independent of Phase 5/6.
|
||
|
||
**Changes**
|
||
- New Linux template: log path `/var/log/mbproxy/mbproxy-.log`; Serilog
|
||
`Using` array includes the syslog sink; comment header points at
|
||
`/etc/mbproxy/appsettings.json`.
|
||
- `csproj`: link the win template for `win-*` RIDs and the linux template for
|
||
`linux-*` RIDs into the published `appsettings.json` (RID-conditioned
|
||
`<Content>` items).
|
||
|
||
**Tests** (`tests/Mbproxy.Tests/Options/`)
|
||
- Extend `MbproxyOptionsBindingTests`: load **each** shipped template through
|
||
the config binder + `MbproxyOptionsValidator`; assert both bind and validate
|
||
cleanly. Catches a malformed Linux template at build time.
|
||
|
||
**Gate 4**
|
||
- Both templates bind + validate (new test green).
|
||
- `dotnet publish -r linux-x64` ships the Linux template as `appsettings.json`;
|
||
`-r win-x64` ships the Windows one. Verify by inspecting publish output.
|
||
|
||
---
|
||
|
||
### Phase 5 — Linux install tooling
|
||
|
||
**Objective:** Parity with `install.ps1` for systemd hosts.
|
||
|
||
**Owned files** (all new, fully disjoint from all other phases)
|
||
- `install/mbproxy.service` — systemd unit, **`Type=exec`** (not `Type=notify` —
|
||
see Findings: `AddSystemd()` does not deliver `READY=1` for the minimal
|
||
hosting model), `Restart=on-failure`, `User=mbproxy`, `ExecStart` pointing at
|
||
the installed binary; sets `DOTNET_BUNDLE_EXTRACT_BASE_DIR`.
|
||
- `install/install.sh` — creates `mbproxy` service account, lays down binary +
|
||
`/etc/mbproxy/appsettings.json` (preserve-if-exists, matching `install.ps1`
|
||
semantics), creates `/var/log/mbproxy`, installs + `systemctl enable --now`.
|
||
- `install/uninstall.sh` — `systemctl disable --now`, archives logs (mirror the
|
||
`.archived-<ts>` convention), removes unit.
|
||
|
||
**Tests**
|
||
- Not xunit. Gate = `shellcheck` clean + a dry-run inside a throwaway Debian
|
||
container on `10.100.0.35`.
|
||
|
||
**Gate 5**
|
||
- `shellcheck install/*.sh` clean — run on `10.100.0.35` (shellcheck installed
|
||
in the one-time prep).
|
||
- End-to-end on `10.100.0.35`, inside a throwaway Debian container:
|
||
`install.sh` → service active → proxy answers Modbus on a configured port →
|
||
`uninstall.sh` → service gone, logs archived. Container isolation keeps the
|
||
`mbproxy` service account / unit off the real host.
|
||
|
||
---
|
||
|
||
### Phase 6 — Documentation
|
||
|
||
**Objective:** Docs reflect dual-platform reality; doctrine in `DOCS-GUIDE.md`
|
||
respected.
|
||
|
||
**Owned files**
|
||
- `README.md` — rewrite "Hard constraints / prerequisites" (drop "No Linux or
|
||
Docker support"); add Linux install path; document both publish flavors ×
|
||
both RIDs.
|
||
- `docs/Operations/Configuration.md` — both config templates, log-path
|
||
differences, syslog vs Event Log.
|
||
- `docs/Operations/Troubleshooting.md` — `journalctl` guidance alongside Event
|
||
Viewer.
|
||
- `docs/Architecture/Overview.md` — note dual init-system hosting (only if it
|
||
shifts a headline bullet).
|
||
- `docs/Reference/LogEvents.md` — note Error+ events route to Event Log
|
||
(Windows) / syslog (Linux).
|
||
- `mbproxy/CLAUDE.md` — correct the implied Windows-only framing.
|
||
- `wwtools/CLAUDE.md` — broaden the mbproxy index row if the task→tool mapping
|
||
changed.
|
||
|
||
**Tests**
|
||
- Markdown link-check across touched files.
|
||
|
||
**Gate 6**
|
||
- All internal doc links resolve.
|
||
- README "Hard constraints" no longer contradicts the shipped tooling.
|
||
|
||
---
|
||
|
||
## 3. Parallel Subagent Execution Plan
|
||
|
||
### Dependency graph
|
||
|
||
```
|
||
Phase 1 (build) ──> Phase 2 (diagnostics) ──> Phase 3 (host) ──┬─> Phase 4 (config)
|
||
├─> Phase 5 (install)
|
||
└─> Phase 6 (docs)
|
||
```
|
||
|
||
Phases 2 and 3 are **strictly sequential**: Phase 3 calls the new
|
||
`AddMbproxySerilog` signature Phase 2 defines, and both edit
|
||
`HostingExtensions.cs` + `csproj`. Phases 4, 5, 6 are **mutually independent**
|
||
and parallelizable once Phase 3 is merged.
|
||
|
||
### Wave plan
|
||
|
||
| Wave | Phases | Agents | Mode |
|
||
| ---- | --------- | ------------------- | ----------------------------------------------- |
|
||
| W1 | Phase 1 | 1 agent | Single — touches `csproj` |
|
||
| W2 | Phase 2 | 1 agent | Single — touches `csproj` + `HostingExtensions` |
|
||
| W3 | Phase 3 | 1 agent | Single — touches `csproj` + `HostingExtensions` + `Program.cs` |
|
||
| W4 | 4, 5, 6 | 3 agents (parallel) | Parallel — disjoint file sets |
|
||
|
||
> Phase 4 touches `csproj` but no other W4 phase does, so within W4 the file
|
||
> sets are still disjoint. Safe.
|
||
|
||
### File-ownership matrix (the parallel-safety contract)
|
||
|
||
| File | P1 | P2 | P3 | P4 | P5 | P6 |
|
||
| --------------------------------------------- | -- | -- | -- | -- | -- | -- |
|
||
| `Mbproxy.csproj` | x | x | x | x | | |
|
||
| `HostingExtensions.cs` | | x | x | | | |
|
||
| `Program.cs` | | | x | | | |
|
||
| `Diagnostics/*` (new + EventLogBridge) | | x | | | | |
|
||
| `install/publish.*` | x | | | | | |
|
||
| `install/*.config.template.json` | | | | x | | |
|
||
| `install/install.sh`, `uninstall.sh`, `.service` | | | | | x | |
|
||
| `tests/**` | | x | x | x | | |
|
||
| docs / READMEs / CLAUDE.md | | | | | | x |
|
||
|
||
No column in W4 (P4/P5/P6) shares a row. Confirmed conflict-free.
|
||
|
||
### Subagent rules (enforce in every dispatch prompt)
|
||
|
||
1. **One git worktree per subagent** — dispatch each `Agent` call with
|
||
`isolation: "worktree"`. Physical isolation means even a stray edit can't
|
||
corrupt a sibling's tree.
|
||
2. **Owned-file contract** — each subagent is told its exact owned file set
|
||
from the matrix and instructed to edit nothing outside it. A subagent that
|
||
discovers it needs an out-of-set file must stop and report, not edit.
|
||
3. **No intra-wave API coupling** — subagents in the same wave may only depend
|
||
on public APIs from *already-merged* prior waves, never on a sibling's
|
||
in-progress work. (This is why P2→P3 are separate waves, not parallel.)
|
||
4. **Tests ship with code** — the subagent that writes a phase's code also
|
||
writes that phase's tests and runs `dotnet test` green *in its own
|
||
worktree* before reporting done. No separate "test agent."
|
||
5. **Integrator merges in declared order** — the main agent merges each
|
||
worktree, runs the full build + test suite, and only then declares the
|
||
phase gate met. A failed gate blocks the next wave.
|
||
6. **High-contention files are single-agent-only** — `csproj`,
|
||
`HostingExtensions.cs`, `Program.cs`, `CLAUDE.md` are never edited by two
|
||
agents in the same wave (the matrix guarantees this).
|
||
7. **Prefer new files** — `DiagnosticSinkSelector.cs`, `SyslogBridge.cs`,
|
||
`mbproxy.linux.config.template.json`, the shell scripts, the unit file are
|
||
all new — new files can't merge-conflict, maximizing safe parallelism.
|
||
8. **Shared test hosts are integrator-only for mutations** — subagents may run
|
||
`dotnet build` / `dotnet test` (read-mostly) but must **not** install a
|
||
Windows Service, register an Event Log source, or `systemctl` against the
|
||
real `10.100.0.35` host. Service-level E2E is the integrator's job at gate
|
||
time; if a subagent needs Linux E2E it spins an ephemeral Docker container
|
||
on the box (named per-agent, `--rm`) so parallel agents never collide on
|
||
ports, the init system, or service accounts.
|
||
|
||
### Merge protocol per wave
|
||
|
||
```
|
||
for each wave:
|
||
dispatch agent(s) with isolation: worktree + owned-file list
|
||
on completion:
|
||
integrator: merge worktree(s) in matrix order
|
||
integrator: dotnet build -c Debug (must be green)
|
||
integrator: dotnet test (green, count >= prior)
|
||
integrator: dotnet publish -r win-x64 AND -r linux-x64 (must succeed)
|
||
integrator: verify phase-specific gate checklist
|
||
gate green? -> next wave. gate red? -> fix in a single-agent pass, re-gate.
|
||
```
|
||
|
||
---
|
||
|
||
## 4. Cross-Cutting Test Strategy
|
||
|
||
- **Existing baseline (325 = 282 unit + 43 E2E) must never regress.** Every
|
||
gate re-runs the full suite.
|
||
- **New tests target pure logic** — `DiagnosticSinkSelector` is a pure function
|
||
precisely so platform-selection is testable without being a service. Highest-
|
||
value new test.
|
||
- **OS-conditional tests** use `[Trait]` + a runtime `OperatingSystem.IsWindows()`
|
||
skip so the suite is green on both Windows and Linux.
|
||
- **Both platforms are exercised every gate, no simulation.** Windows runs
|
||
locally (admin rights → real Windows Service install). Linux runs on
|
||
`dohertj2@10.100.0.35` (Debian 13, systemd 257) — the integrator drives
|
||
`dotnet build` / `dotnet test` / publish / systemd E2E over SSH.
|
||
- **CI** (if/when a pipeline exists): add a `linux-x64` build+test leg, ideally
|
||
pointed at the same box or an equivalent image. Until then the integrator's
|
||
per-gate SSH run on `10.100.0.35` is the Linux leg.
|
||
- **CA1416 platform analyzer** is treated as a test — `TreatWarningsAsErrors`
|
||
already fails the build if a Windows API escapes its guard.
|
||
|
||
---
|
||
|
||
## 5. Risk Register
|
||
|
||
| Risk | Phase | Mitigation |
|
||
| --------------------------------------------- | ----- | -------------------------------------------------------------------------- |
|
||
| Windows Service behavior regresses unnoticed | P3 | Gate 3 mandates a real Windows Service install/start/stop smoke check |
|
||
| `Serilog.Sinks.Syslog` version drift | P2 | Pin the version; `SyslogBridge` is isolated behind `DiagnosticSinkSelector` |
|
||
| Linux publish ships Windows config path | P4 | RID-conditioned `<Content>` item + `MbproxyOptionsBindingTests` on both templates |
|
||
| Self-extracting single-file temp-dir perms | P1/P5 | Document + set `DOTNET_BUNDLE_EXTRACT_BASE_DIR` in the systemd unit |
|
||
| Two agents racing `csproj` | all | Matrix forbids it — `csproj` edited only in single-agent waves W1–W3 + lone P4 |
|
||
| Hidden Windows path elsewhere in code | all | `Grep` sweep for `C:\\`, `ProgramData`, `\\\\` before Gate 6 |
|
||
| Parallel Wave-4 agents collide on the shared `10.100.0.35` host | W4 | Rule 8 — service-level E2E is integrator-only and serial; subagent E2E uses per-agent `--rm` Docker containers |
|
||
| Windows Service E2E leaves stale service/Event Log source | P2/P3 | Integrator always runs `uninstall.ps1` after each Windows gate |
|
||
|
||
---
|
||
|
||
## 6. Deliverable Summary
|
||
|
||
- **3 modified source files** (`csproj`, `HostingExtensions.cs`, `Program.cs`)
|
||
+ **3 new** (`DiagnosticSinkSelector.cs`, `SyslogBridge.cs`, and the
|
||
truncation-helper extraction in `EventLogBridge.cs`).
|
||
- **2 new packages** (`Microsoft.Extensions.Hosting.Systemd`,
|
||
`Serilog.Sinks.Syslog`).
|
||
- **6 new install/tooling files** (`publish.sh`, Linux config template,
|
||
`mbproxy.service`, `install.sh`, `uninstall.sh`).
|
||
- **~6–8 new tests** across 3 new/extended test files; baseline 325 preserved.
|
||
- **7 doc files** updated.
|
||
- **4 waves**, max 3 concurrent subagents, conflict-free by construction.
|