Files
wwtools/mbproxy/plans/2026-05-15-multiplatform.md
Joseph Doherty b330faff03 mbproxy: cross-platform support — Linux/systemd alongside Windows
Make the service build, run, and install on Linux as a first-class
target while keeping the Windows Service + Event Log behaviour intact.

- Build: drop the hardcoded win-x64 RID — single-file publish now works
  for any RID. publish.ps1 gains -Rid; new publish.sh for Linux hosts.
- Diagnostics: DiagnosticSinkSelector picks the Error+ sink per host —
  Windows Event Log under the SCM, local syslog under systemd
  (Serilog.Sinks.SyslogMessages), none for interactive runs. The
  EventLog truncation helper is extracted so it is testable cross-OS.
- Host: Program.cs registers AddSystemd() alongside AddWindowsService().
- Config: a RID-conditioned appsettings template ships Windows or Unix
  paths; both templates are schema-validated by a test.
- Install: systemd unit (Type=exec) plus install.sh / uninstall.sh.
  Also fixes two cross-platform bugs found while testing: install.ps1
  and uninstall.ps1 used New-EventLog / Remove-EventLog (absent in
  PowerShell 7), and the E2E sim launcher hardcoded Windows venv paths.
- Docs updated across README, CLAUDE.md, and docs/ for dual-platform.

413 tests pass on Windows; 374 (all non-simulator) on Linux.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 09:41:59 -04:00

29 KiB
Raw Permalink Blame History

mbproxy Multiplatform Implementation Plan

Created: 2026-05-15 Status: All six phases implemented. 413 tests green on Windows; Windows Service and Linux systemd install E2E both green. Two findings (pymodbus-sim-on-Linux, AddSystemd() notify) logged as orthogonal follow-ups. Working tree only — nothing committed. Working artifact — not part of the docs/ source-of-truth tree (per ../DOCS-GUIDE.md). Delete or archive once the work lands.

Progress log

  • 2026-05-15 — Phase 1 done, Gate 1 green. RID removed from csproj (single-file settings now gated on '$(RuntimeIdentifier)' != ''); publish.ps1 gained -Rid; publish.sh added. dotnet build -c Debug 0 warnings; dotnet test 398 passed / 0 failed (baseline 325 → 398, the Keepalive feature added tests); win-x64Mbproxy.exe 100.1 MB, linux-x64Mbproxy ELF 97.2 MB. ELF launch-smoked on 10.100.0.35: full startup, listeners bound, mbproxy.startup.ready + admin endpoint up, no errors. Box prep done (.NET SDK 10.0.300, shellcheck 0.10.0 installed).

  • 2026-05-15 — Phases 2 + 3 code done (combined integrator pass). Packages added: Microsoft.Extensions.Hosting.Systemd 10.0.8, Serilog.Sinks.SyslogMessages 4.1.0 (the maintained IonxSolutions package — the bare Serilog.Sinks.Syslog ID is a near-abandoned 0.2.0 package; same approved intent). New DiagnosticSink enum + DiagnosticSinkSelector (pure); new SyslogBridge; EventLogBridge truncation extracted to a non-annotated EventLogMessage type (testable cross-OS). AddMbproxySerilog now selects the sink internally; Program.cs calls AddSystemd() + AddWindowsService(). 13 new tests. 411 passed / 0 failed on Windows; on 10.100.0.35 372 passed / 39 skipped / 0 failed — all 39 skips are simulator-backed E2E (see finding below), every host/diagnostic/smoke test green on Linux.

  • 2026-05-15 — Two cross-platform bugs found and fixed in install tooling. (1) tests/sim/run-dl205-sim.ps1 was Windows-only — hardcoded venv paths Scripts\*.exe; now branches Scripts/.exe vs bin/`` on $IsWindows and adds python3 to the interpreter candidates. (2) install.ps1 / uninstall.ps1 used New-EventLog / Remove-EventLog, which exist only in Windows PowerShell 5.1 — they fail under PowerShell 7+. Switched to the .NET API ([EventLog]::CreateEventSource / DeleteEventSource), symmetric with the SourceExists calls already in those scripts.

  • 2026-05-15 — Windows Service E2E green (local, admin). Republished win-x64; install.ps1 -Start installs + starts the service; verified Running/Automatic, status.json served, listeners bound, mbproxy.startup.ready logged, Event Log source registered, WindowsServiceLifetime wrote "Service started successfully" (proves the process runs under the SCM). uninstall.ps1 stopped/deleted the service, archived logs, removed the Event Log source. Box left clean. (A forced EventLogBridge Error+ write was not pursued — Emit is unchanged code, covered by EventLogMessageTests; sink selection is covered by DiagnosticSinkSelectorTests.)

  • 2026-05-15 — Linux systemd E2E done. The linux-x64 ELF runs under a real systemd unit on 10.100.0.35: starts, binds listeners, serves the admin endpoint, and systemctl stop → graceful SIGTERM drain (mbproxy.shutdown.complete in the journal). Type=notify does not work (see Findings) → Phase 5 will ship Type=exec. Box prep this session: dotnet-sdk-10.0, shellcheck, python3-venv, pwsh 7.6.1 (dotnet global tool), pymodbus 3.13.0 venv.

  • 2026-05-15 — Phases 46 done. Phase 4: new install/mbproxy.linux.config.template.json (Unix log path /var/log/mbproxy, systemd-oriented comments); csproj links the platform-correct template into the published appsettings.json by RID (win-*/RID-less → Windows, else Unix) — verified by publishing both RIDs; MbproxyOptionsBindingTests extended to load + schema-validate both templates (now 413 tests on Windows). Phase 5: install/mbproxy.service (Type=exec, hardened, mbproxy service account), install/install.sh, install/uninstall.shshellcheck clean; install→active→status.json served→uninstall→clean E2E passed on 10.100.0.35. Phase 6: README.md, mbproxy/CLAUDE.md, ../CLAUDE.md, docs/Operations/Configuration.md, docs/Reference/LogEvents.md, docs/Operations/Troubleshooting.md, docs/Architecture/Overview.md, docs/Features/HotReload.md updated for the dual-platform reality.

Findings

  • Linux full run: 374 passed / 37 failed / 0 skipped. With the simulator launcher fixed and pymodbus provisioned, the simulator-backed E2E tests now run on Linux (0 skipped) but 37 fail with IOException: Broken pipe (SocketException) when the NModbus client writes through the proxy. The failures are broad across all simulator-backed E2E (cache, forwarding, rewriter, supervision). Not a Phases 13 regression: the multiplatform work touches only build config, diagnostic sinks, and host registration — none of the Modbus proxy data path. The same 37 tests pass on Windows (411/411), and every non-E2E test — including all 13 new diagnostic tests — passes on Linux. Root cause isolated: the SimulatorSmokeTests — which connect directly to the pymodbus simulator with no proxy in the path — also fail (TCP connect error). So the fault is the pymodbus 3.13.0 simulator itself on this box, not mbproxy's proxy code. Likely pymodbus 3.13.0 vs Python 3.13.5 (both very new), or the box's Docker-host networking. Treated as a separate investigation (pymodbus-simulator-on-Linux), entirely orthogonal to the multiplatform service work — see the session report.
  • The run-dl205-sim.ps1 idempotency check keys on Test-Path $venvDir only; a venv left structurally broken by a killed run (no bin/) is not detected and re-created. Pre-existing latent gap, not platform-specific — noted, not fixed (out of scope; a clean run is unaffected).
  • AddSystemd() does not deliver sd_notify(READY=1) here → Phase 5 uses Type=exec. mbproxy runs correctly under systemd (starts, binds, serves, and SIGTERM → graceful drain all work — verified in the journal), but a Type=notify unit never receives READY=1 and times out. Isolated step by step: SystemdHelpers.IsSystemdService() correctly returns True under systemd; a minimal Host.CreateApplicationBuilder() + AddSystemd() host reproduces the failure; both a systemd-run transient unit and a real Type=notify unit file fail identically. So it is not an mbproxy bug — it is a HostApplicationBuilder + Microsoft.Extensions.Hosting.Systemd 10.0.8 (minimal-hosting) issue. Resolution: the Phase 5 unit uses Type=exec — mbproxy is a leaf service that nothing orders against, so the readiness signal is unnecessary; Type=exec + the generic host's built-in POSIX SIGTERM handling (independent of SystemdLifetime) gives a fully working unit with Restart=on-failure. AddSystemd() stays in Program.cs (correct, documented, forward-compatible, harmless). Root-causing the .NET notify gap is logged as a separate follow-up.

A plan to make mbproxy run on Linux (and incidentally macOS) as a first-class target while keeping the Windows Service + Event Log behavior intact and adding systemd + journald/syslog equivalents.

The hosting model (Host.CreateApplicationBuilder + IHostedService + Kestrel) is already portable, so the work is narrow: generalize the build, abstract one diagnostic sink, add one package + one call, and add Linux tooling/docs.


0. Test Environments

Both platforms can be exercised fully — no environment is simulated or deferred.

0.1 Windows (the dev box — local)

The dev box runs with administrator rights, so every Windows gate runs locally with no separate test machine:

  • install.ps1 (requires elevation) installs the real Windows Service.
  • The Event Log source mbproxy can be registered and EventLogBridge writes verified against the Application log.
  • Install → start → stop → uninstall is a full local round-trip.

Windows Service E2E mutates machine state (a registered service + Event Log source). It is integrator-only and the integrator always runs uninstall.ps1 to leave the box clean after each gate.

0.2 Linux

Host: dohertj2@10.100.0.35 — Debian 13 (trixie), amd64, kernel 6.12, hostname DOCKER. systemd 257.

  • Access: passwordless SSH from the Windows dev box; passwordless sudo (verified 2026-05-15).
  • Reachable on 10.100.0.35 (also 10.50.0.35, 10.200.0.35).
  • One-time prep (run once before Wave 1 gates):
    ssh dohertj2@10.100.0.35 'sudo apt-get update && \
        sudo apt-get install -y dotnet-sdk-10.0 shellcheck'
    
    dotnet-sdk-10.0 candidate is 10.0.203 — matches the net10.0 target.
  • Docker is installed on the box (the user is in the docker group). Use ephemeral Debian containers to isolate per-subagent E2E runs so parallel Wave-4 agents don't collide on the host's systemd / ports (see section 3, rule 8).

How the integrator uses the box per gate:

  • Push the integration branch (or rsync the worktree) to the box, then run dotnet build / dotnet test / dotnet publish -r linux-x64 over SSH.
  • Run the actual linux-x64 ELF binary, the systemd unit, and shellcheck here — Windows can cross-publish a linux-x64 binary but cannot run or service-host it.

The box is a shared mutable resource. Host-level mutations (apt installs, systemctl on the real host, privileged-port binds) are integrator-only and run serially between waves. Subagents that need Linux E2E use throwaway Docker containers, never the host's init system directly.


1. Scope

In scope

  • Linux (linux-x64) as a supported runtime target alongside win-x64.
  • systemd integration (Type=notify, sd_notify readiness, SIGTERM drain).
  • A Linux-appropriate error-event diagnostic sink (syslog, severity-mapped).
  • RID-agnostic build + dual-RID publish tooling.
  • Linux install tooling (systemd unit + shell scripts).
  • Docs/README/CLAUDE.md updates.

Out of scope (state explicitly in docs)

  • macOS launchd integration — mbproxy will run on macOS as a console process but ships no service-manager integration.
  • ARM RIDs (linux-arm64) — the build will not forbid them, but they are untested.
  • Container/Docker packaging — separate future effort.

Locked design decisions

  • Reference Microsoft.Extensions.Hosting.WindowsServices and Microsoft.Extensions.Hosting.Systemd unconditionally; both packages are portable and both helpers self-detect their host. No conditional <PackageReference>.
  • All Windows API calls (System.Diagnostics.EventLog) stay behind OperatingSystem.IsWindows() + [SupportedOSPlatform("windows")]; CA1416 (already enforced via TreatWarningsAsErrors) is the safety net.
  • Diagnostic sink selection happens once, at the composition root (AddMbproxySerilog). No OS branching anywhere else.
  • Prefer new files over editing shared files, to keep parallel work conflict-free.
  • Linux error-event sink: Serilog.Sinks.Syslog (decided 2026-05-15). Error+ events get RFC5424 severity mapping on Linux, mirroring the Windows Event Log behavior where Error+ is surfaced distinctly. DiagnosticSinkSelector returns EventLog | Syslog | None.

2. Phase Breakdown

Each phase lists its owned file set (the parallel-safety contract), changes, tests, and a gate that must be green before the next phase starts.

Phase 1 — Build & publish generalization (foundation)

Objective: Remove the hardcoded RID so the project builds/publishes for any runtime; keep the Windows output byte-identical.

Owned files

  • src/Mbproxy/Mbproxy.csproj
  • install/publish.ps1
  • install/publish.sh (new)

Changes

  • Mbproxy.csproj: delete <RuntimeIdentifier>win-x64</RuntimeIdentifier> from the Release PropertyGroup; keep PublishSingleFile / SelfContained / IncludeNativeLibrariesForSelfExtract. RID becomes a publish-time -r argument.
  • publish.ps1: add a -Rid parameter (default win-x64), keep the two-flavor logic.
  • publish.sh: Linux counterpart producing linux-x64 self-contained + framework-dependent builds.
  • (The RID-conditioned appsettings.json content item is Phase 4; in Phase 1 just confirm the build works without a baked RID.)

Tests

  • No xunit tests (build-config change). Gate is publish success on both RIDs.

Gate 1

  • dotnet build -c Debug green; dotnet test full suite green (unchanged count).
  • dotnet publish -c Release -r win-x64 produces a single-file Mbproxy.exe (same size class as before).
  • dotnet publish -c Release -r linux-x64 produces a single-file Mbproxy ELF binary. Cross-published from the Windows dev box; the ELF is then copied to 10.100.0.35 and confirmed to launch (./Mbproxy --version-class smoke).
  • Zero new analyzer warnings.

Phase 2 — Diagnostic sink abstraction

Objective: Make error-event delivery a platform-selected sink. Windows keeps EventLogBridge; Linux gets a syslog sink.

Owned files

  • src/Mbproxy/Diagnostics/DiagnosticSinkSelector.cs (new — pure selection logic)
  • src/Mbproxy/Diagnostics/SyslogBridge.cs (new)
  • src/Mbproxy/Diagnostics/EventLogBridge.cs (minor: extract the 32 KB truncation helper into a testable static method)
  • src/Mbproxy/HostingExtensions.cs (only AddMbproxySerilog)
  • src/Mbproxy/Mbproxy.csproj (add Serilog.Sinks.Syslog package)
  • New test files (see below)

HostingExtensions.cs and Mbproxy.csproj are also touched by Phase 3. Phases 2 and 3 must not run in parallel (see section 3). They are sequential.

Changes

  • DiagnosticSinkSelector — a pure function taking (bool isWindows, bool isWindowsService, bool isSystemd) and returning an enum (EventLog | Syslog | None). No I/O, fully unit-testable.
  • SyslogBridge: Serilog ILogEventSink wrapping Serilog.Sinks.Syslog, active for Error+ only, mirroring EventLogBridge's contract (silent no-op if syslog unavailable).
  • AddMbproxySerilog: replace the addEventLogBridge bool parameter with a DiagnosticSinkSelector result; wire the chosen sink. Keep the OperatingSystem.IsWindows() guard around EventLogBridge.
  • Extract EventLogBridge's message-truncation into internal static string TruncateToEventLogLimit(string) so it can be tested OS-independently.

Tests (tests/Mbproxy.Tests/Diagnostics/)

  • DiagnosticSinkSelectorTests — table-driven: Windows+service→EventLog; Windows console→None; Linux+systemd→Syslog; Linux console→None; macOS→None.
  • EventLogBridgeTests[Trait("Category","Unit")], Windows-guarded facts: source-missing → silent no-op; truncation helper caps at 32 KB and appends ... (this fact runs on all OSes since the helper is pure).
  • SyslogBridgeTests — Error+ filter; no-throw when transport unavailable.

Gate 2

  • Full test suite green on Windows (local); full suite green on Linux — integrator runs dotnet test over SSH on 10.100.0.35.
  • EventLogBridge emits to the Application log — verified locally via a real Windows Service install (install.ps1, admin rights available), then uninstall.ps1 to clean up.
  • CA1416: zero warnings.

Phase 3 — Service host integration (systemd)

Objective: Register both init-system integrations; the host correctly reports readiness to whichever launched it.

Owned files

  • src/Mbproxy/Program.cs
  • src/Mbproxy/HostingExtensions.cs (call-site update only)
  • src/Mbproxy/Mbproxy.csproj (add Microsoft.Extensions.Hosting.Systemd)

Changes

  • csproj: add <PackageReference Include="Microsoft.Extensions.Hosting.Systemd" /> (pin to the 10.0.x line matching the existing Windows-services package).
  • Program.cs: call builder.Services.AddSystemd(); alongside AddWindowsService();. Compute isSystemd via SystemdHelpers.IsSystemdService() and feed DiagnosticSinkSelector together with isWindowsService.
  • Confirm SIGTERM → host shutdown → existing Connection.GracefulShutdownTimeoutMs drain path works (it does — POSIX signal handling is built into the generic host; just verify).

Tests (tests/Mbproxy.Tests/HostSmokeTests.cs — extend existing file)

  • HostSmoke_RegistersBothServiceIntegrations_StartsAndStops — builds the host with both AddWindowsService + AddSystemd, asserts no throw, asserts mbproxy.startup.ready still logged.
  • Existing two smoke tests must remain green.

Gate 3

  • Full suite green on Windows (local) and Linux (10.100.0.35 via SSH).
  • Windows Service E2E, run locally with admin rights: install.ps1 → service starts, logs mbproxy.startup.ready + writes to Event Log, Stop-Service drains cleanly, uninstall.ps1 removes it. No regression in Windows behavior is the hard requirement of this gate.
  • Linux systemd E2E on 10.100.0.35done. The linux-x64 binary runs under a real systemd unit: it starts, binds listeners, serves the admin endpoint, and systemctl stop (SIGTERM) drains gracefully (mbproxy.shutdown.complete in the journal). Type=notify was found not to deliver READY=1 (Findings) → the Phase 5 unit uses Type=exec, under which the service is fully functional.

Phase 4 — Config & filesystem portability

Objective: No Windows-only paths in the shipped/installed config.

Owned files

  • install/mbproxy.config.template.json (Windows — keep C:\ProgramData\... path)
  • install/mbproxy.linux.config.template.json (new — /var/log/mbproxy/..., Linux syslog Using entry)
  • src/Mbproxy/Mbproxy.csproj (condition the linked appsettings.json content item by $(RuntimeIdentifier))

Touches csproj. Must run after Phase 3's csproj edit is merged (sequential w.r.t. csproj), but is otherwise independent of Phase 5/6.

Changes

  • New Linux template: log path /var/log/mbproxy/mbproxy-.log; Serilog Using array includes the syslog sink; comment header points at /etc/mbproxy/appsettings.json.
  • csproj: link the win template for win-* RIDs and the linux template for linux-* RIDs into the published appsettings.json (RID-conditioned <Content> items).

Tests (tests/Mbproxy.Tests/Options/)

  • Extend MbproxyOptionsBindingTests: load each shipped template through the config binder + MbproxyOptionsValidator; assert both bind and validate cleanly. Catches a malformed Linux template at build time.

Gate 4

  • Both templates bind + validate (new test green).
  • dotnet publish -r linux-x64 ships the Linux template as appsettings.json; -r win-x64 ships the Windows one. Verify by inspecting publish output.

Phase 5 — Linux install tooling

Objective: Parity with install.ps1 for systemd hosts.

Owned files (all new, fully disjoint from all other phases)

  • install/mbproxy.service — systemd unit, Type=exec (not Type=notify — see Findings: AddSystemd() does not deliver READY=1 for the minimal hosting model), Restart=on-failure, User=mbproxy, ExecStart pointing at the installed binary; sets DOTNET_BUNDLE_EXTRACT_BASE_DIR.
  • install/install.sh — creates mbproxy service account, lays down binary + /etc/mbproxy/appsettings.json (preserve-if-exists, matching install.ps1 semantics), creates /var/log/mbproxy, installs + systemctl enable --now.
  • install/uninstall.shsystemctl disable --now, archives logs (mirror the .archived-<ts> convention), removes unit.

Tests

  • Not xunit. Gate = shellcheck clean + a dry-run inside a throwaway Debian container on 10.100.0.35.

Gate 5

  • shellcheck install/*.sh clean — run on 10.100.0.35 (shellcheck installed in the one-time prep).
  • End-to-end on 10.100.0.35, inside a throwaway Debian container: install.sh → service active → proxy answers Modbus on a configured port → uninstall.sh → service gone, logs archived. Container isolation keeps the mbproxy service account / unit off the real host.

Phase 6 — Documentation

Objective: Docs reflect dual-platform reality; doctrine in DOCS-GUIDE.md respected.

Owned files

  • README.md — rewrite "Hard constraints / prerequisites" (drop "No Linux or Docker support"); add Linux install path; document both publish flavors × both RIDs.
  • docs/Operations/Configuration.md — both config templates, log-path differences, syslog vs Event Log.
  • docs/Operations/Troubleshooting.mdjournalctl guidance alongside Event Viewer.
  • docs/Architecture/Overview.md — note dual init-system hosting (only if it shifts a headline bullet).
  • docs/Reference/LogEvents.md — note Error+ events route to Event Log (Windows) / syslog (Linux).
  • mbproxy/CLAUDE.md — correct the implied Windows-only framing.
  • wwtools/CLAUDE.md — broaden the mbproxy index row if the task→tool mapping changed.

Tests

  • Markdown link-check across touched files.

Gate 6

  • All internal doc links resolve.
  • README "Hard constraints" no longer contradicts the shipped tooling.

3. Parallel Subagent Execution Plan

Dependency graph

Phase 1 (build) ──> Phase 2 (diagnostics) ──> Phase 3 (host) ──┬─> Phase 4 (config)
                                                               ├─> Phase 5 (install)
                                                               └─> Phase 6 (docs)

Phases 2 and 3 are strictly sequential: Phase 3 calls the new AddMbproxySerilog signature Phase 2 defines, and both edit HostingExtensions.cs + csproj. Phases 4, 5, 6 are mutually independent and parallelizable once Phase 3 is merged.

Wave plan

Wave Phases Agents Mode
W1 Phase 1 1 agent Single — touches csproj
W2 Phase 2 1 agent Single — touches csproj + HostingExtensions
W3 Phase 3 1 agent Single — touches csproj + HostingExtensions + Program.cs
W4 4, 5, 6 3 agents (parallel) Parallel — disjoint file sets

Phase 4 touches csproj but no other W4 phase does, so within W4 the file sets are still disjoint. Safe.

File-ownership matrix (the parallel-safety contract)

File P1 P2 P3 P4 P5 P6
Mbproxy.csproj x x x x
HostingExtensions.cs x x
Program.cs x
Diagnostics/* (new + EventLogBridge) x
install/publish.* x
install/*.config.template.json x
install/install.sh, uninstall.sh, .service x
tests/** x x x
docs / READMEs / CLAUDE.md x

No column in W4 (P4/P5/P6) shares a row. Confirmed conflict-free.

Subagent rules (enforce in every dispatch prompt)

  1. One git worktree per subagent — dispatch each Agent call with isolation: "worktree". Physical isolation means even a stray edit can't corrupt a sibling's tree.
  2. Owned-file contract — each subagent is told its exact owned file set from the matrix and instructed to edit nothing outside it. A subagent that discovers it needs an out-of-set file must stop and report, not edit.
  3. No intra-wave API coupling — subagents in the same wave may only depend on public APIs from already-merged prior waves, never on a sibling's in-progress work. (This is why P2→P3 are separate waves, not parallel.)
  4. Tests ship with code — the subagent that writes a phase's code also writes that phase's tests and runs dotnet test green in its own worktree before reporting done. No separate "test agent."
  5. Integrator merges in declared order — the main agent merges each worktree, runs the full build + test suite, and only then declares the phase gate met. A failed gate blocks the next wave.
  6. High-contention files are single-agent-onlycsproj, HostingExtensions.cs, Program.cs, CLAUDE.md are never edited by two agents in the same wave (the matrix guarantees this).
  7. Prefer new filesDiagnosticSinkSelector.cs, SyslogBridge.cs, mbproxy.linux.config.template.json, the shell scripts, the unit file are all new — new files can't merge-conflict, maximizing safe parallelism.
  8. Shared test hosts are integrator-only for mutations — subagents may run dotnet build / dotnet test (read-mostly) but must not install a Windows Service, register an Event Log source, or systemctl against the real 10.100.0.35 host. Service-level E2E is the integrator's job at gate time; if a subagent needs Linux E2E it spins an ephemeral Docker container on the box (named per-agent, --rm) so parallel agents never collide on ports, the init system, or service accounts.

Merge protocol per wave

for each wave:
    dispatch agent(s) with isolation: worktree + owned-file list
    on completion:
        integrator: merge worktree(s) in matrix order
        integrator: dotnet build -c Debug          (must be green)
        integrator: dotnet test                    (green, count >= prior)
        integrator: dotnet publish -r win-x64 AND -r linux-x64 (must succeed)
        integrator: verify phase-specific gate checklist
    gate green? -> next wave.  gate red? -> fix in a single-agent pass, re-gate.

4. Cross-Cutting Test Strategy

  • Existing baseline (325 = 282 unit + 43 E2E) must never regress. Every gate re-runs the full suite.
  • New tests target pure logicDiagnosticSinkSelector is a pure function precisely so platform-selection is testable without being a service. Highest- value new test.
  • OS-conditional tests use [Trait] + a runtime OperatingSystem.IsWindows() skip so the suite is green on both Windows and Linux.
  • Both platforms are exercised every gate, no simulation. Windows runs locally (admin rights → real Windows Service install). Linux runs on dohertj2@10.100.0.35 (Debian 13, systemd 257) — the integrator drives dotnet build / dotnet test / publish / systemd E2E over SSH.
  • CI (if/when a pipeline exists): add a linux-x64 build+test leg, ideally pointed at the same box or an equivalent image. Until then the integrator's per-gate SSH run on 10.100.0.35 is the Linux leg.
  • CA1416 platform analyzer is treated as a test — TreatWarningsAsErrors already fails the build if a Windows API escapes its guard.

5. Risk Register

Risk Phase Mitigation
Windows Service behavior regresses unnoticed P3 Gate 3 mandates a real Windows Service install/start/stop smoke check
Serilog.Sinks.Syslog version drift P2 Pin the version; SyslogBridge is isolated behind DiagnosticSinkSelector
Linux publish ships Windows config path P4 RID-conditioned <Content> item + MbproxyOptionsBindingTests on both templates
Self-extracting single-file temp-dir perms P1/P5 Document + set DOTNET_BUNDLE_EXTRACT_BASE_DIR in the systemd unit
Two agents racing csproj all Matrix forbids it — csproj edited only in single-agent waves W1W3 + lone P4
Hidden Windows path elsewhere in code all Grep sweep for C:\\, ProgramData, \\\\ before Gate 6
Parallel Wave-4 agents collide on the shared 10.100.0.35 host W4 Rule 8 — service-level E2E is integrator-only and serial; subagent E2E uses per-agent --rm Docker containers
Windows Service E2E leaves stale service/Event Log source P2/P3 Integrator always runs uninstall.ps1 after each Windows gate

6. Deliverable Summary

  • 3 modified source files (csproj, HostingExtensions.cs, Program.cs)
    • 3 new (DiagnosticSinkSelector.cs, SyslogBridge.cs, and the truncation-helper extraction in EventLogBridge.cs).
  • 2 new packages (Microsoft.Extensions.Hosting.Systemd, Serilog.Sinks.Syslog).
  • 6 new install/tooling files (publish.sh, Linux config template, mbproxy.service, install.sh, uninstall.sh).
  • ~68 new tests across 3 new/extended test files; baseline 325 preserved.
  • 7 doc files updated.
  • 4 waves, max 3 concurrent subagents, conflict-free by construction.