Adds the mbproxy service end-to-end. Phases 00-08 implement the production-ready single-listener / 1:1-backend transparent Modbus TCP proxy with bidirectional BCD rewriting for the ~54-PLC DL205/DL260 fleet. Phase 9 replaces the connection layer with a single backend socket per PLC plus MBAP TxId rewriting, lifting the H2-ECOM100's 4-concurrent-client cap as an operational ceiling. Phase 9 additions of note: - PlcMultiplexer + UpstreamPipe + TxIdAllocator + CorrelationMap - InFlightRequest with IReadOnlyList<InterestedParty> (load-bearing for Phase 10 read coalescing — do not collapse to a single field) - Per-request watchdog: surfaces Modbus exception 0x0B to upstream on BackendRequestTimeoutMs, defending against lost responses, dead-PLC paths, and pymodbus 3.13.0's concurrent-multiplexed- request bug (its ServerRequestHandler.last_pdu state race) - Status DTO + HTML gain inFlight / maxInFlight / txIdWraps / disconnectCascades / queueDepth (Tier 1.6 in docs/kpi.md) Tests: 263 unit + 38 E2E. Multiplexer correctness under truly concurrent backend traffic is proved against a stub backend in PlcMultiplexerTests; MultiplexerE2ETests paces requests so pymodbus 3.13's single-PDU framer stays in known-good mode. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
11 KiB
mbproxy operations runbook
Day-two operations reference for the mbproxy Windows Service: install, upgrade, configuration, logs, and troubleshooting.
Install
Prerequisites
- Windows 10 / Server 2019 or later (64-bit).
- PowerShell 5.1+ run as Administrator (the install script uses
#Requires -RunAsAdministrator). - The compiled publish output from
dotnet publish(see README.md for the exact command). - Modbus TCP reachable from the proxy host to the PLCs on port 502.
- Port 8080 (or whatever
AdminPortis set to) available for the status page.
Steps
-
Publish the binaries on the build machine:
dotnet publish src/Mbproxy/Mbproxy.csproj -c Release -r win-x64 --self-contained true -o C:\build\mbproxy-publish -
Copy the publish output to the target server (or run the install script locally if you built on the server).
-
Open an elevated PowerShell prompt and run the install script:
.\install\install.ps1 -PublishOutput C:\build\mbproxy-publish -StartThe script:
- Copies binaries to
C:\Program Files\Mbproxy\(configurable via-InstallPath). - Registers the service with
sc.exe create. - Sets failure-recovery: restart after 60 s on first/second failure, no action on third.
- Creates
%ProgramData%\mbproxy\logs\and sets ACLs if needed. - Copies
mbproxy.config.template.json→%ProgramData%\mbproxy\appsettings.jsononly if no config exists. - Registers the Windows Event Log source
mbproxy. - With
-Start, starts the service and waits up to 30 s forRUNNINGstate.
- Copies binaries to
-
Edit
%ProgramData%\mbproxy\appsettings.jsonto configure your PLC list and BCD tags. See the template for inline comments on every field. -
If you edited the config before starting, start the service:
sc.exe start mbproxy -
Verify (smoke checklist — see Smoke checklist below).
Re-running install on an existing installation
The install script is idempotent. Re-running it:
- Stops the service if running.
- Overwrites the binaries.
- Updates the service config via
sc.exe config(notsc.exe create). - Preserves
%ProgramData%\mbproxy\appsettings.json(never overwritten on update). - Skips Event Log source creation if already registered.
Upgrade procedure
-
Publish new binaries on the build machine (same command as install step 1).
-
Stop the service:
sc.exe stop mbproxyWait for the service to reach
STOPPEDstate — graceful shutdown drains in-flight PDUs (up toConnection.GracefulShutdownTimeoutMs, default 10 s). -
Copy new binaries to
C:\Program Files\Mbproxy\(or runinstall.ps1 -PublishOutput ...to automate steps 2–4):Copy-Item -Path C:\build\mbproxy-publish\* -Destination 'C:\Program Files\Mbproxy\' -Force -
Start the service:
sc.exe start mbproxy -
Check the status page to confirm the new version:
Invoke-RestMethod http://localhost:8080/status.json | Select-Object -ExpandProperty serviceThe
versionfield should show the new build.
Uninstall
.\install\uninstall.ps1
Options:
-KeepConfig— preserves%ProgramData%\mbproxy\appsettings.jsonfor re-install.- Log files are always archived to
%ProgramData%\mbproxy.archived-<timestamp>\logs\regardless of-KeepConfig. They are never deleted.
Configuration
The service reads %ProgramData%\mbproxy\appsettings.json at startup and watches it for changes while running. Most settings are hot-reloadable; a few require a restart.
Hot-reload vs. restart
| Setting | Behaviour on file save |
|---|---|
BcdTags.Global add/remove/width |
Next PDU uses the new map; in-flight PDUs complete with the old map. |
Plcs[].BcdTags.{Add,Remove} |
Same per-PDU propagation. |
Plcs[].Name or .Host or .ListenPort changed |
Treated as remove + add: old listener stops, new one starts. |
New Plcs[] entry |
New listener binds immediately (subject to port availability). |
Plcs[] entry removed |
Supervisor stops the listener; all connected clients for that PLC are disconnected. |
Connection.Backend*TimeoutMs |
Next connect/request uses the new value. |
Connection.GracefulShutdownTimeoutMs |
Picked up on the next ApplicationStopping event. |
AdminPort |
Admin endpoint re-binds on the new port; old port released. |
| Invalid reload (schema error, duplicate ports/addresses) | Rejected as a whole. Current in-memory config stays; mbproxy.config.reload.rejected logged at Error. |
For more detail on the hot-reload propagation model, see design.md → "Configuration hot-reload".
Editing appsettings.json
The service picks up changes automatically. There is no need to restart unless you are changing the Connection.GracefulShutdownTimeoutMs (applies only on next stop) or updating the binary.
If a reload is rejected (mbproxy.config.reload.rejected in the log), the service continues running with the previous config. Fix the JSON error and save again — the next valid file write will be accepted.
Logs
Location
Rolling log files live at: C:\ProgramData\mbproxy\logs\mbproxy-<date>.log
One file per day, retained for 30 days by default (controlled by retainedFileCountLimit in the Serilog config section).
Windows Event Log
When running as a Windows Service, the EventLogBridge sink writes events at Error level and above to the Windows Application Event Log under source mbproxy. View with:
Get-EventLog -LogName Application -Source mbproxy -Newest 20
Or open Event Viewer → Windows Logs → Application, filter by source mbproxy.
Log survival after uninstall
uninstall.ps1 never deletes log files. It moves logs\ to a timestamped archive at %ProgramData%\mbproxy.archived-<timestamp>\logs\ so post-crash diagnostics remain accessible.
Status page
URL: http://<proxy-host>:<AdminPort>/
Default port: 8080. Change with Mbproxy.AdminPort in appsettings.json.
Routes:
GET /— HTML table, auto-refreshes every 5 s. No external assets.GET /status.json— same data as JSON for monitoring scrapers.
Key fields on /status.json:
| Field | Meaning |
|---|---|
service.version |
Assembly informational version (set at publish time). |
service.uptimeSeconds |
Seconds since service start. |
service.config.lastReloadUtc |
Last accepted hot-reload timestamp. |
listeners.bound / listeners.configured |
Bound count vs. configured PLC count. |
plcs[].listener.state |
bound / recovering / stopped. |
plcs[].backend.connectsSuccess |
Successful backend TCP connects since start. |
plcs[].backend.connectsFailed |
Failed backend connects (all retries exhausted). |
plcs[].pdus.forwarded |
Total PDUs forwarded through this PLC's proxy. |
Common failure modes
mbproxy.startup.bind.failed — port in use
Symptom: The service starts but one or more PLCs show listener.state = recovering.
Cause: Another process is bound to the configured ListenPort.
Remediation:
netstat -ano | findstr :<port> # find PID holding the port
Get-Process -Id <pid> # identify the process
Release the port or change Plcs[].ListenPort in appsettings.json. The supervisor will retry automatically — watch for mbproxy.listener.recovered in the log.
mbproxy.listener.recovered — no action needed
A previously-failing listener successfully bound. The service is self-healing. This is informational.
mbproxy.backend.failed — PLC unreachable
Symptom: Upstream clients cannot connect through the proxy, or connections are immediately dropped.
Cause: The PLC backend (Plcs[].Host:Port) is unreachable — network issue, PLC power cycle, or H2-ECOM100 firmware issue.
Remediation: Check network path to the PLC. Verify the PLC Modbus port is responding:
Test-NetConnection -ComputerName <plc-ip> -Port 502
Note: the H2-ECOM100 module caps connections at 4 simultaneous TCP clients. If the proxy already has 4 upstream clients connected to one PLC port, a fifth will trigger mbproxy.backend.failed.
mbproxy.config.reload.rejected — bad config
Symptom: The log shows a rejection event after a file save; the current config is unchanged.
Cause: The saved appsettings.json has a schema error, duplicate port, or conflicting BCD address.
Remediation: Check the log for the joined error list immediately following the rejection event. Fix the JSON and save again.
mbproxy.admin.bind.failed — admin port in use
Symptom: The status page is unreachable.
Cause: Another process is using AdminPort.
Remediation: The proxy continues to forward Modbus traffic — only the status page is affected. Change AdminPort in appsettings.json (hot-reload applies).
mbproxy.rewrite.partial_bcd — client reading half a 32-bit BCD pair
Symptom: Warning in the log; the value passes through raw (no rewrite).
Cause: The upstream client is reading only one register of a configured 32-bit BCD pair (e.g., quantity = 1 at the low address, or any read at the high address alone). This is almost always a client-side tag-definition bug.
Remediation: Verify the client's tag definition specifies quantity = 2 for 32-bit BCD addresses.
mbproxy.rewrite.invalid_bcd — non-BCD value from PLC
Symptom: Warning in the log; the value passes through raw.
Cause: The PLC returned a register value that contains non-BCD nibbles (e.g., 0xA123 — the nibble A is invalid BCD). This usually indicates the ladder program wrote a non-BCD value to a register configured as a BCD tag.
Remediation: Investigate the PLC ladder program. The proxy cannot decode non-BCD data — passing it through is safer than guessing.
First-install smoke checklist
Run these commands after install.ps1 -Start to verify the deployment:
# 1. Service is running
Get-Service mbproxy | Select-Object Status, DisplayName
# 2. Status page is reachable
Invoke-WebRequest http://localhost:8080/ -UseBasicParsing | Select-Object StatusCode
# 3. JSON endpoint returns expected fields
$status = Invoke-RestMethod http://localhost:8080/status.json
$status.service | Select-Object version, uptimeSeconds
$status.listeners
# 4. Log file exists and is recent
Get-Item "C:\ProgramData\mbproxy\logs\mbproxy-*.log" | Sort-Object LastWriteTime -Descending | Select-Object -First 1
# 5. No Error events in the Event Log
Get-EventLog -LogName Application -Source mbproxy -EntryType Error -Newest 5
# 6. Stop the service cleanly (graceful shutdown within 10 s)
$sw = [System.Diagnostics.Stopwatch]::StartNew()
sc.exe stop mbproxy
$deadline = [DateTime]::UtcNow.AddSeconds(15)
do { Start-Sleep 1 } until ((Get-Service mbproxy).Status -eq 'Stopped' -or [DateTime]::UtcNow -gt $deadline)
$sw.Stop()
Write-Host "Stop elapsed: $($sw.ElapsedMilliseconds) ms"
(Get-Service mbproxy).Status # Should be Stopped
Note: This checklist documents the expected steps. It was not executed on a dedicated clean VM (the proxy was developed and unit/E2E tested in-process). Run this checklist on first deployment to a production host.