b4c82bf379
Three sections in operations.md duplicated the new focused docs: - "Configuration" → Operations/Configuration.md + Features/HotReload.md - "Status page" → Operations/StatusPage.md - "Common failure modes" → Operations/Troubleshooting.md + Reference/LogEvents.md Replaced each with a short pointer block. The runbook now keeps only content unique to day-two ops: install steps, upgrade procedure, uninstall, log file locations / retention / archival, and the first-install smoke checklist. 271 -> 176 lines. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
177 lines
7.3 KiB
Markdown
177 lines
7.3 KiB
Markdown
# mbproxy operations runbook
|
||
|
||
Day-two operations reference for the mbproxy Windows Service: install, upgrade, configuration, logs, and troubleshooting.
|
||
|
||
## Install
|
||
|
||
### Prerequisites
|
||
|
||
- Windows 10 / Server 2019 or later (64-bit).
|
||
- PowerShell 5.1+ run as Administrator (the install script uses `#Requires -RunAsAdministrator`).
|
||
- The compiled publish output from `dotnet publish` (see [README.md](../README.md) for the exact command).
|
||
- Modbus TCP reachable from the proxy host to the PLCs on port 502.
|
||
- Port 8080 (or whatever `AdminPort` is set to) available for the status page.
|
||
|
||
### Steps
|
||
|
||
1. Publish the binaries on the build machine:
|
||
|
||
```powershell
|
||
dotnet publish src/Mbproxy/Mbproxy.csproj -c Release -r win-x64 --self-contained true -o C:\build\mbproxy-publish
|
||
```
|
||
|
||
2. Copy the publish output to the target server (or run the install script locally if you built on the server).
|
||
|
||
3. Open an elevated PowerShell prompt and run the install script:
|
||
|
||
```powershell
|
||
.\install\install.ps1 -PublishOutput C:\build\mbproxy-publish -Start
|
||
```
|
||
|
||
The script:
|
||
- Copies binaries to `C:\Program Files\Mbproxy\` (configurable via `-InstallPath`).
|
||
- Registers the service with `sc.exe create`.
|
||
- Sets failure-recovery: restart after 60 s on first/second failure, no action on third.
|
||
- Creates `%ProgramData%\mbproxy\logs\` and sets ACLs if needed.
|
||
- Copies `mbproxy.config.template.json` → `%ProgramData%\mbproxy\appsettings.json` **only if no config exists**.
|
||
- Registers the Windows Event Log source `mbproxy`.
|
||
- With `-Start`, starts the service and waits up to 30 s for `RUNNING` state.
|
||
|
||
4. Edit `%ProgramData%\mbproxy\appsettings.json` to configure your PLC list and BCD tags. See the template for inline comments on every field.
|
||
|
||
5. If you edited the config before starting, start the service:
|
||
|
||
```powershell
|
||
sc.exe start mbproxy
|
||
```
|
||
|
||
6. Verify (smoke checklist — see [Smoke checklist](#first-install-smoke-checklist) below).
|
||
|
||
### Re-running install on an existing installation
|
||
|
||
The install script is idempotent. Re-running it:
|
||
- Stops the service if running.
|
||
- Overwrites the binaries.
|
||
- Updates the service config via `sc.exe config` (not `sc.exe create`).
|
||
- Preserves `%ProgramData%\mbproxy\appsettings.json` (never overwritten on update).
|
||
- Skips Event Log source creation if already registered.
|
||
|
||
## Upgrade procedure
|
||
|
||
1. Publish new binaries on the build machine (same command as install step 1).
|
||
|
||
2. Stop the service:
|
||
|
||
```powershell
|
||
sc.exe stop mbproxy
|
||
```
|
||
|
||
Wait for the service to reach `STOPPED` state — graceful shutdown drains in-flight PDUs (up to `Connection.GracefulShutdownTimeoutMs`, default 10 s).
|
||
|
||
3. Copy new binaries to `C:\Program Files\Mbproxy\` (or run `install.ps1 -PublishOutput ...` to automate steps 2–4):
|
||
|
||
```powershell
|
||
Copy-Item -Path C:\build\mbproxy-publish\* -Destination 'C:\Program Files\Mbproxy\' -Force
|
||
```
|
||
|
||
4. Start the service:
|
||
|
||
```powershell
|
||
sc.exe start mbproxy
|
||
```
|
||
|
||
5. Check the status page to confirm the new version:
|
||
|
||
```powershell
|
||
Invoke-RestMethod http://localhost:8080/status.json | Select-Object -ExpandProperty service
|
||
```
|
||
|
||
The `version` field should show the new build.
|
||
|
||
## Uninstall
|
||
|
||
```powershell
|
||
.\install\uninstall.ps1
|
||
```
|
||
|
||
Options:
|
||
- `-KeepConfig` — preserves `%ProgramData%\mbproxy\appsettings.json` for re-install.
|
||
- Log files are **always archived** to `%ProgramData%\mbproxy.archived-<timestamp>\logs\` regardless of `-KeepConfig`. They are never deleted.
|
||
|
||
## Configuration
|
||
|
||
The service reads `%ProgramData%\mbproxy\appsettings.json` at startup and watches it for changes while running. Most settings are hot-reloadable; a save triggers a re-bind of `IOptionsMonitor<MbproxyOptions>` and a per-change-kind reconcile.
|
||
|
||
- Full schema (every `Mbproxy:*` key, defaults, validation rules, examples): [`Operations/Configuration.md`](Operations/Configuration.md).
|
||
- Per-change-kind reconcile semantics (what propagates instantly vs. what requires a restart): [`Features/HotReload.md`](Features/HotReload.md).
|
||
|
||
If a reload is rejected (`mbproxy.config.reload.rejected` in the log), the service continues running with the previous config. Fix the JSON and save again — the next valid file write is accepted.
|
||
|
||
## Logs
|
||
|
||
### Location
|
||
|
||
Rolling log files live at: `C:\ProgramData\mbproxy\logs\mbproxy-<date>.log`
|
||
|
||
One file per day, retained for 30 days by default (controlled by `retainedFileCountLimit` in the Serilog config section).
|
||
|
||
### Windows Event Log
|
||
|
||
When running as a Windows Service, the `EventLogBridge` sink writes events at Error level and above to the Windows Application Event Log under source `mbproxy`. View with:
|
||
|
||
```powershell
|
||
Get-EventLog -LogName Application -Source mbproxy -Newest 20
|
||
```
|
||
|
||
Or open Event Viewer → Windows Logs → Application, filter by source `mbproxy`.
|
||
|
||
### Log survival after uninstall
|
||
|
||
`uninstall.ps1` **never deletes log files**. It moves `logs\` to a timestamped archive at `%ProgramData%\mbproxy.archived-<timestamp>\logs\` so post-crash diagnostics remain accessible.
|
||
|
||
## Status page
|
||
|
||
**URL:** `http://<proxy-host>:<AdminPort>/` (default port 8080; change via `Mbproxy.AdminPort` in `appsettings.json`).
|
||
|
||
Routes: `GET /` (auto-refreshing HTML, no external assets) and `GET /status.json` (same data as JSON for monitoring scrapers).
|
||
|
||
The full endpoint shape, every JSON field, counter semantics, and scraping examples live in [`Operations/StatusPage.md`](Operations/StatusPage.md). KPI catalog and dashboard guidance: [`kpi.md`](kpi.md).
|
||
|
||
## Common failure modes
|
||
|
||
The full diagnosis playbook — startup bind conflicts, backend connectivity, hot-reload validation errors, BCD rewrite anomalies, performance and queue-depth issues, response-cache anomalies, and graceful-shutdown problems — is keyed to log events and status counters in [`Operations/Troubleshooting.md`](Operations/Troubleshooting.md). The complete `mbproxy.*` event catalog with levels, properties, and operator implications is in [`Reference/LogEvents.md`](Reference/LogEvents.md).
|
||
|
||
## First-install smoke checklist
|
||
|
||
Run these commands after `install.ps1 -Start` to verify the deployment:
|
||
|
||
```powershell
|
||
# 1. Service is running
|
||
Get-Service mbproxy | Select-Object Status, DisplayName
|
||
|
||
# 2. Status page is reachable
|
||
Invoke-WebRequest http://localhost:8080/ -UseBasicParsing | Select-Object StatusCode
|
||
|
||
# 3. JSON endpoint returns expected fields
|
||
$status = Invoke-RestMethod http://localhost:8080/status.json
|
||
$status.service | Select-Object version, uptimeSeconds
|
||
$status.listeners
|
||
|
||
# 4. Log file exists and is recent
|
||
Get-Item "C:\ProgramData\mbproxy\logs\mbproxy-*.log" | Sort-Object LastWriteTime -Descending | Select-Object -First 1
|
||
|
||
# 5. No Error events in the Event Log
|
||
Get-EventLog -LogName Application -Source mbproxy -EntryType Error -Newest 5
|
||
|
||
# 6. Stop the service cleanly (graceful shutdown within 10 s)
|
||
$sw = [System.Diagnostics.Stopwatch]::StartNew()
|
||
sc.exe stop mbproxy
|
||
$deadline = [DateTime]::UtcNow.AddSeconds(15)
|
||
do { Start-Sleep 1 } until ((Get-Service mbproxy).Status -eq 'Stopped' -or [DateTime]::UtcNow -gt $deadline)
|
||
$sw.Stop()
|
||
Write-Host "Stop elapsed: $($sw.ElapsedMilliseconds) ms"
|
||
(Get-Service mbproxy).Status # Should be Stopped
|
||
```
|
||
|
||
**Note:** This checklist documents the expected steps. It was not executed on a dedicated clean VM (the proxy was developed and unit/E2E tested in-process). Run this checklist on first deployment to a production host.
|