docs: alarms-over-gateway plan — add Track D deployment refresh
After A/B/C all merge, the running services on C:\publish need to be refreshed before the Galaxy alarm-event family flows end-to-end. Add PR D.1: a Refresh-Services.ps1 script + runbook for stopping in reverse-dependency order, restaging binaries from the build outputs, restarting in forward-dependency order, and capturing a smoke-run artifact. D.1 gates B.5 (docs sweep) — the documentation records the as-deployed shape, so the deployment has to be live first. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -442,7 +442,8 @@ storage) plug into the same path.
|
|||||||
|
|
||||||
### PR B.5 — docs + memory housekeeping
|
### PR B.5 — docs + memory housekeeping
|
||||||
|
|
||||||
**Depends on:** B.1 / B.2 / B.3 / B.4 all green on the parity rig.
|
**Depends on:** B.1 / B.2 / B.3 / B.4 all green on the parity rig + D.1
|
||||||
|
(deployment refresh) verified on the dev rig.
|
||||||
|
|
||||||
**Files:**
|
**Files:**
|
||||||
|
|
||||||
@@ -533,6 +534,143 @@ completes that slot. Two PRs in the sidecar + one consumer-side PR
|
|||||||
C.2's lmxopcua-side consumer is **PR B.4 in Track B**, which depends
|
C.2's lmxopcua-side consumer is **PR B.4 in Track B**, which depends
|
||||||
on C.2 being deployed.
|
on C.2 being deployed.
|
||||||
|
|
||||||
|
## Track D — deployment refresh
|
||||||
|
|
||||||
|
The dev box at `DESKTOP-6JL3KKO` runs three live services from
|
||||||
|
`C:\publish\` (installed in the session that produced commit
|
||||||
|
`ea04547`'s install scripts). Once Tracks A / B / C are merged, the
|
||||||
|
deployed binaries need to be refreshed so the running services pick
|
||||||
|
up the new alarm path. Track D is one PR — pure ops, no code change.
|
||||||
|
|
||||||
|
### PR D.1 — refresh C:\publish + restart services
|
||||||
|
|
||||||
|
**Depends on:** A.4 + B.4 + C.2 merged (every code-change PR landed).
|
||||||
|
|
||||||
|
**Order matters** — services must stop in reverse-dependency order
|
||||||
|
(`OtOpcUa` → `OtOpcUaWonderwareHistorian` → `MxAccessGw`) and start in
|
||||||
|
forward-dependency order (`MxAccessGw` → `OtOpcUaWonderwareHistorian`
|
||||||
|
→ `OtOpcUa`). Touching binaries while a dependent service holds them
|
||||||
|
locked produces the publish-time `MSB3027` file-lock error caught
|
||||||
|
during the original install (see commit `80104ca`).
|
||||||
|
|
||||||
|
**Steps (run as a single PowerShell session on the deploy host):**
|
||||||
|
|
||||||
|
1. **Stop in reverse order**:
|
||||||
|
```powershell
|
||||||
|
nssm stop OtOpcUa
|
||||||
|
nssm stop OtOpcUaWonderwareHistorian
|
||||||
|
nssm stop MxAccessGw
|
||||||
|
Start-Sleep -Seconds 3
|
||||||
|
Get-Process MxGateway.Server, MxGateway.Worker, OtOpcUa.Server, `
|
||||||
|
OtOpcUa.Driver.Historian.Wonderware -ErrorAction SilentlyContinue |
|
||||||
|
Stop-Process -Force
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **Refresh mxaccessgw binaries** (Track A output):
|
||||||
|
```powershell
|
||||||
|
$gwSrc = "C:\Users\dohertj2\Desktop\mxaccessgw"
|
||||||
|
dotnet build "$gwSrc\src\MxGateway.Worker" -c Release
|
||||||
|
dotnet build "$gwSrc\src\MxGateway.Server" -c Release
|
||||||
|
|
||||||
|
Copy-Item -Recurse -Force `
|
||||||
|
"$gwSrc\src\MxGateway.Server\bin\Release\net10.0\*" `
|
||||||
|
"C:\publish\mxaccessgw\Server\"
|
||||||
|
Copy-Item -Recurse -Force `
|
||||||
|
"$gwSrc\src\MxGateway.Worker\bin\x86\Release\net48\*" `
|
||||||
|
"C:\publish\mxaccessgw\Worker\"
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **Refresh OtOpcUa + historian sidecar binaries** (Tracks B + C
|
||||||
|
output):
|
||||||
|
```powershell
|
||||||
|
$repo = "C:\Users\dohertj2\Desktop\lmxopcua"
|
||||||
|
dotnet publish "$repo\src\ZB.MOM.WW.OtOpcUa.Server" `
|
||||||
|
-c Release -o "C:\publish\lmxopcua"
|
||||||
|
dotnet publish "$repo\src\ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware" `
|
||||||
|
-c Release -o "C:\publish\lmxopcua\WonderwareHistorian"
|
||||||
|
```
|
||||||
|
|
||||||
|
4. **Update service env block if Track C added the new toggle**:
|
||||||
|
```powershell
|
||||||
|
# Pull existing env, append OTOPCUA_HISTORIAN_ALARM_WRITE_ENABLED=true
|
||||||
|
# (default-on per C.2 design, but explicit assignment lets us flip false
|
||||||
|
# for read-only deployments without re-installing)
|
||||||
|
nssm set OtOpcUaWonderwareHistorian AppEnvironmentExtra `
|
||||||
|
(((nssm get OtOpcUaWonderwareHistorian AppEnvironmentExtra) `
|
||||||
|
+ "`r`nOTOPCUA_HISTORIAN_ALARM_WRITE_ENABLED=true"))
|
||||||
|
```
|
||||||
|
|
||||||
|
5. **Start in forward order**:
|
||||||
|
```powershell
|
||||||
|
nssm start MxAccessGw
|
||||||
|
Start-Sleep -Seconds 4
|
||||||
|
nssm start OtOpcUaWonderwareHistorian
|
||||||
|
Start-Sleep -Seconds 4
|
||||||
|
nssm start OtOpcUa
|
||||||
|
Start-Sleep -Seconds 8
|
||||||
|
```
|
||||||
|
|
||||||
|
6. **Smoke verification:**
|
||||||
|
```powershell
|
||||||
|
foreach ($s in 'MxAccessGw','OtOpcUaWonderwareHistorian','OtOpcUa') {
|
||||||
|
(Get-Service $s).Status
|
||||||
|
}
|
||||||
|
foreach ($p in 5120, 4840, 4841) {
|
||||||
|
Get-NetTCPConnection -LocalPort $p -State Listen `
|
||||||
|
-ErrorAction SilentlyContinue
|
||||||
|
}
|
||||||
|
Get-Content "C:\publish\lmxopcua\logs\otopcua-*.log" -Tail 20
|
||||||
|
Get-Content "C:\publish\mxaccessgw\stdout.log" -Tail 20
|
||||||
|
Get-Content "C:\ProgramData\OtOpcUa\historian-wonderware-*.log" -Tail 10
|
||||||
|
```
|
||||||
|
|
||||||
|
Pass criterion: all three services `Running`; ports 5120 + 4840
|
||||||
|
listening; sidecar log shows `Wonderware historian sidecar
|
||||||
|
serving — pipe=OtOpcUaWonderwareHistorian`; OtOpcUa log shows
|
||||||
|
`OPC UA server started — endpoint=opc.tcp://0.0.0.0:4840/OtOpcUa`
|
||||||
|
and a new line `IAlarmHistorianWriter resolved: Sidecar` (added
|
||||||
|
in B.4).
|
||||||
|
|
||||||
|
7. **Functional verification — fire one alarm of each kind and assert
|
||||||
|
it propagates:**
|
||||||
|
- **Galaxy-native** — raise the `OtOpcUaParityTest_001.Counter`
|
||||||
|
`$Alarm*` extension via Galaxy's alarm-fire mechanism; assert an
|
||||||
|
OPC UA Part 9 transition reaches a connected `otopcua-cli alarms`
|
||||||
|
subscriber with rich payload (operator-comment field non-null,
|
||||||
|
original-raise-timestamp present). This validates Track A + B.1
|
||||||
|
+ B.2 + B.3.
|
||||||
|
- **Scripted** — author a one-line scripted alarm in the Admin UI
|
||||||
|
against any always-true predicate; assert the transition lands in
|
||||||
|
AVEVA Historian via `aaHistClientTrend` query (or
|
||||||
|
`Driver.Historian.Wonderware.IntegrationTests` with a query for
|
||||||
|
the alarm event). Validates Track C + B.4.
|
||||||
|
- **Sub-attribute fallback** — disable `IAlarmSource` on the
|
||||||
|
GalaxyDriver via the test seam (B.3 will introduce one); fire an
|
||||||
|
alarm; assert Part 9 transition still raised by the value-driven
|
||||||
|
path. Validates the fallback wasn't broken.
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
|
||||||
|
- `scripts\install\Refresh-Services.ps1` *(new — automates the above)*
|
||||||
|
- `docs\v2\dev-environment.md` — add the refresh script to the dev
|
||||||
|
workflow section.
|
||||||
|
|
||||||
|
**Tests:** smoke run on the dev rig (`DESKTOP-6JL3KKO`) producing
|
||||||
|
`docs\plans\artifacts\d1-rollout-YYYY-MM-DD.md` with the captured log
|
||||||
|
tails + smoke-test assertions. Captured artifact lands as part of the
|
||||||
|
PR.
|
||||||
|
|
||||||
|
**Rollback:** the refresh script keeps a timestamped backup of the
|
||||||
|
existing `C:\publish\mxaccessgw\` and `C:\publish\lmxopcua\` trees
|
||||||
|
before overwriting (mirrored to `C:\publish\.backup-YYYY-MM-DD\`).
|
||||||
|
Rollback is a stop / restore-from-backup / start sequence; no service
|
||||||
|
re-install needed since the NSSM service definitions don't change.
|
||||||
|
|
||||||
|
**Production deploy:** out of scope for D.1 — the dev rig is the only
|
||||||
|
deployment in scope at this point. A separate PR-or-runbook lands the
|
||||||
|
production refresh once the dev rig has soaked for the documented
|
||||||
|
duration (parity-rig validation gate; see "Test gates" above).
|
||||||
|
|
||||||
## Sequencing matrix
|
## Sequencing matrix
|
||||||
|
|
||||||
```
|
```
|
||||||
@@ -552,13 +690,22 @@ A.4 ConditionRefresh │ │
|
|||||||
│ │
|
│ │
|
||||||
B.4 SidecarAlarmHistorianWriter
|
B.4 SidecarAlarmHistorianWriter
|
||||||
(depends on C.2 deployed)
|
(depends on C.2 deployed)
|
||||||
──►B.5 docs + memory
|
|
||||||
|
▼
|
||||||
|
Track D (deployment)
|
||||||
|
─────────────────────────
|
||||||
|
D.1 Refresh C:\publish + restart services
|
||||||
|
(depends on A.4 + B.4 + C.2 merged)
|
||||||
|
▼
|
||||||
|
──►B.5 docs + memory + completion banner
|
||||||
```
|
```
|
||||||
|
|
||||||
A.1 + B.1 + C.1 can all land in parallel — none have cross-repo runtime
|
A.1 + B.1 + C.1 can all land in parallel — none have cross-repo runtime
|
||||||
dependencies. B.1's tests use proto types without needing a running
|
dependencies. B.1's tests use proto types without needing a running
|
||||||
gateway. C.1 is purely sidecar-internal. The gateway-side dispatch (A.3)
|
gateway. C.1 is purely sidecar-internal. The gateway-side dispatch (A.3)
|
||||||
gates B.2; the sidecar-side wiring (C.2) gates B.4.
|
gates B.2; the sidecar-side wiring (C.2) gates B.4. D.1 (deployment
|
||||||
|
refresh) gates B.5 (docs) — the docs sweep records the as-deployed
|
||||||
|
state, so the deploy must be live first.
|
||||||
|
|
||||||
## Test gates
|
## Test gates
|
||||||
|
|
||||||
@@ -677,7 +824,14 @@ needed); land B.4 last and only after end-of-epic gate is green.
|
|||||||
- `scripts\install\Install-Services.ps1` (C.2 — env-var toggle for write-enable)
|
- `scripts\install\Install-Services.ps1` (C.2 — env-var toggle for write-enable)
|
||||||
- `tests\ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Tests\` (C.1 — outcome mapping + batch + cluster failover)
|
- `tests\ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Tests\` (C.1 — outcome mapping + batch + cluster failover)
|
||||||
|
|
||||||
|
**lmxopcua — deployment refresh (Track D):**
|
||||||
|
|
||||||
|
- `scripts\install\Refresh-Services.ps1` *(new — D.1)*
|
||||||
|
- `docs\v2\dev-environment.md` (D.1 — document the refresh workflow)
|
||||||
|
- `docs\plans\artifacts\d1-rollout-YYYY-MM-DD.md` *(new — D.1 captured smoke run)*
|
||||||
|
|
||||||
Total: ~10 source files added/modified in mxaccessgw; ~14 in lmxopcua
|
Total: ~10 source files added/modified in mxaccessgw; ~14 in lmxopcua
|
||||||
proper; ~3 in the historian sidecar; ~12 test files across all repos.
|
proper; ~3 in the historian sidecar; ~2 deployment scripts; ~12 test
|
||||||
Should land in 4-6 weeks of focused work given the parity-rig dependency
|
files across all repos. Should land in 4-6 weeks of focused work given
|
||||||
for end-to-end validation.
|
the parity-rig dependency for end-to-end validation, plus a short
|
||||||
|
final-week ops slot for D.1.
|
||||||
|
|||||||
Reference in New Issue
Block a user