The runbook shipped at phase-7 close (2026-04-20) described the original
`Doubled = Source × 2` virtual tag, Float64 seed, and flat TagId-shaped
NodeIds. Four commits later the wiring has moved:
- Seed now targets `TestMachine_001.TestHistoryValue` (Int32, writable,
historized) — no placeholder to fill in for the dev box.
- VirtualTag is `MachineStatus` (Boolean, `Source > 0`, historized).
- NodeIds are path-based per OPC UA Part 3 §5.2.2
(`{driverId}/{folder-path}/{browseName}`).
- Seed inserts the ClusterNodeCredential row — without it the Server
bootstrap fails `Unauthorized: caller X is not bound to NodeId`.
Changes:
1. Step 3 — replace "edit the placeholder" instructions with the ZB
Galaxy-Repository query that finds writable historized attributes
(dpc CTE + HistoryExtension EXISTS + `security_classification > 0`).
2. New step 4a — LDAP + `SecurityProfile = Basic256Sha256-Sign` recipe
for the reverse-bridge + alarm-fires stages. Anonymous sessions are
denied writes against `Operate`-classified attributes (PR 26 gate);
`writeop / writeop123` against the dev-box GLAuth clears it.
3. Step 6 validation commands updated to the new NodeIds + reference
the path-based scheme's Part-3 rationale.
4. Drive-the-alarm snippet now calls `otopcua-cli write … -U writeop`
so operators see the explicit auth step.
5. Acceptance checklist updated for the new tag names + the
test-galaxy.ps1 `-Username` invocation.
6. Added a 2026-04-24 second-run evidence section alongside the original
— documents the 3/7 anonymous ceiling and what's needed to reach 7/7.
No code or seed changes in this commit — doc-only.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
14 KiB
Phase 7 Live OPC UA E2E Smoke (task #240)
End-to-end validation that the Phase 7 production wiring chain (#243 / #244 / #245 / #246 / #247) actually serves virtual tags + scripted alarms over OPC UA against a real Galaxy + Aveva Historian.
Scope. Per-stream + per-follow-up unit tests already prove every piece in isolation (197 + 41 + 32 = 270 green tests as of #247). What's missing is a single demonstration that all the pieces wire together against a live deployment. This runbook is that demonstration.
Prerequisites
| Component | How to verify |
|---|---|
| AVEVA Galaxy + MXAccess installed | Get-Service ArchestrA* returns at least one running service |
OtOpcUaGalaxyHost Windows service running |
sc query OtOpcUaGalaxyHost → STATE: 4 RUNNING |
Galaxy.Host shared secret matches .local/galaxy-host-secret.txt |
Set during NSSM install — see docs/ServiceHosting.md |
SQL Server reachable, OtOpcUaConfig DB exists with all migrations applied |
sqlcmd -S "localhost,14330" -d OtOpcUaConfig -U sa -P "..." -Q "SELECT COUNT(*) FROM dbo.__EFMigrationsHistory" returns ≥ 11 |
Server's appsettings.json Node:ConfigDbConnectionString matches your SQL Server |
cat src/ZB.MOM.WW.OtOpcUa.Server/appsettings.json |
Galaxy.Host pipe ACL. The pipe allows the configured
OTOPCUA_ALLOWED_SID(typically the user that runsOtOpcUaGalaxyHost—dohertj2on the dev box). Run the Server under the same user; elevation doesn't matter —PipeAcl.csno longer deniesBUILTIN\Administratorssince UAC's deny-only Admins SID would have blocked non-elevated dev-box admins too.
Setup
1. Migrate the Config DB
cd src/ZB.MOM.WW.OtOpcUa.Configuration
dotnet ef database update --connection "Server=localhost,14330;Database=OtOpcUaConfig;User Id=sa;Password=OtOpcUaDev_2026!;TrustServerCertificate=True;Encrypt=False;"
Expect every migration through 20260420232000_ExtendComputeGenerationDiffWithPhase7 to report Applying migration.... Re-running is a no-op.
2. Seed the smoke fixture
sqlcmd -S "localhost,14330" -d OtOpcUaConfig -U sa -P "OtOpcUaDev_2026!" `
-I -i scripts/smoke/seed-phase-7-smoke.sql
Expected output ends with Phase 7 smoke seed complete. plus a Cluster / Node / Generation summary. Idempotent — re-running wipes the prior smoke state and starts clean.
The seed creates one each of: ServerCluster, ClusterNode, ClusterNodeCredential (binds the SQL login to the node — without this sp_GetCurrentGenerationForCluster returns Unauthorized: caller X is not bound to NodeId p7-smoke-node), ConfigGeneration (Published), Namespace, UnsArea, UnsLine, Equipment, DriverInstance (Galaxy proxy), Tag, two Script rows, one VirtualTag (MachineStatus = Source > 0, Boolean, historized), one ScriptedAlarm (OverTemp when Source > 50).
3. (Optional) Swap the Galaxy attribute
The shipped seed points dbo.Tag.TagConfig at TestMachine_001.TestHistoryValue — the dev-box Galaxy ships it as Int32, writable (security_classification = Operate), and historized (HistoryExtension primitive), so every E2E stage has a real live target. To swap to another attribute on a different Galaxy, pick a candidate via the same shape:
-- Run against the Galaxy Repository DB (ZB).
;WITH dpc AS (
SELECT g.gobject_id, p.package_id, p.derived_from_package_id, 0 AS depth
FROM gobject g INNER JOIN package p ON p.package_id = g.deployed_package_id
WHERE g.is_template = 0 AND g.deployed_package_id <> 0
UNION ALL
SELECT c.gobject_id, p.package_id, p.derived_from_package_id, c.depth + 1
FROM dpc c INNER JOIN package p ON p.package_id = c.derived_from_package_id
WHERE c.derived_from_package_id <> 0 AND c.depth < 10
)
SELECT DISTINCT g.tag_name + '.' + da.attribute_name AS full_ref,
dt.description AS dtype, da.security_classification
FROM dpc
INNER JOIN dynamic_attribute da ON da.package_id = dpc.package_id
INNER JOIN gobject g ON g.gobject_id = dpc.gobject_id
LEFT JOIN data_type dt ON dt.mx_data_type = da.mx_data_type
WHERE da.attribute_name NOT LIKE '[_]%'
AND da.attribute_name NOT LIKE '%.Description'
AND da.mx_data_type IN (1, 2, 3, 4)
AND da.security_classification > 0 -- writable
AND EXISTS (
SELECT 1 FROM primitive_instance pi
INNER JOIN primitive_definition pd
ON pd.primitive_definition_id = pi.primitive_definition_id
AND pd.primitive_name = 'HistoryExtension'
WHERE pi.package_id = dpc.package_id AND pi.primitive_name = da.attribute_name)
ORDER BY full_ref;
Then update the seed:
UPDATE dbo.Tag
SET TagConfig = N'{"FullName":"YourReal.GalaxyAttr","DataType":"Int32"}'
WHERE TagId = 'p7-smoke-tag-source';
4. Point Server.appsettings at the smoke node
{
"Node": {
"NodeId": "p7-smoke-node",
"ClusterId": "p7-smoke",
"ConfigDbConnectionString": "Server=localhost,14330;..."
}
}
4a. (Optional) Enable LDAP + SecurityProfile for the write stage
Anonymous OPC UA sessions are denied writes against Operate-classified tags by the PR 26 server-layer classification gate. To exercise the reverse-bridge + alarm-fires stages fully, the Server has to advertise a UserName UserTokenPolicy (any profile other than None) and authenticate against LDAP.
{
"OpcUa": {
"SecurityProfile": "Basic256Sha256-Sign",
"Ldap": {
"Enabled": true,
"Server": "localhost",
"Port": 3893,
"SearchBase": "dc=lmxopcua,dc=local",
"ServiceAccountDn": "cn=serviceaccount,dc=lmxopcua,dc=local",
"ServiceAccountPassword": "serviceaccount123",
"GroupToRole": {
"ReadOnly": "ReadOnly",
"WriteOperate": "WriteOperate",
"WriteTune": "WriteTune",
"WriteConfigure": "WriteConfigure",
"AlarmAck": "AlarmAck"
}
}
}
}
Dev-box GLAuth ships writeop / writeop123 in the WriteOperate group, admin / admin123 across all write groups. See C:\publish\glauth\auth.md.
Run
5. Start the Server
dotnet run --project src/ZB.MOM.WW.OtOpcUa.Server
Expected log markers (in order):
Bootstrap complete: source=db generation=1
Equipment namespace snapshots loaded for 1/1 driver(s) at generation 1
Phase 7 historian sink: driver p7-smoke-galaxy provides IAlarmHistorianWriter — wiring SqliteStoreAndForwardSink
Phase 7: composed engines from generation 1 — 1 virtual tag(s), 1 scripted alarm(s), 2 script(s)
Phase 7 bridge subscribed N attribute(s) from driver GalaxyProxyDriver
OPC UA server started — endpoint=opc.tcp://0.0.0.0:4840/OtOpcUa driverCount=1
Address space populated for driver p7-smoke-galaxy
Any line missing = follow up the failure surface (each step has its own log signature so the broken piece is identifiable).
6. Validate via Client.CLI
dotnet run --project src/ZB.MOM.WW.OtOpcUa.Client.CLI -- browse -u opc.tcp://localhost:4840/OtOpcUa -r -d 5
Expect to see under the namespace root: lab-floor → galaxy-line → reactor-1 with three child variables: Source (driver-sourced Int32), MachineStatus (virtual tag Boolean, Source > 0), and OverTemp (scripted alarm Boolean, Source > 50). NodeIds are path-based per OPC UA Part 3 §5.2.2 — the walker mints them from {driverId}/{folder-path}/{browseName} and stores the driver-side FullReference in an internal NodeId→FullRef map, so client subscriptions survive backend address renames.
Read the virtual tag
dotnet run --project src/ZB.MOM.WW.OtOpcUa.Client.CLI -- read `
-u opc.tcp://localhost:4840/OtOpcUa `
-n "ns=2;s=p7-smoke-galaxy/lab-floor/galaxy-line/reactor-1/MachineStatus"
Expected: Boolean. Push a value change into the Source Galaxy attribute and re-read — MachineStatus should follow within the bridge's publishing interval (1 second by default).
Read the scripted alarm
dotnet run --project src/ZB.MOM.WW.OtOpcUa.Client.CLI -- read `
-u opc.tcp://localhost:4840/OtOpcUa `
-n "ns=2;s=p7-smoke-galaxy/lab-floor/galaxy-line/reactor-1/OverTemp"
Expected: Boolean — false when Source ≤ 50, true when Source > 50.
Drive the alarm + verify historian queue
Push a Source value above 50 — either from Galaxy itself, or via the Server's OPC UA write path using LDAP credentials (step 4a). Within ~1 second, OverTemp.Read flips to true. The alarm engine emits a transition to Phase7EngineComposer.RouteToHistorianAsync → SqliteStoreAndForwardSink.EnqueueAsync → drain worker (every 2s) → GalaxyHistorianWriter.WriteBatchAsync → Galaxy.Host pipe → Aveva Historian alarm schema.
# OPC UA write path — requires LDAP from step 4a + a writeop-class user.
dotnet run --project src/ZB.MOM.WW.OtOpcUa.Client.CLI -- write `
-u opc.tcp://localhost:4840/OtOpcUa -S sign `
-n "ns=2;s=p7-smoke-galaxy/lab-floor/galaxy-line/reactor-1/Source" `
-v 75 -U writeop -P writeop123
Verify the queue absorbed the event:
sqlite3 "$env:ProgramData\OtOpcUa\alarm-historian-queue.db" "SELECT COUNT(*) FROM Queue;"
Should return 0 once the drain worker successfully forwards (or a small positive number while in-flight). A persistently-non-zero queue + log warnings about RetryPlease indicate the Galaxy.Host historian write path is failing — check the Host's log file.
Verify in Aveva Historian
Open the Historian Client (or InTouch alarm summary) — the OverTemp activation should appear with EquipmentPath = /lab-floor/galaxy-line/reactor-1 + the rendered message Reactor source value 75.3 exceeded 50 (or whatever value tripped it).
Acceptance Checklist
- EF migrations applied through
20260420232000_ExtendComputeGenerationDiffWithPhase7 - Smoke seed completes without errors + creates exactly 1 Published generation
- Server starts + logs the Phase 7 composition lines
- Client.CLI browse shows the UNS tree with Source / MachineStatus / OverTemp under reactor-1
- Read on
Sourcereturns a Good-quality Int32 value (proves MXAccess round-trip) - Read on
MachineStatusreturns the live boolean truth ofSource > 0 - Read on
OverTempreturns the live boolean truth ofSource > 50 test-galaxy.ps1 -Username writeop -Password writeop123drives Source past 50 and flipsOverTemptotruewithin 1 s- SQLite queue drains (
COUNT(*)returns to 0 within 2 s of an alarm transition) - Historian shows the
OverTempactivation event with the rendered message
Second-run evidence (2026-04-24 dev box)
Full live stack ran end-to-end once the IPC unblocks (commit d11dd05), path-based NodeIds (commit 8be82e0), cold-start engine guards (commit 69e1d32), and seed retarget to TestMachine_001.TestHistoryValue (commit ec1a590) landed. Anonymous scripts/e2e/test-galaxy.ps1 run reaches 3/7:
[PASS] source NodeId readable (Galaxy pipe → proxy → server → client chain up)
[PASS] source value = System.Byte[]
[INFO] BadUserAccessDenied — attribute's Galaxy-side ACL blocks writes for this session.
The INFO stage is correct behaviour — Source is Operate-classified and the anonymous session carries no LDAP roles. The Virtual-tag / Subscribe / Alarm / History stages stay at [FAIL] for two further environmental reasons once write is unblocked:
TestMachine_001.TestHistoryValueis driven by whatever Galaxy code runs on the object — idle in the default dev-box state, so no subscription pushes fire.- Historian writes require the Aveva Historian SDK to accept the alarm schema event — dev box doesn't have that path live.
Running ./test-galaxy.ps1 -Username writeop -Password writeop123 with step 4a's LDAP + SecurityProfile = Basic256Sha256-Sign applied unblocks the reverse-bridge + alarm-fires stages. The virtual-tag, subscribe, and history stages depend on further deployment choices (pick an attribute Galaxy is actively writing to, wire Aveva Historian SDK).
First-run evidence (2026-04-20 dev box)
Ran the smoke against the live dev environment. Captured log signatures prove the Phase 7 wiring chain executes in production:
[INF] Bootstrapped from central DB: generation 1
[INF] Bootstrap complete: source=CentralDb generation=1
[INF] Phase 7 historian sink: no driver provides IAlarmHistorianWriter — using NullAlarmHistorianSink
[INF] VirtualTagEngine loaded 1 tag(s), 1 upstream subscription(s)
[INF] ScriptedAlarmEngine loaded 1 alarm(s)
[INF] Phase 7: composed engines from generation 1 — 1 virtual tag(s), 1 scripted alarm(s), 2 script(s)
Each line corresponds to a piece shipped in #243 / #244 / #245 / #246 / #247 — the composer ran, engines loaded, historian-sink decision fired, scripts compiled.
Two gaps surfaced (filed as new tasks below, NOT Phase 7 regressions):
- No driver-instance bootstrap pipeline. The seeded
DriverInstancerow never materialised an actualIDriverinstance inDriverHost—Equipment namespace snapshots loaded for 0/0 driver(s). The DriverHost requires explicit registration which no current code path performs. Without a driver, scripts readBadNodeIdUnknownfromCachedTagUpstreamSource→NullReferenceExceptionon the(double)ctx.GetTag(...).Valuecast. The engine isolated the error to the alarm + kept the rest running, exactly per plan decision #11. - OPC UA endpoint port collision.
Failed to establish tcp listener socketsbecause port 4840 was already in use by another OPC UA server on the dev box.
Both are pre-Phase-7 deployment-wiring gaps. Phase 7 itself ships green — every line of new wiring executed exactly as designed.
Known limitations + follow-ups
- Subscribing to virtual tags via OPC UA monitored items (instead of polled reads) needs
VirtualTagSource.SubscribeAsyncwiring throughDriverNodeManager.OnCreateMonitoredItem— covered as part of release-readiness. - Scripted alarm Acknowledge via the OPC UA Part 9
Acknowledgemethod node is not yet wired throughDriverNodeManager.MethodCalldispatch — operators acknowledge through Admin UI today; the OPC UA-method path is a separate task. - Phase 7 compliance script (
scripts/compliance/phase-7-compliance.ps1) does not exercise the live engine path — it stays at the per-piece presence-check level. End-to-end runtime check belongs in this runbook, not the static analyzer.