10a6ac6f3e
v2-ci / build (pull_request) Failing after 41s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (pull_request) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (pull_request) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (pull_request) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (pull_request) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (pull_request) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (pull_request) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (pull_request) Has been skipped
Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
196 lines
14 KiB
Markdown
196 lines
14 KiB
Markdown
# OtOpcUa ↔ HistorianGateway — Follow-up & Deferred Items
|
||
|
||
**Status:** the 21-task integration (`feat/historian-gateway-backend`, Gitea PR
|
||
[#423](https://gitea.dohertylan.com/dohertj2/lmxopcua/pulls/423)) + the continuous-historization
|
||
ref-feed are complete and **live-validated** against `wonder-sql-vd03`. The offline suite is green;
|
||
the live `Category=LiveIntegration` suite is green (read ✅, write-persist ✅, alarm-send ✅,
|
||
alarm-readback ⏭ skip). This doc tracks everything deliberately deferred or surfaced during
|
||
validation, with the **owning repo** for each.
|
||
|
||
> **Execution update (2026-06-27 — this follow-up pass):**
|
||
> - **FU-1 — RESOLVED as a documented protocol limitation** (NOT a fixable gateway bug): the captured
|
||
> CM_EVENT event-send wire never carries `SourceName`, so `Source_Object` cannot be populated by the
|
||
> gateway. Recorded as `pending.md` **C4** + a CLAUDE.md note in the HistorianGateway repo (commit
|
||
> `174a4a9` on `fix/gateway-otopcua-followups`). The OtOpcUa live test stays skipped with the corrected
|
||
> reason. See FU-1 below for the (now-confirmed) root cause.
|
||
> - **FU-2 — ✅ DONE + live-validated** in HistorianGateway (`fix/gateway-otopcua-followups`, commits
|
||
> `150868c` + `1c2d11d`). The SQL live-write path converts UTC→server-local in-SQL via
|
||
> `DATEADD(MINUTE, DATEPART(TZOFFSET, SYSDATETIMEOFFSET()), @dt)`; an explicit-timestamp round-trip is
|
||
> now EXACT against the live historian (delta 00:00:00).
|
||
> - **FU-3 — ✅ DONE** in OtOpcUa (this branch, commit `111adc92`): `HistorizedTagRef(MuxRef, HistorianName)`
|
||
> carried through the sink/recorder; interest registered by mux ref, values written under the historian
|
||
> name. Recorder + applier tests green.
|
||
> - **FU-4 — ✅ DONE** in OtOpcUa (this branch, commit `b2276b5b`).
|
||
> - **FU-5** — still pre-existing/not-ours (tracked below). **FU-6** — still pending the merges.
|
||
|
||
**Live-validation harness recap (how to reproduce any of the live findings below):** run the
|
||
HistorianGateway locally against the live historian, then point the OtOpcUa live tests (or `grpcurl`)
|
||
at it. The gateway boots from env-var config (secrets from `~/.zshenv`):
|
||
|
||
```
|
||
ASPNETCORE_ENVIRONMENT=Development
|
||
Historian__Host=$HISTORIAN_GRPC_HOST Historian__Port=32565 Historian__GrpcUseTls=true
|
||
Historian__UserName=$HISTORIAN_USER Historian__Password=$HISTORIAN_PASSWORD
|
||
Historian__AllowUntrustedServerCertificate=true
|
||
Galaxy__ConnectionString=$GALAXY_SQL_CONNECTION
|
||
RuntimeDb__Enabled=true RuntimeDb__EventReadsEnabled=true
|
||
RuntimeDb__ConnectionString="Server=$HISTORIAN_GRPC_HOST;Database=Runtime;User Id=$HISTORIAN_SQL_USER;Password=$HISTORIAN_SQL_PASSWORD;TrustServerCertificate=true;Encrypt=false"
|
||
ApiKeys__Mode=Disabled
|
||
# dotnet run the Server → gRPC h2c on localhost:5221, HTTP on :5220 (/healthz, /health/ready)
|
||
```
|
||
OtOpcUa live tests then read `HISTGW_GATEWAY_ENDPOINT=http://localhost:5221` +
|
||
`HISTGW_GATEWAY_APIKEY=<any>` + `HISTGW_TEST_TAG`/`HISTGW_WRITE_SANDBOX_TAG`/`HISTGW_ALARM_SOURCE`.
|
||
Direct SQL: `Runtime.dbo.Events` is an **INSQL linked-server view that rejects untimed queries** —
|
||
always include an `EventTimeUtc` range. `sqlcmd -S $HISTORIAN_GRPC_HOST -d Runtime -U $HISTORIAN_SQL_USER -C`
|
||
(password via `SQLCMDPASSWORD`).
|
||
|
||
---
|
||
|
||
## Priority 1 — Gateway-side bugs that block OtOpcUa write/read use cases
|
||
**Owning repo: `~/Desktop/HistorianGateway` (HistorianGateway).** OtOpcUa code is correct for both;
|
||
these are gateway defects that gate the "write OtOpcUa's own data, read it back" use case.
|
||
|
||
### FU-1 — `SendEvent` does not populate `Source_Object` — ✅ RESOLVED as a documented protocol limitation (2026-06-27)
|
||
> **Outcome:** root-caused and confirmed **not fixable at the gateway** — the captured CM_EVENT event-send
|
||
> wire (`HistorianEventWriteProtocol.SerializeEventValueBlob`) serializes Namespace/Type/properties but
|
||
> **never `SourceName`** (the gateway threads it correctly; the wire drops it). `Source_Object` is a
|
||
> Galaxy-platform association for object-raised events. Documented as `pending.md` **C4** + a CLAUDE.md note
|
||
> in HistorianGateway; likely won't-fix (would need new wire-capture evidence in `histsdk` — vendored
|
||
> sources aren't hand-edited). The "Investigation/Proposed fix" below is retained for the record; option 1
|
||
> is now known to be infeasible.
|
||
|
||
**Symptom (live-proven):** OtOpcUa's `GatewayAlarmHistorianWriter.SendEvent` of an event with
|
||
`source_name="HistGW.LiveTest.AlarmSource"` **acks** and **lands in `Runtime.dbo.Events`** with the
|
||
correct `Type` (`LimitAlarm`) and `EventTimeUtc` (no shift) — but with **`Source_Object = NULL`** (and
|
||
all other `Source_*`/`Provider_*` columns null). The gateway's `SqlEventReader` filters
|
||
`WHERE Source_Object = @source`, so a source-filtered `ReadEvents` of a just-sent event returns 0.
|
||
|
||
**What works (so this is narrow, not "C2 won't-fix"):**
|
||
- Time-only `ReadEvents` (no source filter) returns events (50 in a 2-day window during validation).
|
||
- Source-filtered `ReadEvents` for a **real Galaxy event source** (`TableAlarms_006`) returns its
|
||
history (`System.Deploy`/`Undeploy`/`Alarm.Set`, each with `source_name` populated). So the SQL
|
||
reader + source filter are functional; only **ad-hoc SendEvents lack a `Source_Object`.**
|
||
- ⇒ **Reading existing Galaxy alarm/event history by source already works** (the mxaccessgw read use
|
||
case). Only round-tripping OtOpcUa's *own* sends by source is blocked.
|
||
|
||
**Investigation (gateway repo):**
|
||
- Read the v8 event-send path: `RegisterCmEventTag` + the `ConnectionType=Event` send (CM_EVENT). Find
|
||
where the event's source/tag is set on the wire payload and whether the historian maps any send-side
|
||
field → the `Events.Source_Object` column. Start at the gateway `SendEvent` service + the vendored
|
||
`AVEVA.Historian.Client` event session (`HistorianEventSession`), and the
|
||
`event-session-reuse-spike` notes in `../histsdk/docs/reverse-engineering/`.
|
||
- Determine whether the historian's CM_EVENT API even *allows* setting a `Source_Object` for an event
|
||
not raised by a Galaxy object. If the source must be a registered event-tag/source name, decide how
|
||
OtOpcUa's `EquipmentPath` should map to it.
|
||
|
||
**Proposed fix (one of):**
|
||
1. If the send payload has a source/tag field that maps to `Source_Object`: populate it from the event's
|
||
`source_name` in the gateway `SendEvent` handler. (Preferred — makes write-back-by-source work.)
|
||
2. If the historian cannot carry a source for ad-hoc events: document it, and have the gateway's
|
||
`SqlEventReader` optionally match the source in a fallback column the send *does* populate (if any),
|
||
or expose a "read all events in window, filter client-side" mode. Update OtOpcUa's
|
||
`GatewayHistorianDataSource.ReadEventsAsync` defensive client-side source filter accordingly (it
|
||
currently drops events whose mapped `SourceName` ≠ requested source — which would also drop
|
||
source-less sends even if the server returned them).
|
||
|
||
**Acceptance:** an OtOpcUa `SendEvent(source=X)` is readable back via `ReadEvents(source=X)` within the
|
||
window. Then **un-skip** `Alarm_SendEvent_then_ReadEvents` in
|
||
`tests/Drivers/.../Live/GatewayLiveIntegrationTests.cs` (it currently `Assert.Skip`s on a 0-result with
|
||
the accurate reason).
|
||
|
||
### FU-2 — `WriteLiveValues` shifts an explicit timestamp by the local↔UTC offset (~+4h) — ✅ DONE + live-validated (2026-06-27)
|
||
> **Outcome:** fixed in HistorianGateway (`fix/gateway-otopcua-followups`). The SQL live-write path now
|
||
> converts UTC→server-local in-SQL via `DATEADD(MINUTE, DATEPART(TZOFFSET, SYSDATETIMEOFFSET()), @dt)` (a
|
||
> single atomic offset read). An explicit-timestamp round-trip (real SQL write → gateway UTC ReadRaw) is now
|
||
> EXACT against the live 2023 R2 historian (delta 00:00:00); offline unit test locks the exact conversion
|
||
> expression. The OtOpcUa live write test can now be tightened (see acceptance).
|
||
|
||
**Symptom (live-proven, reproduces via raw `grpcurl` — no OtOpcUa code involved):** a `WriteLiveValues`
|
||
with an **explicit** `timestamp=2026-06-27T03:45:00Z` lands in the historian at
|
||
`2026-06-27T07:45:00Z` (+4h = the deployment's local↔UTC delta). A **server-stamped** write (null
|
||
timestamp) lands correctly at the gateway's UTC now. The OtOpcUa value-writer sends correct UTC
|
||
(`Timestamp.FromDateTime(SpecifyKind(ts, Utc))`), so the shift is in the gateway's SQL write path.
|
||
|
||
**Impact:** the continuous-historization recorder writes the driver's **source** timestamp (explicit),
|
||
so historized values would carry timestamps offset by the host's UTC offset until fixed. (The OtOpcUa
|
||
live write test currently uses a ±12h tz-tolerant readback window to validate *persistence* around
|
||
this — see FU-2 acceptance.)
|
||
|
||
**Investigation (gateway repo):** `SqlLiveValueWriter` (the `aaAnalogTagInsert` + `INSERT INTO History`
|
||
path). Inspect which `History` DateTime column is written (local vs `*UTC`) and the conversion applied
|
||
to the incoming proto UTC `Timestamp`. The +4h (value lands *later* than supplied UTC) is consistent
|
||
with writing a UTC value into a **local** column that `ReadRaw` then converts local→UTC, on a server
|
||
whose offset is −4h (EDT). Compare against the **server-stamped** path (which is correct) to see what
|
||
conversion the explicit path is missing.
|
||
|
||
**Proposed fix:** convert the supplied UTC timestamp to the historian server's local time before the
|
||
`History` insert (or write the UTC-typed column), so an explicit UTC timestamp round-trips unchanged.
|
||
Add a gateway unit/live test: write explicit `T`, read back, assert the sample timestamp == `T`.
|
||
|
||
**Acceptance:** an explicit-timestamp `WriteLiveValues` reads back at the supplied UTC time. Then
|
||
**tighten** the OtOpcUa live write test (`Write_then_read_on_sandbox_tag`) back to a narrow recent
|
||
window anchored on the write time.
|
||
|
||
---
|
||
|
||
## Priority 2 — OtOpcUa-side follow-ups
|
||
**Owning repo: `~/Desktop/OtOpcUa` (this repo).**
|
||
|
||
### FU-3 — Continuous-historization `HistorianTagname` override edge case — ✅ DONE (2026-06-27, commit `111adc92`)
|
||
> **Outcome:** implemented the "carry both identifiers" fix below. A new `HistorizedTagRef(MuxRef,
|
||
> HistorianName)` record threads through `IHistorizedTagSubscriptionSink` → the recorder; the recorder keeps
|
||
> a **muxRef → SET-of-historian-names** map, registers/filters mux interest by `MuxRef` (= driver `FullName`)
|
||
> but writes under every `HistorianName` (override-or-FullName) sharing that ref. The applier resolves both.
|
||
> The set (not a single name) closes a code-review **Critical**: one driver ref can back several historized
|
||
> equipment tags via aliasing (identical machines sharing a register), each with its own override — a single
|
||
> fan must write ALL of them, not silently drop all but one. Tests: divergent-override, aliased-refs-each-
|
||
> get-the-value, remove-one-alias-keeps-the-ref, override-rename updates the write target without mux churn;
|
||
> applier feed tests assert the full pairs. Commits `111adc92` + `60695179` (review fix).
|
||
|
||
The `ContinuousHistorizationRecorder` registers `DependencyMuxActor` interest **by the resolved
|
||
historian name** (`HistorianTagname` override else `FullName`) — the same key the EnsureTags hook and
|
||
the writer use. The mux fans `DependencyValueChanged` **keyed by `FullReference`** (the driver's
|
||
published ref). In the **common case (no override)** historian-name == `FullReference`, so it's fully
|
||
consistent and works (live-validated path is the value writer; mux fan-out is the recorder's input).
|
||
**When a `HistorianTagname` override is set** (override ≠ `FullReference`), the recorder registers
|
||
interest under a key the mux never fans → that tag's values are never captured.
|
||
**Fix options:** register mux interest by `FullReference` (the mux key) while writing to the historian
|
||
under the resolved historian name — i.e. carry both identifiers through `IHistorizedTagSubscriptionSink`
|
||
/ the recorder (a `(muxRef, historianName)` pair) instead of a single string. Add a recorder test with
|
||
a divergent override. **Low urgency** (overrides are uncommon); only matters for non-Galaxy historized
|
||
tags that set an explicit `HistorianTagname`.
|
||
|
||
### FU-4 — `AlarmHistorianOptions.Validate()` `MaxAttempts<=0` test coverage (minor) — ✅ DONE (2026-06-27, commit `b2276b5b`)
|
||
T19 pruned the Wonderware-shaped fields and reworked `AlarmHistorianRegistrationTests`. The
|
||
`MaxAttempts <= 0` warning branch in `AlarmHistorianOptions.Validate()` is exercised in prod but not
|
||
covered by a test (the sibling warnings for `DrainIntervalSeconds`/`Capacity`/`DeadLetterRetentionDays`
|
||
are). Add a `Validate_warns_on_non_positive_max_attempts` case. Trivial.
|
||
|
||
### FU-5 — Pre-existing `Host.IntegrationTests` failure (NOT ours — track separately)
|
||
`EquipmentNamespaceMaterializationTests.Deploying_an_equipment_namespace_carries_the_signal_into_the_artifact`
|
||
fails (`Rejected` vs expected `Accepted`) on a **Modbus-only** namespace via `DraftValidator`/
|
||
`ConfigComposer` — untouched by this branch. **Verified failing identically on `master`** (via
|
||
`git stash`). Environment/pre-existing; out of scope for the historian work but worth a separate ticket.
|
||
|
||
---
|
||
|
||
## Priority 3 — Cross-repo propagation (after merges)
|
||
- **FU-6 — scadaproj index + agent memory.** When PR #423 merges (and the Plan 1 client PR), update
|
||
`../scadaproj/CLAUDE.md` (the HistorianGateway + OtOpcUa entries) and the agent memory notes
|
||
(`otopcua-historian-backend`, `scadaproj-umbrella`) to record: OtOpcUa now consumes
|
||
`ZB.MOM.WW.HistorianGateway.Client` as its historian backend; the Wonderware historian driver was
|
||
retired; the two gateway follow-ups (FU-1/FU-2). Per the CLAUDE.md cross-repo propagation rule.
|
||
|
||
---
|
||
|
||
## Already resolved this effort (for the record — do NOT redo)
|
||
- **Alarm SendEvent event-id bug** — `AlarmEventMapper` set the wire `Id` → gateway handler throws →
|
||
every alarm send `PermanentFail`. **Fixed** (`44644ddc`): leave `Id` unset, carry the id as an
|
||
`AlarmId` property. Live-validated (send acks).
|
||
- **Continuous-historization ref-feed gap** — recorder spawned with an empty ref set. **Closed**
|
||
(`2982cc4b`): `IHistorizedTagSubscriptionSink` + recorder `UpdateHistorizedRefs(added, removed)`
|
||
converges mux interest on each `AddressSpaceApplier.Apply()`.
|
||
- **Read path / use case 1** — live-validated PASS (ReadRaw through `GatewayHistorianDataSource`).
|
||
- **C2 mis-attribution** — the alarm readback-0 was NOT the "C2 server-gated event reads" limitation;
|
||
the SQL reader works (see FU-1).
|