Files
lmxopcua/docs/plans/2026-06-27-otopcua-historian-followups.md
T
Joseph Doherty 9fca3d9c05
v2-ci / build (pull_request) Failing after 40s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (pull_request) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (pull_request) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (pull_request) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (pull_request) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (pull_request) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (pull_request) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (pull_request) Has been skipped
docs(historian-gateway): follow-up & deferred-items plan (gateway SendEvent source + tz, recorder override, propagation)
Consolidates everything deferred or surfaced during live validation, with owning repo
per item: P1 gateway bugs (FU-1 SendEvent doesn't populate Source_Object → alarm
write-back-by-source; FU-2 WriteLiveValues +4h explicit-timestamp shift), P2 OtOpcUa
items (FU-3 HistorianTagname-override recorder edge; FU-4 MaxAttempts test; FU-5
pre-existing Modbus Host.IntegrationTests failure), P3 cross-repo propagation. Includes
the live-validation reproduction recipe + the dbo.Events INSQL-view caveat.

Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
2026-06-27 00:04:21 -04:00

156 lines
11 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# OtOpcUa ↔ HistorianGateway — Follow-up & Deferred Items
**Status:** the 21-task integration (`feat/historian-gateway-backend`, Gitea PR
[#423](https://gitea.dohertylan.com/dohertj2/lmxopcua/pulls/423)) + the continuous-historization
ref-feed are complete and **live-validated** against `wonder-sql-vd03`. The offline suite is green;
the live `Category=LiveIntegration` suite is green (read ✅, write-persist ✅, alarm-send ✅,
alarm-readback ⏭ skip). This doc tracks everything deliberately deferred or surfaced during
validation, with the **owning repo** for each.
**Live-validation harness recap (how to reproduce any of the live findings below):** run the
HistorianGateway locally against the live historian, then point the OtOpcUa live tests (or `grpcurl`)
at it. The gateway boots from env-var config (secrets from `~/.zshenv`):
```
ASPNETCORE_ENVIRONMENT=Development
Historian__Host=$HISTORIAN_GRPC_HOST Historian__Port=32565 Historian__GrpcUseTls=true
Historian__UserName=$HISTORIAN_USER Historian__Password=$HISTORIAN_PASSWORD
Historian__AllowUntrustedServerCertificate=true
Galaxy__ConnectionString=$GALAXY_SQL_CONNECTION
RuntimeDb__Enabled=true RuntimeDb__EventReadsEnabled=true
RuntimeDb__ConnectionString="Server=$HISTORIAN_GRPC_HOST;Database=Runtime;User Id=$HISTORIAN_SQL_USER;Password=$HISTORIAN_SQL_PASSWORD;TrustServerCertificate=true;Encrypt=false"
ApiKeys__Mode=Disabled
# dotnet run the Server → gRPC h2c on localhost:5221, HTTP on :5220 (/healthz, /health/ready)
```
OtOpcUa live tests then read `HISTGW_GATEWAY_ENDPOINT=http://localhost:5221` +
`HISTGW_GATEWAY_APIKEY=<any>` + `HISTGW_TEST_TAG`/`HISTGW_WRITE_SANDBOX_TAG`/`HISTGW_ALARM_SOURCE`.
Direct SQL: `Runtime.dbo.Events` is an **INSQL linked-server view that rejects untimed queries**
always include an `EventTimeUtc` range. `sqlcmd -S $HISTORIAN_GRPC_HOST -d Runtime -U $HISTORIAN_SQL_USER -C`
(password via `SQLCMDPASSWORD`).
---
## Priority 1 — Gateway-side bugs that block OtOpcUa write/read use cases
**Owning repo: `~/Desktop/HistorianGateway` (HistorianGateway).** OtOpcUa code is correct for both;
these are gateway defects that gate the "write OtOpcUa's own data, read it back" use case.
### FU-1 — `SendEvent` does not populate `Source_Object` (alarm write-back-by-source)
**Symptom (live-proven):** OtOpcUa's `GatewayAlarmHistorianWriter.SendEvent` of an event with
`source_name="HistGW.LiveTest.AlarmSource"` **acks** and **lands in `Runtime.dbo.Events`** with the
correct `Type` (`LimitAlarm`) and `EventTimeUtc` (no shift) — but with **`Source_Object = NULL`** (and
all other `Source_*`/`Provider_*` columns null). The gateway's `SqlEventReader` filters
`WHERE Source_Object = @source`, so a source-filtered `ReadEvents` of a just-sent event returns 0.
**What works (so this is narrow, not "C2 won't-fix"):**
- Time-only `ReadEvents` (no source filter) returns events (50 in a 2-day window during validation).
- Source-filtered `ReadEvents` for a **real Galaxy event source** (`TableAlarms_006`) returns its
history (`System.Deploy`/`Undeploy`/`Alarm.Set`, each with `source_name` populated). So the SQL
reader + source filter are functional; only **ad-hoc SendEvents lack a `Source_Object`.**
-**Reading existing Galaxy alarm/event history by source already works** (the mxaccessgw read use
case). Only round-tripping OtOpcUa's *own* sends by source is blocked.
**Investigation (gateway repo):**
- Read the v8 event-send path: `RegisterCmEventTag` + the `ConnectionType=Event` send (CM_EVENT). Find
where the event's source/tag is set on the wire payload and whether the historian maps any send-side
field → the `Events.Source_Object` column. Start at the gateway `SendEvent` service + the vendored
`AVEVA.Historian.Client` event session (`HistorianEventSession`), and the
`event-session-reuse-spike` notes in `../histsdk/docs/reverse-engineering/`.
- Determine whether the historian's CM_EVENT API even *allows* setting a `Source_Object` for an event
not raised by a Galaxy object. If the source must be a registered event-tag/source name, decide how
OtOpcUa's `EquipmentPath` should map to it.
**Proposed fix (one of):**
1. If the send payload has a source/tag field that maps to `Source_Object`: populate it from the event's
`source_name` in the gateway `SendEvent` handler. (Preferred — makes write-back-by-source work.)
2. If the historian cannot carry a source for ad-hoc events: document it, and have the gateway's
`SqlEventReader` optionally match the source in a fallback column the send *does* populate (if any),
or expose a "read all events in window, filter client-side" mode. Update OtOpcUa's
`GatewayHistorianDataSource.ReadEventsAsync` defensive client-side source filter accordingly (it
currently drops events whose mapped `SourceName` ≠ requested source — which would also drop
source-less sends even if the server returned them).
**Acceptance:** an OtOpcUa `SendEvent(source=X)` is readable back via `ReadEvents(source=X)` within the
window. Then **un-skip** `Alarm_SendEvent_then_ReadEvents` in
`tests/Drivers/.../Live/GatewayLiveIntegrationTests.cs` (it currently `Assert.Skip`s on a 0-result with
the accurate reason).
### FU-2 — `WriteLiveValues` shifts an explicit timestamp by the local↔UTC offset (~+4h)
**Symptom (live-proven, reproduces via raw `grpcurl` — no OtOpcUa code involved):** a `WriteLiveValues`
with an **explicit** `timestamp=2026-06-27T03:45:00Z` lands in the historian at
`2026-06-27T07:45:00Z` (+4h = the deployment's local↔UTC delta). A **server-stamped** write (null
timestamp) lands correctly at the gateway's UTC now. The OtOpcUa value-writer sends correct UTC
(`Timestamp.FromDateTime(SpecifyKind(ts, Utc))`), so the shift is in the gateway's SQL write path.
**Impact:** the continuous-historization recorder writes the driver's **source** timestamp (explicit),
so historized values would carry timestamps offset by the host's UTC offset until fixed. (The OtOpcUa
live write test currently uses a ±12h tz-tolerant readback window to validate *persistence* around
this — see FU-2 acceptance.)
**Investigation (gateway repo):** `SqlLiveValueWriter` (the `aaAnalogTagInsert` + `INSERT INTO History`
path). Inspect which `History` DateTime column is written (local vs `*UTC`) and the conversion applied
to the incoming proto UTC `Timestamp`. The +4h (value lands *later* than supplied UTC) is consistent
with writing a UTC value into a **local** column that `ReadRaw` then converts local→UTC, on a server
whose offset is 4h (EDT). Compare against the **server-stamped** path (which is correct) to see what
conversion the explicit path is missing.
**Proposed fix:** convert the supplied UTC timestamp to the historian server's local time before the
`History` insert (or write the UTC-typed column), so an explicit UTC timestamp round-trips unchanged.
Add a gateway unit/live test: write explicit `T`, read back, assert the sample timestamp == `T`.
**Acceptance:** an explicit-timestamp `WriteLiveValues` reads back at the supplied UTC time. Then
**tighten** the OtOpcUa live write test (`Write_then_read_on_sandbox_tag`) back to a narrow recent
window anchored on the write time.
---
## Priority 2 — OtOpcUa-side follow-ups
**Owning repo: `~/Desktop/OtOpcUa` (this repo).**
### FU-3 — Continuous-historization `HistorianTagname` override edge case
The `ContinuousHistorizationRecorder` registers `DependencyMuxActor` interest **by the resolved
historian name** (`HistorianTagname` override else `FullName`) — the same key the EnsureTags hook and
the writer use. The mux fans `DependencyValueChanged` **keyed by `FullReference`** (the driver's
published ref). In the **common case (no override)** historian-name == `FullReference`, so it's fully
consistent and works (live-validated path is the value writer; mux fan-out is the recorder's input).
**When a `HistorianTagname` override is set** (override ≠ `FullReference`), the recorder registers
interest under a key the mux never fans → that tag's values are never captured.
**Fix options:** register mux interest by `FullReference` (the mux key) while writing to the historian
under the resolved historian name — i.e. carry both identifiers through `IHistorizedTagSubscriptionSink`
/ the recorder (a `(muxRef, historianName)` pair) instead of a single string. Add a recorder test with
a divergent override. **Low urgency** (overrides are uncommon); only matters for non-Galaxy historized
tags that set an explicit `HistorianTagname`.
### FU-4 — `AlarmHistorianOptions.Validate()` `MaxAttempts<=0` test coverage (minor)
T19 pruned the Wonderware-shaped fields and reworked `AlarmHistorianRegistrationTests`. The
`MaxAttempts <= 0` warning branch in `AlarmHistorianOptions.Validate()` is exercised in prod but not
covered by a test (the sibling warnings for `DrainIntervalSeconds`/`Capacity`/`DeadLetterRetentionDays`
are). Add a `Validate_warns_on_non_positive_max_attempts` case. Trivial.
### FU-5 — Pre-existing `Host.IntegrationTests` failure (NOT ours — track separately)
`EquipmentNamespaceMaterializationTests.Deploying_an_equipment_namespace_carries_the_signal_into_the_artifact`
fails (`Rejected` vs expected `Accepted`) on a **Modbus-only** namespace via `DraftValidator`/
`ConfigComposer` — untouched by this branch. **Verified failing identically on `master`** (via
`git stash`). Environment/pre-existing; out of scope for the historian work but worth a separate ticket.
---
## Priority 3 — Cross-repo propagation (after merges)
- **FU-6 — scadaproj index + agent memory.** When PR #423 merges (and the Plan 1 client PR), update
`../scadaproj/CLAUDE.md` (the HistorianGateway + OtOpcUa entries) and the agent memory notes
(`otopcua-historian-backend`, `scadaproj-umbrella`) to record: OtOpcUa now consumes
`ZB.MOM.WW.HistorianGateway.Client` as its historian backend; the Wonderware historian driver was
retired; the two gateway follow-ups (FU-1/FU-2). Per the CLAUDE.md cross-repo propagation rule.
---
## Already resolved this effort (for the record — do NOT redo)
- **Alarm SendEvent event-id bug** — `AlarmEventMapper` set the wire `Id` → gateway handler throws →
every alarm send `PermanentFail`. **Fixed** (`44644ddc`): leave `Id` unset, carry the id as an
`AlarmId` property. Live-validated (send acks).
- **Continuous-historization ref-feed gap** — recorder spawned with an empty ref set. **Closed**
(`2982cc4b`): `IHistorizedTagSubscriptionSink` + recorder `UpdateHistorizedRefs(added, removed)`
converges mux interest on each `AddressSpaceApplier.Apply()`.
- **Read path / use case 1** — live-validated PASS (ReadRaw through `GatewayHistorianDataSource`).
- **C2 mis-attribution** — the alarm readback-0 was NOT the "C2 server-gated event reads" limitation;
the SQL reader works (see FU-1).