From 9fca3d9c05283a3bdcb8b8811549d551f2662904 Mon Sep 17 00:00:00 2001 From: Joseph Doherty Date: Sat, 27 Jun 2026 00:04:21 -0400 Subject: [PATCH] docs(historian-gateway): follow-up & deferred-items plan (gateway SendEvent source + tz, recorder override, propagation) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Consolidates everything deferred or surfaced during live validation, with owning repo per item: P1 gateway bugs (FU-1 SendEvent doesn't populate Source_Object → alarm write-back-by-source; FU-2 WriteLiveValues +4h explicit-timestamp shift), P2 OtOpcUa items (FU-3 HistorianTagname-override recorder edge; FU-4 MaxAttempts test; FU-5 pre-existing Modbus Host.IntegrationTests failure), P3 cross-repo propagation. Includes the live-validation reproduction recipe + the dbo.Events INSQL-view caveat. Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii --- .../2026-06-27-otopcua-historian-followups.md | 155 ++++++++++++++++++ 1 file changed, 155 insertions(+) create mode 100644 docs/plans/2026-06-27-otopcua-historian-followups.md diff --git a/docs/plans/2026-06-27-otopcua-historian-followups.md b/docs/plans/2026-06-27-otopcua-historian-followups.md new file mode 100644 index 00000000..d4084c22 --- /dev/null +++ b/docs/plans/2026-06-27-otopcua-historian-followups.md @@ -0,0 +1,155 @@ +# OtOpcUa ↔ HistorianGateway — Follow-up & Deferred Items + +**Status:** the 21-task integration (`feat/historian-gateway-backend`, Gitea PR +[#423](https://gitea.dohertylan.com/dohertj2/lmxopcua/pulls/423)) + the continuous-historization +ref-feed are complete and **live-validated** against `wonder-sql-vd03`. The offline suite is green; +the live `Category=LiveIntegration` suite is green (read ✅, write-persist ✅, alarm-send ✅, +alarm-readback ⏭ skip). This doc tracks everything deliberately deferred or surfaced during +validation, with the **owning repo** for each. + +**Live-validation harness recap (how to reproduce any of the live findings below):** run the +HistorianGateway locally against the live historian, then point the OtOpcUa live tests (or `grpcurl`) +at it. The gateway boots from env-var config (secrets from `~/.zshenv`): + +``` +ASPNETCORE_ENVIRONMENT=Development +Historian__Host=$HISTORIAN_GRPC_HOST Historian__Port=32565 Historian__GrpcUseTls=true +Historian__UserName=$HISTORIAN_USER Historian__Password=$HISTORIAN_PASSWORD +Historian__AllowUntrustedServerCertificate=true +Galaxy__ConnectionString=$GALAXY_SQL_CONNECTION +RuntimeDb__Enabled=true RuntimeDb__EventReadsEnabled=true +RuntimeDb__ConnectionString="Server=$HISTORIAN_GRPC_HOST;Database=Runtime;User Id=$HISTORIAN_SQL_USER;Password=$HISTORIAN_SQL_PASSWORD;TrustServerCertificate=true;Encrypt=false" +ApiKeys__Mode=Disabled +# dotnet run the Server → gRPC h2c on localhost:5221, HTTP on :5220 (/healthz, /health/ready) +``` +OtOpcUa live tests then read `HISTGW_GATEWAY_ENDPOINT=http://localhost:5221` + +`HISTGW_GATEWAY_APIKEY=` + `HISTGW_TEST_TAG`/`HISTGW_WRITE_SANDBOX_TAG`/`HISTGW_ALARM_SOURCE`. +Direct SQL: `Runtime.dbo.Events` is an **INSQL linked-server view that rejects untimed queries** — +always include an `EventTimeUtc` range. `sqlcmd -S $HISTORIAN_GRPC_HOST -d Runtime -U $HISTORIAN_SQL_USER -C` +(password via `SQLCMDPASSWORD`). + +--- + +## Priority 1 — Gateway-side bugs that block OtOpcUa write/read use cases +**Owning repo: `~/Desktop/HistorianGateway` (HistorianGateway).** OtOpcUa code is correct for both; +these are gateway defects that gate the "write OtOpcUa's own data, read it back" use case. + +### FU-1 — `SendEvent` does not populate `Source_Object` (alarm write-back-by-source) +**Symptom (live-proven):** OtOpcUa's `GatewayAlarmHistorianWriter.SendEvent` of an event with +`source_name="HistGW.LiveTest.AlarmSource"` **acks** and **lands in `Runtime.dbo.Events`** with the +correct `Type` (`LimitAlarm`) and `EventTimeUtc` (no shift) — but with **`Source_Object = NULL`** (and +all other `Source_*`/`Provider_*` columns null). The gateway's `SqlEventReader` filters +`WHERE Source_Object = @source`, so a source-filtered `ReadEvents` of a just-sent event returns 0. + +**What works (so this is narrow, not "C2 won't-fix"):** +- Time-only `ReadEvents` (no source filter) returns events (50 in a 2-day window during validation). +- Source-filtered `ReadEvents` for a **real Galaxy event source** (`TableAlarms_006`) returns its + history (`System.Deploy`/`Undeploy`/`Alarm.Set`, each with `source_name` populated). So the SQL + reader + source filter are functional; only **ad-hoc SendEvents lack a `Source_Object`.** +- ⇒ **Reading existing Galaxy alarm/event history by source already works** (the mxaccessgw read use + case). Only round-tripping OtOpcUa's *own* sends by source is blocked. + +**Investigation (gateway repo):** +- Read the v8 event-send path: `RegisterCmEventTag` + the `ConnectionType=Event` send (CM_EVENT). Find + where the event's source/tag is set on the wire payload and whether the historian maps any send-side + field → the `Events.Source_Object` column. Start at the gateway `SendEvent` service + the vendored + `AVEVA.Historian.Client` event session (`HistorianEventSession`), and the + `event-session-reuse-spike` notes in `../histsdk/docs/reverse-engineering/`. +- Determine whether the historian's CM_EVENT API even *allows* setting a `Source_Object` for an event + not raised by a Galaxy object. If the source must be a registered event-tag/source name, decide how + OtOpcUa's `EquipmentPath` should map to it. + +**Proposed fix (one of):** +1. If the send payload has a source/tag field that maps to `Source_Object`: populate it from the event's + `source_name` in the gateway `SendEvent` handler. (Preferred — makes write-back-by-source work.) +2. If the historian cannot carry a source for ad-hoc events: document it, and have the gateway's + `SqlEventReader` optionally match the source in a fallback column the send *does* populate (if any), + or expose a "read all events in window, filter client-side" mode. Update OtOpcUa's + `GatewayHistorianDataSource.ReadEventsAsync` defensive client-side source filter accordingly (it + currently drops events whose mapped `SourceName` ≠ requested source — which would also drop + source-less sends even if the server returned them). + +**Acceptance:** an OtOpcUa `SendEvent(source=X)` is readable back via `ReadEvents(source=X)` within the +window. Then **un-skip** `Alarm_SendEvent_then_ReadEvents` in +`tests/Drivers/.../Live/GatewayLiveIntegrationTests.cs` (it currently `Assert.Skip`s on a 0-result with +the accurate reason). + +### FU-2 — `WriteLiveValues` shifts an explicit timestamp by the local↔UTC offset (~+4h) +**Symptom (live-proven, reproduces via raw `grpcurl` — no OtOpcUa code involved):** a `WriteLiveValues` +with an **explicit** `timestamp=2026-06-27T03:45:00Z` lands in the historian at +`2026-06-27T07:45:00Z` (+4h = the deployment's local↔UTC delta). A **server-stamped** write (null +timestamp) lands correctly at the gateway's UTC now. The OtOpcUa value-writer sends correct UTC +(`Timestamp.FromDateTime(SpecifyKind(ts, Utc))`), so the shift is in the gateway's SQL write path. + +**Impact:** the continuous-historization recorder writes the driver's **source** timestamp (explicit), +so historized values would carry timestamps offset by the host's UTC offset until fixed. (The OtOpcUa +live write test currently uses a ±12h tz-tolerant readback window to validate *persistence* around +this — see FU-2 acceptance.) + +**Investigation (gateway repo):** `SqlLiveValueWriter` (the `aaAnalogTagInsert` + `INSERT INTO History` +path). Inspect which `History` DateTime column is written (local vs `*UTC`) and the conversion applied +to the incoming proto UTC `Timestamp`. The +4h (value lands *later* than supplied UTC) is consistent +with writing a UTC value into a **local** column that `ReadRaw` then converts local→UTC, on a server +whose offset is −4h (EDT). Compare against the **server-stamped** path (which is correct) to see what +conversion the explicit path is missing. + +**Proposed fix:** convert the supplied UTC timestamp to the historian server's local time before the +`History` insert (or write the UTC-typed column), so an explicit UTC timestamp round-trips unchanged. +Add a gateway unit/live test: write explicit `T`, read back, assert the sample timestamp == `T`. + +**Acceptance:** an explicit-timestamp `WriteLiveValues` reads back at the supplied UTC time. Then +**tighten** the OtOpcUa live write test (`Write_then_read_on_sandbox_tag`) back to a narrow recent +window anchored on the write time. + +--- + +## Priority 2 — OtOpcUa-side follow-ups +**Owning repo: `~/Desktop/OtOpcUa` (this repo).** + +### FU-3 — Continuous-historization `HistorianTagname` override edge case +The `ContinuousHistorizationRecorder` registers `DependencyMuxActor` interest **by the resolved +historian name** (`HistorianTagname` override else `FullName`) — the same key the EnsureTags hook and +the writer use. The mux fans `DependencyValueChanged` **keyed by `FullReference`** (the driver's +published ref). In the **common case (no override)** historian-name == `FullReference`, so it's fully +consistent and works (live-validated path is the value writer; mux fan-out is the recorder's input). +**When a `HistorianTagname` override is set** (override ≠ `FullReference`), the recorder registers +interest under a key the mux never fans → that tag's values are never captured. +**Fix options:** register mux interest by `FullReference` (the mux key) while writing to the historian +under the resolved historian name — i.e. carry both identifiers through `IHistorizedTagSubscriptionSink` +/ the recorder (a `(muxRef, historianName)` pair) instead of a single string. Add a recorder test with +a divergent override. **Low urgency** (overrides are uncommon); only matters for non-Galaxy historized +tags that set an explicit `HistorianTagname`. + +### FU-4 — `AlarmHistorianOptions.Validate()` `MaxAttempts<=0` test coverage (minor) +T19 pruned the Wonderware-shaped fields and reworked `AlarmHistorianRegistrationTests`. The +`MaxAttempts <= 0` warning branch in `AlarmHistorianOptions.Validate()` is exercised in prod but not +covered by a test (the sibling warnings for `DrainIntervalSeconds`/`Capacity`/`DeadLetterRetentionDays` +are). Add a `Validate_warns_on_non_positive_max_attempts` case. Trivial. + +### FU-5 — Pre-existing `Host.IntegrationTests` failure (NOT ours — track separately) +`EquipmentNamespaceMaterializationTests.Deploying_an_equipment_namespace_carries_the_signal_into_the_artifact` +fails (`Rejected` vs expected `Accepted`) on a **Modbus-only** namespace via `DraftValidator`/ +`ConfigComposer` — untouched by this branch. **Verified failing identically on `master`** (via +`git stash`). Environment/pre-existing; out of scope for the historian work but worth a separate ticket. + +--- + +## Priority 3 — Cross-repo propagation (after merges) +- **FU-6 — scadaproj index + agent memory.** When PR #423 merges (and the Plan 1 client PR), update + `../scadaproj/CLAUDE.md` (the HistorianGateway + OtOpcUa entries) and the agent memory notes + (`otopcua-historian-backend`, `scadaproj-umbrella`) to record: OtOpcUa now consumes + `ZB.MOM.WW.HistorianGateway.Client` as its historian backend; the Wonderware historian driver was + retired; the two gateway follow-ups (FU-1/FU-2). Per the CLAUDE.md cross-repo propagation rule. + +--- + +## Already resolved this effort (for the record — do NOT redo) +- **Alarm SendEvent event-id bug** — `AlarmEventMapper` set the wire `Id` → gateway handler throws → + every alarm send `PermanentFail`. **Fixed** (`44644ddc`): leave `Id` unset, carry the id as an + `AlarmId` property. Live-validated (send acks). +- **Continuous-historization ref-feed gap** — recorder spawned with an empty ref set. **Closed** + (`2982cc4b`): `IHistorizedTagSubscriptionSink` + recorder `UpdateHistorizedRefs(added, removed)` + converges mux interest on each `AddressSpaceApplier.Apply()`. +- **Read path / use case 1** — live-validated PASS (ReadRaw through `GatewayHistorianDataSource`). +- **C2 mis-attribution** — the alarm readback-0 was NOT the "C2 server-gated event reads" limitation; + the SQL reader works (see FU-1).