feat: HistorianGateway as the OtOpcUa historian backend (read/write/alarms + continuous historization); retire Wonderware #423

Merged
dohertj2 merged 40 commits from feat/historian-gateway-backend into master 2026-06-27 11:09:04 -04:00
Owner

Summary

Makes HistorianGateway the sole historian read/write backend for OtOpcUa, replacing the
bespoke Wonderware TCP/ArchestrA sidecar. New ZB.MOM.WW.OtOpcUa.Driver.Historian.Gateway
driver consumes the published ZB.MOM.WW.HistorianGateway.Client 0.1.0 gRPC package (Gitea feed)
behind a thin IHistorianGatewayClient seam, so every adapter is unit-tested against a fake.

What changed

  • Read cutoverAddServerHistorian now builds a GatewayHistorianDataSource
    (IHistorianDataSource): ReadRaw / ReadProcessed (15 aggregate modes) / ReadAtTime / ReadEvents
    • health snapshot. Matrix-guarded mappers translate OPC UA shapes ↔ gateway proto.
  • Alarm-write cutoverGatewayAlarmHistorianWriter (IAlarmHistorianWriter) behind the
    existing SqliteStoreAndForwardSink; gRPC status → Ack / RetryPlease / PermanentFail.
  • Continuous historization — a crash-safe FasterLog durable outbox + an Akka
    ContinuousHistorizationRecorder actor (append-before-ack, numeric-only gate, backoff drain)
    draining to the gateway's WriteLiveValues, wired into DI + hosted lifecycle with
    observable-gauge meters.
  • Auto-provisioning — a non-blocking IHistorianProvisioning EnsureTags hook fires from
    AddressSpaceApplier.Apply() (off the publish thread) for historized tags.
  • ServerHistorianOptions reshaped to the gateway form (Endpoint / ApiKey / UseTls /
    AllowUntrustedServerCertificate / CaCertificatePath / CallTimeout) + a new
    ContinuousHistorization config section.
  • Wonderware retired — the 5 historian projects and their (vestigial, factory-less)
    AdminUI driver surface removed; gateway is the sole backend.

Gateway-side prerequisites

The target gateway must run with RuntimeDb:Enabled=true (WriteLiveValues) and
RuntimeDb:EventReadsEnabled=true (alarm-history reads); the API key must carry scopes
historian:read, historian:write, historian:tags:write. Supply ServerHistorian__ApiKey
via env — never commit a key.

Two gates before merge (documented in CLAUDE.md)

  1. Live validation (required). The env-gated Category=LiveIntegration round-trips
    (read / write-then-read / alarm SendEvent→ReadEvents) must pass against the live gateway →
    wonder-sql-vd03 on VPN. They skip cleanly offline. Do not merge until green.
  2. Continuous-historization ref-feed (follow-on). The recorder is fully wired (actor + outbox
    • writer + meters) but is currently spawned with an empty historized-ref set — it captures
      nothing until a SetHistorizedRefs-style feed from the address-space deploy flow lands.
      Reads and alarms work today; the recorder's value-capture is the remaining piece.

Test plan

  • dotnet build ZB.MOM.WW.OtOpcUa.slnx — 0 errors (touched projects 0 warnings; the ~381 warnings
    are the pre-existing legacy-test-project baseline).
  • Offline suite green: Gateway driver 98 pass / 3 live-skip, Runtime 341, OpcUaServer 324,
    AdminUI 514, Core.AlarmHistorian 29.
  • One pre-existing Host.IntegrationTests failure (Modbus-only namespace deploy) is unrelated —
    verified failing identically on master.
  • Not run here: the live VPN round-trips (gate 1 above).

Built task-by-task with classification-driven review (implement → spec/code review → fix) per task.

https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii

## Summary Makes **HistorianGateway** the sole historian read/write backend for OtOpcUa, replacing the bespoke Wonderware TCP/ArchestrA sidecar. New `ZB.MOM.WW.OtOpcUa.Driver.Historian.Gateway` driver consumes the published `ZB.MOM.WW.HistorianGateway.Client` 0.1.0 gRPC package (Gitea feed) behind a thin `IHistorianGatewayClient` seam, so every adapter is unit-tested against a fake. ### What changed - **Read cutover** — `AddServerHistorian` now builds a `GatewayHistorianDataSource` (`IHistorianDataSource`): ReadRaw / ReadProcessed (15 aggregate modes) / ReadAtTime / ReadEvents + health snapshot. Matrix-guarded mappers translate OPC UA shapes ↔ gateway proto. - **Alarm-write cutover** — `GatewayAlarmHistorianWriter` (`IAlarmHistorianWriter`) behind the existing `SqliteStoreAndForwardSink`; gRPC status → Ack / RetryPlease / PermanentFail. - **Continuous historization** — a crash-safe **FasterLog** durable outbox + an Akka `ContinuousHistorizationRecorder` actor (append-before-ack, numeric-only gate, backoff drain) draining to the gateway's `WriteLiveValues`, wired into DI + hosted lifecycle with observable-gauge meters. - **Auto-provisioning** — a non-blocking `IHistorianProvisioning` EnsureTags hook fires from `AddressSpaceApplier.Apply()` (off the publish thread) for historized tags. - **`ServerHistorianOptions` reshaped** to the gateway form (Endpoint / ApiKey / UseTls / AllowUntrustedServerCertificate / CaCertificatePath / CallTimeout) + a new `ContinuousHistorization` config section. - **Wonderware retired** — the 5 historian projects *and* their (vestigial, factory-less) AdminUI driver surface removed; gateway is the sole backend. ### Gateway-side prerequisites The target gateway must run with `RuntimeDb:Enabled=true` (WriteLiveValues) and `RuntimeDb:EventReadsEnabled=true` (alarm-history reads); the API key must carry scopes `historian:read`, `historian:write`, `historian:tags:write`. Supply `ServerHistorian__ApiKey` via env — never commit a key. ## Two gates before merge (documented in CLAUDE.md) 1. **Live validation (required).** The env-gated `Category=LiveIntegration` round-trips (read / write-then-read / alarm SendEvent→ReadEvents) must pass against the live gateway → `wonder-sql-vd03` on VPN. They skip cleanly offline. Do not merge until green. 2. **Continuous-historization ref-feed (follow-on).** The recorder is fully wired (actor + outbox + writer + meters) but is currently spawned with an **empty** historized-ref set — it captures nothing until a `SetHistorizedRefs`-style feed from the address-space deploy flow lands. **Reads and alarms work today**; the recorder's value-capture is the remaining piece. ## Test plan - `dotnet build ZB.MOM.WW.OtOpcUa.slnx` — 0 errors (touched projects 0 warnings; the ~381 warnings are the pre-existing legacy-test-project baseline). - Offline suite green: Gateway driver **98** pass / 3 live-skip, Runtime **341**, OpcUaServer **324**, AdminUI **514**, Core.AlarmHistorian **29**. - One pre-existing `Host.IntegrationTests` failure (Modbus-only namespace deploy) is unrelated — verified failing identically on `master`. - **Not run here:** the live VPN round-trips (gate 1 above). Built task-by-task with classification-driven review (implement → spec/code review → fix) per task. https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
dohertj2 added 30 commits 2026-06-26 23:01:31 -04:00
Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
Addresses Task 1 code-review: document that ReadEventsAsync.maxEvents is enforced
client-side (no server cap in the wire contract); add Platforms=AnyCPU;x64 to match
sibling drivers; use ValueTask.CompletedTask in FakeHistorianGatewayClient.

Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
Addresses T7/T8/T11 code-review minors: route the sync dispose through DisposeAsync
so a double Dispose()+DisposeAsync() stays a no-op; cover the sync path.

Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
Addresses Task 9 review: add the enabled+nonpositive MaxTieClusterOverfetch warning
test; update the AddServerHistorian XML doc to describe the gateway-backed data source
(the alarm-path Wonderware doc stays until T13).

Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
I-1: GatewayAlarmHistorianWriter no longer dead-letters events cancelled
mid-drain at shutdown. WriteBatchAsync short-circuits remaining events to
RetryPlease once cancellation is requested, and SendOneAsync catches
OperationCanceledException (when the token is cancelled) -> RetryPlease,
so in-flight events stay queued instead of being permanently dropped.

I-2: FasterLogHistorizationOutbox.Dispose now guards the awaited periodic
loop with a broad catch (Exception) after the OperationCanceledException
catch, so a non-Faster teardown fault (e.g. ObjectDisposedException) can
never escape Dispose.

M-1: GatewayTagProvisioner skips the empty EnsureTags round-trip when every
request is non-historizable (early return).

M-2: GatewayTagProvisioner handles plain shutdown cancellation quietly
(Debug, not Warning), counting the unsent batch as Failed, never throwing.

M-3/M-4: Added remove-last-entry (TailAddress truncation branch) and
FIFO implicit-ack (RemoveAsync acks up to and including the target)
durability tests, both reopen-and-survive.

M-5: Clarifying comment in RecoverState on the transient over-capacity
rebuild after a crash between append-commit and drop-truncation-commit.

Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
Continuous-historization engine for non-Galaxy driver tags. Registers
interest with the per-node DependencyMuxActor for the historized refs and
taps the VirtualTagActor.DependencyValueChanged values the mux fans:
coerce to numeric -> append to the durable IHistorizationOutbox (crash
boundary) -> off-thread drain writes batches through IHistorianValueWriter
and acks (FIFO-truncates) on success, backing off (exponential, capped) on
failure. Non-numeric values are dropped + metered (SQL analog path is
numeric-only).

- New seam IHistorianValueWriter + HistorizationValue in Core.Abstractions
  so Runtime stays free of the gRPC driver.
- GatewayHistorianValueWriter (driver) adapts IHistorianGatewayClient.
  WriteLiveValues: HistorizationValue -> HistorianLiveValue proto, WriteAck
  Success||Queued -> true; non-throwing (errors -> false for retry).
- Drain runs via PipeTo(Self) so the mailbox never blocks on the gateway
  write; appends awaited on the actor thread to stay serialized.

Adaptation vs plan: the mux fans DependencyValueChanged (TagId/Value/
TimestampUtc, no quality), not DriverInstanceActor.AttributeValuePublished,
so values are recorded Good-quality (192) by the same convention the
scripted-alarm host uses.

Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
Addresses T15 review: treat a canceled EnsureTags task like a faulted one so the
fire-and-forget continuation never reaches t.Result (which would re-throw and leave
the discarded task unobserved).

Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
I-1: Wrap the OnValueChangedAsync AppendAsync in try/catch so a durable-boundary
failure (e.g. a PerEntry fsync hitting disk-full/I-O error) can no longer propagate
out of the handler and trip Akka supervision into a restart loop. A canceled append
during shutdown returns quietly; any other exception increments a new
_outboxAppendFailures counter, logs a Warning (exception type name only), and drops
the value without recording it or nudging the drain. The counter is surfaced on
RecorderStatus (new OutboxAppendFailures field).

I-2: Strengthen Writer_failure_keeps_entry_for_retry to prove the drain actually ran
— assert the writer was invoked (the fake records even on Succeed=false) AND the
outbox stayed at 1 (RemoveAsync not called), via AwaitAssertAsync.

M-3: Capture Sender before the await in the GetStatus handler, then Tell the reply.

M-4: Add Retry_after_writer_failure_eventually_acks proving the retry -> success ->
ack path; FakeValueWriter gains a FailFirstN option + CallCount (Succeed behaviour
unchanged). Short minBackoff keeps it fast and deterministic (AwaitAssert, no sleep).

M-5: Deregister mux interest on PostStop via DependencyMuxActor.UnregisterInterest,
mirroring VirtualTagActor.PostStop, closing the dead-letter window before Terminated.

Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
Bind ContinuousHistorizationOptions (Enabled/OutboxPath/CommitMode/
CommitIntervalMs/DrainBatchSize/DrainIntervalSeconds/Capacity/backoff) with a
warn-only Validate(); gated on Enabled AND the ServerHistorian gateway being
configured, the Host registers the durable FasterLogHistorizationOutbox (container
-disposed) + a gateway-backed GatewayHistorianValueWriter, and binds outbox
depth/dropped observable gauges on the central scraped meter. WithOtOpcUaRuntimeActors
spawns the recorder (over the same dependency-mux ref) when the options + writer +
outbox resolve, registering ContinuousHistorizationRecorderKey. Spawned with an EMPTY
historized-ref set: the deployed address space builds later, so ref population is a
documented follow-on (a later SetHistorizedRefs feed) — T18 wires the actor + outbox
+ writer + meters; the ref feed is the known remaining gap.

Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
Addresses T18 review: GatewayHistorianValueWriter is a DI singleton holding a gRPC
channel — make it IAsyncDisposable so the container closes the channel gracefully at
shutdown. Tighten the blank-OutboxPath warning to state startup will fail.

Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
The HistorianGateway driver is now the sole historian read/write+alarm backend, so the
Wonderware sidecar projects are dead code. Removes the 5 Wonderware projects (driver,
.Client, .Client.Contracts, + their 2 test projects) from the solution and tree, and fully
retires the vestigial 'Historian.Wonderware' driver type (UI/probe-only; it had no driver
factory): the Host probe registration, the AdminUI driver-config surface (driver page,
tag-config editor/model/validator entry, address picker/builder, driver-type catalog +
dropdown + edit-router entries), and their tests. Prunes the now-unused Wonderware
connection fields (Host/Port/UseTls/ServerCertThumbprint/SharedSecret) from
AlarmHistorianOptions (keeping Enabled + the SQLite store-and-forward knobs) and refreshes
the stale XML docs that named Wonderware as the production backend.

Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
docs(historian-gateway): document gateway backend, config keys, EnsureTags hook, known gates; retire Wonderware from docs
v2-ci / build (pull_request) Failing after 38s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (pull_request) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (pull_request) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (pull_request) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (pull_request) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (pull_request) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (pull_request) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (pull_request) Has been skipped
2124f21ab6
HistorianGateway is now the sole historian backend (read + alarm SendEvent +
continuous WriteLiveValues). Document the final state and retire the Wonderware
sidecar from the docs/config/labels:

- CLAUDE.md: rewrite the Historian section — ServerHistorian /
  ContinuousHistorization / AlarmHistorian config keys, the IHistorianProvisioning
  EnsureTags hook, the GatewayAlarmHistorianWriter SendEvent path + ReadEvents
  dependency on gateway RuntimeDb:EventReadsEnabled=true, gateway-side
  prerequisites (RuntimeDb flags + historian:read/write/tags:write scopes),
  migration note, and two KNOWN-LIMITATION callouts (live-validation gate +
  empty historized-ref-set recorder follow-on).
- appsettings.json: fix the stale ServerHistorian block (Host/Port/SharedSecret/
  ServerCertThumbprint -> Endpoint/ApiKey/UseTls/AllowUntrustedServerCertificate/
  CaCertificatePath/CallTimeout, keep MaxTieClusterOverfetch); add a disabled
  ContinuousHistorization block; prune the orphaned Wonderware keys from
  AlarmHistorian (keep the SQLite knobs). ApiKey env-supplied via
  ServerHistorian__ApiKey (commented; valid strict JSON via _comment keys).
- README.md + docs (Historian.md, AlarmHistorian.md, Configuration.md,
  ServiceHosting.md, DriverLifecycle.md, drivers/README.md, Uns.md, VirtualTags.md,
  AlarmTracking.md, Client.UI.md, README.md, TestConnectProbes.md): retire the
  Wonderware historian backend from current-backend descriptions; fix the stale
  ServerHistorian/AlarmHistorian config tables (now gateway shape); convert
  drivers/Historian.Wonderware.md to a retired stub pointing at the gateway.
- Source/UI labels (descriptive text only, no behavior change):
  OtOpcUaServerHostedService.cs, HistoryPaging.cs, OtOpcUaSdkServer.cs,
  HistorianAdapterActor.cs, VirtualTagModal.razor, ScriptedAlarmModal.razor,
  AlarmsHistorian.razor now name the HistorianGateway backend.

Build clean (0 errors); AdminUI.Tests green (514 passed).

Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
dohertj2 added 1 commit 2026-06-26 23:22:46 -04:00
feat(historian-gateway): feed historized refs to the recorder on deploy (close continuous-historization ref-feed gap)
v2-ci / build (pull_request) Failing after 39s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (pull_request) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (pull_request) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (pull_request) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (pull_request) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (pull_request) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (pull_request) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (pull_request) Has been skipped
2982cc4bb5
The ContinuousHistorizationRecorder was spawned with an EMPTY historized-ref
set, so it registered interest in nothing and historized nothing. This feeds it
the currently-historized tag refs on every address-space deploy/redeploy so its
DependencyMuxActor interest converges to exactly the historized set (the same
refs the EnsureTags provisioning hook resolves: override-or-FullName).

Design — delta convergence (the plan is a pure DIFF):
- New seam IHistorizedTagSubscriptionSink (Core.Abstractions/Historian) with a
  Null no-op singleton, mirroring how IHistorianProvisioning decouples the T15
  hook. AddressSpaceApplier gains a DEFAULTED ctor param (Null sink) so all ~80
  existing call sites + the production site compile unchanged.
- Apply() only ever sees a plan diff (an incremental/surgical apply carries a
  delta, not the full set), so the applier feeds an add/remove DELTA computed
  from AddedEquipmentTags / RemovedEquipmentTags / ChangedEquipmentTags. The
  recorder keeps the full set and re-registers it. The feed is a single
  non-blocking Tell behind the sink, wrapped in try/catch so a faulting feed
  never blocks or breaks a deploy (same discipline as the provisioning hook).
- Recorder.UpdateHistorizedRefs(added, removed) converges the tracked set, then
  — only when it actually changed — sends ONE RegisterInterest with the full set
  (the mux's RegisterInterest is a full-REPLACE) or one UnregisterInterest when
  it drains to empty (the mux has no per-ref unregister). An unchanged delta is
  a no-op (no mux churn).
- DI: the recorder is now spawned BEFORE the applier so the adapter
  (ActorHistorizedTagSubscriptionSink) can wrap its IActorRef; the Null sink is
  used when continuous historization is off/unwired.

Tests: recorder convergence (add-from-empty, add+remove converge, idempotent,
drain-to-empty unregisters); applier feeds resolved added refs, removed+renamed
deltas, and survives a throwing sink. Build clean (0 warnings on touched
projects); Runtime/OpcUaServer/Gateway/AdminUI suites green.

Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
dohertj2 added 1 commit 2026-06-26 23:48:56 -04:00
fix(historian-gateway): alarm SendEvent must not set wire event Id (live-validated)
v2-ci / build (pull_request) Failing after 45s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (pull_request) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (pull_request) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (pull_request) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (pull_request) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (pull_request) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (pull_request) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (pull_request) Has been skipped
44644ddc7f
Live validation against wonder-sql-vd03 caught that the gateway's SendEvent handler
throws when the wire event carries a client-supplied Id — so every alarm send from
OtOpcUa failed (PermanentFail). AlarmEventMapper now leaves HistorianEvent.Id unset
(the historian assigns event identity) and preserves the alarm's id as an 'AlarmId'
property. With this, the live alarm send acks.

Also harden the env-gated live tests against two gateway/historian-side limitations
surfaced during validation (neither an OtOpcUa defect): the write readback uses a
timezone-tolerant window (an explicit-timestamp WriteLiveValues lands offset by the
deployment's local-vs-UTC delta — reproducible via raw grpcurl; OtOpcUa sends correct
UTC), and the alarm ReadEvents readback skips with a clear reason when the historian's
server-gated event reads (C2, won't-fix) return nothing. Read + write-persist +
alarm-send are all live-validated green; the alarm send-ack is split into its own test.

Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
dohertj2 added 1 commit 2026-06-26 23:59:59 -04:00
docs(historian-gateway): correct the alarm-readback skip reason (SQL reader works)
v2-ci / build (pull_request) Failing after 44s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (pull_request) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (pull_request) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (pull_request) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (pull_request) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (pull_request) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (pull_request) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (pull_request) Has been skipped
240c967576
Live investigation showed the earlier 'C2 server-gated event reads' attribution was
wrong: the gateway's SQL event reader works (a source-filtered ReadEvents returns a
real Galaxy-sourced event's history; a time-only ReadEvents returns 50 events). The
alarm round-trip's source-filtered readback is empty only because an ad-hoc SendEvent
is recorded in Runtime.dbo.Events WITHOUT a Source_Object — so reading existing Galaxy
alarm/event history by source works, but round-tripping OtOpcUa's own sends by source
needs the gateway's SendEvent to populate the event source. Skip message corrected.

Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
dohertj2 added 1 commit 2026-06-27 00:04:24 -04:00
docs(historian-gateway): follow-up & deferred-items plan (gateway SendEvent source + tz, recorder override, propagation)
v2-ci / build (pull_request) Failing after 40s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (pull_request) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (pull_request) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (pull_request) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (pull_request) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (pull_request) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (pull_request) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (pull_request) Has been skipped
9fca3d9c05
Consolidates everything deferred or surfaced during live validation, with owning repo
per item: P1 gateway bugs (FU-1 SendEvent doesn't populate Source_Object → alarm
write-back-by-source; FU-2 WriteLiveValues +4h explicit-timestamp shift), P2 OtOpcUa
items (FU-3 HistorianTagname-override recorder edge; FU-4 MaxAttempts test; FU-5
pre-existing Modbus Host.IntegrationTests failure), P3 cross-repo propagation. Includes
the live-validation reproduction recipe + the dbo.Events INSQL-view caveat.

Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
dohertj2 added 5 commits 2026-06-27 00:59:41 -04:00
The MaxAttempts<=0 warning branch in AlarmHistorianOptions.Validate() was the
only one without a test (the sibling DrainIntervalSeconds/Capacity/
DeadLetterRetentionDays warnings are covered). Add the matching case. Closes FU-4.

Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
The continuous-historization recorder conflated two identifiers into one string:
the dependency mux fans DependencyValueChanged keyed by the driver FullName
(the mux ref), but a value must be historized under the resolved historian name
(HistorianTagname override, else FullName). In the common no-override case the
two are equal, so it worked; with an override they diverge and the recorder
registered mux interest under a key the mux never fans — that tag's values were
never captured (and, had they been, would have been written under the mux ref).

Carry BOTH identifiers through the seam: a new HistorizedTagRef(MuxRef,
HistorianName) record on IHistorizedTagSubscriptionSink. The applier resolves
MuxRef = FullName and HistorianName = override-or-FullName. The recorder now
keeps a muxRef->historianName map: it registers/filters mux interest by MuxRef
but writes the outbox entry (and drains) under HistorianName. The convergence
handler re-registers the mux only when the registered key-set changes, so an
override-only rename (same FullName) updates the write target without mux churn.

Tests: a divergent-override recorder test (interest by mux ref, value written
under the override name, never the mux ref) + an override-rename no-churn test;
the applier feed tests now assert the full (mux ref, historian name) pairs.
Runtime 348/0, OpcUaServer 327/0; 0 warnings. Closes FU-3.

Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
Record the execution outcomes in the follow-up plan: FU-1 resolved as a documented
protocol limitation (gateway pending.md C4; not fixable without histsdk wire-capture
evidence), FU-2 done + live-validated (exact round-trip), FU-3 done (mux-ref vs
historian-name decoupled via HistorizedTagRef), FU-4 done. FU-5 (pre-existing Modbus
failure) and FU-6 (post-merge propagation) remain tracked.

Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
Code review found a residual silent-data-loss path: a single driver ref (mux
ref) can back SEVERAL historized equipment tags via aliasing (identical machines
sharing a register — DriverHostActor._nodeIdByDriverRef is a HashSet), each with
its own HistorianTagname. The muxRef->single-name map collapsed last-wins, so
under alias + divergent overrides only one historian tag got the value and the
rest were silently dropped — the exact failure class FU-3 exists to eliminate.

Model the fan as muxRef -> HashSet<historianName> and append ONE outbox entry per
name in OnValueChangedAsync (a per-name append failure drops only that name and
continues). Convergence removes/adds each (muxRef, name) pair individually from
the per-ref set, dropping the mux key only when its last name is removed — so
removing one alias leaves the shared ref fanning for the others with no mux churn.

Tests: aliased-refs-each-get-the-value (one fan → both historian names written),
removing-one-alias-keeps-the-ref-registered, and the override-rename test now
feeds a value post-rename to prove the write target actually moved to the new
name. Runtime 350/0, OpcUaServer 327/0; 0 warnings.

Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
docs(historian-gateway): note FU-3 alias handling (review fix) in follow-up plan
v2-ci / build (pull_request) Failing after 41s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (pull_request) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (pull_request) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (pull_request) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (pull_request) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (pull_request) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (pull_request) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (pull_request) Has been skipped
10a6ac6f3e
Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
dohertj2 added 1 commit 2026-06-27 02:00:13 -04:00
docs(historian-gateway): FU-5 tracked via issue #424 (pre-existing, not ours)
v2-ci / build (pull_request) Failing after 36s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (pull_request) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (pull_request) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (pull_request) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (pull_request) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (pull_request) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (pull_request) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (pull_request) Has been skipped
1030d00b3f
Claude-Session: https://claude.ai/code/session_012SDSQ3AcaXqPcBtDESBRii
dohertj2 merged commit 53798bbf40 into master 2026-06-27 11:09:04 -04:00
Sign in to join this conversation.