lmxopcua

Author	SHA1	Message	Date
Joseph Doherty	907005d2d2	docs(claude): note local docker-dev rig has login disabled — run live /run verification directly, don't wait for sign-in v2-ci / build (push) Failing after 48s Details v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped Details	2026-06-15 11:50:55 -04:00
Joseph Doherty	c6a543d1b6	docs(vtags): note rename-respawn transient + write-side-only historize (integration review) v2-ci / build (push) Failing after 44s Details v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped Details	2026-06-15 10:50:08 -04:00
Joseph Doherty	aaa5d8b851	docs(vtags): document runtime Historize honoring + infra-gated durable sink (Phase 1 H5)	2026-06-15 10:43:29 -04:00
Joseph Doherty	4501f12669	feat(vtags): wire IHistoryWriter through DriverHostActor (Null default; durable sink infra-gated) (H5d, stillpending §1)	2026-06-15 10:38:49 -04:00
Joseph Doherty	2f30c54dc1	test(vtags): thread-safe CapturingHistoryWriter + drop redundant wait (H5c review follow-up)	2026-06-15 10:33:14 -04:00
Joseph Doherty	0c6d4c5491	feat(vtags): forward historized vtag results to IHistoryWriter (H5c, stillpending §1)	2026-06-15 10:26:25 -04:00
Joseph Doherty	83d3b9f7be	test(vtags): planner detects Historize-only toggle as a change + doc nit (H5a review follow-up)	2026-06-15 10:21:31 -04:00
Joseph Doherty	9c5a091395	feat(vtags): decode VirtualTag Historize from artifact, byte-parity with composer (H5b, stillpending §1)	2026-06-15 10:17:08 -04:00
Joseph Doherty	fc8121cbf3	feat(vtags): carry VirtualTag.Historize onto EquipmentVirtualTagPlan (H5a, stillpending §1)	2026-06-15 10:17:05 -04:00
Joseph Doherty	ebf2f1dd7a	fix(vtags): prune _planByVtag on child termination + crash-then-change test (H1b review follow-up)	2026-06-15 10:12:11 -04:00
Joseph Doherty	ada01e1af8	fix(vtags): respawn equipment virtualtag child on in-place plan change (H1b, stillpending §1)	2026-06-15 10:05:29 -04:00
Joseph Doherty	1dc713693a	fix(deploy): count removed equipment tags/vtags in RemovedNodes (H1a review follow-up)	2026-06-15 10:01:37 -04:00
Joseph Doherty	1e95856b00	fix(deploy): rebuild address space on changed-only deploys (H1a, stillpending §1)	2026-06-15 09:57:40 -04:00
Joseph Doherty	50a2fdf32d	chore(plans): mark confirmed-shipped .tasks.json completed so audits don't re-flag (stillpending §7)	2026-06-15 09:52:51 -04:00
Joseph Doherty	a9d267c91a	docs(security,core): correct stale write-outcome doc + note benign DraftSnapshot/LeaderChanged residue (stillpending §9/§3)	2026-06-15 09:48:14 -04:00
Joseph Doherty	b4af9e7f37	docs(comments): correct 7 stale 'later task/milestone' comments (stillpending §9)	2026-06-15 09:47:08 -04:00
Joseph Doherty	68a0f759f0	docs(plans): Phase 0+1 implementation plan for the still-pending backlog 12 tasks (0 branch; 1-3 Phase 0 hygiene; 4-5 H1 changed-only-deploy fix; 6-9 H5 vtag Historize threading + IHistoryWriter seam; 10 docs; 11 verify). Conservative rebuild-on-change; no EF migration (Historize column + artifact already carry it); durable AVEVA sink flagged infra-gated.	2026-06-15 09:40:03 -04:00
Joseph Doherty	f64be52796	docs(plans): phased completion design for the still-pending backlog Roadmap for closing stillpending.md §1-§5 + §7/§9 cleanup in 9 phases (0 hygiene -> 1 silent-deploy bugs H1/H5 -> 2 ServiceLevel H3 -> 3 OPC UA standards H4/H2-bit/H6 -> 4 driver coverage -> 5 probes -> 6 AdminUI -> 7 Client.UI -> 8 per-cluster scoping). Conservative rebuild-on-change for H1; plan-and-execute phase-by-phase; no EF migration; defer-list flagged (Denied/Simulated/Language/InlayHints/ HistoryUpdate-service/Galaxy-gateway-write).	2026-06-15 09:27:06 -04:00
Joseph Doherty	151b7165af	docs(abcip,focas): document RetireAsync one-tick overlap residual + guard Dispose v2-ci / build (push) Failing after 2m47s Details v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped Details Code-review follow-ups on the poll-loop collapse: (1) RetireAsync is fire-and- forget and does NOT guarantee zero overlap — the retired loop runs until its in-flight read+tick finish and it observes cancellation, so a device transition landing in that one-tick window can fire once on both loops (at most ONE duplicate raise/clear per reconnect, transient + self-correcting; upstream Part 9 conditions dedupe on ConditionId). Documented in both RetireAsync XML docs so it isn't mistaken for a zero-overlap guarantee. (2) wrap Cts.Dispose() so the fire-and-forget task has no theoretical unobserved-exception path.	2026-06-15 06:14:44 -04:00
Joseph Doherty	6ba59f9d4d	fix(abcip,focas): collapse alarm projection to a single poll loop (no reconnect leak) The owning DriverInstanceActor re-subscribes alarms on every Connected entry (DetachAlarmSource nulls its cached handle on Connected->Reconnecting without calling UnsubscribeAlarmsAsync), and the driver object + its alarm projection are reused across every in-place reconnect. Each SubscribeAsync started a fresh, never-cancelled Task.Run poll loop and added it to _subs, so N reconnects leaked N concurrent loops all polling the device and all firing the same raise/clear transitions => duplicate alarm events + CPU/mem growth. Mirrors the Galaxy #399 fix (Clear-before-Add) but for live poll loops the collapse must also CANCEL the superseded loops, not just drop references. SubscribeAsync now snapshots existing subs under _subsLock, clears _subs, adds the new sub, starts its loop, then retires each stale sub out-of-band (RetireAsync: Cancel + await loop + Dispose CTS, fire-and-forget so the new subscription's return isn't blocked on a poll interval). Snapshot+clear under the same lock DisposeAsync uses guarantees no double-own / double-dispose. There is exactly one consumer per driver instance (factory-per-actor), so retiring all prior subscriptions before starting the new one is faithful. Regression tests (TDD, fail->pass): subscribe twice then drive one device raise; assert OnAlarmEvent fires exactly once (was twice with two leaked loops).	2026-06-15 06:09:38 -04:00
Joseph Doherty	43b3769a1d	docs(plans): add write-outcome self-correction implementation plan v2-ci / build (push) Failing after 32s Details v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped Details The plan + task list for the write-outcome self-correction work (B1, already shipped via master `1d797c1c`). Its design-doc counterpart is already committed; this adds the matching plan artifacts, consistent with the other docs/plans/.	2026-06-15 05:57:15 -04:00
Joseph Doherty	5a70cd7910	chore(deps): bump vendored MxGateway.Client/Contracts 0.1.0 -> 0.1.1 User-published 0.1.1 of the MxGateway client + contracts packages into the local-mxgw vendored source (nuget-packages/). Bumps Directory.Packages.props to match and adds the 0.1.1 .nupkg artifacts alongside the existing 0.1.0 ones. Full solution builds clean against 0.1.1.	2026-06-15 05:57:09 -04:00
Joseph Doherty	013882262a	fix(galaxy): bound alarm-subscription handles to one (no reconnect leak) v2-ci / build (push) Failing after 44s Details v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped Details GalaxyDriver's StreamAlarms feed is session-less and survives an in-place reconnect, so DriverInstanceActor re-subscribed on every Connected re-entry (after dropping its own cached handle without an Unsubscribe — sync teardown). The re-subscribe was additive: _alarmSubscriptions.Add grew the list by one untracked handle per reconnect cycle — a slow unbounded leak. Functionally harmless (the gate is Count>0 and OnAlarmFeedTransition only reads [0], firing once regardless), but it accumulated forever. Fix: SubscribeAlarmsAsync clears the set before adding, collapsing to a single live handle (under the existing _alarmHandlersLock, atomic w.r.t. the fan-out reader). There is exactly one consumer per driver instance (factory-per-actor lifecycle), so replacing the set with the latest handle is faithful. Chosen over making the actor's sync DetachAlarmSource call UnsubscribeAlarmsAsync async/fire-and-forget — disproportionate for a minor leak. Regression test Re_subscribe_collapses_to_a_single_handle_no_accumulation (TDD-verified: FAILS without the Clear — releasing the latest handle leaves the feed open because stale handles remain; PASSES with the fix). Galaxy tests 263 pass / 3 skip; Runtime native-alarm 24 pass. Code-reviewed (approved).	2026-06-15 05:49:07 -04:00
Joseph Doherty	c9643f68ba	fix(runtime): restart driver no longer throws 'actor name is not unique' v2-ci / build (push) Failing after 42s Details v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped Details HandleRestartDriver stopped + respawned the child within one synchronous message handler, reusing the base actor name drv-<id>. Context.Stop is async (the child processes its own stop on its own mailbox), so the old child was ALWAYS still registered when the respawn ran — Context.ActorOf threw InvalidActorNameException deterministically on every AdminUI Restart press, crashing + restarting the host. Fix: a monotonic _childSpawnGeneration counter (single-threaded actor) feeds a -g<gen> suffix on every spawned child name, so a respawn can never collide with the still-terminating predecessor. Children are tracked by the _children dict (by IActorRef), never by actor path, so the suffix is invisible to callers. This also closes the same-shaped latent race in the reconcile path (a removed- then-readded instance, and a driver-type-change ToStop+ToSpawn in one plan). Regression test RestartDriver_respawns_the_child_without_an_actor_name_collision (verified: FAILS on the old code with the exact InvalidActorNameException, PASSES with the fix). Runtime.Tests 238/238 green. Code-reviewed (approved).	2026-06-15 05:41:18 -04:00
Joseph Doherty	aa1e21f53c	docs(historian): clarify modified-value history is infra-gated (no backend modified-read path) v2-ci / build (push) Failing after 40s Details v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped Details	2026-06-15 05:15:50 -04:00
Joseph Doherty	bea0b482d4	fix(historian): address code review on Raw HistoryRead paging C1 (critical): a boundary tie cluster larger than NumValuesPerNode could silently truncate a resumed read to GoodNoData, permanently dropping the un-emitted ties — the (timestamp, skip) cursor cannot advance past a single timestamp the fixed-(start,end,cap) backend keeps re-returning. Now detected and failed LOUDLY per node with BadHistoryOperationUnsupported + a log naming the tag/timestamp/cap; documented in Historian.md with the larger-cap remedy. Regression test Raw_tie_cluster_larger_than_page_fails_loudly_not_silently. I3: build HistoryData before Save() so a projection failure can never orphan a stored continuation cursor. N1 (YAGNI): drop the never-produced HistoryReadKind enum + Processed-only Aggregate/IntervalTicks fields from HistoryContinuationState — only Raw pages. N3: ComputeResumeCursor guards its documented non-empty precondition. I1: document InMemoryHistoryContinuationStore's eventual-consistency (test double). Build clean, 182/182 OpcUaServer tests pass.	2026-06-15 05:15:07 -04:00
Joseph Doherty	94c3ca60fc	feat(historian): server-side continuation-point paging for HistoryRead-Raw The Wonderware historian backend is single-shot — it returns up to NumValuesPerNode samples with a null continuation point — so paging is synthesised server-side, time-based, for the only count-capped arm (Raw): - A full page (count == NumValuesPerNode, NumValuesPerNode > 0) emits an opaque 16-byte continuation point and stores a resume cursor; a short page (or NumValuesPerNode == 0 "all values") emits none. - A resume read takes the stored cursor, reads the next page from the boundary forward, and emits a fresh CP only if that page is also full. - The resume cursor is tie-safe (HistoryPaging.ComputeResumeCursor / TrimBoundaryDuplicates): the next page resumes from the boundary timestamp INCLUSIVE and drops the head ties already returned, so samples sharing the boundary SourceTimestamp are neither duplicated nor skipped. Continuation points are bound to the OPC UA session via the SDK's ISession.SaveHistoryContinuationPoint / RestoreHistoryContinuationPoint store (SessionHistoryContinuationStore) — capped by ServerConfiguration. MaxHistoryContinuationPoints (default 100, oldest-evicted) and disposed on session close. releaseContinuationPoints is honoured via an override of HistoryReleaseContinuationPoints (the base dispatcher routes release-only reads there, never to the per-details arms). An unknown / evicted / released point resumes to BadContinuationPointInvalid. Processed and AtTime stay single-shot: neither details type carries a client count cap, so the single-shot backend returns the complete result in one read and there is no "full page" signal to page on (spec-conformant). Modified-value history remains out of scope. The pure paging decisions + CP store contract are unit-tested via HistoryPaging + InMemoryHistoryContinuationStore; the full multi-page round trip is driven end-to-end through the node manager with an in-memory store + a series-backed fake historian (the in-process harness is session-less).	2026-06-15 03:02:48 -04:00
Joseph Doherty	a5c0c82661	fix(opcua): address code review on write-outcome surfacing v2-ci / build (push) Failing after 35s Details v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped Details - A.1 (false-rejection safety): restrict the structural fail-fast's confident-mismatch check to the CLOSED set of built-in types ResolveBuiltInDataType emits (numeric families + Boolean/ String/DateTime/ByteString). Any other expected type (Enumeration, Guid, …) now defers to the SDK, so a coercible write (Int32→Enumeration) is never false-rejected. + A7/A8 regression tests. - C.1: guard BuildWriteFailureAuditEvent (under Lock) in try/catch like ReportAuditEvent, so a SetChildValue surprise is swallowed+logged, never thrown out of the fire-and-forget continuation.	2026-06-15 02:45:51 -04:00
Joseph Doherty	bb59fd4e75	feat(opcua): surface failed inbound writes to clients (fail-fast, Bad blip, audit event) Three deferred 'surface the failed write' enhancements on the write-outcome self-correction path in OtOpcUaNodeManager: - Item A: synchronous structural fail-fast. EvaluateEquipmentWriteStructure (pure static) rejects a structurally-invalid write INLINE (Bad sync) after the authz gate but before the optimistic dispatch, so the SDK never applies it. Null payload -> BadTypeMismatch; plus a confidence-gated cheap built-in type compatibility check (numeric widening + BaseDataType wildcard tolerated; uncertain cases defer to the SDK's own coercion). - Item B: Bad-quality blip on device-write failure. On a revert, RevertOptimisticWriteIfNeeded first publishes the still-applied optimistic value with StatusCode BadDeviceFailure, then restores the prior value/status (both under the existing Lock). Documents the queue-coalescing caveat (a slow subscriber may see only the restored value -> the audit event is the reliable signal). - Item C: Part 8 AuditWriteUpdateEvent on device-write failure. Builds an AuditWriteUpdateEventState (SourceNode=node, AttributeId=Value, OldValue=prior, NewValue=attempted, ClientUserId from the threaded identity, Message carries outcome.Reason) under Lock and reports it via Server.ReportEvent OUTSIDE Lock. Guarded so auditing-disabled / report failure never breaks the revert. Threads the writing identity's user-id + node into the continuation. Adds 6 unit tests for EvaluateEquipmentWriteStructure. Build clean (0 warnings); 158/158 OpcUaServer.Tests green.	2026-06-15 02:38:57 -04:00
Joseph Doherty	dcb0be650e	feat(galaxy): debug-log native alarm feed delivery (subs + fanout) v2-ci / build (push) Failing after 38s Details v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped Details Native-alarm delivery through OnAlarmFeedTransition was a black box — there was no way to answer 'is the gateway feed delivering / is a subscription un-gating it', which is partly why the missing-SubscribeAlarmsAsync wiring shipped undetected. Add a single per-transition Debug line (kind, ref, live subscription count, fanout flag). Debug so a flapping galaxy doesn't flood prod, but available on demand.	2026-06-15 01:30:15 -04:00
Joseph Doherty	a833d1b4aa	fix(alarms): address code review — accurate reconnect comment + SubscribeAlarms drop handlers - Correct the misleading DetachAlarmSource comment: a session-less feed (Galaxy) is NOT torn down on an in-place reconnect, so re-subscribe is additive (harmless; gate reads [0]). - Add trace-only SubscribeAlarms drop handlers in Connecting/Reconnecting (symmetry with NativeAlarmRaised) so a self-tell overtaken by a queued disconnect doesn't dead-letter. - Document the deliberate no-unsubscribe-on-empty asymmetry vs the value path. Behavior-neutral for the un-gate path. Minor handle-accumulation leak tracked as follow-up.	2026-06-15 00:49:19 -04:00
Joseph Doherty	7f313df7a6	fix(alarms): subscribe native alarms to un-gate the IAlarmSource feed Phase B native alarms never fired end-to-end: GalaxyDriver suppresses OnAlarmEvent until an alarm subscription exists (_alarmSubscriptions.Count > 0), but the runtime only attached the OnAlarmEvent handler and never called SubscribeAlarmsAsync — so the central feed stayed gated and no transition reached the Part 9 condition / /alerts. Unit tests passed because they inject through the IAlarmSource seam directly; the deferred live /run surfaced it. DriverHostActor computes per-driver alarm refs (alarm-bearing tags' FullNames) and hands them via SetDesiredSubscriptions; DriverInstanceActor calls SubscribeAlarmsAsync for IAlarmSource drivers on Connected entry and whenever alarm refs are pushed while Connected (the deploy path), idempotent via a cached handle reset on detach so reconnect re-subscribes.	2026-06-15 00:42:43 -04:00
Joseph Doherty	063d004fda	docs(security): fix data-plane alarm-ack GroupToRole value (AlarmAck, not AlarmAcknowledge) The gate reads the literal role string OpcUaDataPlaneRoles.AlarmAck = "AlarmAck" (OtOpcUaNodeManager.cs:643), but the Role-grant-source section told operators to map their alarm-ack group to "AlarmAcknowledge" (the PermissionFlags ACL bit, a different vocabulary) — which silently never satisfies the ack gate. Fix the three role-string occurrences + add a code-true note; generalize the scripted-alarm note to native alarms.	2026-06-15 00:42:32 -04:00
Joseph Doherty	d19deb9b42	test(galaxy): readback via explicit TCS + skip unused buffered-interval RPC (review) v2-ci / build (push) Failing after 44s Details v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped Details Code-review refinement of the live-gw read-back helper: complete a TaskCompletionSource<double?> from the pump instead of a captured local (explicit cross-task visibility), pass bufferedUpdateIntervalMs:0 (Advise snapshot needs no SetBufferedUpdateInterval), and document the Advise->OnDataChange filter. Live re-verified 2/2.	2026-06-14 23:32:00 -04:00
Joseph Doherty	622bfda27d	test(galaxy): live-gw reopen + supervisory-write-persist integration smokes Skip-gated (MXGW_ENDPOINT + GALAXY_MXGW_API_KEY) like GatewayGalaxyAlarmFeedLiveTests. Covers the two seams the unit suite can only fake in isolation: - reopen: RecreateAsync + InvalidateHandleCaches re-establish write handles - write: no-login supervisory write commits + persists (fresh-session read-back) Live-verified 2/2 against 10.100.0.48:5120 (2026-06-14); skips cleanly in CI.	2026-06-14 23:24:50 -04:00
Joseph Doherty	fa2388eabf	test(alarms): assert reconnect-dropped native alarm does not dead-letter; tighten severity doc v2-ci / build (push) Failing after 38s Details v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped Details Add AllDeadLetters probe to Native_alarm_during_reconnect_is_dropped_not_forwarded so the test genuinely guards the Reconnecting state's Receive<NativeAlarmRaised> drop handler — removing that handler would now cause a dead-letter and fail the assertion (false-negative gap closed). Reword the ScriptedAlarms.md severity-mapping note: "snaps on the first transition" → "every transition maps … overriding the authored seed from the first transition onward", clarifying that MapSeverity runs on every event, not just the first.	2026-06-14 22:56:18 -04:00
Joseph Doherty	c03361de1b	test(drivers): extract shared stub-driver harness (de-dup)	2026-06-14 22:49:26 -04:00
Joseph Doherty	d8129e5ab7	test(alarms): fix reconnect-drop test to use mailbox-ordering approach	2026-06-14 22:48:36 -04:00
Joseph Doherty	49d98cba31	docs(alarms): note native-alarm severity-bucket snapping	2026-06-14 22:41:57 -04:00
Joseph Doherty	51cda2c744	test(alarms): assert OperatorComment flows through ForwardNativeAlarm	2026-06-14 22:41:43 -04:00
Joseph Doherty	c8db5767ea	test(alarms): guard native-alarm-during-reconnect is dropped not dead-lettered	2026-06-14 22:41:08 -04:00
Joseph Doherty	a1d333869e	docs(plans): residual-followups-cleanup plan (4 offline items; reconcile stale residuals)	2026-06-14 22:38:42 -04:00
Joseph Doherty	cd20c3c064	docs: refresh pending.md for compaction (Phase C shipped; open-items digest) v2-ci / build (push) Failing after 31s Details v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped Details	2026-06-14 22:23:58 -04:00
Joseph Doherty	c24abc8a97	docs(historian): mark Phase C tasks 1-6 complete (T7 live deferred) v2-ci / build (push) Failing after 40s Details v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped Details v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped Details v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped Details	2026-06-14 21:19:28 -04:00
Joseph Doherty	7eb8f4d054	docs(historian): Phase C HistoryRead guide + CLAUDE.md pointer	2026-06-14 20:27:17 -04:00
Joseph Doherty	83e1318425	refactor(historian): align ServerHistorianOptions with AlarmHistorian (Port default, Validate list, log context)	2026-06-14 20:23:22 -04:00
Joseph Doherty	a6f1f4ef15	feat(historian): AddServerHistorian DI + Host wiring of IHistorianDataSource	2026-06-14 20:17:10 -04:00
Joseph Doherty	e6ec0ad8be	fix(historian): events arm sets results on bad paths + Variant.Null SourceName + test hardening - HistoryReadEvents miss path + catch path now both set results[handle.Index] explicitly (new SdkHistoryReadResult { StatusCode = BadHistoryOperationUnsupported }) — don't rely on base pre-seeding results[i] so every path sets BOTH errors and results coherently (#1) - ProjectEventField: SourceName null now emits Variant.Null instead of a String-typed null variant (evt.SourceName is null ? Variant.Null : new Variant(evt.SourceName)) (#3) - Comment near the HistoryRead dispatcher block updated: all four arms (Raw/Processed/AtTime + Events/Task 4) are now overridden — "left to the base" wording was stale (#5) - Happy-path test adds ReceiveTime to select clauses and asserts it projects ReceivedTimeUtc as a DateTime Variant at the correct select-order position (#4) - Backend-throw test hardened: asserts errors[0] via ServiceResult.IsBad + explicit code, asserts results[0] is non-null with the Bad code (no longer relies on base seeding), and asserts EventsEntered to prove the override reached the bridge before the throw (#1) - RecordingHistorianDataSource gains EventsEntered flag (set before ThrowOnRead check) (#1) - Events_non_source_node test gains clarifying doc comment explaining the SDK base rejects variable nodes (EventNotifier=None) for event reads before our override runs; the override's source-guard is exercised by the promoted-without-source test instead (#2)	2026-06-14 20:10:16 -04:00
Joseph Doherty	e3c0ef7b41	feat(historian): HistoryReadEvents over equipment-folder notifiers + event-field projection	2026-06-14 19:56:38 -04:00
Joseph Doherty	059f18bdad	test(historian): multi-node HistoryRead isolation + single-lookup ServeNode + comment fix Fix A: add Raw_multi_node_per_node_error_isolation test — two historized variables (eqA/good→A.PV, eqB/bad→B.PV) in one Raw batch; per-tagname fake throws for B.PV, returns a sample for A.PV; asserts errors[0]=Good+sample, errors[1]=Bad, HistoryData[1]=null (no cross-slot leak), no exception escapes. Fix B: collapse double ConcurrentDictionary lookup in ServeNode — TryGetHistorizedTagname now captures `out var tagname` on the guard; the resolved tagname is threaded into the read callback as a second parameter (Func<IHistorianDataSource, string, Task<HistorianRead>>), removing the redundant ResolveTagname helper (deleted) and the tiny race window between the check and the second lookup. All three call-sites (Raw/Processed/AtTime) updated. Fix C: rewrite the IsReadModified comment at NodeManagerHistoryReadTests.cs:102 — the SDK's ReadRawModifiedDetails.Initialize() sets m_isReadModified=true (generated ctor body in Opc.Ua.DataTypes.cs), so the default IS true; the test must explicitly clear it to false for a plain raw read. Previous comment said the same thing but imprecisely; now cites the SDK mechanism (Initialize() call in the public ctor).	2026-06-14 19:44:56 -04:00

1 2 3 4 5 ...

1695 Commits