Commit Graph

74 Commits

Author SHA1 Message Date
Joseph Doherty 95be607a07 feat(opcua): remove SystemPlatform-mirror GalaxyTags contract end-to-end (composer+applier+artifact, byte-parity) 2026-06-12 21:45:19 -04:00
Joseph Doherty 7ce7505a36 feat(historian-host): bind TCP host/port/tls config 2026-06-12 11:19:46 -04:00
Joseph Doherty 57355405a6 chore(security): drop dead audit suppressions; patch OpenTelemetry + Tmds.DBus CVEs
All five suppressed advisories are now resolved at baseline/resolved versions,
so every NuGetAuditSuppress is removed repo-wide:
- System.Security.Cryptography.Xml (GHSA-37gx-xxp4-5rgx / GHSA-w3x6-4m5h-cxqf)
  -> fixed by the .NET 10 baseline (10.0.6)
- OPCFoundation Opc.Ua.Core (GHSA-h958-fxgg-g7w3) -> fixed at resolved 1.5.378.106

Two were still live and are now patched via direct security pins:
- OpenTelemetry.Api 1.9.0 -> 1.15.3 (GHSA-g94r-2vxg-569j) pinned in Cluster;
  Runtime/ControlPlane/AdminUI + tests inherit via project reference
- Tmds.DBus.Protocol 0.20.0 -> 0.21.3 (GHSA-xrw6-gwf8-vvr9) pinned in Client.UI

Also correct the Historian sidecar runtime comments (x86 -> x64, matching the
csproj PlatformTarget). Solution audit: 0 vulnerable packages; full build clean.
2026-06-12 09:03:42 -04:00
Joseph Doherty bc9e83ed9f feat(composer): admit GalaxyMxGateway-backed equipment alias tags (+byte-parity) 2026-06-11 21:10:21 -04:00
Joseph Doherty 7f535c0e9d test(historian): cover non-positive DeadLetterRetentionDays validation warning 2026-06-11 13:24:46 -04:00
Joseph Doherty 56750e110f fix(alarms): historize the real operator for shelve/unshelve/enable/disable transitions 2026-06-11 13:14:00 -04:00
Joseph Doherty 5ea6e9d7d9 fix(historian): validate non-positive drain/capacity/retention knobs (review) + log prefix 2026-06-11 13:09:13 -04:00
Joseph Doherty f215982b93 feat(historian): drain/capacity/retention config knobs + startup config-warning validation 2026-06-11 13:04:16 -04:00
Joseph Doherty 61b230d79a harden(historian): nullable HistorizeToAveva (missing→historize) for rolling-restart-safe deserialize + middle-link test 2026-06-11 13:00:57 -04:00
Joseph Doherty 8012509584 feat(historian): honor per-alarm HistorizeToAveva opt-out at the durable write 2026-06-11 12:48:13 -04:00
Joseph Doherty f64f7ce669 fix(alarms): historize the operator (not 'system') for CommentAdded transitions (review)
v2-ci / build (push) Failing after 50s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped
2026-06-11 11:42:56 -04:00
Joseph Doherty 943c621371 feat(historian): config-gated SqliteStoreAndForward→Wonderware sink (AddAlarmHistorian) 2026-06-11 11:30:31 -04:00
Joseph Doherty e9355e9514 refactor(historian): gate before translate (no discarded alloc on secondary) + strengthen double-write warning (review) 2026-06-11 11:24:48 -04:00
Joseph Doherty bb42e5834a feat(historian): subscribe to alerts topic + translate to AlarmHistorianEvent (Primary-gated, exactly-once) 2026-06-11 11:18:26 -04:00
Joseph Doherty 8ac3ac5be9 feat(alarms): carry AlarmTypeName + operator Comment on AlarmTransitionEvent (historian feed prep) 2026-06-11 11:03:00 -04:00
Joseph Doherty 4f291ed09c test(redundancy): cover absent-node default-historize for HistorianAdapter (A2) 2026-06-11 09:02:02 -04:00
Joseph Doherty 0742946108 feat(redundancy): gate alarm historization on Primary (A2, defensive — actor currently unfed)
HistorianAdapterActor now subscribes to the redundancy-state DPS topic,
caches the local node's RedundancyRole, and SKIPS the durable-sink enqueue
when the local node is Secondary or Detached. Unknown/null role default-writes
so single-node deploys and the boot window never silently drop historization.
GetStatus stays ungated.

PREMISE: verified the actor is registered but FED BY NOTHING in production —
there is no AlarmHistorianEvent producer and nothing resolves its registry key
to Tell it. This is a FORWARD-LOOKING / DEFENSIVE guard, not a fix for a live
double-write: the moment a per-node feeder lands (engine -> historian, expected
as a per-node cluster broadcast like the alerts topic), only the Primary will
write to the durable sink (exactly-once across all alarm sources).

Mirrors the sibling A1 treatment of ScriptedAlarmHostActor (06c4155) and
OpcUaPublishActor's redundancy-state handler. localNode threaded through
HistorianAdapterActor.Props from ServiceCollectionExtensions (roleInfo.LocalNode).
2026-06-11 08:57:41 -04:00
Joseph Doherty 9ac9f0b7a9 test(redundancy): cover Detached suppression + absent-node default-emit (A1) 2026-06-11 08:50:59 -04:00
Joseph Doherty 06c415598c feat(redundancy): gate scripted-alarm alerts publish on Primary (A1) 2026-06-11 08:44:44 -04:00
Joseph Doherty 1d7e2a0f8b fix(runtime): reject empty AddComment instead of silently swallowing it
Validate AddComment up-front (IsNullOrWhiteSpace guard + Warning log) so
a blank-comment command is cleanly rejected before reaching the engine
rather than faulting inside ApplyAddComment and being silently swallowed
by the outer catch.  Mirrors the existing TimedShelve missing-UnshelveAtUtc
pattern.

Also fix two stale inline comments: the "async void crash" note on
TimedShelve now correctly says "fault escaping async Task → supervision
restart", and the ownership-filter now documents the benign race with a
concurrent LoadAsync clearing the loaded set.

Tests: AlarmCommand_add_comment_empty_text_is_rejected_not_driven (Theory
— empty string + whitespace) and AlarmCommand_add_comment_nonempty_drives_engine
(positive path, asserts CommentAdded transition on alerts topic).
2026-06-11 06:32:53 -04:00
Joseph Doherty 4f7999eac2 feat(alarms): consume alarm-commands topic in ScriptedAlarmHostActor (T19)
Subscribe the host to the cluster alarm-commands DPS topic in PreStart and
drive the matching ScriptedAlarmEngine op per inbound AlarmCommand. An
ownership filter (engine.LoadedAlarmIds) ignores commands for alarms this
node does not own; TimedShelve without UnshelveAtUtc and unknown operations
are logged + rejected (never thrown); op failures are caught + logged so a
faulting op can't fault the actor. Re-projection is left to the engine's
existing OnEvent -> OnEngineEmission path.

Handler is a Task-returning ReceiveAsync (the project's AK2003 analyzer
forbids an async-void Receive delegate), giving ordered awaited async on the
actor thread. Adds 3 TestKit tests: ack drives the engine with mapped args,
unowned command ignored, missing-UnshelveAtUtc TimedShelve rejected not
thrown.
2026-06-11 06:23:08 -04:00
Joseph Doherty 4eb1d65e2b feat(scripted-alarms): richer AlarmConditionState bridge to the OPC UA node (T15) 2026-06-10 19:41:16 -04:00
Joseph Doherty 60d48a2a0a feat(scripted-alarms): materialise real Part 9 AlarmConditionState nodes (T14) 2026-06-10 19:19:10 -04:00
Joseph Doherty a8640a9331 test(scripted-alarms): cover bootstrap-restore path forwarding alarms (T10 review) 2026-06-10 15:24:39 -04:00
Joseph Doherty fc0d43a3dc refactor(scripted-alarms): retire orphaned ScriptedAlarmActor + F9b evaluator (T11) 2026-06-10 15:22:26 -04:00
Joseph Doherty 5256761368 feat(scripted-alarms): spawn + apply ScriptedAlarmHostActor in DriverHostActor (T10) 2026-06-10 15:17:29 -04:00
Joseph Doherty dafaf2faec fix(scripted-alarms): ScriptedAlarmHostActor review fixes — load-gen guard, quiet cancel, parse guard (T9 review) 2026-06-10 15:08:54 -04:00
Joseph Doherty 3b418a54f1 feat(scripted-alarms): ScriptedAlarmHostActor — engine runtime host (T9) 2026-06-10 14:57:42 -04:00
Joseph Doherty 8e8ca9efe8 feat(scripted-alarms): DeploymentArtifact byte-parity for the alarm plan (T6) 2026-06-10 14:41:46 -04:00
Joseph Doherty 1c96fe0be0 feat(scripted-alarms): EfAlarmConditionStateStore (T8) 2026-06-10 14:21:19 -04:00
Joseph Doherty 945ccd0b85 feat(scripted-alarms): DependencyMuxTagUpstreamSource (T7)
Concrete ITagUpstreamSource the scripted-alarm host actor pushes
DependencyValueChanged values into and ScriptedAlarmEngine reads/subscribes
from. Thread-safe: ConcurrentDictionary value cache + per-path ImmutableList
observer lists with atomic add/remove and capture-then-invoke fan-out.
ReadTag of an unknown path returns a Bad-quality (0x80000000) snapshot stamped
via the injected clock. Adds the Core.ScriptedAlarms project reference Runtime
needs to see the interface.
2026-06-10 14:20:02 -04:00
Joseph Doherty 73014258ef feat(scripting): root script logger + DPS publisher wired in Host 2026-06-10 11:50:50 -04:00
Joseph Doherty 66ea9c56f6 feat(runtime): DeploymentArtifact substitutes {{equip}} (parity with composer) 2026-06-10 07:53:20 -04:00
Joseph Doherty d909a8e4f6 docs+test(deploy): clarify driver-less attribution docs + no-line exclusion test (Task 2 review) 2026-06-08 07:02:25 -04:00
Joseph Doherty c688899134 fix(deploy): cluster-attribute driver-less equipment via its UNS line area (BuildClusterSets) 2026-06-08 06:53:41 -04:00
Joseph Doherty 6b36eff2d3 refactor(runtime): capture-first in HandleWriteAsync; assert no handler leak on resubscribe; fix stale comment 2026-06-07 10:31:20 -04:00
Joseph Doherty 98259ab026 fix(runtime): capture Sender before await in DriverInstanceActor subscribe (no-ActorContext race) 2026-06-07 10:26:17 -04:00
Joseph Doherty 4c221ce2b3 merge: equipment-namespace live values (VirtualTag route)
v2-ci / build (push) Failing after 36s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped
2026-06-07 09:33:21 -04:00
Joseph Doherty 1c579410cd fix(runtime): flag cross-cluster orphan-equipment bindings on rebuild
v2-ci / build (push) Failing after 42s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped
ParseComposition(blob, nodeId, onInconsistency?) detects a kept equipment whose
UNS line belongs to another cluster (a same-cluster-invariant violation that
would orphan the equipment folder) and reports it via an optional callback,
wired to OpcUaPublishActor's logger. Detection-only; the upstream draft
validator remains the authority. Adds two unit tests.
2026-06-07 08:24:11 -04:00
Joseph Doherty 397f9b783a feat(runtime): spawn+apply VirtualTagHostActor on deploy apply and restore 2026-06-07 05:41:04 -04:00
Joseph Doherty 5e2869bab7 fix(runtime): VirtualTagHost watches children + respawns after unexpected death
Context.Watch each spawned child; OnChildTerminated evicts it from _children so the
next ApplyVirtualTags (still containing that vtagId) falls through the ContainsKey
guard and re-spawns a fresh VirtualTagActor.  Adds a spawn-site Debug log, moves the
TODO about in-place plan mutation to the skip-existing branch where it belongs, and
adds a deterministic TestKit test (Child_is_respawned_after_unexpected_termination)
that kills the first child, drains its UnregisterInterest from the mux probe, re-applies,
and asserts a second distinct RegisterInterest arrives.
2026-06-07 05:34:50 -04:00
Joseph Doherty 85a36cec54 feat(runtime): VirtualTagHostActor spawns VTag actors + bridges results to OPC UA 2026-06-07 05:28:46 -04:00
Joseph Doherty c7661d0510 feat(opcua): parse Equipment VirtualTag plans from the deployment artifact 2026-06-07 05:09:53 -04:00
Joseph Doherty 8ce57e47a3 feat(runtime): OPC UA rebuild materialises only the node's ClusterId slice 2026-06-07 03:23:02 -04:00
Joseph Doherty 1b7f995aea feat(runtime): DriverHost spawns + subscribes only its own ClusterId's drivers 2026-06-07 03:19:22 -04:00
Joseph Doherty 4fca4e1aca feat(runtime): node-scoped ParseComposition filters address space by ClusterId 2026-06-07 03:15:46 -04:00
Joseph Doherty 7b2f64fdb8 refactor(runtime): case-insensitive ClusterId/NodeId match + suppress short-circuit + edge tests (review) 2026-06-07 03:12:09 -04:00
Joseph Doherty 24796f2c12 feat(runtime): ClusterId scope resolution + node-scoped driver-spec parse 2026-06-07 03:05:02 -04:00
Joseph Doherty aaf869145a fix(opcua): equipment-tag planner diff + folder-scoped NodeIds (review findings)
Two bundle-review fixes + idempotency coverage:
- CRITICAL: the planner ignored EquipmentTags, so an incremental deploy changing only
  equipment tags produced an empty plan and HandleRebuild short-circuited before
  materialising them. Add TagId to EquipmentTagPlan + Added/Removed/ChangedEquipmentTags
  to Phase7Plan (diffed by TagId, in IsEmpty, driving Apply's needsRebuild) — mirroring
  the GalaxyTags treatment.
- IMPORTANT: equipment variable NodeId was the raw driver FullName, which collides across
  identical machines (e.g. two PLCs both exposing register 40001) — the second variable
  was silently dropped. NodeId is now folder-scoped (parent/Name); FullName stays on
  EquipmentTagPlan for the later values-routing milestone.
- Task 4: SDK-backed idempotency test (double-apply -> single variable); restart-safety
  confirmed (RestoreApplied reuses the same RebuildAddressSpace -> HandleRebuild path).
- Minor: align composer equipment-tag sort with the artifact decoder (coalesce FolderPath).
2026-06-06 15:02:50 -04:00
Joseph Doherty 08cddfe128 fix(opcua): UNS equipment folders browse by friendly Name, NodeId stays the logical Id
Equipment folder DisplayName was the colloquial MachineCode; the live rebuild (artifact
ReadEquipmentNode) + composer now use the UNS level-5 Name segment, matching Area/Line
folders + EquipmentNodeWalker. NodeId stays the logical EquipmentId so browse-path
resolution + ACLs are unaffected.
2026-06-06 14:51:12 -04:00