Commit Graph

218 Commits

Author SHA1 Message Date
Joseph Doherty
a5c6ce279e docs(v2): finish path corrections in phase-7-status, admin-ui, OpcUaClient fixture 2026-05-26 12:09:47 -04:00
Joseph Doherty
59b3d9f295 docs: rewrite stale src/Server/Server|Admin/ paths to v2 project locations 2026-05-26 12:06:59 -04:00
Joseph Doherty
89095c15e3 docs(v2): update for gap-closeout — peer-URI discovery, role overlays, release status 2026-05-26 11:58:06 -04:00
Joseph Doherty
bdae749b2b docs(plans): mark gap-closeout tasks complete 2026-05-26 11:48:05 -04:00
Joseph Doherty
7209bc99e2 docs(plans): gap-closeout plan + task persistence file 2026-05-26 11:15:59 -04:00
Joseph Doherty
d21f6947e1 feat(opcua): F10b SDK NodeManager binding — real OPC UA address-space writes
Some checks failed
v2-ci / build (push) Failing after 38s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (push) Has been skipped
OtOpcUaNodeManager + SdkAddressSpaceSink: the v2 IOpcUaAddressSpaceSink
seam now has a production adapter against a real Opc.Ua.Server
CustomNodeManager2. Writes through OpcUaPublishActor's sink materialise
as real OPC UA Variable updates that subscribed clients see via the
standard ClearChangeMasks notification path.

OtOpcUaNodeManager (CustomNodeManager2):
  - Owns a ConcurrentDictionary<string, BaseDataVariableState> under a
    single namespace (https://zb.com/otopcua/ns) hung off Objects/.
  - WriteValue lazy-creates the variable on first write, sets Value +
    StatusCode (mapped from OpcUaQuality severity bits) + SourceTimestamp,
    then ClearChangeMasks to notify subscribers.
  - WriteAlarmState surfaces a [active, acknowledged] pair on a
    dedicated node id — full AlarmConditionState/event firing comes
    with #85 F14b (EquipmentNodeWalker SDK integration).
  - RebuildAddressSpace tears down every registered variable + clears
    the dictionary so the next write-pass starts fresh.
  - Address-space root folder is materialised in CreateAddressSpace.

SdkAddressSpaceSink: thin IOpcUaAddressSpaceSink → OtOpcUaNodeManager
bridge. Production DI binding (#108) constructs this once the host's
StandardServer has booted.

OtOpcUaSdkServer (StandardServer subclass): overrides
CreateMasterNodeManager to inject OtOpcUaNodeManager via the
MasterNodeManager additionalManagers ctor. NodeManager property
exposes the live instance so OpcUaApplicationHost callers can wrap
it in a sink.

Tests: OpcUaServer 20 -> 24 (+4):
- WriteValue creates + updates variables in the manager
- WriteAlarmState creates a node distinct from value writes
- RebuildAddressSpace clears everything; subsequent writes start fresh
- NullOpcUaAddressSpaceSink no-op sanity

Each test boots a real OpcUaApplicationHost on a free port with the
SDK certificate auto-create flow (F13a) intact — full integration
slice on macOS.

All 6 v2 test suites green: 167 tests passing.

F10 status updated to reflect SDK binding shipped. Residuals:
- #109 OpcUaPublishActor.RebuildAddressSpace → Phase7Applier wiring
- #108 Host DI default to SdkAddressSpaceSink when hasDriver
- #85 F14b EquipmentNodeWalker integration (proper AlarmConditionState
  + folder hierarchy)
- IServiceLevelPublisher SDK binding (writes Server.ServiceLevel node)
2026-05-26 09:49:44 -04:00
Joseph Doherty
7fa863f6da feat(runtime): #113 DependencyMuxActor — drivers → virtual-tag fan-out
Some checks failed
v2-ci / build (push) Failing after 36s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (push) Has been skipped
End-to-end data path is now wired on the read side: driver subscriptions
fire AttributeValuePublished → DriverHostActor → DependencyMuxActor →
DependencyValueChanged to every interested VirtualTagActor. Previously
the publish hit a dead-letter at the host.

DependencyMuxActor:
  - Per-node fan-out router. Maintains tagRef → Set<IActorRef> with a
    reverse subscriber → refs index so unregister/replace are O(refs).
  - Watches subscribers; Terminated triggers automatic unregister so
    dead virtual-tag actors stop receiving publishes.
  - Re-register replaces the prior interest set — no stale-ref leaks
    on actor restart.
  - Drops publishes for refs with no interested subscribers.

VirtualTagActor:
  - New Props params: dependencyRefs + mux ActorRef.
  - PreStart sends RegisterInterest to the mux; PostStop sends
    UnregisterInterest. Default both null so older callers stay quiet.

DriverHostActor:
  - New dependencyMux Props param. Steady + Applying states now
    receive AttributeValuePublished from their DriverInstance children
    and forward to the mux. Null mux is a no-op (dev/Mac).

ServiceCollectionExtensions:
  - WithOtOpcUaRuntimeActors spawns DependencyMuxActor before
    DriverHostActor and threads its ActorRef into the host's Props.
    New DependencyMuxActorKey + DependencyMuxActorName.

Tests: Runtime 57 -> 63 (+6):
- Mux forwards to only subscribers interested in each ref
- Publish for unregistered ref is dropped silently
- Unregister stops forwarding
- Re-register replaces prior interest set
- VirtualTagActor PreStart registration drives end-to-end eval
  (uses AwaitAssert to race-safely settle the PreStart Tell)
- DriverHostActor forwards AttributeValuePublished through to mux

All 6 v2 test suites green: 163 tests passing.

F8 (#79) state updated — dep subscribe seam shipped, Core.VirtualTags
production engine binding (compile + ITagUpstreamSource subscribe) is
the residual.
2026-05-26 09:43:06 -04:00
Joseph Doherty
f427dc4f26 feat(runtime): #112 ScriptedAlarmActor state persistence via IAlarmActorStateStore
Some checks failed
v2-ci / build (push) Failing after 42s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (push) Has been skipped
ScriptedAlarmActor now survives actor restart: PreStart loads from
the configured store + restores in-memory state; every Transition()
fires a fire-and-forget save. ActiveState still re-derives from the
evaluator on first tick (Phase 7 decision #14), but Acked state +
lastAckUser persist verbatim so operators don't re-ack across an
outage.

Three pieces:
- IAlarmActorStateStore seam in Commons.Engines, with the
  AlarmActorStateSnapshot record (alarmId / state / lastTransitionUtc
  / lastAckUser) and NullAlarmActorStateStore default.
- EfAlarmActorStateStore in Runtime.ScriptedAlarms — production
  adapter over the existing ScriptedAlarmState table in ConfigDb.
  Maps the actor's 3-state enum to the table's AckedState column
  (Active⇒Unacknowledged, Acknowledged⇒Acknowledged, Inactive⇒
  Acknowledged). Concurrency conflicts are logged + dropped — the
  next transition writes again.
- ScriptedAlarmActor PreStart load (async, piped back as
  StateRestored) + Transition save. New Props overload takes the
  store; default is NullAlarmActorStateStore so tests stay quiet.

Tests: Runtime 52 -> 57 (+5):
- Transition writes Active then Acknowledged snapshots with
  lastAckUser populated
- PreStart with persisted Active state restores so a subsequent
  AcknowledgeAlarm fires (not ignored as it would be from Inactive)
- Empty store boots Inactive (AcknowledgeAlarm correctly ignored)
- EfAlarmActorStateStore Save + Load round-trips via in-memory EF
- Load for unknown alarmId returns null

All 6 v2 test suites green: 157 tests passing.

Closes #112. F9 (#80) remaining residual is predicate binding to
Core.ScriptedAlarms.ScriptedAlarmEngine — split as F9b in tasks JSON.
2026-05-26 09:34:37 -04:00
Joseph Doherty
3e3f7588bd feat(runtime,host): close F7 — driver subscribe + write paths + Host DI
Some checks failed
v2-ci / build (push) Failing after 42s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (push) Has been skipped
Three pieces landed in one batch, closing F7-residual + Host DI #106:

Runtime/DriverInstanceActor:
  - Subscribe / Unsubscribe message contracts; the Connected state
    handles them via IDriver.ISubscribable. On every OnDataChange
    event the actor publishes AttributeValuePublished to its parent
    (DriverHostActor → OpcUaPublishActor). OPC UA StatusCode is
    mapped to the 3-state OpcUaQuality enum via severity bits
    (00=Good, 01=Uncertain, 10/11=Bad).
  - DetachSubscription tears the handler off the driver on
    DisconnectObserved, Unsubscribe, and PostStop so a stale handler
    never pushes to a dead actor.
  - WriteAttribute now dispatches IWritable.WriteAsync (batch of one)
    with a 5s CancellationTokenSource; status-code propagated to
    WriteAttributeResult on non-Good results.

Host:
  - New ProjectReferences to Core + every cross-platform driver
    assembly (AbCip/AbLegacy/FOCAS/Galaxy/Modbus/S7/TwinCAT).
    Galaxy is net10 (gRPC client to mxaccessgw); the COM-bound net48
    Wonderware Historian driver stays out of the Host's reference
    closure — its .Client gRPC wrapper is what binds for historian
    needs.
  - New DriverFactoryBootstrap.AddOtOpcUaDriverFactories() registers
    a singleton DriverFactoryRegistry, invokes each driver's
    Register(registry, loggerFactory), and binds IDriverFactory to
    DriverFactoryRegistryAdapter. Replaces the F7 NullDriverFactory
    default so deploys actually materialise real IDriver instances
    on driver-role nodes. ShouldStub() still gates per-platform
    behaviour at spawn time.
  - Program.cs wires AddOtOpcUaDriverFactories() before AddAkka so
    the runtime extension can resolve IDriverFactory from DI.

Tests: Runtime 46 -> 52 (+6):
- Write returns success when StatusCode = Good
- Write propagates non-Good status code in failure Reason
- Subscribe forwards OnDataChange to parent as AttributeValuePublished
- Quality translation: Uncertain (0x40...) and Bad (0x80...)
- Subscribe against non-ISubscribable returns failure
- DisconnectObserved detaches handler so late events are dropped

All 6 v2 test suites green: 152 tests passing.

Closes F7. F7-residual sub-tasks #110 (subscribe) and #111 (write)
both shipped. Host DI binding #106 shipped.
2026-05-26 09:28:34 -04:00
Joseph Doherty
c02f016f1d feat(opcua): F14 Phase7Plan + Phase7Applier
Some checks failed
v2-ci / build (push) Failing after 34s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (push) Has been skipped
Splits the side-effecting half of Phase7Composer (deferred at Task 47)
into two pieces that mirror DriverHostActor's spawn-plan pattern:

Phase7Plan + Phase7Planner.Compute (pure):
  Diff two Phase7CompositionResult snapshots by stable id (EquipmentId,
  DriverInstanceId, ScriptedAlarmId). Emits Added/Removed/Changed lists
  per entity class. Added/Removed are sorted by id for deterministic
  apply order. Changed wraps both Previous and Current projections so
  consumers can decide between in-place mutation and tear-down +
  rebuild.

Phase7Applier (side-effecting):
  Drives an IOpcUaAddressSpaceSink against a plan. Removed equipment/
  alarms get an inactive AlarmState write per id; Added/Removed of
  Equipment or ScriptedAlarm triggers RebuildAddressSpace. Driver-only
  changes correctly skip the rebuild — those flow through DriverHost-
  Actor's spawn-plan in Runtime. Sink exceptions are caught + logged so
  one bad node doesn't abort the apply.

Tests: OpcUaServer 6 -> 20 (+14):
- Phase7PlannerTests x9 (empty-in/empty-out, add/remove/change per
  entity class, mixed changes, deterministic ordering)
- Phase7ApplierTests x5 (empty plan no-op, removal writes inactive
  states + rebuild, added equipment triggers rebuild, driver-only
  skips rebuild, sink fault is non-fatal)

The remaining piece is the EquipmentNodeWalker integration against a
real SDK NodeManager — split as F14b, gated on F10b's SDK builder.

All 6 v2 test suites green: 146 tests passing.
2026-05-26 09:16:08 -04:00
Joseph Doherty
a1325299ce feat(runtime): F10 OpcUaPublishActor sink seams + redundancy-driven ServiceLevel
OpcUaPublishActor now routes through pluggable seams instead of just
incrementing a counter:

- IOpcUaAddressSpaceSink (Commons.OpcUa) — WriteValue / WriteAlarmState
  / RebuildAddressSpace. OpcUaQuality enum moved here from the actor's
  nested type so producers don't have to reference the actor itself.
- IServiceLevelPublisher — Publish(byte). NullServiceLevelPublisher
  retains the last level for inspection.
- The actor subscribes to the redundancy-state DPS topic in PreStart
  and maps the local node's NodeRedundancyState to a coarse
  ServiceLevel (Primary+leader=240, Primary=200, Secondary=100,
  Detached=0). This keeps the local SDK's ServiceLevel node honest
  without round-tripping back through the admin-singleton calculator.
- ServiceLevelChanged dedupes identical levels so the SDK doesn't see
  redundant writes.
- Sink + publisher exceptions are caught and logged; the actor never
  crashes its own dispatcher.
- PropsForTests gets optional sink/publisher/localNode params and
  skips the DPS subscribe so unit tests stay on a vanilla TestKit
  cluster.

Production binding to a real SDK NodeManager + Variable nodes is the
remaining residual — split as F10b. Task 60 still blocked on F10b.

Tests: Runtime 40 -> 46 (+6):
- AttributeValueUpdate routes to sink
- AlarmStateUpdate routes to sink
- RebuildAddressSpace calls sink.Rebuild
- ServiceLevelChanged dedupes
- RedundancyStateChanged for primary-leader publishes 240
- RedundancyStateChanged for secondary publishes 100

All 6 v2 test suites green: 132 tests passing.
2026-05-26 09:10:55 -04:00
Joseph Doherty
14fb2b05ed feat(runtime): F8/F9 engine evaluator seams + DPS fan-out
VirtualTagActor and ScriptedAlarmActor now route through pluggable
evaluator interfaces and fan out to the cluster's live-tail topics
shipped in F15.3:

- IVirtualTagEvaluator + NullVirtualTagEvaluator in Commons.Engines.
  VirtualTagActor calls evaluator on every DependencyValueChanged,
  dedupes unchanged values, forwards EvaluationResult to its parent,
  and publishes ScriptLogEntry Warning to the script-logs DPS topic
  whenever the evaluator fails.

- IScriptedAlarmEvaluator + NullScriptedAlarmEvaluator. ScriptedAlarmActor
  takes an AlarmConfig (id/name/equipment-path/severity/predicate) and
  publishes both an AlarmTransitionEvent (alerts topic) and a
  ScriptLogEntry (script-logs topic) at every transition. Manual
  ConditionMet/Acknowledge/Cleared still flow through the same
  Transition() so callers without engine bindings still drive the
  state machine; the legacy single-string Props() overload routes
  through a default AlarmConfig.

The Null* defaults keep the actors safe when no engine is bound —
unconfigured nodes never spuriously alarm. Production binding to
Core.VirtualTags.VirtualTagEngine and Core.ScriptedAlarms is the
remaining residual (F8b/F9b — split in tasks JSON).

Tests: Runtime 34 -> 40 (+6):
- VirtualTagActorTests x3 (evaluator drives EvaluationResult,
  unchanged-value dedup, failure publishes Warning ScriptLogEntry)
- ScriptedAlarmActorTests x3 (engine threshold drives Activated +
  Cleared on alerts topic, manual Acknowledge attribution).

All 6 v2 test suites green: 126 tests passing.
2026-05-26 09:05:04 -04:00
Joseph Doherty
da141497f8 feat(runtime): F7 spawn lifecycle + F20 ShouldStub gate
DriverHostActor.ApplyAndAck now reads the deployment artifact and
reconciles its set of DriverInstanceActor children — spawn the missing,
ApplyDelta to those with changed config, stop the removed/disabled.
The diff lives in pure DriverSpawnPlanner so it can be unit-tested
without an ActorSystem.

Adds IDriverFactory in Core.Abstractions (consumed by Runtime) +
DriverFactoryRegistryAdapter in Core.Hosting that wraps the existing
v1 DriverFactoryRegistry — Runtime stays decoupled from Polly/Serilog,
the Host wires the adapter once driver assemblies have registered.

ShouldStub(type, roles) is now actually called on every spawn — Galaxy
+ Wonderware-Historian boot stubbed on macOS/Linux or whenever the host
carries the dev role. Missing factory ⇒ stub fallback, never a crash.

Tests: 24 → 34 in Runtime (+10):
- DriverSpawnPlannerTests x7 (diff cases, type change ⇒ stop+respawn)
- DeploymentArtifactTests  x5 (empty/malformed/missing fields tolerant)
- DriverHostActorReconcileTests x4 (spawn count, stub fallback,
  ShouldStub gate, second-apply stops the removed)
All 6 v2 test suites green: 120 tests passing.

Closes F20 (ShouldStub wired). F7 marked partial — subscription
publishing + write path still stubbed in DriverInstanceActor itself.
2026-05-26 08:57:16 -04:00
Joseph Doherty
9892ceae9a docs(plans): mark F15.3 complete — F15 fully shipped
Some checks failed
v2-ci / build (push) Failing after 42s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (push) Has been skipped
2026-05-26 08:39:47 -04:00
Joseph Doherty
e248e037e7 docs(plans): mark F15 complete — read views + live-edit CRUD
Some checks failed
v2-ci / build (push) Failing after 39s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (push) Has been skipped
2026-05-26 08:28:13 -04:00
Joseph Doherty
d055cb059e docs(plans): mark F15 partial — Phases A–D shipped
Some checks failed
v2-ci / build (push) Failing after 34s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (push) Has been skipped
2026-05-26 08:02:02 -04:00
Joseph Doherty
5c754ecffd docs(v2): F15 UX kickoff — AdminUI rebuild plan
Some checks failed
v2-ci / build (push) Failing after 41s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (push) Has been skipped
47-page legacy inventory mapped to v2 disposition (5 already done, 22 port
as-is, 7 reshape, 5 dropped because live-edit replaces draft/publish, 4
deferred driver-typed editors). Net ~30 active pages to rebuild.

Five open design questions surfaced for review before per-page work starts:
Q1 driver-typed editors (defer vs. ship), Q2 top-level fleet-wide views
(drop vs. keep), Q3 ClusterDetail tabs vs. split routes, Q4 RoleGrants
cluster-scoped vs. LDAP-group fleet-wide, Q5 Login error UX.

Proposed 4-phase sequencing (~5 days total): shell+auth+fleet, cluster
CRUD, config tabs, logic+ops. Each phase independently mergeable.
2026-05-26 07:38:58 -04:00
Joseph Doherty
68c6f36cfe docs(plans): mark F13a partial-complete (36c4751)
Some checks failed
v2-ci / build (push) Failing after 42s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (push) Has been skipped
2026-05-26 07:35:04 -04:00
Joseph Doherty
229282ad8b docs(plans): mark F21 complete (b0a2bb0)
Some checks failed
v2-ci / build (push) Failing after 43s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (push) Has been skipped
2026-05-26 07:25:36 -04:00
Joseph Doherty
ba6e5dd7f9 docs(plans): mark F11 + F22 complete
Some checks failed
v2-ci / build (push) Failing after 49s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (push) Has been skipped
F22 → cd5540c (failover scenarios on TwoNodeClusterHarness)
F11 → 6861381 (HistorianAdapterActor → IAlarmHistorianSink bridge)

Branch follow-ups: 11/22 → 13/22 done. Remaining 9 are engine wiring
gated on real drivers/SDKs (F7-F10, F13-F14), Admin UI rebuild (F15),
fold-in to F7 (F20), and SQL/LDAP harness mode (F21).
2026-05-26 07:19:07 -04:00
Joseph Doherty
4e6ef648d1 docs(plans): mark F16 complete
Some checks failed
v2-ci / build (push) Failing after 3m8s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (push) Has been skipped
2026-05-26 07:01:54 -04:00
Joseph Doherty
7a6b016d9e docs(plans): mark F17 complete
Some checks failed
v2-ci / build (push) Failing after 41s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (push) Has been skipped
2026-05-26 06:58:27 -04:00
Joseph Doherty
337a691629 docs(plans): mark F3, F4, F5, F12 follow-ups complete
Some checks failed
v2-ci / build (push) Failing after 38s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (push) Has been skipped
2026-05-26 06:55:39 -04:00
Joseph Doherty
b06e3ae740 feat(runtime): PeerOpcUaProbeActor real TCP-connect probe (F12)
Replaces the Ok=true stub with a TCP connect to the peer's OPC UA port (4840
default) with a 2s timeout. A successful connect indicates the OPC UA server
process is up + accepting connections — enough for the redundancy calculator
to treat the peer as live. A full secure-channel Hello/Acknowledge handshake
is overkill for what the redundancy calc consumes and would pull in the OPC
UA Client SDK + a PKI setup. Upgrade later if a deeper liveness signal is ever
required.

Probe extracts the host from NodeId by stripping the :port suffix (commit
5cfbe8b encoded host:port into NodeId for cluster-member identity).

Tests: 2 new tests — Ok=true against a live TcpListener on a chosen port,
Ok=false against an unreachable endpoint. All 17 Runtime tests pass (was 16
covering only the message-contract surface).
2026-05-26 06:54:51 -04:00
Joseph Doherty
8e5c8e29f7 docs(plans): mark Task 61 complete
Some checks failed
v2-ci / build (push) Failing after 1m21s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (push) Has been skipped
2026-05-26 06:49:06 -04:00
Joseph Doherty
8ac71db464 docs(plans): mark Tasks 62, 63, 64, 65 complete 2026-05-26 06:46:55 -04:00
Joseph Doherty
1689901c0e docs(v2): Architecture-v2 + Cluster + ControlPlane + Runtime overviews (Task 65)
Four new docs at docs/v2/ giving a single-page tour of each v2 piece:
- Architecture-v2.md: top-level mental model (fused Host + roles + cluster + live-edit)
- Cluster.md: AkkaClusterOptions + IClusterRoleInfo + WithOtOpcUaClusterBootstrap
- ControlPlane.md: 5 admin singletons + DPS topics + deploy flow + failover recovery
- Runtime.md: per-node actor tree + state machines + engine-wiring follow-up map

Each links back to the design doc for depth. Architecture-v2 cross-references
the other three + ServiceHosting + Redundancy + security.
2026-05-26 06:41:48 -04:00
Joseph Doherty
3c3fef911c docs: v2 updates to Redundancy, ServiceHosting, security, README (Task 64)
- Redundancy.md: full rewrite — Akka-leader-driven ServiceLevel replaces
  operator-managed RedundancyRole. Documents the 5-tier ServiceLevelCalculator,
  RedundancyStateActor cluster singleton, and the DPS data flow.

- ServiceHosting.md: full rewrite — single fused OtOpcUa.Host binary with
  OTOPCUA_ROLES env gating. Documents the conditional DI graph and the new
  health endpoints (/health/ready, /health/active, /healthz).

- security.md: v2 banner at top covering path/project renames + new JWT bearer
  + DataProtection persisted to ConfigDb. Body unchanged because the 4-concern
  security model is unchanged in v2; full per-section rewrite waits for F15
  (Admin pages migration) since security.md references many pages that move.

- README.md: platform overview updated to v2 (fused Host + role gating).
2026-05-26 06:38:55 -04:00
Joseph Doherty
a8becc9c46 docs(plans): mark Task 59 complete; track F22 failover scenarios 2026-05-26 06:35:03 -04:00
Joseph Doherty
62e3cd6599 docs(plans): mark Task 58 complete; track F21 docker-compose follow-up 2026-05-26 06:27:33 -04:00
Joseph Doherty
bb353c4d43 docs(plans): mark F1, F2, F6, F18, F19 follow-ups complete
Post-compaction batch of 5 follow-ups landed:
- F19 (09d6676): WithOtOpcUaRuntimeActors extension for driver-role spawn
- F1  (463512d): AuthEndpoints integration tests via TestServer
- F6  (dfc143c): RedundancyStateActor broadcast override + un-skip 2 tests
- F18 (b266f63): User.Identity.Name into Deployments createdBy
- F2  (45a8c79): JwtBearer validation via IPostConfigureOptions
2026-05-26 06:18:31 -04:00
Joseph Doherty
698709a578 docs(plans): mark Tasks 56+57 complete 2026-05-26 05:39:07 -04:00
Joseph Doherty
2b75ce3876 docs(plans): mark Phase 9 tasks 53-55 complete; track F19/F20 follow-ups 2026-05-26 05:23:48 -04:00
Joseph Doherty
eb4280b7eb docs(plans): add F15-F18 follow-ups for Phase 8 deferred scope 2026-05-26 05:19:02 -04:00
Joseph Doherty
8a1f97b27f docs(plans): mark Phase 8 tasks 48-52 complete; track F15-F18 follow-ups 2026-05-26 05:18:37 -04:00
Joseph Doherty
5e31449529 docs(plans): mark Phase 7 tasks 46+47 complete; track F13/F14 full-extraction follow-ups 2026-05-26 05:15:18 -04:00
Joseph Doherty
2e4f1399bb docs(plans): mark Phase 6 tasks 39-45 complete (race-recovered commit) 2026-05-26 05:10:21 -04:00
Joseph Doherty
e31547d00e docs(plans): mark Phase 6 tasks 37-45 complete; track F7-F12 engine-wiring follow-ups 2026-05-26 05:09:52 -04:00
Joseph Doherty
ea6f972e96 docs(plans): mark Phase 5 tasks 30-36 complete with commit hashes 2026-05-26 04:57:37 -04:00
Joseph Doherty
bad2aef137 docs(plans): track F5 multi-node coordinator test + F6 RedundancyState publisher refactor 2026-05-26 04:53:32 -04:00
Joseph Doherty
9582e448d5 docs(plans): track F4 WrapDetails JSON hardening follow-up 2026-05-26 04:46:18 -04:00
Joseph Doherty
1955bc5f4d docs(plans): mark Task 33+35 partial complete; track F3 audit-idempotency follow-up 2026-05-26 04:44:37 -04:00
Joseph Doherty
32574b3e4e docs(plans): track JwtBearer DI antipattern as follow-up F2 2026-05-26 04:39:10 -04:00
Joseph Doherty
fc22d4f7b6 docs(plans): track AuthEndpoints integration tests as follow-up F1 2026-05-26 04:37:49 -04:00
Joseph Doherty
973a3d1b9a docs(plans): mark Tasks 24-29 complete in tasks.json 2026-05-26 04:36:17 -04:00
Joseph Doherty
f35925b57e docs(plans): mark Tasks 19-23 complete in tasks.json 2026-05-26 04:31:28 -04:00
Joseph Doherty
fdb4ac7051 docs(plans): mark Tasks 15-18 complete in tasks.json 2026-05-26 04:27:41 -04:00
Joseph Doherty
990ce343fe docs(plans): split Task 14 into 14a-14f (entity-model rewrite)
The original Task 14 (5-min EF migration that "drops ConfigGeneration") was
under-scoped: the design doc (live-edit model, ~line 208) requires removing
GenerationId from 13 entities (Equipment, DriverInstance, Device, Tag,
PollGroup, Namespace, UnsArea, UnsLine, NodeAcl, Script, VirtualTag,
ScriptedAlarm) and adding RowVersion columns for last-write-wins detection.
That cascades into GenerationApplier / GenerationDiff / GenerationSealedCache
and the legacy Server/Admin CRUD services.

New decomposition (~85 min total, replacing the original 5-min estimate):

  14a  standard   10m  Add RowVersion to live-edit entities
  14b  high-risk  30m  Drop GenerationId FK from those entities
  14c  high-risk  20m  Obsolete GenerationApplier/Diff/SealedCache
  14d  standard   5m   Drop ClusterNode.RedundancyRole
  14e  small      5m   Delete ConfigGeneration + ClusterNodeGenerationState
  14f  high-risk  15m  Consolidator: generate V2HostingAlignment migration

Policy decision (recorded with user): OtOpcUa.Server + OtOpcUa.Admin are
allowed to fail-to-compile between 14b and Task 56 - only the new v2 projects
need to stay green. Task 56 deletes the legacy projects.

Plan markdown: replaces the original Task 14 section with the 6-task
decomposition + a header explaining the rewrite. Task index table at the
bottom of the plan updated.

Tasks JSON: replaces the single Task 14 row with 6 string-id rows
("14a", "14b", ..., "14f"). Task 15 (Migrate-To-V2.ps1) and downstream
consumers re-pointed at "14f".

Verification step in 14f rewritten to use the shared docker host at
10.100.0.35 per CLAUDE.md (Docker is not installed on this Mac dev VM).
2026-05-26 03:55:48 -04:00
Joseph Doherty
fac32ad69b docs(plans): add v2 implementation plan with 66 bite-sized tasks
Converts the akka-hosting-alignment design into an executable plan:
12 phases covering branch/scaffold, ConfigDb schema, Commons,
Cluster, Security, ControlPlane singletons, Runtime per-node actors,
OpcUaServer extraction, AdminUI migration, Host entry point, cleanup,
integration tests, deploy scripts, and docs. Each task has files,
TDD steps, exact commands, classification, time estimate, and
parallelizable list. Co-located .tasks.json drives executing-plans
resume from any session.
2026-05-26 03:17:29 -04:00
Joseph Doherty
ef4a70751c docs(plans): add v2 Akka + fused hosting alignment design
Captures the brainstormed design to align OtOpcUa with ScadaLink:
single role-gated binary, Akka.NET cluster with admin/driver roles,
cluster singletons for control plane, per-node actor hierarchy for
OPC UA runtime, dual-endpoint warm redundancy preserved with
ServiceLevel driven by Akka leader, cookie+JWT auth, Traefik routing,
and ScadaLink-style live-edit + deploy model replacing the
draft/publish ConfigGeneration lifecycle.
2026-05-26 03:04:21 -04:00