Three pieces landed in one batch, closing F7-residual + Host DI #106:
Runtime/DriverInstanceActor:
- Subscribe / Unsubscribe message contracts; the Connected state
handles them via IDriver.ISubscribable. On every OnDataChange
event the actor publishes AttributeValuePublished to its parent
(DriverHostActor → OpcUaPublishActor). OPC UA StatusCode is
mapped to the 3-state OpcUaQuality enum via severity bits
(00=Good, 01=Uncertain, 10/11=Bad).
- DetachSubscription tears the handler off the driver on
DisconnectObserved, Unsubscribe, and PostStop so a stale handler
never pushes to a dead actor.
- WriteAttribute now dispatches IWritable.WriteAsync (batch of one)
with a 5s CancellationTokenSource; status-code propagated to
WriteAttributeResult on non-Good results.
Host:
- New ProjectReferences to Core + every cross-platform driver
assembly (AbCip/AbLegacy/FOCAS/Galaxy/Modbus/S7/TwinCAT).
Galaxy is net10 (gRPC client to mxaccessgw); the COM-bound net48
Wonderware Historian driver stays out of the Host's reference
closure — its .Client gRPC wrapper is what binds for historian
needs.
- New DriverFactoryBootstrap.AddOtOpcUaDriverFactories() registers
a singleton DriverFactoryRegistry, invokes each driver's
Register(registry, loggerFactory), and binds IDriverFactory to
DriverFactoryRegistryAdapter. Replaces the F7 NullDriverFactory
default so deploys actually materialise real IDriver instances
on driver-role nodes. ShouldStub() still gates per-platform
behaviour at spawn time.
- Program.cs wires AddOtOpcUaDriverFactories() before AddAkka so
the runtime extension can resolve IDriverFactory from DI.
Tests: Runtime 46 -> 52 (+6):
- Write returns success when StatusCode = Good
- Write propagates non-Good status code in failure Reason
- Subscribe forwards OnDataChange to parent as AttributeValuePublished
- Quality translation: Uncertain (0x40...) and Bad (0x80...)
- Subscribe against non-ISubscribable returns failure
- DisconnectObserved detaches handler so late events are dropped
All 6 v2 test suites green: 152 tests passing.
Closes F7. F7-residual sub-tasks #110 (subscribe) and #111 (write)
both shipped. Host DI binding #106 shipped.
DriverHostActor.ApplyAndAck now reads the deployment artifact and
reconciles its set of DriverInstanceActor children — spawn the missing,
ApplyDelta to those with changed config, stop the removed/disabled.
The diff lives in pure DriverSpawnPlanner so it can be unit-tested
without an ActorSystem.
Adds IDriverFactory in Core.Abstractions (consumed by Runtime) +
DriverFactoryRegistryAdapter in Core.Hosting that wraps the existing
v1 DriverFactoryRegistry — Runtime stays decoupled from Polly/Serilog,
the Host wires the adapter once driver assemblies have registered.
ShouldStub(type, roles) is now actually called on every spawn — Galaxy
+ Wonderware-Historian boot stubbed on macOS/Linux or whenever the host
carries the dev role. Missing factory ⇒ stub fallback, never a crash.
Tests: 24 → 34 in Runtime (+10):
- DriverSpawnPlannerTests x7 (diff cases, type change ⇒ stop+respawn)
- DeploymentArtifactTests x5 (empty/malformed/missing fields tolerant)
- DriverHostActorReconcileTests x4 (spawn count, stub fallback,
ShouldStub gate, second-apply stops the removed)
All 6 v2 test suites green: 120 tests passing.
Closes F20 (ShouldStub wired). F7 marked partial — subscription
publishing + write path still stubbed in DriverInstanceActor itself.
- New Commons.Messages.Fleet.GetDiagnostics request record.
- DriverHostActor handles GetDiagnostics in all three states (Steady, Applying,
Stale); replies with a NodeDiagnosticsSnapshot built from _currentRevision
+ the local NodeId. Drivers list is empty until F7 wires the per-instance
children.
- FleetDiagnosticsClient now resolves the target via ActorSelection at
akka.tcp://{system}@{nodeId}/user/driver-host and Asks with a 3s timeout.
On timeout/peer-down it returns an empty snapshot so the UI degrades
gracefully rather than throwing.
Two new integration tests in Host.IntegrationTests:
- GetDiagnostics_returns_snapshot_with_target_NodeId verifies the
cross-node Ask/Reply works.
- GetDiagnostics_after_deploy_reports_current_revision exercises the
end-to-end path: AdminOps starts a deployment, both DriverHostActors
apply, then diagnostics reports the new revision on both nodes.
All 98 v2 tests pass (was 96 + 2 new).
DeployHappyPathTests exercises the full deploy pipeline on the 2-node harness:
AdminOperationsActor → ConfigPublishCoordinator → DistributedPubSub →
DriverHostActor on both nodes → ApplyAck → coordinator seals. Verifies both
NodeDeploymentState rows reach Applied and Deployment.Status reaches Sealed.
Exposed + fixed two production bugs along the way:
1. Coordinator was publishing DispatchDeployment on the "deployments" topic but
never subscribed to anything — DriverHostActor ACKs published on the same
topic could not reach it. Added dedicated "deployment-acks" topic with
coordinator subscription in PreStart, and DriverHostActor publishes ACKs
there.
2. NodeId derivation used member.Address.Host only — two cluster members on a
shared loopback host (test harness, dev VMs) collided to one identity. The
coordinator's expected-ack set became {1} and the system sealed after only
half the nodes acked. Switched to host:port everywhere (ClusterRoleInfo +
coordinator) so loopback nodes stay distinct and production identities are
harmlessly more specific.
Tests: 95 v2 tests pass (was 93 + 2 deploy tests), 0 skipped.
Failover scenarios (design §8 cases 3-7: node-kill-mid-apply, split-brain,
restart-during-deploy) deferred — they need controlled node-down primitives
on the harness. Tracked as F22 (failover scenario test cases).