HistorianAdapterActor now subscribes to the redundancy-state DPS topic,
caches the local node's RedundancyRole, and SKIPS the durable-sink enqueue
when the local node is Secondary or Detached. Unknown/null role default-writes
so single-node deploys and the boot window never silently drop historization.
GetStatus stays ungated.
PREMISE: verified the actor is registered but FED BY NOTHING in production —
there is no AlarmHistorianEvent producer and nothing resolves its registry key
to Tell it. This is a FORWARD-LOOKING / DEFENSIVE guard, not a fix for a live
double-write: the moment a per-node feeder lands (engine -> historian, expected
as a per-node cluster broadcast like the alerts topic), only the Primary will
write to the durable sink (exactly-once across all alarm sources).
Mirrors the sibling A1 treatment of ScriptedAlarmHostActor (06c4155) and
OpcUaPublishActor's redundancy-state handler. localNode threaded through
HistorianAdapterActor.Props from ServiceCollectionExtensions (roleInfo.LocalNode).
Validate AddComment up-front (IsNullOrWhiteSpace guard + Warning log) so
a blank-comment command is cleanly rejected before reaching the engine
rather than faulting inside ApplyAddComment and being silently swallowed
by the outer catch. Mirrors the existing TimedShelve missing-UnshelveAtUtc
pattern.
Also fix two stale inline comments: the "async void crash" note on
TimedShelve now correctly says "fault escaping async Task → supervision
restart", and the ownership-filter now documents the benign race with a
concurrent LoadAsync clearing the loaded set.
Tests: AlarmCommand_add_comment_empty_text_is_rejected_not_driven (Theory
— empty string + whitespace) and AlarmCommand_add_comment_nonempty_drives_engine
(positive path, asserts CommentAdded transition on alerts topic).
Subscribe the host to the cluster alarm-commands DPS topic in PreStart and
drive the matching ScriptedAlarmEngine op per inbound AlarmCommand. An
ownership filter (engine.LoadedAlarmIds) ignores commands for alarms this
node does not own; TimedShelve without UnshelveAtUtc and unknown operations
are logged + rejected (never thrown); op failures are caught + logged so a
faulting op can't fault the actor. Re-projection is left to the engine's
existing OnEvent -> OnEngineEmission path.
Handler is a Task-returning ReceiveAsync (the project's AK2003 analyzer
forbids an async-void Receive delegate), giving ordered awaited async on the
actor thread. Adds 3 TestKit tests: ack drives the engine with mapped args,
unowned command ignored, missing-UnshelveAtUtc TimedShelve rejected not
thrown.
Concrete ITagUpstreamSource the scripted-alarm host actor pushes
DependencyValueChanged values into and ScriptedAlarmEngine reads/subscribes
from. Thread-safe: ConcurrentDictionary value cache + per-path ImmutableList
observer lists with atomic add/remove and capture-then-invoke fan-out.
ReadTag of an unknown path returns a Bad-quality (0x80000000) snapshot stamped
via the injected clock. Adds the Core.ScriptedAlarms project reference Runtime
needs to see the interface.
ParseComposition(blob, nodeId, onInconsistency?) detects a kept equipment whose
UNS line belongs to another cluster (a same-cluster-invariant violation that
would orphan the equipment folder) and reports it via an optional callback,
wired to OpcUaPublishActor's logger. Detection-only; the upstream draft
validator remains the authority. Adds two unit tests.
Context.Watch each spawned child; OnChildTerminated evicts it from _children so the
next ApplyVirtualTags (still containing that vtagId) falls through the ContainsKey
guard and re-spawns a fresh VirtualTagActor. Adds a spawn-site Debug log, moves the
TODO about in-place plan mutation to the skip-existing branch where it belongs, and
adds a deterministic TestKit test (Child_is_respawned_after_unexpected_termination)
that kills the first child, drains its UnregisterInterest from the mux probe, re-applies,
and asserts a second distinct RegisterInterest arrives.
Two bundle-review fixes + idempotency coverage:
- CRITICAL: the planner ignored EquipmentTags, so an incremental deploy changing only
equipment tags produced an empty plan and HandleRebuild short-circuited before
materialising them. Add TagId to EquipmentTagPlan + Added/Removed/ChangedEquipmentTags
to Phase7Plan (diffed by TagId, in IsEmpty, driving Apply's needsRebuild) — mirroring
the GalaxyTags treatment.
- IMPORTANT: equipment variable NodeId was the raw driver FullName, which collides across
identical machines (e.g. two PLCs both exposing register 40001) — the second variable
was silently dropped. NodeId is now folder-scoped (parent/Name); FullName stays on
EquipmentTagPlan for the later values-routing milestone.
- Task 4: SDK-backed idempotency test (double-apply -> single variable); restart-safety
confirmed (RestoreApplied reuses the same RebuildAddressSpace -> HandleRebuild path).
- Minor: align composer equipment-tag sort with the artifact decoder (coalesce FolderPath).
Equipment folder DisplayName was the colloquial MachineCode; the live rebuild (artifact
ReadEquipmentNode) + composer now use the UNS level-5 Name segment, matching Area/Line
folders + EquipmentNodeWalker. NodeId stays the logical EquipmentId so browse-path
resolution + ACLs are unaffected.
Add EquipmentTagPlan + an init-only EquipmentTags member on Phase7CompositionResult
(mirror of GalaxyTags). Populate it compose-side (Tag.EquipmentId != null AND owning
namespace Kind == Equipment) and artifact-decode-side via BuildEquipmentTagPlans, with
FullName extracted from Tag.TagConfig. Init-only member (not a 7th positional param) so
existing convenience constructors + call sites are untouched.
Materialised SystemPlatform/Galaxy variables previously stayed
BadWaitingForInitialData because nothing told the driver to subscribe
(OpcUaPublishActor TODO 'on a future SubscribeBulk pass') and published
values were only forwarded to the VirtualTag mux, never the OPC UA sink.
DriverHostActor now, after each apply, groups the deployment's galaxy tag
MXAccess refs by driver and sends DriverInstanceActor.SetDesiredSubscriptions;
the actor retains the set and (re)subscribes on every Connected entry, so
values resume after reconnects/redeploys (closes the F8b/#113 gap). Published
values are also forwarded to OpcUaPublishActor as AttributeValueUpdate
(NodeId == galaxy MxAccessRef) so the materialised variable shows live data.
Verified live in docker-dev: galaxy TestMachine_001 tags go Good with a
changing TestChangingInt. +1 unit test.
Resolves the 12 reported build errors (7 CS0535 sink fakes + 5 CLI CS1587).
Runtime.Tests green (74). NOTE: OpcUaServer.Tests still has pre-existing CS7036
errors from the in-progress Galaxy-tag workstream (Phase7Plan/Phase7CompositionResult
new required params) — separate, test-only, not addressed here.
Adds <summary>, <param>, <typeparam>, and <inheritdoc/> tags to public
members surfaced by commentchecker — resolves 5,847 of 5,869 issues
(99.6%) across three /fixdocs passes.
Phase7Composer now carries UnsAreaProjection + UnsLineProjection lists so
the applier can materialise the full UNS topology in the OPC UA address
space. New IOpcUaAddressSpaceSink.EnsureFolder(folderNodeId, parentNodeId,
displayName) seam (no-op default, recorded in tests, forwarded by
DeferredAddressSpaceSink, implemented by SdkAddressSpaceSink). The SDK-
side OtOpcUaNodeManager gains an EnsureFolder API that creates
FolderState nodes with proper parent linkage; RebuildAddressSpace now
clears folders too so re-applies don't accumulate stale topology.
Phase7Applier.MaterialiseHierarchy walks composition.UnsAreas →
composition.UnsLines → composition.EquipmentNodes, calling EnsureFolder
with the correct parent at each level. Idempotent — calling twice with
the same composition is a no-op. OpcUaPublishActor.HandleRebuild invokes
it after Phase7Applier.Apply so OPC UA clients browsing the server now
see Area/Line/Equipment as proper folders rather than flat tag ids.
DeploymentArtifact.ParseComposition reads UnsAreas + UnsLines from the
JSON snapshot the ControlPlane emits, populating the new fields when
present.
Phase7Composer.Compose now accepts UnsAreas + UnsLines; a 3-arg overload
preserves the old signature for legacy callers + existing tests. The
Phase7CompositionResult convenience ctor likewise keeps the planner
tests working without UNS data.
3 new hierarchy tests (pure unit + boot-verify against a real
OtOpcUaSdkServer); OpcUaServer suite is 48/48 green (was 45, +3),
Runtime 74/74 unchanged.
Closes#85.
Boots a real StandardServer + OpcUaApplicationHost, wires
SdkServiceLevelPublisher into a DeferredServiceLevelPublisher (production
binding pattern), spawns OpcUaPublishActor against the deferred
publisher, sends RedundancyStateChanged snapshots, and asserts that
ServerObject.ServiceLevel.Value reflects the role-derived byte:
Primary + RoleLeaderForDriver → 240
Secondary → 100
Together with the F13b endpoint-security tests (which already verify
ServerConfiguration.SecurityPolicies populates the three baseline
profiles), this closes Task 60's "dual-endpoint + ServiceLevel" scope.
Cross-node failover tests stay in the 2-node integration harness
(Task 59 / FailoverScenarioTests).
Runtime suite now 74 / 74 green (+2). Closes Task 60.
OtOpcUaTelemetry (Commons/Observability) centralizes the project's Meter
+ ActivitySource so all instrumentation points emit through a single
named surface. Counters cover the hot paths:
otopcua.deploy.applied (outcome=ack|reject)
otopcua.deploy.apply.duration (s, histogram)
otopcua.driver.lifecycle (event=spawn|spawn_stub|stop|fault)
otopcua.virtualtag.eval (outcome=ok|fail|skip)
otopcua.scriptedalarm.transition (state=activated|acknowledged|cleared)
otopcua.opcua.sink.write (kind=value|alarm|rebuild)
otopcua.redundancy.service_level_change (level=byte)
Plus two ActivitySource spans:
otopcua.deploy.apply wraps DriverHostActor.ApplyAndAck
otopcua.opcua.address_space_rebuild wraps OpcUaPublishActor.HandleRebuild
Instruments are no-op until a listener attaches, so tests + dev hosts
pay nothing for unread telemetry.
Host Program.cs gains AddOtOpcUaObservability() (binds the OtOpcUa Meter
+ ActivitySource to OpenTelemetry, attaches a Prometheus exporter) and
MapOtOpcUaMetrics() (mounts /metrics scrape endpoint). Driver-side
internals + ASP.NET request metrics deliberately stay off — the scrape
payload is scoped to OtOpcUa signals only.
Tests use MeterListener + ActivityListener to verify
VirtualTagActor.eval, OpcUaPublishActor.AttributeValueUpdate, and
RebuildAddressSpace actually emit on the central instruments. Runtime
suite is 72 / 72 green (+3).
Closes#105. Path A (F13b/c/d) complete; next batch options: #85 UNS
folder hierarchy in SDK, or F8b/F9b production engine bindings.
Wires the OPC UA SDK into the fused Host's lifecycle on driver-role
nodes + spawns OpcUaPublishActor with the proper sink/publisher/dbFactory/
applier resolution. The full read+write data path is now live in
production: Deploy → DriverHost → OpcUaPublish → SDK NodeManager →
subscribed OPC UA clients.
DeferredAddressSpaceSink (Commons.OpcUa):
- Thread-safe wrapper IOpcUaAddressSpaceSink that delegates to an
inner sink swapped in at runtime. Needed because Akka actors
resolve the sink at construction time, but the production sink
(SdkAddressSpaceSink wrapping OtOpcUaNodeManager) only exists
after the SDK StandardServer has started.
- Defaults to NullOpcUaAddressSpaceSink so calls before swap are
safe; SetSink(null) reverts (for graceful shutdown).
OtOpcUaServerHostedService (Host.OpcUa):
- IHostedService that owns the OPC UA SDK lifecycle. Reads
OpcUaApplicationHostOptions from the 'OpcUa' config section,
creates an OtOpcUaSdkServer, boots it through OpcUaApplicationHost,
then swaps a real SdkAddressSpaceSink into the DeferredAddressSpaceSink
singleton.
- SDK boot failure is logged + non-fatal — the rest of the host
(admin UI, driver actors) keeps running. Stop reverts to null sink.
WithOtOpcUaRuntimeActors (Runtime):
- Now spawns OpcUaPublishActor (new actor) + threads its ActorRef
into DriverHostActor's Props so successful applies trigger the
address-space rebuild pipeline.
- Phase7Applier is constructed here from the resolved sink + a
logger; OpcUaPublishActor takes both.
- Prepends the opcua-synchronized-dispatcher HOCON so the extension
is self-contained — consumers (Host, tests) don't need to redeclare
the dispatcher block.
- New OpcUaPublishActorKey + OpcUaPublishActorName for actor-registry
resolution.
- AddOtOpcUaRuntime now also TryAddSingleton's NullOpcUaAddressSpaceSink
+ NullServiceLevelPublisher so admin-only nodes (or tests that
don't bind the Deferred sink) stay safe.
Host.Program.cs (driver-role only):
- Binds DeferredAddressSpaceSink as singleton + as IOpcUaAddressSpaceSink
- AddHostedService<OtOpcUaServerHostedService>()
Tests: OpcUaServer 24 -> 28 (+4 DeferredAddressSpaceSink unit tests),
Runtime 69 -> 69 (existing ServiceCollectionExtensionsTests extended
to verify the new mux + publish actor registration).
All 6 v2 test suites green: 177 tests passing.
Closes#108. Engine-wiring is now production-bound end-to-end on
driver-role nodes — Deploy reaches real OPC UA Variable nodes that
subscribed clients see.
Closes the loop between F10b (SDK NodeManager) and F14 (Phase7Plan +
Phase7Applier). DriverHostActor's successful apply now triggers a
RebuildAddressSpace on the publish actor, which loads the latest
deployment artifact + walks composer → planner → applier through the
sink. The OPC UA address space tracks the deployed composition.
DeploymentArtifact:
- New ParseComposition(blob) → Phase7CompositionResult that decodes
Equipment + DriverInstance + ScriptedAlarm arrays into the
projection records Phase7Planner consumes. Pascal-case property
names mirror ConfigComposer.SnapshotAndFlattenAsync's output.
- Each entity reader is tolerant: missing-id rows are dropped,
natural-key sort matches Phase7Composer's contract.
OpcUaPublishActor:
- New Props params: dbFactory + applier. When wired, RebuildAddressSpace
does:
1. LoadLatestArtifact (most recent Sealed Deployment.ArtifactBlob)
2. ParseComposition → Phase7CompositionResult
3. Phase7Planner.Compute(lastApplied, next) → Phase7Plan
4. Empty plan ⇒ no-op (deploy of unchanged composition is benign)
5. applier.Apply(plan) drives sink.RebuildAddressSpace +
WriteAlarmState for removed nodes
6. lastApplied = next so the next rebuild diffs forward
- Without dbFactory/applier wiring, falls back to raw
sink.RebuildAddressSpace — the dev/Mac path before #108 binds prod.
DriverHostActor:
- New Props param opcUaPublishActor (IActorRef?). After successful
ApplyAndAck (status Applied, ACK sent), tells the publish actor
RebuildAddressSpace with the same correlation id so the audit trail
threads through. Null publish actor ⇒ no trigger (admin-only nodes).
Tests: Runtime 63 -> 69 (+6):
- ParseComposition reads Equipment/Driver/Alarm sorted by natural key
- ParseComposition returns empty for empty blob
- Rebuild with dbFactory + sealed deployment artifact triggers exactly
one sink.Rebuild call (Equipment topology added)
- Rebuild with no artifact is idempotent no-op
- Second rebuild with same composition is empty-plan no-op
- Rebuild without dbFactory falls back to raw sink.Rebuild (legacy path)
All 6 v2 test suites green: 173 tests passing.
Closes#109. Engine-wiring data flow is now end-to-end through:
Deploy → DriverHostActor.ApplyAndAck → driver spawn + ACK +
RebuildAddressSpace → OpcUaPublishActor → Phase7Applier → SDK
NodeManager → subscribed OPC UA clients see the change.