fix(runtime): restart driver no longer throws 'actor name is not unique'
v2-ci / build (push) Failing after 42s
v2-ci / unit-tests (tests/Core/ZB.MOM.WW.OtOpcUa.Cluster.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.ControlPlane.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Runtime.Tests) (push) Has been skipped
v2-ci / unit-tests (tests/Server/ZB.MOM.WW.OtOpcUa.Security.Tests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.Host.IntegrationTests) (push) Has been skipped
v2-ci / integration (tests/Server/ZB.MOM.WW.OtOpcUa.OpcUaServer.IntegrationTests) (push) Has been skipped

HandleRestartDriver stopped + respawned the child within one synchronous
message handler, reusing the base actor name drv-<id>. Context.Stop is async
(the child processes its own stop on its own mailbox), so the old child was
ALWAYS still registered when the respawn ran — Context.ActorOf threw
InvalidActorNameException deterministically on every AdminUI Restart press,
crashing + restarting the host.

Fix: a monotonic _childSpawnGeneration counter (single-threaded actor) feeds a
-g<gen> suffix on every spawned child name, so a respawn can never collide with
the still-terminating predecessor. Children are tracked by the _children dict
(by IActorRef), never by actor path, so the suffix is invisible to callers.
This also closes the same-shaped latent race in the reconcile path (a removed-
then-readded instance, and a driver-type-change ToStop+ToSpawn in one plan).

Regression test RestartDriver_respawns_the_child_without_an_actor_name_collision
(verified: FAILS on the old code with the exact InvalidActorNameException,
PASSES with the fix). Runtime.Tests 238/238 green. Code-reviewed (approved).
This commit is contained in:
Joseph Doherty
2026-06-15 05:41:18 -04:00
parent aa1e21f53c
commit c9643f68ba
2 changed files with 65 additions and 5 deletions
@@ -88,6 +88,11 @@ public sealed class DriverHostActor : ReceiveActor, IWithTimers
private readonly Dictionary<string, ChildEntry> _children = new(StringComparer.Ordinal);
// Monotonic counter feeding the child actor-name suffix (see ActorNameFor / SpawnChild). Single-
// threaded actor, so a plain increment is safe; it only ever grows, guaranteeing a unique name per
// spawn so a restart's respawn never collides with the still-terminating old child.
private long _childSpawnGeneration;
/// <summary>
/// Driver live-value routing map: <c>(DriverInstanceId, FullName) → folder-scoped equipment
/// NodeId(s)</c>. Rebuilt every apply by <see cref="PushDesiredSubscriptions"/> from the
@@ -984,6 +989,13 @@ public sealed class DriverHostActor : ReceiveActor, IWithTimers
// identity for pre-PR artifacts that don't carry it yet (older deploys persisted before
// ClusterId was added to DriverInstanceSpec).
var clusterId = !string.IsNullOrEmpty(spec.ClusterId) ? spec.ClusterId : _localNode.Value;
// A fresh generation-suffixed name on EVERY spawn so a respawn can never collide with a child
// still tearing down: Akka frees a stopped actor's name only after it FULLY terminates (async),
// but HandleRestartDriver stops + respawns within one (synchronous) message handler — the old
// child is still registered, so reusing the base name throws InvalidActorNameException
// ("actor name is not unique"). Children are tracked by the _children dict (by IActorRef), never
// by path, so the suffix is invisible to every caller.
var actorName = ActorNameFor(spec.DriverInstanceId, _childSpawnGeneration++);
if (stub)
{
child = Context.ActorOf(
@@ -993,7 +1005,7 @@ public sealed class DriverHostActor : ReceiveActor, IWithTimers
startStubbed: true,
healthPublisher: _healthPublisher,
clusterId: clusterId),
ActorNameFor(spec.DriverInstanceId));
actorName);
}
else
{
@@ -1002,7 +1014,7 @@ public sealed class DriverHostActor : ReceiveActor, IWithTimers
driver!,
healthPublisher: _healthPublisher,
clusterId: clusterId),
ActorNameFor(spec.DriverInstanceId));
actorName);
child.Tell(new DriverInstanceActor.InitializeRequested(spec.DriverConfig));
}
@@ -1028,11 +1040,13 @@ public sealed class DriverHostActor : ReceiveActor, IWithTimers
_log.Info("DriverHost {Node}: stopped driver child {Id}", _localNode, driverInstanceId);
}
private static string ActorNameFor(string driverInstanceId)
private static string ActorNameFor(string driverInstanceId, long generation)
{
// Akka actor names cannot contain '/', ':', or whitespace. Mangle defensively.
// Akka actor names cannot contain '/', ':', or whitespace. Mangle defensively. The monotonic
// generation suffix guarantees a never-before-used name on every spawn (see SpawnChild) so a
// restart's respawn can never collide with the still-terminating predecessor.
var chars = driverInstanceId.Select(c => char.IsLetterOrDigit(c) || c is '-' or '_' or '.' ? c : '_').ToArray();
return "drv-" + new string(chars);
return "drv-" + new string(chars) + "-g" + generation;
}
/// <summary>