Notes and documentation covering actors, remoting, clustering, persistence, streams, serialization, hosting, testing, and best practices for the Akka.NET framework used throughout the ScadaLink system.
4.9 KiB
04 — Cluster Sharding (Akka.Cluster.Sharding)
Overview
Cluster Sharding distributes a set of actors (called "entities") across cluster members using a consistent hashing strategy. Entities are identified by a unique ID, and the sharding infrastructure ensures each entity exists on exactly one node at a time, automatically rebalancing when nodes join or leave.
In our SCADA system, Cluster Sharding is a candidate for managing device actors across the cluster — but with a 2-node active/cold-standby topology where only the active node communicates with equipment, the value proposition is nuanced.
When to Use
- If the system evolves beyond a strict active/standby model to allow both nodes to handle subsets of devices
- If device counts grow beyond what a single node can handle (significantly more than 500 devices with high-frequency tag updates)
- If the architecture shifts to active/active with partitioned device ownership
When Not to Use
- In the current design (active/cold-standby), Cluster Singleton is a better fit for owning all device communication on a single node
- Sharding adds complexity (shard coordinators, rebalancing, remember-entities configuration) that is unnecessary when one node does all the work
- With only 2 nodes, rebalancing is not meaningful — entities would just move from one node to the other
Design Decisions for the SCADA System
Current Recommendation: Do Not Use Sharding
For the current architecture where one node is active and the other is a cold standby, Cluster Sharding adds overhead without benefit. The active node hosts all device actors via the Singleton pattern, and on failover, all actors restart on the new active node.
Future Consideration: Active/Active Migration Path
If the system eventually needs to scale beyond one node's capacity, Sharding provides a clean migration path. Each device becomes a sharded entity identified by its device ID:
// Hypothetical future sharding setup
var sharding = ClusterSharding.Get(system);
var deviceRegion = sharding.Start(
typeName: "Device",
entityPropsFactory: entityId => Props.Create(() => new DeviceActor(entityId)),
settings: ClusterShardingSettings.Create(system),
messageExtractor: new DeviceMessageExtractor()
);
The message extractor would route based on device ID:
public class DeviceMessageExtractor : HashCodeMessageExtractor
{
public DeviceMessageExtractor() : base(maxNumberOfShards: 100) { }
public override string EntityId(object message) => message switch
{
IDeviceMessage m => m.DeviceId,
_ => null
};
}
If Sharding Is Adopted: Remember Entities
For SCADA, if sharding is used, enable remember-entities so that device actors are automatically restarted after rebalancing without waiting for a new message:
akka.cluster.sharding {
remember-entities = on
remember-entities-store = "eventsourced" # Requires Akka.Persistence
}
This is important because device actors need to maintain their tag subscriptions — they can't wait for an external trigger to restart.
Common Patterns
Passivation (Not Recommended for SCADA)
Sharding supports passivating idle entities to save memory. For SCADA device actors, passivation is inappropriate — devices need continuous monitoring regardless of command activity. If sharding is used, disable passivation or set very long timeouts:
akka.cluster.sharding {
passivate-idle-entity-after = off
}
Shard Count Sizing
If sharding is adopted, the number of shards should be significantly larger than the number of nodes but not excessively large. For 2 nodes with 500 devices, 50–100 shards is reasonable. The shard count is permanent and cannot be changed without data migration.
Anti-Patterns
Sharding as a Default Choice
Do not adopt Sharding just because it's available. For a 2-node active/standby system, it adds a shard coordinator (which itself is a Singleton), persistence requirements for shard state, and rebalancing logic — all of which increase failure surface area without providing benefit.
Mixing Sharding and Singleton for the Same Concern
If device actors are managed by Sharding, do not also wrap them in a Singleton. These are alternative distribution strategies. Pick one.
Configuration Guidance
If Sharding is adopted in the future, here is a starting configuration for the SCADA system:
akka.cluster.sharding {
guardian-name = "sharding"
role = "scada-node"
remember-entities = on
remember-entities-store = "eventsourced"
passivate-idle-entity-after = off
# For 2-node cluster, keep coordinator on oldest node
least-shard-allocation-strategy {
rebalance-absolute-limit = 5
rebalance-relative-limit = 0.1
}
}
References
- Official Documentation: https://getakka.net/articles/clustering/cluster-sharding.html
- Sharding Configuration: https://getakka.net/articles/configuration/modules/akka.cluster.sharding.html