# 04 — Cluster Sharding (Akka.Cluster.Sharding) ## Overview Cluster Sharding distributes a set of actors (called "entities") across cluster members using a consistent hashing strategy. Entities are identified by a unique ID, and the sharding infrastructure ensures each entity exists on exactly one node at a time, automatically rebalancing when nodes join or leave. In our SCADA system, Cluster Sharding is a candidate for managing device actors across the cluster — but with a 2-node active/cold-standby topology where only the active node communicates with equipment, the value proposition is nuanced. ## When to Use - If the system evolves beyond a strict active/standby model to allow both nodes to handle subsets of devices - If device counts grow beyond what a single node can handle (significantly more than 500 devices with high-frequency tag updates) - If the architecture shifts to active/active with partitioned device ownership ## When Not to Use - In the current design (active/cold-standby), Cluster Singleton is a better fit for owning all device communication on a single node - Sharding adds complexity (shard coordinators, rebalancing, remember-entities configuration) that is unnecessary when one node does all the work - With only 2 nodes, rebalancing is not meaningful — entities would just move from one node to the other ## Design Decisions for the SCADA System ### Current Recommendation: Do Not Use Sharding For the current architecture where one node is active and the other is a cold standby, Cluster Sharding adds overhead without benefit. The active node hosts all device actors via the Singleton pattern, and on failover, all actors restart on the new active node. ### Future Consideration: Active/Active Migration Path If the system eventually needs to scale beyond one node's capacity, Sharding provides a clean migration path. Each device becomes a sharded entity identified by its device ID: ```csharp // Hypothetical future sharding setup var sharding = ClusterSharding.Get(system); var deviceRegion = sharding.Start( typeName: "Device", entityPropsFactory: entityId => Props.Create(() => new DeviceActor(entityId)), settings: ClusterShardingSettings.Create(system), messageExtractor: new DeviceMessageExtractor() ); ``` The message extractor would route based on device ID: ```csharp public class DeviceMessageExtractor : HashCodeMessageExtractor { public DeviceMessageExtractor() : base(maxNumberOfShards: 100) { } public override string EntityId(object message) => message switch { IDeviceMessage m => m.DeviceId, _ => null }; } ``` ### If Sharding Is Adopted: Remember Entities For SCADA, if sharding is used, enable `remember-entities` so that device actors are automatically restarted after rebalancing without waiting for a new message: ```hocon akka.cluster.sharding { remember-entities = on remember-entities-store = "eventsourced" # Requires Akka.Persistence } ``` This is important because device actors need to maintain their tag subscriptions — they can't wait for an external trigger to restart. ## Common Patterns ### Passivation (Not Recommended for SCADA) Sharding supports passivating idle entities to save memory. For SCADA device actors, passivation is inappropriate — devices need continuous monitoring regardless of command activity. If sharding is used, disable passivation or set very long timeouts: ```hocon akka.cluster.sharding { passivate-idle-entity-after = off } ``` ### Shard Count Sizing If sharding is adopted, the number of shards should be significantly larger than the number of nodes but not excessively large. For 2 nodes with 500 devices, 50–100 shards is reasonable. The shard count is permanent and cannot be changed without data migration. ## Anti-Patterns ### Sharding as a Default Choice Do not adopt Sharding just because it's available. For a 2-node active/standby system, it adds a shard coordinator (which itself is a Singleton), persistence requirements for shard state, and rebalancing logic — all of which increase failure surface area without providing benefit. ### Mixing Sharding and Singleton for the Same Concern If device actors are managed by Sharding, do not also wrap them in a Singleton. These are alternative distribution strategies. Pick one. ## Configuration Guidance If Sharding is adopted in the future, here is a starting configuration for the SCADA system: ```hocon akka.cluster.sharding { guardian-name = "sharding" role = "scada-node" remember-entities = on remember-entities-store = "eventsourced" passivate-idle-entity-after = off # For 2-node cluster, keep coordinator on oldest node least-shard-allocation-strategy { rebalance-absolute-limit = 5 rebalance-relative-limit = 0.1 } } ``` ## References - Official Documentation: - Sharding Configuration: