# 20 — Serialization ## Overview Akka.NET's serialization system converts messages to bytes and back. Serialization is required whenever a message crosses a boundary: Remoting (between nodes), Persistence (to the journal/snapshot store), or Cluster Sharding (entity passivation). The serialization system is pluggable — you can register different serializers for different message types. In the SCADA system, serialization matters in two critical paths: messages between the active and standby nodes (via Remoting/Cluster), and command events persisted to the SQLite journal (via Persistence). The default JSON serializer works but is slow and produces large payloads. For a production SCADA system, a more efficient serializer is recommended. ## When to Use - Serialization configuration is required whenever Remoting or Persistence is enabled — which is always in our system - Explicit serializer registration is recommended for all application messages that cross node boundaries or are persisted - Custom serialization is needed if messages contain types that the default serializer handles poorly (e.g., complex object graphs, binary data) ## When Not to Use - Messages that stay within a single ActorSystem (local-only messages between actors on the same node) are not serialized — they are passed by reference - Do not serialize large binary blobs (equipment firmware, images) through Akka messages — use out-of-band transfer ## Design Decisions for the SCADA System ### Serializer Choice **Recommended: System.Text.Json or a binary serializer** Akka.NET v1.5+ supports pluggable serializers. Options: | Serializer | Pros | Cons | |---|---|---| | Newtonsoft.Json (default) | Human-readable, easy debugging | Slow, large payloads, type handling quirks | | System.Text.Json | Fast, built into .NET, human-readable | Requires explicit converters for some types | | Hyperion | Fast binary, handles complex types | Not human-readable, occasional compatibility issues across versions | | Custom (protobuf, MessagePack) | Maximum performance, schema evolution | Requires manual schema management | **Recommendation for SCADA:** Use Hyperion for Remoting messages (speed matters for cluster heartbeats and Distributed Data gossip) and Newtonsoft.Json or System.Text.Json for Persistence events (human-readable journal aids debugging). ### Message Design for Serialization Design all cross-boundary messages as simple, immutable records with primitive or well-known types: ```csharp // GOOD — simple types, easy to serialize public record CommandDispatched(string CommandId, string DeviceId, string TagName, double Value, DateTime Timestamp); public record TagValueChanged(string DeviceId, string TagName, double Value, DateTime Timestamp); // BAD — complex types, hard to serialize public record DeviceSnapshot(IActorRef DeviceActor, ConcurrentDictionary State); ``` Rules for serializable messages: - Use primitive types (string, int, double, bool, DateTime, Guid) - Use immutable collections (`IReadOnlyList`, `IReadOnlyDictionary`) - Never include `IActorRef` in persisted messages — actor references are not stable across restarts - Never include mutable state or framework types ### Serializer Binding Configuration Register serializers for application message types: ```hocon akka.actor { serializers { hyperion = "Akka.Serialization.HyperionSerializer, Akka.Serialization.Hyperion" json = "Akka.Serialization.NewtonSoftJsonSerializer, Akka" } serialization-bindings { # Remoting messages — use Hyperion for speed "ScadaSystem.Messages.IClusterMessage, ScadaSystem" = hyperion # Persistence events — use JSON for readability "ScadaSystem.Persistence.IPersistedEvent, ScadaSystem" = json } } ``` ### Marker Interfaces for Binding Use marker interfaces to group messages by serialization strategy: ```csharp // All messages that cross the Remoting boundary public interface IClusterMessage { } // All events persisted to the journal public interface IPersistedEvent { } // Application messages public record CommandDispatched(...) : IClusterMessage, IPersistedEvent; public record TagValueChanged(...) : IClusterMessage; public record AlarmRaised(...) : IClusterMessage, IPersistedEvent; ``` ## Common Patterns ### Versioning Persisted Events Persistence events are stored permanently. When the message schema changes (new fields, renamed fields), the journal contains old-format events. Handle this with version-tolerant deserialization: ```csharp // Version 1 public record CommandDispatched(string CommandId, string DeviceId, string TagName, double Value, DateTime Timestamp); // Version 2 — added Priority field public record CommandDispatchedV2(string CommandId, string DeviceId, string TagName, double Value, DateTime Timestamp, int Priority); // In the persistent actor's recovery Recover(evt => { // Handle v1 events _state.AddPendingCommand(evt.CommandId, evt.DeviceId, priority: 0); }); Recover(evt => { // Handle v2 events _state.AddPendingCommand(evt.CommandId, evt.DeviceId, evt.Priority); }); ``` Alternatively, use a custom serializer with built-in schema evolution (protobuf, Avro). ### Serialization Verification in Tests Verify that all cross-boundary messages serialize and deserialize correctly: ```csharp [Theory] [MemberData(nameof(AllClusterMessages))] public void All_cluster_messages_should_roundtrip_serialize(IClusterMessage message) { var serializer = Sys.Serialization.FindSerializerFor(message); var bytes = serializer.ToBinary(message); var deserialized = serializer.FromBinary(bytes, message.GetType()); Assert.Equal(message, deserialized); } ``` ### Excluding Local-Only Messages Not all messages need serialization. Mark local-only messages to avoid accidentally sending them across Remoting: ```csharp // Local-only message — never crosses node boundaries public record InternalDeviceStateUpdate(string TagName, object Value); // This does NOT implement IClusterMessage ``` ## Anti-Patterns ### IActorRef in Persisted Messages `IActorRef` contains a node address that becomes invalid after restart. Never persist actor references. Store the actor's logical identifier (device ID, entity name) and resolve the reference at runtime. ### Serializing Everything as JSON JSON serialization of cluster heartbeats, Distributed Data gossip, and Singleton coordination messages adds unnecessary latency. Use a binary serializer (Hyperion) for infrastructure messages. ### Ignoring Serialization in Development Serialization issues often surface only when Remoting is enabled (in multi-node testing or production). Test serialization explicitly during development, not just in production. ### Large Serialized Payloads If a serialized message exceeds Remoting's maximum frame size (default 128KB), the message is dropped silently. Monitor serialized message sizes, especially for device state snapshots. ## Configuration Guidance ### Hyperion for Remoting ``` NuGet: Akka.Serialization.Hyperion ``` ```hocon akka.actor { serializers { hyperion = "Akka.Serialization.HyperionSerializer, Akka.Serialization.Hyperion" } serialization-bindings { "System.Object" = hyperion # Default all messages to Hyperion } serialization-settings.hyperion { preserve-object-references = false # Better performance, no circular refs needed known-types-provider = "ScadaSystem.HyperionKnownTypes, ScadaSystem" } } ``` ### Known Types for Performance Register frequently serialized types to improve Hyperion performance: ```csharp public class HyperionKnownTypes : IKnownTypesProvider { public IEnumerable GetKnownTypes() { return new[] { typeof(CommandDispatched), typeof(CommandAcknowledged), typeof(TagValueChanged), typeof(AlarmRaised), typeof(DeviceStatus) }; } } ``` ## References - Official Documentation: - Hyperion Serializer: (Note: check current status — Hyperion has had maintenance concerns; evaluate alternatives)