Files
scadalink-design/AkkaDotNet/02-Remoting.md
Joseph Doherty de636b908b Add Akka.NET reference documentation
Notes and documentation covering actors, remoting, clustering, persistence,
streams, serialization, hosting, testing, and best practices for the Akka.NET
framework used throughout the ScadaLink system.
2026-03-16 09:08:17 -04:00

135 lines
5.9 KiB
Markdown

# 02 — Remoting (Akka.Remote)
## Overview
Akka.Remote is the transport layer that enables actor systems on different machines to exchange messages transparently. In our SCADA system, Remoting is the communication backbone between the active and standby nodes in the failover pair. It underpins the Cluster module — you rarely interact with Remoting directly, but understanding its configuration is critical because it governs how the two nodes discover each other, exchange heartbeats, and transfer messages.
## When to Use
- Remoting is always enabled as a prerequisite for Akka.Cluster — it's not optional in our 2-node topology
- Direct Remoting APIs (sending to remote actor paths) are rarely needed; prefer Cluster-aware abstractions (Singleton, Pub-Sub, Distributed Data) instead
## When Not to Use
- Do not use raw Remoting for device communication — use Akka.IO or protocol-specific libraries for equipment comms
- Do not use Remoting to communicate with external systems (databases, APIs) — it's strictly for inter-ActorSystem communication
- Do not attempt to use Remoting without Cluster in our architecture; the Cluster module adds membership management and failure detection that raw Remoting lacks
## Design Decisions for the SCADA System
### Transport: DotNetty TCP
Akka.NET uses DotNetty as its TCP transport. For our Windows Server environment, this works well out of the box. Key decision: use a fixed, known port (not port 0/random) since both nodes in the failover pair have static IPs or hostnames.
### Port and Hostname Configuration
Each node in the pair must be individually addressable. Use explicit hostnames and ports:
- **Node A (Active):** `akka.tcp://scada-system@nodeA-hostname:4053`
- **Node B (Standby):** `akka.tcp://scada-system@nodeB-hostname:4053`
The ActorSystem name (`scada-system`) must be identical on both nodes.
### Public Hostname vs. Bind Hostname
On machines with multiple NICs (common in industrial environments — one NIC for the equipment network, one for the corporate/management network), configure `public-hostname` to advertise the correct address:
```hocon
akka.remote.dot-netty.tcp {
hostname = "0.0.0.0" # Bind to all interfaces
public-hostname = "nodeA.scada.local" # Advertise this address
port = 4053
}
```
### Serialization Boundary
Every message that crosses the Remoting boundary must be serializable. This includes messages sent via Cluster Singleton, Distributed Data, and Pub-Sub. Design messages as simple, immutable records and register appropriate serializers (see [20-Serialization.md](./20-Serialization.md)).
## Common Patterns
### Separate Equipment and Cluster Networks
If the site has separate networks for equipment communication and inter-node communication, bind Akka.Remote to the inter-node network only. Device actors use Akka.IO or direct protocol libraries on the equipment network. This prevents equipment traffic from interfering with cluster heartbeats.
### Logging Remote Lifecycle Events
Enable remote lifecycle event logging during development and initial deployment to diagnose connection issues:
```hocon
akka.remote {
log-remote-lifecycle-events = on
log-received-messages = off # Too verbose for production
log-sent-messages = off
}
```
### Watching Remote Actors
`Context.Watch()` works across Remoting boundaries. The device manager on the standby node can watch actors on the active node to detect failover conditions — though in practice, Cluster membership changes are a better signal.
## Anti-Patterns
### Using Remote Actor Paths Directly
Avoid constructing remote `ActorSelection` paths like `akka.tcp://scada-system@nodeB:4053/user/some-actor`. This creates tight coupling to physical addresses. Instead, use Cluster-aware mechanisms: Cluster Singleton proxy, Distributed Data, or Pub-Sub.
### Large Message Payloads
Remoting is optimized for small, frequent messages — not bulk data transfer. If you need to transfer large datasets between nodes (e.g., a full device state snapshot during failover), consider chunking the data or using an out-of-band mechanism (shared file system, database).
Default maximum frame size is 128KB. For our SCADA system, this should be sufficient for individual messages, but if device state snapshots are large:
```hocon
akka.remote.dot-netty.tcp {
maximum-frame-size = 256000b # Increase only if needed
}
```
### Assuming Reliable Delivery
Remoting provides at-most-once delivery. Messages can be lost if the connection drops at the wrong moment. For commands that must not be lost during failover, use Akka.Persistence to journal them (see [08-Persistence.md](./08-Persistence.md)).
## Configuration Guidance
### Heartbeat and Failure Detection
The transport failure detector controls how quickly a downed node is detected. For a SCADA failover pair, faster detection means faster failover, but too aggressive settings cause false positives:
```hocon
akka.remote {
transport-failure-detector {
heartbeat-interval = 4s
acceptable-heartbeat-pause = 20s # Default; increase to 30s if on unreliable networks
}
watch-failure-detector {
heartbeat-interval = 1s
threshold = 10.0
acceptable-heartbeat-pause = 10s
}
}
```
For our 2-node SCADA pair on a reliable local network, the defaults are generally appropriate. Do not reduce `acceptable-heartbeat-pause` below 10s — garbage collection pauses on .NET can trigger false positives.
### Connection Limits
With only 2 nodes, connection limits are not a concern. Keep defaults.
### Retry and Backoff
If the active node crashes, the standby will attempt to reconnect. Configure retry behavior:
```hocon
akka.remote {
retry-gate-closed-for = 5s # Wait before retrying after a failed connection
}
```
## References
- Official Documentation: <https://getakka.net/articles/remoting/index.html>
- Configuration Reference: <https://getakka.net/articles/configuration/modules/akka.remote.html>
- Transports: <https://getakka.net/articles/remoting/transports.html>