Files
scadalink-design/docker
Joseph Doherty 4f22ca2b1f feat: replace ActorSelection with ClusterClient for inter-cluster communication
Central and site clusters now communicate via ClusterClient/
ClusterClientReceptionist instead of direct ActorSelection. Both
CentralCommunicationActor and SiteCommunicationActor are registered
with their cluster's receptionist. Central creates one ClusterClient
per site using NodeA/NodeB contact points from the DB. Sites configure
multiple CentralContactPoints for automatic failover between central
nodes. ISiteClientFactory enables test injection.
2026-03-18 00:08:47 -04:00
..

ScadaLink Docker Infrastructure

Local Docker deployment of the full ScadaLink cluster topology: a 2-node central cluster and three 2-node site clusters.

Cluster Topology

┌─────────────────────────────────────────────────────┐
│                  Central Cluster                    │
│                                                     │
│  ┌─────────────────┐     ┌─────────────────┐        │
│  │  central-node-a  │◄──►│  central-node-b  │       │
│  │  (leader/oldest) │     │  (standby)       │       │
│  │  Web UI :9001    │     │  Web UI :9002    │       │
│  │  Akka   :9011    │     │  Akka   :9012    │       │
│  └────────┬─────────┘     └─────────────────┘       │
│           │                                         │
└───────────┼─────────────────────────────────────────┘
            │ Akka.NET Remoting (hub-and-spoke)
            ├──────────────────┬──────────────────┐
            ▼                  ▼                  ▼
┌────────────────────┐ ┌────────────────────┐ ┌────────────────────┐
│  Site-A Cluster    │ │  Site-B Cluster    │ │  Site-C Cluster    │
│  (Test Plant A)    │ │  (Test Plant B)    │ │  (Test Plant C)    │
│                    │ │                    │ │                    │
│  node-a ◄──► node-b│ │  node-a ◄──► node-b│ │  node-a ◄──► node-b│
│  :9021      :9022  │ │  :9031      :9032  │ │  :9041      :9042  │
└────────────────────┘ └────────────────────┘ └────────────────────┘

Central Cluster (active/standby)

Runs the web UI (Blazor Server), Template Engine, Deployment Manager, Security, Inbound API, Management Service, and Health Monitoring. Connects to MS SQL for configuration and machine data, LDAP for authentication, and SMTP for notifications.

Site Clusters (active/standby each)

Each site cluster runs Site Runtime, Data Connection Layer, Store-and-Forward, and Site Event Logging. Sites connect to OPC UA for device data and to the central cluster via Akka.NET remoting. Deployed configurations and S&F buffers are stored in local SQLite databases per node.

Site Cluster Site Identifier Central UI Name
Site-A site-a Test Plant A
Site-B site-b Test Plant B
Site-C site-c Test Plant C

Port Allocation

Application Nodes

Node Container Name Host Web Port Host Akka Port Internal Ports
Central A scadalink-central-a 9001 9011 5000 (web), 8081 (Akka)
Central B scadalink-central-b 9002 9012 5000 (web), 8081 (Akka)
Site-A A scadalink-site-a-a 9021 8082 (Akka)
Site-A B scadalink-site-a-b 9022 8082 (Akka)
Site-B A scadalink-site-b-a 9031 8082 (Akka)
Site-B B scadalink-site-b-b 9032 8082 (Akka)
Site-C A scadalink-site-c-a 9041 8082 (Akka)
Site-C B scadalink-site-c-b 9042 8082 (Akka)

Port block pattern: 90X1/90X2 where X = 0 (central), 1 (web), 2 (site-a), 3 (site-b), 4 (site-c).

Infrastructure Services (from infra/docker-compose.yml)

Service Container Name Host Port Purpose
MS SQL 2022 scadalink-mssql 1433 Configuration and machine data databases
LDAP (GLAuth) scadalink-ldap 3893 Authentication with test users
SMTP (Mailpit) scadalink-smtp 1025 / 8025 Email capture (SMTP / web UI)
OPC UA scadalink-opcua 50000 / 8080 Simulated OPC UA server (protocol / web UI)
REST API scadalink-restapi 5200 External REST API for integration testing

All containers communicate over the shared scadalink-net Docker bridge network using container names as hostnames.

Directory Structure

docker/
├── Dockerfile                          # Multi-stage build (shared by all nodes)
├── docker-compose.yml                  # 8-node application stack
├── build.sh                            # Build Docker image
├── deploy.sh                           # Build + deploy all containers
├── teardown.sh                         # Stop and remove containers
├── central-node-a/
│   ├── appsettings.Central.json        # Central node A configuration
│   └── logs/                           # Serilog file output (gitignored)
├── central-node-b/
│   ├── appsettings.Central.json
│   └── logs/
├── site-a-node-a/
│   ├── appsettings.Site.json           # Site-A node A configuration
│   ├── data/                           # SQLite databases (gitignored)
│   └── logs/
├── site-a-node-b/
│   ├── appsettings.Site.json
│   ├── data/
│   └── logs/
├── site-b-node-a/
│   ├── appsettings.Site.json           # Site-B node A configuration
│   ├── data/
│   └── logs/
├── site-b-node-b/
│   ├── appsettings.Site.json
│   ├── data/
│   └── logs/
├── site-c-node-a/
│   ├── appsettings.Site.json           # Site-C node A configuration
│   ├── data/
│   └── logs/
└── site-c-node-b/
    ├── appsettings.Site.json
    ├── data/
    └── logs/

Commands

Initial Setup

Start infrastructure services first, then build and deploy the application:

# 1. Start test infrastructure (MS SQL, LDAP, SMTP, OPC UA)
cd infra && docker compose up -d && cd ..

# 2. Build and deploy all 8 ScadaLink nodes
docker/deploy.sh

After Code Changes

Rebuild and redeploy. The Docker build cache skips NuGet restore when only source files change:

docker/deploy.sh

Stop Application Nodes

Stops and removes all 8 application containers. Site SQLite databases and log files are preserved in node directories:

docker/teardown.sh

Stop Everything

docker/teardown.sh
cd infra && docker compose down && cd ..

View Logs

# All nodes (follow mode)
docker compose -f docker/docker-compose.yml logs -f

# Single node
docker logs -f scadalink-central-a

# Filter by site cluster
docker compose -f docker/docker-compose.yml logs -f site-a-a site-a-b
docker compose -f docker/docker-compose.yml logs -f site-b-a site-b-b
docker compose -f docker/docker-compose.yml logs -f site-c-a site-c-b

# Persisted log files
ls docker/central-node-a/logs/

Restart a Single Node

docker restart scadalink-central-a

Check Cluster Health

# Central node A health check
curl -s http://localhost:9001/health/ready | python3 -m json.tool

# Central node B health check
curl -s http://localhost:9002/health/ready | python3 -m json.tool

CLI Access

Connect the ScadaLink CLI to the central cluster via host-mapped Akka remoting ports:

dotnet run --project src/ScadaLink.CLI -- \
    --contact-points akka.tcp://scadalink@localhost:9011 \
    --username admin --password password \
    template list

Clear Site Data

Remove SQLite databases to reset site state (deployed configs, S&F buffers):

# Single site
rm -rf docker/site-a-node-a/data docker/site-a-node-b/data
docker restart scadalink-site-a-a scadalink-site-a-b

# All sites
rm -rf docker/site-*/data
docker restart scadalink-site-a-a scadalink-site-a-b \
    scadalink-site-b-a scadalink-site-b-b \
    scadalink-site-c-a scadalink-site-c-b

Rebuild Image From Scratch (no cache)

docker build --no-cache -t scadalink:latest -f docker/Dockerfile .

Build Cache

The Dockerfile uses a multi-stage build optimized for fast rebuilds:

  1. Restore stage: Copies only .csproj files and runs dotnet restore. This layer is cached as long as no project file changes.
  2. Build stage: Copies source code and runs dotnet publish --no-restore. Re-runs on any source change but skips restore.
  3. Runtime stage: Uses the slim aspnet:10.0 base image with only the published output.

Typical rebuild after a source-only change takes ~5 seconds (restore cached, only build + publish runs).

Test Users

All test passwords are password. See infra/glauth/config.toml for the full list.

Username Roles Use Case
admin Admin System administration
designer Design Template authoring
deployer Deployment Instance deployment (all sites)
multi-role Admin, Design, Deployment Full access for testing

Failover Testing

Central Failover

# Stop the active central node
docker stop scadalink-central-a

# Verify central-b takes over (check logs for leader election)
docker logs -f scadalink-central-b

# Access UI on standby node
open http://localhost:9002

# Restore the original node
docker start scadalink-central-a

Site Failover

# Stop the active site-a node
docker stop scadalink-site-a-a

# Verify site-a-b takes over singleton (DeploymentManager)
docker logs -f scadalink-site-a-b

# Restore
docker start scadalink-site-a-a

Same pattern applies for site-b (scadalink-site-b-a/scadalink-site-b-b) and site-c (scadalink-site-c-a/scadalink-site-c-b).

Failover takes approximately 25 seconds (2s heartbeat + 10s detection threshold + 15s stable-after for split-brain resolver).