New components 18-19: ManagementService (Akka.NET actor on Central exposing all admin operations via ClusterClientReceptionist) and CLI (console app using ClusterClient for scripting). Updated HighLevelReqs, CLAUDE.md, README, Component-Host, Component-Communication, Component-Security.
146 lines
12 KiB
Markdown
146 lines
12 KiB
Markdown
# SCADA System — Design Documentation
|
||
|
||
## Overview
|
||
|
||
This document serves as the master index for the SCADA system design. The system is a centrally-managed, distributed SCADA configuration and deployment platform built on Akka.NET, running across a central cluster and multiple site clusters in a hub-and-spoke topology.
|
||
|
||
### Technology Stack
|
||
|
||
| Layer | Technology |
|
||
|-------|-----------|
|
||
| Runtime | .NET, Akka.NET (actors, clustering, remoting, persistence, streams) |
|
||
| Central UI | Blazor Server (ASP.NET Core + SignalR) |
|
||
| Inbound API | ASP.NET Core Web API (REST/JSON) |
|
||
| Central Database | MS SQL Server, Entity Framework Core |
|
||
| Site Storage | SQLite (deployed configs, S&F buffer, event logs) |
|
||
| Authentication | Direct LDAP/AD bind (LDAPS/StartTLS), JWT sessions |
|
||
| Notifications | SMTP with OAuth2 Client Credentials (Microsoft 365) |
|
||
| Hosting | Windows Server, Windows Service |
|
||
| Cluster | Akka.NET Cluster (active/standby, keep-oldest SBR) |
|
||
| Logging | Serilog (structured) |
|
||
|
||
### Scale
|
||
|
||
- ~10 site clusters, each with 50–500 machines, 25–75 live tags per machine.
|
||
- Central cluster: 2-node active/standby behind a load balancer.
|
||
- Site clusters: 2-node active/standby, headless (no UI).
|
||
|
||
## Document Map
|
||
|
||
### Requirements
|
||
- [HighLevelReqs.md](HighLevelReqs.md) — Complete high-level requirements covering all functional areas.
|
||
|
||
### Component Design Documents
|
||
|
||
| # | Component | Document | Description |
|
||
|---|-----------|----------|-------------|
|
||
| 1 | Template Engine | [Component-TemplateEngine.md](Component-TemplateEngine.md) | Template modeling, inheritance, composition, path-qualified member addressing, override granularity, locking, alarms, flattening, semantic validation, revision hashing, and diff calculation. |
|
||
| 2 | Deployment Manager | [Component-DeploymentManager.md](Component-DeploymentManager.md) | Central-side deployment pipeline with deployment ID/idempotency, per-instance operation lock, state transition matrix, all-or-nothing site apply, system-wide artifact deployment with per-site status. |
|
||
| 3 | Site Runtime | [Component-SiteRuntime.md](Component-SiteRuntime.md) | Site-side actor hierarchy with explicit supervision strategies, staggered startup, script trust model (constrained APIs), Tell/Ask conventions, concurrency serialization, and site-wide Akka stream with per-subscriber backpressure. |
|
||
| 4 | Data Connection Layer | [Component-DataConnectionLayer.md](Component-DataConnectionLayer.md) | Common data connection interface (OPC UA, custom), Become/Stash connection actor model, auto-reconnect, immediate bad quality on disconnect, transparent re-subscribe, synchronous write failures, tag path resolution retry. |
|
||
| 5 | Central–Site Communication | [Component-Communication.md](Component-Communication.md) | Akka.NET remoting/cluster topology, 8 message patterns with per-pattern timeouts, application-level correlation IDs, transport heartbeat config, message ordering, connection failure behavior. |
|
||
| 6 | Store-and-Forward Engine | [Component-StoreAndForward.md](Component-StoreAndForward.md) | Buffering (transient failures only), fixed-interval retry, parking, async best-effort replication, SQLite persistence at sites. |
|
||
| 7 | External System Gateway | [Component-ExternalSystemGateway.md](Component-ExternalSystemGateway.md) | HTTP/REST + JSON, API key/Basic Auth, per-system timeout, dual call modes (Call/CachedCall), transient/permanent error classification, dedicated blocking I/O dispatcher, ADO.NET connection pooling. |
|
||
| 8 | Notification Service | [Component-NotificationService.md](Component-NotificationService.md) | SMTP with OAuth2 (M365) or Basic Auth, BCC delivery, plain text, transient/permanent SMTP error classification, store-and-forward integration. |
|
||
| 9 | Central UI | [Component-CentralUI.md](Component-CentralUI.md) | Blazor Server with SignalR real-time push, load balancer failover with JWT, all management workflows. |
|
||
| 10 | Security & Auth | [Component-Security.md](Component-Security.md) | Direct LDAP bind (LDAPS/StartTLS), JWT sessions (HMAC-SHA256, 15-min refresh, 30-min idle), role-based authorization, site-scoped permissions. |
|
||
| 11 | Health Monitoring | [Component-HealthMonitoring.md](Component-HealthMonitoring.md) | 30s report interval, 60s offline threshold, monotonic sequence numbers, raw error counts, tag resolution counts, dead letter monitoring. |
|
||
| 12 | Site Event Logging | [Component-SiteEventLogging.md](Component-SiteEventLogging.md) | SQLite storage, 30-day retention + 1GB cap, daily purge, paginated remote queries with keyword search. |
|
||
| 13 | Cluster Infrastructure | [Component-ClusterInfrastructure.md](Component-ClusterInfrastructure.md) | Akka.NET cluster, keep-oldest SBR with down-if-alone, min-nr-of-members=1, 2s/10s/15s failure detection, CoordinatedShutdown, automatic dual-node recovery. |
|
||
| 14 | Inbound API | [Component-InboundAPI.md](Component-InboundAPI.md) | POST /api/{methodName}, X-API-Key header, flat JSON, extended type system (Object/List), script-based implementations, failures-only logging. |
|
||
| 15 | Host | [Component-Host.md](Component-Host.md) | Single deployable binary, role-based component registration, per-component config binding (Options pattern), readiness gating, dead letter monitoring, Akka.NET bootstrap, ASP.NET Core hosting for central. |
|
||
| 16 | Commons | [Component-Commons.md](Component-Commons.md) | Namespace/folder convention (Types/Interfaces/Entities/Messages), shared data types, POCOs, repository interfaces, message contracts with additive-only versioning, UTC timestamp convention. |
|
||
| 17 | Configuration Database | [Component-ConfigurationDatabase.md](Component-ConfigurationDatabase.md) | EF Core data access, per-component repositories, unit-of-work, optimistic concurrency on deployment status, audit logging (IAuditService), migration management. |
|
||
| 18 | Management Service | [Component-ManagementService.md](Component-ManagementService.md) | Akka.NET ManagementActor on central, ClusterClientReceptionist registration, programmatic access to all admin operations, CLI interface. |
|
||
| 19 | CLI | [Component-CLI.md](Component-CLI.md) | Standalone command-line tool, System.CommandLine, Akka.NET ClusterClient transport, LDAP auth, JSON/table output, mirrors all Management Service operations. |
|
||
|
||
### Reference Documentation
|
||
|
||
- [AkkaDotNet/](AkkaDotNet/) — Akka.NET reference notes covering actors, remoting, clustering, persistence, streams, serialization, hosting, testing, and best practices.
|
||
- [docs/plans/](docs/plans/) — Design decision documents from refinement sessions.
|
||
|
||
### Architecture Diagram (Logical)
|
||
|
||
```
|
||
Users (Blazor Server)
|
||
│
|
||
Load Balancer
|
||
│
|
||
┌────────────────────────┼────────────────────────────┐
|
||
│ CENTRAL CLUSTER │
|
||
│ (2-node active/standby) │
|
||
│ │
|
||
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
|
||
│ │ Template │ │Deployment│ │ Central │ │
|
||
│ │ Engine │ │ Manager │ │ UI │ Blazor Svr │
|
||
│ └──────────┘ └──────────┘ └──────────┘ │
|
||
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
|
||
│ │ Security │ │ Config │ │ Health │ │
|
||
│ │ & Auth │ │ DB │ │ Monitor │ │
|
||
│ │ (JWT/LDAP)│ │ (EF+IAud)│ │ │ │
|
||
│ └──────────┘ └──────────┘ └──────────┘ │
|
||
│ ┌──────────┐ │
|
||
│ │ Inbound │ ◄── External Systems (X-API-Key) │
|
||
│ │ API │ POST /api/{method}, JSON │
|
||
│ └──────────┘ │
|
||
│ ┌──────────┐ │
|
||
│ │ Mgmt │ ◄── CLI (ClusterClient) │
|
||
│ │ Service │ ManagementActor + Receptionist │
|
||
│ └──────────┘ │
|
||
│ ┌───────────────────────────────────┐ │
|
||
│ │ Akka.NET Communication Layer │ │
|
||
│ │ (correlation IDs, per-pattern │ │
|
||
│ │ timeouts, message ordering) │ │
|
||
│ └──────────────┬────────────────────┘ │
|
||
│ ┌──────────────┴────────────────────┐ │
|
||
│ │ Configuration Database (EF) │──► MS SQL │
|
||
│ └───────────────────────────────────┘ (Config DB)│
|
||
│ │ Machine Data DB│
|
||
└─────────────────┼───────────────────────────────────┘
|
||
│ Akka.NET Remoting
|
||
┌────────────┼────────────┐
|
||
▼ ▼ ▼
|
||
┌─────────┐ ┌─────────┐ ┌─────────┐
|
||
│ SITE A │ │ SITE B │ │ SITE N │
|
||
│ (2-node)│ │ (2-node)│ │ (2-node)│
|
||
│ ┌─────┐ │ │ ┌─────┐ │ │ ┌─────┐ │
|
||
│ │Data │ │ │ │Data │ │ │ │Data │ │
|
||
│ │Conn │ │ │ │Conn │ │ │ │Conn │ │
|
||
│ │Layer │ │ │ │Layer │ │ │ │Layer │ │
|
||
│ ├─────┤ │ │ ├─────┤ │ │ ├─────┤ │
|
||
│ │Site │ │ │ │Site │ │ │ │Site │ │
|
||
│ │Runtm│ │ │ │Runtm│ │ │ │Runtm│ │
|
||
│ ├─────┤ │ │ ├─────┤ │ │ ├─────┤ │
|
||
│ │S&F │ │ │ │S&F │ │ │ │S&F │ │
|
||
│ │Engine│ │ │ │Engine│ │ │ │Engine│ │
|
||
│ ├─────┤ │ │ ├─────┤ │ │ ├─────┤ │
|
||
│ │ExtSys│ │ │ │ExtSys│ │ │ │ExtSys│ │
|
||
│ │Gatwy │ │ │ │Gatwy │ │ │ │Gatwy │ │
|
||
│ └─────┘ │ │ └─────┘ │ │ └─────┘ │
|
||
│ SQLite │ │ SQLite │ │ SQLite │
|
||
└─────────┘ └─────────┘ └─────────┘
|
||
│ │ │
|
||
OPC UA / OPC UA / OPC UA /
|
||
Custom Custom Custom
|
||
Protocol Protocol Protocol
|
||
```
|
||
|
||
### Site Runtime Actor Hierarchy
|
||
|
||
```
|
||
Deployment Manager Singleton (Cluster Singleton)
|
||
├── Instance Actor (one per deployed, enabled instance)
|
||
│ ├── Script Actor (coordinator, one per instance script)
|
||
│ │ └── Script Execution Actor (short-lived, per invocation)
|
||
│ ├── Alarm Actor (coordinator, one per alarm definition)
|
||
│ │ └── Alarm Execution Actor (short-lived, per on-trigger invocation)
|
||
│ └── ... (more Script/Alarm Actors)
|
||
├── Instance Actor
|
||
│ └── ...
|
||
└── ... (more Instance Actors)
|
||
|
||
Site-Wide Akka Stream (attribute + alarm state changes)
|
||
├── All Instance Actors publish to the stream
|
||
└── Debug view subscribes with instance-level filtering
|
||
```
|