# ScadaBridge ScadaBridge is a centrally-managed, distributed SCADA configuration and deployment platform built on Akka.NET, running across a central cluster and multiple site clusters in a hub-and-spoke topology. ## Overview This repository is the full **implementation** project for ScadaBridge — the C#/.NET source (`src/`), tests (`tests/`), deployable Docker topology (`docker/`, `docker-env2/`, `infra/`), and the design documentation (`docs/`) that the code implements. This README is the master index: it links the per-component **design specs** (the spec the code in `src/` implements) and shows the system architecture. The solution file is `ZB.MOM.WW.ScadaBridge.slnx`. ### Technology Stack | Layer | Technology | |-------|-----------| | Runtime | .NET, Akka.NET (actors, clustering, remoting, persistence, streams) | | Central UI | Blazor Server (ASP.NET Core + SignalR) | | Inbound API | ASP.NET Core Web API (REST/JSON) | | Central Database | MS SQL Server, Entity Framework Core | | Site Storage | SQLite (deployed configs, S&F buffer, event logs) | | Authentication | Direct LDAP/AD bind (LDAPS/StartTLS), JWT sessions | | Notifications | Delivered from the central cluster (SMTP, OAuth2/Microsoft 365); store-and-forwarded from sites | | Hosting | Windows Server, Windows Service | | Cluster | Akka.NET Cluster (active/standby, keep-oldest SBR) | | Logging | Serilog (structured) | ### Scale - Central cluster: 2-node active/standby behind a load balancer. - Site clusters: 2-node active/standby, headless (no UI). ## Repository Layout | Path | Contents | |------|----------| | `src/` | C#/.NET implementation — one project per component (`ZB.MOM.WW.ScadaBridge.`). Solution: `ZB.MOM.WW.ScadaBridge.slnx`. | | `tests/` | Unit and integration test projects. | | `docs/` | Design documentation — `docs/requirements/` (high-level + per-component specs, the spec the code implements), `docs/test_infra/`, `docs/plans/`. | | `docker/` | Primary 8-node cluster topology (2 central + 3 sites × 2 nodes + Traefik) + `deploy.sh`. | | `docker-env2/` | Minimal second cluster (2 central + 1 site) for exercising Transport (#24) against a real second environment. | | `infra/` | Local test services (MS SQL, LDAP, OPC UA, SMTP, REST API, Traefik). | | `deploy/` | Production/on-host deployment artifacts (e.g. `wonder-app-vd03/`). | | `AkkaDotNet/` | Akka.NET reference notes. | ## Build, Test & Run ```bash # Build the solution dotnet build ZB.MOM.WW.ScadaBridge.slnx # Run the tests dotnet test ZB.MOM.WW.ScadaBridge.slnx # Bring up the primary local cluster (builds the scadabridge:latest image + recreates containers) bash docker/deploy.sh # central load balancer at http://localhost:9000 # Drive the system from the CLI (reads ~/.scadabridge/config.json; test user has all roles) dotnet run --project src/ZB.MOM.WW.ScadaBridge.CLI -- \ --username multi-role --password password template list ``` See [`docker/README.md`](docker/README.md) for ports and management commands, and [`src/ZB.MOM.WW.ScadaBridge.CLI/README.md`](src/ZB.MOM.WW.ScadaBridge.CLI/README.md) for the full CLI reference. ## Local Test Environments Two Docker-based cluster topologies are available for local development and testing: - **Primary** ([`docker/`](docker/)) — Full topology (2 central + 3 sites × 2 nodes + Traefik). Default development target. - **Env2** ([`docker-env2/`](docker-env2/)) — Minimal sibling stack (2 central + 1 site × 2 nodes + Traefik), runs concurrently with primary on host ports 91XX. Purpose: exercise the Transport (#24) bundle export/import feature against a real second environment. Both stacks share the infrastructure services in [`infra/`](infra/) (MS SQL, LDAP, SMTP, OPC UA, REST API). ## Document Map ### Requirements - [HighLevelReqs.md](docs/requirements/HighLevelReqs.md) — Complete high-level requirements covering all functional areas. ### Component Design Documents | # | Component | Document | Description | |---|-----------|----------|-------------| | 1 | Template Engine | [docs/requirements/Component-TemplateEngine.md](docs/requirements/Component-TemplateEngine.md) | Template modeling, inheritance, composition, path-qualified member addressing, override granularity, locking, alarms, native alarm source bindings, flattening, semantic validation, revision hashing, diff calculation, and folder organization (nested folders, drag-drop). | | 2 | Deployment Manager | [docs/requirements/Component-DeploymentManager.md](docs/requirements/Component-DeploymentManager.md) | Central-side deployment pipeline with deployment ID/idempotency, per-instance operation lock, state transition matrix, all-or-nothing site apply, system-wide artifact deployment with per-site status. | | 3 | Site Runtime | [docs/requirements/Component-SiteRuntime.md](docs/requirements/Component-SiteRuntime.md) | Site-side actor hierarchy with explicit supervision strategies, staggered startup, script trust model (constrained APIs), Tell/Ask conventions, concurrency serialization, site-wide Akka stream with per-subscriber backpressure, and a read-only Native Alarm Actor (peer to the computed Alarm Actor) mirroring native OPC UA A&C / MxAccess alarms with site SQLite persistence. | | 4 | Data Connection Layer | [docs/requirements/Component-DataConnectionLayer.md](docs/requirements/Component-DataConnectionLayer.md) | Common data connection interface (OPC UA, MxGateway, custom), Become/Stash connection actor model, auto-reconnect, immediate bad quality on disconnect, transparent re-subscribe, synchronous write failures, tag path resolution retry, protocol-agnostic address-space browse, and optional read-only native alarm mirroring (`IAlarmSubscribableConnection`, one alarm feed per connection with snapshot replay). | | 5 | Central–Site Communication | [docs/requirements/Component-Communication.md](docs/requirements/Component-Communication.md) | Dual transport: Akka.NET ClusterClient (command/control) + gRPC server-streaming (real-time data). 9 message patterns with per-pattern timeouts, SiteStreamGrpcServer/Client, application-level correlation IDs, transport heartbeat config, gRPC keepalive, message ordering, connection failure behavior. The gRPC stream additively carries the read-only native alarm mirror (computed + native OPC UA / MxAccess) via the enriched `AlarmStateUpdate`. | | 6 | Store-and-Forward Engine | [docs/requirements/Component-StoreAndForward.md](docs/requirements/Component-StoreAndForward.md) | Buffering (transient failures only), fixed-interval retry, parking, async best-effort replication, SQLite persistence at sites. | | 7 | External System Gateway | [docs/requirements/Component-ExternalSystemGateway.md](docs/requirements/Component-ExternalSystemGateway.md) | HTTP/REST + JSON, API key/Basic Auth, per-system timeout, dual call modes (Call/CachedCall), transient/permanent error classification, dedicated blocking I/O dispatcher, ADO.NET connection pooling. | | 8 | Notification Service | [docs/requirements/Component-NotificationService.md](docs/requirements/Component-NotificationService.md) | Central-only — manages typed notification-list and SMTP definitions, supplies per-type delivery adapters (SMTP with OAuth2 (M365) or Basic Auth, BCC, plain text); delivery performed by the Notification Outbox. | | 9 | Central UI | [docs/requirements/Component-CentralUI.md](docs/requirements/Component-CentralUI.md) | Blazor Server with SignalR real-time push, load balancer failover with JWT, all management workflows. | | 10 | Security & Auth | [docs/requirements/Component-Security.md](docs/requirements/Component-Security.md) | Direct LDAP bind (LDAPS/StartTLS), JWT sessions (HMAC-SHA256, 15-min refresh, 30-min idle), role-based authorization, site-scoped permissions. | | 11 | Health Monitoring | [docs/requirements/Component-HealthMonitoring.md](docs/requirements/Component-HealthMonitoring.md) | 30s report interval, 60s offline threshold, monotonic sequence numbers, raw error counts, tag resolution counts, dead letter monitoring. | | 12 | Site Event Logging | [docs/requirements/Component-SiteEventLogging.md](docs/requirements/Component-SiteEventLogging.md) | SQLite storage, 30-day retention + 1GB cap, daily purge, paginated remote queries with keyword search. | | 13 | Cluster Infrastructure | [docs/requirements/Component-ClusterInfrastructure.md](docs/requirements/Component-ClusterInfrastructure.md) | Akka.NET cluster, keep-oldest SBR with down-if-alone, min-nr-of-members=1, 2s/10s/15s failure detection, CoordinatedShutdown, automatic dual-node recovery. The `ClusterInfrastructure` project owns the `ClusterOptions` config model; the Akka bootstrap/SBR/CoordinatedShutdown wiring lives in the Host. | | 14 | Inbound API | [docs/requirements/Component-InboundAPI.md](docs/requirements/Component-InboundAPI.md) | POST /api/{methodName}, X-API-Key header, flat JSON, extended type system (Object/List), script-based implementations, failures-only logging. | | 15 | Host | [docs/requirements/Component-Host.md](docs/requirements/Component-Host.md) | Single deployable binary, role-based component registration, per-component config binding (Options pattern), readiness gating, dead letter monitoring, Akka.NET bootstrap, ASP.NET Core hosting for central. | | 16 | Commons | [docs/requirements/Component-Commons.md](docs/requirements/Component-Commons.md) | Namespace/folder convention (Types/Interfaces/Entities/Messages), shared data types, POCOs, repository interfaces, message contracts with additive-only versioning, UTC timestamp convention, the unified read-only alarm condition model (`AlarmConditionState`/`AlarmKind`), and native alarm source entities + the `IAlarmSubscribableConnection` capability seam. | | 17 | Configuration Database | [docs/requirements/Component-ConfigurationDatabase.md](docs/requirements/Component-ConfigurationDatabase.md) | EF Core data access, per-component repositories, unit-of-work, optimistic concurrency on deployment status, audit logging (IAuditService), migration management (incl. the `AddNativeAlarmSources` migration + native alarm source repository CRUD). | | 18 | Management Service | [docs/requirements/Component-ManagementService.md](docs/requirements/Component-ManagementService.md) | Akka.NET ManagementActor on central, ClusterClientReceptionist registration, programmatic access to all admin operations, CLI interface. | | 19 | CLI | [docs/requirements/Component-CLI.md](docs/requirements/Component-CLI.md) | Standalone command-line tool, System.CommandLine, HTTP transport via Management API, JSON/table output, mirrors all Management Service operations. | | 20 | Traefik Proxy | [docs/requirements/Component-TraefikProxy.md](docs/requirements/Component-TraefikProxy.md) | Reverse proxy/load balancer fronting central cluster, active node routing via `/health/active`, automatic failover. | | 21 | Notification Outbox | [docs/requirements/Component-NotificationOutbox.md](docs/requirements/Component-NotificationOutbox.md) | Central component ingesting store-and-forwarded notifications into the `Notifications` audit table, with `NotificationOutboxActor` singleton dispatcher, per-type delivery adapters, retry/parking, status tracking, daily purge, and delivery KPIs. | | 22 | Site Call Audit | [docs/requirements/Component-SiteCallAudit.md](docs/requirements/Component-SiteCallAudit.md) | Central component auditing site cached calls (`ExternalSystem.CachedCall`/`Database.CachedWrite`) into the `SiteCalls` audit table, with `SiteCallAuditActor` singleton, telemetry ingest, periodic reconciliation, point-in-time KPIs, daily purge, and central→site Retry/Discard relay for parked calls. | | 23 | Audit Log | [docs/requirements/Component-AuditLog.md](docs/requirements/Component-AuditLog.md) | New central append-only AuditLog spanning every script-trust-boundary action (outbound API sync+cached, outbound DB sync+cached, notifications, inbound API). Site-local SQLite hot-path append + gRPC telemetry + central reconciliation; combined telemetry packet with Site Call Audit; central direct-write for Notification Outbox dispatch + Inbound API middleware; monthly partitioning, 365-day default retention. | | 24 | Transport | [docs/requirements/Component-Transport.md](docs/requirements/Component-Transport.md) | Bundle export/import for templates, shared scripts, external systems, central-only artifacts. AES-256-GCM encryption; per-conflict resolution on import; correlated audit trail. | **Shared UI sub-component** (not a top-level component): [TreeView](docs/requirements/Component-TreeView.md) — reusable hierarchical tree/grid Blazor component used by the Central UI (#9) for the templates folder hierarchy, data-connection browse, and tag pickers. ### Reference Documentation - [AkkaDotNet/](AkkaDotNet/) — Akka.NET reference notes covering actors, remoting, clustering, persistence, streams, serialization, hosting, testing, and best practices. - [docs/plans/](docs/plans/) — Design decision documents from refinement sessions. ### Architecture Diagram (Logical) ![Logical architecture](diagrams/architecture-logical.png) ### Site Runtime Actor Hierarchy ![Site runtime actor hierarchy](diagrams/site-runtime-actor-hierarchy.png)