Files
scadalink-design/docs/requirements/Component-Host.md
Joseph Doherty 416a03b782 feat: complete gRPC streaming channel — site host, docker config, docs, integration tests
Switch site host to WebApplicationBuilder with Kestrel HTTP/2 gRPC server,
add GrpcPort/keepalive config, wire SiteStreamManager as ISiteStreamSubscriber,
expose gRPC ports in docker-compose, add site seed script, update all 10
requirement docs + CLAUDE.md + README.md for the new dual-transport architecture.
2026-03-21 12:38:33 -04:00

203 lines
13 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Component: Host
## Purpose
The Host component is the single deployable executable for the entire ScadaLink system. The same binary runs on every node — central and site alike. The node's role is determined entirely by configuration (`appsettings.json`), not by which binary is deployed. On central nodes the Host additionally bootstraps ASP.NET Core to serve the Central UI and Inbound API web endpoints.
## Location
All nodes (central and site).
## Responsibilities
- Serve as the single entry point (`Program.cs`) for the ScadaLink process.
- Read and validate node configuration at startup before any actor system is created.
- Register the correct set of component services and actors based on the configured node role.
- Bootstrap the Akka.NET actor system with Remoting, Clustering, Persistence, and split-brain resolution via Akka.Hosting.
- Host ASP.NET Core web endpoints on central nodes only.
- Configure structured logging (Serilog) with environment-specific enrichment.
- Support running as a Windows Service in production and as a console application during development.
- Perform graceful shutdown via Akka.NET CoordinatedShutdown when the service is stopped.
---
## Requirements
### REQ-HOST-1: Single Binary Deployment
The same compiled binary must be deployable to both central and site nodes. The node's role (Central or Site) is determined solely by configuration values in `appsettings.json` (or environment-specific overrides). There must be no separate build targets, projects, or conditional compilation symbols for central vs. site.
### REQ-HOST-2: Role-Based Service Registration
At startup the Host must inspect the configured node role and register only the component services appropriate for that role:
- **Shared** (both Central and Site): ClusterInfrastructure, Communication, HealthMonitoring, ExternalSystemGateway, NotificationService.
- **Central only**: TemplateEngine, DeploymentManager, Security, AuditLogging, CentralUI, InboundAPI, ManagementService.
- **Site only**: SiteRuntime, DataConnectionLayer, StoreAndForward, SiteEventLogging.
Components not applicable to the current role must not be registered in the DI container or the Akka.NET actor system.
### REQ-HOST-3: Configuration Binding
The Host must bind configuration sections from `appsettings.json` to strongly-typed options classes using the .NET **Options pattern** (`IOptions<T>` / `IOptionsSnapshot<T>`). Each component has its own configuration section under `ScadaLink`, mapped to a dedicated configuration class owned by that component.
#### Infrastructure Sections
| Section | Options Class | Owner | Contents |
|---------|--------------|-------|----------|
| `ScadaLink:Node` | `NodeOptions` | Host | Role, NodeHostname, SiteId, RemotingPort, GrpcPort (site only, default 8083) |
| `ScadaLink:Cluster` | `ClusterOptions` | ClusterInfrastructure | SeedNodes, SplitBrainResolverStrategy, StableAfter, HeartbeatInterval, FailureDetectionThreshold, MinNrOfMembers |
| `ScadaLink:Database` | `DatabaseOptions` | Host | Central: ConfigurationDb, MachineDataDb connection strings; Site: SQLite paths |
#### Per-Component Sections
| Section | Options Class | Owner | Contents |
|---------|--------------|-------|----------|
| `ScadaLink:DataConnection` | `DataConnectionOptions` | Data Connection Layer | ReconnectInterval, TagResolutionRetryInterval, WriteTimeout |
| `ScadaLink:StoreAndForward` | `StoreAndForwardOptions` | Store-and-Forward | SqliteDbPath, ReplicationEnabled |
| `ScadaLink:HealthMonitoring` | `HealthMonitoringOptions` | Health Monitoring | ReportInterval, OfflineTimeout |
| `ScadaLink:SiteEventLog` | `SiteEventLogOptions` | Site Event Logging | RetentionDays, MaxStorageMb, PurgeScheduleCron |
| `ScadaLink:Communication` | `CommunicationOptions` | Communication | DeploymentTimeout, LifecycleTimeout, QueryTimeout, TransportHeartbeatInterval, TransportFailureThreshold |
| `ScadaLink:Security` | `SecurityOptions` | Security & Auth | LdapServer, LdapPort, LdapUseTls, JwtSigningKey, JwtExpiryMinutes, IdleTimeoutMinutes |
| `ScadaLink:InboundApi` | `InboundApiOptions` | Inbound API | DefaultMethodTimeout |
| `ScadaLink:Notification` | `NotificationOptions` | Notification Service | (SMTP config is stored in config DB and deployed to sites, not in appsettings) |
| `ScadaLink:ManagementService` | `ManagementServiceOptions` | Management Service | (Reserved for future configuration) |
| `ScadaLink:Logging` | `LoggingOptions` | Host | Serilog sink configuration, log level overrides |
#### Convention
- Each component defines its own options class (e.g., `DataConnectionOptions`) in its own project. The class is a plain POCO with properties matching the JSON section keys.
- The Host binds each section during startup via `services.Configure<T>(configuration.GetSection("ScadaLink:<ComponentName>"))`.
- Each component's `AddXxx()` extension method accepts `IServiceCollection` and reads its options via `IOptions<T>` — the component never reads `IConfiguration` directly.
- Options classes live in the component project, not in Commons, because they are component-specific configuration — not shared contracts.
- Startup validation (REQ-HOST-4) validates all required options before the actor system starts.
### REQ-HOST-4: Startup Validation
Before the Akka.NET actor system is created, the Host must validate all required configuration values and fail fast with a clear error message if any are missing or invalid. Validation rules include:
- `NodeConfiguration.Role` must be a valid `NodeRole` value.
- `NodeConfiguration.NodeHostname` must not be null or empty.
- `NodeConfiguration.RemotingPort` must be in valid port range (165535).
- Site nodes must have `GrpcPort` in valid port range (165535) and different from `RemotingPort`.
- Site nodes must have a non-empty `SiteId`.
- Central nodes must have non-empty `ConfigurationDb` and `MachineDataDb` connection strings.
- Site nodes must have non-empty SQLite path values. Site nodes do **not** require a `ConfigurationDb` connection string — all configuration is received via artifact deployment and read from local SQLite.
- At least two seed nodes must be configured.
### REQ-HOST-4a: Readiness Gating
On central nodes, the ASP.NET Core web endpoints (Central UI, Inbound API) must **not accept traffic** until the node is fully operational:
- Akka.NET cluster membership is established.
- Database connectivity (MS SQL) is verified.
- Required cluster singletons are running (if applicable).
A standard ASP.NET Core health check endpoint (`/health/ready`) reports readiness status. The load balancer uses this endpoint to determine when to route traffic to the node. During startup or failover, the node returns `503 Service Unavailable` until ready.
### REQ-HOST-5: Windows Service Hosting
The Host must support running as a Windows Service via `UseWindowsService()`. When launched outside of a Windows Service context (e.g., during development), it must run as a standard console application. No code changes or conditional compilation are required to switch between the two modes.
### REQ-HOST-6: Akka.NET Bootstrap
The Host must configure the Akka.NET actor system using Akka.Hosting with:
- **Remoting**: Configured with the node's hostname and port from `NodeConfiguration`.
- **Clustering**: Configured with seed nodes and the node's cluster role from configuration.
- **Persistence**: Configured with the appropriate journal and snapshot store (SQL for central, SQLite for site).
- **Split-Brain Resolver**: Configured with the strategy and stable-after duration from `ClusterConfiguration`.
- **Actor registration**: Each component's actors registered via its `AddXxxActors()` extension method, conditional on the node's role.
### REQ-HOST-6a: ClusterClientReceptionist (Central Only)
On central nodes, the Host must configure the Akka.NET **ClusterClientReceptionist** and register the ManagementActor with it. This allows external processes (e.g., the CLI) to discover and communicate with the ManagementActor via ClusterClient without joining the cluster as full members. The receptionist is started as part of the Akka.NET bootstrap (REQ-HOST-6) on central nodes only.
### REQ-HOST-7: ASP.NET Web Endpoints
On central nodes, the Host must use `WebApplication.CreateBuilder` to produce a full ASP.NET Core host with Kestrel, and must map web endpoints for:
- Central UI (via `MapCentralUI()` extension method).
- Inbound API (via `MapInboundAPI()` extension method).
On site nodes, the Host must also use `WebApplication.CreateBuilder` (not `Host.CreateDefaultBuilder`) to host the **SiteStreamGrpcServer** via Kestrel HTTP/2 on the configured `GrpcPort` (default 8083). Kestrel is configured with `HttpProtocols.Http2` on the gRPC port only — no HTTP/1.1 web endpoints are exposed. The gRPC service is mapped via `MapGrpcService<SiteStreamGrpcServer>()`.
**Startup ordering (site nodes)**:
1. Actor system and SiteStreamManager must be initialized before gRPC begins accepting connections.
2. The gRPC server rejects streams with `StatusCode.Unavailable` until the actor system is ready.
**Shutdown ordering (site nodes)**:
1. On `CoordinatedShutdown`, stop accepting new gRPC streams first.
2. Cancel all active gRPC streams (triggering client-side reconnect).
3. Tear down actors.
4. Use `IHostApplicationLifetime.ApplicationStopping` to signal the gRPC server.
### REQ-HOST-8: Structured Logging
The Host must configure Serilog as the logging provider with:
- Configuration-driven sink setup (console and file sinks at minimum).
- Automatic enrichment of every log entry with `SiteId`, `NodeHostname`, and `NodeRole` properties sourced from `NodeConfiguration`.
- Structured (machine-parseable) output format.
### REQ-HOST-8a: Dead Letter Monitoring
The Host must subscribe to the Akka.NET `DeadLetter` event stream and log dead letters at Warning level. Dead letters indicate messages sent to actors that no longer exist — a common symptom of failover timing issues, stale actor references, or race conditions during instance lifecycle transitions. The dead letter count is reported as a health metric (see Health Monitoring).
### REQ-HOST-9: Graceful Shutdown
When the Host process receives a stop signal (Windows Service stop, `Ctrl+C`, or SIGTERM), it must trigger Akka.NET CoordinatedShutdown to allow actors to drain in-flight work before the process exits. The Host must not call `Environment.Exit()` or forcibly terminate the actor system without coordinated shutdown.
### REQ-HOST-10: Extension Method Convention
Each component library must expose its services to the Host via a consistent set of extension methods:
- `IServiceCollection.AddXxx()` — registers the component's DI services.
- `AkkaConfigurationBuilder.AddXxxActors()` — registers the component's actors with the Akka.NET actor system (for components that have actors).
- `WebApplication.MapXxx()` — maps the component's web endpoints (only for CentralUI and InboundAPI).
The Host's `Program.cs` calls these extension methods; the component libraries own the registration logic. This keeps the Host thin and each component self-contained. The ManagementService component additionally registers the ManagementActor with ClusterClientReceptionist in its `AddManagementServiceActors()` method.
---
## Component Registration Matrix
| Component | Central | Site | DI (`AddXxx`) | Actors (`AddXxxActors`) | Endpoints (`MapXxx`) |
|---|---|---|---|---|---|
| ClusterInfrastructure | Yes | Yes | Yes | Yes | No |
| Communication | Yes | Yes | Yes | Yes | No |
| HealthMonitoring | Yes | Yes | Yes | Yes | No |
| ExternalSystemGateway | Yes | Yes | Yes | Yes | No |
| NotificationService | Yes | Yes | Yes | Yes | No |
| TemplateEngine | Yes | No | Yes | Yes | No |
| DeploymentManager | Yes | No | Yes | Yes | No |
| Security | Yes | No | Yes | Yes | No |
| CentralUI | Yes | No | Yes | No | Yes |
| InboundAPI | Yes | No | Yes | No | Yes |
| ManagementService | Yes | No | Yes | Yes | No |
| SiteRuntime | No | Yes | Yes | Yes | No |
| DataConnectionLayer | No | Yes | Yes | Yes | No |
| StoreAndForward | No | Yes | Yes | Yes | No |
| SiteEventLogging | No | Yes | Yes | Yes | No |
| ConfigurationDatabase | Yes | No | Yes | No | No |
---
## Dependencies
- **All 17 component libraries**: The Host references every component project to call their extension methods (excludes CLI, which is a separate executable).
- **Akka.Hosting**: For `AddAkka()` and the hosting configuration builder.
- **Akka.Remote.Hosting, Akka.Cluster.Hosting, Akka.Persistence.Hosting**: For Akka subsystem configuration.
- **Serilog.AspNetCore**: For structured logging integration.
- **Microsoft.Extensions.Hosting.WindowsServices**: For Windows Service support.
- **ASP.NET Core** (central only): For web endpoint hosting.
## Interactions
- **All components**: The Host is the composition root — it wires every component into the DI container and actor system.
- **Configuration Database**: The Host registers the DbContext and wires repository implementations to their interfaces. In development, triggers auto-migration; in production, validates schema version.
- **ClusterInfrastructure**: The Host configures the underlying Akka.NET cluster that ClusterInfrastructure manages at runtime.
- **CentralUI / InboundAPI**: The Host maps their web endpoints into the ASP.NET Core pipeline on central nodes.
- **ManagementService**: The Host registers the ManagementActor and configures ClusterClientReceptionist on central nodes, enabling CLI access.
- **HealthMonitoring**: The Host's startup validation and logging configuration provide the foundation for health reporting.