docs(components): accuracy fixes from deep review (batch 1)

Commons (third-party dep, 7 namespaces, retired ApiKey, repo SaveChanges
carve-out), ConfigurationDatabase (5 persisted + 1 non-persisted computed col),
ClusterInfrastructure (abbreviated HOCON note, RemotingPort default),
Host (component matrix: CI/HealthMonitoring/ExternalSystemGateway have no
actors; DeadLetterMonitorActor runs on both roles), Security (Bearer not
X-API-Key; ApiKeyAdmin registered by Host), Communication (Task.Run/Sender).
This commit is contained in:
Joseph Doherty
2026-06-03 16:32:01 -04:00
parent 66f0f96328
commit c5fb02d640
6 changed files with 37 additions and 23 deletions
+8 -3
View File
@@ -31,9 +31,12 @@ Cluster Infrastructure provides the hosting platform; each singleton is owned an
### HOCON assembly
`AkkaHostedService.BuildHocon` constructs the Akka HOCON document from the bound options at startup. All interpolated values pass through `QuoteHocon` (string escaping) and `DurationHocon` (millisecond rendering) so the document is never corrupted by hostnames or timing values containing special characters or sub-second precision:
`AkkaHostedService.BuildHocon` constructs the Akka HOCON document from the bound options at startup. All interpolated values pass through `QuoteHocon` (string escaping) and `DurationHocon` (millisecond rendering) so the document is never corrupted by hostnames or timing values containing special characters or sub-second precision.
The snippet below is abbreviated to highlight the cluster stanzas. The full method also emits three additional stanzas: `akka.extensions` (registers `DistributedPubSubExtensionProvider`), `akka.remote.dot-netty.tcp` (binds `NodeOptions.NodeHostname` and `NodeOptions.RemotingPort`), and `akka.remote.transport-failure-detector` (heartbeat interval and acceptable-heartbeat-pause from `CommunicationOptions.TransportHeartbeatInterval` / `TransportFailureThreshold`).
```csharp
// Abbreviated — see AkkaHostedService.BuildHocon for the full method.
public static string BuildHocon(
NodeOptions nodeOptions,
ClusterOptions clusterOptions,
@@ -54,6 +57,8 @@ audit-telemetry-dispatcher {{
}}
}}
akka {{
// akka.extensions, akka.remote.dot-netty.tcp, and
// akka.remote.transport-failure-detector also emitted here (see full method).
actor {{
provider = cluster
}}
@@ -216,7 +221,7 @@ This returns `false` while the actor system is warming up — the safe-by-defaul
| `NodeHostname` | `string` | (required) | Hostname this node advertises to the Akka cluster remoting layer. |
| `NodeName` | `string` | `""` | Semantic label stamped on audit rows (`SourceNode`). Conventional values: `node-a`/`node-b` for sites, `central-a`/`central-b` for central. |
| `SiteId` | `string?` | (required for Site) | Site identifier; appended to the site-specific cluster role (`site-{SiteId}`). |
| `RemotingPort` | `int` | `8081` | Akka.NET TCP remoting port. Default `8081` for central, `8082` for site. |
| `RemotingPort` | `int` | `8081` | Akka.NET TCP remoting port. Code default is `8081`; the site deployment overrides this to `8082` via `appsettings.Site.json`. |
| `GrpcPort` | `int` | `8083` | Kestrel HTTP/2 port for `SiteStreamGrpcServer` (site nodes only). Must differ from `RemotingPort`. |
| `MetricsPort` | `int` | `8084` | Kestrel HTTP/1.1 port for the Prometheus `/metrics` scrape endpoint (site nodes only). Must differ from `RemotingPort` and `GrpcPort`. |
@@ -263,7 +268,7 @@ This returns `false` while the actor system is warming up — the safe-by-defaul
### Node fails to join cluster on startup
`ClusterOptionsValidator` rejects fewer than two seed nodes, a non-`keep-oldest` strategy, `MinNrOfMembers != 1`, or `DownIfAlone = false` at startup with an `OptionsValidationException`. Check that both seed-node URIs reference the Akka remoting port, not the gRPC port (8083) or metrics port (8084) — `StartupValidator` explicitly rejects seed entries whose port matches `GrpcPort`.
`ClusterOptionsValidator` rejects fewer than two seed nodes, a non-`keep-oldest` strategy, `MinNrOfMembers != 1`, or `DownIfAlone = false` at startup with an `OptionsValidationException`. Check that both seed-node URIs reference the Akka remoting port, not the gRPC port (8083) or metrics port (8084) — on site nodes, `StartupValidator` explicitly rejects seed entries whose port matches `GrpcPort`.
### Singleton not starting after failover
+15 -7
View File
@@ -6,9 +6,9 @@ Commons is the foundational shared library that all other ScadaBridge components
Commons (#16) is not a runtime component. It has no actors, no hosted services, and no DI registrations of its own. Its single role is to hold the shared type vocabulary — entity shapes, interface contracts, and message definitions — so that every component agrees on the same types without depending on each other.
The project enforces minimal dependencies by design: only core .NET SDK. It must not reference Akka.NET, ASP.NET Core, Entity Framework Core, or any persistence or framework library, because it is referenced by all other projects and a framework dependency here becomes a transitive constraint on everything.
The project enforces minimal dependencies by design: it references the `ZB.MOM.WW.Audit` package (for the canonical `AuditEvent` type) and the core .NET SDK. It must not reference Akka.NET, ASP.NET Core, Entity Framework Core, or any persistence or framework library, because it is referenced by all other projects and a framework dependency here becomes a transitive constraint on everything.
Source lives in `src/ZB.MOM.WW.ScadaBridge.Commons/`, organized into four top-level namespaces: `Types/`, `Interfaces/`, `Entities/`, and `Messages/`.
Source lives in `src/ZB.MOM.WW.ScadaBridge.Commons/`, organized into seven top-level namespaces: `Types/`, `Interfaces/`, `Entities/`, `Messages/`, `Observability/`, `Serialization/`, and `Validators/`.
## Key Concepts
@@ -42,7 +42,7 @@ public class Template
### Repository interfaces
Commons defines one repository interface per consuming component. Implementations live entirely in the Configuration Database component. Each interface accepts and returns the POCO entity classes from Commons, and every interface includes `SaveChangesAsync()` to support the unit-of-work pattern without requiring a dependency on EF Core.
Commons defines one repository interface per consuming component. Implementations live entirely in the Configuration Database component. Each interface accepts and returns the POCO entity classes from Commons. Most repository interfaces expose `SaveChangesAsync()` to support the unit-of-work pattern without requiring a dependency on EF Core; the append-only audit repositories (`IAuditLogRepository`, `ISiteCallAuditRepository`) do not — they use upsert/insert-only operations that do not require an explicit save step.
### Message contracts and additive-only evolution
@@ -87,7 +87,7 @@ Namespaces mirror folders: `ZB.MOM.WW.ScadaBridge.Commons.Entities.Templates`, `
| `Entities/Sites/` | `Site`, `DataConnection` |
| `Entities/ExternalSystems/` | `ExternalSystemDefinition`, `ExternalSystemMethod`, `DatabaseConnectionDefinition` |
| `Entities/Notifications/` | `NotificationList`, `NotificationRecipient`, `SmtpConfiguration`, `Notification` |
| `Entities/InboundApi/` | `ApiKey`, `ApiMethod` |
| `Entities/InboundApi/` | `ApiMethod` |
| `Entities/Security/` | `LdapGroupMapping`, `SiteScopeRule` |
| `Entities/Deployment/` | `DeploymentRecord`, `SystemArtifactDeploymentRecord`, `DeployedConfigSnapshot` |
| `Entities/Scripts/` | `SharedScript` |
@@ -182,7 +182,7 @@ public interface IAuditLogRepository
| `IDatabaseGateway` | Script-facing ADO.NET database access via named connections. | External System Gateway |
| `IExternalSystemClient` | Script-facing `ExternalSystem.Call()` / `CachedCall()` invocation. | External System Gateway |
| `IInstanceLocator` | Resolves instance unique name to site identifier for `Route.To()`. | Management Service |
| `INotificationDeliveryService` | Script-facing `Notify.Send()` routing with S&F fallback. | Notification Service |
| `IAuditActorAccessor` | Resolves the authenticated principal's actor string for audit rows (inbound API middleware). | Security & Auth |
Transport bundle interfaces (`IBundleExporter`, `IBundleImporter`, `IBundleSessionStore`, `IAuditCorrelationContext`) live in `Interfaces/Transport/` and are defined in Commons so the Configuration Database and Central UI can depend on the abstraction without taking a Transport component dependency.
@@ -279,7 +279,15 @@ public record AlarmStateChanged(
{
public AlarmLevel Level { get; init; } = AlarmLevel.None;
public AlarmKind Kind { get; init; } = AlarmKind.Computed;
public AlarmConditionState Condition { get; init; } = ...; // defaults to computed projection
// Condition uses a private backing field so the getter can return a
// computed default (AlarmConditionStateFactory.ForComputed(State, Priority))
// when no explicit value has been set via the init accessor.
private AlarmConditionState? _condition;
public AlarmConditionState Condition
{
get => _condition ?? AlarmConditionStateFactory.ForComputed(State, Priority);
init => _condition = value;
}
public string SourceReference { get; init; } = string.Empty;
// ... additional native-alarm fields with empty defaults
}
@@ -299,7 +307,7 @@ When adding a new message contract: add an immutable `record` to the appropriate
## Dependencies & Interactions
- **No runtime dependencies** — Commons references only the core .NET SDK. It does not reference Akka.NET, ASP.NET Core, Entity Framework Core, or any third-party library.
- **Minimal dependencies** — Commons references the core .NET SDK and the `ZB.MOM.WW.Audit` package (for the canonical `AuditEvent` type). It does not reference Akka.NET, ASP.NET Core, Entity Framework Core, or any other third-party library.
- [Configuration Database (#17)](./ConfigurationDatabase.md) — implements every repository interface defined here (`ITemplateEngineRepository`, `IAuditLogRepository`, etc.) via EF Core Fluent API; maps the POCO entity classes to the underlying MS SQL schema.
- **All other components** — reference Commons for shared types, entity classes, interface contracts, and message definitions. The dependency graph is strictly one-way: Commons knows nothing about its consumers.
- [Audit Log (#23)](./AuditLog.md) — implements `IAuditWriter`, `ICentralAuditWriter`, `ISiteAuditQueue`, `ICachedCallLifecycleObserver`, and `ICachedCallTelemetryForwarder`; consumes `ScadaBridgeAuditEventFactory`, `AuditDetailsCodec`, `AuditRowProjection`, and the audit message contracts defined here.
+2 -2
View File
@@ -96,7 +96,7 @@ _centralClient.Tell(
new ClusterClient.Send("/user/central-communication", msg), Sender);
```
The original `Sender` is forwarded as the `ClusterClient.Send` sender so any reply from central routes straight back to the waiting Ask on the site, not through `SiteCommunicationActor`.
For request/response messages, the original `Sender` is forwarded as the `ClusterClient.Send` sender so any reply from central routes straight back to the waiting Ask on the site, not through `SiteCommunicationActor`. For fire-and-forget messages (e.g. `SiteHealthReport`), `Self` is used as the sender instead, because no reply is expected.
### Address loading and the 60-second refresh
@@ -145,7 +145,7 @@ private void HandleAttributeValueChanged(AttributeValueChanged msg)
**Central-side**`SiteStreamGrpcClient` / `SiteStreamGrpcClientFactory`:
- `SiteStreamGrpcClientFactory` (singleton) caches one `SiteStreamGrpcClient` per site identifier. On `GetOrCreate`, it compares the cached client's `Endpoint` to the requested endpoint and atomically replaces a stale client (different endpoint — NodeA→NodeB failover flip, or an edited address) with a fresh one.
- `SiteStreamGrpcClient` opens a `GrpcChannel` with HTTP/2 keepalive (`KeepAlivePingDelay` default 15 s, `KeepAlivePingTimeout` default 10 s, `KeepAlivePingPolicy.Always`). It calls `SubscribeInstance` and reads the response stream in a background `Task.Run`, invoking `onEvent` for each received event and `onError` on any non-cancellation exception.
- `SiteStreamGrpcClient` opens a `GrpcChannel` with HTTP/2 keepalive (`KeepAlivePingDelay` default 15 s, `KeepAlivePingTimeout` default 10 s, `KeepAlivePingPolicy.Always`). `SubscribeAsync` is a plain `async Task` that calls `SubscribeInstance` and reads the response stream with `await foreach`, invoking `onEvent` for each received event and `onError` on any non-cancellation exception. The caller (`DebugStreamBridgeActor.OpenGrpcStream`) launches it inside a `Task.Run` so the long-running stream loop runs off the actor thread.
### Debug stream session lifecycle
+2 -2
View File
@@ -61,7 +61,7 @@ Row-level `DELETE` on `AuditLog` is not granted even to the purge role; retentio
Each entity has its own `IEntityTypeConfiguration<T>` in `Configurations/`. Representative examples:
**`AuditLogEntityTypeConfiguration`** maps `AuditLogRow` to `dbo.AuditLog`. The table carries ten writable canonical columns plus six read-only server-side computed columns derived from `DetailsJson` via `JSON_VALUE … PERSISTED`. EF is configured with `ValueGeneratedOnAddOrUpdate()` and no write for the computed columns; the repository writes only the ten canonical columns and lets SQL Server derive the rest:
**`AuditLogEntityTypeConfiguration`** maps `AuditLogRow` to `dbo.AuditLog`. The table carries ten writable canonical columns plus five persisted computed columns derived from `DetailsJson` via `JSON_VALUE … PERSISTED` (`Kind`, `Status`, `SourceSiteId`, `ExecutionId`, `ParentExecutionId`) and one additional non-persisted computed column `IngestedAtUtc` (SWITCHOFFSET-based; SQL Server forbids PERSISTED on a non-deterministic expression). EF is configured with `ValueGeneratedOnAddOrUpdate()` and no write for the computed columns; the repository writes only the ten canonical columns and lets SQL Server derive the rest:
```csharp
// AuditLogEntityTypeConfiguration (excerpt)
@@ -210,7 +210,7 @@ The `SCADABRIDGE_DESIGNTIME_CONNECTIONSTRING` environment variable is an alterna
## Dependencies & Interactions
- [Commons (#16)](./Commons.md) — all POCO entity classes (`Templates`, `Instances`, `Sites`, `AuditLogEntry`, `SiteCall`, …) and all repository interfaces (`ITemplateEngineRepository`, `IDeploymentManagerRepository`, `ISecurityRepository`, `IInboundApiRepository`, `IExternalSystemRepository`, `INotificationRepository`, `INotificationOutboxRepository`, `ISiteCallAuditRepository`, `IAuditLogRepository`, `ICentralUiRepository`) live there. Commons also declares `IAuditService`, `IAuditCorrelationContext`, `IPartitionMaintenance`, and `IInstanceLocator` — all implemented here.
- [Commons (#16)](./Commons.md) — all POCO entity classes (`Templates`, `Instances`, `Sites`, `AuditLogEntry`, `SiteCall`, …) and all repository interfaces (`ITemplateEngineRepository`, `IDeploymentManagerRepository`, `ISecurityRepository`, `IInboundApiRepository`, `IExternalSystemRepository`, `INotificationRepository`, `INotificationOutboxRepository`, `ISiteCallAuditRepository`, `IAuditLogRepository`, `ICentralUiRepository`, `ISiteRepository`) live there. Commons also declares `IAuditService`, `IAuditCorrelationContext`, `IPartitionMaintenance`, and `IInstanceLocator` — all implemented here.
- [Audit Log (#23)](./AuditLog.md) — `IAuditLogRepository` (implemented by `AuditLogRepository`) is the sole central write path for `dbo.AuditLog`. `AuditLogIngestActor`, `CentralAuditWriter`, and `SiteAuditReconciliationActor` all resolve it from a fresh per-message DI scope; the Audit Log component hosts the `AuditLogPartitionMaintenanceService` and `AuditLogPurgeActor` that drive the `IPartitionMaintenance` implementation registered here.
- [Template Engine (#1)](./TemplateEngine.md) — consumes `ITemplateEngineRepository` for all template, attribute, alarm, native alarm source, script, composition, instance, override, connection binding, and area operations.
- [Deployment Manager (#2)](./DeploymentManager.md) — consumes `IDeploymentManagerRepository` for deployment records and configuration snapshots.
+7 -5
View File
@@ -87,13 +87,15 @@ public bool IsActiveNode
`Program.cs` (Central branch) calls `WebApplication.CreateBuilder`, registers shared and central-only components, builds the `WebApplication`, applies or retries database migrations, and mounts the middleware pipeline and endpoints. The order is intentional: `UseAuthentication` and `UseAuthorization` run before `UseAuditWriteMiddleware` so `HttpContext.User` is populated when the audit row is written.
Before branching on role, `AkkaHostedService.StartAsync` creates one actor unconditionally on every node:
- `DeadLetterMonitorActor` — plain `ActorOf`; subscribes to the `DeadLetter` event stream on `PreStart`. Runs on both central and site nodes.
`AkkaHostedService.RegisterCentralActors` creates:
- `CentralCommunicationActor` — registered with `ClusterClientReceptionist` so site `ClusterClient`s can reach it.
- `ManagementActor` — also registered with `ClusterClientReceptionist`; the CLI connects via `ClusterClient` without joining the cluster.
- `NotificationOutboxActor` — cluster singleton (no role scope); a proxy is handed to `CentralCommunicationActor` so forwarded `NotificationSubmit` messages from sites are routed to it.
- `AuditLogIngestActor` — cluster singleton; proxy registered with both `CentralCommunicationActor` and (if present) the `SiteStreamGrpcServer`.
- `SiteCallAuditActor` — cluster singleton; a graceful-stop task is added to the `cluster-leave` coordinated-shutdown phase with a 10-second drain window.
- `DeadLetterMonitorActor` — plain `ActorOf`; subscribes to the `DeadLetter` event stream on `PreStart`.
### Site composition root
@@ -132,10 +134,10 @@ The Host is not consumed as a library; it is the executable entry point. Other c
| Component | Central | Site | `AddXxx` | Actors | `MapXxx` |
|---|:---:|:---:|:---:|:---:|:---:|
| ClusterInfrastructure | Yes | Yes | Yes | Yes | — |
| ClusterInfrastructure | Yes | Yes | Yes | | — |
| Communication | Yes | Yes | Yes | Yes | — |
| HealthMonitoring | Yes | Yes | Yes | Yes | — |
| ExternalSystemGateway | Yes | Yes | Yes | Yes | — |
| HealthMonitoring | Yes | Yes | Yes | | — |
| ExternalSystemGateway | Yes | Yes | Yes | | — |
| AuditLog | Yes | Yes | Yes | Yes | — |
| NotificationService | Yes | No | Yes | — | — |
| NotificationOutbox | Yes | No | Yes | Yes (singleton) | — |
@@ -176,7 +178,7 @@ Options are bound via the .NET Options pattern (`IOptions<T>`). Each component o
| Key | Default | Description |
|-----|---------|-------------|
| `SeedNodes` | — | List of Akka seed-node URIs (`akka.tcp://scadabridge@host:port`). At least 2 required. Must reference remoting ports, not gRPC ports. |
| `SplitBrainResolverStrategy` | | Active strategy name (e.g. `"keep-oldest"`). |
| `SplitBrainResolverStrategy` | `keep-oldest` | Active strategy name (e.g. `"keep-oldest"`). |
| `StableAfter` | `"00:00:15"` | Duration the cluster must be stable before the resolver acts. |
| `HeartbeatInterval` | `"00:00:02"` | Akka failure-detector heartbeat cadence. |
| `FailureDetectionThreshold` | `"00:00:10"` | Acceptable heartbeat pause before a node is considered unreachable. |
+3 -4
View File
@@ -12,7 +12,6 @@ The component registers:
- `RoleMapper` — DB-backed LDAP-group-to-role resolution with site-scope union semantics.
- `ScadaBridgeGroupRoleMapper` — adapter exposing `RoleMapper` on the shared `IGroupRoleMapper<string>` seam.
- `HttpAuditActorAccessor` — resolves the authenticated username from the ambient HTTP context for audit `Actor` stamping.
- `LibraryInboundApiKeyAdmin` — implements `IInboundApiKeyAdmin` over the shared `ApiKeyAdminCommands` facade.
- ASP.NET Core cookie authentication (sliding idle window, HttpOnly/Secure defaults via `ZbCookieDefaults.Apply`).
- Authorization policies (`RequireAdmin`, `RequireDesign`, `RequireDeployment`, `OperationalAudit`, `AuditExport`).
@@ -40,7 +39,7 @@ Claims embedded in every token:
| `ZbClaimTypes.ScopeId` | `JwtTokenService.SiteIdClaimType` | One claim per permitted site (Deployer only) |
| `"LastActivity"` | `JwtTokenService.LastActivityClaimType` | ISO 8601 idle-timeout anchor |
`MapInboundClaims = false` and `MapOutboundClaims` cleared on both mint and validate paths prevent `JwtSecurityTokenHandler`'s default claim-type rewrite maps from transforming the canonical role URI or `zb:` claim types, keeping the type strings byte-for-byte identical in the token and in every policy check.
`MapOutboundClaims` is cleared on the mint path (`GenerateToken`) so `JwtSecurityTokenHandler` writes claim type strings verbatim into the token. On the validate path (`ValidateToken`) only `MapInboundClaims = false` is set — the outbound map is not touched. Together these settings prevent the default claim-type rewrite maps from transforming the canonical role URI or `zb:` claim types, keeping the type strings byte-for-byte identical in the token and in every policy check.
### Token lifecycle and idle timeout
@@ -89,7 +88,7 @@ public static IServiceCollection AddSecurity(this IServiceCollection services)
}
```
The Host composition root calls `AddZbLdapAuth(configuration, LdapSectionPath)` before `AddSecurity()`. `AddZbLdapAuth` registers `ILdapAuthService` as a singleton, binds `LdapOptions` from `ScadaBridge:Security:Ldap`, and registers `IValidateOptions<LdapOptions>` with `ValidateOnStart` so a misconfigured directory fails at boot rather than at first login.
The Host composition root calls `AddZbLdapAuth(configuration, LdapSectionPath)` before `AddSecurity()`. `AddZbLdapAuth` registers `ILdapAuthService` as a singleton, binds `LdapOptions` from `ScadaBridge:Security:Ldap`, and registers `IValidateOptions<LdapOptions>` with `ValidateOnStart` so a misconfigured directory fails at boot rather than at first login. The Host also registers `LibraryInboundApiKeyAdmin` as the `IInboundApiKeyAdmin` singleton via `AddSingleton` — this is not done by `AddSecurity`.
### Role mapping and site scoping
@@ -195,7 +194,7 @@ if (!AuthorizationPolicies.OperationalAuditRoles.Contains(userRole))
- [Configuration Database (#17)](./ConfigurationDatabase.md) — provides the scoped `ISecurityRepository` implementation (`SecurityRepository`) backed by `LdapGroupMappings` and `SiteScopeRules` tables in MS SQL, and hosts the Data Protection key ring via `PersistKeysToDbContext<ScadaBridgeDbContext>()`.
- [Central UI (#9)](./CentralUI.md) — every Blazor Server page and Razor component passes through cookie authentication and named policy authorization. The login page drives the LDAP bind → role map → token mint flow. The Admin → LDAP Mappings page is gated by `RequireAdmin` and calls `ISecurityRepository` directly.
- [Management Service (#18)](./ManagementService.md) — the `ManagementActor` enforces role and site-scope rules on every incoming command using identity carried in the `ManagementEnvelope`. The CLI authenticates users via the same LDAP bind and passes identity in every request.
- [Inbound API (#14)](./InboundAPI.md) — inbound API requests authenticate via `X-API-Key` (library verifier, `sbk_*` token format) rather than the cookie/JWT path. `HttpAuditActorAccessor` resolves the authenticated username for audit `Actor` stamping on the interactive path; the inbound API path keeps its own actor/fallback.
- [Inbound API (#14)](./InboundAPI.md) — inbound API requests authenticate via `Authorization: Bearer sbk_<keyId>_<secret>` (library verifier, `sbk_*` token format) rather than the cookie/JWT path. `HttpAuditActorAccessor` resolves the authenticated username for audit `Actor` stamping on the interactive path; the inbound API path keeps its own actor/fallback.
- [Audit Log (#23)](./AuditLog.md) — `IAuditActorAccessor` is a seam this component implements; the Inbound API audit path calls `CurrentActor` to record the authenticated user as the event actor.
- [Transport (#24)](./Transport.md) — export gates on `RequireDesign`; import gates on `RequireAdmin`, enforced at both the Razor page layer and inside the Transport service entrypoints.
- Design spec: [Component-Security.md](../requirements/Component-Security.md).