ManagementService (role table: queries any-auth, area mutations Designer; audit contract exception), CLI (missing instance/api-key subcommands; server JSON printed verbatim; bundle preview timeout), Transport (BundleFormatVersion exact-match gate; dependency scan fields; three flushes), CentralUI (/api/script-analysis endpoints; LoginLayout minimal; Health tile components), TreeView (Topology no RevealNode; ContextMenu Site branch; InitiallyExpanded).
20 KiB
Management Service
The Management Service is the Akka.NET actor that provides programmatic access to every admin operation on the central cluster — the same operations the Central UI exposes, made available over an HTTP API and, optionally, a ClusterClient path for cross-cluster callers.
Overview
Management Service (#18) runs on the central cluster only. The component code lives in src/ZB.MOM.WW.ScadaBridge.ManagementService/, with four source files:
ManagementActor.cs— theReceiveActorthat owns authorization, dispatch, and error mapping for all commands.ManagementEndpoints.cs— thePOST /managementminimal-API endpoint that authenticates over HTTP Basic Auth and forwards to the actor.AuditEndpoints.cs— dedicated REST endpoints (GET /api/audit/query,GET /api/audit/export) for the centralized Audit Log (#23); these bypass the actor because the workload is read-only and keyset-paged.DebugStreamHub.cs— a SignalR hub for real-time debug stream subscriptions (attribute and alarm state changes).
ServiceCollectionExtensions.AddManagementService registers ManagementActorHolder (a DI singleton that holds the live IActorRef) and binds ManagementServiceOptions from ScadaBridge:ManagementService.
The ManagementActor is not a cluster singleton. Because it is completely stateless — it opens a new DI scope per command and delegates all work to repositories and domain services — every central node runs its own instance. Either node can serve any request independently, so no singleton coordination is needed.
Key Concepts
ManagementEnvelope and the wire protocol
Every command arrives wrapped in a ManagementEnvelope:
public record AuthenticatedUser(
string Username, string DisplayName,
string[] Roles, string[] PermittedSiteIds);
public record ManagementEnvelope(AuthenticatedUser User, object Command, string CorrelationId);
The HTTP endpoint constructs the envelope after LDAP authentication and role resolution; the CorrelationId (a Guid formatted as "N") ties server-log entries to the caller's request. The actor never authenticates a second time — the envelope carries an already-resolved AuthenticatedUser.
Role enforcement and site scope
Authorization is a two-level check. GetRequiredRole maps each command type to the minimum role required:
| Role | Commands |
|---|---|
Administrator |
Site management, role mappings, API key management, scope rules, QueryAuditLogCommand, PreviewBundle, ImportBundle |
Designer |
Template authoring (members, folders, compositions), external systems, data connections, notification lists, shared scripts, database connections, inbound API methods, ExportBundle |
Deployer |
Instance lifecycle, connection bindings, overrides, deployments, debug snapshot, RetryParkedMessageCommand, DiscardParkedMessageCommand |
| (any authenticated user) | Read-only list/get queries, health summary |
Within Deployer commands, EnforceSiteScope applies a second check: users whose role mapping carries PermittedSiteIds can only touch instances and sites within their permitted set. Administrators and system-wide deployers (empty PermittedSiteIds) are unrestricted. A violation throws SiteScopeViolationException, which MapFault converts to ManagementUnauthorized.
Command registry
ManagementCommandRegistry (in Commons) maps wire names to CLR types via reflection at startup. It scans the ZB.MOM.WW.ScadaBridge.Commons.Messages.Management namespace for non-abstract types whose name ends in "Command" and stores them in a FrozenDictionary. The HTTP endpoint calls ManagementCommandRegistry.Resolve(commandName) to get the target type, then deserializes the payload JSON into it.
Audit contract
Mutating handlers that call repositories directly invoke AuditAsync (backed by IAuditService) after a successful write. Most handlers that delegate to a domain service — TemplateService, DeploymentService, ArtifactDeploymentService, TemplateFolderService, SharedScriptService — do not call AuditAsync; those services audit internally, avoiding double-logging. However, some delegating handlers also call AuditAsync directly: HandleCreateInstance delegates to InstanceService.CreateInstanceAsync and then calls AuditAsync itself. SMTP configuration and API key responses project out secrets before the audit entry is written.
Architecture
Actor lifecycle and registration
AkkaHostedService (in the Host) creates the ManagementActor under the path /user/management and registers it with ClusterClientReceptionist:
var mgmtActor = _actorSystem!.ActorOf(
Props.Create(() => new ManagementActor(_serviceProvider, mgmtLogger)),
"management");
ClusterClientReceptionist.Get(_actorSystem).RegisterService(mgmtActor);
var mgmtHolder = _serviceProvider.GetRequiredService<ManagementActorHolder>();
mgmtHolder.ActorRef = mgmtActor;
ClusterClientReceptionist advertises the actor to ClusterClient senders without requiring them to join the Akka cluster. The ManagementActorHolder.ActorRef property is then the bridge from the HTTP endpoint (which runs in ASP.NET Core middleware) into the Akka actor world.
The actor declares an explicit supervisor strategy — one-for-one with Resume and no retry limit — to match the coordinator-actor convention and remain correct if child actors are added later.
HTTP Management API (POST /management)
ManagementEndpoints.MapManagementAPI registers the endpoint. Each request goes through six steps:
- Raise the per-request body size cap to 200 MB (needed for Transport bundle imports).
- Decode
Authorization: Basic <base64>and split username/password. - Authenticate via
ILdapAuthService. - Resolve roles via
RoleMapper, building theAuthenticatedUserwith any site-scope limits. - Deserialize the JSON body (
command+payload) viaManagementCommandRegistry. AsktheManagementActorwith aManagementEnvelopeand map the response:
return response switch
{
ManagementSuccess success => Results.Text(success.JsonData, "application/json", statusCode: 200),
ManagementError error => Results.Json(new { error = error.Error, code = error.ErrorCode }, statusCode: 400),
ManagementUnauthorized u => Results.Json(new { error = u.Message, code = "UNAUTHORIZED" }, statusCode: 403),
_ => Results.Json(new { error = "Unexpected response.", code = "INTERNAL_ERROR" }, statusCode: 500)
};
The Ask timeout defaults to 30 seconds and is overridable via ScadaBridge:ManagementService:CommandTimeout. An elapsed timeout returns HTTP 504.
Actor dispatch and error mapping
ManagementActor.HandleEnvelope checks the required role, then calls ProcessCommand, which opens a DI scope, runs DispatchCommand, and wraps the result in ManagementSuccess. The PipeTo pattern keeps the actor's message loop free during async work; the failure continuation maps exceptions to ManagementError or ManagementUnauthorized:
private void HandleEnvelope(ManagementEnvelope envelope)
{
var sender = Sender;
var correlationId = envelope.CorrelationId;
var user = envelope.User;
var requiredRole = GetRequiredRole(envelope.Command);
if (requiredRole != null && !user.Roles.Contains(requiredRole, StringComparer.OrdinalIgnoreCase))
{
sender.Tell(new ManagementUnauthorized(correlationId,
$"Role '{requiredRole}' required for {envelope.Command.GetType().Name}"));
return;
}
ProcessCommand(envelope, user)
.PipeTo(sender,
success: result => result,
failure: ex => MapFault(ex, correlationId, envelope.Command));
}
ManagementCommandException carries a message safe to surface to callers. Any other exception is an unanticipated fault; only the correlation ID is returned so internal detail (server names, constraint names) is not disclosed.
Audit REST API (/api/audit/*)
AuditEndpoints.MapAuditAPI registers two GET endpoints that go directly to IAuditLogRepository, bypassing the actor:
GET /api/audit/query— keyset-paged JSON result. RequiresOperationalAuditpermission (Admin / Audit / AuditReadOnly roles). Acceptschannel,kind,status,sourceSiteId,correlationId,executionId,parentExecutionId,fromUtc,toUtc,pageSize, and cursor paramsafterOccurredAtUtc/afterEventId. Returns{ events, nextCursor }wherenextCursoris explicitnullon the last page.GET /api/audit/export— server-side streaming export (CSV or JSONL) of all matching rows, paging the repository internally at 1 000 rows per batch and flushing after each batch. RequiresAuditExportpermission (Admin / Audit roles).format=parquetreturns HTTP 501 (deferred).
Both endpoints apply the same HTTP Basic Auth / LDAP / role flow as /management. Site-scoped callers have their sourceSiteId filter intersected with their PermittedSiteIds; an explicit out-of-scope filter returns HTTP 403 rather than silently empty results.
Debug stream (/debug-stream)
DebugStreamHub is a SignalR hub registered alongside the management endpoints. It authenticates on OnConnectedAsync (same Basic Auth / LDAP / role flow), requires the Deployer role, and enforces per-instance site scope on SubscribeInstance. Accepted connections receive an initial DebugViewSnapshot followed by incremental AttributeValueChanged and AlarmStateChanged events pushed from DebugStreamService.
Usage
Sending a command from the CLI
The CLI sends a single POST /management with JSON body and Basic Auth; it does not use ClusterClient directly. A typical request:
POST /management
Authorization: Basic base64(username:password)
Content-Type: application/json
{
"command": "ListSites",
"payload": {}
}
A successful response is HTTP 200 with the JSON result. An authorization failure is HTTP 403 with { "error": "...", "code": "UNAUTHORIZED" }.
Sending a command via ClusterClient
The ManagementActor is also reachable from any ClusterClient that has a contact point into the central cluster. The actor is registered under /system/receptionist with the path /user/management. Callers construct and Tell a ManagementEnvelope and expect one of ManagementSuccess, ManagementError, or ManagementUnauthorized in reply.
Command Groups
DispatchCommand in ManagementActor.cs is the canonical enumeration of every supported command. The table below organizes them by domain area.
| Group | Commands | Minimum role |
|---|---|---|
| Templates | ListTemplates, GetTemplate, CreateTemplate, UpdateTemplate, DeleteTemplate, ValidateTemplate |
Designer (mutations) |
| Template members | AddTemplateAttribute, UpdateTemplateAttribute, DeleteTemplateAttribute, AddTemplateAlarm, UpdateTemplateAlarm, DeleteTemplateAlarm, AddTemplateNativeAlarmSource, UpdateTemplateNativeAlarmSource, DeleteTemplateNativeAlarmSource, ListTemplateNativeAlarmSources, AddTemplateScript, UpdateTemplateScript, DeleteTemplateScript, AddTemplateComposition, DeleteTemplateComposition |
Designer (mutations) |
| Template folders | ListTemplateFolders, CreateTemplateFolder, RenameTemplateFolder, MoveTemplateFolder, DeleteTemplateFolder, MoveTemplateToFolder |
Designer (mutations) |
| Instances | ListInstances, GetInstance, CreateInstance, MgmtDeployInstance, MgmtEnableInstance, MgmtDisableInstance, MgmtDeleteInstance, SetConnectionBindings, SetInstanceOverrides, SetInstanceArea, SetInstanceAlarmOverride, DeleteInstanceAlarmOverride, ListInstanceAlarmOverrides, SetInstanceNativeAlarmSourceOverride, DeleteInstanceNativeAlarmSourceOverride, ListInstanceNativeAlarmSourceOverrides |
Deployer (mutations) |
| Sites & areas | ListSites, GetSite, CreateSite, UpdateSite, DeleteSite, ListAreas, CreateArea, UpdateArea, DeleteArea |
Administrator (site mutations); Designer (CreateArea, UpdateArea, DeleteArea) |
| Data connections | ListDataConnections, GetDataConnection, CreateDataConnection, UpdateDataConnection, DeleteDataConnection |
Designer (mutations) |
| External systems | ListExternalSystems, GetExternalSystem, CreateExternalSystem, UpdateExternalSystem, DeleteExternalSystem, ListExternalSystemMethods, GetExternalSystemMethod, CreateExternalSystemMethod, UpdateExternalSystemMethod, DeleteExternalSystemMethod |
Designer (mutations) |
| Notification lists / SMTP | ListNotificationLists, GetNotificationList, CreateNotificationList, UpdateNotificationList, DeleteNotificationList, ListSmtpConfigs, UpdateSmtpConfig |
Designer (mutations) |
| Shared scripts | ListSharedScripts, GetSharedScript, CreateSharedScript, UpdateSharedScript, DeleteSharedScript |
Designer (mutations) |
| Database connections | ListDatabaseConnections, GetDatabaseConnection, CreateDatabaseConnectionDef, UpdateDatabaseConnectionDef, DeleteDatabaseConnectionDef |
Designer (mutations) |
| Inbound API methods | ListApiMethods, GetApiMethod, CreateApiMethod, UpdateApiMethod, DeleteApiMethod |
Designer (mutations) |
| Security | ListRoleMappings, CreateRoleMapping, UpdateRoleMapping, DeleteRoleMapping, ListApiKeys, CreateApiKey, UpdateApiKey, DeleteApiKey, SetApiKeyMethods, ListScopeRules, AddScopeRule, DeleteScopeRule |
Administrator |
| Deployments | MgmtDeployArtifacts, QueryDeployments, GetDeploymentDiff |
Deployer |
| Health | GetHealthSummary, GetSiteHealth |
Any authenticated user |
| Remote queries | QueryEventLogsCommand, QueryParkedMessagesCommand (any authenticated user); RetryParkedMessageCommand, DiscardParkedMessageCommand, DebugSnapshotCommand (Deployer) |
Varies |
| Audit (legacy) | QueryAuditLog |
Administrator |
| Transport | ExportBundle (Designer), PreviewBundle, ImportBundle (Administrator) |
Varies |
ValidateTemplate builds a FlattenedConfiguration from the template's attributes, alarms, and scripts, runs the full ValidationService pipeline (collision detection, script compilation, trigger reference checks), and merges in naming-collision errors from TemplateService.DetectCollisionsAsync — all without a deployment.
SetInstanceOverrides validates every attribute name and lock status against the template before applying any write, making the batch all-or-nothing at the validation layer.
Configuration
| Section | Key | Default | Description |
|---|---|---|---|
ScadaBridge:ManagementService |
CommandTimeout |
00:00:30 |
Ask timeout the ManagementEndpoints applies when forwarding to the ManagementActor. A non-positive value falls back to the 30-second default. |
The 200 MB per-request body cap (ManagementEndpoints.MaxManagementRequestBodyBytes) is hard-coded; it exists to accommodate Transport (#24) Import calls where a 100 MB raw bundle base64-inflates to roughly 140 MB plus the envelope overhead.
Dependencies & Interactions
- Commons (#16) — owns the message contracts (
Messages/Management/),ManagementEnvelope,ManagementCommandRegistry,AuthenticatedUser, andManagementSuccess/ManagementError/ManagementUnauthorizedresponse types. - Configuration Database (#17) — every repository (
ITemplateEngineRepository,ISiteRepository,IExternalSystemRepository,INotificationRepository,ISecurityRepository,IInboundApiRepository,IDeploymentManagerRepository,ICentralUiRepository) andIAuditServiceare backed by EF Core against the central MS SQL database. Management Service resolves them per-command through scoped DI. - Template Engine (#1) —
TemplateService,TemplateFolderService,SharedScriptService, and theValidationServicehandle template authoring and validation. Management Service is the sole entry point for template mutations from outside the Central UI. - Deployment Manager (#2) —
DeploymentServiceandArtifactDeploymentServiceown the deployment pipeline.MgmtDeployInstanceandMgmtDeployArtifactsdelegate here. - Central–Site Communication (#5) —
CommunicationServiceroutesQueryEventLogsCommand,QueryParkedMessagesCommand,RetryParkedMessageCommand,DiscardParkedMessageCommand, andDebugSnapshotCommandto site actors viaClusterClient. Deployment commands also flow through the communication layer. - Security & Auth (#10) —
ILdapAuthServiceandRoleMapperauthenticate and map roles on every HTTP request; theRolesconstants andIInboundApiKeyAdminare also consumed here. - Health Monitoring (#11) —
ICentralHealthAggregatoranswersGetHealthSummaryandGetSiteHealthqueries synchronously from its in-memory state. - Audit Log (#23) —
AuditEndpointsreads the centralAuditLogtable viaIAuditLogRepositorydirectly (no actor hop).QueryAuditLogCommandthrough/managementis a legacy path for the configuration-change audit viaICentralUiRepository. - CLI (#19) — the primary consumer of
POST /managementand the/api/audit/*endpoints. ConstructsManagementEnvelope-shaped JSON, sends Basic Auth, and deserializes the response. - Host (#15) —
AkkaHostedServicecreates theManagementActor, registers it withClusterClientReceptionist, and setsManagementActorHolder.ActorRefso the HTTP endpoint can reach it. - Design spec: Component-ManagementService.md.
Troubleshooting
Actor not ready (HTTP 503)
If POST /management returns 503 SERVICE_UNAVAILABLE, ManagementActorHolder.ActorRef is null — the actor system has not finished starting. This resolves itself once AkkaHostedService.StartAsync completes. The /health/ready endpoint is the gating signal; traffic should not reach /management before it returns 200.
Command timeout (HTTP 504)
A 504 response means the Ask to ManagementActor did not return within the configured CommandTimeout. The server log entry includes the CorrelationId from the response body. Common causes: a long-running deployment waiting on a site that is offline, or a database query against a cold EF Core connection. Increasing ScadaBridge:ManagementService:CommandTimeout buys time while the root cause is investigated.
Unexpected internal error
Any exception that is not a ManagementCommandException or SiteScopeViolationException maps to a generic COMMAND_FAILED error with the correlation ID. The server log at Error level will contain the full exception, keyed by CorrelationId. ManagementCommandException messages are intentionally surfaced verbatim; all other exception messages are suppressed on the wire to avoid leaking internal detail.
Audit log export stalls mid-stream
GET /api/audit/export streams rows in pages of 1 000 and flushes after each page. If the response body stops arriving, check whether a proxy is buffering the response (the endpoint sets Cache-Control: no-store to defeat most buffers). The pageSize parameter on /api/audit/query caps at 1 000; requests above that are silently clamped.