Files
scadalink-design/docs/requirements/Component-CentralUI.md
Joseph Doherty c929562e41 docs(audit): apply cross-bundle review fixes before merge
Final cross-bundle reviewer identified 7 inconsistencies that the per-bundle
reviewers couldn't see; all fixed in one logical commit.

Critical:
- HighLevelReqs AL-3: drop 'then upsert-on-newer-status' — AuditLog is
  strictly append-only (correct for SiteCalls/Notifications, wrong for
  the immutable AuditLog shadow).
- Component-AuditLog Error rate KPI: align with HealthMonitoring's
  exclusion list (Success/Delivered/Enqueued) rather than just non-Success;
  otherwise every Delivered notification or Enqueued cached call would be
  counted as an error.

Important:
- Component-AuditLog line 154: ISiteAuditWriter -> IAuditWriter (canonical
  name per Commons and the rest of this doc).
- Component-AuditLog Central direct-write paragraph: convert remaining
  slash notation (ApiInbound/Completed, Notification/Attempt,
  Notification/Terminal) to dot notation used everywhere else.
- Component-ClusterInfrastructure: scope SiteCallAuditActor to
  reconciliation + KPIs + Retry/Discard relay; cached-telemetry ingest is
  AuditLogIngestActor's role per Combined Telemetry contract.
- Component-CentralUI Audit Log page: state the OperationalAudit read
  permission and the read-vs-export split (matching CLI doc).
- Component-NotificationOutbox: add never-fail-the-action invariant for
  dispatcher audit writes.

Minor:
- Component-InboundAPI: 'Non-blocking semantics' was ambiguous (could be
  read as async); reword to 'Fail-soft' — the write is still synchronous
  before flush, but failures are caught and don't change the response.
- Component-CLI: realign audit-query/audit-export flags to actually match
  the Central UI Audit Log filter set (channel, kind, status, site,
  instance, target, actor, correlation-id, errors-only); drop --user and
  --entity-id which are IAuditService concepts, not Audit Log columns.
- Component-AuditLog KPI tile names: 'Volume/Error rate/Backlog' ->
  'Audit volume/Audit error rate/Audit backlog' (matches Central UI and
  Health Monitoring); drop the two orphan KPIs (Top inbound callers, Top
  outbound 5xx) that were never surfaced anywhere.
- Component-AuditLog Interactions: re-attribute DbOutbound emissions to
  ESG (where Database.* lives) with a note that Site Runtime is the API
  surface for scripts.
- HighLevelReqs AL-12: drop 'and reconciliation operations' (CLI has no
  reconcile command; reconciliation is an internal self-healing pull).
  Add note that verify-chain becomes operational once AL-11's hash chain
  ships.
2026-05-20 09:00:11 -04:00

20 KiB

Component: Central UI

Purpose

The Central UI is a web-based management interface hosted on the central cluster. It provides all configuration, deployment, monitoring, and troubleshooting workflows for the SCADA system. There is no live machine data visualization — the UI is focused on system management, with the exception of on-demand debug views.

Location

Central cluster only. Sites have no user interface.

Technology

  • Framework: Blazor Server (ASP.NET Core). UI logic executes on the server, updates pushed to the browser via SignalR.
  • Keeps the entire stack in C#/.NET, consistent with the rest of the system (Akka.NET, EF Core).
  • SignalR provides built-in support for real-time UI updates.

Failover Behavior

  • A load balancer sits in front of the central cluster and routes to the active node.
  • On central failover, the Blazor Server SignalR circuit is interrupted. The browser automatically attempts to reconnect via SignalR's built-in reconnection logic.
  • Since sessions use authentication cookies carrying an embedded JWT (not server-side state), the user's authentication survives failover — the new active node validates the same cookie-embedded JWT. No re-login required if the token is still valid.
  • Active debug view streams and in-progress deployment status subscriptions are lost on failover and must be re-opened by the user.
  • Both central nodes share the same ASP.NET Data Protection keys (stored in the configuration database or shared configuration) so that tokens and anti-forgery tokens remain valid across failover.

Real-Time Updates

  • Debug view: Real-time display of attribute values and alarm states via gRPC streaming. When the user opens a debug view, a DebugStreamBridgeActor on the central side opens a gRPC server-streaming subscription to the site's SiteStreamGrpcServer for the selected instance, then requests an initial DebugViewSnapshot via ClusterClient. Ongoing AttributeValueChanged and AlarmStateChanged events flow via the gRPC stream (not through ClusterClient) to the bridge actor, which delivers them to the Blazor component via callbacks that call InvokeAsync(StateHasChanged) to push UI updates through the built-in SignalR circuit.
  • Health dashboard: Site status, connection health, error rates, and buffer depths update via a 10-second auto-refresh timer. Since health reports arrive from sites every 30 seconds, a 10s poll interval catches updates within one reporting cycle without unnecessary overhead.
  • Deployment status: Pending/in-progress/success/failed transitions push to the UI immediately via SignalR (built into Blazor Server). No polling required for deployment tracking.

Responsibilities

  • Provide authenticated access to all management workflows.
  • Enforce role-based access control in the UI (Admin, Design, Deployment with site scoping).
  • Present data from the configuration database, and from site clusters via remote queries.

Workflows / Pages

Template Authoring (Design Role)

  • The /design/templates page uses a split-pane layout: a folder/template tree sidebar on the left and the editor on the right.
  • The tree shows nested TemplateFolder entities with their templates underneath; composition children render inline as leaf nodes beneath their owning template (right-click "Open composed template" reveals and selects the target).
  • Per-kind context menus on folder, template, and composition nodes expose the relevant operations (new folder, new template, rename, move, delete, move to folder). Native HTML5 drag-drop reorganizes templates between folders and reparents folders, with cycle detection rejected via toast on drop. Tree expansion state persists in sessionStorage, and deep links (/design/templates/{id}) reveal and select the target node.
  • Create, edit, and delete templates.
  • Template deletion is blocked if any instances or child templates reference the template. The UI displays the references preventing deletion.
  • Manage template hierarchy (inheritance) — visual tree of parent/child relationships.
  • Manage composition — add/remove feature module instances within templates. Naming collision detection provides immediate feedback if composed modules introduce duplicate attribute, alarm, or script names.
  • Define and edit attributes, alarms, and scripts on templates.
  • Set lock flags on attributes, alarms, and scripts.
  • Visual indicator showing inherited vs. locally defined vs. overridden members.
  • On-demand validation: A "Validate" action allows Design users to run comprehensive pre-deployment validation (flattening, naming collisions, script compilation, trigger references) without triggering a deployment. Provides early feedback during authoring.
  • Last-write-wins editing — no pessimistic locks or conflict detection on templates.

Shared Script Management (Design Role)

  • Create, edit, and delete shared (global) scripts.
  • Shared scripts are not associated with any template.
  • On-demand validation (compilation check) available.

External System Management (Design Role)

  • Define external system contracts: connection details, API method definitions (parameters, return types).
  • Define retry settings per external system (max retry count, fixed time between retries).
  • The external system detail page includes a "Recent activity" link that opens the Audit Log page pre-filtered to Channel = ApiOutbound and Target starts-with the system name — surfacing the system's recent outbound API audit history.

Database Connection Management (Design Role)

  • Define named database connections: server, database, credentials.
  • Define retry settings per connection (max retry count, fixed time between retries).

Notification List Management (Design Role)

  • Create, edit, and delete notification lists.
  • Each notification list has a TypeEmail now, with Teams and other types planned. The type determines the type-specific targets a list carries.
  • Manage recipients (name + email) within each Email list.
  • Configure SMTP settings.

Site & Data Connection Management (Admin Role)

  • Create, edit, and delete site definitions, including Akka node addresses (NodeA/NodeB) and gRPC node addresses (GrpcNodeA/GrpcNodeB).
  • Define data connections and assign them to sites (name, protocol type, connection details).
  • Data connection form: "Primary Endpoint Configuration" (required JSON text area) and optional "Backup Endpoint Configuration" (collapsible section, hidden by default, revealed via "Add Backup Endpoint" button; "Remove Backup" button when editing an existing backup). "Failover Retry Count" numeric input (default 3, min 1, max 20) is visible only when a backup endpoint is configured.
  • Data connection list page: Shows Primary Config and Backup Config columns. Active Endpoint column populated from health reports.
  • The site detail page exposes a new "Audit feed" tab that hosts the Audit Log page pre-filtered to Site = <site> — an in-context view of every operational audit event for that site.

Inbound API Management (Admin Role for keys, Design Role for methods)

  • Manage inbound API keys (create, enable / disable, delete) and define API methods (name, parameters, return values, approved keys, implementation script).
  • The API key detail page includes a "Recent calls" link that opens the Audit Log page pre-filtered to Actor = <key name> and Channel = ApiInbound — surfacing the key's recent inbound-call audit history.

Area Management (Admin Role)

  • Define hierarchical area structures per site.
  • Parent-child area relationships.
  • Assign areas when managing instances.

Instance Management (Deployment Role)

  • Create instances from templates at a specific site.
  • Assign instances to areas.
  • Bind data connections — per-attribute binding where each attribute with a data source reference individually selects its data connection from the site's available connections. Bulk assignment supported: select multiple attributes and assign a data connection to all of them at once.
  • Set instance-level attribute overrides (non-locked attributes only).
  • Filter/search instances by site, area, template, or status.
  • Disable instances — stops data collection, script triggers, and alarm evaluation at the site while retaining the deployed configuration.
  • Enable instances — re-activates a disabled instance.
  • Delete instances — removes the running configuration from the site. Blocked if the site is unreachable. Store-and-forward messages are not cleared.
  • The instance detail page exposes a new "Audit feed" tab that hosts the Audit Log page pre-filtered to the instance (Site = <site> and the Instance / Script filter set to the instance unique name) — an in-context view of every operational audit event involving that instance.

Deployment (Deployment Role)

  • View list of instances with staleness indicators (deployed config differs from template-derived config).
  • Filter by site, area, template.
  • View diff between deployed and current template-derived configuration.
  • Deploy updated configuration to individual instances. Pre-deployment validation runs automatically before any deployment is sent — validation errors are displayed and block deployment.
  • Track deployment status (pending, in-progress, success, failed).

System-Wide Artifact Deployment (Deployment Role)

  • Explicitly deploy shared scripts, external system definitions, database connection definitions, and data connection definitions to all sites or to an individual site. (Notification lists and SMTP configuration are central-only and are not deployed.)
  • Per-site deployment: A "Deploy Artifacts" button on the Sites admin page allows deploying all artifacts to an individual site.
  • Deploy all: A bulk action deploys artifacts to all sites at once.
  • This is a separate action from instance deployment — system-wide artifacts are not automatically pushed when definitions change.
  • Track per-site deployment status.

Debug View (Deployment Role)

  • Select a deployed instance and open a live debug view.
  • Real-time streaming of all attribute values (with quality and timestamp) and alarm states for that instance.
  • The DebugStreamService creates a DebugStreamBridgeActor on the central side. The bridge actor opens a gRPC server-streaming subscription to the site's SiteStreamGrpcServer for the selected instance, then requests an initial DebugViewSnapshot via ClusterClient.
  • Ongoing events (AttributeValueChanged, AlarmStateChanged) flow via the gRPC stream directly to the bridge actor — they do not pass through ClusterClient.
  • Events are delivered to the Blazor component via callbacks, which call InvokeAsync(StateHasChanged) to push UI updates through the built-in SignalR circuit.
  • A pulsing "Live" indicator replaces the static "Connected" badge when streaming is active.
  • Stream includes attribute values formatted as [InstanceUniqueName].[AttributePath].[AttributeName] and alarm states formatted as [InstanceUniqueName].[AlarmName].
  • Subscribe-on-demand — stream starts when opened, stops when closed.

Parked Message Management (Deployment Role)

  • Query sites for parked messages (external system calls, cached DB writes). (Parked notifications are managed centrally on the Notification Outbox page, not here.)
  • View message details (target, payload, retry count, timestamps).
  • Retry or discard individual parked messages.

Notification Outbox (Deployment Role)

  • Monitor and manage centrally-delivered notifications. The Notification Outbox dispatches every notification store-and-forwarded from sites and logs each one to the central Notifications table.
  • KPI tiles at the top of the page: queue depth (Pending + Retrying), stuck count, parked count, delivered in the last interval, and oldest pending age. The KPIs are central-computed on demand from the Notifications table.
  • A queryable notification list filterable by status, type, source site, notification list, and time range, with a stuck-only toggle and keyword search on subject. Each row shows the notification's status, retry count, last error, and key timestamps.
  • Retry and Discard actions are available on parked notifications: Retry returns the notification to Pending and resets RetryCount / NextAttemptAt; Discard moves it to Discarded. The row is retained either way so the table stays a complete audit record.
  • Each row exposes a "View audit history" action that opens the Audit Log page pre-filtered to CorrelationId = NotificationId, surfacing every operational audit event recorded for that notification.
  • Stuck rows are visually badged — a notification is stuck if it is Pending or Retrying and older than the configurable stuck-age threshold. Stuck detection is display-only; there is no automated escalation or alerting.
  • All queries are served from the central Notifications table — no remote per-site queries are needed, unlike the Parked Message Management page.

Site Calls (Deployment Role)

  • Monitor cached calls store-and-forwarded from sites — ExternalSystem.CachedCall() and Database.CachedWrite() operations. Scoped to the ExternalCall and DatabaseWrite kinds only; notifications keep their separate Notification Outbox page and are not merged here.
  • A queryable cached-call list filterable by site, kind, status, and time range. Each row shows the call's timestamp, site, kind, target summary, status badge, retry count, and last error.
  • Retry and Discard actions are available on Parked rows only — Failed rows are not actionable, since a permanent failure would simply fail again and its error was already returned synchronously to the calling script. The actions issue central→site commands to the owning site; if the site is offline the UI surfaces a "site unreachable" message.
  • Each row exposes a "View audit history" action that opens the Audit Log page pre-filtered to CorrelationId = TrackedOperationId, showing every operational audit event recorded for that cached call.
  • Data is served from the central Site Call Audit component's SiteCalls table. The page is read-mostly — an eventually-consistent mirror of site state; the site remains the source of truth.

Health Monitoring Dashboard (All Roles)

  • Overview of all sites with online/offline status.
  • Per-site detail: active/standby node status, data connection health, script error rates, alarm evaluation error rates, store-and-forward buffer depths.
  • Headline Notification Outbox KPI tiles — queue depth, stuck count, and parked count. These are central-computed by the Notification Outbox from the central Notifications table (not part of any site health report). The full outbox view is on the dedicated Notification Outbox page.
  • Headline Site Call Audit KPI tiles — buffered count, parked count, and failed-last-interval. These are central-computed by the Site Call Audit component from the central SiteCalls table (not part of any site health report). The full cached-call view is on the dedicated Site Calls page.
  • Headline Audit KPI tiles — three tiles in a new "Audit" KPI group: Audit volume, Audit error rate, and Audit backlog. These are sourced from the Audit Log component (#23) and Health Monitoring per the metric definitions in Component-HealthMonitoring.md; the dashboard simply surfaces them. The full audit query view is on the dedicated Audit Log page.

Site Event Log Viewer (Deployment Role)

  • Query site event logs remotely.
  • Filter by event type, time range, instance.
  • View script executions, alarm events (activations, clears, evaluation errors), deployment events (including script compilation results), connection status changes, store-and-forward activity, instance lifecycle events (enable, disable, delete).

Audit Log (Admin / Audit Role)

  • Lives under a new top-level "Audit" nav group (sibling to Notifications). In v1 the Audit nav group contains this single Audit Log page; the pre-existing Configuration Audit Log Viewer remains its own page below.
  • Global query / filter / drilldown over the central AuditLog table maintained by the Audit Log component (#23). Read-only — the table is append-only, so there are no edit actions on rows.
  • Read access to the page requires the OperationalAudit permission (Security & Auth #10). Per-site row scoping reuses the existing site-permission model: a user sees only rows for sites they are authorized to operate. Bulk export (see below) additionally requires AuditExport. The split mirrors the CLI's permission model (see Component-CLI.md).
  • Filter bar (top of page, collapses to a single row when not focused):
    • Time range — relative (15m / 1h / 24h / 7d) or custom.
    • Channel — multi-select: ApiOutbound, DbOutbound, Notification, ApiInbound.
    • Kind — multi-select; the available options are filtered by the selected Channels.
    • Status — multi-select.
    • Site — multi-select, scoped to the user's authorized sites.
    • Instance / Script — text search with autocomplete.
    • Target — text search (system + method, DB connection, list name).
    • Actor — text search (inbound API key name).
    • CorrelationId — paste a TrackedOperationId / NotificationId / request-id to see the full event sequence for one operation.
    • "Errors only" toggle — shorthand for Status NOT IN (Success, Delivered, Enqueued).
  • Results grid (custom Blazor + Bootstrap component, consistent with the rest of the UI — no third-party grid):
    • Columns, all resizable and reorderable, persisted per user: OccurredAtUtc, Site, Channel, Kind, Status, Target, Actor, DurationMs, HttpStatus, ErrorMessage.
    • Keyset pagination ordered by (OccurredAtUtc desc, EventId desc). Default page size 100.
    • Clicking a row opens the drilldown drawer.
  • Drilldown drawer:
    • Pretty-prints RequestSummary / ResponseSummary — JSON is auto-detected and syntax-highlighted; SQL is syntax-highlighted.
    • Surfaces redaction indicators wherever headers or fields were stripped at write time, per the Audit Log component's "Payload Capture Policy".
    • "Copy as cURL" action on ApiOutbound and ApiInbound rows.
    • "Show all events for this operation" link — re-applies the current view filtered by the row's CorrelationId.
  • Export button on the page header streams a server-side CSV of the current filter (default cap 100k rows; larger exports go through the CLI). Requires the AuditExport permission.

Configuration Audit Log Viewer (Admin Role)

  • Pre-existing viewer for the IAuditService configuration-change log (template / instance / site / etc. before-after edits). Lives under the same Audit nav group as the operational Audit Log above.
  • Query the central configuration audit log.
  • Filter by user, entity type, action type, time range.
  • View before/after state for each change.

LDAP Group Mapping (Admin Role)

  • Map LDAP groups to system roles (Admin, Design, Deployment).
  • Configure site-scoping for Deployment role groups.

Dependencies

  • Template Engine: Provides template and instance data models, flattening, diff calculation, and validation.
  • Deployment Manager: Triggers deployments, system-wide artifact deployments, and instance lifecycle commands. Provides deployment status.
  • Communication Layer: Routes debug view subscriptions, remote queries to sites.
  • Security & Auth: Authenticates users and enforces role-based access.
  • Configuration Database: All central data, including audit log data for the audit log viewer. Accessed via ICentralUiRepository.
  • Health Monitoring: Provides site health data for the dashboard.
  • Notification Outbox: Provides notification delivery KPIs and serves the Notifications table queries and Retry/Discard actions for the Notification Outbox page.
  • Site Call Audit: Serves the SiteCalls table queries and relays Retry/Discard actions to sites for the Site Calls page.
  • Audit Log (#23): Serves all AuditLog table queries (filter / grid / drilldown / CSV export) for the new Audit Log page and the drill-in surfaces on Notifications, Site Calls, External Systems, Inbound API keys, Sites, and Instances. Payload capture, redaction, and per-site authorization follow the Audit Log component's "Payload Capture Policy" and "Security & Tamper-Evidence" sections.