Resolve audit findings: correct WorkerEnvelope proto/route/metric/session facts; rewrite auth (ZB.MOM.WW.Auth migration), dashboard (ZB.MOM.WW.Theme), and StyleGuide (foreign-project copy-paste); document alarm subsystem, Ldap options, and gateway alarm broker; fix client CLI flags and package paths.
27 KiB
Gateway Dashboard Detailed Design
Purpose
The gateway should host a basic web dashboard for operators and developers. The dashboard is diagnostic and operational visibility only for v1. It should show gateway health, active MXAccess worker instances, session state, and basic statistics in real time.
Technology Choice
Decision: Blazor Server with the shared ZB.MOM.WW.Theme kit layered over
Bootstrap CSS/JS.
Allowed UI stack:
- ASP.NET Core Blazor Server,
- the
ZB.MOM.WW.Themekit (layout chassis, status components, design tokens), - Bootstrap CSS,
- Bootstrap JavaScript,
- small local CSS for layout and status styling,
- built-in Blazor components.
Not allowed for v1:
- MudBlazor,
- Radzen,
- Syncfusion,
- Telerik,
- other Blazor UI component libraries,
- client-side SPA framework replacement.
Rationale: Blazor Server keeps the dashboard in the gateway process, avoids a
separate frontend build, and gives real-time UI updates through the Blazor
SignalR circuit. The ZB.MOM.WW.Theme kit gives the dashboard the same chassis,
status vocabulary, and visual identity as the other ZB.MOM.WW operations UIs
without re-implementing layout and status styling per project.
Theme Kit
The dashboard depends on the shared ZB.MOM.WW.Theme NuGet package
(version 0.2.0, referenced in ZB.MOM.WW.MxGateway.Server.csproj). The kit is
a Razor Class Library that ships the technical-light design system: a layout
chassis, a small set of UI components, the design tokens, and the head/script
asset wiring. The dashboard takes its chrome and status presentation from the
kit and adds only its own pages and view CSS on top.
Components and assets used:
| Kit member | Role in the dashboard |
|---|---|
<ThemeShell> |
The application chassis — vertical side rail (brand, hamburger, responsive collapse) plus a content area. MainLayout.razor wraps it and supplies Nav, RailFooter, and ChildContent slots. |
<NavRailSection> / <NavRailItem> |
Grouped navigation items in the rail. Section expand/collapse persistence is owned by the kit (<details> + ThemeScripts); the app runs no JS interop for it. |
<LoginCard> |
The centered login card on Login.razor. Renders a native static <form method="post" action="/login"> so the submit reaches the minimal-API endpoint rather than a Blazor event. |
<StatusPill State="…"> |
The status chip. StatusBadge.razor is a thin adapter that maps domain state text to one of four StatusState values (Ok, Warn, Bad, Idle) and renders this pill. |
<ThemeHead/> |
Loaded in App.razor's <head>; injects the kit's theme.css and related head assets. |
<ThemeScripts/> |
Loaded at the end of App.razor's <body>; supplies the rail's interactive behavior. |
| Token system | theme.css defines all design tokens (var(--card), var(--ink), var(--accent), var(--mono), the state colors, etc.). The local site.css references these tokens and defines no hard-coded colors. |
The dependency on this kit is the reason the layout shell, navigation, status chips, and tokens differ from a stock Bootstrap dashboard. See Dashboard Interface Design for how the kit's tokens and components shape the visual language.
Hosting Model
The dashboard is hosted by ZB.MOM.WW.MxGateway.Server alongside the gRPC API. When
MxGateway:Dashboard:Enabled is true, MapGatewayDashboard() mounts the
Blazor Server app at the host root and registers the login, logout, denied,
SignalR hub, and hub-token endpoints beside it. When dashboard hosting is
disabled, none of those routes are mapped — the same listener still serves
gRPC.
Endpoint layout:
/
/sessions
/sessions/{sessionId}
/workers
/events
/alarms
/galaxy
/browse
/apikeys
/settings
/login (POST also)
/logout (POST)
/denied
/hubs/snapshot
/hubs/alarms
/hubs/events
/hubs/token
/_blazor
The /galaxy page surfaces the Galaxy Repository browse summary
(deployed object hierarchy size, last deploy timestamp, attribute totals,
template usage, and connectivity sync info). The summary is fed by
GalaxyHierarchyCache, which is refreshed off the request path by
GalaxyHierarchyRefreshService on the
MxGateway:Galaxy:DashboardRefreshIntervalSeconds cadence so the dashboard
never blocks on SQL. See Galaxy Repository Browse for
the underlying gRPC service.
High-Level Components
ZB.MOM.WW.MxGateway.Server
Dashboard/
Components/
App.razor (loads <ThemeHead/> / <ThemeScripts/>)
Routes.razor
DashboardPageBase.cs
DashboardDisplay.cs
Layout/
MainLayout.razor (ThemeShell side-rail chassis)
LoginLayout.razor (minimal, no rail; hosts <LoginCard>)
Pages/
DashboardHome.razor
Login.razor
SessionsPage.razor
SessionDetailsPage.razor
WorkersPage.razor
EventsPage.razor
AlarmsPage.razor
GalaxyPage.razor
BrowsePage.razor
ApiKeysPage.razor
SettingsPage.razor
Shared/
MetricCard.razor
StatusBadge.razor (adapter over kit <StatusPill>)
FaultList.razor
BrowseTreeNodeView.razor
ConfirmDialog.razor
DashboardSnapshotService.cs
DashboardAuthorizationHandler.cs
DashboardAuthenticator.cs
DashboardApiKeyAuthorization.cs
DashboardApiKeyManagementService.cs
DashboardApiKeySummary.cs
DashboardSnapshot.cs
DashboardSessionSummary.cs
DashboardWorkerSummary.cs
DashboardMetricSummary.cs
The dashboard exposes three named SignalR hubs in addition to Blazor Server's
internal circuit; pages connect to those hubs from within the circuit via the
DashboardHubConnectionFactory helper. The hubs publish snapshot, alarm, and
per-session event updates that the pages render in place of polling.
Dashboard Data Source
The dashboard should consume read-only snapshots from gateway services:
SessionRegistry,SessionManager,WorkerClient,GatewayMetrics,- health checks,
- structured fault/event counters.
Do not let Razor components directly mutate gateway session or worker objects. Create a small read-only dashboard service that projects gateway state into plain DTOs.
GatewayMetrics.GetSnapshot() is the metrics input for the first dashboard
projection. It carries current session and worker gauges, command and event
counters, queue depth, and fault totals. The dashboard reads that snapshot
instead of reading raw Meter instruments because exporter configuration is an
operations concern, not a UI dependency.
Suggested service:
public interface IDashboardSnapshotService
{
DashboardSnapshot GetSnapshot();
IAsyncEnumerable<DashboardSnapshot> WatchSnapshotsAsync(
CancellationToken cancellationToken);
}
Snapshot updates can be driven by:
- periodic timer, default every 1 second,
- session lifecycle notifications,
- worker heartbeat updates,
- event counter updates,
- fault notifications.
Use immutable snapshot DTOs so Razor components can render without locking gateway internals.
Realtime Updates
Updates flow over three SignalR hubs, all guarded by the
MxGateway.Dashboard.HubClients policy (cookie OR MxGateway.Dashboard.HubToken
bearer). Each hub class is [Authorize(Policy = HubClientsPolicy)].
| Hub | Path | Producer | Payload | Routing |
|---|---|---|---|---|
DashboardSnapshotHub |
/hubs/snapshot |
DashboardSnapshotPublisher (BackgroundService consuming IDashboardSnapshotService.WatchSnapshotsAsync) |
DashboardSnapshot |
Sent to all connected clients on every snapshot tick; new connections receive the current snapshot synchronously in OnConnectedAsync. |
AlarmsHub |
/hubs/alarms |
AlarmsHubPublisher (BackgroundService consuming IGatewayAlarmService.StreamAsync(filter: null)) |
AlarmFeedMessage (active_alarm / snapshot_complete / transition) |
Connected clients auto-join __alarms__; all clients receive every message. Publisher auto-reconnects every 5s on stream faults. |
EventsHub |
/hubs/events |
DashboardEventBroadcaster invoked by EventStreamService for each event it forwards to a gRPC client |
MxEvent |
Clients call SubscribeSession(sessionId) to join session:{id}. Events appear only while a gRPC client is also consuming that session's events — the dashboard is a passive mirror, not a separate worker subscriber. |
DashboardPageBase opens a DashboardSnapshotHub connection via the connection
factory in OnInitializedAsync, seeds Snapshot synchronously from
IDashboardSnapshotService.GetSnapshot() so the first render is non-empty, and
calls InvokeAsync(StateHasChanged) on every SnapshotUpdated push. SignalR's
WithAutomaticReconnect handles transient disconnects.
SessionDetailsPage additionally opens an EventsHub connection for the
current session id and renders the most recent N events (default 50) in a
"Recent events" table with a live/offline connection pill.
Default cadences:
- snapshot service produces one snapshot per
MxGateway:Dashboard:SnapshotIntervalMilliseconds(default 1s); - alarm publisher emits on each transition observed by the central monitor;
- event publisher emits per event forwarded by
StreamEvents.
Avoid pushing every MXAccess data-change event into a wider broadcast group.
The current design routes events strictly through session:{id} groups; the
snapshot hub continues to carry aggregate event counters and rates.
Pages
Dashboard home
Show top-level status:
- gateway status,
- gateway version,
- uptime,
- open sessions,
- workers running,
- sessions faulted,
- command rate,
- command failure count,
- event rate,
- event queue depth,
- worker restart/kill count.
Use Bootstrap cards for individual metric summaries. Keep the layout compact and operational.
Sessions page
Show active and recent sessions in a table:
- session id,
- client identity or API key display name,
- state,
- backend,
- worker process id,
- open time,
- last client activity,
- last worker heartbeat,
- active event subscribers,
- pending commands,
- event queue depth,
- last fault summary.
Rows should link to session details.
Session details page
Show:
- session metadata,
- worker metadata,
- command counters by method,
- event counters by family,
- active server handles and item counts if gateway shadow state has them,
- latest faults,
- last heartbeat payload,
- admin Close session / Kill worker controls (Admin role only).
The Sessions list, the Workers list, and this details page all render the same
admin controls when the signed-in principal carries the Administrator role; viewers
and the localhost-anonymous bypass see no action affordances and the server
re-checks the role on every invocation. Every destructive admin action is
gated by the shared ConfirmDialog component before it reaches
ISessionManager. ConfirmDialog is a reusable Bootstrap modal (title,
message, confirm/cancel buttons, and a busy state that disables both buttons
while the action runs); each page binds its open state and confirm/cancel
callbacks. The API keys page uses the same component.
- Close session routes through
ISessionManager.CloseSessionAsync: the worker is asked to shut down gracefully and is killed only as a fallback if shutdown fails. - Kill worker routes through
ISessionManager.KillWorkerAsync: the worker is killed immediately with no graceful-shutdown attempt. The session is removed from the registry and the open-session slot is released either way.
Workers page
Show:
- worker process id,
- session id,
- executable path/version,
- state,
- startup duration,
- memory and CPU if available,
- last heartbeat,
- current command correlation id,
- pending command count,
- event queue depth,
- restart/kill reason if terminal.
Events page
Show aggregate event diagnostics:
- event rate by session,
- event rate by event family,
- total events since start,
- queue overflow count,
- stream disconnect count,
- recent terminal faults.
Do not display full tag values by default. If value display is later added, make it opt-in and redacted.
Browse page
/browse lets an operator explore the Galaxy tag hierarchy and watch
live values. The tree is built in-process by the static
DashboardBrowseTreeBuilder (in DashboardBrowseModel.cs) from
IGalaxyHierarchyCache.Current — the same cache the Galaxy page reads — so a
render costs no gRPC call and no SQL round-trip. Each node shows its child
objects and, when expanded, its attributes with attribute name, data type
(including array dimension), and the alarm / historized flags. Galaxy SQL
carries no attribute description, so none is shown. A filter box switches the
tree to a flat list of matching attributes.
Right-clicking an attribute (or double-clicking it) adds it to the subscription
panel. The panel shows each subscribed tag's live value, MXAccess data type,
quality and source timestamp, refreshed every two seconds. The subscription
panel is the explicit opt-in tag-value surface: it always shows values
regardless of Dashboard:ShowTagValues, which continues to govern only the
diagnostic session/worker views.
Alarms page
/alarms lists the alarms the gateway's central alarm monitor
currently holds as Active or ActiveAcked. The page injects
IDashboardLiveDataService and drives a PeriodicTimer poll loop that calls
QueryAlarmsAsync every three seconds, rather than subscribing to the snapshot
hub or holding a CurrentAlarms reference directly. It
defaults to showing unacknowledged Active alarms; filters add acknowledged
alarms and narrow by area, severity range, and a reference/source/description
text search. Cleared alarms are not retained — the gateway holds no
alarm-history store, so the page reflects only the live active set. The page is
read-only; it does not acknowledge alarms. If MxGateway:Alarms:Enabled is
false the central monitor never starts, and the page says so instead of showing
an empty list with no explanation.
Live data source
Both the Browse subscription panel and the Alarms page read live MXAccess data
through IDashboardLiveDataService (DashboardLiveDataService). For tag data
it owns one shared gateway session for the whole dashboard, opened lazily on
first use via ISessionManager and re-opened transparently when it faults or
its lease expires. One session means one worker process backs every dashboard
circuit; all access is serialised so the worker sees one in-flight command at a
time. Tag reads go through GatewaySession.SubscribeBulkAsync / ReadBulkAsync.
The Alarms page does not use the dashboard session: alarm data comes from
the gateway's always-on central monitor. QueryAlarmsAsync reads
IGatewayAlarmService.CurrentAlarms — the monitor's in-process cache — so the
dashboard sees the same active-alarm set as every StreamAlarms client, with
no per-dashboard alarm subscription. When MxGateway:Alarms:Enabled is false
the monitor never starts and the cache stays empty.
API keys page
/apikeys lists the gateway's API keys and, for authorized
operators, manages them. It reads key metadata through the same
IApiKeyAdminStore the apikey CLI uses, so the dashboard and the CLI act
on one source of truth.
The table shows one row per key:
- key id,
- status (
ActiveorRevoked), - display name,
- scopes,
- constraints (rendered as
unconstrainedwhen none are set), - created timestamp,
- last-used timestamp.
Key secrets are never listed. Only the peppered hash is stored, and the page never reconstructs a key. See Authorization for what each constraint means and how it is enforced on the gRPC path.
Management actions
Create, Rotate, Revoke, and Delete controls render only when the signed-in
user is authorized. DashboardApiKeyAuthorization.CanManage requires an
authenticated principal carrying the Administrator role claim (resolved at login
from the user's LDAP groups via MxGateway:Dashboard:GroupToRole). A
Viewer role can read the table but sees no action controls, and an
anonymous localhost session shows the same read-only view.
- Create opens a dialog for the key id, display name, scope checkboxes
(the
GatewayScopescatalog), and the optional constraint fields: read and write subtrees, read and write tag globs, browse subtrees, max write classification, and the read-alarm-only / read-historized-only flags. - Rotate issues a new secret for an existing key id and invalidates the old one. Active keys only — rotating a revoked key would un-revoke it, so the button is not shown on revoked rows.
- Revoke marks a key revoked; a revoked key cannot be un-revoked.
- Delete permanently removes a key row from the auth database, but only
when the key is already revoked.
IApiKeyAdminStore.DeleteAsyncrejects active keys (returns false) so the revoke event lands in the audit log before the row disappears. Revoked rows show a Delete button in place of the previous "No actions" placeholder.
Every destructive action (Rotate / Revoke / Delete) is gated by the shared
ConfirmDialog component before reaching the service; Create uses its own
form modal as the implicit confirmation step.
Create and Rotate return the assembled mxgw_<keyId>_<secret> token once,
in a one-time banner. It is never shown again, so the operator must copy it
immediately. This mirrors the apikey create-key / rotate-key CLI.
Every management action writes an entry to the canonical audit_event store
through IAuditWriter (dashboard-create-key, dashboard-rotate-key,
dashboard-revoke-key, dashboard-delete-key) with the key id, the caller's
remote address, and a correlation id. Secrets and pepper values are never
logged.
Settings page
Show read-only effective configuration:
- worker executable path,
- configured timeouts,
- queue capacities,
- auth mode,
- SQLite auth database path with sensitive parts redacted if needed,
- dashboard enabled state,
- protocol version.
Do not show API key secrets or pepper values.
Authentication And Authorization
Dashboard authentication is LDAP-backed, distinct from the API-key model used
on the gRPC API. Users sign in with directory credentials; the gateway maps
their LDAP groups to one of two dashboard roles (Administrator or Viewer) and
issues a cookie carrying those role claims.
Implemented behavior:
GET /loginis served by the[AllowAnonymous]BlazorLogin.razorcomponent (underLoginLayout), which renders the shared kit's<LoginCard>.LoginCardemits a native static<form method="post" action="/login">(username, password, hidden returnUrl) plus an<AntiforgeryToken/>. A native form submit is not a Blazor event, so it reaches the minimal-APIPOST /loginendpoint regardless of the app's InteractiveServer render mode;DashboardAuthenticatordelegates bind/search to the sharedZB.MOM.WW.Auth.Ldapprovider, registered byAddZbLdapAuth(configuration, "MxGateway:Ldap"). The provider performs a service-account bind, user search, then candidate bind, and fails closed;- the user's group membership (stripped to its first RDN by the provider) is
matched against
MxGateway:Dashboard:GroupToRole; the resolved role(s) are emitted asClaimTypes.Roleclaims, alongside the per-groupmxgateway:ldap_groupclaims; - a successful login signs in the
MxGateway.Dashboardcookie scheme. The cookie defaults to the nameMxGatewayDashboard(HttpOnly, SameSite=Strict, Secure) and can be overridden viaMxGateway:Dashboard:CookieName; - a user with no matching group cannot sign in — the login screen returns the
generic credential-rejected message via
/login?error=…; - antiforgery tokens guard the login and logout POSTs.
POST /logout(and aGET /logoutconvenience redirect) sign the cookie out and return to/login.
Three authorization policies are registered:
MxGateway.Dashboard.Viewer— Razor component routes. Satisfied by Admin or Viewer.MxGateway.Dashboard.Admin— Admin-only write surfaces (API-key CRUD).MxGateway.Dashboard.HubClients— SignalR hubs. Accepts the dashboard cookie OR aMxGateway.Dashboard.HubTokenbearer (used by WebSocket upgrades where the cookie can't be forwarded).
Two environmental bypasses still apply: MxGateway:Authentication:Mode = Disabled
authorizes every request, and MxGateway:Dashboard:AllowAnonymousLocalhost
(default true) authorizes any loopback request without a role check. Remote
requests always require an authenticated principal carrying at least the
Viewer role.
Hub bearer flow
SignalR connections cannot reuse the MxGatewayDashboard cookie when the JS
client upgrades to WebSocket — the cookie's SameSite=Strict; Path=/ keeps it from
being forwarded by the browser's WebSocket layer in some edge cases. The
dashboard mints short-lived bearer tokens for the connection:
- The cookie-authenticated Blazor page calls
GET /hubs/token(gated byViewerPolicy, cookie-only). HubTokenService.Issue(user)serializes the user's name, NameIdentifier, and role claims to JSON, encrypts with the ASP.NET Core data-protection time-limited protector under purposeZB.MOM.WW.MxGateway.Dashboard.HubToken.v1, and returns the protected string. Lifetime is 30 minutes.- The SignalR client passes the token as either
Authorization: Bearer …or?access_token=…(WebSocket upgrade query string). HubTokenAuthenticationHandlervalidates the protected payload and rebuilds theClaimsPrincipalwith the carried roles.- The hubs'
[Authorize(Policy = HubClientsPolicy)]accepts the resulting identity.
DashboardHubConnectionFactory (scoped to the Blazor circuit) wraps the
HubConnectionBuilder and supplies a fresh token via AccessTokenProvider on
every (re)connect.
Configuration
Effective configuration:
{
"MxGateway": {
"Dashboard": {
"Enabled": true,
"AllowAnonymousLocalhost": true,
"SnapshotIntervalMilliseconds": 1000,
"RecentFaultLimit": 100,
"RecentSessionLimit": 200,
"ShowTagValues": false,
"CookieName": null,
"RequireHttpsCookie": true,
"GroupToRole": {
"GwAdmin": "Administrator",
"GwReader": "Viewer"
}
}
}
}
Two cookie keys tune the auth cookie:
CookieNameoverrides the cookie name. Null or blank keeps the canonical defaultMxGatewayDashboard, so a misconfiguration cannot leave the cookie unnamed.RequireHttpsCookie(defaulttrue) sets the cookieSecurePolicytoAlways. Set it tofalsefor dev HTTP deployments, which relaxes the policy toSameAsRequest.
See Gateway Configuration for the full option table and the policies/hubs that derive from these values.
Security Rules
- Do not display API key secrets.
- Do not display credential-bearing MXAccess command values.
- Do not display full tag values by default.
- Do not expose worker pipe names with nonce or sensitive details.
- Protect dashboard auth cookies with
HttpOnly,Secure, andSameSite. - Require TLS for remote dashboard access.
- Use anti-forgery protection for login/logout and any future admin actions.
Styling
Styling is layered. From base to top:
- Bootstrap 5.3.3 assets served from
src/ZB.MOM.WW.MxGateway.Server/wwwroot/lib/bootstrap/. - The
ZB.MOM.WW.Themekit'stheme.css(the technical-light design system), which owns the design tokens and the kit component styles.App.razorloads it through the kit's<ThemeHead/>component, and pairs it with<ThemeScripts/>at the end of<body>for the rail's interactive behavior. - The local view stylesheet
src/ZB.MOM.WW.MxGateway.Server/wwwroot/css/site.css, which wires the dashboard's own class names and Bootstrap widgets onto the kit tokens. It defines no hard-coded colors.
The minimal /denied page is rendered outside the Blazor circuit, so it loads
the kit CSS directly from the static-web-asset path
(/_content/ZB.MOM.WW.Theme/css/theme.css and …/layout.css) plus Bootstrap
and site.css.
Recommended visual language:
- compact tables,
- the kit
StatusPillfor state, - metric cards,
- Bootstrap alerts for faults,
- restrained colors drawn from the kit tokens,
- no decorative hero sections,
- no charting dependency for v1.
If charts are added later, prefer simple server-generated data tables first. Do not add a JavaScript charting dependency without a specific need.
The reusable visual rules for replicating this interface in other projects are documented in Dashboard Interface Design.
Testing
Dashboard unit/component tests should cover:
- snapshot projection,
- dashboard auth authorization decisions,
- login LDAP bind and group-to-role mapping behavior,
- pages render with empty state,
- pages render with active sessions,
- pages render with faulted sessions,
- realtime subscription disposal,
- redaction of API keys and credential values.
Use bUnit if component testing is added. Otherwise keep the first tests focused on snapshot services and authorization logic.
Integration tests should verify:
- dashboard disabled returns not found or configured fallback,
- dashboard requires auth when enabled,
- a user in an Admin-mapped LDAP group can access the dashboard and the API-key CRUD surface,
- a user in a Viewer-mapped LDAP group can render every page but cannot invoke the Admin-only management actions,
- a user with no mapped LDAP group cannot sign in at all,
- live snapshot updates when a fake session changes state are delivered
via the
/hubs/snapshotpush, not by polling.
Initial Implementation Slice
The first dashboard slice implements:
- Blazor Server hosting in
ZB.MOM.WW.MxGateway.Server. - local Bootstrap static assets plus the
ZB.MOM.WW.Themekit layer (chassis, tokens, status components). - dashboard configuration binding.
- dashboard auth using LDAP bind + role-mapped HTTP-only cookie.
DashboardSnapshotServiceprojecting gateway state for read views.- home page with metric cards.
- sessions page with active session table and session details.
- workers page with worker table.
- events page with aggregate counters.
- settings page with redacted effective configuration.
- periodic realtime refresh through Blazor Server.
- route-mapping tests, disabled-dashboard tests, auth tests, and snapshot projection/redaction tests.
Subsequent slices added Admin-gated destructive actions: API-key
Create/Rotate/Revoke (and Delete on revoked keys), and session/worker
Close/Kill via IDashboardSessionAdminService → ISessionManager. Every
destructive action passes through the shared ConfirmDialog component
before reaching its service.