# Gateway Dashboard Detailed Design ## Purpose The gateway should host a basic web dashboard for operators and developers. The dashboard is diagnostic and operational visibility only for v1. It should show gateway health, active MXAccess worker instances, session state, and basic statistics in real time. ## Technology Choice Decision: Blazor Server with the shared `ZB.MOM.WW.Theme` kit layered over Bootstrap CSS/JS. Allowed UI stack: - ASP.NET Core Blazor Server, - the `ZB.MOM.WW.Theme` kit (layout chassis, status components, design tokens), - Bootstrap CSS, - Bootstrap JavaScript, - small local CSS for layout and status styling, - built-in Blazor components. Not allowed for v1: - MudBlazor, - Radzen, - Syncfusion, - Telerik, - other Blazor UI component libraries, - client-side SPA framework replacement. Rationale: Blazor Server keeps the dashboard in the gateway process, avoids a separate frontend build, and gives real-time UI updates through the Blazor SignalR circuit. The `ZB.MOM.WW.Theme` kit gives the dashboard the same chassis, status vocabulary, and visual identity as the other ZB.MOM.WW operations UIs without re-implementing layout and status styling per project. ## Theme Kit The dashboard depends on the shared `ZB.MOM.WW.Theme` NuGet package (version `0.2.0`, referenced in `ZB.MOM.WW.MxGateway.Server.csproj`). The kit is a Razor Class Library that ships the technical-light design system: a layout chassis, a small set of UI components, the design tokens, and the head/script asset wiring. The dashboard takes its chrome and status presentation from the kit and adds only its own pages and view CSS on top. Components and assets used: | Kit member | Role in the dashboard | |---|---| | `` | The application chassis — vertical side rail (brand, hamburger, responsive collapse) plus a content area. `MainLayout.razor` wraps it and supplies `Nav`, `RailFooter`, and `ChildContent` slots. | | `` / `` | Grouped navigation items in the rail. Section expand/collapse persistence is owned by the kit (`
` + `ThemeScripts`); the app runs no JS interop for it. | | `` | The centered login card on `Login.razor`. Renders a native static `
` so the submit reaches the minimal-API endpoint rather than a Blazor event. | | `` | The status chip. `StatusBadge.razor` is a thin adapter that maps domain state text to one of four `StatusState` values (`Ok`, `Warn`, `Bad`, `Idle`) and renders this pill. | | `` | Loaded in `App.razor`'s ``; injects the kit's `theme.css` and related head assets. | | `` | Loaded at the end of `App.razor`'s ``; supplies the rail's interactive behavior. | | Token system | `theme.css` defines all design tokens (`var(--card)`, `var(--ink)`, `var(--accent)`, `var(--mono)`, the state colors, etc.). The local `site.css` references these tokens and defines no hard-coded colors. | The dependency on this kit is the reason the layout shell, navigation, status chips, and tokens differ from a stock Bootstrap dashboard. See [Dashboard Interface Design](./DashboardInterfaceDesign.md) for how the kit's tokens and components shape the visual language. ## Hosting Model The dashboard is hosted by `ZB.MOM.WW.MxGateway.Server` alongside the gRPC API. When `MxGateway:Dashboard:Enabled` is `true`, `MapGatewayDashboard()` mounts the Blazor Server app at the host root and registers the login, logout, denied, SignalR hub, and hub-token endpoints beside it. When dashboard hosting is disabled, none of those routes are mapped — the same listener still serves gRPC. Endpoint layout: ```text / /sessions /sessions/{sessionId} /workers /events /alarms /galaxy /browse /apikeys /settings /login (POST also) /logout (POST) /denied /hubs/snapshot /hubs/alarms /hubs/events /hubs/token /_blazor ``` The `/galaxy` page surfaces the Galaxy Repository browse summary (deployed object hierarchy size, last deploy timestamp, attribute totals, template usage, and connectivity sync info). The summary is fed by `GalaxyHierarchyCache`, which is refreshed off the request path by `GalaxyHierarchyRefreshService` on the `MxGateway:Galaxy:DashboardRefreshIntervalSeconds` cadence so the dashboard never blocks on SQL. See [Galaxy Repository Browse](./GalaxyRepository.md) for the underlying gRPC service. ## High-Level Components ```text ZB.MOM.WW.MxGateway.Server Dashboard/ Components/ App.razor (loads / ) Routes.razor DashboardPageBase.cs DashboardDisplay.cs Layout/ MainLayout.razor (ThemeShell side-rail chassis) LoginLayout.razor (minimal, no rail; hosts ) Pages/ DashboardHome.razor Login.razor SessionsPage.razor SessionDetailsPage.razor WorkersPage.razor EventsPage.razor AlarmsPage.razor GalaxyPage.razor BrowsePage.razor ApiKeysPage.razor SettingsPage.razor Shared/ MetricCard.razor StatusBadge.razor (adapter over kit ) FaultList.razor BrowseTreeNodeView.razor ConfirmDialog.razor DashboardSnapshotService.cs DashboardAuthorizationHandler.cs DashboardAuthenticator.cs DashboardApiKeyAuthorization.cs DashboardApiKeyManagementService.cs DashboardApiKeySummary.cs DashboardSnapshot.cs DashboardSessionSummary.cs DashboardWorkerSummary.cs DashboardMetricSummary.cs ``` The dashboard exposes three named SignalR hubs in addition to Blazor Server's internal circuit; pages connect to those hubs from within the circuit via the `DashboardHubConnectionFactory` helper. The hubs publish snapshot, alarm, and per-session event updates that the pages render in place of polling. ## Dashboard Data Source The dashboard should consume read-only snapshots from gateway services: - `SessionRegistry`, - `SessionManager`, - `WorkerClient`, - `GatewayMetrics`, - health checks, - structured fault/event counters. Do not let Razor components directly mutate gateway session or worker objects. Create a small read-only dashboard service that projects gateway state into plain DTOs. `GatewayMetrics.GetSnapshot()` is the metrics input for the first dashboard projection. It carries current session and worker gauges, command and event counters, queue depth, and fault totals. The dashboard reads that snapshot instead of reading raw `Meter` instruments because exporter configuration is an operations concern, not a UI dependency. Suggested service: ```csharp public interface IDashboardSnapshotService { DashboardSnapshot GetSnapshot(); IAsyncEnumerable WatchSnapshotsAsync( CancellationToken cancellationToken); } ``` Snapshot updates can be driven by: - periodic timer, default every 1 second, - session lifecycle notifications, - worker heartbeat updates, - event counter updates, - fault notifications. Use immutable snapshot DTOs so Razor components can render without locking gateway internals. ## Realtime Updates Updates flow over three SignalR hubs, all guarded by the `MxGateway.Dashboard.HubClients` policy (cookie OR `MxGateway.Dashboard.HubToken` bearer). Each hub class is `[Authorize(Policy = HubClientsPolicy)]`. | Hub | Path | Producer | Payload | Routing | |---|---|---|---|---| | `DashboardSnapshotHub` | `/hubs/snapshot` | `DashboardSnapshotPublisher` (BackgroundService consuming `IDashboardSnapshotService.WatchSnapshotsAsync`) | `DashboardSnapshot` | Sent to all connected clients on every snapshot tick; new connections receive the current snapshot synchronously in `OnConnectedAsync`. | | `AlarmsHub` | `/hubs/alarms` | `AlarmsHubPublisher` (BackgroundService consuming `IGatewayAlarmService.StreamAsync(filter: null)`) | `AlarmFeedMessage` (`active_alarm` / `snapshot_complete` / `transition`) | Connected clients auto-join `__alarms__`; all clients receive every message. Publisher auto-reconnects every 5s on stream faults. | | `EventsHub` | `/hubs/events` | `DashboardEventBroadcaster` invoked by `EventStreamService` for each event it forwards to a gRPC client | `MxEvent` | Clients call `SubscribeSession(sessionId)` to join `session:{id}`. Events appear only while a gRPC client is also consuming that session's events — the dashboard is a passive mirror, not a separate worker subscriber. | `DashboardPageBase` opens a `DashboardSnapshotHub` connection via the connection factory in `OnInitializedAsync`, seeds `Snapshot` synchronously from `IDashboardSnapshotService.GetSnapshot()` so the first render is non-empty, and calls `InvokeAsync(StateHasChanged)` on every `SnapshotUpdated` push. SignalR's `WithAutomaticReconnect` handles transient disconnects. `SessionDetailsPage` additionally opens an `EventsHub` connection for the current session id and renders the most recent N events (default 50) in a "Recent events" table with a live/offline connection pill. Default cadences: - snapshot service produces one snapshot per `MxGateway:Dashboard:SnapshotIntervalMilliseconds` (default 1s); - alarm publisher emits on each transition observed by the central monitor; - event publisher emits per event forwarded by `StreamEvents`. Avoid pushing every MXAccess data-change event into a wider broadcast group. The current design routes events strictly through `session:{id}` groups; the snapshot hub continues to carry aggregate event counters and rates. ## Pages ### Dashboard home Show top-level status: - gateway status, - gateway version, - uptime, - open sessions, - workers running, - sessions faulted, - command rate, - command failure count, - event rate, - event queue depth, - worker restart/kill count. Use Bootstrap cards for individual metric summaries. Keep the layout compact and operational. ### Sessions page Show active and recent sessions in a table: - session id, - client identity or API key display name, - state, - backend, - worker process id, - open time, - last client activity, - last worker heartbeat, - active event subscribers, - pending commands, - event queue depth, - last fault summary. Rows should link to session details. ### Session details page Show: - session metadata, - worker metadata, - command counters by method, - event counters by family, - active server handles and item counts if gateway shadow state has them, - latest faults, - last heartbeat payload, - admin Close session / Kill worker controls (Admin role only). The Sessions list, the Workers list, and this details page all render the same admin controls when the signed-in principal carries the `Administrator` role; viewers and the localhost-anonymous bypass see no action affordances and the server re-checks the role on every invocation. Every destructive admin action is gated by the shared `ConfirmDialog` component before it reaches `ISessionManager`. `ConfirmDialog` is a reusable Bootstrap modal (title, message, confirm/cancel buttons, and a busy state that disables both buttons while the action runs); each page binds its open state and confirm/cancel callbacks. The API keys page uses the same component. - **Close session** routes through `ISessionManager.CloseSessionAsync`: the worker is asked to shut down gracefully and is killed only as a fallback if shutdown fails. - **Kill worker** routes through `ISessionManager.KillWorkerAsync`: the worker is killed immediately with no graceful-shutdown attempt. The session is removed from the registry and the open-session slot is released either way. ### Workers page Show: - worker process id, - session id, - executable path/version, - state, - startup duration, - memory and CPU if available, - last heartbeat, - current command correlation id, - pending command count, - event queue depth, - restart/kill reason if terminal. ### Events page Show aggregate event diagnostics: - event rate by session, - event rate by event family, - total events since start, - queue overflow count, - stream disconnect count, - recent terminal faults. Do not display full tag values by default. If value display is later added, make it opt-in and redacted. ### Browse page `/browse` lets an operator explore the Galaxy tag hierarchy and watch live values. The tree is built in-process by the static `DashboardBrowseTreeBuilder` (in `DashboardBrowseModel.cs`) from `IGalaxyHierarchyCache.Current` — the same cache the Galaxy page reads — so a render costs no gRPC call and no SQL round-trip. Each node shows its child objects and, when expanded, its attributes with attribute name, data type (including array dimension), and the alarm / historized flags. Galaxy SQL carries no attribute description, so none is shown. A filter box switches the tree to a flat list of matching attributes. Right-clicking an attribute (or double-clicking it) adds it to the subscription panel. The panel shows each subscribed tag's live value, MXAccess data type, quality and source timestamp, refreshed every two seconds. The subscription panel is the explicit opt-in tag-value surface: it always shows values regardless of `Dashboard:ShowTagValues`, which continues to govern only the diagnostic session/worker views. ### Alarms page `/alarms` lists the alarms the gateway's central alarm monitor currently holds as Active or ActiveAcked. The page injects `IDashboardLiveDataService` and drives a `PeriodicTimer` poll loop that calls `QueryAlarmsAsync` every three seconds, rather than subscribing to the snapshot hub or holding a `CurrentAlarms` reference directly. It defaults to showing unacknowledged `Active` alarms; filters add acknowledged alarms and narrow by area, severity range, and a reference/source/description text search. Cleared alarms are not retained — the gateway holds no alarm-history store, so the page reflects only the live active set. The page is read-only; it does not acknowledge alarms. If `MxGateway:Alarms:Enabled` is false the central monitor never starts, and the page says so instead of showing an empty list with no explanation. ### Live data source Both the Browse subscription panel and the Alarms page read live MXAccess data through `IDashboardLiveDataService` (`DashboardLiveDataService`). For tag data it owns one shared gateway session for the whole dashboard, opened lazily on first use via `ISessionManager` and re-opened transparently when it faults or its lease expires. One session means one worker process backs every dashboard circuit; all access is serialised so the worker sees one in-flight command at a time. Tag reads go through `GatewaySession.SubscribeBulkAsync` / `ReadBulkAsync`. The Alarms page does **not** use the dashboard session: alarm data comes from the gateway's always-on central monitor. `QueryAlarmsAsync` reads `IGatewayAlarmService.CurrentAlarms` — the monitor's in-process cache — so the dashboard sees the same active-alarm set as every `StreamAlarms` client, with no per-dashboard alarm subscription. When `MxGateway:Alarms:Enabled` is false the monitor never starts and the cache stays empty. ### API keys page `/apikeys` lists the gateway's API keys and, for authorized operators, manages them. It reads key metadata through the same `IApiKeyAdminStore` the `apikey` CLI uses, so the dashboard and the CLI act on one source of truth. The table shows one row per key: - key id, - status (`Active` or `Revoked`), - display name, - scopes, - constraints (rendered as `unconstrained` when none are set), - created timestamp, - last-used timestamp. Key secrets are never listed. Only the peppered hash is stored, and the page never reconstructs a key. See [Authorization](./Authorization.md#constraint-enforcement) for what each constraint means and how it is enforced on the gRPC path. #### Management actions Create, Rotate, Revoke, and Delete controls render only when the signed-in user is authorized. `DashboardApiKeyAuthorization.CanManage` requires an authenticated principal carrying the `Administrator` role claim (resolved at login from the user's LDAP groups via `MxGateway:Dashboard:GroupToRole`). A `Viewer` role can read the table but sees no action controls, and an anonymous localhost session shows the same read-only view. - **Create** opens a dialog for the key id, display name, scope checkboxes (the `GatewayScopes` catalog), and the optional constraint fields: read and write subtrees, read and write tag globs, browse subtrees, max write classification, and the read-alarm-only / read-historized-only flags. - **Rotate** issues a new secret for an existing key id and invalidates the old one. Active keys only — rotating a revoked key would un-revoke it, so the button is not shown on revoked rows. - **Revoke** marks a key revoked; a revoked key cannot be un-revoked. - **Delete** permanently removes a key row from the auth database, but only when the key is already revoked. `IApiKeyAdminStore.DeleteAsync` rejects active keys (returns false) so the revoke event lands in the audit log before the row disappears. Revoked rows show a Delete button in place of the previous "No actions" placeholder. Every destructive action (Rotate / Revoke / Delete) is gated by the shared `ConfirmDialog` component before reaching the service; Create uses its own form modal as the implicit confirmation step. Create and Rotate return the assembled `mxgw__` token **once**, in a one-time banner. It is never shown again, so the operator must copy it immediately. This mirrors the `apikey create-key` / `rotate-key` CLI. Every management action writes an entry to the canonical `audit_event` store through `IAuditWriter` (`dashboard-create-key`, `dashboard-rotate-key`, `dashboard-revoke-key`, `dashboard-delete-key`) with the key id, the caller's remote address, and a correlation id. Secrets and pepper values are never logged. ### Settings page Show read-only effective configuration: - worker executable path, - configured timeouts, - queue capacities, - auth mode, - SQLite auth database path with sensitive parts redacted if needed, - dashboard enabled state, - protocol version. Do not show API key secrets or pepper values. ## Authentication And Authorization Dashboard authentication is LDAP-backed, distinct from the API-key model used on the gRPC API. Users sign in with directory credentials; the gateway maps their LDAP groups to one of two dashboard roles (`Administrator` or `Viewer`) and issues a cookie carrying those role claims. Implemented behavior: - `GET /login` is served by the `[AllowAnonymous]` Blazor `Login.razor` component (under `LoginLayout`), which renders the shared kit's ``. `LoginCard` emits a native static `` (username, password, hidden returnUrl) plus an ``. A native form submit is not a Blazor event, so it reaches the minimal-API `POST /login` endpoint regardless of the app's InteractiveServer render mode; - `DashboardAuthenticator` delegates bind/search to the shared `ZB.MOM.WW.Auth.Ldap` provider, registered by `AddZbLdapAuth(configuration, "MxGateway:Ldap")`. The provider performs a service-account bind, user search, then candidate bind, and fails closed; - the user's group membership (stripped to its first RDN by the provider) is matched against `MxGateway:Dashboard:GroupToRole`; the resolved role(s) are emitted as `ClaimTypes.Role` claims, alongside the per-group `mxgateway:ldap_group` claims; - a successful login signs in the `MxGateway.Dashboard` cookie scheme. The cookie defaults to the name `MxGatewayDashboard` (HttpOnly, SameSite=Strict, Secure) and can be overridden via `MxGateway:Dashboard:CookieName`; - a user with no matching group cannot sign in — the login screen returns the generic credential-rejected message via `/login?error=…`; - antiforgery tokens guard the login and logout POSTs. `POST /logout` (and a `GET /logout` convenience redirect) sign the cookie out and return to `/login`. Three authorization policies are registered: - `MxGateway.Dashboard.Viewer` — Razor component routes. Satisfied by Admin or Viewer. - `MxGateway.Dashboard.Admin` — Admin-only write surfaces (API-key CRUD). - `MxGateway.Dashboard.HubClients` — SignalR hubs. Accepts the dashboard cookie OR a `MxGateway.Dashboard.HubToken` bearer (used by WebSocket upgrades where the cookie can't be forwarded). Two environmental bypasses still apply: `MxGateway:Authentication:Mode = Disabled` authorizes every request, and `MxGateway:Dashboard:AllowAnonymousLocalhost` (default `true`) authorizes any loopback request without a role check. Remote requests always require an authenticated principal carrying at least the Viewer role. ### Hub bearer flow SignalR connections cannot reuse the `MxGatewayDashboard` cookie when the JS client upgrades to WebSocket — the cookie's `SameSite=Strict; Path=/` keeps it from being forwarded by the browser's WebSocket layer in some edge cases. The dashboard mints short-lived bearer tokens for the connection: 1. The cookie-authenticated Blazor page calls `GET /hubs/token` (gated by `ViewerPolicy`, cookie-only). 2. `HubTokenService.Issue(user)` serializes the user's name, NameIdentifier, and role claims to JSON, encrypts with the ASP.NET Core data-protection time-limited protector under purpose `ZB.MOM.WW.MxGateway.Dashboard.HubToken.v1`, and returns the protected string. Lifetime is 30 minutes. 3. The SignalR client passes the token as either `Authorization: Bearer …` or `?access_token=…` (WebSocket upgrade query string). 4. `HubTokenAuthenticationHandler` validates the protected payload and rebuilds the `ClaimsPrincipal` with the carried roles. 5. The hubs' `[Authorize(Policy = HubClientsPolicy)]` accepts the resulting identity. `DashboardHubConnectionFactory` (scoped to the Blazor circuit) wraps the HubConnectionBuilder and supplies a fresh token via `AccessTokenProvider` on every (re)connect. ## Configuration Effective configuration: ```json { "MxGateway": { "Dashboard": { "Enabled": true, "AllowAnonymousLocalhost": true, "SnapshotIntervalMilliseconds": 1000, "RecentFaultLimit": 100, "RecentSessionLimit": 200, "ShowTagValues": false, "CookieName": null, "RequireHttpsCookie": true, "GroupToRole": { "GwAdmin": "Administrator", "GwReader": "Viewer" } } } } ``` Two cookie keys tune the auth cookie: - `CookieName` overrides the cookie name. Null or blank keeps the canonical default `MxGatewayDashboard`, so a misconfiguration cannot leave the cookie unnamed. - `RequireHttpsCookie` (default `true`) sets the cookie `SecurePolicy` to `Always`. Set it to `false` for dev HTTP deployments, which relaxes the policy to `SameAsRequest`. See [Gateway Configuration](./GatewayConfiguration.md#dashboard-options) for the full option table and the policies/hubs that derive from these values. ## Security Rules - Do not display API key secrets. - Do not display credential-bearing MXAccess command values. - Do not display full tag values by default. - Do not expose worker pipe names with nonce or sensitive details. - Protect dashboard auth cookies with `HttpOnly`, `Secure`, and `SameSite`. - Require TLS for remote dashboard access. - Use anti-forgery protection for login/logout and any future admin actions. ## Styling Styling is layered. From base to top: 1. Bootstrap 5.3.3 assets served from `src/ZB.MOM.WW.MxGateway.Server/wwwroot/lib/bootstrap/`. 2. The `ZB.MOM.WW.Theme` kit's `theme.css` (the technical-light design system), which owns the design tokens and the kit component styles. `App.razor` loads it through the kit's `` component, and pairs it with `` at the end of `` for the rail's interactive behavior. 3. The local view stylesheet `src/ZB.MOM.WW.MxGateway.Server/wwwroot/css/site.css`, which wires the dashboard's own class names and Bootstrap widgets onto the kit tokens. It defines no hard-coded colors. The minimal `/denied` page is rendered outside the Blazor circuit, so it loads the kit CSS directly from the static-web-asset path (`/_content/ZB.MOM.WW.Theme/css/theme.css` and `…/layout.css`) plus Bootstrap and `site.css`. Recommended visual language: - compact tables, - the kit `StatusPill` for state, - metric cards, - Bootstrap alerts for faults, - restrained colors drawn from the kit tokens, - no decorative hero sections, - no charting dependency for v1. If charts are added later, prefer simple server-generated data tables first. Do not add a JavaScript charting dependency without a specific need. The reusable visual rules for replicating this interface in other projects are documented in [Dashboard Interface Design](./DashboardInterfaceDesign.md). ## Testing Dashboard unit/component tests should cover: - snapshot projection, - dashboard auth authorization decisions, - login LDAP bind and group-to-role mapping behavior, - pages render with empty state, - pages render with active sessions, - pages render with faulted sessions, - realtime subscription disposal, - redaction of API keys and credential values. Use bUnit if component testing is added. Otherwise keep the first tests focused on snapshot services and authorization logic. Integration tests should verify: - dashboard disabled returns not found or configured fallback, - dashboard requires auth when enabled, - a user in an Admin-mapped LDAP group can access the dashboard and the API-key CRUD surface, - a user in a Viewer-mapped LDAP group can render every page but cannot invoke the Admin-only management actions, - a user with no mapped LDAP group cannot sign in at all, - live snapshot updates when a fake session changes state are delivered via the `/hubs/snapshot` push, not by polling. ## Initial Implementation Slice The first dashboard slice implements: 1. Blazor Server hosting in `ZB.MOM.WW.MxGateway.Server`. 2. local Bootstrap static assets plus the `ZB.MOM.WW.Theme` kit layer (chassis, tokens, status components). 3. dashboard configuration binding. 4. dashboard auth using LDAP bind + role-mapped HTTP-only cookie. 5. `DashboardSnapshotService` projecting gateway state for read views. 6. home page with metric cards. 7. sessions page with active session table and session details. 8. workers page with worker table. 9. events page with aggregate counters. 10. settings page with redacted effective configuration. 11. periodic realtime refresh through Blazor Server. 12. route-mapping tests, disabled-dashboard tests, auth tests, and snapshot projection/redaction tests. Subsequent slices added Admin-gated destructive actions: API-key Create/Rotate/Revoke (and Delete on revoked keys), and session/worker Close/Kill via `IDashboardSessionAdminService` → `ISessionManager`. Every destructive action passes through the shared `ConfirmDialog` component before reaching its service. ## Related Documentation - [Dashboard Interface Design](./DashboardInterfaceDesign.md) - [Gateway Process Detailed Design](./GatewayProcessDesign.md) - [Authentication](./Authentication.md) - [Authorization](./Authorization.md) - [Sessions](./Sessions.md) - [Metrics](./Metrics.md) - [Diagnostics](./Diagnostics.md)