c1fe7fbc4a
Browse renders the Galaxy hierarchy tree from IGalaxyHierarchyCache: expandable areas/objects with attribute name, data type and the alarm/historized flags, plus a name/reference filter. Right-click or double-click an attribute to add it to a subscription panel that polls live value, quality and source timestamp every two seconds. Alarms lists the worker's currently-active alarm set via IAlarmRpcDispatcher, defaulting to unacknowledged Active alarms with filters for acknowledged alarms, area, severity range and text. It is read-only and warns when alarm auto-subscribe is disabled. Both tabs read live MXAccess data through a new singleton DashboardLiveDataService that owns one shared, lazily-opened gateway session (one worker) for the whole dashboard, re-opened transparently if it faults or its lease expires. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
497 lines
16 KiB
Markdown
497 lines
16 KiB
Markdown
# Gateway Dashboard Detailed Design
|
|
|
|
## Purpose
|
|
|
|
The gateway should host a basic web dashboard for operators and developers. The
|
|
dashboard is diagnostic and operational visibility only for v1. It should show
|
|
gateway health, active MXAccess worker instances, session state, and basic
|
|
statistics in real time.
|
|
|
|
## Technology Choice
|
|
|
|
Decision: Blazor Server with Bootstrap CSS/JS.
|
|
|
|
Allowed UI stack:
|
|
|
|
- ASP.NET Core Blazor Server,
|
|
- Bootstrap CSS,
|
|
- Bootstrap JavaScript,
|
|
- small local CSS for layout and status styling,
|
|
- built-in Blazor components.
|
|
|
|
Not allowed for v1:
|
|
|
|
- MudBlazor,
|
|
- Radzen,
|
|
- Syncfusion,
|
|
- Telerik,
|
|
- other Blazor UI component libraries,
|
|
- client-side SPA framework replacement.
|
|
|
|
Rationale: Blazor Server keeps the dashboard in the gateway process, avoids a
|
|
separate frontend build, and gives real-time UI updates through the Blazor
|
|
SignalR circuit. Bootstrap is sufficient for a basic dashboard.
|
|
|
|
## Hosting Model
|
|
|
|
The dashboard is hosted by `MxGateway.Server` alongside the gRPC API. When
|
|
`MxGateway:Dashboard:Enabled` is `true`, `MapGatewayDashboard()` maps the
|
|
configured `Dashboard:PathBase` to the Blazor Server app and maps the login,
|
|
logout, and access-denied HTTP endpoints beside it. When dashboard hosting is
|
|
disabled, those routes are not mapped.
|
|
|
|
Endpoint layout:
|
|
|
|
```text
|
|
/dashboard
|
|
/dashboard/sessions
|
|
/dashboard/sessions/{sessionId}
|
|
/dashboard/workers
|
|
/dashboard/events
|
|
/dashboard/galaxy
|
|
/dashboard/apikeys
|
|
/dashboard/settings
|
|
/dashboard/_blazor
|
|
```
|
|
|
|
The `/dashboard/galaxy` page surfaces the Galaxy Repository browse summary
|
|
(deployed object hierarchy size, last deploy timestamp, attribute totals,
|
|
template usage, and connectivity sync info). The summary is fed by
|
|
`GalaxySummaryCache`, which is refreshed off the request path by
|
|
`GalaxySummaryRefreshService` on the
|
|
`MxGateway:Galaxy:DashboardRefreshIntervalSeconds` cadence so the dashboard
|
|
never blocks on SQL. See [Galaxy Repository Browse](./GalaxyRepository.md) for
|
|
the underlying gRPC service.
|
|
|
|
The app should redirect `/` to `/dashboard` only if the deployment wants the
|
|
dashboard as the default web page. Otherwise leave gRPC/API hosting unaffected.
|
|
|
|
## High-Level Components
|
|
|
|
```text
|
|
MxGateway.Server
|
|
Dashboard/
|
|
Components/
|
|
App.razor
|
|
Routes.razor
|
|
DashboardPageBase.cs
|
|
DashboardDisplay.cs
|
|
Layout/
|
|
DashboardLayout.razor
|
|
Pages/
|
|
DashboardHome.razor
|
|
SessionsPage.razor
|
|
SessionDetailsPage.razor
|
|
WorkersPage.razor
|
|
EventsPage.razor
|
|
ApiKeysPage.razor
|
|
SettingsPage.razor
|
|
Shared/
|
|
MetricCard.razor
|
|
StatusBadge.razor
|
|
FaultList.razor
|
|
DashboardSnapshotService.cs
|
|
DashboardAuthorizationHandler.cs
|
|
DashboardAuthenticator.cs
|
|
DashboardApiKeyAuthorization.cs
|
|
DashboardApiKeyManagementService.cs
|
|
DashboardApiKeySummary.cs
|
|
DashboardSnapshot.cs
|
|
DashboardSessionSummary.cs
|
|
DashboardWorkerSummary.cs
|
|
DashboardMetricSummary.cs
|
|
```
|
|
|
|
Blazor Server provides the SignalR circuit for UI updates. The implementation
|
|
does not add a separate public dashboard hub.
|
|
|
|
## Dashboard Data Source
|
|
|
|
The dashboard should consume read-only snapshots from gateway services:
|
|
|
|
- `SessionRegistry`,
|
|
- `SessionManager`,
|
|
- `WorkerClient`,
|
|
- `GatewayMetrics`,
|
|
- health checks,
|
|
- structured fault/event counters.
|
|
|
|
Do not let Razor components directly mutate gateway session or worker objects.
|
|
Create a small read-only dashboard service that projects gateway state into
|
|
plain DTOs.
|
|
|
|
`GatewayMetrics.GetSnapshot()` is the metrics input for the first dashboard
|
|
projection. It carries current session and worker gauges, command and event
|
|
counters, queue depth, and fault totals. The dashboard reads that snapshot
|
|
instead of reading raw `Meter` instruments because exporter configuration is an
|
|
operations concern, not a UI dependency.
|
|
|
|
Suggested service:
|
|
|
|
```csharp
|
|
public interface IDashboardSnapshotService
|
|
{
|
|
DashboardSnapshot GetSnapshot();
|
|
IAsyncEnumerable<DashboardSnapshot> WatchSnapshotsAsync(
|
|
CancellationToken cancellationToken);
|
|
}
|
|
```
|
|
|
|
Snapshot updates can be driven by:
|
|
|
|
- periodic timer, default every 1 second,
|
|
- session lifecycle notifications,
|
|
- worker heartbeat updates,
|
|
- event counter updates,
|
|
- fault notifications.
|
|
|
|
Use immutable snapshot DTOs so Razor components can render without locking
|
|
gateway internals.
|
|
|
|
## Realtime Updates
|
|
|
|
Use Blazor Server component state updates for real-time dashboard refresh.
|
|
|
|
Implemented pattern:
|
|
|
|
1. Page/component subscribes to `WatchSnapshotsAsync`.
|
|
2. Snapshot service emits updates from a bounded channel or timer.
|
|
3. Component stores the latest snapshot.
|
|
4. Component calls `InvokeAsync(StateHasChanged)`.
|
|
5. Component cancels subscription on dispose.
|
|
|
|
Default update cadence:
|
|
|
|
- periodic metrics refresh every 1 second,
|
|
- event counters update on the next snapshot tick.
|
|
|
|
Avoid pushing every MXAccess data-change event to the dashboard. Aggregate event
|
|
counts and rates instead.
|
|
|
|
## Pages
|
|
|
|
### Dashboard home
|
|
|
|
Show top-level status:
|
|
|
|
- gateway status,
|
|
- gateway version,
|
|
- uptime,
|
|
- open sessions,
|
|
- workers running,
|
|
- sessions faulted,
|
|
- command rate,
|
|
- command failure count,
|
|
- event rate,
|
|
- event queue depth,
|
|
- worker restart/kill count.
|
|
|
|
Use Bootstrap cards for individual metric summaries. Keep the layout compact
|
|
and operational.
|
|
|
|
### Sessions page
|
|
|
|
Show active and recent sessions in a table:
|
|
|
|
- session id,
|
|
- client identity or API key display name,
|
|
- state,
|
|
- backend,
|
|
- worker process id,
|
|
- open time,
|
|
- last client activity,
|
|
- last worker heartbeat,
|
|
- active event subscribers,
|
|
- pending commands,
|
|
- event queue depth,
|
|
- last fault summary.
|
|
|
|
Rows should link to session details.
|
|
|
|
### Session details page
|
|
|
|
Show:
|
|
|
|
- session metadata,
|
|
- worker metadata,
|
|
- command counters by method,
|
|
- event counters by family,
|
|
- active server handles and item counts if gateway shadow state has them,
|
|
- latest faults,
|
|
- last heartbeat payload,
|
|
- close/kill controls only if admin actions are later enabled.
|
|
|
|
For v1, details should be read-only unless an explicit admin action design is
|
|
added.
|
|
|
|
### Workers page
|
|
|
|
Show:
|
|
|
|
- worker process id,
|
|
- session id,
|
|
- executable path/version,
|
|
- state,
|
|
- startup duration,
|
|
- memory and CPU if available,
|
|
- last heartbeat,
|
|
- current command correlation id,
|
|
- pending command count,
|
|
- event queue depth,
|
|
- restart/kill reason if terminal.
|
|
|
|
### Events page
|
|
|
|
Show aggregate event diagnostics:
|
|
|
|
- event rate by session,
|
|
- event rate by event family,
|
|
- total events since start,
|
|
- queue overflow count,
|
|
- stream disconnect count,
|
|
- recent terminal faults.
|
|
|
|
Do not display full tag values by default. If value display is later added, make
|
|
it opt-in and redacted.
|
|
|
|
### Browse page
|
|
|
|
`/dashboard/browse` lets an operator explore the Galaxy tag hierarchy and watch
|
|
live values. The tree is built in-process by `DashboardBrowseTreeBuilder` from
|
|
`IGalaxyHierarchyCache.Current` — the same cache the Galaxy page reads — so a
|
|
render costs no gRPC call and no SQL round-trip. Each node shows its child
|
|
objects and, when expanded, its attributes with attribute name, data type
|
|
(including array dimension), and the alarm / historized flags. Galaxy SQL
|
|
carries no attribute description, so none is shown. A filter box switches the
|
|
tree to a flat list of matching attributes.
|
|
|
|
Right-clicking an attribute (or double-clicking it) adds it to the subscription
|
|
panel. The panel shows each subscribed tag's live value, MXAccess data type,
|
|
quality and source timestamp, refreshed every two seconds. The subscription
|
|
panel is the explicit opt-in tag-value surface: it always shows values
|
|
regardless of `Dashboard:ShowTagValues`, which continues to govern only the
|
|
diagnostic session/worker views.
|
|
|
|
### Alarms page
|
|
|
|
`/dashboard/alarms` lists the alarms the dashboard session's worker currently
|
|
reports as Active or ActiveAcked, refreshed every three seconds. It defaults to
|
|
showing unacknowledged `Active` alarms; filters add acknowledged alarms and
|
|
narrow by area, severity range, and a reference/source/description text search.
|
|
Cleared alarms are not retained — the gateway holds no alarm-history store, so
|
|
the page reflects only the live active set. The page is read-only; it does not
|
|
acknowledge alarms. If `MxGateway:Alarms:Enabled` is false the session is never
|
|
subscribed to an alarm provider, and the page says so instead of showing an
|
|
empty list with no explanation.
|
|
|
|
### Live data source
|
|
|
|
Both the Browse subscription panel and the Alarms page read live MXAccess data
|
|
through `IDashboardLiveDataService` (`DashboardLiveDataService`). It owns one
|
|
shared gateway session for the whole dashboard, opened lazily on first use via
|
|
`ISessionManager` and re-opened transparently when it faults or its lease
|
|
expires. One session means one worker process backs every dashboard circuit;
|
|
all access is serialised so the worker sees one in-flight command at a time.
|
|
Tag reads go through `GatewaySession.SubscribeBulkAsync` / `ReadBulkAsync`;
|
|
alarm queries go through `IAlarmRpcDispatcher`. Alarm subscription is the
|
|
gateway's existing auto-subscribe-on-open hook, so the dashboard session is
|
|
alarm-subscribed only when `MxGateway:Alarms:Enabled` is set.
|
|
|
|
### API keys page
|
|
|
|
`/dashboard/apikeys` lists the gateway's API keys and, for authorized
|
|
operators, manages them. It reads key metadata through the same
|
|
`IApiKeyAdminStore` the `apikey` CLI uses, so the dashboard and the CLI act
|
|
on one source of truth.
|
|
|
|
The table shows one row per key:
|
|
|
|
- key id,
|
|
- status (`Active` or `Revoked`),
|
|
- display name,
|
|
- scopes,
|
|
- constraints (rendered as `unconstrained` when none are set),
|
|
- created timestamp,
|
|
- last-used timestamp.
|
|
|
|
Key secrets are never listed. Only the peppered hash is stored, and the page
|
|
never reconstructs a key. See [Authorization](./Authorization.md#constraint-enforcement)
|
|
for what each constraint means and how it is enforced on the gRPC path.
|
|
|
|
#### Management actions
|
|
|
|
Create, Rotate, and Revoke controls render only when the signed-in user is
|
|
authorized. `DashboardApiKeyAuthorization.CanManage` requires an authenticated
|
|
principal that is a member of the LDAP `MxGateway:Ldap:RequiredGroup` — the
|
|
same group the dashboard login enforces. An anonymous localhost viewer can read
|
|
the table but sees no action controls.
|
|
|
|
- **Create** opens a dialog for the key id, display name, scope checkboxes
|
|
(the `GatewayScopes` catalog), and the optional constraint fields: read and
|
|
write subtrees, read and write tag globs, browse subtrees, max write
|
|
classification, and the read-alarm-only / read-historized-only flags.
|
|
- **Rotate** issues a new secret for an existing key id and invalidates the
|
|
old one.
|
|
- **Revoke** marks a key revoked; a revoked key cannot be un-revoked.
|
|
|
|
Create and Rotate return the assembled `mxgw_<keyId>_<secret>` token **once**,
|
|
in a one-time banner. It is never shown again, so the operator must copy it
|
|
immediately. This mirrors the `apikey create-key` / `rotate-key` CLI.
|
|
|
|
Every management action appends an `api_key_audit` entry
|
|
(`dashboard-create-key`, `dashboard-rotate-key`, `dashboard-revoke-key`) with
|
|
the key id and the caller's remote address. Secrets and pepper values are never
|
|
logged.
|
|
|
|
### Settings page
|
|
|
|
Show read-only effective configuration:
|
|
|
|
- worker executable path,
|
|
- configured timeouts,
|
|
- queue capacities,
|
|
- auth mode,
|
|
- SQLite auth database path with sensitive parts redacted if needed,
|
|
- dashboard enabled state,
|
|
- protocol version.
|
|
|
|
Do not show API key secrets or pepper values.
|
|
|
|
## Authentication And Authorization
|
|
|
|
Dashboard access uses the same API-key authentication model as gRPC where
|
|
practical.
|
|
|
|
Implemented v1 behavior:
|
|
|
|
- when enabled, require API key auth,
|
|
- require `admin` scope for dashboard access,
|
|
- accept API key through a secure cookie established by a simple login form,
|
|
- do not put API keys in query strings,
|
|
- validate anti-forgery tokens for login and logout posts.
|
|
|
|
The implementation path is:
|
|
|
|
1. Add `/dashboard/login`.
|
|
2. User submits API key over HTTPS.
|
|
3. Gateway validates key and `admin` scope.
|
|
4. Gateway issues an HTTP-only secure auth cookie for the dashboard.
|
|
5. Dashboard pages require that cookie.
|
|
6. Logout clears the cookie.
|
|
|
|
For local development, `Dashboard:AllowAnonymousLocalhost` defaults to `true`.
|
|
The bypass applies only to loopback requests; remote dashboard requests still
|
|
use the API-key-backed cookie flow.
|
|
|
|
`DashboardAuthenticator` keeps API-key validation outside UI components. It
|
|
formats the submitted key as a bearer authorization header for
|
|
`IApiKeyVerifier`, rejects non-admin keys when `Dashboard:RequireAdminScope` is
|
|
enabled, and creates the dashboard cookie principal without storing raw API key
|
|
material. `DashboardAuthorizationHandler` enforces the cookie, admin-scope, and
|
|
explicit loopback bypass decisions for all protected dashboard routes.
|
|
|
|
## Configuration
|
|
|
|
Suggested configuration:
|
|
|
|
```json
|
|
{
|
|
"MxGateway": {
|
|
"Dashboard": {
|
|
"Enabled": true,
|
|
"PathBase": "/dashboard",
|
|
"RequireAdminScope": true,
|
|
"AllowAnonymousLocalhost": true,
|
|
"SnapshotIntervalMilliseconds": 1000,
|
|
"RecentFaultLimit": 100,
|
|
"RecentSessionLimit": 200,
|
|
"ShowTagValues": false
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
## Security Rules
|
|
|
|
- Do not display API key secrets.
|
|
- Do not display credential-bearing MXAccess command values.
|
|
- Do not display full tag values by default.
|
|
- Do not expose worker pipe names with nonce or sensitive details.
|
|
- Protect dashboard auth cookies with `HttpOnly`, `Secure`, and `SameSite`.
|
|
- Require TLS for remote dashboard access.
|
|
- Use anti-forgery protection for login/logout and any future admin actions.
|
|
|
|
## Styling
|
|
|
|
The dashboard serves Bootstrap 5.3.3 assets from
|
|
`src/MxGateway.Server/wwwroot/lib/bootstrap/` and local layout/status styling
|
|
from `src/MxGateway.Server/wwwroot/css/dashboard.css`.
|
|
|
|
Recommended visual language:
|
|
|
|
- compact tables,
|
|
- status badges,
|
|
- metric cards,
|
|
- Bootstrap alerts for faults,
|
|
- restrained colors,
|
|
- no decorative hero sections,
|
|
- no charting dependency for v1.
|
|
|
|
If charts are added later, prefer simple server-generated data tables first. Do
|
|
not add a JavaScript charting dependency without a specific need.
|
|
|
|
The reusable visual rules for replicating this interface in other projects are
|
|
documented in [Dashboard Interface Design](./DashboardInterfaceDesign.md).
|
|
|
|
## Testing
|
|
|
|
Dashboard unit/component tests should cover:
|
|
|
|
- snapshot projection,
|
|
- dashboard auth authorization decisions,
|
|
- login API-key validation behavior,
|
|
- pages render with empty state,
|
|
- pages render with active sessions,
|
|
- pages render with faulted sessions,
|
|
- realtime subscription disposal,
|
|
- redaction of API keys and credential values.
|
|
|
|
Use bUnit if component testing is added. Otherwise keep the first tests focused
|
|
on snapshot services and authorization logic.
|
|
|
|
Integration tests should verify:
|
|
|
|
- dashboard disabled returns not found or configured fallback,
|
|
- dashboard requires auth when enabled,
|
|
- admin-scoped key can access dashboard,
|
|
- non-admin key is denied,
|
|
- live snapshot updates when a fake session changes state.
|
|
|
|
## Initial Implementation Slice
|
|
|
|
The first dashboard slice implements:
|
|
|
|
1. Blazor Server hosting in `MxGateway.Server`.
|
|
2. local Bootstrap static assets.
|
|
3. dashboard configuration binding.
|
|
4. dashboard auth using API key login and HTTP-only cookie.
|
|
5. read-only `DashboardSnapshotService`.
|
|
6. home page with metric cards.
|
|
7. sessions page with active session table and session details.
|
|
8. workers page with worker table.
|
|
9. events page with aggregate counters.
|
|
10. settings page with redacted effective configuration.
|
|
11. periodic realtime refresh through Blazor Server.
|
|
12. route-mapping tests, disabled-dashboard tests, auth tests, and snapshot
|
|
projection/redaction tests.
|
|
|
|
## Related Documentation
|
|
|
|
- [Dashboard Interface Design](./DashboardInterfaceDesign.md)
|
|
- [Gateway Process Detailed Design](./GatewayProcessDesign.md)
|
|
- [Authentication](./Authentication.md)
|
|
- [Authorization](./Authorization.md)
|
|
- [Sessions](./Sessions.md)
|
|
- [Metrics](./Metrics.md)
|
|
- [Diagnostics](./Diagnostics.md)
|