370 lines
9.3 KiB
Markdown
370 lines
9.3 KiB
Markdown
# Gateway Dashboard Detailed Design
|
|
|
|
## Purpose
|
|
|
|
The gateway should host a basic web dashboard for operators and developers. The
|
|
dashboard is diagnostic and operational visibility only for v1. It should show
|
|
gateway health, active MXAccess worker instances, session state, and basic
|
|
statistics in real time.
|
|
|
|
## Technology Choice
|
|
|
|
Decision: Blazor Server with Bootstrap CSS/JS.
|
|
|
|
Allowed UI stack:
|
|
|
|
- ASP.NET Core Blazor Server,
|
|
- Bootstrap CSS,
|
|
- Bootstrap JavaScript,
|
|
- small local CSS for layout and status styling,
|
|
- built-in Blazor components.
|
|
|
|
Not allowed for v1:
|
|
|
|
- MudBlazor,
|
|
- Radzen,
|
|
- Syncfusion,
|
|
- Telerik,
|
|
- other Blazor UI component libraries,
|
|
- client-side SPA framework replacement.
|
|
|
|
Rationale: Blazor Server keeps the dashboard in the gateway process, avoids a
|
|
separate frontend build, and gives real-time UI updates through the Blazor
|
|
SignalR circuit. Bootstrap is sufficient for a basic dashboard.
|
|
|
|
## Hosting Model
|
|
|
|
The dashboard is hosted by `MxGateway.Server` alongside the gRPC API.
|
|
|
|
Suggested endpoint layout:
|
|
|
|
```text
|
|
/dashboard
|
|
/dashboard/sessions
|
|
/dashboard/sessions/{sessionId}
|
|
/dashboard/workers
|
|
/dashboard/events
|
|
/dashboard/settings
|
|
/_blazor
|
|
```
|
|
|
|
The app should redirect `/` to `/dashboard` only if the deployment wants the
|
|
dashboard as the default web page. Otherwise leave gRPC/API hosting unaffected.
|
|
|
|
## High-Level Components
|
|
|
|
```text
|
|
MxGateway.Server
|
|
Dashboard/
|
|
Components/
|
|
App.razor
|
|
Routes.razor
|
|
Layout/
|
|
DashboardLayout.razor
|
|
NavMenu.razor
|
|
Pages/
|
|
DashboardHome.razor
|
|
SessionsPage.razor
|
|
SessionDetailsPage.razor
|
|
WorkersPage.razor
|
|
EventsPage.razor
|
|
SettingsPage.razor
|
|
Components/
|
|
MetricCard.razor
|
|
SessionTable.razor
|
|
WorkerTable.razor
|
|
EventRatePanel.razor
|
|
FaultList.razor
|
|
Services/
|
|
DashboardSnapshotService.cs
|
|
DashboardUpdateHub.cs
|
|
DashboardAuthorization.cs
|
|
Models/
|
|
DashboardSnapshot.cs
|
|
SessionSummary.cs
|
|
WorkerSummary.cs
|
|
MetricSummary.cs
|
|
```
|
|
|
|
`DashboardUpdateHub` here means an internal application update service, not a
|
|
separate public SignalR hub unless implementation proves one is needed. Blazor
|
|
Server already uses SignalR for UI circuits.
|
|
|
|
## Dashboard Data Source
|
|
|
|
The dashboard should consume read-only snapshots from gateway services:
|
|
|
|
- `SessionRegistry`,
|
|
- `SessionManager`,
|
|
- `WorkerClient`,
|
|
- `GatewayMetrics`,
|
|
- health checks,
|
|
- structured fault/event counters.
|
|
|
|
Do not let Razor components directly mutate gateway session or worker objects.
|
|
Create a small read-only dashboard service that projects gateway state into
|
|
plain DTOs.
|
|
|
|
`GatewayMetrics.GetSnapshot()` is the metrics input for the first dashboard
|
|
projection. It carries current session and worker gauges, command and event
|
|
counters, queue depth, and fault totals. The dashboard reads that snapshot
|
|
instead of reading raw `Meter` instruments because exporter configuration is an
|
|
operations concern, not a UI dependency.
|
|
|
|
Suggested service:
|
|
|
|
```csharp
|
|
public interface IDashboardSnapshotService
|
|
{
|
|
DashboardSnapshot GetSnapshot();
|
|
IAsyncEnumerable<DashboardSnapshot> WatchSnapshotsAsync(
|
|
CancellationToken cancellationToken);
|
|
}
|
|
```
|
|
|
|
Snapshot updates can be driven by:
|
|
|
|
- periodic timer, default every 1 second,
|
|
- session lifecycle notifications,
|
|
- worker heartbeat updates,
|
|
- event counter updates,
|
|
- fault notifications.
|
|
|
|
Use immutable snapshot DTOs so Razor components can render without locking
|
|
gateway internals.
|
|
|
|
## Realtime Updates
|
|
|
|
Use Blazor Server component state updates for real-time dashboard refresh.
|
|
|
|
Recommended pattern:
|
|
|
|
1. Page/component subscribes to `WatchSnapshotsAsync`.
|
|
2. Snapshot service emits updates from a bounded channel or timer.
|
|
3. Component stores the latest snapshot.
|
|
4. Component calls `InvokeAsync(StateHasChanged)`.
|
|
5. Component cancels subscription on dispose.
|
|
|
|
Default update cadence:
|
|
|
|
- immediate update on session create/close/fault,
|
|
- immediate update on worker fault,
|
|
- periodic metrics refresh every 1 second,
|
|
- event-rate windows updated every 1 second.
|
|
|
|
Avoid pushing every MXAccess data-change event to the dashboard. Aggregate event
|
|
counts and rates instead.
|
|
|
|
## Pages
|
|
|
|
### Dashboard Home
|
|
|
|
Show top-level status:
|
|
|
|
- gateway status,
|
|
- gateway version,
|
|
- uptime,
|
|
- open sessions,
|
|
- workers running,
|
|
- sessions faulted,
|
|
- command rate,
|
|
- command failure count,
|
|
- event rate,
|
|
- event queue depth,
|
|
- worker restart/kill count.
|
|
|
|
Use Bootstrap cards for individual metric summaries. Keep the layout compact
|
|
and operational.
|
|
|
|
### Sessions Page
|
|
|
|
Show active and recent sessions in a table:
|
|
|
|
- session id,
|
|
- client identity or API key display name,
|
|
- state,
|
|
- backend,
|
|
- worker process id,
|
|
- open time,
|
|
- last client activity,
|
|
- last worker heartbeat,
|
|
- active event subscribers,
|
|
- pending commands,
|
|
- event queue depth,
|
|
- last fault summary.
|
|
|
|
Rows should link to session details.
|
|
|
|
### Session Details Page
|
|
|
|
Show:
|
|
|
|
- session metadata,
|
|
- worker metadata,
|
|
- command counters by method,
|
|
- event counters by family,
|
|
- active server handles and item counts if gateway shadow state has them,
|
|
- latest faults,
|
|
- last heartbeat payload,
|
|
- close/kill controls only if admin actions are later enabled.
|
|
|
|
For v1, details should be read-only unless an explicit admin action design is
|
|
added.
|
|
|
|
### Workers Page
|
|
|
|
Show:
|
|
|
|
- worker process id,
|
|
- session id,
|
|
- executable path/version,
|
|
- state,
|
|
- startup duration,
|
|
- memory and CPU if available,
|
|
- last heartbeat,
|
|
- current command correlation id,
|
|
- pending command count,
|
|
- event queue depth,
|
|
- restart/kill reason if terminal.
|
|
|
|
### Events Page
|
|
|
|
Show aggregate event diagnostics:
|
|
|
|
- event rate by session,
|
|
- event rate by event family,
|
|
- total events since start,
|
|
- queue overflow count,
|
|
- stream disconnect count,
|
|
- recent terminal faults.
|
|
|
|
Do not display full tag values by default. If value display is later added, make
|
|
it opt-in and redacted.
|
|
|
|
### Settings Page
|
|
|
|
Show read-only effective configuration:
|
|
|
|
- worker executable path,
|
|
- configured timeouts,
|
|
- queue capacities,
|
|
- auth mode,
|
|
- SQLite auth database path with sensitive parts redacted if needed,
|
|
- dashboard enabled state,
|
|
- protocol version.
|
|
|
|
Do not show API key secrets or pepper values.
|
|
|
|
## Authentication And Authorization
|
|
|
|
Dashboard access should use the same API-key authentication model as gRPC where
|
|
practical.
|
|
|
|
Recommended v1 behavior:
|
|
|
|
- dashboard disabled by default unless configured,
|
|
- when enabled, require API key auth,
|
|
- require `admin` scope for dashboard access,
|
|
- accept API key through a secure cookie established by a simple login form, or
|
|
through reverse-proxy/header configuration for local deployments,
|
|
- do not put API keys in query strings.
|
|
|
|
Simplest implementation path:
|
|
|
|
1. Add `/dashboard/login`.
|
|
2. User submits API key over HTTPS.
|
|
3. Gateway validates key and `admin` scope.
|
|
4. Gateway issues an HTTP-only secure auth cookie for the dashboard.
|
|
5. Dashboard pages require that cookie.
|
|
6. Logout clears the cookie.
|
|
|
|
For local development, allow an explicit `Dashboard:AllowAnonymousLocalhost`
|
|
option. It must default to false.
|
|
|
|
## Configuration
|
|
|
|
Suggested configuration:
|
|
|
|
```json
|
|
{
|
|
"MxGateway": {
|
|
"Dashboard": {
|
|
"Enabled": true,
|
|
"PathBase": "/dashboard",
|
|
"RequireAdminScope": true,
|
|
"AllowAnonymousLocalhost": false,
|
|
"SnapshotIntervalMilliseconds": 1000,
|
|
"RecentFaultLimit": 100,
|
|
"RecentSessionLimit": 200,
|
|
"ShowTagValues": false
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
## Security Rules
|
|
|
|
- Do not display API key secrets.
|
|
- Do not display credential-bearing MXAccess command values.
|
|
- Do not display full tag values by default.
|
|
- Do not expose worker pipe names with nonce or sensitive details.
|
|
- Protect dashboard auth cookies with `HttpOnly`, `Secure`, and `SameSite`.
|
|
- Require TLS for remote dashboard access.
|
|
- Use anti-forgery protection for login/logout and any future admin actions.
|
|
|
|
## Styling
|
|
|
|
Use Bootstrap utility classes and a small local stylesheet.
|
|
|
|
Recommended visual language:
|
|
|
|
- compact tables,
|
|
- status badges,
|
|
- metric cards,
|
|
- Bootstrap alerts for faults,
|
|
- restrained colors,
|
|
- no decorative hero sections,
|
|
- no charting dependency for v1.
|
|
|
|
If charts are added later, prefer simple server-generated data tables first. Do
|
|
not add a JavaScript charting dependency without a specific need.
|
|
|
|
## Testing
|
|
|
|
Dashboard unit/component tests should cover:
|
|
|
|
- snapshot projection,
|
|
- dashboard auth authorization decisions,
|
|
- login API-key validation behavior,
|
|
- pages render with empty state,
|
|
- pages render with active sessions,
|
|
- pages render with faulted sessions,
|
|
- realtime subscription disposal,
|
|
- redaction of API keys and credential values.
|
|
|
|
Use bUnit if component testing is added. Otherwise keep the first tests focused
|
|
on snapshot services and authorization logic.
|
|
|
|
Integration tests should verify:
|
|
|
|
- dashboard disabled returns not found or configured fallback,
|
|
- dashboard requires auth when enabled,
|
|
- admin-scoped key can access dashboard,
|
|
- non-admin key is denied,
|
|
- live snapshot updates when a fake session changes state.
|
|
|
|
## Initial Implementation Slice
|
|
|
|
The first dashboard slice should implement:
|
|
|
|
1. Blazor Server hosting in `MxGateway.Server`.
|
|
2. Bootstrap static assets.
|
|
3. dashboard configuration binding.
|
|
4. dashboard auth using API key login and HTTP-only cookie.
|
|
5. read-only `DashboardSnapshotService`.
|
|
6. home page with metric cards.
|
|
7. sessions page with active session table.
|
|
8. workers page with worker table.
|
|
9. 1-second realtime refresh through Blazor Server.
|
|
10. redaction tests for secrets.
|