Add gateway implementation planning docs
This commit is contained in:
@@ -0,0 +1,364 @@
|
||||
# Gateway Dashboard Detailed Design
|
||||
|
||||
## Purpose
|
||||
|
||||
The gateway should host a basic web dashboard for operators and developers. The
|
||||
dashboard is diagnostic and operational visibility only for v1. It should show
|
||||
gateway health, active MXAccess worker instances, session state, and basic
|
||||
statistics in real time.
|
||||
|
||||
## Technology Choice
|
||||
|
||||
Decision: Blazor Server with Bootstrap CSS/JS.
|
||||
|
||||
Allowed UI stack:
|
||||
|
||||
- ASP.NET Core Blazor Server,
|
||||
- Bootstrap CSS,
|
||||
- Bootstrap JavaScript,
|
||||
- small local CSS for layout and status styling,
|
||||
- built-in Blazor components.
|
||||
|
||||
Not allowed for v1:
|
||||
|
||||
- MudBlazor,
|
||||
- Radzen,
|
||||
- Syncfusion,
|
||||
- Telerik,
|
||||
- other Blazor UI component libraries,
|
||||
- client-side SPA framework replacement.
|
||||
|
||||
Rationale: Blazor Server keeps the dashboard in the gateway process, avoids a
|
||||
separate frontend build, and gives real-time UI updates through the Blazor
|
||||
SignalR circuit. Bootstrap is sufficient for a basic dashboard.
|
||||
|
||||
## Hosting Model
|
||||
|
||||
The dashboard is hosted by `MxGateway.Server` alongside the gRPC API.
|
||||
|
||||
Suggested endpoint layout:
|
||||
|
||||
```text
|
||||
/dashboard
|
||||
/dashboard/sessions
|
||||
/dashboard/sessions/{sessionId}
|
||||
/dashboard/workers
|
||||
/dashboard/events
|
||||
/dashboard/settings
|
||||
/_blazor
|
||||
```
|
||||
|
||||
The app should redirect `/` to `/dashboard` only if the deployment wants the
|
||||
dashboard as the default web page. Otherwise leave gRPC/API hosting unaffected.
|
||||
|
||||
## High-Level Components
|
||||
|
||||
```text
|
||||
MxGateway.Server
|
||||
Dashboard/
|
||||
Components/
|
||||
App.razor
|
||||
Routes.razor
|
||||
Layout/
|
||||
DashboardLayout.razor
|
||||
NavMenu.razor
|
||||
Pages/
|
||||
DashboardHome.razor
|
||||
SessionsPage.razor
|
||||
SessionDetailsPage.razor
|
||||
WorkersPage.razor
|
||||
EventsPage.razor
|
||||
SettingsPage.razor
|
||||
Components/
|
||||
MetricCard.razor
|
||||
SessionTable.razor
|
||||
WorkerTable.razor
|
||||
EventRatePanel.razor
|
||||
FaultList.razor
|
||||
Services/
|
||||
DashboardSnapshotService.cs
|
||||
DashboardUpdateHub.cs
|
||||
DashboardAuthorization.cs
|
||||
Models/
|
||||
DashboardSnapshot.cs
|
||||
SessionSummary.cs
|
||||
WorkerSummary.cs
|
||||
MetricSummary.cs
|
||||
```
|
||||
|
||||
`DashboardUpdateHub` here means an internal application update service, not a
|
||||
separate public SignalR hub unless implementation proves one is needed. Blazor
|
||||
Server already uses SignalR for UI circuits.
|
||||
|
||||
## Dashboard Data Source
|
||||
|
||||
The dashboard should consume read-only snapshots from gateway services:
|
||||
|
||||
- `SessionRegistry`,
|
||||
- `SessionManager`,
|
||||
- `WorkerClient`,
|
||||
- `GatewayMetrics`,
|
||||
- health checks,
|
||||
- structured fault/event counters.
|
||||
|
||||
Do not let Razor components directly mutate gateway session or worker objects.
|
||||
Create a small read-only dashboard service that projects gateway state into
|
||||
plain DTOs.
|
||||
|
||||
Suggested service:
|
||||
|
||||
```csharp
|
||||
public interface IDashboardSnapshotService
|
||||
{
|
||||
DashboardSnapshot GetSnapshot();
|
||||
IAsyncEnumerable<DashboardSnapshot> WatchSnapshotsAsync(
|
||||
CancellationToken cancellationToken);
|
||||
}
|
||||
```
|
||||
|
||||
Snapshot updates can be driven by:
|
||||
|
||||
- periodic timer, default every 1 second,
|
||||
- session lifecycle notifications,
|
||||
- worker heartbeat updates,
|
||||
- event counter updates,
|
||||
- fault notifications.
|
||||
|
||||
Use immutable snapshot DTOs so Razor components can render without locking
|
||||
gateway internals.
|
||||
|
||||
## Realtime Updates
|
||||
|
||||
Use Blazor Server component state updates for real-time dashboard refresh.
|
||||
|
||||
Recommended pattern:
|
||||
|
||||
1. Page/component subscribes to `WatchSnapshotsAsync`.
|
||||
2. Snapshot service emits updates from a bounded channel or timer.
|
||||
3. Component stores the latest snapshot.
|
||||
4. Component calls `InvokeAsync(StateHasChanged)`.
|
||||
5. Component cancels subscription on dispose.
|
||||
|
||||
Default update cadence:
|
||||
|
||||
- immediate update on session create/close/fault,
|
||||
- immediate update on worker fault,
|
||||
- periodic metrics refresh every 1 second,
|
||||
- event-rate windows updated every 1 second.
|
||||
|
||||
Avoid pushing every MXAccess data-change event to the dashboard. Aggregate event
|
||||
counts and rates instead.
|
||||
|
||||
## Pages
|
||||
|
||||
### Dashboard Home
|
||||
|
||||
Show top-level status:
|
||||
|
||||
- gateway status,
|
||||
- gateway version,
|
||||
- uptime,
|
||||
- open sessions,
|
||||
- workers running,
|
||||
- sessions faulted,
|
||||
- command rate,
|
||||
- command failure count,
|
||||
- event rate,
|
||||
- event queue depth,
|
||||
- worker restart/kill count.
|
||||
|
||||
Use Bootstrap cards for individual metric summaries. Keep the layout compact
|
||||
and operational.
|
||||
|
||||
### Sessions Page
|
||||
|
||||
Show active and recent sessions in a table:
|
||||
|
||||
- session id,
|
||||
- client identity or API key display name,
|
||||
- state,
|
||||
- backend,
|
||||
- worker process id,
|
||||
- open time,
|
||||
- last client activity,
|
||||
- last worker heartbeat,
|
||||
- active event subscribers,
|
||||
- pending commands,
|
||||
- event queue depth,
|
||||
- last fault summary.
|
||||
|
||||
Rows should link to session details.
|
||||
|
||||
### Session Details Page
|
||||
|
||||
Show:
|
||||
|
||||
- session metadata,
|
||||
- worker metadata,
|
||||
- command counters by method,
|
||||
- event counters by family,
|
||||
- active server handles and item counts if gateway shadow state has them,
|
||||
- latest faults,
|
||||
- last heartbeat payload,
|
||||
- close/kill controls only if admin actions are later enabled.
|
||||
|
||||
For v1, details should be read-only unless an explicit admin action design is
|
||||
added.
|
||||
|
||||
### Workers Page
|
||||
|
||||
Show:
|
||||
|
||||
- worker process id,
|
||||
- session id,
|
||||
- executable path/version,
|
||||
- state,
|
||||
- startup duration,
|
||||
- memory and CPU if available,
|
||||
- last heartbeat,
|
||||
- current command correlation id,
|
||||
- pending command count,
|
||||
- event queue depth,
|
||||
- restart/kill reason if terminal.
|
||||
|
||||
### Events Page
|
||||
|
||||
Show aggregate event diagnostics:
|
||||
|
||||
- event rate by session,
|
||||
- event rate by event family,
|
||||
- total events since start,
|
||||
- queue overflow count,
|
||||
- stream disconnect count,
|
||||
- recent terminal faults.
|
||||
|
||||
Do not display full tag values by default. If value display is later added, make
|
||||
it opt-in and redacted.
|
||||
|
||||
### Settings Page
|
||||
|
||||
Show read-only effective configuration:
|
||||
|
||||
- worker executable path,
|
||||
- configured timeouts,
|
||||
- queue capacities,
|
||||
- auth mode,
|
||||
- SQLite auth database path with sensitive parts redacted if needed,
|
||||
- dashboard enabled state,
|
||||
- protocol version.
|
||||
|
||||
Do not show API key secrets or pepper values.
|
||||
|
||||
## Authentication And Authorization
|
||||
|
||||
Dashboard access should use the same API-key authentication model as gRPC where
|
||||
practical.
|
||||
|
||||
Recommended v1 behavior:
|
||||
|
||||
- dashboard disabled by default unless configured,
|
||||
- when enabled, require API key auth,
|
||||
- require `admin` scope for dashboard access,
|
||||
- accept API key through a secure cookie established by a simple login form, or
|
||||
through reverse-proxy/header configuration for local deployments,
|
||||
- do not put API keys in query strings.
|
||||
|
||||
Simplest implementation path:
|
||||
|
||||
1. Add `/dashboard/login`.
|
||||
2. User submits API key over HTTPS.
|
||||
3. Gateway validates key and `admin` scope.
|
||||
4. Gateway issues an HTTP-only secure auth cookie for the dashboard.
|
||||
5. Dashboard pages require that cookie.
|
||||
6. Logout clears the cookie.
|
||||
|
||||
For local development, allow an explicit `Dashboard:AllowAnonymousLocalhost`
|
||||
option. It must default to false.
|
||||
|
||||
## Configuration
|
||||
|
||||
Suggested configuration:
|
||||
|
||||
```json
|
||||
{
|
||||
"MxGateway": {
|
||||
"Dashboard": {
|
||||
"Enabled": true,
|
||||
"PathBase": "/dashboard",
|
||||
"RequireAdminScope": true,
|
||||
"AllowAnonymousLocalhost": false,
|
||||
"SnapshotIntervalMilliseconds": 1000,
|
||||
"RecentFaultLimit": 100,
|
||||
"RecentSessionLimit": 200,
|
||||
"ShowTagValues": false
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Security Rules
|
||||
|
||||
- Do not display API key secrets.
|
||||
- Do not display credential-bearing MXAccess command values.
|
||||
- Do not display full tag values by default.
|
||||
- Do not expose worker pipe names with nonce or sensitive details.
|
||||
- Protect dashboard auth cookies with `HttpOnly`, `Secure`, and `SameSite`.
|
||||
- Require TLS for remote dashboard access.
|
||||
- Use anti-forgery protection for login/logout and any future admin actions.
|
||||
|
||||
## Styling
|
||||
|
||||
Use Bootstrap utility classes and a small local stylesheet.
|
||||
|
||||
Recommended visual language:
|
||||
|
||||
- compact tables,
|
||||
- status badges,
|
||||
- metric cards,
|
||||
- Bootstrap alerts for faults,
|
||||
- restrained colors,
|
||||
- no decorative hero sections,
|
||||
- no charting dependency for v1.
|
||||
|
||||
If charts are added later, prefer simple server-generated data tables first. Do
|
||||
not add a JavaScript charting dependency without a specific need.
|
||||
|
||||
## Testing
|
||||
|
||||
Dashboard unit/component tests should cover:
|
||||
|
||||
- snapshot projection,
|
||||
- dashboard auth authorization decisions,
|
||||
- login API-key validation behavior,
|
||||
- pages render with empty state,
|
||||
- pages render with active sessions,
|
||||
- pages render with faulted sessions,
|
||||
- realtime subscription disposal,
|
||||
- redaction of API keys and credential values.
|
||||
|
||||
Use bUnit if component testing is added. Otherwise keep the first tests focused
|
||||
on snapshot services and authorization logic.
|
||||
|
||||
Integration tests should verify:
|
||||
|
||||
- dashboard disabled returns not found or configured fallback,
|
||||
- dashboard requires auth when enabled,
|
||||
- admin-scoped key can access dashboard,
|
||||
- non-admin key is denied,
|
||||
- live snapshot updates when a fake session changes state.
|
||||
|
||||
## Initial Implementation Slice
|
||||
|
||||
The first dashboard slice should implement:
|
||||
|
||||
1. Blazor Server hosting in `MxGateway.Server`.
|
||||
2. Bootstrap static assets.
|
||||
3. dashboard configuration binding.
|
||||
4. dashboard auth using API key login and HTTP-only cookie.
|
||||
5. read-only `DashboardSnapshotService`.
|
||||
6. home page with metric cards.
|
||||
7. sessions page with active session table.
|
||||
8. workers page with worker table.
|
||||
9. 1-second realtime refresh through Blazor Server.
|
||||
10. redaction tests for secrets.
|
||||
|
||||
Reference in New Issue
Block a user