Doc refresh (task #205) — requirements updated for multi-driver OtOpcUa three-process deploy

Per-file summary:

- docs/reqs/OpcUaServerReqs.md — rewritten driver-agnostic. OPC-001..OPC-013 re-scoped to multi-driver address-space composition + capability dispatch; OPC-014 AuthorizationGate + permission trie; OPC-015 dynamic ServiceLevel via RedundancyCoordinator; OPC-017 surgical generation-apply rebuild; OPC-012 capability dispatch via CapabilityInvoker (decision #143 idempotence-aware retry); OPC-013 per-host Polly isolation (decision #144); OPC-019 OpenTelemetry metrics. Transport-security profile matrix (OPC-010) + UserName/LDAP (OPC-011) preserved.

- docs/reqs/GalaxyRepositoryReqs.md — scope clarified as Galaxy-driver-only (not platform). GR-001..GR-004 tied to ITagDiscovery.DiscoverAsync + IRediscoverable; all SQL runs inside OtOpcUa.Galaxy.Host and streams to Proxy via named pipe. GR-008 capability wrapping via CapabilityInvoker added. Cross-links to docs/v2/driver-specs.md + docs/GalaxyRepository.md.

- docs/reqs/MxAccessClientReqs.md — scope clarified as Galaxy-Host-only. MXA-001..MXA-009 preserved (STA pump, register/unregister, subscription refcount, auto-reconnect, probe, COM cleanup, operation metrics, error translation). MXA-010 Proxy-side capability wrapping + MXA-011 pipe ACL + per-process shared secret (OTOPCUA_ALLOWED_SID / OTOPCUA_GALAXY_SECRET) added.

- docs/reqs/ServiceHostReqs.md — rewritten for three-process deployment. Shared section (SVC-SHARED-001/002) for Serilog + bootstrap-only appsettings. SRV-* for OtOpcUa.Server (net10 x64, Microsoft.Extensions.Hosting + AddWindowsService, in-process driver hosting, redundancy-node bootstrap). ADM-* for OtOpcUa.Admin (Blazor Server, cookie+LDAP auth, CanEdit/CanPublish policies, sole DB writer, Prometheus /metrics, audit logging). GHX-* for OtOpcUa.Galaxy.Host (TopShelf, net48 x86, named-pipe IPC bootstrap, STA backend lifecycle, crash handling tied to supervisor).

- docs/reqs/ClientRequirements.md — restructured as numbered, verifiable requirements. SHR-* for Client.Shared (single IOpcUaClientService, ConnectionSettings, failover, cross-platform certs, type-coercing write, UI-thread neutrality). CLI-001..CLI-011 cover connect/read/write/browse/subscribe/historyread/alarms/redundancy. UI-001..UI-008 cover connection panel, tree browser, each tab, connection-state reflection, cross-platform build. Reference design content (IOpcUaClientService shape, models, view-model map, mock layout) preserved.

- docs/reqs/StatusDashboardReqs.md — retired cleanly. Replaced with a pointer to docs/v2/admin-ui.md + HLR-015 / HLR-016 / HLR-017 / ADM-*. Mapping table shows each retired DASH-001..DASH-009 requirement's replacement (live cluster-node view via SignalR, Prometheus metrics, driver-instance detail views, etc.). Note that a formal AdminUiReqs.md can be written later if needed for cert compliance.

HighLevelReqs.md was already at the target shape (HLR-001..HLR-018 with Revision header noting retired HLR-009) as of commit f217636; verified identical and no additional edit required.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Joseph Doherty
2026-04-20 01:31:58 -04:00
parent f217636467
commit 48970af416
6 changed files with 739 additions and 644 deletions

View File

@@ -1,157 +1,29 @@
# Status Dashboard — Component Requirements
# Status Dashboard — Retired
Parent: [HLR-009](HighLevelReqs.md#hlr-009-status-dashboard)
> **Revision** — Retired 2026-04-19 (task #205). The embedded HTTP Status Dashboard hosted inside the v1 LmxOpcUa service (`Dashboard:Port 8081`) has been **superseded by the Admin UI** introduced in OtOpcUa v2. The requirements formerly numbered DASH-001 through DASH-009 no longer apply.
Reference: LmxProxy Status Dashboard (see `dashboard.JPG` in project root).
## What replaces it
## DASH-001: Embedded HTTP Endpoint
Operator surface is now the **OtOpcUa Admin** Blazor Server web app:
The service shall host a lightweight HTTP listener on a configurable port serving a self-contained HTML status dashboard page (no external dependencies).
- Canonical design doc: `docs/v2/admin-ui.md`
- High-level operator surface requirement: [HLR-015](HighLevelReqs.md#hlr-015-admin-ui-operator-surface)
- Service-host requirements for the Admin process: [ServiceHostReqs.md → ADM-*](ServiceHostReqs.md#otopcua-admin---service-host-requirements-adm-)
- Cross-cluster metrics endpoint: `/metrics` on the Admin app — see [HLR-017](HighLevelReqs.md#hlr-017-prometheus-metrics).
- Audit log: see [HLR-016](HighLevelReqs.md#hlr-016-audit-logging) and `AuditLogService`.
### Acceptance Criteria
## Mapping from retired DASH-* requirements to today's surface
- Uses `System.Net.HttpListener` on a configurable port (`Dashboard:Port`, default 8081).
- Routes:
- `GET /` → HTML dashboard
- `GET /api/status` → JSON status report
- `GET /api/health` → 200 OK if healthy, 503 if unhealthy
- Only GET requests accepted; other methods return 405.
- Unknown paths return 404.
- All responses include `Cache-Control: no-cache, no-store, must-revalidate` headers.
- Dashboard can be disabled via config (`Dashboard:Enabled`, default true).
| Retired requirement | Replacement |
|---------------------|-------------|
| DASH-001 Embedded HTTP listener | Admin UI (Blazor Server) hosted in the `OtOpcUa.Admin` process. |
| DASH-002 Connection panel | Admin UI cluster-node view (live via SignalR) shows per-driver connection state. |
| DASH-003 Health panel | Admin UI renders `DriverHealth` + Polly circuit state per driver instance; cluster-level rollup on the cluster dashboard. |
| DASH-004 Subscriptions panel | Prometheus gauges (session count, monitored-item count, driver-subscription count) exposed via `/metrics`. |
| DASH-005 Operations table | Capability-call duration histograms + counts exposed via `/metrics`; Admin UI renders latency summaries per `DriverInstanceId`. |
| DASH-006 Footer (last-updated + version) | Admin UI footer; version stamped from the assembly version of the Admin app. |
| DASH-007 Auto-refresh | Admin UI uses SignalR push for live updates — no meta-refresh. |
| DASH-008 JSON status API | Prometheus `/metrics` endpoint is the programmatic surface. |
| DASH-009 Galaxy info panel | Admin UI Galaxy-driver-instance detail view (driver config, last discovery time, Galaxy DB connection state, MXAccess pipe health). |
### Details
- HTTP prefix: `http://+:{port}/` to bind to all interfaces.
- If HttpListener fails to start (port conflict, missing URL reservation), log Error and continue service startup without the dashboard.
- HTML page is self-contained: inline CSS, no external resources (no CDN, no JavaScript frameworks).
---
## DASH-002: Connection Panel
The dashboard shall display a Connection panel showing MXAccess connection state.
### Acceptance Criteria
- Shows: **Connected** (True/False), **State** (Connected/Disconnected/Reconnecting/Error), **Connected Since** (UTC timestamp).
- Green left border when Connected, red when Disconnected/Error, yellow when Reconnecting.
- "Connected Since" shows "N/A" when not connected.
- Data sourced from MXAccess client's connection state properties.
### Details
- Timestamp format: `yyyy-MM-dd HH:mm:ss UTC`.
- Panel title: "Connection".
---
## DASH-003: Health Panel
The dashboard shall display a Health panel showing overall service health.
### Acceptance Criteria
- Three states: **Healthy** (green text), **Degraded** (yellow text), **Unhealthy** (red text).
- Includes a health message string explaining the status.
- Health rules:
- Not connected to MXAccess → Unhealthy
- Success rate < 50% with > 100 total operations → Degraded
- Connected with acceptable success rate → Healthy
### Details
- Health message examples: "LmxOpcUa is healthy", "MXAccess client is not connected", "Average success rate is below 50%".
- Green left border for Healthy, yellow for Degraded, red for Unhealthy.
---
## DASH-004: Subscriptions Panel
The dashboard shall display a Subscriptions panel showing subscription statistics.
### Acceptance Criteria
- Shows: **Clients** (connected OPC UA client count), **Tags** (total variable nodes in address space), **Active** (active MXAccess subscriptions), **Delivered** (cumulative data change notifications delivered).
- Values update on each dashboard refresh.
- Zero values shown as "0", not blank.
### Details
- "Tags" is the count of variable nodes, not object/folder nodes.
- "Active" is the count of distinct MXAccess item subscriptions (after ref-counting — the number of actual AdviseSupervisory calls, not the number of OPC UA monitored items).
- "Delivered" is a running counter since service start (not reset on reconnect).
---
## DASH-005: Operations Table
The dashboard shall display an operations metrics table showing performance statistics.
### Acceptance Criteria
- Table with columns: **Operation**, **Count**, **Success Rate**, **Avg (ms)**, **Min (ms)**, **Max (ms)**, **P95 (ms)**.
- Rows: Read, Write, Subscribe, Browse.
- Empty cells show em-dash ("—") when no data available (count = 0).
- Success rate displayed as percentage (e.g., "99.8%").
- Latency values rounded to 1 decimal place.
### Details
- Metrics sourced from the PerformanceMetrics component (1000-entry rolling buffer for percentile calculation).
- "Browse" row tracks OPC UA browse operations.
- "Subscribe" row tracks OPC UA CreateMonitoredItems operations.
---
## DASH-006: Footer
The dashboard shall display a footer with last-updated time and service identification.
### Acceptance Criteria
- Format: "Last updated: {timestamp} UTC | Service: ZB.MOM.WW.OtOpcUa.Host v{version}".
- Timestamp is the server-side UTC time when the HTML was generated.
- Version is read from the assembly version (`Assembly.GetExecutingAssembly().GetName().Version`).
---
## DASH-007: Auto-Refresh
The dashboard page shall auto-refresh to show current status without manual reload.
### Acceptance Criteria
- HTML page includes `<meta http-equiv="refresh" content="10">` for 10-second auto-refresh.
- No JavaScript required for refresh (pure HTML meta-refresh).
- Refresh interval: configurable via `Dashboard:RefreshIntervalSeconds`, default 10 seconds.
---
## DASH-008: JSON Status API
The `/api/status` endpoint shall return a JSON object with all dashboard data for programmatic consumption.
### Acceptance Criteria
- Response Content-Type: `application/json`.
- JSON structure includes: connection state, health status, subscription statistics, and operation metrics.
- Same data as the HTML dashboard, structured for machine consumption.
- Suitable for integration with external monitoring tools.
---
## DASH-009: Galaxy Info Panel
The dashboard shall display a Galaxy Info panel showing Galaxy Repository state.
### Acceptance Criteria
- Shows: **Galaxy Name** (e.g., ZB), **DB Status** (Connected/Disconnected), **Last Deploy** (timestamp from `galaxy.time_of_last_deploy`), **Objects** (count), **Attributes** (count), **Last Rebuild** (timestamp of last address space rebuild).
- Provides visibility into the Galaxy Repository component's state independently of MXAccess connection status.
### Details
- "DB Status" reflects whether the most recent change detection poll succeeded.
- "Last Deploy" shows the raw `time_of_last_deploy` value from the Galaxy database.
- "Objects" and "Attributes" show counts from the most recent successful hierarchy/attribute query.
A formal requirements-level doc for the Admin UI (AdminUiReqs.md) is not yet written — the design doc at `docs/v2/admin-ui.md` serves as the authoritative reference until formal cert-compliance requirements are needed.