Phase 3 PR 34 — Host-status publisher (Server) + /hosts drill-down page (Admin). Closes LMX follow-up #7 by wiring together the data layer from PR 33. Server.HostStatusPublisher is a BackgroundService that walks every driver registered in DriverHost every 10 seconds, skips drivers that don't implement IHostConnectivityProbe, calls GetHostStatuses() on each probe-capable driver, and upserts one DriverHostStatus row per (NodeId, DriverInstanceId, HostName) into the central config DB. Upsert path: SingleOrDefaultAsync on the composite PK; if no row exists, Add a new one; if a row exists, LastSeenUtc advances unconditionally (heartbeat) and State + StateChangedUtc update only on transitions so Admin UI can distinguish 'still reporting, still Running' from 'freshly transitioned to Running'. MapState translates Core.Abstractions.HostState to Configuration.Enums.DriverHostState (intentional duplicate enum — Configuration project stays free of driver-runtime deps per PR 33's choice). If a driver's GetHostStatuses throws, log warning and skip that driver this tick — never take down the Server on a publisher failure. If the DB is unreachable, log warning + retry next heartbeat (no buffering — next tick's current-state snapshot is more useful than replaying stale transitions after a long outage). 2-second startup delay so NodeBootstrap's RegisterAsync calls land before the first publish tick, then tick runs immediately so a freshly-started Server surfaces its host topology in the Admin UI without waiting a full interval.
Polling chosen over event-driven for initial scope: simpler, matches Admin UI consumer cadence, avoids DriverHost lifecycle-event plumbing that doesn't exist today. Event-driven push for sub-heartbeat latency is a straightforward follow-up. Admin.Services.HostStatusService left-joins DriverHostStatus against ClusterNode on NodeId so rows persist even when the ClusterNode entry doesn't exist yet (first-boot bootstrap case). StaleThreshold = 30s — covers one missed publisher heartbeat plus a generous buffer for clock skew and GC pauses. Admin Components/Pages/Hosts.razor — FleetAdmin-visible page grouped by cluster (handles the '(unassigned)' case for rows without a matching ClusterNode). Four summary cards (Hosts / Running / Stale / Faulted); per-cluster table with Node / Driver / Host / State + Stale-badge / Last-transition / Last-seen / Detail columns; 10s auto-refresh via IServiceScopeFactory timer pattern matching FleetStatusPoller + Fleet dashboard (PR 27). Row-class highlighting: Faulted → table-danger, Stale → table-warning, else default. State badge maps DriverHostState enum to bootstrap color classes. Sidebar link added between 'Fleet status' and 'Clusters'. Server csproj adds Microsoft.EntityFrameworkCore.SqlServer 10.0.0 + registers OtOpcUaConfigDbContext in Program.cs scoped via NodeOptions.ConfigDbConnectionString (no Admin-style manual SQL raw — the DbContext is the only access path, keeps migrations owner-of-record). Tests — HostStatusPublisherTests (4 new Integration cases, uses per-run throwaway DB matching the FleetStatusPollerTests pattern): publisher upserts one row per host from each probe-capable driver and skips non-probe drivers; second tick advances LastSeenUtc without creating duplicate rows (upsert pattern verified end-to-end); state change between ticks updates State AND StateChangedUtc (datetime2(3) rounds to millisecond precision so comparison uses 1ms tolerance — documented inline); MapState translates every HostState enum member. Server.Tests Integration: 4 new tests pass. Admin build clean, Admin.Tests Unit still 23 / 0. docs/v2/lmx-followups.md item #7 marked DONE with three explicit deferred items (event-driven push, failure-count column, SignalR fan-out). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -108,13 +108,30 @@ condition node). Alarm tracking already has its own integration test
|
||||
(`AlarmSubscription*`); the multi-driver alarm case would need a stub
|
||||
`IAlarmSource` that's worth its own focused PR.
|
||||
|
||||
## 7. Host-status per-AppEngine granularity → Admin UI dashboard
|
||||
## 7. Host-status per-AppEngine granularity → Admin UI dashboard — **DONE (PRs 33 + 34)**
|
||||
|
||||
**Status**: PR 13 ships per-platform/per-AppEngine `ScanState` probing; PR 17
|
||||
surfaces the resulting `OnHostStatusChanged` events through OPC UA. Admin
|
||||
UI doesn't render a per-host dashboard yet.
|
||||
**PR 33** landed the data layer: `DriverHostStatus` entity + migration with
|
||||
composite key `(NodeId, DriverInstanceId, HostName)` and two query-supporting
|
||||
indexes (per-cluster drill-down on `NodeId`, stale-row detection on
|
||||
`LastSeenUtc`).
|
||||
|
||||
**To do**:
|
||||
- SignalR hub push of `HostStatusChangedEventArgs` to the Admin UI.
|
||||
- Dashboard page showing each tracked host, current state, last transition
|
||||
time, failure count.
|
||||
**PR 34** wired the publisher + consumer. `HostStatusPublisher` is a
|
||||
`BackgroundService` in the Server process that walks every registered
|
||||
`IHostConnectivityProbe`-capable driver every 10s, calls
|
||||
`GetHostStatuses()`, and upserts rows (`LastSeenUtc` advances each tick;
|
||||
`State` + `StateChangedUtc` update on transitions). Admin UI `/hosts` page
|
||||
groups by cluster, shows four summary cards (Hosts / Running / Stale /
|
||||
Faulted), and flags rows whose `LastSeenUtc` is older than 30s as Stale so
|
||||
operators see crashed Servers without waiting for a state change.
|
||||
|
||||
Deferred as follow-ups:
|
||||
|
||||
- Event-driven push (subscribe to `OnHostStatusChanged` per driver for
|
||||
sub-heartbeat latency). Adds DriverHost lifecycle-event plumbing;
|
||||
10s polling is fine for operator-scale use.
|
||||
- Failure-count column — needs the publisher to track a transition history
|
||||
per host, not just current-state.
|
||||
- SignalR fan-out to the Admin page (currently the page polls the DB, not
|
||||
a hub). The DB-polled version is fine at current cadence but a hub push
|
||||
would eliminate the 10s race where a new row sits in the DB before the
|
||||
Admin page notices.
|
||||
|
||||
Reference in New Issue
Block a user