fix(health-monitoring): resolve HealthMonitoring-001/002 — populate S&F buffer depth, make SiteHealthState immutable
This commit is contained in:
@@ -8,7 +8,7 @@
|
||||
| Last reviewed | 2026-05-16 |
|
||||
| Reviewer | claude-agent |
|
||||
| Commit reviewed | `9c60592` |
|
||||
| Open findings | 12 |
|
||||
| Open findings | 10 |
|
||||
|
||||
## Summary
|
||||
|
||||
@@ -55,7 +55,7 @@ design-adherence gap.
|
||||
|--|--|
|
||||
| Severity | High |
|
||||
| Category | Design-document adherence |
|
||||
| Status | Open |
|
||||
| Status | Resolved |
|
||||
| Location | `src/ScadaLink.HealthMonitoring/SiteHealthCollector.cs:104`, `src/ScadaLink.HealthMonitoring/HealthReportSender.cs:79` |
|
||||
|
||||
**Description**
|
||||
@@ -79,7 +79,17 @@ the dead setter. Update the placeholder test accordingly once implemented.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
Resolved 2026-05-16 (commit `<pending>`). `HealthReportSender.ExecuteAsync` now
|
||||
queries the existing public `StoreAndForwardStorage.GetBufferDepthByCategoryAsync()`
|
||||
API alongside the parked-count call and feeds the per-category depths into
|
||||
`SiteHealthCollector.SetStoreAndForwardDepths` (category enum names as keys), so the
|
||||
documented store-and-forward buffer depth metric is populated in every emitted
|
||||
report. Regression test `HealthReportSenderTests.ReportsIncludeStoreAndForwardBufferDepthsFromStorage`
|
||||
verifies populated per-category depths. The obsolete placeholder test
|
||||
`SiteHealthCollectorTests.StoreAndForwardBufferDepths_IsEmptyPlaceholder` continues
|
||||
to pass — it only exercises the collector with no setter call and still correctly
|
||||
asserts the empty default; it was left in place as the collector-level default-state
|
||||
test. No StoreAndForward source was modified (existing public API only).
|
||||
|
||||
### HealthMonitoring-002 — `SiteHealthState` mutable fields written from multiple threads without synchronization
|
||||
|
||||
@@ -87,7 +97,7 @@ _Unresolved._
|
||||
|--|--|
|
||||
| Severity | High |
|
||||
| Category | Concurrency & thread safety |
|
||||
| Status | Open |
|
||||
| Status | Resolved |
|
||||
| Location | `src/ScadaLink.HealthMonitoring/SiteHealthState.cs:11`, `src/ScadaLink.HealthMonitoring/CentralHealthAggregator.cs:86`, `src/ScadaLink.HealthMonitoring/CentralHealthAggregator.cs:137` |
|
||||
|
||||
**Description**
|
||||
@@ -112,7 +122,22 @@ a single atomic reference swap.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
Resolved 2026-05-16 (commit `<pending>`). `SiteHealthState` is now a `sealed record`
|
||||
with `init`-only properties. `CentralHealthAggregator.ProcessReport`,
|
||||
`MarkHeartbeat`, and `CheckForOfflineSites` were rewritten to perform every state
|
||||
transition as an atomic compare-and-swap (`TryAdd`/`TryUpdate`) producing a new
|
||||
record instance — no field of a stored state is ever mutated in place. `ProcessReport`
|
||||
uses an explicit CAS retry loop instead of the `AddOrUpdate` update delegate so the
|
||||
sequence-number guard and the field writes are evaluated against the value actually
|
||||
installed (this also closes the root cause behind HealthMonitoring-003). Reads via
|
||||
`GetAllSiteStates`/`GetSiteState` now hand out immutable snapshots, so a concurrent
|
||||
reader can never observe a torn or half-applied state. `LatestReport` was changed
|
||||
from `SiteHealthReport` (`null!`) to `SiteHealthReport?`, making the contract honest;
|
||||
all existing consumers (CentralUI, integration/perf tests) already null-checked it
|
||||
and continue to build clean. Regression test
|
||||
`CentralHealthAggregatorTests.ProcessReport_ConcurrentUpdates_NeverLoseSequenceOrTearState`
|
||||
exercises concurrent report/heartbeat/read threads and asserts snapshot consistency
|
||||
and no lost updates.
|
||||
|
||||
### HealthMonitoring-003 — Shared state mutated inside `ConcurrentDictionary.AddOrUpdate` update delegate
|
||||
|
||||
|
||||
Reference in New Issue
Block a user