6 Commits

66 changed files with 3267 additions and 366 deletions
+56 -13
View File
@@ -8,7 +8,7 @@
| Last reviewed | 2026-05-16 | | Last reviewed | 2026-05-16 |
| Reviewer | claude-agent | | Reviewer | claude-agent |
| Commit reviewed | `9c60592` | | Commit reviewed | `9c60592` |
| Open findings | 12 | | Open findings | 6 |
## Summary ## Summary
@@ -92,7 +92,7 @@ now call `ResolveFormat`. Regression tests added in `FormatResolutionTests`.
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Correctness & logic bugs | | Category | Correctness & logic bugs |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.CLI/Commands/CommandHelpers.cs:59-68`, `src/ScadaLink.CLI/Commands/CommandHelpers.cs:78-80` | | Location | `src/ScadaLink.CLI/Commands/CommandHelpers.cs:59-68`, `src/ScadaLink.CLI/Commands/CommandHelpers.cs:78-80` |
**Description** **Description**
@@ -112,7 +112,13 @@ output" case (print nothing or `(ok)`), and return 0 before attempting to parse.
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit pending). Root cause confirmed against source —
`HandleResponse` tested `JsonData != null`, so an empty success body fell through to
`WriteAsTable``JsonDocument.Parse("")` and threw an uncaught `JsonException`.
`HandleResponse` now treats a null-or-whitespace `JsonData` as a "succeeded, no output"
case, prints `(ok)`, and returns 0 before any parse. Regression tests added in
`ResponseRenderingTests` (`HandleResponse_EmptyBody_TableFormat_DoesNotThrow_ReturnsZero`,
`HandleResponse_EmptyBody_JsonFormat_DoesNotThrow_ReturnsZero`).
### CLI-003 — Non-JSON success body crashes table rendering ### CLI-003 — Non-JSON success body crashes table rendering
@@ -120,7 +126,7 @@ _Unresolved._
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Error handling & resilience | | Category | Error handling & resilience |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.CLI/Commands/CommandHelpers.cs:80` | | Location | `src/ScadaLink.CLI/Commands/CommandHelpers.cs:80` |
**Description** **Description**
@@ -139,7 +145,11 @@ printing the raw body verbatim (as the JSON path already does at line 66).
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit pending). Root cause confirmed — `WriteAsTable` parsed the
body with no `try/catch`. The `JsonDocument.Parse` call is now wrapped in a
`try/catch (JsonException)` that prints the raw body verbatim on failure, mirroring the
raw-body fallback on the JSON path. Regression test
`ResponseRenderingTests.HandleResponse_NonJsonBody_TableFormat_FallsBackToRaw_ReturnsZero`.
### CLI-004 — Malformed `--url` throws an unhandled `UriFormatException` ### CLI-004 — Malformed `--url` throws an unhandled `UriFormatException`
@@ -147,7 +157,7 @@ _Unresolved._
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Error handling & resilience | | Category | Error handling & resilience |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.CLI/ManagementHttpClient.cs:13` | | Location | `src/ScadaLink.CLI/ManagementHttpClient.cs:13` |
**Description** **Description**
@@ -166,7 +176,12 @@ clean `INVALID_URL` error with exit code 1 on failure.
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit pending). Root cause confirmed — the
`ManagementHttpClient` constructor's `new Uri(...)` ran outside the `SendCommandAsync`
`try/catch`. Added `CommandHelpers.IsValidManagementUrl`, which checks for an absolute
http/https URL via `Uri.TryCreate`. Both `CommandHelpers.ExecuteCommandAsync` and
`DebugCommands.BuildStream` now validate the resolved URL up front and emit a clean
`INVALID_URL` error with exit code 1. Regression tests in `UrlValidationTests`.
### CLI-005 — Malformed `--bindings` / `--overrides` JSON throws unhandled exceptions ### CLI-005 — Malformed `--bindings` / `--overrides` JSON throws unhandled exceptions
@@ -174,7 +189,7 @@ _Unresolved._
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Error handling & resilience | | Category | Error handling & resilience |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.CLI/Commands/InstanceCommands.cs:55-58`, `src/ScadaLink.CLI/Commands/InstanceCommands.cs:181-182` | | Location | `src/ScadaLink.CLI/Commands/InstanceCommands.cs:55-58`, `src/ScadaLink.CLI/Commands/InstanceCommands.cs:181-182` |
**Description** **Description**
@@ -195,7 +210,15 @@ code and return 1.
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit pending). Root cause confirmed — both `set-bindings` and
`set-overrides` deserialized and indexed JSON inline with no `try/catch`. Extracted the
parsing into testable `InstanceCommands.TryParseBindings` / `TryParseOverrides` helpers
that catch `JsonException`, guard against null results, and (for bindings) validate pair
arity and element kinds (`JsonValueKind`) instead of letting `ArgumentOutOfRangeException`
/ `InvalidOperationException` escape. The command actions now emit a clean
`INVALID_ARGUMENT` error and return 1 on failure. Regression tests in
`InstanceArgumentParsingTests` (8 tests covering valid input, malformed JSON, short pairs,
wrong element types, and JSON null).
### CLI-006 — Password is passed as a command-line argument with no safer alternative ### CLI-006 — Password is passed as a command-line argument with no safer alternative
@@ -203,7 +226,7 @@ _Unresolved._
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Security | | Category | Security |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.CLI/Program.cs:9`, `src/ScadaLink.CLI/Commands/CommandHelpers.cs:36-44` | | Location | `src/ScadaLink.CLI/Program.cs:9`, `src/ScadaLink.CLI/Commands/CommandHelpers.cs:36-44` |
**Description** **Description**
@@ -225,7 +248,17 @@ supplied.
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit pending). Root cause confirmed — credentials had no
non-command-line source. Added `SCADALINK_USERNAME` / `SCADALINK_PASSWORD` environment
fallbacks: `CliConfig.Load` now reads them into new `CliConfig.Username` / `Password`
properties (credentials are sourced from environment variables only, never the config
file, so they are not persisted). `CommandHelpers.ResolveCredential` resolves precedence
(explicit `--username`/`--password` → env var); both `ExecuteCommandAsync` and
`DebugCommands.BuildStream` use it. The design doc and the in-repo `README.md` now
document that `--password` on the command line is discouraged. The `--password-stdin`
option / interactive prompt was not added — the env-var fallback fully satisfies the
CI/CD safe-credential need; a stdin/prompt variant can be a follow-up if interactive use
demands it. Regression tests in `CredentialResolutionTests`.
### CLI-007 — `Component-CLI.md` command surface is substantially stale ### CLI-007 — `Component-CLI.md` command surface is substantially stale
@@ -233,7 +266,7 @@ _Unresolved._
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Design-document adherence | | Category | Design-document adherence |
| Status | Open | | Status | Resolved |
| Location | `docs/requirements/Component-CLI.md:51-211` (vs. all files under `src/ScadaLink.CLI/Commands/`) | | Location | `docs/requirements/Component-CLI.md:51-211` (vs. all files under `src/ScadaLink.CLI/Commands/`) |
**Description** **Description**
@@ -267,7 +300,17 @@ authoritative.
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit pending). Drift confirmed against every file under
`src/ScadaLink.CLI/Commands/`. Regenerated the entire "Command Structure" section of
`Component-CLI.md` from the actual command tree: all entities are now keyed by integer
`--id`; the non-existent `--file` option is removed; create/update commands list their
real individual flags; non-existent commands (`template diff`, `instance
bind-connections`/`assign-area`, `data-connection assign/unassign`, `security api-key
enable/disable`) are removed; previously-omitted commands (`instance alarm-override
set/delete/list`, `external-system method` subgroup, `site deploy-artifacts`) are added.
A note now points to `src/ScadaLink.CLI/README.md` as the authoritative reference. The
Configuration section also documents the new `SCADALINK_USERNAME`/`SCADALINK_PASSWORD`
env vars (see CLI-006).
### CLI-008 — `--format` value is not validated ### CLI-008 — `--format` value is not validated
+136 -21
View File
@@ -8,7 +8,7 @@
| Last reviewed | 2026-05-16 | | Last reviewed | 2026-05-16 |
| Reviewer | claude-agent | | Reviewer | claude-agent |
| Commit reviewed | `9c60592` | | Commit reviewed | `9c60592` |
| Open findings | 15 | | Open findings | 7 |
## Summary ## Summary
@@ -269,7 +269,17 @@ fixed 30-minute model. The code and the documented decision must agree.
**Resolution** **Resolution**
_Unresolved._ _Unresolved — requires a cross-module change plus a design decision, both out of
scope for a CentralUI-only fix._ Verified 2026-05-16: the discrepancy is real.
The sliding-expiration mechanism, however, is owned by the cookie
authentication middleware configured in **`ScadaLink.Security`**
(`ServiceCollectionExtensions.AddCookie` — currently sets neither
`ExpireTimeSpan` nor `SlidingExpiration`); `AuthEndpoints` (CentralUI) only sets
the absolute `ExpiresUtc`/`expires_at`. Implementing "15-minute sliding token"
means editing `ScadaLink.Security`, which this module's review cannot touch, and
the alternative — amending the documented decision to a fixed 30-minute model —
is a design decision, not a code fix. Left Open and surfaced for a follow-up
that spans CentralUI + Security, or a design-doc amendment.
### CentralUI-006 — Deployment status page polls every 10s despite the documented SignalR-push design ### CentralUI-006 — Deployment status page polls every 10s despite the documented SignalR-push design
@@ -300,7 +310,16 @@ If polling is kept as a fallback, fetch only changed/in-progress records.
**Resolution** **Resolution**
_Unresolved._ _Unresolved — a genuine SignalR-push fix requires an event source in another
module._ Verified 2026-05-16: `Deployments.razor` does poll every 10s, contrary
to the design doc. But a real push implementation needs the **Deployment
Manager** module (`ScadaLink.DeploymentManager``DeploymentService` /
`ArtifactDeploymentService` write the `DeploymentRecord` rows) to raise a
status-change event/observable that the page subscribes to; there is no such
event today and no CentralUI-only seam to subscribe to. Building that event
source is out of scope for a CentralUI-only review. Left Open and surfaced for a
follow-up that adds a deployment-status broadcaster in DeploymentManager (or a
design-doc amendment acknowledging the polling fallback).
### CentralUI-007 — Monitoring nav links to Deployment-only pages are shown to all roles ### CentralUI-007 — Monitoring nav links to Deployment-only pages are shown to all roles
@@ -308,7 +327,7 @@ _Unresolved._
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Correctness & logic bugs | | Category | Correctness & logic bugs |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.CentralUI/Components/Layout/NavMenu.razor:69-78`; `src/ScadaLink.CentralUI/Components/Pages/Monitoring/EventLogs.razor:2`; `src/ScadaLink.CentralUI/Components/Pages/Monitoring/ParkedMessages.razor:2` | | Location | `src/ScadaLink.CentralUI/Components/Layout/NavMenu.razor:69-78`; `src/ScadaLink.CentralUI/Components/Pages/Monitoring/EventLogs.razor:2`; `src/ScadaLink.CentralUI/Components/Pages/Monitoring/ParkedMessages.razor:2` |
**Description** **Description**
@@ -333,7 +352,17 @@ all-roles (it is, per the design).
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit pending). Confirmed: both `EventLogs.razor` and
`ParkedMessages.razor` carried a bare `[Authorize]`, so any authenticated user
could query site event logs and retry/discard parked messages — contrary to the
design doc's Deployment-Role classification. Both pages now use
`[Authorize(Policy = AuthorizationPolicies.RequireDeployment)]`, and the
"Event Logs" / "Parked Messages" nav links were moved out of the all-roles
Monitoring block into an `<AuthorizeView Policy="RequireDeployment">` (Health
Dashboard stays all-roles, as the design intends). Regression tests
`MonitoringAuthorizationTests.{EventLogsPage,ParkedMessagesPage}_RequiresDeploymentPolicy`
fail against the pre-fix code and pass after;
`HealthDashboard_IsIntentionallyAllAuthenticatedRoles` guards the all-roles page.
### CentralUI-008 — Audit-log date filters treat browser-local datetimes as UTC ### CentralUI-008 — Audit-log date filters treat browser-local datetimes as UTC
@@ -341,7 +370,7 @@ _Unresolved._
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Correctness & logic bugs | | Category | Correctness & logic bugs |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.CentralUI/Components/Pages/Monitoring/AuditLog.razor:242-243` | | Location | `src/ScadaLink.CentralUI/Components/Pages/Monitoring/AuditLog.razor:242-243` |
**Description** **Description**
@@ -364,7 +393,19 @@ time-range filters.
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit pending). Confirmed: `FetchPage` wrapped the
`datetime-local` value with `new DateTimeOffset(value, TimeSpan.Zero)`,
relabelling the browser-local wall-clock value as UTC and shifting the audit
query window by the user's offset. Added a pure helper
`Components/BrowserTime.LocalInputToUtc(DateTime?, int)` that converts a
local-input value to UTC using the browser's `Date.getTimezoneOffset()`;
`AuditLog.razor` now fetches that offset once via JS interop in
`OnAfterRenderAsync` (defaulting to 0/UTC on prerender or a disconnected
circuit) and runs both `from`/`to` filters through the helper. Regression suite
`BrowserTimeTests` (5 tests) fails against the naive relabelling and passes
after — including `LocalInputToUtc_NonUtcBrowser_DoesNotEqualNaiveRelabelling`,
which pins the exact pre-fix bug. `EventLogs.razor` was checked and has no
time-range filters, so it is unaffected.
### CentralUI-009 — `DebugView` stream callbacks touch a possibly-disposed `ToastNotification` ### CentralUI-009 — `DebugView` stream callbacks touch a possibly-disposed `ToastNotification`
@@ -372,7 +413,7 @@ _Unresolved._
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Concurrency & thread safety | | Category | Concurrency & thread safety |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.CentralUI/Components/Pages/Deployment/DebugView.razor:400-409,538-544` | | Location | `src/ScadaLink.CentralUI/Components/Pages/Deployment/DebugView.razor:400-409,538-544` |
**Description** **Description**
@@ -395,7 +436,21 @@ callbacks should no-op once disposed.
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit pending). Confirmed: the `onEvent`/`onTerminated`
callbacks captured `this` and `_toast` and ran on an Akka/gRPC thread with no
disposal guard. Added a `volatile bool _disposed` flag, set first thing in
`Dispose()` before the stream is stopped. Every callback now checks `_disposed`
and no-ops if set; the render dispatch goes through a new `SafeInvokeAsync`
helper that re-checks the flag and swallows `ObjectDisposedException` should the
component be disposed between the guard and the dispatch. Regression tests
`DebugViewDisposalTests.{DebugView_HasDisposalGuardField,
DebugView_Dispose_SetsDisposedFlag_AndIsIdempotent}` pin the observable contract
(the guard field exists; `Dispose()` sets it and is idempotent) — the first
fails against the pre-fix code, which had no `_disposed` field. The Akka-thread
timing race itself is not deterministically reproducible in a unit test:
`DebugStreamService` is a non-virtual concrete class with no seam to inject and
later fire the callbacks, so the closest meaningful tests pin the guard
mechanism rather than the race window.
### CentralUI-010 — `ToastNotification` auto-dismiss continuation runs after component disposal ### CentralUI-010 — `ToastNotification` auto-dismiss continuation runs after component disposal
@@ -403,7 +458,7 @@ _Unresolved._
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Error handling & resilience | | Category | Error handling & resilience |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.CentralUI/Components/Shared/ToastNotification.razor:62-71,90` | | Location | `src/ScadaLink.CentralUI/Components/Shared/ToastNotification.razor:62-71,90` |
**Description** **Description**
@@ -424,7 +479,21 @@ body in a try/catch for `ObjectDisposedException`.
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit pending). Confirmed: `AddToast` scheduled
`Task.Delay(...).ContinueWith(...)` with no cancellation and `Dispose()` was an
empty body, so the continuation ran `InvokeAsync(StateHasChanged)` against a
disposed component. Added a `CancellationTokenSource _disposalCts` cancelled in
`Dispose()`; the auto-dismiss is now an `AutoDismissAsync` method that awaits
`Task.Delay(dismissMs, token)`, returns on `OperationCanceledException`, and
wraps the post-delay `InvokeAsync(StateHasChanged)` in a try/catch for
`ObjectDisposedException`. `AddToast` also short-circuits if the component is
already disposed. Regression tests:
`ToastNotificationTests.ShowToast_AfterDisposal_IsNoOp_AndSchedulesNothing`
fails against the pre-fix code (which still added the toast / mis-scheduled
after disposal) and passes after;
`AutoDismiss_AfterDisposal_DoesNotThrowUnobservedException` and
`AutoDismiss_BeforeDisposal_StillRemovesToast` guard the no-throw and
still-works behaviours.
### CentralUI-011 — `DiffDialog` leaves a dangling `TaskCompletionSource` when disposed while open ### CentralUI-011 — `DiffDialog` leaves a dangling `TaskCompletionSource` when disposed while open
@@ -432,7 +501,7 @@ _Unresolved._
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Error handling & resilience | | Category | Error handling & resilience |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.CentralUI/Components/Shared/DiffDialog.razor:89-95,151-157` | | Location | `src/ScadaLink.CentralUI/Components/Shared/DiffDialog.razor:89-95,151-157` |
**Description** **Description**
@@ -452,7 +521,15 @@ awaiter completes deterministically.
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit pending). Confirmed: `OpenAsync` returned
`_tcs.Task`, completed only by `Close()`; `DisposeAsync` never touched the TCS,
so disposing the dialog while open left the awaiting caller suspended forever.
`DisposeAsync` now calls `_tcs?.TrySetResult(false)` before unlocking the body,
so a dialog disposed while open resolves its caller to `false` (not confirmed).
Regression test `DiffDialogTests.DisposeAsync_WhileOpen_CompletesPendingTask`
fails against the pre-fix code (the pending task stays `WaitingForActivation`)
and passes after; `Close_CompletesPendingTaskWithTrue` guards the normal close
path.
### CentralUI-012 — N+1 query loading data connections for the Sites page ### CentralUI-012 — N+1 query loading data connections for the Sites page
@@ -460,7 +537,7 @@ _Unresolved._
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Performance & resource management | | Category | Performance & resource management |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.CentralUI/Components/Pages/Admin/Sites.razor:196-205` | | Location | `src/ScadaLink.CentralUI/Components/Pages/Admin/Sites.razor:196-205` |
**Description** **Description**
@@ -479,7 +556,15 @@ summary in a single query.
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit pending). Confirmed: `LoadDataAsync` looped
`GetDataConnectionsBySiteIdAsync(site.Id)` once per site (N+1). `ISiteRepository`
already exposes `GetAllDataConnectionsAsync()` and `DataConnection` carries a
`SiteId`, so the loop was replaced with a single `GetAllDataConnectionsAsync()`
call grouped client-side by `SiteId` — one query regardless of site count, on
every load and post-delete refresh. Regression tests
`SitesPageTests.{LoadData_FetchesAllConnectionsInOneQuery_NoPerSiteQueries,
LoadData_GroupsConnectionsBySite_AndRendersThem}` fail against the pre-fix code
(`GetDataConnectionsBySiteIdAsync` was called per site) and pass after.
### CentralUI-013 — `ScriptAnalysisService` blocks on async shared-script lookups ### CentralUI-013 — `ScriptAnalysisService` blocks on async shared-script lookups
@@ -487,8 +572,8 @@ _Unresolved._
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Concurrency & thread safety | | Category | Concurrency & thread safety |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.CentralUI/ScriptAnalysis/ScriptAnalysisService.cs:951-952` | | Location | `src/ScadaLink.CentralUI/ScriptAnalysis/ScriptAnalysisService.cs:951-952` (actual call at `:975`) |
**Description** **Description**
@@ -509,15 +594,27 @@ them synchronously would remove the blocking call.
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit pending). Confirmed (the sync-over-async call is at
`:975`, not `:951-952` as originally cited — `ResolveCalledShape`'s
`Scripts.CallShared` branch). Took the recommended root-cause fix: `Hover` and
`SignatureHelp` are now `async Task<...>` and `ResolveCalledShape` is
`async Task<ScriptShape?>` which `await`s `_sharedScripts.GetShapesAsync()`
instead of `.GetAwaiter().GetResult()`. The two minimal-API endpoints
(`/hover`, `/signature-help`) were updated to `await` the methods. Regression
suite `ScriptAnalysisAsyncResolveTests` (3 tests): the structural test
`HoverAndSignatureHelp_AreAsync_NotSyncOverAsync` fails against the pre-fix
synchronous signatures, and two behavioural tests resolve shared-script shapes
through a catalog that only completes after `Task.Yield()` (a genuinely async
source). The five existing `Hover`/`SignatureHelp` tests in
`ScriptAnalysisServiceTests` were updated to `await` the now-async methods.
### CentralUI-014 — Test Run side effects (HTTP/SQL/SMTP) fire against production services ### CentralUI-014 — Test Run side effects (HTTP/SQL/SMTP) fire against production services
| | | | | |
|--|--| |--|--|
| Severity | Medium | | Severity | Low (re-triaged from Medium 2026-05-16 — see Resolution) |
| Category | Error handling & resilience | | Category | Error handling & resilience |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.CentralUI/ScriptAnalysis/ScriptAnalysisService.cs:254-259`; `src/ScadaLink.CentralUI/ScriptAnalysis/SandboxHostHelpers.cs:26-117` | | Location | `src/ScadaLink.CentralUI/ScriptAnalysis/ScriptAnalysisService.cs:254-259`; `src/ScadaLink.CentralUI/ScriptAnalysis/SandboxHostHelpers.cs:26-117` |
**Description** **Description**
@@ -541,7 +638,25 @@ dry-run mode that stubs the helpers, defaulting to dry-run.
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit pending) — **re-triaged**. Re-verified against the
reviewed commit `9c60592`: the finding's premise that "the blast radius is not
surfaced to the user" is **inaccurate**. Both Test Run surfaces that can produce
real side effects — `SharedScriptForm.razor` and the script Test Run in
`TemplateEdit.razor` — already carry a prominent `Real I/O` badge on the panel
header and an `alert-warning` block stating `External`/`Database`/`Notify` calls
"fire for real … real HTTP, real SQL, real emails. Side effects are permanent"
(present since commit `2951507`, an ancestor of the reviewed commit, confirmed
via `git merge-base`). `ApiMethodForm.razor` (Inbound API kind) has **no**
real-I/O surface at all — `SandboxInboundScriptHost` exposes only
`Parameters`/`Route` (Route throws) — and correctly omits the badge while still
warning. Revealing the panel ("Test Run" toggle) then clicking "Run" is itself a
two-step explicit opt-in. The minimum recommendation is therefore already met;
the optional dry-run mode is a separate feature decision the design doc does not
mandate. Severity re-triaged Medium → Low (intentional, documented, clearly
warned behaviour — not a bug). Regression suite `TestRunWarningTests` (3 tests)
pins the `Real I/O` badge + warning text in `SharedScriptForm`/`TemplateEdit`
and the deliberate absence of the badge in `ApiMethodForm`, so the warning
cannot silently regress.
### CentralUI-015 — `DialogService` continuations resolve off the render thread ### CentralUI-015 — `DialogService` continuations resolve off the render thread
+85 -9
View File
@@ -8,7 +8,7 @@
| Last reviewed | 2026-05-16 | | Last reviewed | 2026-05-16 |
| Reviewer | claude-agent | | Reviewer | claude-agent |
| Commit reviewed | `9c60592` | | Commit reviewed | `9c60592` |
| Open findings | 7 | | Open findings | 3 |
## Summary ## Summary
@@ -144,7 +144,7 @@ module-ownership claim was wrong. Module test suite green (3 passed).
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Correctness & logic bugs | | Category | Correctness & logic bugs |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.ClusterInfrastructure/ServiceCollectionExtensions.cs:7-17` | | Location | `src/ScadaLink.ClusterInfrastructure/ServiceCollectionExtensions.cs:7-17` |
**Description** **Description**
@@ -167,7 +167,23 @@ with the genuine registration when CI-001 is addressed.
**Resolution** **Resolution**
_Unresolved._ Confirmed against the source: both methods returned the `IServiceCollection`
unchanged. Verified the consumers — `ScadaLink.Host` calls `AddClusterInfrastructure()`
(`Program.cs:68`, `SiteServiceRegistration.cs:24`); `AddClusterInfrastructureActors`
is dead — it is called nowhere in the solution.
**Resolved** — fixing commit `commit pending`, date 2026-05-16.
`AddClusterInfrastructure` now does real work: it registers the
`ClusterOptionsValidator` (CI-004) via `TryAddEnumerable`, so the method is no longer a
no-op and a misconfigured `ScadaLink:Cluster` section fails fast on the first
`IOptions<ClusterOptions>` resolution. `AddClusterInfrastructureActors` — which this
component never had any actors to register, as CI-001 established the Akka bootstrap
lives in `ScadaLink.Host` — now throws `NotImplementedException` with a message
pointing the caller to the Host, rather than masquerading as a completed registration.
Covered by `ServiceCollectionExtensionsTests`
(`AddClusterInfrastructure_RegistersOptionsValidator`,
`AddClusterInfrastructure_ValidatorRejectsBadOptionsAtResolution`,
`AddClusterInfrastructureActors_ThrowsRatherThanSilentlySucceeding`).
### ClusterInfrastructure-003 — ClusterOptions omits several documented node-configuration settings ### ClusterInfrastructure-003 — ClusterOptions omits several documented node-configuration settings
@@ -175,7 +191,7 @@ _Unresolved._
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Design-document adherence | | Category | Design-document adherence |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.ClusterInfrastructure/ClusterOptions.cs:3-11` | | Location | `src/ScadaLink.ClusterInfrastructure/ClusterOptions.cs:3-11` |
**Description** **Description**
@@ -202,7 +218,27 @@ agree on where each value lives.
**Resolution** **Resolution**
_Unresolved._ Partially re-triaged. Verified against the source: most of the "missing" settings are
**deliberately owned by `ScadaLink.Host.NodeOptions`** — `NodeOptions` already carries
`Role`, `NodeHostname`, `SiteId`, `RemotingPort` and `GrpcPort`, and `AkkaHostedService`
builds the HOCON from `NodeOptions` for exactly those values. Local SQLite storage paths
live in the database / store-and-forward options. This is the ownership split CI-001
established (the Host owns node identity and bootstrap; this project owns the
cluster-formation contract), so those settings do **not** belong in `ClusterOptions`.
The one genuine gap the finding identifies is `down-if-alone`, which the design doc
puts with the split-brain settings.
**Resolved** — fixing commit `commit pending`, date 2026-05-16. Added the
`DownIfAlone` boolean (default `true`) to `ClusterOptions` so the split-brain
configuration contract is complete, and added a class-level XML doc that records the
deliberate ownership split — node identity/remoting/gRPC in `Host.NodeOptions`, storage
paths in the database options, cluster-formation settings here — so the design doc and
the options classes now agree on where each value lives. (`AkkaHostedService` currently
hard-codes `down-if-alone = on` in HOCON; wiring it to read `DownIfAlone` is a one-line
`ScadaLink.Host` change, outside this module's permitted edit scope, and is noted for
the Host's review.) Covered by `ClusterOptionsTests.DefaultValues_AreCorrect` and
`ClusterOptionsTests.DownIfAlone_CanBeSet`.
### ClusterInfrastructure-004 — ClusterOptions has no validation despite safety-critical values ### ClusterInfrastructure-004 — ClusterOptions has no validation despite safety-critical values
@@ -210,7 +246,7 @@ _Unresolved._
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Code organization & conventions | | Category | Code organization & conventions |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.ClusterInfrastructure/ClusterOptions.cs:3-11` | | Location | `src/ScadaLink.ClusterInfrastructure/ClusterOptions.cs:3-11` |
**Description** **Description**
@@ -239,7 +275,26 @@ FailureDetectionThreshold` and positive `StableAfter`. Register it with
**Resolution** **Resolution**
_Unresolved._ Confirmed: `ClusterOptions` had no validation of any kind, and the design doc's
catastrophic-misconfiguration values (`MinNrOfMembers: 2`, a quorum split-brain
strategy) would have been bound silently.
**Resolved** — fixing commit `commit pending`, date 2026-05-16. Added
`ClusterOptionsValidator : IValidateOptions<ClusterOptions>`, which enforces
`MinNrOfMembers == 1`, restricts `SplitBrainResolverStrategy` to the
`keep-oldest`-only allowed set, requires a non-empty `SeedNodes`, requires positive
`StableAfter` / `HeartbeatInterval` / `FailureDetectionThreshold`, and asserts
`HeartbeatInterval < FailureDetectionThreshold`. It accumulates every failure into one
result. It is registered by `AddClusterInfrastructure()` (CI-002) as a singleton
`IValidateOptions<ClusterOptions>`, so a misconfigured section throws
`OptionsValidationException` on the first `IOptions<ClusterOptions>.Value` resolution
— which `AkkaHostedService` performs during startup, giving the fail-fast-at-boot
behaviour the recommendation asks for without the src project taking a dependency on
the full `Microsoft.Extensions.DependencyInjection` package needed for the
`ValidateOnStart()` overload. Data annotations were not used — a single
`IValidateOptions` implementation expresses the interdependent timing rules that
attributes cannot. Covered by `ClusterOptionsValidatorTests` (8 cases) and
`ServiceCollectionExtensionsTests.AddClusterInfrastructure_ValidatorRejectsBadOptionsAtResolution`.
### ClusterInfrastructure-005 — No configuration section name constant for the Options pattern binding ### ClusterInfrastructure-005 — No configuration section name constant for the Options pattern binding
@@ -276,7 +331,7 @@ _Unresolved._
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Testing coverage | | Category | Testing coverage |
| Status | Open | | Status | Resolved |
| Location | `tests/ScadaLink.ClusterInfrastructure.Tests/ClusterOptionsTests.cs:1-51` | | Location | `tests/ScadaLink.ClusterInfrastructure.Tests/ClusterOptionsTests.cs:1-51` |
**Description** **Description**
@@ -301,7 +356,28 @@ from `ClusterOptions` and for the options validation from CI-004.
**Resolution** **Resolution**
_Unresolved._ Re-triaged in light of CI-001's resolution. The Akka bootstrap, HOCON generation,
cluster formation, failover and singleton handover are owned by `ScadaLink.Host`, not
this project — multi-node `Akka.Cluster.TestKit` tests for that behaviour belong in the
Host's test suite, outside this module's scope. What this module legitimately owns is
`ClusterOptions`, its validator, and the DI registration, and the testing gap there is
now closed.
**Resolved** — fixing commit `commit pending`, date 2026-05-16. Added two test classes
to `tests/ScadaLink.ClusterInfrastructure.Tests`: `ClusterOptionsValidatorTests`
(8 cases — valid defaults pass; `MinNrOfMembers != 1`, unsupported split-brain
strategies, empty seed nodes, heartbeat not below the failure threshold, non-positive
`StableAfter` all fail; and a multi-failure accumulation case) and
`ServiceCollectionExtensionsTests` (3 cases — `AddClusterInfrastructure` registers the
validator, the validator rejects bad options at `IOptions` resolution, and
`AddClusterInfrastructureActors` throws). The pre-existing `ClusterOptionsTests` was
extended with `DownIfAlone` coverage. The test project gained references to
`Microsoft.Extensions.DependencyInjection` and `Microsoft.Extensions.Options`. Module
test suite green: 16 passed (was 3). Note: the `keep-majority` value used in the
pre-existing `ClusterOptionsTests.Properties_CanBeSetToCustomValues` is intentionally
left — that test exercises the POCO's property setter (the POCO accepts any string by
design); `ClusterOptionsValidator` is the layer that now rejects `keep-majority`, and
`UnsupportedSplitBrainStrategy_FailsValidation` proves it.
### ClusterInfrastructure-007 — ClusterOptions lacks XML documentation comments ### ClusterInfrastructure-007 — ClusterOptions lacks XML documentation comments
+43 -9
View File
@@ -8,7 +8,7 @@
| Last reviewed | 2026-05-16 | | Last reviewed | 2026-05-16 |
| Reviewer | claude-agent | | Reviewer | claude-agent |
| Commit reviewed | `9c60592` | | Commit reviewed | `9c60592` |
| Open findings | 12 | | Open findings | 8 |
## Summary ## Summary
@@ -55,7 +55,7 @@ wire command.
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Concurrency & thread safety | | Category | Concurrency & thread safety |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.Commons/Types/StaleTagMonitor.cs:42-46`, `:62-67` | | Location | `src/ScadaLink.Commons/Types/StaleTagMonitor.cs:42-46`, `:62-67` |
**Description** **Description**
@@ -81,7 +81,14 @@ only then reschedule the timer.
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit pending) — confirmed the race against the source. Replaced
the `volatile bool` guard with a lock-protected monotonic generation token: `Start`,
`OnValueReceived` and `Stop` each bump the generation under a gate, and the timer
callback only raises `Stale` if its scheduled generation still matches. `OnValueReceived`
now recreates the timer (rather than `Change`-ing it) so the rescheduled callback carries
the new token. A superseded or stopped period can no longer emit a spurious staleness
signal. Regression tests added in `StaleTagMonitorRaceTests` (deterministic via an
internal `CallbackEnteredHook` test seam).
### Commons-002 — `DynamicJsonElement` retains a `JsonElement` whose `JsonDocument` lifetime it does not own ### Commons-002 — `DynamicJsonElement` retains a `JsonElement` whose `JsonDocument` lifetime it does not own
@@ -89,7 +96,7 @@ _Unresolved._
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Performance & resource management | | Category | Performance & resource management |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.Commons/Types/DynamicJsonElement.cs:10-17` | | Location | `src/ScadaLink.Commons/Types/DynamicJsonElement.cs:10-17` |
**Description** **Description**
@@ -113,7 +120,13 @@ regardless.
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit pending) — confirmed the hazard: `ExternalCallResult.Response`
constructs the wrapper from `JsonDocument.Parse(...).RootElement` with no reference kept
to the document, so deferred script-time access could fault. Fixed at the root by cloning
the element with `JsonElement.Clone()` in the `DynamicJsonElement` constructor, detaching
it from the owning document; the public constructor signature is unchanged. Added a
remarks block documenting the lifetime contract. Regression tests added in
`DynamicJsonElementTests` (access after the source document is disposed / GC-collected).
### Commons-003 — `ScriptParameters.GetNullable` silently swallows conversion failures ### Commons-003 — `ScriptParameters.GetNullable` silently swallows conversion failures
@@ -121,7 +134,7 @@ _Unresolved._
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Error handling & resilience | | Category | Error handling & resilience |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.Commons/Types/ScriptParameters.cs:72-86` | | Location | `src/ScadaLink.Commons/Types/ScriptParameters.cs:72-86` |
**Description** **Description**
@@ -146,7 +159,15 @@ element handling. If the swallowing must stay for compatibility, at minimum surf
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit pending) — confirmed the silent-swallow path against the
source. Removed the `catch (ScriptParameterException)` block in `GetNullable<T>`: an
absent or explicitly-null parameter still returns `null`, but a parameter that is
*present but holds an unconvertible value* now throws `ScriptParameterException` with a
descriptive message, consistent with `Get<T>()` and the array/list element paths. The
`Get<T>` XML doc was corrected accordingly. This is a deliberate behavioral change toward
correctness — the previous behavior masked caller/script bugs; the type-level public
contract is unchanged. Regression tests added in `ScriptParametersTests`
(`Get_NullableInt_PresentButUnparsable_Throws` and siblings).
### Commons-004 — `ManagementCommandRegistry` name mapping is asymmetric and namespace-scoped ### Commons-004 — `ManagementCommandRegistry` name mapping is asymmetric and namespace-scoped
@@ -154,7 +175,7 @@ _Unresolved._
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Code organization & conventions | | Category | Code organization & conventions |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.Commons/Messages/Management/ManagementCommandRegistry.cs:14-35` | | Location | `src/ScadaLink.Commons/Messages/Management/ManagementCommandRegistry.cs:14-35` |
**Description** **Description**
@@ -187,7 +208,20 @@ scope, and reconsider whether the `Mgmt*` prefixed duplicates are still needed.
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit pending) — confirmed the asymmetry: `GetCommandName` stripped
`Command` from any type while `BuildRegistry` only registered the `Messages.Management`
namespace. In practice no defect was observed because every command type the CLI and
ManagementService actually use is in `Messages.Management` (a round-trip test over all
registered commands confirms no name collision). Closed the asymmetry by making
`GetCommandName` registry-bound: it now looks up a reverse `Type→name` frozen dictionary
built from the same registry and throws `ArgumentException` for any unregistered type, so
`Resolve(GetCommandName(t)) == t` holds for every type it accepts. Added an XML remarks
block documenting the registry scope and the symmetry guarantee. The `Mgmt*` prefixed
records were left in place — they are the genuine Management-namespace command types the
CLI constructs and renaming them would change wire command names (out of scope for a
behavior-preserving fix; noted for a future cleanup). CLI, ManagementService, and
SiteRuntime all build clean against the change. Regression tests added in
`ManagementCommandRegistryTests`.
### Commons-005 — `OpcUaEndpointConfigSerializer.Deserialize` discards malformed legacy input and over-reports `IsLegacy` ### Commons-005 — `OpcUaEndpointConfigSerializer.Deserialize` discards malformed legacy input and over-reports `IsLegacy`
+69 -11
View File
@@ -8,7 +8,7 @@
| Last reviewed | 2026-05-16 | | Last reviewed | 2026-05-16 |
| Reviewer | claude-agent | | Reviewer | claude-agent |
| Commit reviewed | `9c60592` | | Commit reviewed | `9c60592` |
| Open findings | 8 | | Open findings | 3 |
## Summary ## Summary
@@ -187,7 +187,7 @@ fail against the pre-fix logic and pass after.
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Akka.NET conventions | | Category | Akka.NET conventions |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.Communication/Actors/CentralCommunicationActor.cs:42`, `src/ScadaLink.Communication/Actors/SiteCommunicationActor.cs:22` | | Location | `src/ScadaLink.Communication/Actors/CentralCommunicationActor.cs:42`, `src/ScadaLink.Communication/Actors/SiteCommunicationActor.cs:22` |
**Description** **Description**
@@ -212,7 +212,17 @@ strategy), matching the documented decision and other coordinator actors.
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit pending). Root cause confirmed: neither
`CentralCommunicationActor` nor `SiteCommunicationActor` overrode `SupervisorStrategy`,
so child faults fell back to the Akka default (`Restart`). Note that an actor's own
`SupervisorStrategy` governs its *children* — a transient child fault would `Restart`
the child and discard its in-memory state, contrary to the CLAUDE.md "Resume for
coordinator actors" decision. Fix: both actors now override `SupervisorStrategy()` to
return a `OneForOneStrategy` with an unbounded `Decider` resolving to `Directive.Resume`
(mirroring `DataConnectionManagerActor`). Regression tests
`CoordinatorSupervisionTests.CentralCommunicationActor_SupervisorStrategy_IsResume` and
`CoordinatorSupervisionTests.SiteCommunicationActor_SupervisorStrategy_IsResume` fail
against the pre-fix code (decider yields `Restart`) and pass after.
### Communication-005 — gRPC keepalive and max-stream-lifetime options are defined but never applied ### Communication-005 — gRPC keepalive and max-stream-lifetime options are defined but never applied
@@ -220,7 +230,7 @@ _Unresolved._
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Design-document adherence | | Category | Design-document adherence |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.Communication/Grpc/SiteStreamGrpcClient.cs:25`, `src/ScadaLink.Communication/CommunicationOptions.cs:36` | | Location | `src/ScadaLink.Communication/Grpc/SiteStreamGrpcClient.cs:25`, `src/ScadaLink.Communication/CommunicationOptions.cs:36` |
**Description** **Description**
@@ -249,7 +259,22 @@ option and update the design doc.
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit pending). Root cause confirmed: `SiteStreamGrpcClient`
hard-coded the keepalive values, `GrpcMaxStreamLifetime` was referenced nowhere, and
`GrpcMaxConcurrentStreams` was never bound to the server. Fix (scoped to
`src/ScadaLink.Communication`): `SiteStreamGrpcClient` gained a constructor taking
`CommunicationOptions` and now applies `GrpcKeepAlivePingDelay`/`GrpcKeepAlivePingTimeout`
to its `SocketsHttpHandler`; `SiteStreamGrpcClientFactory` gained an
`IOptions<CommunicationOptions>` DI constructor and flows the options into every client
it creates; `SiteStreamGrpcServer` gained an `IOptions<CommunicationOptions>` DI
constructor that binds `GrpcMaxConcurrentStreams` and implements the documented 4-hour
session timeout via `CancellationTokenSource.CancelAfter(GrpcMaxStreamLifetime)` on the
per-stream CTS. The Host's existing `AddSingleton<SiteStreamGrpcServer>()` registration
resolves the new DI constructor via greedy resolution — no Host change required.
Regression tests `GrpcOptionsWiringTests.SiteStreamGrpcClient_AppliesKeepAliveFromOptions`,
`GrpcOptionsWiringTests.SiteStreamGrpcClientFactory_FlowsOptionsToCreatedClients`, and
`GrpcOptionsWiringTests.SiteStreamGrpcServer_BindsMaxConcurrentStreamsAndLifetimeFromOptions`
exercise the wiring (they require the new members to even compile).
### Communication-006 — Site address load failures are silently swallowed, leaving a stale cache ### Communication-006 — Site address load failures are silently swallowed, leaving a stale cache
@@ -257,7 +282,7 @@ _Unresolved._
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Error handling & resilience | | Category | Error handling & resilience |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.Communication/Actors/CentralCommunicationActor.cs:204` | | Location | `src/ScadaLink.Communication/Actors/CentralCommunicationActor.cs:204` |
**Description** **Description**
@@ -279,7 +304,16 @@ is down". Optionally surface a health metric for repeated load failures.
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit pending). Root cause confirmed: a faulted
`LoadSiteAddressesFromDb` task is piped to `Self` as a `Status.Failure`, but the actor
had no handler for it — the failure became an unhandled message (debug-level only) and
the periodic refresh failed silently. Fix: added a `Receive<Status.Failure>` handler
that logs the load failure at `Warning` with the underlying exception as the cause, so
operators can distinguish a missing-addresses configuration from a database outage.
Regression test
`CentralCommunicationActorTests.LoadSiteAddressesFailure_IsLoggedNotSilentlySwallowed`
(repository query throws) asserts the Warning is emitted — it produces no warning
against the pre-fix code and passes after.
### Communication-007 — `SiteStreamGrpcClientFactory.Dispose` blocks on async work (sync-over-async) ### Communication-007 — `SiteStreamGrpcClientFactory.Dispose` blocks on async work (sync-over-async)
@@ -287,7 +321,7 @@ _Unresolved._
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Performance & resource management | | Category | Performance & resource management |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.Communication/Grpc/SiteStreamGrpcClientFactory.cs:53` | | Location | `src/ScadaLink.Communication/Grpc/SiteStreamGrpcClientFactory.cs:53` |
**Description** **Description**
@@ -308,7 +342,17 @@ path, or document why blocking is safe here.
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit pending). Root cause confirmed: `Dispose()` called
`DisposeAsync().AsTask().GetAwaiter().GetResult()`, the classic sync-over-async pattern.
Fix: `SiteStreamGrpcClient` now also implements `IDisposable` with a synchronous
`Dispose()` that releases its CancellationTokenSources and underlying `GrpcChannel`
directly (all of that teardown is inherently synchronous); `SiteStreamGrpcClientFactory.Dispose()`
now disposes each cached client via that synchronous path with no blocking on the async
path. A `CreateClient` seam was extracted so the test can substitute a tracking client
while still exercising the factory's real caching/disposal machinery. Regression test
`SiteStreamGrpcClientFactoryDisposeTests.Dispose_DisposesClientsSynchronously_NotViaAsyncPath`
fails against the pre-fix code (clients disposed via `DisposeAsync`) and passes after;
`Dispose_DoesNotDeadlock_UnderSingleThreadedSynchronizationContext` guards the stall path.
### Communication-008 — Reconnect retry-count reset can mask a flapping stream indefinitely ### Communication-008 — Reconnect retry-count reset can mask a flapping stream indefinitely
@@ -316,7 +360,7 @@ _Unresolved._
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Correctness & logic bugs | | Category | Correctness & logic bugs |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.Communication/Actors/DebugStreamBridgeActor.cs:71`, `src/ScadaLink.Communication/Actors/DebugStreamBridgeActor.cs:174` | | Location | `src/ScadaLink.Communication/Actors/DebugStreamBridgeActor.cs:71`, `src/ScadaLink.Communication/Actors/DebugStreamBridgeActor.cs:174` |
**Description** **Description**
@@ -339,7 +383,21 @@ reconnects regardless of intervening events.
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit pending). Root cause confirmed: `_retryCount` was reset to
0 on every received `AttributeValueChanged`/`AlarmStateChanged`, so a stream that
connected, delivered one event, then failed — repeatedly — never tripped `MaxRetries`.
Fix (recommendation option a): the per-event reset was removed; instead `OpenGrpcStream`
arms a single `StabilityWindow` timer (60s default, internal-settable for tests), and
only when it fires (`GrpcStreamStable`) — i.e. the stream stayed up long enough to be
considered recovered — is `_retryCount` reset. `HandleGrpcError` cancels that timer, so
a stream that fails before the window elapses does not recover its retry budget. A
flapping stream therefore terminates after `MaxRetries` regardless of intervening
events. Regression test
`DebugStreamBridgeActorTests.FlappingStream_DeliveringEventsBetweenFailures_StillTerminatesAfterMaxRetries`
fails against the pre-fix code (actor never terminates) and passes after;
`RetryCount_RecoveredOnlyAfterStreamStaysStableForStabilityWindow` verifies the budget
is recovered after a stable interval. The pre-existing test that codified the buggy
per-event reset (`Grpc_Error_Resets_RetryCount_On_Successful_Event`) was replaced.
### Communication-009 — `_siteClients` field is mutable and reassignable; cache update is not atomic on failure ### Communication-009 — `_siteClients` field is mutable and reassignable; cache update is not atomic on failure
+8 -35
View File
@@ -41,19 +41,19 @@ module file and counted in **Total**.
|----------|---------------| |----------|---------------|
| Critical | 0 | | Critical | 0 |
| High | 0 | | High | 0 |
| Medium | 100 | | Medium | 73 |
| Low | 90 | | Low | 90 |
| **Total** | **190** | | **Total** | **163** |
## Module Status ## Module Status
| Module | Last reviewed | Commit | Open (C/H/M/L) | Open | Total | | Module | Last reviewed | Commit | Open (C/H/M/L) | Open | Total |
|--------|---------------|--------|----------------|------|-------| |--------|---------------|--------|----------------|------|-------|
| [CLI](CLI/findings.md) | 2026-05-16 | `9c60592` | 0/0/6/6 | 12 | 13 | | [CLI](CLI/findings.md) | 2026-05-16 | `9c60592` | 0/0/0/6 | 6 | 13 |
| [CentralUI](CentralUI/findings.md) | 2026-05-16 | `9c60592` | 0/0/10/5 | 15 | 19 | | [CentralUI](CentralUI/findings.md) | 2026-05-16 | `9c60592` | 0/0/2/5 | 7 | 19 |
| [ClusterInfrastructure](ClusterInfrastructure/findings.md) | 2026-05-16 | `9c60592` | 0/0/4/3 | 7 | 8 | | [ClusterInfrastructure](ClusterInfrastructure/findings.md) | 2026-05-16 | `9c60592` | 0/0/0/3 | 3 | 8 |
| [Commons](Commons/findings.md) | 2026-05-16 | `9c60592` | 0/0/4/8 | 12 | 12 | | [Commons](Commons/findings.md) | 2026-05-16 | `9c60592` | 0/0/0/8 | 8 | 12 |
| [Communication](Communication/findings.md) | 2026-05-16 | `9c60592` | 0/0/5/3 | 8 | 11 | | [Communication](Communication/findings.md) | 2026-05-16 | `9c60592` | 0/0/0/3 | 3 | 11 |
| [ConfigurationDatabase](ConfigurationDatabase/findings.md) | 2026-05-16 | `9c60592` | 0/0/4/6 | 10 | 11 | | [ConfigurationDatabase](ConfigurationDatabase/findings.md) | 2026-05-16 | `9c60592` | 0/0/4/6 | 10 | 11 |
| [DataConnectionLayer](DataConnectionLayer/findings.md) | 2026-05-16 | `9c60592` | 0/0/6/2 | 8 | 13 | | [DataConnectionLayer](DataConnectionLayer/findings.md) | 2026-05-16 | `9c60592` | 0/0/6/2 | 8 | 13 |
| [DeploymentManager](DeploymentManager/findings.md) | 2026-05-16 | `9c60592` | 0/0/6/5 | 11 | 14 | | [DeploymentManager](DeploymentManager/findings.md) | 2026-05-16 | `9c60592` | 0/0/6/5 | 11 | 14 |
@@ -84,39 +84,12 @@ _None open._
_None open._ _None open._
### Medium (100) ### Medium (73)
| ID | Module | Title | | ID | Module | Title |
|----|--------|-------| |----|--------|-------|
| CLI-002 | [CLI](CLI/findings.md) | Empty success body crashes table rendering with an unhandled exception |
| CLI-003 | [CLI](CLI/findings.md) | Non-JSON success body crashes table rendering |
| CLI-004 | [CLI](CLI/findings.md) | Malformed `--url` throws an unhandled `UriFormatException` |
| CLI-005 | [CLI](CLI/findings.md) | Malformed `--bindings` / `--overrides` JSON throws unhandled exceptions |
| CLI-006 | [CLI](CLI/findings.md) | Password is passed as a command-line argument with no safer alternative |
| CLI-007 | [CLI](CLI/findings.md) | `Component-CLI.md` command surface is substantially stale |
| CentralUI-005 | [CentralUI](CentralUI/findings.md) | Session expiry implementation diverges from the documented policy | | CentralUI-005 | [CentralUI](CentralUI/findings.md) | Session expiry implementation diverges from the documented policy |
| CentralUI-006 | [CentralUI](CentralUI/findings.md) | Deployment status page polls every 10s despite the documented SignalR-push design | | CentralUI-006 | [CentralUI](CentralUI/findings.md) | Deployment status page polls every 10s despite the documented SignalR-push design |
| CentralUI-007 | [CentralUI](CentralUI/findings.md) | Monitoring nav links to Deployment-only pages are shown to all roles |
| CentralUI-008 | [CentralUI](CentralUI/findings.md) | Audit-log date filters treat browser-local datetimes as UTC |
| CentralUI-009 | [CentralUI](CentralUI/findings.md) | `DebugView` stream callbacks touch a possibly-disposed `ToastNotification` |
| CentralUI-010 | [CentralUI](CentralUI/findings.md) | `ToastNotification` auto-dismiss continuation runs after component disposal |
| CentralUI-011 | [CentralUI](CentralUI/findings.md) | `DiffDialog` leaves a dangling `TaskCompletionSource` when disposed while open |
| CentralUI-012 | [CentralUI](CentralUI/findings.md) | N+1 query loading data connections for the Sites page |
| CentralUI-013 | [CentralUI](CentralUI/findings.md) | `ScriptAnalysisService` blocks on async shared-script lookups |
| CentralUI-014 | [CentralUI](CentralUI/findings.md) | Test Run side effects (HTTP/SQL/SMTP) fire against production services |
| ClusterInfrastructure-002 | [ClusterInfrastructure](ClusterInfrastructure/findings.md) | No-op DI extension methods report success while doing nothing |
| ClusterInfrastructure-003 | [ClusterInfrastructure](ClusterInfrastructure/findings.md) | ClusterOptions omits several documented node-configuration settings |
| ClusterInfrastructure-004 | [ClusterInfrastructure](ClusterInfrastructure/findings.md) | ClusterOptions has no validation despite safety-critical values |
| ClusterInfrastructure-006 | [ClusterInfrastructure](ClusterInfrastructure/findings.md) | No tests for any cluster behaviour; only the options POCO is covered |
| Commons-001 | [Commons](Commons/findings.md) | `StaleTagMonitor` stale-fire race between timer and `OnValueReceived` |
| Commons-002 | [Commons](Commons/findings.md) | `DynamicJsonElement` retains a `JsonElement` whose `JsonDocument` lifetime it does not own |
| Commons-003 | [Commons](Commons/findings.md) | `ScriptParameters.GetNullable` silently swallows conversion failures |
| Commons-004 | [Commons](Commons/findings.md) | `ManagementCommandRegistry` name mapping is asymmetric and namespace-scoped |
| Communication-004 | [Communication](Communication/findings.md) | Coordinator actors declare no SupervisorStrategy (design requires Resume) |
| Communication-005 | [Communication](Communication/findings.md) | gRPC keepalive and max-stream-lifetime options are defined but never applied |
| Communication-006 | [Communication](Communication/findings.md) | Site address load failures are silently swallowed, leaving a stale cache |
| Communication-007 | [Communication](Communication/findings.md) | `SiteStreamGrpcClientFactory.Dispose` blocks on async work (sync-over-async) |
| Communication-008 | [Communication](Communication/findings.md) | Reconnect retry-count reset can mask a flapping stream indefinitely |
| ConfigurationDatabase-002 | [ConfigurationDatabase](ConfigurationDatabase/findings.md) | Hardcoded `sa` connection string with embedded password literal | | ConfigurationDatabase-002 | [ConfigurationDatabase](ConfigurationDatabase/findings.md) | Hardcoded `sa` connection string with embedded password literal |
| ConfigurationDatabase-003 | [ConfigurationDatabase](ConfigurationDatabase/findings.md) | No-arg `AddConfigurationDatabase()` silently registers nothing | | ConfigurationDatabase-003 | [ConfigurationDatabase](ConfigurationDatabase/findings.md) | No-arg `AddConfigurationDatabase()` silently registers nothing |
| ConfigurationDatabase-004 | [ConfigurationDatabase](ConfigurationDatabase/findings.md) | Secret-bearing columns stored in plaintext with no protection | | ConfigurationDatabase-004 | [ConfigurationDatabase](ConfigurationDatabase/findings.md) | Secret-bearing columns stored in plaintext with no protection |
+109 -89
View File
@@ -48,127 +48,142 @@ The CLI uses a hierarchical subcommand structure mirroring the Management Servic
scadalink <group> <action> [options] scadalink <group> <action> [options]
``` ```
All entities are identified by their integer **ID** (via `--id`, `--template-id`,
`--site-id`, etc.), not by name. Create/update commands take individual flags — there
is no `--file` option. The authoritative, always-current reference is the in-repo
`src/ScadaLink.CLI/README.md`; the command lists below mirror the implemented command
tree at the time of writing.
### Template Commands ### Template Commands
``` ```
scadalink template list [--format json|table] scadalink template list
scadalink template get <name> [--format json|table] scadalink template get --id <id>
scadalink template create --name <name> [--parent <parent>] --file <path> scadalink template create --name <name> [--description <desc>] [--parent-id <id>]
scadalink template update <name> --file <path> scadalink template update --id <id> [--name <name>] [--description <desc>] [--parent-id <id>]
scadalink template delete <name> scadalink template validate --id <id>
scadalink template validate <name> scadalink template delete --id <id>
scadalink template diff <instance-code> scadalink template attribute add --template-id <id> --name <name> --data-type <type> [--value <value>] [--description <desc>] [--data-source <ref>] [--locked <bool>]
scadalink template attribute add --template-id <id> --name <name> --data-type <type> [--default-value <value>] [--tag-path <path>] scadalink template attribute update --id <id> [--name <name>] [--data-type <type>] [--value <value>] [--description <desc>] [--data-source <ref>] [--locked <bool>]
scadalink template attribute update --template-id <id> --name <name> [--data-type <type>] [--default-value <value>] [--tag-path <path>] scadalink template attribute delete --id <id>
scadalink template attribute delete --template-id <id> --name <name> scadalink template alarm add --template-id <id> --name <name> --trigger-type <type> --priority <n> [--description <desc>] [--trigger-config <json>] [--locked <bool>]
scadalink template alarm add --template-id <id> --name <name> --trigger-attribute <attr> --condition <cond> --setpoint <value> [--severity <level>] [--notification-list <name>] scadalink template alarm update --id <id> [--name <name>] [--trigger-type <type>] [--priority <n>] [--description <desc>] [--trigger-config <json>] [--locked <bool>]
scadalink template alarm update --template-id <id> --name <name> [--condition <cond>] [--setpoint <value>] [--severity <level>] [--notification-list <name>] scadalink template alarm delete --id <id>
scadalink template alarm delete --template-id <id> --name <name> scadalink template script add --template-id <id> --name <name> --code <code> --trigger-type <type> [--trigger-config <json>] [--locked <bool>] [--parameters <json>] [--return-def <json>]
scadalink template script add --template-id <id> --name <name> --trigger-type <type> [--trigger-attribute <attr>] [--interval <ms>] --code <code> scadalink template script update --id <id> [--name <name>] [--code <code>] [--trigger-type <type>] [--trigger-config <json>] [--locked <bool>] [--parameters <json>] [--return-def <json>]
scadalink template script update --template-id <id> --name <name> [--trigger-type <type>] [--trigger-attribute <attr>] [--interval <ms>] [--code <code>] scadalink template script delete --id <id>
scadalink template script delete --template-id <id> --name <name> scadalink template composition add --template-id <id> --instance-name <name> --composed-template-id <id>
scadalink template composition add --template-id <id> --module-template-id <id> --instance-name <name>
scadalink template composition delete --template-id <id> --instance-name <name> scadalink template composition delete --template-id <id> --instance-name <name>
``` ```
### Instance Commands ### Instance Commands
``` ```
scadalink instance list [--site <site>] [--area <area>] [--format json|table] scadalink instance list [--site-id <id>] [--template-id <id>] [--search <term>]
scadalink instance get <code> [--format json|table] scadalink instance get --id <id>
scadalink instance create --template <name> --site <site> --code <code> [--area <area>] scadalink instance create --name <name> --template-id <id> --site-id <id> [--area-id <id>]
scadalink instance set-overrides <code> --file <path> scadalink instance set-bindings --id <id> --bindings <json>
scadalink instance set-bindings <code> --bindings <json> scadalink instance set-overrides --id <id> --overrides <json>
scadalink instance bind-connections <code> --file <path> scadalink instance alarm-override set --instance-id <id> --alarm <name> [--trigger-config <json>] [--priority <n>]
scadalink instance assign-area <code> --area <area> scadalink instance alarm-override delete --instance-id <id> --alarm <name>
scadalink instance enable <code> scadalink instance alarm-override list --instance-id <id>
scadalink instance disable <code> scadalink instance set-area --id <id> [--area-id <id>]
scadalink instance delete <code> scadalink instance diff --id <id>
scadalink instance deploy --id <id>
scadalink instance enable --id <id>
scadalink instance disable --id <id>
scadalink instance delete --id <id>
``` ```
`--bindings` is a JSON array of `[attributeName, dataConnectionId]` pairs, e.g.
`[["Speed", 5], ["Mode", 7]]`. `--overrides` is a JSON object of attribute name to
value, e.g. `{"Speed": "100", "Mode": null}`.
### Site Commands ### Site Commands
``` ```
scadalink site list [--format json|table] scadalink site list
scadalink site get <site-id> [--format json|table] scadalink site get --id <id>
scadalink site create --name <name> --id <site-id> [--node-a-address <addr>] [--node-b-address <addr>] [--grpc-node-a-address <addr>] [--grpc-node-b-address <addr>] scadalink site create --identifier <id> --name <name> [--description <desc>] [--node-a-address <addr>] [--node-b-address <addr>] [--grpc-node-a-address <addr>] [--grpc-node-b-address <addr>]
scadalink site update <site-id> --file <path> scadalink site update --id <id> [--name <name>] [--description <desc>] [--node-a-address <addr>] [--node-b-address <addr>] [--grpc-node-a-address <addr>] [--grpc-node-b-address <addr>]
scadalink site delete <site-id> scadalink site delete --id <id>
scadalink site area list <site-id> scadalink site area list --site-id <id>
scadalink site area create <site-id> --name <name> [--parent <parent-area>] scadalink site area create --site-id <id> --name <name> [--parent-id <id>]
scadalink site area update <site-id> --name <name> [--new-name <name>] [--parent <parent-area>] scadalink site area update --id <id> --name <name>
scadalink site area delete <site-id> --name <name> scadalink site area delete --id <id>
scadalink site deploy-artifacts [--site-id <id>]
``` ```
### Deployment Commands ### Deployment Commands
``` ```
scadalink deploy instance <code> scadalink deploy instance --id <id>
scadalink deploy artifacts [--site <site>] [--type <artifact-type>] scadalink deploy artifacts [--site-id <id>]
scadalink deploy status [--format json|table] scadalink deploy status [--instance-id <id>] [--status <status>] [--page <n>] [--page-size <n>]
``` ```
### Data Connection Commands ### Data Connection Commands
``` ```
scadalink data-connection list [--format json|table] scadalink data-connection list [--site-id <id>]
scadalink data-connection get <name> [--format json|table] scadalink data-connection get --id <id>
scadalink data-connection create --file <path> scadalink data-connection create --site-id <id> --name <name> --protocol <protocol> [--backup-config <json>] [--failover-retry-count <n>]
scadalink data-connection update <name> --file <path> scadalink data-connection update --id <id> [--name <name>] [--protocol <protocol>] [--backup-config <json>] [--failover-retry-count <n>]
scadalink data-connection delete <name> scadalink data-connection delete --id <id>
scadalink data-connection assign <name> --site <site-id>
scadalink data-connection unassign <name> --site <site-id>
``` ```
### External System Commands ### External System Commands
``` ```
scadalink external-system list [--format json|table] scadalink external-system list
scadalink external-system get <name> [--format json|table] scadalink external-system get --id <id>
scadalink external-system create --file <path> scadalink external-system create --name <name> --endpoint-url <url> --auth-type <type> [--auth-config <json>]
scadalink external-system update <name> --file <path> scadalink external-system update --id <id> [--name <name>] [--endpoint-url <url>] [--auth-type <type>] [--auth-config <json>]
scadalink external-system delete <name> scadalink external-system delete --id <id>
scadalink external-system method list --external-system-id <id>
scadalink external-system method get --id <id>
scadalink external-system method create --external-system-id <id> --name <name> --http-method <verb> --path <path> [--params <json>] [--return <json>]
scadalink external-system method update --id <id> [--name <name>] [--http-method <verb>] [--path <path>] [--params <json>] [--return <json>]
scadalink external-system method delete --id <id>
``` ```
### Notification Commands ### Notification Commands
``` ```
scadalink notification list [--format json|table] scadalink notification list
scadalink notification get <name> [--format json|table] scadalink notification get --id <id>
scadalink notification create --file <path> scadalink notification create --name <name> --emails <comma-separated>
scadalink notification update <name> --file <path> scadalink notification update --id <id> [--name <name>] [--emails <comma-separated>]
scadalink notification delete <name> scadalink notification delete --id <id>
scadalink notification smtp list [--format json|table] scadalink notification smtp list
scadalink notification smtp update --file <path> scadalink notification smtp update --id <id> --server <host> --port <n> --auth-mode <mode> --from-address <email>
``` ```
### Security Commands ### Security Commands
``` ```
scadalink security api-key list [--format json|table] scadalink security api-key list
scadalink security api-key create --name <name> scadalink security api-key create --name <name>
scadalink security api-key update <name> [--name <new-name>] [--enabled <bool>] scadalink security api-key update --id <id> --enabled <bool>
scadalink security api-key enable <name> scadalink security api-key delete --id <id>
scadalink security api-key disable <name> scadalink security role-mapping list
scadalink security api-key delete <name> scadalink security role-mapping create --ldap-group <group> --role <role>
scadalink security role-mapping list [--format json|table] scadalink security role-mapping update --id <id> [--ldap-group <group>] [--role <role>]
scadalink security role-mapping create --group <ldap-group> --role <role> [--site <site>] scadalink security role-mapping delete --id <id>
scadalink security role-mapping update --id <id> [--group <ldap-group>] [--role <role>] scadalink security scope-rule list [--mapping-id <id>]
scadalink security role-mapping delete --group <ldap-group> --role <role> scadalink security scope-rule add --mapping-id <id> --site-id <id>
scadalink security scope-rule list [--role-mapping-id <id>] [--format json|table]
scadalink security scope-rule add --role-mapping-id <id> --site-id <site-id>
scadalink security scope-rule delete --id <id> scadalink security scope-rule delete --id <id>
``` ```
### Audit Log Commands ### Audit Log Commands
``` ```
scadalink audit-log query [--user <username>] [--entity-type <type>] [--from <date>] [--to <date>] [--format json|table] scadalink audit-log query [--user <username>] [--entity-type <type>] [--action <action>] [--from <date>] [--to <date>] [--page <n>] [--page-size <n>]
``` ```
### Health Commands ### Health Commands
``` ```
scadalink health summary [--format json|table] scadalink health summary
scadalink health site <site-id> [--format json|table] scadalink health site --identifier <site-identifier>
scadalink health event-log --site-identifier <site-id> [--from <date>] [--to <date>] [--search <term>] [--page <n>] [--page-size <n>] [--format json|table] scadalink health event-log --site <site-identifier> [--event-type <type>] [--severity <level>] [--keyword <term>] [--from <date>] [--to <date>] [--page <n>] [--page-size <n>] [--instance-name <name>]
scadalink health parked-messages --site-identifier <site-id> [--page <n>] [--page-size <n>] [--format json|table] scadalink health parked-messages --site <site-identifier> [--page <n>] [--page-size <n>]
``` ```
### Debug Commands ### Debug Commands
``` ```
scadalink debug snapshot --id <id> [--format json|table] scadalink debug snapshot --id <id>
scadalink debug stream --id <instanceId> [--url ...] [--username ...] [--password ...] scadalink debug stream --id <id>
``` ```
The `debug snapshot` command retrieves a point-in-time snapshot via the HTTP Management API. The `debug snapshot` command retrieves a point-in-time snapshot via the HTTP Management API.
@@ -185,31 +200,33 @@ Unlike `debug snapshot` (which uses the HTTP Management API), `debug stream` use
### Shared Script Commands ### Shared Script Commands
``` ```
scadalink shared-script list [--format json|table] scadalink shared-script list
scadalink shared-script get --id <id> [--format json|table] scadalink shared-script get --id <id>
scadalink shared-script create --name <name> --code <code> scadalink shared-script create --name <name> --code <code> [--parameters <json>] [--return-def <json>]
scadalink shared-script update --id <id> [--name <name>] [--code <code>] scadalink shared-script update --id <id> [--name <name>] [--code <code>] [--parameters <json>] [--return-def <json>]
scadalink shared-script delete --id <id> scadalink shared-script delete --id <id>
``` ```
### Database Connection Commands ### Database Connection Commands
``` ```
scadalink db-connection list [--format json|table] scadalink db-connection list
scadalink db-connection get --id <id> [--format json|table] scadalink db-connection get --id <id>
scadalink db-connection create --name <name> --connection-string <string> [--provider <provider>] scadalink db-connection create --name <name> --connection-string <string>
scadalink db-connection update --id <id> [--name <name>] [--connection-string <string>] [--provider <provider>] scadalink db-connection update --id <id> [--name <name>] [--connection-string <string>]
scadalink db-connection delete --id <id> scadalink db-connection delete --id <id>
``` ```
### Inbound API Method Commands ### Inbound API Method Commands
``` ```
scadalink api-method list [--format json|table] scadalink api-method list
scadalink api-method get --id <id> [--format json|table] scadalink api-method get --id <id>
scadalink api-method create --name <name> --code <code> [--description <desc>] scadalink api-method create --name <name> --script <code> [--timeout <seconds>] [--parameters <json>] [--return-def <json>]
scadalink api-method update --id <id> [--name <name>] [--code <code>] [--description <desc>] scadalink api-method update --id <id> [--script <code>] [--timeout <seconds>] [--parameters <json>] [--return-def <json>]
scadalink api-method delete --id <id> scadalink api-method delete --id <id>
``` ```
The `--format json|table` option is recursive and accepted on every command above.
## Configuration ## Configuration
Configuration is resolved in the following priority order (highest wins): Configuration is resolved in the following priority order (highest wins):
@@ -218,7 +235,10 @@ Configuration is resolved in the following priority order (highest wins):
2. **Environment variables**: 2. **Environment variables**:
- `SCADALINK_MANAGEMENT_URL` — Management API URL (e.g., `http://central-host:5000`). - `SCADALINK_MANAGEMENT_URL` — Management API URL (e.g., `http://central-host:5000`).
- `SCADALINK_FORMAT` — Default output format (`json` or `table`). - `SCADALINK_FORMAT` — Default output format (`json` or `table`).
3. **Configuration file**: `~/.scadalink/config.json` — Persistent defaults for management URL and output format. - `SCADALINK_USERNAME` / `SCADALINK_PASSWORD` — LDAP credentials. Preferred over
`--password` on the command line, which is visible in process listings and shell
history. Credentials are never read from the config file.
3. **Configuration file**: `~/.scadalink/config.json` — Persistent defaults for management URL and output format only (never credentials).
### Configuration File Format ### Configuration File Format
+23
View File
@@ -7,6 +7,20 @@ public class CliConfig
public string? ManagementUrl { get; set; } public string? ManagementUrl { get; set; }
public string DefaultFormat { get; set; } = "json"; public string DefaultFormat { get; set; } = "json";
/// <summary>
/// LDAP username from the <c>SCADALINK_USERNAME</c> environment variable, if set.
/// Credentials are intentionally only sourced from environment variables (or the
/// command line) — never from the config file — so they are not persisted to disk.
/// </summary>
public string? Username { get; set; }
/// <summary>
/// LDAP password from the <c>SCADALINK_PASSWORD</c> environment variable, if set.
/// Provides a safer alternative to <c>--password</c>, which leaks into process
/// listings and shell history.
/// </summary>
public string? Password { get; set; }
public static CliConfig Load() public static CliConfig Load()
{ {
var config = new CliConfig(); var config = new CliConfig();
@@ -38,6 +52,15 @@ public class CliConfig
if (!string.IsNullOrEmpty(envFormat)) if (!string.IsNullOrEmpty(envFormat))
config.DefaultFormat = envFormat; config.DefaultFormat = envFormat;
// Credentials from environment variables only (never the config file).
var envUsername = Environment.GetEnvironmentVariable("SCADALINK_USERNAME");
if (!string.IsNullOrEmpty(envUsername))
config.Username = envUsername;
var envPassword = Environment.GetEnvironmentVariable("SCADALINK_PASSWORD");
if (!string.IsNullOrEmpty(envPassword))
config.Password = envPassword;
return config; return config;
} }
+60 -5
View File
@@ -31,14 +31,23 @@ internal static class CommandHelpers
return 1; return 1;
} }
// Validate credentials if (!IsValidManagementUrl(url))
var username = result.GetValue(usernameOption); {
var password = result.GetValue(passwordOption); OutputFormatter.WriteError(
$"Invalid management URL '{url}'. Expected an absolute http/https URL (e.g. http://localhost:9001).",
"INVALID_URL");
return 1;
}
// Resolve credentials: command-line options take precedence, then the
// SCADALINK_USERNAME / SCADALINK_PASSWORD environment variables.
var username = ResolveCredential(result.GetValue(usernameOption), config.Username);
var password = ResolveCredential(result.GetValue(passwordOption), config.Password);
if (string.IsNullOrWhiteSpace(username) || string.IsNullOrWhiteSpace(password)) if (string.IsNullOrWhiteSpace(username) || string.IsNullOrWhiteSpace(password))
{ {
OutputFormatter.WriteError( OutputFormatter.WriteError(
"Credentials required. Use --username and --password options.", "Credentials required. Use --username/--password or set SCADALINK_USERNAME/SCADALINK_PASSWORD.",
"NO_CREDENTIALS"); "NO_CREDENTIALS");
return 1; return 1;
} }
@@ -74,10 +83,40 @@ internal static class CommandHelpers
return string.IsNullOrWhiteSpace(config.DefaultFormat) ? "json" : config.DefaultFormat; return string.IsNullOrWhiteSpace(config.DefaultFormat) ? "json" : config.DefaultFormat;
} }
/// <summary>
/// Resolves a single credential: an explicit command-line value wins, otherwise the
/// environment-variable fallback (from <see cref="CliConfig"/>) is used.
/// </summary>
internal static string? ResolveCredential(string? commandLineValue, string? envValue)
=> string.IsNullOrWhiteSpace(commandLineValue) ? envValue : commandLineValue;
/// <summary>
/// Validates that a management URL is an absolute http/https URL. A malformed URL
/// (missing scheme, empty, or a non-http scheme) would otherwise reach
/// <c>new Uri(...)</c> in the <see cref="ManagementHttpClient"/> constructor and throw
/// an unhandled <see cref="UriFormatException"/>.
/// </summary>
internal static bool IsValidManagementUrl(string? url)
{
if (string.IsNullOrWhiteSpace(url))
return false;
return Uri.TryCreate(url, UriKind.Absolute, out var uri)
&& (uri.Scheme == Uri.UriSchemeHttp || uri.Scheme == Uri.UriSchemeHttps);
}
internal static int HandleResponse(ManagementResponse response, string format) internal static int HandleResponse(ManagementResponse response, string format)
{ {
if (response.JsonData != null) if (response.JsonData != null)
{ {
// A success status with an empty/whitespace body (e.g. a 204 from a delete)
// is a "command succeeded, no output" case — do not attempt to parse it.
if (string.IsNullOrWhiteSpace(response.JsonData))
{
Console.WriteLine("(ok)");
return 0;
}
if (string.Equals(format, "table", StringComparison.OrdinalIgnoreCase)) if (string.Equals(format, "table", StringComparison.OrdinalIgnoreCase))
{ {
WriteAsTable(response.JsonData); WriteAsTable(response.JsonData);
@@ -98,7 +137,22 @@ internal static class CommandHelpers
private static void WriteAsTable(string json) private static void WriteAsTable(string json)
{ {
using var doc = JsonDocument.Parse(json); JsonDocument doc;
try
{
doc = JsonDocument.Parse(json);
}
catch (JsonException)
{
// The server returned a success status but a non-JSON body (e.g. a proxy
// HTML error page, or a plain-text message). Print it verbatim rather than
// crashing — mirrors the raw-body fallback on the JSON path.
Console.WriteLine(json);
return;
}
using (doc)
{
var root = doc.RootElement; var root = doc.RootElement;
if (root.ValueKind == JsonValueKind.Array) if (root.ValueKind == JsonValueKind.Array)
@@ -141,3 +195,4 @@ internal static class CommandHelpers
} }
} }
} }
}
+11 -3
View File
@@ -57,13 +57,21 @@ public static class DebugCommands
return 1; return 1;
} }
var username = result.GetValue(usernameOption); if (!CommandHelpers.IsValidManagementUrl(url))
var password = result.GetValue(passwordOption); {
OutputFormatter.WriteError(
$"Invalid management URL '{url}'. Expected an absolute http/https URL (e.g. http://localhost:9001).",
"INVALID_URL");
return 1;
}
var username = CommandHelpers.ResolveCredential(result.GetValue(usernameOption), config.Username);
var password = CommandHelpers.ResolveCredential(result.GetValue(passwordOption), config.Password);
if (string.IsNullOrWhiteSpace(username) || string.IsNullOrWhiteSpace(password)) if (string.IsNullOrWhiteSpace(username) || string.IsNullOrWhiteSpace(password))
{ {
OutputFormatter.WriteError( OutputFormatter.WriteError(
"Credentials required. Use --username and --password options.", "Credentials required. Use --username/--password or set SCADALINK_USERNAME/SCADALINK_PASSWORD.",
"NO_CREDENTIALS"); "NO_CREDENTIALS");
return 1; return 1;
} }
+100 -8
View File
@@ -52,17 +52,106 @@ public static class InstanceCommands
{ {
var id = result.GetValue(idOption); var id = result.GetValue(idOption);
var bindingsJson = result.GetValue(bindingsOption)!; var bindingsJson = result.GetValue(bindingsOption)!;
var pairs = System.Text.Json.JsonSerializer.Deserialize<List<List<System.Text.Json.JsonElement>>>(bindingsJson) if (!TryParseBindings(bindingsJson, out var bindings, out var error))
?? throw new InvalidOperationException("Invalid bindings JSON"); {
var bindings = pairs.Select(p => OutputFormatter.WriteError(error!, "INVALID_ARGUMENT");
(p[0].GetString()!, p[1].GetInt32())).ToList(); return 1;
}
return await CommandHelpers.ExecuteCommandAsync( return await CommandHelpers.ExecuteCommandAsync(
result, urlOption, formatOption, usernameOption, passwordOption, result, urlOption, formatOption, usernameOption, passwordOption,
new SetConnectionBindingsCommand(id, bindings)); new SetConnectionBindingsCommand(id, bindings!));
}); });
return cmd; return cmd;
} }
/// <summary>
/// Parses the <c>--bindings</c> argument — a JSON array of
/// <c>[attributeName, dataConnectionId]</c> pairs — into a typed list.
/// Returns <c>false</c> with a descriptive <paramref name="error"/> instead of
/// throwing when the JSON is malformed, a pair has the wrong arity, or an element
/// has the wrong type.
/// </summary>
internal static bool TryParseBindings(
string json,
out List<(string, int)>? bindings,
out string? error)
{
bindings = null;
error = null;
try
{
var pairs = System.Text.Json.JsonSerializer
.Deserialize<List<List<System.Text.Json.JsonElement>>>(json);
if (pairs == null)
{
error = "Bindings JSON must be a non-null array of [attributeName, dataConnectionId] pairs.";
return false;
}
var result = new List<(string, int)>(pairs.Count);
foreach (var pair in pairs)
{
if (pair.Count != 2)
{
error = "Each binding must be a [attributeName, dataConnectionId] pair of exactly two elements.";
return false;
}
if (pair[0].ValueKind != System.Text.Json.JsonValueKind.String)
{
error = "The first element of each binding (attributeName) must be a string.";
return false;
}
if (pair[1].ValueKind != System.Text.Json.JsonValueKind.Number
|| !pair[1].TryGetInt32(out var connectionId))
{
error = "The second element of each binding (dataConnectionId) must be an integer.";
return false;
}
result.Add((pair[0].GetString()!, connectionId));
}
bindings = result;
return true;
}
catch (System.Text.Json.JsonException ex)
{
error = $"Invalid bindings JSON: {ex.Message}";
return false;
}
}
/// <summary>
/// Parses the <c>--overrides</c> argument — a JSON object of
/// <c>attributeName -> value</c> pairs — into a typed dictionary. Returns
/// <c>false</c> with a descriptive <paramref name="error"/> instead of throwing
/// when the JSON is malformed or null.
/// </summary>
internal static bool TryParseOverrides(
string json,
out Dictionary<string, string?>? overrides,
out string? error)
{
overrides = null;
error = null;
try
{
var parsed = System.Text.Json.JsonSerializer
.Deserialize<Dictionary<string, string?>>(json);
if (parsed == null)
{
error = "Overrides JSON must be a non-null object of attribute name -> value pairs.";
return false;
}
overrides = parsed;
return true;
}
catch (System.Text.Json.JsonException ex)
{
error = $"Invalid overrides JSON: {ex.Message}";
return false;
}
}
private static Command BuildList(Option<string> urlOption, Option<string> formatOption, Option<string> usernameOption, Option<string> passwordOption) private static Command BuildList(Option<string> urlOption, Option<string> formatOption, Option<string> usernameOption, Option<string> passwordOption)
{ {
var siteIdOption = new Option<int?>("--site-id") { Description = "Filter by site ID" }; var siteIdOption = new Option<int?>("--site-id") { Description = "Filter by site ID" };
@@ -178,11 +267,14 @@ public static class InstanceCommands
{ {
var id = result.GetValue(idOption); var id = result.GetValue(idOption);
var overridesJson = result.GetValue(overridesOption)!; var overridesJson = result.GetValue(overridesOption)!;
var overrides = System.Text.Json.JsonSerializer.Deserialize<Dictionary<string, string?>>(overridesJson) if (!TryParseOverrides(overridesJson, out var overrides, out var error))
?? throw new InvalidOperationException("Invalid overrides JSON"); {
OutputFormatter.WriteError(error!, "INVALID_ARGUMENT");
return 1;
}
return await CommandHelpers.ExecuteCommandAsync( return await CommandHelpers.ExecuteCommandAsync(
result, urlOption, formatOption, usernameOption, passwordOption, result, urlOption, formatOption, usernameOption, passwordOption,
new SetInstanceOverridesCommand(id, overrides)); new SetInstanceOverridesCommand(id, overrides!));
}); });
return cmd; return cmd;
} }
+2
View File
@@ -59,6 +59,8 @@ For the Docker test environment, see `docker/README.md` for a ready-to-use confi
|----------|-------------| |----------|-------------|
| `SCADALINK_MANAGEMENT_URL` | Management API URL (overrides config file) | | `SCADALINK_MANAGEMENT_URL` | Management API URL (overrides config file) |
| `SCADALINK_FORMAT` | Default output format (overrides config file) | | `SCADALINK_FORMAT` | Default output format (overrides config file) |
| `SCADALINK_USERNAME` | LDAP username (fallback when `--username` is not supplied) |
| `SCADALINK_PASSWORD` | LDAP password (fallback when `--password` is not supplied). Preferred over `--password` on the command line, which leaks into process listings and shell history. |
## Output ## Output
@@ -0,0 +1,41 @@
namespace ScadaLink.CentralUI.Components;
/// <summary>
/// Converts <c>&lt;input type="datetime-local"&gt;</c> values — which are always
/// expressed in the user's <i>browser-local</i> time zone — into UTC
/// <see cref="DateTimeOffset"/>s for querying.
/// <para>
/// CLAUDE.md mandates UTC throughout the system, but a <c>datetime-local</c>
/// value carries no offset, so it must be <i>converted</i> to UTC, not relabelled
/// as UTC. Relabelling (the CentralUI-008 bug) shifts every query window by the
/// user's offset for any non-UTC browser.
/// </para>
/// </summary>
public static class BrowserTime
{
/// <summary>
/// Converts a browser-local <paramref name="localValue"/> to UTC using the
/// browser's <c>Date.getTimezoneOffset()</c> result.
/// </summary>
/// <param name="localValue">
/// The wall-clock value from a <c>datetime-local</c> input, or <c>null</c>.
/// </param>
/// <param name="browserUtcOffsetMinutes">
/// The value of JavaScript <c>new Date().getTimezoneOffset()</c>: the number
/// of minutes that, <b>added</b> to local time, yields UTC. It is positive
/// for time zones behind UTC (e.g. +300 for UTC-5) and negative for zones
/// ahead (e.g. -120 for UTC+2).
/// </param>
/// <returns>The equivalent instant in UTC, or <c>null</c> when the input is null.</returns>
public static DateTimeOffset? LocalInputToUtc(DateTime? localValue, int browserUtcOffsetMinutes)
{
if (localValue is not { } local)
return null;
// getTimezoneOffset() is defined as (UTC - local) in minutes, so
// UTC = local + offset.
var utc = DateTime.SpecifyKind(local, DateTimeKind.Unspecified)
.AddMinutes(browserUtcOffsetMinutes);
return new DateTimeOffset(utc, TimeSpan.Zero);
}
}
@@ -65,17 +65,22 @@
</Authorized> </Authorized>
</AuthorizeView> </AuthorizeView>
@* Monitoring — visible to all authenticated users *@ @* Monitoring — Health Dashboard is all-roles; Event Logs and
Parked Messages are Deployment-role only (Component-CentralUI). *@
<div role="presentation" class="nav-section-header">Monitoring</div> <div role="presentation" class="nav-section-header">Monitoring</div>
<li class="nav-item"> <li class="nav-item">
<NavLink class="nav-link" href="/monitoring/health">Health Dashboard</NavLink> <NavLink class="nav-link" href="/monitoring/health">Health Dashboard</NavLink>
</li> </li>
<AuthorizeView Policy="@AuthorizationPolicies.RequireDeployment">
<Authorized Context="monitoringContext">
<li class="nav-item"> <li class="nav-item">
<NavLink class="nav-link" href="/monitoring/event-logs">Event Logs</NavLink> <NavLink class="nav-link" href="/monitoring/event-logs">Event Logs</NavLink>
</li> </li>
<li class="nav-item"> <li class="nav-item">
<NavLink class="nav-link" href="/monitoring/parked-messages">Parked Messages</NavLink> <NavLink class="nav-link" href="/monitoring/parked-messages">Parked Messages</NavLink>
</li> </li>
</Authorized>
</AuthorizeView>
@* Audit Log — Admin only *@ @* Audit Log — Admin only *@
<AuthorizeView Policy="@AuthorizationPolicies.RequireAdmin"> <AuthorizeView Policy="@AuthorizationPolicies.RequireAdmin">
@@ -194,15 +194,12 @@
try try
{ {
_sites = (await SiteRepository.GetAllSitesAsync()).ToList(); _sites = (await SiteRepository.GetAllSitesAsync()).ToList();
_siteConnections.Clear();
foreach (var site in _sites) // CentralUI-012: fetch all data connections in one query and group
{ // them by site, instead of issuing one query per site (N+1).
var connections = await SiteRepository.GetDataConnectionsBySiteIdAsync(site.Id); _siteConnections = (await SiteRepository.GetAllDataConnectionsAsync())
if (connections.Count > 0) .GroupBy(c => c.SiteId)
{ .ToDictionary(g => g.Key, g => g.ToList());
_siteConnections[site.Id] = connections.ToList();
}
}
} }
catch (Exception ex) catch (Exception ex)
{ {
@@ -293,6 +293,13 @@
private string? _initError; private string? _initError;
// CentralUI-009: the stream callbacks (onEvent/onTerminated) run on an
// Akka/gRPC thread and capture `this` and `_toast`. Once the component is
// disposed, an in-flight callback must no-op rather than touch a disposed
// component (InvokeAsync would throw ObjectDisposedException) or a disposed
// ToastNotification.
private volatile bool _disposed;
protected override async Task OnInitializedAsync() protected override async Task OnInitializedAsync()
{ {
try try
@@ -396,15 +403,18 @@
_selectedInstanceId, _selectedInstanceId,
onEvent: evt => onEvent: evt =>
{ {
// CentralUI-009: the component may have been disposed while
// this event was in flight on the Akka/gRPC thread.
if (_disposed) return;
switch (evt) switch (evt)
{ {
case AttributeValueChanged av: case AttributeValueChanged av:
UpsertWithCap(_attributeValues, av.AttributeName, av); UpsertWithCap(_attributeValues, av.AttributeName, av);
_ = InvokeAsync(StateHasChanged); SafeInvokeStateHasChanged();
break; break;
case AlarmStateChanged al: case AlarmStateChanged al:
UpsertWithCap(_alarmStates, al.AlarmName, al); UpsertWithCap(_alarmStates, al.AlarmName, al);
_ = InvokeAsync(StateHasChanged); SafeInvokeStateHasChanged();
break; break;
} }
}, },
@@ -412,8 +422,11 @@
{ {
_connected = false; _connected = false;
_session = null; _session = null;
_ = InvokeAsync(() => // CentralUI-009: skip the toast/render if already disposed.
if (_disposed) return;
_ = SafeInvokeAsync(() =>
{ {
if (_disposed) return;
_toast.ShowError("Debug stream terminated (site disconnected)."); _toast.ShowError("Debug stream terminated (site disconnected).");
StateHasChanged(); StateHasChanged();
}); });
@@ -546,8 +559,31 @@
_ => "—" _ => "—"
}; };
/// <summary>
/// Runs <paramref name="action"/> on the render thread, guarded against the
/// component being disposed mid-flight (CentralUI-009): <c>InvokeAsync</c>
/// throws <see cref="ObjectDisposedException"/> once the circuit is gone.
/// </summary>
private async Task SafeInvokeAsync(Action action)
{
if (_disposed) return;
try
{
await InvokeAsync(action);
}
catch (ObjectDisposedException)
{
// Component disposed between the guard and the dispatch — ignore.
}
}
private void SafeInvokeStateHasChanged() => _ = SafeInvokeAsync(StateHasChanged);
public void Dispose() public void Dispose()
{ {
// CentralUI-009: mark disposed first so any in-flight stream callback
// sees the flag and no-ops, then stop the stream synchronously.
_disposed = true;
if (_session != null) if (_session != null)
{ {
DebugStreamService.StopStream(_session.SessionId); DebugStreamService.StopStream(_session.SessionId);
@@ -1,5 +1,6 @@
@page "/monitoring/audit-log" @page "/monitoring/audit-log"
@using ScadaLink.Security @using ScadaLink.Security
@using ScadaLink.CentralUI.Components
@using ScadaLink.Commons.Entities.Audit @using ScadaLink.Commons.Entities.Audit
@using ScadaLink.Commons.Interfaces.Repositories @using ScadaLink.Commons.Interfaces.Repositories
@attribute [Authorize(Policy = AuthorizationPolicies.RequireAdmin)] @attribute [Authorize(Policy = AuthorizationPolicies.RequireAdmin)]
@@ -195,6 +196,12 @@
private DateTime? _filterFrom; private DateTime? _filterFrom;
private DateTime? _filterTo; private DateTime? _filterTo;
// The datetime-local filter inputs are in the browser's local time zone.
// This holds new Date().getTimezoneOffset() so the values are converted to
// UTC (CentralUI-008) rather than relabelled. Until JS interop runs it is 0
// (UTC), which is a safe default for a UTC server/browser.
private int _browserUtcOffsetMinutes;
private List<AuditLogEntry>? _entries; private List<AuditLogEntry>? _entries;
private int _totalCount; private int _totalCount;
private int _page = 1; private int _page = 1;
@@ -209,6 +216,23 @@
private int TotalPages => _pageSize > 0 ? Math.Max(1, (_totalCount + _pageSize - 1) / _pageSize) : 1; private int TotalPages => _pageSize > 0 ? Math.Max(1, (_totalCount + _pageSize - 1) / _pageSize) : 1;
private bool HasMore => _page * _pageSize < _totalCount; private bool HasMore => _page * _pageSize < _totalCount;
protected override async Task OnAfterRenderAsync(bool firstRender)
{
if (!firstRender) return;
try
{
// Date.getTimezoneOffset() returns (UTC - local) in minutes.
_browserUtcOffsetMinutes = await JS.InvokeAsync<int>(
"eval", "new Date().getTimezoneOffset()");
}
catch (Exception ex) when (ex is JSException or JSDisconnectedException
or InvalidOperationException or TaskCanceledException)
{
// Prerender or a disconnected circuit: fall back to UTC (offset 0).
_browserUtcOffsetMinutes = 0;
}
}
private async Task Search() private async Task Search()
{ {
_page = 1; _page = 1;
@@ -239,8 +263,8 @@
user: string.IsNullOrWhiteSpace(_filterUser) ? null : _filterUser.Trim(), user: string.IsNullOrWhiteSpace(_filterUser) ? null : _filterUser.Trim(),
entityType: string.IsNullOrWhiteSpace(_filterEntityType) ? null : _filterEntityType.Trim(), entityType: string.IsNullOrWhiteSpace(_filterEntityType) ? null : _filterEntityType.Trim(),
action: string.IsNullOrWhiteSpace(_filterAction) ? null : _filterAction.Trim(), action: string.IsNullOrWhiteSpace(_filterAction) ? null : _filterAction.Trim(),
from: _filterFrom.HasValue ? new DateTimeOffset(_filterFrom.Value, TimeSpan.Zero) : null, from: BrowserTime.LocalInputToUtc(_filterFrom, _browserUtcOffsetMinutes),
to: _filterTo.HasValue ? new DateTimeOffset(_filterTo.Value, TimeSpan.Zero) : null, to: BrowserTime.LocalInputToUtc(_filterTo, _browserUtcOffsetMinutes),
page: _page, page: _page,
pageSize: _pageSize); pageSize: _pageSize);
@@ -1,5 +1,5 @@
@page "/monitoring/event-logs" @page "/monitoring/event-logs"
@attribute [Authorize] @attribute [Authorize(Policy = ScadaLink.Security.AuthorizationPolicies.RequireDeployment)]
@using ScadaLink.Commons.Entities.Sites @using ScadaLink.Commons.Entities.Sites
@using ScadaLink.Commons.Interfaces.Repositories @using ScadaLink.Commons.Interfaces.Repositories
@using ScadaLink.Commons.Messages.RemoteQuery @using ScadaLink.Commons.Messages.RemoteQuery
@@ -1,5 +1,5 @@
@page "/monitoring/parked-messages" @page "/monitoring/parked-messages"
@attribute [Authorize] @attribute [Authorize(Policy = ScadaLink.Security.AuthorizationPolicies.RequireDeployment)]
@using ScadaLink.Commons.Entities.Sites @using ScadaLink.Commons.Entities.Sites
@using ScadaLink.Commons.Interfaces.Repositories @using ScadaLink.Commons.Interfaces.Repositories
@using ScadaLink.Commons.Messages.RemoteQuery @using ScadaLink.Commons.Messages.RemoteQuery
@@ -150,6 +150,11 @@
public async ValueTask DisposeAsync() public async ValueTask DisposeAsync()
{ {
// CentralUI-011: if the dialog is disposed while still open (the user
// navigated away), complete the pending task so the awaiting caller
// resumes deterministically instead of hanging forever.
_tcs?.TrySetResult(false);
if (_bodyLocked) if (_bodyLocked)
{ {
await TryUnlockBodyAsync(); await TryUnlockBodyAsync();
@@ -30,6 +30,16 @@
private readonly List<ToastItem> _toasts = new(); private readonly List<ToastItem> _toasts = new();
private readonly object _lock = new(); private readonly object _lock = new();
// Cancels all pending auto-dismiss delays when the component is disposed
// (CentralUI-010) so their continuations never touch a disposed component.
private readonly CancellationTokenSource _disposalCts = new();
/// <summary>Number of toasts currently displayed.</summary>
public int ToastCount
{
get { lock (_lock) { return _toasts.Count; } }
}
public void ShowSuccess(string message, string title = "Success", int? autoDismissMs = null) public void ShowSuccess(string message, string title = "Success", int? autoDismissMs = null)
{ {
AddToast(title, message, ToastType.Success, autoDismissMs); AddToast(title, message, ToastType.Success, autoDismissMs);
@@ -52,6 +62,9 @@
private void AddToast(string title, string message, ToastType type, int? autoDismissMs) private void AddToast(string title, string message, ToastType type, int? autoDismissMs)
{ {
// If the component is already disposed, do not add or schedule anything.
if (_disposalCts.IsCancellationRequested) return;
var toast = new ToastItem { Title = title, Message = message, Type = type }; var toast = new ToastItem { Title = title, Message = message, Type = type };
lock (_lock) lock (_lock)
{ {
@@ -60,14 +73,41 @@
StateHasChanged(); StateHasChanged();
var dismissMs = autoDismissMs ?? DefaultAutoDismissMs; var dismissMs = autoDismissMs ?? DefaultAutoDismissMs;
_ = Task.Delay(dismissMs).ContinueWith(_ => _ = AutoDismissAsync(toast, dismissMs, _disposalCts.Token);
}
/// <summary>
/// Removes a toast after its dismiss delay. The delay is bound to the
/// component's disposal token (CentralUI-010): if the host page is disposed
/// first, the delay is cancelled and the continuation never touches the
/// disposed component — no <see cref="ObjectDisposedException"/> escapes.
/// </summary>
private async Task AutoDismissAsync(ToastItem toast, int dismissMs, CancellationToken token)
{ {
try
{
await Task.Delay(dismissMs, token);
}
catch (OperationCanceledException)
{
return;
}
if (token.IsCancellationRequested) return;
lock (_lock) lock (_lock)
{ {
_toasts.Remove(toast); _toasts.Remove(toast);
} }
InvokeAsync(StateHasChanged);
}); try
{
await InvokeAsync(StateHasChanged);
}
catch (ObjectDisposedException)
{
// Component disposed between the token check and the render — ignore.
}
} }
private void Dismiss(ToastItem toast) private void Dismiss(ToastItem toast)
@@ -87,7 +127,11 @@
_ => "bg-secondary text-white" _ => "bg-secondary text-white"
}; };
public void Dispose() { } public void Dispose()
{
_disposalCts.Cancel();
_disposalCts.Dispose();
}
private enum ToastType { Success, Error, Warning, Info } private enum ToastType { Success, Error, Warning, Info }
@@ -19,11 +19,11 @@ public static class ScriptAnalysisEndpoints
group.MapPost("/completions", async (CompletionsRequest req, ScriptAnalysisService svc) => group.MapPost("/completions", async (CompletionsRequest req, ScriptAnalysisService svc) =>
Results.Ok(await svc.CompleteAsync(req))); Results.Ok(await svc.CompleteAsync(req)));
group.MapPost("/hover", (HoverRequest req, ScriptAnalysisService svc) => group.MapPost("/hover", async (HoverRequest req, ScriptAnalysisService svc) =>
Results.Ok(svc.Hover(req))); Results.Ok(await svc.Hover(req)));
group.MapPost("/signature-help", (SignatureHelpRequest req, ScriptAnalysisService svc) => group.MapPost("/signature-help", async (SignatureHelpRequest req, ScriptAnalysisService svc) =>
Results.Ok(svc.SignatureHelp(req))); Results.Ok(await svc.SignatureHelp(req)));
group.MapPost("/format", (FormatRequest req, ScriptAnalysisService svc) => group.MapPost("/format", (FormatRequest req, ScriptAnalysisService svc) =>
Results.Ok(svc.Format(req))); Results.Ok(svc.Format(req)));
@@ -722,7 +722,7 @@ public class ScriptAnalysisService
public InlayHintsResponse InlayHints(InlayHintsRequest request) => public InlayHintsResponse InlayHints(InlayHintsRequest request) =>
new(Array.Empty<InlayHint>()); new(Array.Empty<InlayHint>());
public HoverResponse Hover(HoverRequest request) public async Task<HoverResponse> Hover(HoverRequest request)
{ {
var script = TryParse(request.CodeText); var script = TryParse(request.CodeText);
if (script == null) return new HoverResponse(null); if (script == null) return new HoverResponse(null);
@@ -762,13 +762,13 @@ public class ScriptAnalysisService
var rawName = token.ValueText; var rawName = token.ValueText;
if (string.IsNullOrEmpty(rawName)) return new HoverResponse(null); if (string.IsNullOrEmpty(rawName)) return new HoverResponse(null);
var shape = ResolveCalledShape( var shape = await ResolveCalledShape(
call, rawName, request.SiblingScripts, request.Children, request.Parent); call, rawName, request.SiblingScripts, request.Children, request.Parent);
if (shape == null) return new HoverResponse(null); if (shape == null) return new HoverResponse(null);
return new HoverResponse(FormatHover(shape, call)); return new HoverResponse(FormatHover(shape, call));
} }
public SignatureHelpResponse SignatureHelp(SignatureHelpRequest request) public async Task<SignatureHelpResponse> SignatureHelp(SignatureHelpRequest request)
{ {
var empty = new SignatureHelpResponse(null, null, 0); var empty = new SignatureHelpResponse(null, null, 0);
var script = TryParse(request.CodeText); var script = TryParse(request.CodeText);
@@ -803,7 +803,7 @@ public class ScriptAnalysisService
var nameArg = inv.ArgumentList.Arguments[0].Expression as LiteralExpressionSyntax; var nameArg = inv.ArgumentList.Arguments[0].Expression as LiteralExpressionSyntax;
var scriptName = nameArg?.Token.ValueText ?? ""; var scriptName = nameArg?.Token.ValueText ?? "";
var shape = ResolveCalledShape( var shape = await ResolveCalledShape(
call, scriptName, request.SiblingScripts, request.Children, request.Parent); call, scriptName, request.SiblingScripts, request.Children, request.Parent);
if (shape == null) return empty; if (shape == null) return empty;
@@ -964,22 +964,35 @@ public class ScriptAnalysisService
_ => "script" _ => "script"
}; };
/// <summary>Resolves the called script's shape from the metadata in scope for its kind.</summary> /// <summary>
private ScriptShape? ResolveCalledShape( /// Resolves the called script's shape from the metadata in scope for its kind.
/// CentralUI-013: the shared-script catalog is awaited rather than blocked on
/// with <c>.GetAwaiter().GetResult()</c>, so this method is async — and
/// <see cref="Hover"/> / <see cref="SignatureHelp"/> are async with it.
/// </summary>
private async Task<ScriptShape?> ResolveCalledShape(
ScriptCallInfo call, ScriptCallInfo call,
string scriptName, string scriptName,
IReadOnlyList<ScriptShape>? siblings, IReadOnlyList<ScriptShape>? siblings,
IReadOnlyList<CompositionContext>? children, IReadOnlyList<CompositionContext>? children,
CompositionContext? parent) => call.Kind switch CompositionContext? parent)
{ {
ScriptCallKind.Shared => _sharedScripts.GetShapesAsync().GetAwaiter().GetResult() switch (call.Kind)
.FirstOrDefault(s => s.Name == scriptName), {
ScriptCallKind.Sibling => siblings?.FirstOrDefault(s => s.Name == scriptName), case ScriptCallKind.Shared:
ScriptCallKind.Parent => parent?.Scripts.FirstOrDefault(s => s.Name == scriptName), var shapes = await _sharedScripts.GetShapesAsync();
ScriptCallKind.Child => children?.FirstOrDefault(c => c.Name == call.CompositionName) return shapes.FirstOrDefault(s => s.Name == scriptName);
?.Scripts.FirstOrDefault(s => s.Name == scriptName), case ScriptCallKind.Sibling:
_ => null return siblings?.FirstOrDefault(s => s.Name == scriptName);
}; case ScriptCallKind.Parent:
return parent?.Scripts.FirstOrDefault(s => s.Name == scriptName);
case ScriptCallKind.Child:
return children?.FirstOrDefault(c => c.Name == call.CompositionName)
?.Scripts.FirstOrDefault(s => s.Name == scriptName);
default:
return null;
}
}
/// <summary> /// <summary>
/// SCADA006 — flag <c>Attributes["typo"]</c>, /// SCADA006 — flag <c>Attributes["typo"]</c>,
@@ -1,11 +1,75 @@
namespace ScadaLink.ClusterInfrastructure; namespace ScadaLink.ClusterInfrastructure;
/// <summary>
/// Cluster configuration model, bound from the <c>ScadaLink:Cluster</c> section
/// of <c>appsettings.json</c> via the Options pattern.
/// <para>
/// This project owns the cluster <em>configuration contract</em>. The actual
/// Akka.NET bootstrap — building the HOCON from these values, starting the
/// <c>ActorSystem</c>, configuring the split-brain resolver and wiring
/// <c>CoordinatedShutdown</c> — lives in <c>ScadaLink.Host</c>
/// (see <c>Component-ClusterInfrastructure.md</c> → "Implementation Note — Code Placement").
/// </para>
/// <para>
/// Node-identity settings (remoting hostname/port, cluster role, site identifier,
/// gRPC port) are deliberately <em>not</em> here — they are owned by
/// <c>ScadaLink.Host.NodeOptions</c> (<c>ScadaLink:Node</c> section). Local SQLite
/// storage paths are owned by the database / store-and-forward options. This class
/// holds only the cluster-formation and failure-detection settings shared by every node.
/// </para>
/// </summary>
public class ClusterOptions public class ClusterOptions
{ {
/// <summary>
/// The <c>appsettings.json</c> section name this options class binds from.
/// Single source of truth so binding sites do not hard-code the magic string.
/// </summary>
public const string SectionName = "ScadaLink:Cluster";
/// <summary>
/// Akka.NET cluster seed nodes. Both nodes are seed nodes — each node lists
/// itself and its partner — so either can start first and form the cluster.
/// Must contain at least one entry.
/// </summary>
public List<string> SeedNodes { get; set; } = new(); public List<string> SeedNodes { get; set; } = new();
/// <summary>
/// Split-brain resolver strategy. Must be <c>keep-oldest</c> for the two-node
/// clusters ScadaLink uses: quorum strategies (<c>keep-majority</c>,
/// <c>static-quorum</c>) cannot distinguish a crash from a partition with only
/// two nodes and would shut down the whole cluster.
/// </summary>
public string SplitBrainResolverStrategy { get; set; } = "keep-oldest"; public string SplitBrainResolverStrategy { get; set; } = "keep-oldest";
/// <summary>
/// Time the cluster membership must remain stable before the split-brain
/// resolver acts to down unreachable nodes. Must be positive. Default 15s.
/// </summary>
public TimeSpan StableAfter { get; set; } = TimeSpan.FromSeconds(15); public TimeSpan StableAfter { get; set; } = TimeSpan.FromSeconds(15);
/// <summary>
/// Frequency of cluster failure-detector heartbeat messages between nodes.
/// Must be well below <see cref="FailureDetectionThreshold"/>. Default 2s.
/// </summary>
public TimeSpan HeartbeatInterval { get; set; } = TimeSpan.FromSeconds(2); public TimeSpan HeartbeatInterval { get; set; } = TimeSpan.FromSeconds(2);
/// <summary>
/// Time without a heartbeat before a node is considered unreachable
/// (Akka's <c>acceptable-heartbeat-pause</c>). Default 10s.
/// </summary>
public TimeSpan FailureDetectionThreshold { get; set; } = TimeSpan.FromSeconds(10); public TimeSpan FailureDetectionThreshold { get; set; } = TimeSpan.FromSeconds(10);
/// <summary>
/// Akka's <c>min-nr-of-members</c>. Must be <c>1</c>: after failover only one
/// node runs, and a value of <c>2</c> blocks the cluster singleton (Site Runtime
/// Deployment Manager) — and therefore all data collection — indefinitely.
/// </summary>
public int MinNrOfMembers { get; set; } = 1; public int MinNrOfMembers { get; set; } = 1;
/// <summary>
/// The keep-oldest resolver's <c>down-if-alone</c> flag. When <c>true</c> (the
/// design-doc requirement), the oldest node downs itself if it finds it has no
/// other reachable members, rather than running as an isolated single-node cluster.
/// </summary>
public bool DownIfAlone { get; set; } = true;
} }
@@ -0,0 +1,72 @@
using Microsoft.Extensions.Options;
namespace ScadaLink.ClusterInfrastructure;
/// <summary>
/// CI-004: Validates <see cref="ClusterOptions"/> at startup. The values it
/// guards carry cluster-wide consequences — the design doc
/// (<c>Component-ClusterInfrastructure.md</c>) is emphatic that misconfiguring
/// them produces a total cluster shutdown or an indefinitely blocked singleton.
/// Registered with <c>ValidateOnStart()</c> so a bad <c>appsettings.json</c>
/// fails fast at boot rather than failing far from the cause.
/// </summary>
public sealed class ClusterOptionsValidator : IValidateOptions<ClusterOptions>
{
/// <summary>Split-brain resolver strategies safe for ScadaLink's two-node clusters.</summary>
private static readonly HashSet<string> AllowedStrategies = new(StringComparer.OrdinalIgnoreCase)
{
"keep-oldest"
};
public ValidateOptionsResult Validate(string? name, ClusterOptions options)
{
var failures = new List<string>();
if (options.SeedNodes is null || options.SeedNodes.Count == 0)
{
failures.Add("ClusterOptions.SeedNodes must contain at least one seed node.");
}
if (string.IsNullOrWhiteSpace(options.SplitBrainResolverStrategy)
|| !AllowedStrategies.Contains(options.SplitBrainResolverStrategy))
{
failures.Add(
$"ClusterOptions.SplitBrainResolverStrategy must be 'keep-oldest' for a two-node cluster; " +
$"'{options.SplitBrainResolverStrategy}' would risk a total cluster shutdown on a partition.");
}
if (options.MinNrOfMembers != 1)
{
failures.Add(
$"ClusterOptions.MinNrOfMembers must be 1 (was {options.MinNrOfMembers}); " +
"any other value blocks the cluster singleton after failover and halts all data collection.");
}
if (options.StableAfter <= TimeSpan.Zero)
{
failures.Add("ClusterOptions.StableAfter must be a positive duration.");
}
if (options.HeartbeatInterval <= TimeSpan.Zero)
{
failures.Add("ClusterOptions.HeartbeatInterval must be a positive duration.");
}
if (options.FailureDetectionThreshold <= TimeSpan.Zero)
{
failures.Add("ClusterOptions.FailureDetectionThreshold must be a positive duration.");
}
if (options.HeartbeatInterval >= options.FailureDetectionThreshold)
{
failures.Add(
$"ClusterOptions.HeartbeatInterval ({options.HeartbeatInterval}) must be well below " +
$"FailureDetectionThreshold ({options.FailureDetectionThreshold}); otherwise nodes are " +
"declared unreachable before a heartbeat can arrive.");
}
return failures.Count > 0
? ValidateOptionsResult.Fail(failures)
: ValidateOptionsResult.Success;
}
}
@@ -1,18 +1,47 @@
using Microsoft.Extensions.DependencyInjection; using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.DependencyInjection.Extensions;
using Microsoft.Extensions.Options;
namespace ScadaLink.ClusterInfrastructure; namespace ScadaLink.ClusterInfrastructure;
/// <summary>
/// DI registration for the Cluster Infrastructure component.
/// </summary>
public static class ServiceCollectionExtensions public static class ServiceCollectionExtensions
{ {
/// <summary>
/// Registers the Cluster Infrastructure services. This component owns the
/// cluster <em>configuration contract</em> (<see cref="ClusterOptions"/>); the
/// Akka.NET bootstrap itself lives in <c>ScadaLink.Host</c>
/// (see <c>Component-ClusterInfrastructure.md</c>).
/// <para>
/// Registering the <see cref="ClusterOptionsValidator"/> means a misconfigured
/// <c>ScadaLink:Cluster</c> section (e.g. <c>MinNrOfMembers: 2</c> or a quorum
/// split-brain strategy) throws an <see cref="OptionsValidationException"/> the
/// first time <see cref="IOptions{TOptions}"/> is resolved, rather than booting
/// into a broken cluster.
/// </para>
/// </summary>
public static IServiceCollection AddClusterInfrastructure(this IServiceCollection services) public static IServiceCollection AddClusterInfrastructure(this IServiceCollection services)
{ {
// Phase 0: skeleton only services.TryAddEnumerable(
ServiceDescriptor.Singleton<IValidateOptions<ClusterOptions>, ClusterOptionsValidator>());
return services; return services;
} }
/// <summary>
/// Reserved for cluster-infrastructure actor registration. This component does
/// not register any actors — the Akka.NET bootstrap and actor wiring live in
/// <c>ScadaLink.Host</c>. The method throws rather than silently returning
/// success so that any caller assuming this component registers actors fails
/// fast with a clear cause instead of failing later, far from here.
/// </summary>
/// <exception cref="NotImplementedException">Always thrown.</exception>
public static IServiceCollection AddClusterInfrastructureActors(this IServiceCollection services) public static IServiceCollection AddClusterInfrastructureActors(this IServiceCollection services)
{ {
// Phase 0: placeholder for Akka actor registration throw new NotImplementedException(
return services; "ScadaLink.ClusterInfrastructure registers no actors. The Akka.NET actor system " +
"bootstrap and all cluster actor registration live in ScadaLink.Host " +
"(AkkaHostedService). Do not call AddClusterInfrastructureActors().");
} }
} }
@@ -2,21 +2,56 @@ using System.Collections.Frozen;
namespace ScadaLink.Commons.Messages.Management; namespace ScadaLink.Commons.Messages.Management;
/// <summary>
/// Bidirectional name registry for management command records. The registry contains
/// exactly the non-abstract <c>*Command</c> types declared in the
/// <c>ScadaLink.Commons.Messages.Management</c> namespace; these are the commands that
/// travel over the HTTP / ClusterClient management boundary.
/// </summary>
/// <remarks>
/// <see cref="Resolve"/> and <see cref="GetCommandName"/> are symmetric:
/// <c>Resolve(GetCommandName(t))</c> returns <c>t</c> for every type
/// <see cref="GetCommandName"/> accepts. <see cref="GetCommandName"/> rejects any type
/// the registry does not contain rather than computing an unresolvable name.
/// </remarks>
public static class ManagementCommandRegistry public static class ManagementCommandRegistry
{ {
private static readonly FrozenDictionary<string, Type> Commands = BuildRegistry(); private static readonly FrozenDictionary<string, Type> Commands = BuildRegistry();
/// <summary>
/// Names keyed by command type, for the reverse lookup. Keeps
/// <see cref="GetCommandName"/> in lock-step with the forward registry.
/// </summary>
private static readonly FrozenDictionary<Type, string> NamesByType =
Commands.ToFrozenDictionary(kv => kv.Value, kv => kv.Key);
public static Type? Resolve(string commandName) public static Type? Resolve(string commandName)
{ {
return Commands.GetValueOrDefault(commandName); return Commands.GetValueOrDefault(commandName);
} }
/// <summary>
/// Returns the registered wire name for a management command type.
/// </summary>
/// <exception cref="ArgumentException">
/// Thrown when <paramref name="commandType"/> is not a registered management
/// command — i.e. not a non-abstract <c>*Command</c> type in the
/// <c>ScadaLink.Commons.Messages.Management</c> namespace. This keeps the method
/// symmetric with <see cref="Resolve"/>: it never yields a name that
/// <see cref="Resolve"/> cannot turn back into the same type.
/// </exception>
public static string GetCommandName(Type commandType) public static string GetCommandName(Type commandType)
{ {
var name = commandType.Name; ArgumentNullException.ThrowIfNull(commandType);
return name.EndsWith("Command", StringComparison.Ordinal)
? name[..^"Command".Length] if (NamesByType.TryGetValue(commandType, out var name))
: name; return name;
throw new ArgumentException(
$"'{commandType.FullName}' is not a registered management command. " +
$"Management commands must be non-abstract '*Command' records declared in " +
$"the '{typeof(ManagementEnvelope).Namespace}' namespace.",
nameof(commandType));
} }
private static FrozenDictionary<string, Type> BuildRegistry() private static FrozenDictionary<string, Type> BuildRegistry()
@@ -7,13 +7,22 @@ namespace ScadaLink.Commons.Types;
/// Wraps a JsonElement as a dynamic object for convenient property access in scripts. /// Wraps a JsonElement as a dynamic object for convenient property access in scripts.
/// Supports property access (obj.name), indexing (obj.items[0]), and ToString(). /// Supports property access (obj.name), indexing (obj.items[0]), and ToString().
/// </summary> /// </summary>
/// <remarks>
/// The element passed to the constructor is <see cref="JsonElement.Clone()">cloned</see>
/// so the wrapper owns a self-contained copy. This decouples its lifetime from the
/// <see cref="JsonDocument"/> the element originated from: a wrapper built from an
/// element inside a <c>using</c> block remains valid for deferred (e.g. script-time)
/// access after that document has been disposed.
/// </remarks>
public class DynamicJsonElement : DynamicObject public class DynamicJsonElement : DynamicObject
{ {
private readonly JsonElement _element; private readonly JsonElement _element;
public DynamicJsonElement(JsonElement element) public DynamicJsonElement(JsonElement element)
{ {
_element = element; // Clone detaches the element from its owning JsonDocument so accessing it
// later cannot throw ObjectDisposedException once that document is disposed.
_element = element.Clone();
} }
public override bool TryGetMember(GetMemberBinder binder, out object? result) public override bool TryGetMember(GetMemberBinder binder, out object? result)
@@ -24,7 +24,8 @@ public class ScriptParameters : IReadOnlyDictionary<string, object?>
/// Gets a parameter value with typed conversion. /// Gets a parameter value with typed conversion.
/// <list type="bullet"> /// <list type="bullet">
/// <item><c>Get&lt;int&gt;("key")</c> — throws if missing, null, or unconvertible.</item> /// <item><c>Get&lt;int&gt;("key")</c> — throws if missing, null, or unconvertible.</item>
/// <item><c>Get&lt;int?&gt;("key")</c> — returns null if missing, null, or unconvertible.</item> /// <item><c>Get&lt;int?&gt;("key")</c> — returns null if the parameter is missing or null;
/// throws if it is present but holds an unconvertible value.</item>
/// <item><c>Get&lt;int[]&gt;("key")</c> — converts list to typed array; throws on first bad element.</item> /// <item><c>Get&lt;int[]&gt;("key")</c> — converts list to typed array; throws on first bad element.</item>
/// <item><c>Get&lt;List&lt;int&gt;&gt;("key")</c> — converts list to typed List; throws on first bad element.</item> /// <item><c>Get&lt;List&lt;int&gt;&gt;("key")</c> — converts list to typed List; throws on first bad element.</item>
/// </list> /// </list>
@@ -71,19 +72,18 @@ public class ScriptParameters : IReadOnlyDictionary<string, object?>
private T GetNullable<T>(string key, Type underlyingType) private T GetNullable<T>(string key, Type underlyingType)
{ {
// Absent or explicitly-null parameter — the caller did not supply a value.
if (!_inner.TryGetValue(key, out var value) || value is null) if (!_inner.TryGetValue(key, out var value) || value is null)
return default!; // null for Nullable<T> return default!; // null for Nullable<T>
try // A parameter that is *present but non-null* must be convertible. A value
{ // that cannot be converted is a caller/script bug, not "not supplied":
// throw with a descriptive message rather than silently returning null
// (which a script would misread as absent). This mirrors Get<T>() and the
// array/list element paths. See Commons-003.
var converted = ConvertScalar(value, underlyingType, key); var converted = ConvertScalar(value, underlyingType, key);
return (T)converted; return (T)converted;
} }
catch (ScriptParameterException)
{
return default!; // null on conversion failure for nullable
}
}
private Array ConvertToArray(string key, Type elementType) private Array ConvertToArray(string key, Type elementType)
{ {
+74 -7
View File
@@ -1,3 +1,7 @@
using System.Runtime.CompilerServices;
[assembly: InternalsVisibleTo("ScadaLink.Commons.Tests")]
namespace ScadaLink.Commons.Types; namespace ScadaLink.Commons.Types;
/// <summary> /// <summary>
@@ -5,11 +9,29 @@ namespace ScadaLink.Commons.Types;
/// within <see cref="MaxSilence"/>, the <see cref="Stale"/> event fires. /// within <see cref="MaxSilence"/>, the <see cref="Stale"/> event fires.
/// Composable into any IDataConnection adapter. /// Composable into any IDataConnection adapter.
/// </summary> /// </summary>
/// <remarks>
/// Thread-safe: <see cref="Start"/>, <see cref="OnValueReceived"/> and <see cref="Stop"/>
/// may be called from any thread and race the internal timer callback. Each call to
/// <see cref="Start"/> or <see cref="OnValueReceived"/> begins a new monitoring period
/// identified by a generation token; a timer callback only raises <see cref="Stale"/>
/// if it still belongs to the current period. A fresh value, a restart, or a
/// <see cref="Stop"/> arriving while a previous-period callback is in flight bumps the
/// generation, so that callback observes the mismatch and declines to fire — no spurious
/// staleness signal is emitted after the period it was scheduled for has ended.
/// </remarks>
public sealed class StaleTagMonitor : IDisposable public sealed class StaleTagMonitor : IDisposable
{ {
private readonly TimeSpan _maxSilence; private readonly TimeSpan _maxSilence;
private readonly object _gate = new();
private Timer? _timer; private Timer? _timer;
private volatile bool _staleFired;
/// <summary>
/// Monotonically increasing token identifying the current monitoring period.
/// Bumped on every <see cref="Start"/>, <see cref="OnValueReceived"/> and
/// <see cref="Stop"/> so that a timer callback scheduled for an earlier period
/// can detect that it is stale and decline to fire.
/// </summary>
private long _generation;
public StaleTagMonitor(TimeSpan maxSilence) public StaleTagMonitor(TimeSpan maxSilence)
{ {
@@ -26,14 +48,25 @@ public sealed class StaleTagMonitor : IDisposable
public TimeSpan MaxSilence => _maxSilence; public TimeSpan MaxSilence => _maxSilence;
/// <summary>
/// Test-only seam invoked by the timer callback after it has been entered but
/// before it acquires the synchronization gate. Allows a test to deterministically
/// interleave a <see cref="Stop"/> / <see cref="OnValueReceived"/> with an in-flight
/// callback to exercise the stale-fire race. Never set in production.
/// </summary>
internal Action? CallbackEnteredHook { get; set; }
/// <summary> /// <summary>
/// Start monitoring. The timer begins counting from now. /// Start monitoring. The timer begins counting from now.
/// </summary> /// </summary>
public void Start() public void Start()
{ {
_staleFired = false; lock (_gate)
{
_generation++;
_timer?.Dispose(); _timer?.Dispose();
_timer = new Timer(OnTimerElapsed, null, _maxSilence, Timeout.InfiniteTimeSpan); _timer = new Timer(OnTimerElapsed, _generation, _maxSilence, Timeout.InfiniteTimeSpan);
}
} }
/// <summary> /// <summary>
@@ -41,8 +74,20 @@ public sealed class StaleTagMonitor : IDisposable
/// </summary> /// </summary>
public void OnValueReceived() public void OnValueReceived()
{ {
_staleFired = false; lock (_gate)
_timer?.Change(_maxSilence, Timeout.InfiniteTimeSpan); {
// No active monitoring — nothing to reset.
if (_timer is null)
return;
// Bump the generation: any timer callback for the previous period that
// has already been entered will see a generation mismatch and decline to
// raise Stale. The timer is recreated rather than re-armed with
// Change(...) so the new callback carries the new generation token.
_generation++;
_timer.Dispose();
_timer = new Timer(OnTimerElapsed, _generation, _maxSilence, Timeout.InfiniteTimeSpan);
}
} }
/// <summary> /// <summary>
@@ -50,9 +95,15 @@ public sealed class StaleTagMonitor : IDisposable
/// </summary> /// </summary>
public void Stop() public void Stop()
{ {
lock (_gate)
{
// Bumping the generation invalidates any in-flight callback so a stopped
// monitor cannot deliver a Stale signal.
_generation++;
_timer?.Dispose(); _timer?.Dispose();
_timer = null; _timer = null;
} }
}
public void Dispose() public void Dispose()
{ {
@@ -61,8 +112,24 @@ public sealed class StaleTagMonitor : IDisposable
private void OnTimerElapsed(object? state) private void OnTimerElapsed(object? state)
{ {
if (_staleFired) return; var scheduledGeneration = (long)state!;
_staleFired = true;
CallbackEnteredHook?.Invoke();
// Only fire if this callback still represents the current period. The check
// and the generation bump happen under the gate, so a concurrent
// OnValueReceived / Stop / Start either completes before this guard (its
// generation bump makes this callback decline) or serializes after it.
lock (_gate)
{
if (_generation != scheduledGeneration)
return;
// Consume this period so a duplicate callback for the same generation
// cannot fire twice; the next Start/OnValueReceived issues a new token.
_generation++;
}
Stale?.Invoke(); Stale?.Invoke();
} }
} }
@@ -84,6 +84,15 @@ public class CentralCommunicationActor : ReceiveActor
// Periodic refresh trigger // Periodic refresh trigger
Receive<RefreshSiteAddresses>(_ => LoadSiteAddressesFromDb()); Receive<RefreshSiteAddresses>(_ => LoadSiteAddressesFromDb());
// Communication-006: a faulted LoadSiteAddressesFromDb task is piped here as a
// Status.Failure. Without this handler the failure was an unhandled message
// (debug-level only) and the refresh failed silently — operators could not
// distinguish "no sites configured" from "database is down". Log at Warning.
Receive<Status.Failure>(failure =>
_log.Warning(failure.Cause,
"Failed to load site addresses from the database; the site ClusterClient "
+ "cache was not refreshed and may be stale or empty"));
// Health monitoring: heartbeats and health reports from sites // Health monitoring: heartbeats and health reports from sites
Receive<HeartbeatMessage>(HandleHeartbeat); Receive<HeartbeatMessage>(HandleHeartbeat);
Receive<SiteHealthReport>(HandleSiteHealthReport); Receive<SiteHealthReport>(HandleSiteHealthReport);
@@ -296,6 +305,25 @@ public class CentralCommunicationActor : ReceiveActor
} }
} }
/// <summary>
/// Coordinator supervision strategy (CLAUDE.md: "Resume for coordinator actors").
/// CentralCommunicationActor is a long-lived coordinator that owns the per-site
/// ClusterClient map; a transient fault in a child (e.g. a ClusterClient child)
/// must Resume so the child's connection state is preserved rather than wiped by
/// a Restart.
/// </summary>
protected override SupervisorStrategy SupervisorStrategy()
{
return new OneForOneStrategy(
maxNrOfRetries: -1,
withinTimeRange: Timeout.InfiniteTimeSpan,
decider: Decider.From(ex =>
{
_log.Warning(ex, "Child actor of CentralCommunicationActor faulted, resuming (state preserved)");
return Directive.Resume;
}));
}
protected override void PreStart() protected override void PreStart()
{ {
_log.Info("CentralCommunicationActor started"); _log.Info("CentralCommunicationActor started");
@@ -28,7 +28,19 @@ public class DebugStreamBridgeActor : ReceiveActor, IWithTimers
private const int MaxRetries = 3; private const int MaxRetries = 3;
private const string ReconnectTimerKey = "grpc-reconnect"; private const string ReconnectTimerKey = "grpc-reconnect";
private const string StabilityTimerKey = "grpc-stability";
internal static TimeSpan ReconnectDelay { get; set; } = TimeSpan.FromSeconds(5); internal static TimeSpan ReconnectDelay { get; set; } = TimeSpan.FromSeconds(5);
/// <summary>
/// How long a freshly-opened gRPC stream must stay up before its retry budget
/// is considered "recovered" and <see cref="_retryCount"/> is reset to 0.
/// Communication-008: the retry count must NOT be reset by individual events —
/// a stream that connects, delivers one event, then fails repeatedly would
/// otherwise reconnect forever and never trip <see cref="MaxRetries"/>. Resetting
/// only after a stable interval bounds a flapping stream.
/// </summary>
internal static TimeSpan StabilityWindow { get; set; } = TimeSpan.FromSeconds(60);
private int _retryCount; private int _retryCount;
private bool _useNodeA = true; private bool _useNodeA = true;
private bool _stopped; private bool _stopped;
@@ -66,16 +78,21 @@ public class DebugStreamBridgeActor : ReceiveActor, IWithTimers
OpenGrpcStream(); OpenGrpcStream();
}); });
// Domain events arriving via Self.Tell from gRPC callback // Domain events arriving via Self.Tell from gRPC callback.
Receive<AttributeValueChanged>(changed => // Communication-008: receiving an event must NOT reset _retryCount — a
{ // flapping stream that delivers a single event between failures would
_retryCount = 0; // Successful event resets retry count // otherwise never trip MaxRetries. The retry budget is recovered only by
_onEvent(changed); // GrpcStreamStable (a stream that has stayed up for StabilityWindow).
}); Receive<AttributeValueChanged>(changed => _onEvent(changed));
Receive<AlarmStateChanged>(changed => Receive<AlarmStateChanged>(changed => _onEvent(changed));
// Stream has been stably connected for StabilityWindow — recover the
// retry budget so a future transient fault gets a fresh set of retries.
Receive<GrpcStreamStable>(_ =>
{ {
if (_stopped) return;
_retryCount = 0; _retryCount = 0;
_onEvent(changed); _log.Debug("gRPC stream for {0} stable, retry count reset", _instanceUniqueName);
}); });
// gRPC stream error — attempt reconnection // gRPC stream error — attempt reconnection
@@ -151,6 +168,10 @@ public class DebugStreamBridgeActor : ReceiveActor, IWithTimers
_grpcCts?.Dispose(); _grpcCts?.Dispose();
_grpcCts = new CancellationTokenSource(); _grpcCts = new CancellationTokenSource();
// Arm the stability timer: if the stream stays up for StabilityWindow the
// retry budget is recovered (Communication-008). Cancelled by HandleGrpcError.
Timers.StartSingleTimer(StabilityTimerKey, new GrpcStreamStable(), StabilityWindow);
var client = _grpcFactory.GetOrCreate(_siteIdentifier, endpoint); var client = _grpcFactory.GetOrCreate(_siteIdentifier, endpoint);
var self = Self; var self = Self;
var ct = _grpcCts.Token; var ct = _grpcCts.Token;
@@ -171,6 +192,10 @@ public class DebugStreamBridgeActor : ReceiveActor, IWithTimers
{ {
if (_stopped) return; if (_stopped) return;
// The stream failed before reaching the stability window — its retry
// budget is NOT recovered (Communication-008).
Timers.Cancel(StabilityTimerKey);
_retryCount++; _retryCount++;
if (_retryCount > MaxRetries) if (_retryCount > MaxRetries)
@@ -239,3 +264,10 @@ internal record GrpcStreamError(Exception Exception);
/// Internal message to trigger gRPC stream reconnection. /// Internal message to trigger gRPC stream reconnection.
/// </summary> /// </summary>
internal record ReconnectGrpcStream; internal record ReconnectGrpcStream;
/// <summary>
/// Internal message indicating the current gRPC stream has been connected long
/// enough (<see cref="DebugStreamBridgeActor.StabilityWindow"/>) to be considered
/// stable, so the reconnect retry budget can be recovered.
/// </summary>
internal record GrpcStreamStable;
@@ -177,6 +177,24 @@ public class SiteCommunicationActor : ReceiveActor, IWithTimers
} }
/// <summary>
/// Coordinator supervision strategy (CLAUDE.md: "Resume for coordinator actors").
/// SiteCommunicationActor is a long-lived coordinator routing all message
/// patterns to local handlers; a transient fault in a child must Resume so the
/// child's in-memory state is preserved rather than discarded by a Restart.
/// </summary>
protected override SupervisorStrategy SupervisorStrategy()
{
return new OneForOneStrategy(
maxNrOfRetries: -1,
withinTimeRange: Timeout.InfiniteTimeSpan,
decider: Decider.From(ex =>
{
_log.Warning(ex, "Child actor of SiteCommunicationActor faulted, resuming (state preserved)");
return Directive.Resume;
}));
}
protected override void PreStart() protected override void PreStart()
{ {
_log.Info("SiteCommunicationActor started for site {0}", _siteId); _log.Info("SiteCommunicationActor started for site {0}", _siteId);
@@ -13,21 +13,45 @@ namespace ScadaLink.Communication.Grpc;
/// SiteStreamGrpcServer. The central-side DebugStreamBridgeActor uses this /// SiteStreamGrpcServer. The central-side DebugStreamBridgeActor uses this
/// to open server-streaming calls for individual instances. /// to open server-streaming calls for individual instances.
/// </summary> /// </summary>
public class SiteStreamGrpcClient : IAsyncDisposable public class SiteStreamGrpcClient : IAsyncDisposable, IDisposable
{ {
private readonly GrpcChannel? _channel; private readonly GrpcChannel? _channel;
private readonly SiteStreamService.SiteStreamServiceClient? _client; private readonly SiteStreamService.SiteStreamServiceClient? _client;
private readonly ILogger? _logger; private readonly ILogger? _logger;
private readonly ConcurrentDictionary<string, CancellationTokenSource> _subscriptions = new(); private readonly ConcurrentDictionary<string, CancellationTokenSource> _subscriptions = new();
/// <summary>
/// The HTTP/2 keepalive ping delay actually applied to this client's channel.
/// Exposed for tests verifying that <see cref="CommunicationOptions"/> is honoured.
/// </summary>
internal TimeSpan KeepAlivePingDelay { get; }
/// <summary>
/// The HTTP/2 keepalive ping timeout actually applied to this client's channel.
/// Exposed for tests verifying that <see cref="CommunicationOptions"/> is honoured.
/// </summary>
internal TimeSpan KeepAlivePingTimeout { get; }
public SiteStreamGrpcClient(string endpoint, ILogger logger) public SiteStreamGrpcClient(string endpoint, ILogger logger)
: this(endpoint, logger, new CommunicationOptions())
{ {
}
/// <summary>
/// Creates a client whose HTTP/2 keepalive is taken from <see cref="CommunicationOptions"/>
/// rather than hard-coded, satisfying the design doc's "gRPC Connection Keepalive"
/// section which states these values are configurable.
/// </summary>
public SiteStreamGrpcClient(string endpoint, ILogger logger, CommunicationOptions options)
{
KeepAlivePingDelay = options.GrpcKeepAlivePingDelay;
KeepAlivePingTimeout = options.GrpcKeepAlivePingTimeout;
_channel = GrpcChannel.ForAddress(endpoint, new GrpcChannelOptions _channel = GrpcChannel.ForAddress(endpoint, new GrpcChannelOptions
{ {
HttpHandler = new SocketsHttpHandler HttpHandler = new SocketsHttpHandler
{ {
KeepAlivePingDelay = TimeSpan.FromSeconds(15), KeepAlivePingDelay = options.GrpcKeepAlivePingDelay,
KeepAlivePingTimeout = TimeSpan.FromSeconds(10), KeepAlivePingTimeout = options.GrpcKeepAlivePingTimeout,
KeepAlivePingPolicy = HttpKeepAlivePingPolicy.Always KeepAlivePingPolicy = HttpKeepAlivePingPolicy.Always
} }
}); });
@@ -205,7 +229,13 @@ public class SiteStreamGrpcClient : IAsyncDisposable
_ => AlarmLevel.None _ => AlarmLevel.None
}; };
public async ValueTask DisposeAsync() /// <summary>
/// Releases all subscription CancellationTokenSources and the underlying
/// gRPC channel. All teardown here is synchronous (CTS disposal and
/// <see cref="GrpcChannel.Dispose"/>), so a synchronous <see cref="Dispose"/>
/// can release everything without sync-over-async blocking.
/// </summary>
private void ReleaseResources()
{ {
foreach (var cts in _subscriptions.Values) foreach (var cts in _subscriptions.Values)
{ {
@@ -214,9 +244,22 @@ public class SiteStreamGrpcClient : IAsyncDisposable
} }
_subscriptions.Clear(); _subscriptions.Clear();
if (_channel is not null) _channel?.Dispose();
_channel.Dispose(); }
await ValueTask.CompletedTask; public virtual ValueTask DisposeAsync()
{
ReleaseResources();
return ValueTask.CompletedTask;
}
/// <summary>
/// Synchronous disposal. All resources held by this client are released
/// synchronously, so callers (e.g. <see cref="SiteStreamGrpcClientFactory.Dispose"/>)
/// need not block on the async disposal path.
/// </summary>
public virtual void Dispose()
{
ReleaseResources();
} }
} }
@@ -1,5 +1,6 @@
using System.Collections.Concurrent; using System.Collections.Concurrent;
using Microsoft.Extensions.Logging; using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Options;
namespace ScadaLink.Communication.Grpc; namespace ScadaLink.Communication.Grpc;
@@ -12,22 +13,43 @@ public class SiteStreamGrpcClientFactory : IAsyncDisposable, IDisposable
{ {
private readonly ConcurrentDictionary<string, SiteStreamGrpcClient> _clients = new(); private readonly ConcurrentDictionary<string, SiteStreamGrpcClient> _clients = new();
private readonly ILoggerFactory _loggerFactory; private readonly ILoggerFactory _loggerFactory;
private readonly CommunicationOptions _options;
public SiteStreamGrpcClientFactory(ILoggerFactory loggerFactory) public SiteStreamGrpcClientFactory(ILoggerFactory loggerFactory)
: this(loggerFactory, Options.Create(new CommunicationOptions()))
{ {
_loggerFactory = loggerFactory;
} }
/// <summary> /// <summary>
/// Returns an existing client for the site or creates a new one. /// DI constructor — flows <see cref="CommunicationOptions"/> into every created
/// <see cref="SiteStreamGrpcClient"/> so the configured gRPC keepalive settings
/// are applied rather than hard-coded defaults.
/// </summary>
public SiteStreamGrpcClientFactory(ILoggerFactory loggerFactory, IOptions<CommunicationOptions> options)
{
_loggerFactory = loggerFactory;
_options = options.Value;
}
/// <summary>
/// Returns an existing client for the site or creates a new one. The new
/// client is created via <see cref="CreateClient"/> and tracked so the
/// factory's <see cref="Dispose"/> / <see cref="DisposeAsync"/> release it.
/// </summary> /// </summary>
public virtual SiteStreamGrpcClient GetOrCreate(string siteIdentifier, string grpcEndpoint) public virtual SiteStreamGrpcClient GetOrCreate(string siteIdentifier, string grpcEndpoint)
{ {
return _clients.GetOrAdd(siteIdentifier, _ => return _clients.GetOrAdd(siteIdentifier, _ => CreateClient(grpcEndpoint));
}
/// <summary>
/// Creates a single <see cref="SiteStreamGrpcClient"/>. Overridable so tests
/// can substitute a tracking client while still exercising the factory's real
/// caching and disposal machinery.
/// </summary>
protected virtual SiteStreamGrpcClient CreateClient(string grpcEndpoint)
{ {
var logger = _loggerFactory.CreateLogger<SiteStreamGrpcClient>(); var logger = _loggerFactory.CreateLogger<SiteStreamGrpcClient>();
return new SiteStreamGrpcClient(grpcEndpoint, logger); return new SiteStreamGrpcClient(grpcEndpoint, logger, _options);
});
} }
/// <summary> /// <summary>
@@ -50,8 +72,19 @@ public class SiteStreamGrpcClientFactory : IAsyncDisposable, IDisposable
_clients.Clear(); _clients.Clear();
} }
/// <summary>
/// Synchronous disposal. Communication-007: this used to block on
/// <c>DisposeAsync().AsTask().GetAwaiter().GetResult()</c> (sync-over-async,
/// a stall/deadlock risk during host shutdown). Each
/// <see cref="SiteStreamGrpcClient"/> releases all of its resources
/// synchronously, so we dispose them directly with no async path.
/// </summary>
public void Dispose() public void Dispose()
{ {
DisposeAsync().AsTask().GetAwaiter().GetResult(); foreach (var client in _clients.Values)
{
client.Dispose();
}
_clients.Clear();
} }
} }
@@ -3,6 +3,7 @@ using System.Threading.Channels;
using Akka.Actor; using Akka.Actor;
using Grpc.Core; using Grpc.Core;
using Microsoft.Extensions.Logging; using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Options;
using GrpcStatus = Grpc.Core.Status; using GrpcStatus = Grpc.Core.Status;
namespace ScadaLink.Communication.Grpc; namespace ScadaLink.Communication.Grpc;
@@ -19,6 +20,7 @@ public class SiteStreamGrpcServer : SiteStreamService.SiteStreamServiceBase
private readonly ILogger<SiteStreamGrpcServer> _logger; private readonly ILogger<SiteStreamGrpcServer> _logger;
private readonly ConcurrentDictionary<string, StreamEntry> _activeStreams = new(); private readonly ConcurrentDictionary<string, StreamEntry> _activeStreams = new();
private readonly int _maxConcurrentStreams; private readonly int _maxConcurrentStreams;
private readonly TimeSpan _maxStreamLifetime;
private volatile bool _ready; private volatile bool _ready;
private long _actorCounter; private long _actorCounter;
@@ -26,10 +28,36 @@ public class SiteStreamGrpcServer : SiteStreamService.SiteStreamServiceBase
ISiteStreamSubscriber streamSubscriber, ISiteStreamSubscriber streamSubscriber,
ILogger<SiteStreamGrpcServer> logger, ILogger<SiteStreamGrpcServer> logger,
int maxConcurrentStreams = 100) int maxConcurrentStreams = 100)
: this(streamSubscriber, logger, maxConcurrentStreams, TimeSpan.FromHours(4))
{
}
/// <summary>
/// DI constructor — binds <see cref="CommunicationOptions.GrpcMaxConcurrentStreams"/>
/// and <see cref="CommunicationOptions.GrpcMaxStreamLifetime"/> so the documented
/// concurrency limit and the 4-hour zombie-stream session timeout are honoured
/// rather than hard-coded.
/// </summary>
public SiteStreamGrpcServer(
ISiteStreamSubscriber streamSubscriber,
ILogger<SiteStreamGrpcServer> logger,
IOptions<CommunicationOptions> options)
: this(streamSubscriber, logger,
options.Value.GrpcMaxConcurrentStreams,
options.Value.GrpcMaxStreamLifetime)
{
}
private SiteStreamGrpcServer(
ISiteStreamSubscriber streamSubscriber,
ILogger<SiteStreamGrpcServer> logger,
int maxConcurrentStreams,
TimeSpan maxStreamLifetime)
{ {
_streamSubscriber = streamSubscriber; _streamSubscriber = streamSubscriber;
_logger = logger; _logger = logger;
_maxConcurrentStreams = maxConcurrentStreams; _maxConcurrentStreams = maxConcurrentStreams;
_maxStreamLifetime = maxStreamLifetime;
} }
/// <summary> /// <summary>
@@ -49,6 +77,12 @@ public class SiteStreamGrpcServer : SiteStreamService.SiteStreamServiceBase
/// </summary> /// </summary>
public int ActiveStreamCount => _activeStreams.Count; public int ActiveStreamCount => _activeStreams.Count;
/// <summary>Effective max concurrent stream limit. Exposed for tests.</summary>
internal int MaxConcurrentStreams => _maxConcurrentStreams;
/// <summary>Effective per-stream session lifetime. Exposed for tests.</summary>
internal TimeSpan MaxStreamLifetime => _maxStreamLifetime;
public override async Task SubscribeInstance( public override async Task SubscribeInstance(
InstanceStreamRequest request, InstanceStreamRequest request,
IServerStreamWriter<SiteStreamEvent> responseStream, IServerStreamWriter<SiteStreamEvent> responseStream,
@@ -69,6 +103,11 @@ public class SiteStreamGrpcServer : SiteStreamService.SiteStreamServiceBase
throw new RpcException(new GrpcStatus(StatusCode.ResourceExhausted, "Max concurrent streams reached")); throw new RpcException(new GrpcStatus(StatusCode.ResourceExhausted, "Max concurrent streams reached"));
using var streamCts = CancellationTokenSource.CreateLinkedTokenSource(context.CancellationToken); using var streamCts = CancellationTokenSource.CreateLinkedTokenSource(context.CancellationToken);
// Session timeout (design doc "gRPC Connection Keepalive": 4-hour third layer
// of dead-client detection) — forces a long-lived zombie stream to terminate
// even if keepalive PINGs never detect the loss.
if (_maxStreamLifetime > TimeSpan.Zero && _maxStreamLifetime != Timeout.InfiniteTimeSpan)
streamCts.CancelAfter(_maxStreamLifetime);
var entry = new StreamEntry(streamCts); var entry = new StreamEntry(streamCts);
_activeStreams[request.CorrelationId] = entry; _activeStreams[request.CorrelationId] = entry;
@@ -2,6 +2,7 @@ using ScadaLink.CLI;
namespace ScadaLink.CLI.Tests; namespace ScadaLink.CLI.Tests;
[Collection("Environment")]
public class CliConfigTests public class CliConfigTests
{ {
[Fact] [Fact]
@@ -3,6 +3,7 @@ using ScadaLink.CLI.Commands;
namespace ScadaLink.CLI.Tests; namespace ScadaLink.CLI.Tests;
[Collection("Console")]
public class CommandHelpersTests public class CommandHelpersTests
{ {
[Fact] [Fact]
@@ -0,0 +1,71 @@
using ScadaLink.CLI;
namespace ScadaLink.CLI.Tests;
/// <summary>
/// Regression tests for CLI-006 — credentials could only be supplied via the
/// <c>--password</c> command-line option, which leaks into process listings and
/// shell history. A <c>SCADALINK_PASSWORD</c> / <c>SCADALINK_USERNAME</c> environment
/// fallback gives CI/CD a safer alternative.
/// </summary>
[Collection("Environment")]
public class CredentialResolutionTests
{
[Fact]
public void Load_Password_FromEnvironment()
{
var orig = Environment.GetEnvironmentVariable("SCADALINK_PASSWORD");
try
{
Environment.SetEnvironmentVariable("SCADALINK_PASSWORD", "s3cret");
var config = CliConfig.Load();
Assert.Equal("s3cret", config.Password);
}
finally
{
Environment.SetEnvironmentVariable("SCADALINK_PASSWORD", orig);
}
}
[Fact]
public void Load_Username_FromEnvironment()
{
var orig = Environment.GetEnvironmentVariable("SCADALINK_USERNAME");
try
{
Environment.SetEnvironmentVariable("SCADALINK_USERNAME", "ci-user");
var config = CliConfig.Load();
Assert.Equal("ci-user", config.Username);
}
finally
{
Environment.SetEnvironmentVariable("SCADALINK_USERNAME", orig);
}
}
[Fact]
public void Load_NoCredentialEnvVars_LeavesCredentialsNull()
{
var origUser = Environment.GetEnvironmentVariable("SCADALINK_USERNAME");
var origPass = Environment.GetEnvironmentVariable("SCADALINK_PASSWORD");
try
{
Environment.SetEnvironmentVariable("SCADALINK_USERNAME", null);
Environment.SetEnvironmentVariable("SCADALINK_PASSWORD", null);
var config = CliConfig.Load();
Assert.Null(config.Username);
Assert.Null(config.Password);
}
finally
{
Environment.SetEnvironmentVariable("SCADALINK_USERNAME", origUser);
Environment.SetEnvironmentVariable("SCADALINK_PASSWORD", origPass);
}
}
}
@@ -0,0 +1,98 @@
using ScadaLink.CLI.Commands;
namespace ScadaLink.CLI.Tests;
/// <summary>
/// Regression tests for CLI-005 — malformed <c>--bindings</c> / <c>--overrides</c> JSON
/// previously threw unhandled exceptions instead of producing a clean validation error.
/// </summary>
public class InstanceArgumentParsingTests
{
[Fact]
public void ParseBindings_ValidJson_ReturnsPairs()
{
var ok = InstanceCommands.TryParseBindings(
"[[\"Speed\", 5], [\"Mode\", 7]]", out var bindings, out var error);
Assert.True(ok);
Assert.Null(error);
Assert.Equal(2, bindings!.Count);
Assert.Equal(("Speed", 5), bindings[0]);
Assert.Equal(("Mode", 7), bindings[1]);
}
[Fact]
public void ParseBindings_MalformedJson_ReturnsErrorNotException()
{
var ok = InstanceCommands.TryParseBindings("not json", out var bindings, out var error);
Assert.False(ok);
Assert.Null(bindings);
Assert.NotNull(error);
}
[Fact]
public void ParseBindings_ShortPair_ReturnsErrorNotException()
{
// A pair with fewer than two elements previously threw ArgumentOutOfRangeException.
var ok = InstanceCommands.TryParseBindings("[[\"Speed\"]]", out var bindings, out var error);
Assert.False(ok);
Assert.Null(bindings);
Assert.NotNull(error);
}
[Fact]
public void ParseBindings_WrongElementTypes_ReturnsErrorNotException()
{
// A non-string name / non-int id previously threw InvalidOperationException.
var ok = InstanceCommands.TryParseBindings("[[5, \"Speed\"]]", out var bindings, out var error);
Assert.False(ok);
Assert.Null(bindings);
Assert.NotNull(error);
}
[Fact]
public void ParseBindings_JsonNull_ReturnsErrorNotException()
{
var ok = InstanceCommands.TryParseBindings("null", out var bindings, out var error);
Assert.False(ok);
Assert.Null(bindings);
Assert.NotNull(error);
}
[Fact]
public void ParseOverrides_ValidJson_ReturnsDictionary()
{
var ok = InstanceCommands.TryParseOverrides(
"{\"Speed\": \"100\", \"Mode\": null}", out var overrides, out var error);
Assert.True(ok);
Assert.Null(error);
Assert.Equal(2, overrides!.Count);
Assert.Equal("100", overrides["Speed"]);
Assert.Null(overrides["Mode"]);
}
[Fact]
public void ParseOverrides_MalformedJson_ReturnsErrorNotException()
{
var ok = InstanceCommands.TryParseOverrides("{bad json", out var overrides, out var error);
Assert.False(ok);
Assert.Null(overrides);
Assert.NotNull(error);
}
[Fact]
public void ParseOverrides_JsonNull_ReturnsErrorNotException()
{
var ok = InstanceCommands.TryParseOverrides("null", out var overrides, out var error);
Assert.False(ok);
Assert.Null(overrides);
Assert.NotNull(error);
}
}
@@ -2,6 +2,7 @@ using ScadaLink.CLI;
namespace ScadaLink.CLI.Tests; namespace ScadaLink.CLI.Tests;
[Collection("Console")]
public class OutputFormatterTests public class OutputFormatterTests
{ {
[Fact] [Fact]
@@ -0,0 +1,73 @@
using ScadaLink.CLI;
using ScadaLink.CLI.Commands;
namespace ScadaLink.CLI.Tests;
/// <summary>
/// Regression tests for CLI-002 (empty success body) and CLI-003 (non-JSON success
/// body) — both previously crashed table rendering with an unhandled exception.
/// </summary>
[Collection("Console")]
public class ResponseRenderingTests
{
[Fact]
public void HandleResponse_EmptyBody_TableFormat_DoesNotThrow_ReturnsZero()
{
// CLI-002: a 200/204 with an empty body must be treated as "succeeded, no output".
var writer = new StringWriter();
Console.SetOut(writer);
try
{
var response = new ManagementResponse(204, "", null, null);
var exitCode = CommandHelpers.HandleResponse(response, "table");
Assert.Equal(0, exitCode);
}
finally
{
Console.SetOut(new StreamWriter(Console.OpenStandardOutput()) { AutoFlush = true });
}
}
[Fact]
public void HandleResponse_EmptyBody_JsonFormat_DoesNotThrow_ReturnsZero()
{
var writer = new StringWriter();
Console.SetOut(writer);
try
{
var response = new ManagementResponse(200, " ", null, null);
var exitCode = CommandHelpers.HandleResponse(response, "json");
Assert.Equal(0, exitCode);
}
finally
{
Console.SetOut(new StreamWriter(Console.OpenStandardOutput()) { AutoFlush = true });
}
}
[Fact]
public void HandleResponse_NonJsonBody_TableFormat_FallsBackToRaw_ReturnsZero()
{
// CLI-003: a success status with a non-JSON body (e.g. proxy HTML error page)
// must not crash; it should print the raw body verbatim.
var writer = new StringWriter();
Console.SetOut(writer);
try
{
var response = new ManagementResponse(200, "<html>Service Unavailable</html>", null, null);
var exitCode = CommandHelpers.HandleResponse(response, "table");
Assert.Equal(0, exitCode);
Assert.Contains("<html>Service Unavailable</html>", writer.ToString());
}
finally
{
Console.SetOut(new StreamWriter(Console.OpenStandardOutput()) { AutoFlush = true });
}
}
}
@@ -0,0 +1,16 @@
namespace ScadaLink.CLI.Tests;
/// <summary>
/// xUnit runs test classes in parallel by default. Several CLI test classes redirect
/// the process-global <see cref="System.Console.Out"/> / <see cref="System.Console.Error"/>,
/// which races when they run concurrently. Tests in this collection run serially.
/// </summary>
[CollectionDefinition("Console")]
public sealed class ConsoleCollection { }
/// <summary>
/// Test classes that mutate process-global environment variables run serially so they
/// do not observe each other's changes.
/// </summary>
[CollectionDefinition("Environment")]
public sealed class EnvironmentCollection { }
@@ -0,0 +1,31 @@
using ScadaLink.CLI.Commands;
namespace ScadaLink.CLI.Tests;
/// <summary>
/// Regression tests for CLI-004 — a malformed <c>--url</c> previously reached
/// <c>new Uri(...)</c> in the <see cref="ScadaLink.CLI.ManagementHttpClient"/> constructor
/// and threw an unhandled <see cref="UriFormatException"/>.
/// </summary>
public class UrlValidationTests
{
[Theory]
[InlineData("http://localhost:9001")]
[InlineData("https://central-host:5000/")]
[InlineData("http://central")]
public void IsValidManagementUrl_AcceptsAbsoluteHttpUrls(string url)
{
Assert.True(CommandHelpers.IsValidManagementUrl(url));
}
[Theory]
[InlineData("localhost:9001")] // no scheme
[InlineData("")] // empty
[InlineData(" ")] // whitespace
[InlineData("not a url")]
[InlineData("ftp://central:21")] // non-http scheme
public void IsValidManagementUrl_RejectsMalformedOrNonHttpUrls(string url)
{
Assert.False(CommandHelpers.IsValidManagementUrl(url));
}
}
@@ -0,0 +1,105 @@
using System.Security.Claims;
using Bunit;
using Microsoft.AspNetCore.Components.Authorization;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Logging.Abstractions;
using Microsoft.Extensions.Options;
using NSubstitute;
using ScadaLink.CentralUI.Components.Shared;
using ScadaLink.Commons.Entities.Sites;
using ScadaLink.Commons.Interfaces.Repositories;
using ScadaLink.Commons.Interfaces.Services;
using ScadaLink.Communication;
using ScadaLink.DeploymentManager;
using SitesPage = ScadaLink.CentralUI.Components.Pages.Admin.Sites;
namespace ScadaLink.CentralUI.Tests.Admin;
/// <summary>
/// Regression tests for CentralUI-012. The Sites page loaded all sites and then
/// issued <c>GetDataConnectionsBySiteIdAsync</c> once per site (N+1 database
/// round-trips on every load and post-delete refresh). The fix fetches all
/// data connections in a single <c>GetAllDataConnectionsAsync</c> call and
/// groups them client-side.
/// </summary>
public class SitesPageTests : BunitContext
{
private readonly ISiteRepository _siteRepo = Substitute.For<ISiteRepository>();
private void RegisterServices()
{
Services.AddSingleton(_siteRepo);
var comms = new CommunicationService(
Options.Create(new CommunicationOptions()),
NullLogger<CommunicationService>.Instance);
Services.AddSingleton(comms);
var artifactSvc = new ArtifactDeploymentService(
_siteRepo,
Substitute.For<IDeploymentManagerRepository>(),
Substitute.For<ITemplateEngineRepository>(),
Substitute.For<IExternalSystemRepository>(),
Substitute.For<INotificationRepository>(),
comms,
Substitute.For<IAuditService>(),
Options.Create(new DeploymentManagerOptions()),
NullLogger<ArtifactDeploymentService>.Instance);
Services.AddSingleton(artifactSvc);
Services.AddSingleton<IDialogService>(Substitute.For<IDialogService>());
var identity = new ClaimsIdentity(
new[] { new Claim(ClaimTypes.Name, "admin") }, "TestCookie");
var authState = new AuthenticationState(new ClaimsPrincipal(identity));
Services.AddSingleton<AuthenticationStateProvider>(
new StubAuthStateProvider(authState));
}
private sealed class StubAuthStateProvider : AuthenticationStateProvider
{
private readonly AuthenticationState _state;
public StubAuthStateProvider(AuthenticationState state) => _state = state;
public override Task<AuthenticationState> GetAuthenticationStateAsync()
=> Task.FromResult(_state);
}
private static List<Site> Sites(params int[] ids)
=> ids.Select(id => new Site($"Site{id}", $"SITE-{id}") { Id = id }).ToList();
private static DataConnection Conn(int siteId, string name)
=> new(name, "OpcUa", siteId);
[Fact]
public void LoadData_FetchesAllConnectionsInOneQuery_NoPerSiteQueries()
{
RegisterServices();
_siteRepo.GetAllSitesAsync().Returns(Sites(1, 2, 3));
_siteRepo.GetAllDataConnectionsAsync().Returns(new List<DataConnection>
{
Conn(1, "c1"), Conn(2, "c2"), Conn(3, "c3"),
});
Render<SitesPage>();
// Regression: exactly one bulk query, and zero per-site queries.
_siteRepo.Received(1).GetAllDataConnectionsAsync();
_siteRepo.DidNotReceive().GetDataConnectionsBySiteIdAsync(Arg.Any<int>());
}
[Fact]
public void LoadData_GroupsConnectionsBySite_AndRendersThem()
{
RegisterServices();
_siteRepo.GetAllSitesAsync().Returns(Sites(1, 2));
_siteRepo.GetAllDataConnectionsAsync().Returns(new List<DataConnection>
{
Conn(1, "alpha-conn"), Conn(2, "beta-conn"),
});
var cut = Render<SitesPage>();
Assert.Contains("alpha-conn", cut.Markup);
Assert.Contains("beta-conn", cut.Markup);
}
}
@@ -0,0 +1,104 @@
using System.Reflection;
using System.Security.Claims;
using Bunit;
using Microsoft.AspNetCore.Components.Authorization;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Logging.Abstractions;
using Microsoft.Extensions.Options;
using NSubstitute;
using ScadaLink.CentralUI.Auth;
using ScadaLink.Commons.Entities.Sites;
using ScadaLink.Commons.Interfaces.Repositories;
using ScadaLink.Communication;
using ScadaLink.Communication.Grpc;
using DebugViewPage = ScadaLink.CentralUI.Components.Pages.Deployment.DebugView;
namespace ScadaLink.CentralUI.Tests.Deployment;
/// <summary>
/// Regression tests for CentralUI-009. The <c>DebugView</c> stream callbacks
/// (<c>onEvent</c>/<c>onTerminated</c>) run on an Akka/gRPC thread and capture
/// <c>this</c> and <c>_toast</c>. If the user navigates away, an in-flight
/// callback could still call <c>_toast.ShowError(...)</c> /
/// <c>InvokeAsync(StateHasChanged)</c> on a disposed component. The fix adds a
/// <c>_disposed</c> flag checked at the top of every callback, set in
/// <c>Dispose()</c> before the stream is stopped.
/// <para>
/// The Akka-thread timing race itself is not deterministically reproducible in
/// a unit test (<see cref="DebugStreamService"/> is a non-virtual concrete
/// class with no seam to inject and later fire the callbacks). These tests pin
/// the observable parts of the fix: the component exposes a disposal guard, and
/// disposal is clean and idempotent.
/// </para>
/// </summary>
public class DebugViewDisposalTests : BunitContext
{
private void RegisterServices()
{
// DebugView touches localStorage on render; let bUnit answer loosely.
JSInterop.Mode = JSRuntimeMode.Loose;
var repo = Substitute.For<ITemplateEngineRepository>();
var siteRepo = Substitute.For<ISiteRepository>();
siteRepo.GetAllSitesAsync().Returns(new List<Site>());
Services.AddSingleton(repo);
Services.AddSingleton(siteRepo);
var comms = new CommunicationService(
Options.Create(new CommunicationOptions()),
NullLogger<CommunicationService>.Instance);
Services.AddSingleton(comms);
var grpcFactory = new SiteStreamGrpcClientFactory(NullLoggerFactory.Instance);
var debugStream = new DebugStreamService(
comms, Services.BuildServiceProvider(), grpcFactory,
NullLogger<DebugStreamService>.Instance);
Services.AddSingleton(debugStream);
var identity = new ClaimsIdentity(
new[] { new Claim(ClaimTypes.Name, "deployer") }, "TestCookie");
var authState = new AuthenticationState(new ClaimsPrincipal(identity));
var stubAuth = new StubAuthStateProvider(authState);
Services.AddSingleton<AuthenticationStateProvider>(stubAuth);
Services.AddScoped(_ => new SiteScopeService(stubAuth));
}
private sealed class StubAuthStateProvider : AuthenticationStateProvider
{
private readonly AuthenticationState _state;
public StubAuthStateProvider(AuthenticationState state) => _state = state;
public override Task<AuthenticationState> GetAuthenticationStateAsync()
=> Task.FromResult(_state);
}
[Fact]
public void DebugView_HasDisposalGuardField()
{
// The fix introduces a `_disposed` flag that every stream callback
// checks before touching component state.
var field = typeof(DebugViewPage).GetField(
"_disposed", BindingFlags.Instance | BindingFlags.NonPublic);
Assert.NotNull(field);
Assert.Equal(typeof(bool), field!.FieldType);
}
[Fact]
public void DebugView_Dispose_SetsDisposedFlag_AndIsIdempotent()
{
RegisterServices();
var cut = Render<DebugViewPage>();
var field = typeof(DebugViewPage).GetField(
"_disposed", BindingFlags.Instance | BindingFlags.NonPublic)!;
Assert.False((bool)field.GetValue(cut.Instance)!);
cut.Instance.Dispose();
Assert.True((bool)field.GetValue(cut.Instance)!,
"Dispose() must set the guard so in-flight callbacks no-op.");
// Disposing again must not throw (idempotent).
var ex = Record.Exception(() => cut.Instance.Dispose());
Assert.Null(ex);
}
}
@@ -0,0 +1,58 @@
namespace ScadaLink.CentralUI.Tests.Design;
/// <summary>
/// Regression tests for CentralUI-014. Test Run wires <c>External</c>,
/// <c>Database</c>, and <c>Notify</c> to central's real services, so a Test Run
/// has production-equivalent side effects. The finding asked, at minimum, that
/// this blast radius be surfaced to the user. The Test Run panels in
/// <c>SharedScriptForm</c> and <c>TemplateEdit</c> carry a prominent
/// <c>Real I/O</c> badge and an <c>alert-warning</c> block stating the side
/// effects are real and permanent; <c>ApiMethodForm</c> (Inbound API kind) has
/// no real-I/O surface at all and correctly omits the badge. These tests pin
/// that warning so it cannot silently regress.
/// </summary>
public class TestRunWarningTests
{
private static string SrcRoot
{
get
{
// tests/ScadaLink.CentralUI.Tests/bin/Debug/net10.0 → repo root.
var dir = AppContext.BaseDirectory;
for (var i = 0; i < 6 && dir is not null; i++)
dir = Directory.GetParent(dir)?.FullName;
return Path.Combine(dir!, "src", "ScadaLink.CentralUI",
"Components", "Pages", "Design");
}
}
private static string Read(string fileName)
=> File.ReadAllText(Path.Combine(SrcRoot, fileName));
[Theory]
[InlineData("SharedScriptForm.razor")]
[InlineData("TemplateEdit.razor")]
public void TestRunPanel_WithRealIoSurface_ShowsRealIoBadgeAndWarning(string razorFile)
{
var markup = Read(razorFile);
// The "Real I/O" badge on the Test Run panel header.
Assert.Contains("Real I/O", markup);
// The explicit warning that side effects hit real systems and are permanent.
Assert.Contains("alert-warning", markup);
Assert.Contains("fire for real", markup);
Assert.Contains("Side effects are permanent", markup);
}
[Fact]
public void ApiMethodForm_TestRun_HasNoRealIoBadge_BecauseInboundApiHasNoSideEffectSurface()
{
// The Inbound API sandbox host exposes only Parameters / Route (Route
// throws) — there is no External/Database/Notify, so no "Real I/O".
var markup = Read("ApiMethodForm.razor");
Assert.DoesNotContain("Real I/O", markup);
// It still warns that Route calls throw.
Assert.Contains("alert-warning", markup);
}
}
@@ -0,0 +1,67 @@
using ScadaLink.CentralUI.Components;
namespace ScadaLink.CentralUI.Tests.Monitoring;
/// <summary>
/// Regression tests for CentralUI-008. <c>&lt;input type="datetime-local"&gt;</c>
/// yields the value the user typed in their <i>browser-local</i> time zone. The
/// audit-log filter converted it with <c>new DateTimeOffset(value, TimeSpan.Zero)</c>
/// — relabelling the local wall-clock value as UTC, shifting the query window by
/// the user's offset. <see cref="BrowserTime.LocalInputToUtc"/> performs the
/// correct conversion: it applies the browser offset from <c>getTimezoneOffset()</c>.
/// </summary>
public class BrowserTimeTests
{
[Fact]
public void LocalInputToUtc_Null_ReturnsNull()
{
Assert.Null(BrowserTime.LocalInputToUtc(null, 0));
}
[Fact]
public void LocalInputToUtc_UtcBrowser_LeavesTimeUnchanged()
{
// getTimezoneOffset() == 0 for a UTC browser.
var local = new DateTime(2026, 5, 16, 9, 30, 0);
var utc = BrowserTime.LocalInputToUtc(local, 0);
Assert.Equal(new DateTimeOffset(2026, 5, 16, 9, 30, 0, TimeSpan.Zero), utc);
}
[Fact]
public void LocalInputToUtc_PositiveUtcOffsetBrowser_SubtractsOffset()
{
// A browser at UTC+2 reports getTimezoneOffset() == -120.
// The user typing 09:30 local means 07:30 UTC.
var local = new DateTime(2026, 5, 16, 9, 30, 0);
var utc = BrowserTime.LocalInputToUtc(local, -120);
Assert.Equal(new DateTimeOffset(2026, 5, 16, 7, 30, 0, TimeSpan.Zero), utc);
}
[Fact]
public void LocalInputToUtc_NegativeUtcOffsetBrowser_AddsOffset()
{
// A browser at UTC-5 (US Eastern, standard time) reports getTimezoneOffset() == 300.
// The user typing 09:30 local means 14:30 UTC.
var local = new DateTime(2026, 5, 16, 9, 30, 0);
var utc = BrowserTime.LocalInputToUtc(local, 300);
Assert.Equal(new DateTimeOffset(2026, 5, 16, 14, 30, 0, TimeSpan.Zero), utc);
}
[Fact]
public void LocalInputToUtc_NonUtcBrowser_DoesNotEqualNaiveRelabelling()
{
// The pre-fix bug: naive new DateTimeOffset(value, TimeSpan.Zero).
var local = new DateTime(2026, 5, 16, 9, 30, 0);
var naive = new DateTimeOffset(local, TimeSpan.Zero);
var correct = BrowserTime.LocalInputToUtc(local, 300);
Assert.NotEqual(naive, correct);
}
}
@@ -0,0 +1,50 @@
using Microsoft.AspNetCore.Authorization;
using ScadaLink.CentralUI.Components.Pages.Monitoring;
using ScadaLink.Security;
namespace ScadaLink.CentralUI.Tests.Monitoring;
/// <summary>
/// Regression tests for CentralUI-007. The design doc classifies the Site Event
/// Log Viewer and Parked Message Management as <b>Deployment Role</b>, but both
/// pages were annotated only <c>[Authorize]</c> (any authenticated user) — a
/// non-Deployment user who followed the nav link could query event logs and
/// retry/discard parked messages. The Health Dashboard is intentionally
/// all-roles per the design.
/// </summary>
public class MonitoringAuthorizationTests
{
private static AuthorizeAttribute? AuthorizeOf<TPage>()
=> typeof(TPage)
.GetCustomAttributes(typeof(AuthorizeAttribute), true)
.Cast<AuthorizeAttribute>()
.FirstOrDefault();
[Fact]
public void EventLogsPage_RequiresDeploymentPolicy()
{
var attr = AuthorizeOf<EventLogs>();
Assert.NotNull(attr);
Assert.Equal(AuthorizationPolicies.RequireDeployment, attr!.Policy);
}
[Fact]
public void ParkedMessagesPage_RequiresDeploymentPolicy()
{
var attr = AuthorizeOf<ParkedMessages>();
Assert.NotNull(attr);
Assert.Equal(AuthorizationPolicies.RequireDeployment, attr!.Policy);
}
[Fact]
public void HealthDashboard_IsIntentionallyAllAuthenticatedRoles()
{
// Health Dashboard stays all-roles (no policy) per the design doc.
var attr = AuthorizeOf<Health>();
Assert.NotNull(attr);
Assert.Null(attr!.Policy);
}
}
@@ -0,0 +1,91 @@
using Microsoft.Extensions.Caching.Memory;
using NSubstitute;
using ScadaLink.CentralUI.ScriptAnalysis;
using ScadaLink.TemplateEngine;
namespace ScadaLink.CentralUI.Tests.ScriptAnalysis;
/// <summary>
/// Regression tests for CentralUI-013. <c>ResolveCalledShape</c> resolved shared
/// script shapes with <c>_sharedScripts.GetShapesAsync().GetAwaiter().GetResult()</c>
/// — a sync-over-async block on a request thread that risks thread-pool
/// starvation and deadlock. <c>Hover</c> and <c>SignatureHelp</c> were synchronous
/// purely to accommodate that block. The fix makes both methods async and
/// <c>await</c>s the catalog.
/// </summary>
public class ScriptAnalysisAsyncResolveTests
{
private readonly ISharedScriptCatalog _catalog = Substitute.For<ISharedScriptCatalog>();
private readonly IMemoryCache _cache = new MemoryCache(new MemoryCacheOptions { SizeLimit = 100 });
private readonly IServiceProvider _services = Substitute.For<IServiceProvider>();
private readonly ScriptAnalysisService _svc;
public ScriptAnalysisAsyncResolveTests()
{
_catalog.GetShapesAsync().Returns(Array.Empty<ScriptShape>());
_svc = new ScriptAnalysisService(_catalog, _cache, _services);
}
private static ScriptShape Shape(string name, params ParameterShape[] ps) => new(name, ps, null);
private static ParameterShape Param(string name, string type) => new(name, type, true);
[Fact]
public async Task HoverAsync_OnSharedCallName_AwaitsCatalog_AndResolvesShape()
{
// The catalog only completes after yielding — a truly asynchronous
// source. The fixed Hover awaits it instead of blocking.
_catalog.GetShapesAsync().Returns(async _ =>
{
await Task.Yield();
return (IReadOnlyList<ScriptShape>)new[]
{
Shape("Aggregate", Param("window", "Integer")),
};
});
var resp = await _svc.Hover(new HoverRequest(
CodeText: "var r = Scripts.CallShared(\"Aggregate\");",
Line: 1,
Column: 30));
Assert.NotNull(resp.Markdown);
Assert.Contains("shared script", resp.Markdown);
Assert.Contains("Aggregate", resp.Markdown);
}
[Fact]
public async Task SignatureHelpAsync_InsideSharedCall_AwaitsCatalog_AndResolvesParameters()
{
_catalog.GetShapesAsync().Returns(async _ =>
{
await Task.Yield();
return (IReadOnlyList<ScriptShape>)new[]
{
Shape("Aggregate", Param("window", "Integer"), Param("mode", "String")),
};
});
var resp = await _svc.SignatureHelp(new SignatureHelpRequest(
CodeText: "var r = Scripts.CallShared(\"Aggregate\", ",
Line: 1,
Column: 41));
Assert.NotNull(resp.Label);
Assert.Contains("Aggregate", resp.Label!);
Assert.Equal(2, resp.Parameters!.Count);
}
[Fact]
public void HoverAndSignatureHelp_AreAsync_NotSyncOverAsync()
{
// Structural guard: the methods must return Task so the catalog can be
// awaited rather than blocked with .GetAwaiter().GetResult().
var hover = typeof(ScriptAnalysisService).GetMethod(nameof(ScriptAnalysisService.Hover));
var sigHelp = typeof(ScriptAnalysisService).GetMethod(nameof(ScriptAnalysisService.SignatureHelp));
Assert.NotNull(hover);
Assert.NotNull(sigHelp);
Assert.Equal(typeof(Task<HoverResponse>), hover!.ReturnType);
Assert.Equal(typeof(Task<SignatureHelpResponse>), sigHelp!.ReturnType);
}
}
@@ -200,11 +200,11 @@ public class ScriptAnalysisServiceTests
// ── Hover ───────────────────────────────────────────────────────────── // ── Hover ─────────────────────────────────────────────────────────────
[Fact] [Fact]
public void Hover_OnSiblingName_ReturnsSignature() public async Task Hover_OnSiblingName_ReturnsSignature()
{ {
var siblings = new[] { Shape("Calc", Param("x", "Integer"), Param("y", "Float")) }; var siblings = new[] { Shape("Calc", Param("x", "Integer"), Param("y", "Float")) };
// Cursor inside the "Calc" name literal of Instance.CallScript("Calc", ...). // Cursor inside the "Calc" name literal of Instance.CallScript("Calc", ...).
var resp = _svc.Hover(new HoverRequest( var resp = await _svc.Hover(new HoverRequest(
CodeText: "var r = Instance.CallScript(\"Calc\", 1, 2);", CodeText: "var r = Instance.CallScript(\"Calc\", 1, 2);",
Line: 1, Line: 1,
Column: 32, Column: 32,
@@ -215,9 +215,9 @@ public class ScriptAnalysisServiceTests
} }
[Fact] [Fact]
public void Hover_OnUnrelatedToken_ReturnsNull() public async Task Hover_OnUnrelatedToken_ReturnsNull()
{ {
var resp = _svc.Hover(new HoverRequest( var resp = await _svc.Hover(new HoverRequest(
CodeText: "var r = 1 + 2;", CodeText: "var r = 1 + 2;",
Line: 1, Line: 1,
Column: 5)); Column: 5));
@@ -227,10 +227,10 @@ public class ScriptAnalysisServiceTests
// ── Signature help ──────────────────────────────────────────────────── // ── Signature help ────────────────────────────────────────────────────
[Fact] [Fact]
public void SignatureHelp_InsideCallScript_ReturnsParameterStrip() public async Task SignatureHelp_InsideCallScript_ReturnsParameterStrip()
{ {
var siblings = new[] { Shape("Calc", Param("x", "Integer"), Param("y", "Float")) }; var siblings = new[] { Shape("Calc", Param("x", "Integer"), Param("y", "Float")) };
var resp = _svc.SignatureHelp(new SignatureHelpRequest( var resp = await _svc.SignatureHelp(new SignatureHelpRequest(
CodeText: "var r = Instance.CallScript(\"Calc\", 1, ", CodeText: "var r = Instance.CallScript(\"Calc\", 1, ",
Line: 1, Line: 1,
Column: 40, Column: 40,
@@ -243,9 +243,9 @@ public class ScriptAnalysisServiceTests
} }
[Fact] [Fact]
public void SignatureHelp_OutsideCall_ReturnsNull() public async Task SignatureHelp_OutsideCall_ReturnsNull()
{ {
var resp = _svc.SignatureHelp(new SignatureHelpRequest( var resp = await _svc.SignatureHelp(new SignatureHelpRequest(
CodeText: "var r = 1 + 2;", CodeText: "var r = 1 + 2;",
Line: 1, Line: 1,
Column: 5)); Column: 5));
@@ -394,9 +394,9 @@ public class ScriptAnalysisServiceTests
// ── Hover on Parameters["name"] ─────────────────────────────────────── // ── Hover on Parameters["name"] ───────────────────────────────────────
[Fact] [Fact]
public void Hover_OnParametersKey_ShowsDeclaredType() public async Task Hover_OnParametersKey_ShowsDeclaredType()
{ {
var resp = _svc.Hover(new HoverRequest( var resp = await _svc.Hover(new HoverRequest(
CodeText: "var x = Parameters[\"name\"];", CodeText: "var x = Parameters[\"name\"];",
Line: 1, Line: 1,
Column: 22, Column: 22,
@@ -0,0 +1,60 @@
using Bunit;
using ScadaLink.CentralUI.Components.Shared;
namespace ScadaLink.CentralUI.Tests.Shared;
/// <summary>
/// Regression tests for CentralUI-011. <c>DiffDialog.OpenAsync</c> returns the
/// <c>TaskCompletionSource</c>'s task, completed only by <c>Close()</c>. If the
/// user navigated away while the dialog was open, <c>DisposeAsync</c> ran but
/// never completed the TCS — the awaiting caller was suspended forever and any
/// cleanup after the await was skipped. The fix completes the TCS in
/// <c>DisposeAsync</c>.
/// </summary>
public class DiffDialogTests : BunitContext
{
[Fact]
public async Task DisposeAsync_WhileOpen_CompletesPendingTask()
{
var cut = Render<DiffDialog>();
// Open the dialog; the returned task represents the caller's await.
// Block-bodied lambda so InvokeAsync sees a void delegate — it must NOT
// await the dialog's own (deliberately long-lived) task.
Task<bool> pending = null!;
await cut.InvokeAsync(
() => { pending = cut.Instance.ShowAsync("Compare", "before", "after"); });
Assert.False(pending.IsCompleted, "Dialog task should be pending while open.");
// Simulate navigating away while the dialog is still open.
await cut.InvokeAsync(async () => await cut.Instance.DisposeAsync());
// The awaiter must complete deterministically rather than hang forever.
var completed = await Task.WhenAny(pending, Task.Delay(TimeSpan.FromSeconds(2)));
Assert.Same(pending, completed);
Assert.True(pending.IsCompletedSuccessfully);
var result = await pending;
Assert.False(result, "Dismiss-on-dispose should resolve to false (not confirmed).");
}
[Fact]
public async Task Close_CompletesPendingTaskWithTrue()
{
var cut = Render<DiffDialog>();
// Block-bodied lambda so InvokeAsync sees a void delegate — it must NOT
// await the dialog's own (deliberately long-lived) task.
Task<bool> pending = null!;
await cut.InvokeAsync(
() => { pending = cut.Instance.ShowAsync("Compare", "before", "after"); });
// Closing via the Close button completes the task with true.
await cut.InvokeAsync(() => cut.Find("button.btn-secondary").Click());
var completed = await Task.WhenAny(pending, Task.Delay(TimeSpan.FromSeconds(2)));
Assert.Same(pending, completed);
var result = await pending;
Assert.True(result);
}
}
@@ -0,0 +1,80 @@
using Bunit;
using ScadaLink.CentralUI.Components.Shared;
namespace ScadaLink.CentralUI.Tests.Shared;
/// <summary>
/// Regression tests for CentralUI-010. <c>ToastNotification.AddToast</c>
/// scheduled <c>Task.Delay(dismissMs).ContinueWith(...)</c> with the result
/// discarded; the continuation called <c>InvokeAsync(StateHasChanged)</c>. When
/// the host page is disposed before the delay elapses, the continuation ran
/// against a disposed component and threw <c>ObjectDisposedException</c> on a
/// thread-pool thread with no catch (an unobserved task exception). The fix
/// holds a <c>CancellationTokenSource</c> cancelled in <c>Dispose()</c>.
/// </summary>
public class ToastNotificationTests : BunitContext
{
[Fact]
public async Task ShowToast_AfterDisposal_IsNoOp_AndSchedulesNothing()
{
// Regression: the pre-fix AddToast always added the toast and scheduled
// a Task.Delay continuation, even after Dispose() — the continuation
// then ran InvokeAsync(StateHasChanged) against the disposed component.
// The fix short-circuits AddToast once the disposal token is cancelled.
var cut = Render<ToastNotification>();
await cut.InvokeAsync(() => cut.Instance.Dispose());
await cut.InvokeAsync(() => cut.Instance.ShowError("after dispose", autoDismissMs: 20));
Assert.Equal(0, cut.Instance.ToastCount);
}
[Fact]
public async Task AutoDismiss_AfterDisposal_DoesNotThrowUnobservedException()
{
var unobserved = new List<Exception>();
void Handler(object? s, UnobservedTaskExceptionEventArgs e)
{
unobserved.Add(e.Exception);
e.SetObserved();
}
TaskScheduler.UnobservedTaskException += Handler;
try
{
var cut = Render<ToastNotification>();
// Auto-dismiss after a very short delay so the continuation is
// guaranteed to fire well after we dispose the component.
await cut.InvokeAsync(() => cut.Instance.ShowSuccess("hello", autoDismissMs: 20));
// Dispose the component while the auto-dismiss is still pending.
await cut.InvokeAsync(() => cut.Instance.Dispose());
// Give the (now-cancelled) auto-dismiss well past its delay.
await Task.Delay(250);
GC.Collect();
GC.WaitForPendingFinalizers();
GC.Collect();
}
finally
{
TaskScheduler.UnobservedTaskException -= Handler;
}
Assert.Empty(unobserved);
}
[Fact]
public async Task AutoDismiss_BeforeDisposal_StillRemovesToast()
{
var cut = Render<ToastNotification>();
await cut.InvokeAsync(() => cut.Instance.ShowInfo("transient", autoDismissMs: 20));
// The toast is visible immediately.
Assert.Contains("transient", cut.Markup);
// After the dismiss delay it is removed (auto-dismiss still works).
cut.WaitForAssertion(
() => Assert.DoesNotContain("transient", cut.Markup),
timeout: TimeSpan.FromSeconds(2));
}
}
@@ -15,6 +15,15 @@ public class ClusterOptionsTests
Assert.Equal(TimeSpan.FromSeconds(2), options.HeartbeatInterval); Assert.Equal(TimeSpan.FromSeconds(2), options.HeartbeatInterval);
Assert.Equal(TimeSpan.FromSeconds(10), options.FailureDetectionThreshold); Assert.Equal(TimeSpan.FromSeconds(10), options.FailureDetectionThreshold);
Assert.Equal(1, options.MinNrOfMembers); Assert.Equal(1, options.MinNrOfMembers);
Assert.True(options.DownIfAlone);
}
[Fact]
public void DownIfAlone_CanBeSet()
{
var options = new ClusterOptions { DownIfAlone = false };
Assert.False(options.DownIfAlone);
} }
[Fact] [Fact]
@@ -0,0 +1,119 @@
using Microsoft.Extensions.Options;
namespace ScadaLink.ClusterInfrastructure.Tests;
/// <summary>
/// CI-004: Tests that <see cref="ClusterOptionsValidator"/> rejects the
/// catastrophic misconfigurations the design doc warns against — a
/// <c>MinNrOfMembers</c> other than 1, an unsupported split-brain strategy,
/// empty seed nodes, and timings where the heartbeat is not below the
/// failure-detection threshold.
/// </summary>
public class ClusterOptionsValidatorTests
{
private static ClusterOptions ValidOptions() => new()
{
SeedNodes = new List<string> { "akka.tcp://scadalink@node1:8081", "akka.tcp://scadalink@node2:8081" },
SplitBrainResolverStrategy = "keep-oldest",
StableAfter = TimeSpan.FromSeconds(15),
HeartbeatInterval = TimeSpan.FromSeconds(2),
FailureDetectionThreshold = TimeSpan.FromSeconds(10),
MinNrOfMembers = 1,
DownIfAlone = true
};
[Fact]
public void DefaultOptions_AreValid()
{
var result = new ClusterOptionsValidator().Validate(null, ValidOptions());
Assert.True(result.Succeeded, result.FailureMessage);
}
[Fact]
public void MinNrOfMembers_NotOne_FailsValidation()
{
var options = ValidOptions();
options.MinNrOfMembers = 2;
var result = new ClusterOptionsValidator().Validate(null, options);
Assert.True(result.Failed);
Assert.Contains("MinNrOfMembers", result.FailureMessage);
}
[Theory]
[InlineData("keep-majority")]
[InlineData("static-quorum")]
[InlineData("nonsense")]
public void UnsupportedSplitBrainStrategy_FailsValidation(string strategy)
{
var options = ValidOptions();
options.SplitBrainResolverStrategy = strategy;
var result = new ClusterOptionsValidator().Validate(null, options);
Assert.True(result.Failed);
Assert.Contains("SplitBrainResolverStrategy", result.FailureMessage);
}
[Fact]
public void EmptySeedNodes_FailsValidation()
{
var options = ValidOptions();
options.SeedNodes = new List<string>();
var result = new ClusterOptionsValidator().Validate(null, options);
Assert.True(result.Failed);
Assert.Contains("SeedNodes", result.FailureMessage);
}
[Fact]
public void HeartbeatNotBelowFailureThreshold_FailsValidation()
{
var options = ValidOptions();
options.HeartbeatInterval = TimeSpan.FromSeconds(10);
options.FailureDetectionThreshold = TimeSpan.FromSeconds(10);
var result = new ClusterOptionsValidator().Validate(null, options);
Assert.True(result.Failed);
Assert.Contains("HeartbeatInterval", result.FailureMessage);
}
[Fact]
public void NonPositiveStableAfter_FailsValidation()
{
var options = ValidOptions();
options.StableAfter = TimeSpan.Zero;
var result = new ClusterOptionsValidator().Validate(null, options);
Assert.True(result.Failed);
Assert.Contains("StableAfter", result.FailureMessage);
}
[Fact]
public void Validate_AccumulatesAllFailures()
{
var options = new ClusterOptions
{
SeedNodes = new List<string>(),
SplitBrainResolverStrategy = "keep-majority",
MinNrOfMembers = 2,
StableAfter = TimeSpan.Zero,
HeartbeatInterval = TimeSpan.FromSeconds(20),
FailureDetectionThreshold = TimeSpan.FromSeconds(10)
};
var result = new ClusterOptionsValidator().Validate(null, options);
Assert.True(result.Failed);
Assert.Contains("SeedNodes", result.FailureMessage);
Assert.Contains("SplitBrainResolverStrategy", result.FailureMessage);
Assert.Contains("MinNrOfMembers", result.FailureMessage);
Assert.Contains("StableAfter", result.FailureMessage);
Assert.Contains("HeartbeatInterval", result.FailureMessage);
}
}
@@ -10,6 +10,8 @@
<ItemGroup> <ItemGroup>
<PackageReference Include="coverlet.collector" /> <PackageReference Include="coverlet.collector" />
<PackageReference Include="Microsoft.Extensions.DependencyInjection" />
<PackageReference Include="Microsoft.Extensions.Options" />
<PackageReference Include="Microsoft.NET.Test.Sdk" /> <PackageReference Include="Microsoft.NET.Test.Sdk" />
<PackageReference Include="xunit" /> <PackageReference Include="xunit" />
<PackageReference Include="xunit.runner.visualstudio" /> <PackageReference Include="xunit.runner.visualstudio" />
@@ -0,0 +1,58 @@
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Options;
namespace ScadaLink.ClusterInfrastructure.Tests;
/// <summary>
/// CI-002: Tests that the DI extension methods do real work rather than
/// silently returning success. <see cref="ServiceCollectionExtensions.AddClusterInfrastructure"/>
/// must register the <see cref="ClusterOptionsValidator"/> so misconfiguration
/// fails fast, and the unimplemented actor-registration placeholder must fail
/// loudly rather than masquerade as a completed registration.
/// </summary>
public class ServiceCollectionExtensionsTests
{
[Fact]
public void AddClusterInfrastructure_RegistersOptionsValidator()
{
var services = new ServiceCollection();
services.AddClusterInfrastructure();
var validators = services
.Where(d => d.ServiceType == typeof(IValidateOptions<ClusterOptions>))
.ToList();
Assert.NotEmpty(validators);
using var provider = services.BuildServiceProvider();
var validator = provider.GetService<IValidateOptions<ClusterOptions>>();
Assert.IsType<ClusterOptionsValidator>(validator);
}
[Fact]
public void AddClusterInfrastructure_ValidatorRejectsBadOptionsAtResolution()
{
var services = new ServiceCollection();
services.AddClusterInfrastructure();
// A MinNrOfMembers of 2 blocks the cluster singleton after failover.
services.Configure<ClusterOptions>(o =>
{
o.SeedNodes = new List<string> { "akka.tcp://scadalink@node1:8081" };
o.MinNrOfMembers = 2;
});
using var provider = services.BuildServiceProvider();
var ex = Assert.Throws<OptionsValidationException>(
() => provider.GetRequiredService<IOptions<ClusterOptions>>().Value);
Assert.Contains("MinNrOfMembers", ex.Message);
}
[Fact]
public void AddClusterInfrastructureActors_ThrowsRatherThanSilentlySucceeding()
{
var services = new ServiceCollection();
Assert.Throws<NotImplementedException>(() => services.AddClusterInfrastructureActors());
}
}
@@ -0,0 +1,78 @@
using System.Reflection;
using ScadaLink.Commons.Messages.Management;
namespace ScadaLink.Commons.Tests.Messages;
/// <summary>
/// Tests for <see cref="ManagementCommandRegistry"/>, including the Commons-004
/// regression: <c>GetCommandName</c> and <c>Resolve</c> must be symmetric — every
/// type for which <c>GetCommandName</c> yields a name must round-trip back to the
/// same type via <c>Resolve</c>.
/// </summary>
public class ManagementCommandRegistryTests
{
private static IEnumerable<Type> RegisteredCommandTypes() =>
typeof(ManagementEnvelope).Assembly.GetTypes()
.Where(t => t.Namespace == typeof(ManagementEnvelope).Namespace
&& t.Name.EndsWith("Command", StringComparison.Ordinal)
&& !t.IsAbstract);
[Fact]
public void GetCommandName_Resolve_RoundTrips_ForEveryRegisteredCommand()
{
foreach (var type in RegisteredCommandTypes())
{
var name = ManagementCommandRegistry.GetCommandName(type);
var resolved = ManagementCommandRegistry.Resolve(name);
Assert.Equal(type, resolved);
}
}
[Fact]
public void Resolve_KnownCommand_ReturnsType()
{
var type = ManagementCommandRegistry.Resolve("CreateSite");
Assert.Equal(typeof(CreateSiteCommand), type);
}
[Fact]
public void Resolve_UnknownCommand_ReturnsNull()
{
Assert.Null(ManagementCommandRegistry.Resolve("NoSuchCommand"));
}
[Fact]
public void Resolve_IsCaseInsensitive()
{
Assert.Equal(typeof(CreateSiteCommand), ManagementCommandRegistry.Resolve("createsite"));
}
/// <summary>
/// Commons-004: <c>GetCommandName</c> previously stripped a <c>Command</c> suffix
/// from <em>any</em> type, producing names the registry cannot resolve. It must
/// only return a name for a command type the registry actually contains.
/// </summary>
[Fact]
public void GetCommandName_UnregisteredCommandType_Throws()
{
// A *Command type that is not in the Messages.Management namespace.
Assert.Throws<ArgumentException>(
() => ManagementCommandRegistry.GetCommandName(typeof(UnregisteredFakeCommand)));
}
[Fact]
public void GetCommandName_NonCommandType_Throws()
{
Assert.Throws<ArgumentException>(
() => ManagementCommandRegistry.GetCommandName(typeof(string)));
}
[Fact]
public void GetCommandName_RegisteredCommand_ReturnsStrippedName()
{
Assert.Equal("CreateSite", ManagementCommandRegistry.GetCommandName(typeof(CreateSiteCommand)));
}
/// <summary>A *Command record outside the Management namespace, for the negative test.</summary>
private record UnregisteredFakeCommand(int Id);
}
@@ -0,0 +1,103 @@
using System.Text.Json;
using ScadaLink.Commons.Types;
namespace ScadaLink.Commons.Tests.Types;
/// <summary>
/// Tests for <see cref="DynamicJsonElement"/>, including the Commons-002 regression:
/// a wrapped element must remain valid for deferred (script-time) access even after
/// the <see cref="JsonDocument"/> it was parsed from has been disposed.
/// </summary>
public class DynamicJsonElementTests
{
private static DynamicJsonElement Wrap(string json)
{
using var doc = JsonDocument.Parse(json);
return new DynamicJsonElement(doc.RootElement);
// doc is disposed here — a wrapper that retained a non-cloned element would
// now throw ObjectDisposedException on the first member access.
}
// ── Commons-002 regression: lifetime independence from the source document ──
[Fact]
public void MemberAccess_WorksAfterSourceDocumentDisposed()
{
dynamic obj = Wrap("""{ "name": "pump", "id": 7 }""");
Assert.Equal("pump", (string)obj.name);
Assert.Equal(7, (int)obj.id);
}
[Fact]
public void IndexAccess_WorksAfterSourceDocumentDisposed()
{
dynamic obj = Wrap("""{ "items": [ "a", "b", "c" ] }""");
Assert.Equal("b", obj.items[1]);
}
[Fact]
public void NestedAccess_WorksAfterSourceDocumentDisposed()
{
dynamic obj = Wrap("""{ "outer": { "inner": { "value": 42 } } }""");
Assert.Equal(42, (int)obj.outer.inner.value);
}
[Fact]
public void ToString_WorksAfterSourceDocumentDisposed()
{
dynamic obj = Wrap("""{ "label": "site-1" }""");
Assert.Equal("site-1", obj.label.ToString());
}
[Fact]
public void Access_SurvivesGarbageCollection_OfSourceDocument()
{
// No reference to the source document is held anywhere; force collection
// and finalization to prove the wrapper does not depend on it.
var obj = MakeWrapperAndDropDocument();
GC.Collect();
GC.WaitForPendingFinalizers();
GC.Collect();
dynamic d = obj;
Assert.Equal("ok", (string)d.status);
}
private static DynamicJsonElement MakeWrapperAndDropDocument()
{
var doc = JsonDocument.Parse("""{ "status": "ok" }""");
var wrapper = new DynamicJsonElement(doc.RootElement);
doc.Dispose();
return wrapper;
}
// ── Basic conversion / access behavior ──────────────────────────
[Fact]
public void Convert_NumberToInt()
{
dynamic obj = Wrap("""{ "n": 123 }""");
Assert.Equal(123, (int)obj.n);
}
[Fact]
public void Convert_BoolFromJson()
{
dynamic obj = Wrap("""{ "flag": true }""");
Assert.True((bool)obj.flag);
}
[Fact]
public void MissingMember_Throws()
{
// TryGetMember returns false for an absent property, so the dynamic binder
// surfaces a RuntimeBinderException — the standard DynamicObject contract.
dynamic obj = Wrap("""{ "a": 1 }""");
Assert.Throws<Microsoft.CSharp.RuntimeBinder.RuntimeBinderException>(
() => { var _ = obj.doesNotExist; });
}
}
@@ -148,11 +148,30 @@ public class ScriptParametersTests
Assert.Equal(42, p.Get<int?>("x")); Assert.Equal(42, p.Get<int?>("x"));
} }
// Commons-003: a parameter that is *present but unconvertible* is a caller/script
// bug and must throw — not be silently mapped to null (which a script would
// misread as "not supplied"). Genuinely absent/null still returns null.
[Fact] [Fact]
public void Get_NullableInt_Unparsable_ReturnsNull() public void Get_NullableInt_PresentButUnparsable_Throws()
{ {
var p = new ScriptParameters(new Dictionary<string, object?> { ["x"] = "abc" }); var p = new ScriptParameters(new Dictionary<string, object?> { ["x"] = "banana" });
Assert.Null(p.Get<int?>("x")); var ex = Assert.Throws<ScriptParameterException>(() => p.Get<int?>("x"));
Assert.Contains("could not be parsed as Int32", ex.Message);
}
[Fact]
public void Get_NullableInt_PresentButOverflowing_Throws()
{
var p = new ScriptParameters(new Dictionary<string, object?> { ["x"] = long.MaxValue });
var ex = Assert.Throws<ScriptParameterException>(() => p.Get<int?>("x"));
Assert.Contains("could not be parsed as Int32", ex.Message);
}
[Fact]
public void Get_NullableDateTime_PresentButUnparsable_Throws()
{
var p = new ScriptParameters(new Dictionary<string, object?> { ["dt"] = "not-a-date" });
Assert.Throws<ScriptParameterException>(() => p.Get<DateTime?>("dt"));
} }
[Fact] [Fact]
@@ -0,0 +1,119 @@
using ScadaLink.Commons.Types;
namespace ScadaLink.Commons.Tests.Types;
/// <summary>
/// Regression tests for Commons-001: the check-then-act race between the timer
/// callback (<c>OnTimerElapsed</c>) and <c>OnValueReceived</c> / <c>Stop</c> / <c>Start</c>.
///
/// The original implementation guarded firing with a single <c>volatile bool</c> that
/// was both read by the callback and reset by the caller threads. Because the
/// check-then-set was not atomic with the timer reschedule, a callback that had
/// already entered could raise <c>Stale</c> after the period it was scheduled for
/// had been cancelled or restarted — a spurious staleness signal that, for a
/// connection-health monitor, triggers an unnecessary reconnect.
///
/// These tests use the internal <c>CallbackEnteredHook</c> seam to deterministically
/// interleave a caller-thread operation with an in-flight callback.
/// </summary>
public class StaleTagMonitorRaceTests
{
/// <summary>
/// A value arrives (<c>OnValueReceived</c>) while a previous-period timer callback
/// is in flight, before that callback decides whether to fire. The old period has
/// been superseded, so the in-flight callback must not raise <c>Stale</c>
/// immediately; <c>Stale</c> may only fire later, for the fresh period, after a
/// full <c>MaxSilence</c> with no further values.
///
/// With the original single-volatile-bool guard the in-flight callback fired
/// <c>Stale</c> right after the value arrived (a spurious, wrong-moment signal);
/// this test detects that by checking how soon the fire lands after the value.
/// </summary>
[Fact]
public void Stale_DoesNotFirePromptly_WhenValueArrivesWhileCallbackInFlight()
{
var maxSilence = TimeSpan.FromMilliseconds(60);
using var monitor = new StaleTagMonitor(maxSilence);
DateTime? valueArrivedAt = null;
DateTime? staleFiredAt = null;
monitor.Stale += () => staleFiredAt ??= DateTime.UtcNow;
// When the (old-period) callback is entered, simulate a fresh value arriving
// on another thread before the callback's fire decision.
monitor.CallbackEnteredHook = () =>
{
monitor.CallbackEnteredHook = null; // only intercept the first callback
valueArrivedAt = DateTime.UtcNow;
monitor.OnValueReceived();
};
monitor.Start();
// Wait well past the intercepted callback and the fresh period's deadline.
Thread.Sleep(300);
monitor.Stop();
// The fresh period legitimately goes stale, so a fire is expected — but it
// must land roughly MaxSilence after the value, not immediately. A spurious
// wrong-moment fire from the superseded callback would land within a few ms.
Assert.NotNull(valueArrivedAt);
Assert.NotNull(staleFiredAt);
var delay = staleFiredAt.Value - valueArrivedAt.Value;
Assert.True(delay >= maxSilence * 0.5,
$"Stale fired only {delay.TotalMilliseconds:F0}ms after the value arrived; " +
$"expected at least {maxSilence.TotalMilliseconds * 0.5:F0}ms — the in-flight " +
"callback fired spuriously for the superseded period.");
}
/// <summary>
/// <c>Stop</c> races an in-flight timer callback. Once monitoring is stopped no
/// <c>Stale</c> signal may be delivered, even for a callback that had already
/// been entered.
/// </summary>
[Fact]
public void Stale_DoesNotFire_WhenStopRacesInFlightCallback()
{
using var monitor = new StaleTagMonitor(TimeSpan.FromMilliseconds(30));
var staleCount = 0;
monitor.Stale += () => Interlocked.Increment(ref staleCount);
monitor.CallbackEnteredHook = () =>
{
monitor.CallbackEnteredHook = null;
monitor.Stop();
};
monitor.Start();
Thread.Sleep(200);
Assert.Equal(0, staleCount);
}
/// <summary>
/// <c>Start</c> (a restart) races an in-flight callback from the prior run. The
/// old callback belongs to a superseded period and must not fire; the new period
/// fires exactly once on its own deadline.
/// </summary>
[Fact]
public void Stale_FiresOnceForNewPeriod_WhenRestartRacesInFlightCallback()
{
using var monitor = new StaleTagMonitor(TimeSpan.FromMilliseconds(30));
var staleCount = 0;
monitor.Stale += () => Interlocked.Increment(ref staleCount);
monitor.CallbackEnteredHook = () =>
{
monitor.CallbackEnteredHook = null;
monitor.Start(); // restart — supersedes the in-flight callback's period
};
monitor.Start();
// Old callback must be suppressed; the restarted period fires exactly once.
Thread.Sleep(250);
monitor.Stop();
Assert.Equal(1, staleCount);
}
}
@@ -197,6 +197,31 @@ public class CentralCommunicationActorTests : TestKit
Assert.Equal("dep2", ((DeployInstanceCommand)msg2.Message).DeploymentId); Assert.Equal("dep2", ((DeployInstanceCommand)msg2.Message).DeploymentId);
} }
[Fact]
public void LoadSiteAddressesFailure_IsLoggedNotSilentlySwallowed()
{
// Regression test for Communication-006. When the repository query throws,
// PipeTo delivers a Status.Failure to the actor. Without a Receive<Status.Failure>
// handler the failure becomes an unhandled message (debug-level only) and the
// periodic refresh fails silently — operators cannot tell "no addresses
// configured" from "database is down". The fix logs the failure at Warning.
var mockRepo = Substitute.For<ISiteRepository>();
mockRepo.GetAllSitesAsync(Arg.Any<CancellationToken>())
.Returns<Task<IReadOnlyList<Site>>>(_ => throw new InvalidOperationException("database is down"));
var services = new ServiceCollection();
services.AddScoped(_ => mockRepo);
var sp = services.BuildServiceProvider();
var mockFactory = Substitute.For<ISiteClientFactory>();
// The fix logs a Warning carrying the InvalidOperationException as the cause.
EventFilter.Warning(contains: "Failed to load site addresses from the database").ExpectOne(() =>
{
Sys.ActorOf(Props.Create(() => new CentralCommunicationActor(sp, mockFactory)));
});
}
[Fact] [Fact]
public void BothContactPoints_UsedInSingleClient() public void BothContactPoints_UsedInSingleClient()
{ {
@@ -0,0 +1,83 @@
using Akka.Actor;
using Akka.TestKit.Xunit2;
using Microsoft.Extensions.DependencyInjection;
using NSubstitute;
using ScadaLink.Commons.Interfaces.Repositories;
using ScadaLink.Communication.Actors;
namespace ScadaLink.Communication.Tests;
/// <summary>
/// Regression tests for Communication-004 — coordinator actors must declare an
/// explicit <c>Resume</c> supervision strategy per the CLAUDE.md decision
/// ("Resume for coordinator actors"). A child fault under the default
/// (Restart) strategy would wipe a child's in-memory state; the long-lived
/// coordinators own per-site ClusterClients and must not silently discard
/// their children on a transient fault.
/// </summary>
public class CoordinatorSupervisionTests : TestKit
{
/// <summary>
/// Test-only subclass that exposes the protected <see cref="SupervisorStrategy"/>
/// so the configured directive can be asserted directly.
/// </summary>
private sealed class CentralCommunicationActorProbe : CentralCommunicationActor
{
public CentralCommunicationActorProbe(IServiceProvider sp, ISiteClientFactory factory)
: base(sp, factory) { }
public SupervisorStrategy GetSupervisorStrategy() => SupervisorStrategy();
}
/// <summary>
/// Test-only subclass that exposes the protected <see cref="SupervisorStrategy"/>.
/// </summary>
private sealed class SiteCommunicationActorProbe : SiteCommunicationActor
{
public SiteCommunicationActorProbe(string siteId, CommunicationOptions options, IActorRef dm)
: base(siteId, options, dm) { }
public SupervisorStrategy GetSupervisorStrategy() => SupervisorStrategy();
}
private static IServiceProvider EmptyServiceProvider()
{
var mockRepo = Substitute.For<ISiteRepository>();
mockRepo.GetAllSitesAsync(Arg.Any<CancellationToken>())
.Returns(new List<Commons.Entities.Sites.Site>());
var services = new ServiceCollection();
services.AddScoped(_ => mockRepo);
return services.BuildServiceProvider();
}
[Fact]
public void CentralCommunicationActor_SupervisorStrategy_IsResume()
{
var sp = EmptyServiceProvider();
var factory = Substitute.For<ISiteClientFactory>();
var actorRef = new Akka.TestKit.TestActorRef<CentralCommunicationActorProbe>(
Sys, Props.Create(() => new CentralCommunicationActorProbe(sp, factory)));
var strategy = actorRef.UnderlyingActor.GetSupervisorStrategy();
var oneForOne = Assert.IsType<OneForOneStrategy>(strategy);
var directive = oneForOne.Decider.Decide(new InvalidOperationException("transient child fault"));
Assert.Equal(Directive.Resume, directive);
}
[Fact]
public void SiteCommunicationActor_SupervisorStrategy_IsResume()
{
var dmProbe = CreateTestProbe();
var actorRef = new Akka.TestKit.TestActorRef<SiteCommunicationActorProbe>(
Sys, Props.Create(() => new SiteCommunicationActorProbe("site1", new CommunicationOptions(), dmProbe.Ref)));
var strategy = actorRef.UnderlyingActor.GetSupervisorStrategy();
var oneForOne = Assert.IsType<OneForOneStrategy>(strategy);
var directive = oneForOne.Decider.Decide(new InvalidOperationException("transient child fault"));
Assert.Equal(Directive.Resume, directive);
}
}
@@ -22,6 +22,9 @@ public class DebugStreamBridgeActorTests : TestKit
{ {
// Use a very short reconnect delay for testing // Use a very short reconnect delay for testing
DebugStreamBridgeActor.ReconnectDelay = TimeSpan.FromMilliseconds(100); DebugStreamBridgeActor.ReconnectDelay = TimeSpan.FromMilliseconds(100);
// Long stability window so streams are never considered "stable" mid-test
// unless a test deliberately waits it out.
DebugStreamBridgeActor.StabilityWindow = TimeSpan.FromSeconds(30);
} }
private record TestContext( private record TestContext(
@@ -264,7 +267,55 @@ public class DebugStreamBridgeActorTests : TestKit
} }
[Fact] [Fact]
public void Grpc_Error_Resets_RetryCount_On_Successful_Event() public void FlappingStream_DeliveringEventsBetweenFailures_StillTerminatesAfterMaxRetries()
{
// Communication-008 regression: a stream that connects, delivers an event,
// then fails — repeatedly — must still trip MaxRetries. The retry count is
// NO LONGER reset by a received event (only by the stability window). The
// previous behaviour reset _retryCount on every event, so a flapping site
// reconnected forever and the debug session lived on indefinitely.
var ctx = CreateBridgeActor();
ctx.CommProbe.ExpectMsg<SiteEnvelope>();
var snapshot = new DebugViewSnapshot(
InstanceName,
new List<AttributeValueChanged>(),
new List<AlarmStateChanged>(),
DateTimeOffset.UtcNow);
Watch(ctx.BridgeActor);
ctx.BridgeActor.Tell(snapshot);
AwaitCondition(() => ctx.MockGrpcClient.SubscribeCalls.Count == 1, TimeSpan.FromSeconds(3));
var attrChange = new AttributeValueChanged(InstanceName, "IO", "Temp", 42.5, "Good", DateTimeOffset.UtcNow);
// Flap: deliver one event then fail, three times. Each event would, under
// the old buggy logic, reset the retry budget and prevent termination.
for (var i = 0; i < 3; i++)
{
var call = ctx.MockGrpcClient.SubscribeCalls[i];
call.OnEvent(attrChange);
call.OnError(new Exception($"Flap {i + 1}"));
var expected = i + 2;
AwaitCondition(() => ctx.MockGrpcClient.SubscribeCalls.Count == expected, TimeSpan.FromSeconds(5));
}
// Fourth error (after the 3 retries) must exceed MaxRetries and terminate.
ctx.MockGrpcClient.SubscribeCalls[3].OnEvent(attrChange);
ctx.MockGrpcClient.SubscribeCalls[3].OnError(new Exception("Flap 4"));
ExpectTerminated(ctx.BridgeActor, TimeSpan.FromSeconds(5));
Assert.True(ctx.TerminatedFlag[0]);
}
[Fact]
public void RetryCount_RecoveredOnlyAfterStreamStaysStableForStabilityWindow()
{
// Communication-008: after a stream has been connected for the stability
// window, the retry budget is recovered — a later transient fault then gets
// a fresh set of retries rather than being counted against the old budget.
DebugStreamBridgeActor.StabilityWindow = TimeSpan.FromMilliseconds(300);
try
{ {
var ctx = CreateBridgeActor(); var ctx = CreateBridgeActor();
ctx.CommProbe.ExpectMsg<SiteEnvelope>(); ctx.CommProbe.ExpectMsg<SiteEnvelope>();
@@ -275,30 +326,29 @@ public class DebugStreamBridgeActorTests : TestKit
new List<AlarmStateChanged>(), new List<AlarmStateChanged>(),
DateTimeOffset.UtcNow); DateTimeOffset.UtcNow);
Watch(ctx.BridgeActor);
ctx.BridgeActor.Tell(snapshot); ctx.BridgeActor.Tell(snapshot);
AwaitCondition(() => ctx.MockGrpcClient.SubscribeCalls.Count == 1, TimeSpan.FromSeconds(3)); AwaitCondition(() => ctx.MockGrpcClient.SubscribeCalls.Count == 1, TimeSpan.FromSeconds(3));
// First error → retry 1 // Two failures — but each new stream stays up long enough (the mock
ctx.MockGrpcClient.SubscribeCalls[0].OnError(new Exception("Error 1")); // stream only completes on cancel) for the stability window to elapse
AwaitCondition(() => ctx.MockGrpcClient.SubscribeCalls.Count == 2, TimeSpan.FromSeconds(5)); // and reset the retry budget before the next failure.
for (var i = 0; i < 5; i++)
{
Thread.Sleep(450); // exceed the 300ms stability window
ctx.MockGrpcClient.SubscribeCalls[i].OnError(new Exception($"Error {i + 1}"));
var expected = i + 2;
AwaitCondition(() => ctx.MockGrpcClient.SubscribeCalls.Count == expected, TimeSpan.FromSeconds(5));
}
// Simulate successful event (resets retry count) // Five well-spaced failures did NOT terminate the actor because each
var attrChange = new AttributeValueChanged(InstanceName, "IO", "Temp", 42.5, "Good", DateTimeOffset.UtcNow); // reconnect recovered its retry budget after the stability window.
ctx.MockGrpcClient.SubscribeCalls[1].OnEvent(attrChange); Assert.False(ctx.TerminatedFlag[0]);
AwaitCondition(() => { lock (ctx.ReceivedEvents) { return ctx.ReceivedEvents.Count == 2; } }, }
TimeSpan.FromSeconds(3)); finally
{
// Now another 3 errors should be tolerated (retry count was reset) DebugStreamBridgeActor.StabilityWindow = TimeSpan.FromSeconds(30);
ctx.MockGrpcClient.SubscribeCalls[1].OnError(new Exception("Error 2")); }
AwaitCondition(() => ctx.MockGrpcClient.SubscribeCalls.Count == 3, TimeSpan.FromSeconds(5));
ctx.MockGrpcClient.SubscribeCalls[2].OnError(new Exception("Error 3"));
AwaitCondition(() => ctx.MockGrpcClient.SubscribeCalls.Count == 4, TimeSpan.FromSeconds(5));
ctx.MockGrpcClient.SubscribeCalls[3].OnError(new Exception("Error 4"));
AwaitCondition(() => ctx.MockGrpcClient.SubscribeCalls.Count == 5, TimeSpan.FromSeconds(5));
// Still alive — 3 retries from the second failure point succeeded
} }
} }
@@ -0,0 +1,67 @@
using Microsoft.Extensions.Logging.Abstractions;
using Microsoft.Extensions.Options;
using NSubstitute;
using ScadaLink.Communication;
using ScadaLink.Communication.Grpc;
namespace ScadaLink.Communication.Tests.Grpc;
/// <summary>
/// Regression tests for Communication-005 — the gRPC keepalive and
/// max-stream-lifetime / max-concurrent-stream options defined on
/// <see cref="CommunicationOptions"/> must actually be applied to the
/// gRPC client and server rather than hard-coded.
/// </summary>
public class GrpcOptionsWiringTests
{
[Fact]
public void SiteStreamGrpcClient_AppliesKeepAliveFromOptions()
{
var options = new CommunicationOptions
{
GrpcKeepAlivePingDelay = TimeSpan.FromSeconds(42),
GrpcKeepAlivePingTimeout = TimeSpan.FromSeconds(7)
};
var client = new SiteStreamGrpcClient(
"http://localhost:9999", NullLogger<SiteStreamGrpcClient>.Instance, options);
Assert.Equal(TimeSpan.FromSeconds(42), client.KeepAlivePingDelay);
Assert.Equal(TimeSpan.FromSeconds(7), client.KeepAlivePingTimeout);
}
[Fact]
public void SiteStreamGrpcClientFactory_FlowsOptionsToCreatedClients()
{
var options = new CommunicationOptions
{
GrpcKeepAlivePingDelay = TimeSpan.FromSeconds(33),
GrpcKeepAlivePingTimeout = TimeSpan.FromSeconds(11)
};
using var factory = new SiteStreamGrpcClientFactory(
NullLoggerFactory.Instance, Options.Create(options));
var client = factory.GetOrCreate("site1", "http://localhost:9999");
Assert.Equal(TimeSpan.FromSeconds(33), client.KeepAlivePingDelay);
Assert.Equal(TimeSpan.FromSeconds(11), client.KeepAlivePingTimeout);
}
[Fact]
public void SiteStreamGrpcServer_BindsMaxConcurrentStreamsAndLifetimeFromOptions()
{
var options = new CommunicationOptions
{
GrpcMaxConcurrentStreams = 250,
GrpcMaxStreamLifetime = TimeSpan.FromHours(2)
};
var subscriber = Substitute.For<ISiteStreamSubscriber>();
var server = new SiteStreamGrpcServer(
subscriber, NullLogger<SiteStreamGrpcServer>.Instance, Options.Create(options));
Assert.Equal(250, server.MaxConcurrentStreams);
Assert.Equal(TimeSpan.FromHours(2), server.MaxStreamLifetime);
}
}
@@ -0,0 +1,129 @@
using System.Collections.Concurrent;
using Microsoft.Extensions.Logging.Abstractions;
using ScadaLink.Communication.Grpc;
namespace ScadaLink.Communication.Tests.Grpc;
/// <summary>
/// Regression tests for Communication-007 — the factory's synchronous
/// <see cref="SiteStreamGrpcClientFactory.Dispose"/> must not block on the
/// async disposal path (sync-over-async). It must dispose each client through
/// the client's synchronous <see cref="SiteStreamGrpcClient.Dispose"/>.
/// </summary>
public class SiteStreamGrpcClientFactoryDisposeTests
{
/// <summary>
/// Test client that records whether it was disposed via the sync or async path.
/// </summary>
private sealed class TrackingClient : SiteStreamGrpcClient
{
public bool SyncDisposeCalled { get; private set; }
public bool AsyncDisposeCalled { get; private set; }
public override void Dispose() => SyncDisposeCalled = true;
public override ValueTask DisposeAsync()
{
AsyncDisposeCalled = true;
return ValueTask.CompletedTask;
}
}
/// <summary>
/// Test factory that hands out <see cref="TrackingClient"/> instances while
/// still exercising the base factory's real caching and disposal machinery.
/// </summary>
private sealed class TrackingFactory : SiteStreamGrpcClientFactory
{
private readonly ConcurrentBag<TrackingClient> _created = new();
public TrackingFactory() : base(NullLoggerFactory.Instance) { }
public IReadOnlyCollection<TrackingClient> Created => _created.ToList();
protected override SiteStreamGrpcClient CreateClient(string grpcEndpoint)
{
var client = new TrackingClient();
_created.Add(client);
return client;
}
}
[Fact]
public void Dispose_DisposesClientsSynchronously_NotViaAsyncPath()
{
var factory = new TrackingFactory();
factory.GetOrCreate("site-a", "http://localhost:5100");
factory.GetOrCreate("site-b", "http://localhost:5200");
factory.Dispose();
Assert.NotEmpty(factory.Created);
Assert.All(factory.Created, c =>
{
Assert.True(c.SyncDisposeCalled, "client should be disposed via synchronous Dispose()");
Assert.False(c.AsyncDisposeCalled, "synchronous Dispose() must not route through DisposeAsync()");
});
}
[Fact]
public void Dispose_DoesNotDeadlock_UnderSingleThreadedSynchronizationContext()
{
// A strict single-threaded SynchronizationContext: continuations posted to
// it are only pumped by the worker loop. Sync-over-async (blocking the only
// thread on an async continuation that needs that same thread) deadlocks here.
using var ctx = new SingleThreadSyncContext();
Exception? captured = null;
var done = new ManualResetEventSlim();
ctx.Post(_ =>
{
try
{
var factory = new SiteStreamGrpcClientFactory(NullLoggerFactory.Instance);
factory.GetOrCreate("site-a", "http://localhost:5100");
factory.Dispose();
}
catch (Exception ex)
{
captured = ex;
}
finally
{
done.Set();
}
}, null);
Assert.True(done.Wait(TimeSpan.FromSeconds(5)),
"factory.Dispose() did not complete — likely a sync-over-async deadlock");
Assert.Null(captured);
}
/// <summary>Minimal single-threaded synchronization context for the deadlock test.</summary>
private sealed class SingleThreadSyncContext : SynchronizationContext, IDisposable
{
private readonly BlockingCollection<(SendOrPostCallback cb, object? state)> _queue = new();
private readonly Thread _thread;
public SingleThreadSyncContext()
{
_thread = new Thread(Run) { IsBackground = true };
_thread.Start();
}
private void Run()
{
SetSynchronizationContext(this);
foreach (var (cb, state) in _queue.GetConsumingEnumerable())
cb(state);
}
public override void Post(SendOrPostCallback d, object? state) => _queue.Add((d, state));
public void Dispose()
{
_queue.CompleteAdding();
_thread.Join(TimeSpan.FromSeconds(2));
}
}
}