5 Commits

46 changed files with 2957 additions and 409 deletions
+7 -28
View File
@@ -41,9 +41,9 @@ module file and counted in **Total**.
|----------|---------------| |----------|---------------|
| Critical | 0 | | Critical | 0 |
| High | 0 | | High | 0 |
| Medium | 25 | | Medium | 4 |
| Low | 90 | | Low | 90 |
| **Total** | **115** | | **Total** | **94** |
## Module Status ## Module Status
@@ -64,10 +64,10 @@ module file and counted in **Total**.
| [ManagementService](ManagementService/findings.md) | 2026-05-16 | `9c60592` | 0/0/0/5 | 5 | 13 | | [ManagementService](ManagementService/findings.md) | 2026-05-16 | `9c60592` | 0/0/0/5 | 5 | 13 |
| [NotificationService](NotificationService/findings.md) | 2026-05-16 | `9c60592` | 0/0/0/3 | 3 | 13 | | [NotificationService](NotificationService/findings.md) | 2026-05-16 | `9c60592` | 0/0/0/3 | 3 | 13 |
| [Security](Security/findings.md) | 2026-05-16 | `9c60592` | 0/0/0/4 | 4 | 11 | | [Security](Security/findings.md) | 2026-05-16 | `9c60592` | 0/0/0/4 | 4 | 11 |
| [SiteEventLogging](SiteEventLogging/findings.md) | 2026-05-16 | `9c60592` | 0/0/4/3 | 7 | 11 | | [SiteEventLogging](SiteEventLogging/findings.md) | 2026-05-16 | `9c60592` | 0/0/0/3 | 3 | 11 |
| [SiteRuntime](SiteRuntime/findings.md) | 2026-05-16 | `9c60592` | 0/0/8/5 | 13 | 16 | | [SiteRuntime](SiteRuntime/findings.md) | 2026-05-16 | `9c60592` | 0/0/0/5 | 5 | 16 |
| [StoreAndForward](StoreAndForward/findings.md) | 2026-05-16 | `9c60592` | 0/0/4/7 | 11 | 14 | | [StoreAndForward](StoreAndForward/findings.md) | 2026-05-16 | `9c60592` | 0/0/0/7 | 7 | 14 |
| [TemplateEngine](TemplateEngine/findings.md) | 2026-05-16 | `9c60592` | 0/0/5/4 | 9 | 14 | | [TemplateEngine](TemplateEngine/findings.md) | 2026-05-16 | `9c60592` | 0/0/0/4 | 4 | 14 |
## Pending Findings ## Pending Findings
@@ -84,7 +84,7 @@ _None open._
_None open._ _None open._
### Medium (25) ### Medium (4)
| ID | Module | Title | | ID | Module | Title |
|----|--------|-------| |----|--------|-------|
@@ -92,27 +92,6 @@ _None open._
| CentralUI-006 | [CentralUI](CentralUI/findings.md) | Deployment status page polls every 10s despite the documented SignalR-push design | | CentralUI-006 | [CentralUI](CentralUI/findings.md) | Deployment status page polls every 10s despite the documented SignalR-push design |
| Host-002 | [Host](Host/findings.md) | Akka.Persistence required by REQ-HOST-6 is not configured and not used | | Host-002 | [Host](Host/findings.md) | Akka.Persistence required by REQ-HOST-6 is not configured and not used |
| InboundAPI-007 | [InboundAPI](InboundAPI/findings.md) | `Database.Connection()` script API from the design doc is not implemented | | InboundAPI-007 | [InboundAPI](InboundAPI/findings.md) | `Database.Connection()` script API from the design doc is not implemented |
| SiteEventLogging-005 | [SiteEventLogging](SiteEventLogging/findings.md) | `LogEventAsync` performs synchronous disk I/O on the caller's thread |
| SiteEventLogging-007 | [SiteEventLogging](SiteEventLogging/findings.md) | `ISiteEventLogger` consumers downcast to the concrete type and reach into the DB connection |
| SiteEventLogging-008 | [SiteEventLogging](SiteEventLogging/findings.md) | Event-recording write failures are silently swallowed |
| SiteEventLogging-010 | [SiteEventLogging](SiteEventLogging/findings.md) | Test coverage gaps: actor bridge, purge/write concurrency, vacuum effectiveness, query error path |
| SiteRuntime-004 | [SiteRuntime](SiteRuntime/findings.md) | `_totalDeployedCount` is incremented on redeployment of an existing instance |
| SiteRuntime-005 | [SiteRuntime](SiteRuntime/findings.md) | Deployment reports `Success` to central before persistence completes |
| SiteRuntime-006 | [SiteRuntime](SiteRuntime/findings.md) | Site-local repositories read `SiteStorageService` private field via reflection |
| SiteRuntime-007 | [SiteRuntime](SiteRuntime/findings.md) | Synthetic entity IDs use the non-deterministic `string.GetHashCode()` |
| SiteRuntime-008 | [SiteRuntime](SiteRuntime/findings.md) | Blocking `.GetAwaiter().GetResult()` on the actor thread during startup |
| SiteRuntime-009 | [SiteRuntime](SiteRuntime/findings.md) | Script execution actors run scripts on the default thread pool, not a dedicated dispatcher |
| SiteRuntime-010 | [SiteRuntime](SiteRuntime/findings.md) | `EnsureDclConnections` never updates a connection whose configuration changed |
| SiteRuntime-011 | [SiteRuntime](SiteRuntime/findings.md) | Trust-model validation is a substring scan and is both over- and under-inclusive |
| StoreAndForward-004 | [StoreAndForward](StoreAndForward/findings.md) | `RegisterDeliveryHandler` XML doc contradicts the implemented contract |
| StoreAndForward-005 | [StoreAndForward](StoreAndForward/findings.md) | Parked-message retry/discard can race with the in-progress retry sweep |
| StoreAndForward-010 | [StoreAndForward](StoreAndForward/findings.md) | Retry of a parked message does not reset `LastAttemptAt`, so its retry timing is unspecified |
| StoreAndForward-013 | [StoreAndForward](StoreAndForward/findings.md) | Critical paths lack test coverage: retry-due timing, replication-from-active, and the actor bridge |
| TemplateEngine-006 | [TemplateEngine](TemplateEngine/findings.md) | Forbidden-API enforcement is a naive substring scan (bypassable and false-positive prone) |
| TemplateEngine-007 | [TemplateEngine](TemplateEngine/findings.md) | Brace-balance "compilation" misjudges verbatim / interpolated / raw strings |
| TemplateEngine-008 | [TemplateEngine](TemplateEngine/findings.md) | `SetAlarmOverrideAsync` accepts overrides for unknown / composed alarms with no validation |
| TemplateEngine-009 | [TemplateEngine](TemplateEngine/findings.md) | N+1 query in `TemplateDeletionService.CanDeleteTemplateAsync` |
| TemplateEngine-010 | [TemplateEngine](TemplateEngine/findings.md) | `InstanceService` documents optimistic concurrency that is not implemented |
### Low (90) ### Low (90)
+66 -24
View File
@@ -8,7 +8,7 @@
| Last reviewed | 2026-05-16 | | Last reviewed | 2026-05-16 |
| Reviewer | claude-agent | | Reviewer | claude-agent |
| Commit reviewed | `9c60592` | | Commit reviewed | `9c60592` |
| Open findings | 7 | | Open findings | 3 |
## Summary ## Summary
@@ -241,7 +241,7 @@ on the active node. No code change made; see the re-triage note above.
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Performance & resource management | | Category | Performance & resource management |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.SiteEventLogging/SiteEventLogger.cs:57-99` | | Location | `src/ScadaLink.SiteEventLogging/SiteEventLogger.cs:57-99` |
**Description** **Description**
@@ -267,7 +267,16 @@ background flush is preferable.
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit `pending`): event recording is now offloaded to a
dedicated background writer. `SiteEventLogger` owns an unbounded `Channel<T>` and a
single background consumer thread; `LogEventAsync` only validates its arguments and
enqueues, so caller threads (Akka actor threads on hot paths) never block on the
SQLite write or on contention for the write lock. The returned `Task` completes once
the event is durably persisted (so `await` callers still observe write ordering) and
faults if the write fails. `Dispose` completes the channel and drains the writer.
Regression test `LogEventAsync_DoesNotBlockCaller_WhenWriteIsSlow` (verifies the
caller returns in <500 ms while the database is held busy) plus
`LogEventAsync_TaskCompletes_AfterEventIsPersisted`.
### SiteEventLogging-006 — Missing indexes for severity and keyword-search query paths ### SiteEventLogging-006 — Missing indexes for severity and keyword-search query paths
@@ -299,36 +308,51 @@ cost.
_Unresolved._ _Unresolved._
### SiteEventLogging-007 — `ISiteEventLogger` consumers downcast to the concrete type and reach into the DB connection ### SiteEventLogging-007 — `ISiteEventLogger` consumers downcast to the concrete type
| | | | | |
|--|--| |--|--|
| Severity | Medium | | Severity | Medium (partially re-triaged 2026-05-16 — see Re-triage note) |
| Category | Code organization & conventions | | Category | Code organization & conventions |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.SiteEventLogging/EventLogPurgeService.cs:25`, `src/ScadaLink.SiteEventLogging/EventLogQueryService.cs:26`, `src/ScadaLink.SiteEventLogging/SiteEventLogger.cs:34` | | Location | `src/ScadaLink.SiteEventLogging/EventLogPurgeService.cs:21-30`, `src/ScadaLink.SiteEventLogging/EventLogQueryService.cs:20-28`, `src/ScadaLink.SiteEventLogging/ServiceCollectionExtensions.cs:10-23` |
**Description** **Description**
Both `EventLogPurgeService` and `EventLogQueryService` take `ISiteEventLogger` via Both `EventLogPurgeService` and `EventLogQueryService` took `ISiteEventLogger` via
DI and immediately downcast it: `_eventLogger = (SiteEventLogger)eventLogger;`. They DI and immediately downcast it: `_eventLogger = (SiteEventLogger)eventLogger;`. This
then access the `internal SqliteConnection Connection` property to run arbitrary SQL. made the registration fragile — any `ISiteEventLogger` that is not exactly
This defeats the purpose of the interface abstraction, makes the registration `SiteEventLogger` (a test double, a decorator) caused an `InvalidCastException` at
fragile (any `ISiteEventLogger` that is not exactly `SiteEventLogger` causes an construction — and defeated the purpose of the interface abstraction.
`InvalidCastException` at construction), and leaks the database handle and raw SQL
surface out of the recorder. It is also the root cause of the unsynchronised **Re-triage note (2026-05-16)**
connection sharing in SiteEventLogging-003.
The finding as originally written also claimed the services "access the `internal
SqliteConnection Connection` property to run arbitrary SQL" and called itself "the
root cause of the unsynchronised connection sharing in SiteEventLogging-003". That
part is stale: the resolution of SiteEventLogging-003 had already removed the
`internal Connection` property and replaced it with lock-guarded `WithConnection`
overloads. At the time this finding was actioned, the only remaining defect was the
concrete-type downcast itself. Severity stays Medium; the description is corrected to
the downcast-only scope.
**Recommendation** **Recommendation**
Introduce a proper data-access abstraction (e.g. an `IEventLogStore` with Have the purge and query services depend on the concrete `SiteEventLogger` directly
`Insert`, `Query`, `PurgeOlderThan`, `PurgeToSize`, `GetSizeBytes`) that owns the (it is the type that owns the lock-guarded `WithConnection`), and register the
connection and its locking, and inject that into the recorder, query, and purge concrete type in DI with the interface forwarded to the same singleton. Remove the
services. Remove the `internal Connection` property and the concrete-type downcasts. fragile downcasts.
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit `pending`): `AddSiteEventLogging` now registers
`SiteEventLogger` as a concrete singleton and forwards `ISiteEventLogger` to that
same instance. `EventLogPurgeService` and `EventLogQueryService` take a
`SiteEventLogger` constructor parameter directly, eliminating the
`(SiteEventLogger)eventLogger` downcast and its `InvalidCastException` risk. All
three services still share one connection/lock. Regression tests
`AddSiteEventLogging_ResolvesAllServices_SharingOneRecorderInstance` and
`PurgeAndQueryServices_AcceptConcreteRecorder_WithoutDowncast`.
### SiteEventLogging-008 — Event-recording write failures are silently swallowed ### SiteEventLogging-008 — Event-recording write failures are silently swallowed
@@ -336,7 +360,7 @@ _Unresolved._
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Error handling & resilience | | Category | Error handling & resilience |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.SiteEventLogging/SiteEventLogger.cs:92-95` | | Location | `src/ScadaLink.SiteEventLogging/SiteEventLogger.cs:92-95` |
**Description** **Description**
@@ -359,7 +383,13 @@ a Warning/Error health metric rather than only a local log line.
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit `pending`): write failures are no longer swallowed. The
background writer (introduced for SiteEventLogging-005) now, on an `INSERT` failure,
(a) increments a new `Interlocked`-guarded counter exposed as the public
`SiteEventLogger.FailedWriteCount` property — which Health Monitoring can poll to
detect a logging outage — and (b) faults the `Task` returned by `LogEventAsync` with
the exception instead of returning `Task.CompletedTask`. The error is still logged.
Regression test `LogEventAsync_FaultsTask_AndCountsFailure_OnWriteError`.
### SiteEventLogging-009 — XML doc on `LogEventAsync` claims asynchronous behaviour ### SiteEventLogging-009 — XML doc on `LogEventAsync` claims asynchronous behaviour
@@ -395,7 +425,7 @@ _Unresolved._
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Testing coverage | | Category | Testing coverage |
| Status | Open | | Status | Resolved |
| Location | `tests/ScadaLink.SiteEventLogging.Tests/` | | Location | `tests/ScadaLink.SiteEventLogging.Tests/` |
**Description** **Description**
@@ -423,7 +453,19 @@ oldest-first deletion, and a query-error-path test (e.g. corrupt/closed connecti
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit `pending`): the remaining coverage gaps are now filled.
`EventLogHandlerActorTests` (using `Akka.TestKit.Xunit2`) exercises the actor message
contract — `EventLogQueryRequest` -> `EventLogQueryResponse` via `Sender.Tell`, for
both success and error responses. `EventLogCoverageTests` covers the previously
untested `EventLogQueryService.ExecuteQuery` catch block
(`ExecuteQuery_ReturnsFailureResponse_WhenDatabaseUnavailable`) and the recorder's
disposed-state semantics (`LogEventAsync_AfterDispose_CompletesWithoutThrowing`,
`Dispose_IsIdempotent`). The purge/write concurrency, realistic-cap, and
oldest-first behaviours were already covered by the tests added when
SiteEventLogging-001/-002/-003 were resolved
(`PurgeByStorageCap_ConcurrentWritesDoNotCorruptConnection`,
`PurgeByStorageCap_StopsWhenUnderCap_DoesNotEmptyTable`,
`PurgeByStorageCap_RemovesOldestEventsFirst`).
### SiteEventLogging-011 — Stale "Phase 4+" placeholder in `ServiceCollectionExtensions` ### SiteEventLogging-011 — Stale "Phase 4+" placeholder in `ServiceCollectionExtensions`
+114 -31
View File
@@ -8,7 +8,7 @@
| Last reviewed | 2026-05-16 | | Last reviewed | 2026-05-16 |
| Reviewer | claude-agent | | Reviewer | claude-agent |
| Commit reviewed | `9c60592` | | Commit reviewed | `9c60592` |
| Open findings | 13 | | Open findings | 5 |
## Summary ## Summary
@@ -176,10 +176,10 @@ longer drifts (this additionally addresses the root cause behind SiteRuntime-004
| | | | | |
|--|--| |--|--|
| Severity | Medium | | Severity | Medium — re-triaged: already fixed by the SiteRuntime-003 resolution. |
| Category | Correctness & logic bugs | | Category | Correctness & logic bugs |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.SiteRuntime/Actors/DeploymentManagerActor.cs:239` | | Location | `src/ScadaLink.SiteRuntime/Actors/DeploymentManagerActor.cs` (`ApplyDeployment`) |
**Description** **Description**
@@ -193,16 +193,24 @@ grow, but the in-memory `_totalDeployedCount` (reported to the health collector
`UpdateInstanceCounts`) drifts upward and the reported "disabled" count becomes `UpdateInstanceCounts`) drifts upward and the reported "disabled" count becomes
wrong. wrong.
**Recommendation** **Re-triage (2026-05-16)**
Only increment `_totalDeployedCount` when the instance is genuinely new. Either Verified against the current source: this is **already fixed**. The SiteRuntime-003
track whether this deploy replaced an existing config, or derive the deployed count resolution replaced the fixed-delay reschedule with a shared `ApplyDeployment` helper
from storage / the union of running actors and disabled configs rather than that takes an `isRedeploy` flag and guards the counter with `if (!isRedeploy)
maintaining a hand-incremented counter. _totalDeployedCount++;`. The redeploy path (`HandleTerminated`) always calls
`ApplyDeployment(..., isRedeploy: true)`, so the counter is no longer bumped on
redeployment. The regression test
`DeploymentManagerRedeployTests.Redeploy_ExistingInstance_DoesNotOverCountDeployedInstances`
already covers this and passes. No further code change was required.
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (`commit pending`): no new change needed — the root cause was
eliminated by the SiteRuntime-003 fix (the `isRedeploy` guard in `ApplyDeployment`).
Confirmed by the existing passing regression test
`Redeploy_ExistingInstance_DoesNotOverCountDeployedInstances`. Re-triaged from Open to
Resolved.
### SiteRuntime-005 — Deployment reports `Success` to central before persistence completes ### SiteRuntime-005 — Deployment reports `Success` to central before persistence completes
@@ -210,8 +218,8 @@ _Unresolved._
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Error handling & resilience | | Category | Error handling & resilience |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.SiteRuntime/Actors/DeploymentManagerActor.cs:272` | | Location | `src/ScadaLink.SiteRuntime/Actors/DeploymentManagerActor.cs` (`ApplyDeployment`, `HandleDeployPersistenceResult`) |
**Description** **Description**
@@ -232,7 +240,16 @@ At minimum, do not report `Success` until the config row is committed.
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (`commit pending`): root cause confirmed — `ApplyDeployment` sent
`DeploymentStatusResponse(Success)` synchronously before the persistence `Task.Run`
completed. The `Success` reply is now sent from `HandleDeployPersistenceResult` only
once the persistence result is known: on success it replies `Success`; on a
persistence failure it logs the error, stops the optimistically-created Instance
Actor, rolls back the deployed-instance counter, and replies
`DeploymentStatus.Failed` with the error message. `DeployPersistenceResult` carries an
`IsRedeploy` flag so the counter rollback is skipped for redeployments. Regression
tests: `DeploymentManagerMediumFindingsTests.Deploy_PersistenceFailure_ReportsFailedNotSuccess`
and `Deploy_Success_ReportsSuccessAndPersistsConfig`.
### SiteRuntime-006 — Site-local repositories read `SiteStorageService` private field via reflection ### SiteRuntime-006 — Site-local repositories read `SiteStorageService` private field via reflection
@@ -240,8 +257,8 @@ _Unresolved._
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Code organization & conventions | | Category | Code organization & conventions |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.SiteRuntime/Repositories/SiteExternalSystemRepository.cs:183`, `src/ScadaLink.SiteRuntime/Repositories/SiteNotificationRepository.cs:181` | | Location | `src/ScadaLink.SiteRuntime/Repositories/SiteExternalSystemRepository.cs`, `src/ScadaLink.SiteRuntime/Repositories/SiteNotificationRepository.cs` |
**Description** **Description**
@@ -263,7 +280,16 @@ repositories. Remove the reflection entirely.
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (`commit pending`): root cause confirmed — both repositories
reflected into `SiteStorageService._connectionString`. `SiteStorageService` now
exposes a public `CreateConnection()` factory method that returns an unopened
`SqliteConnection` against the site database. Both `SiteExternalSystemRepository` and
`SiteNotificationRepository` now obtain connections via `_storage.CreateConnection()`;
all reflection (`Type.GetField` / `BindingFlags`) and the contradictory XML comments
have been removed. This is a fully in-module refactor — no cross-module design
decision was needed. Regression test:
`SiteRepositoryTests.ExternalSystemRepository_RoundTripsStoredDefinition` exercises
the repository's connection path end-to-end.
### SiteRuntime-007 — Synthetic entity IDs use the non-deterministic `string.GetHashCode()` ### SiteRuntime-007 — Synthetic entity IDs use the non-deterministic `string.GetHashCode()`
@@ -271,8 +297,8 @@ _Unresolved._
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Correctness & logic bugs | | Category | Correctness & logic bugs |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.SiteRuntime/Repositories/SiteExternalSystemRepository.cs:241`, `src/ScadaLink.SiteRuntime/Repositories/SiteNotificationRepository.cs:254` | | Location | `src/ScadaLink.SiteRuntime/Repositories/SiteExternalSystemRepository.cs`, `src/ScadaLink.SiteRuntime/Repositories/SiteNotificationRepository.cs`, `src/ScadaLink.SiteRuntime/Repositories/SyntheticId.cs` |
**Description** **Description**
@@ -294,7 +320,18 @@ rather than synthesising integer IDs.
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (`commit pending`): root cause confirmed — both repositories used
`name.GetHashCode()`, which is per-process randomized on .NET Core. A new internal
`SyntheticId` helper computes a deterministic, process-stable 31-bit ID using the
FNV-1a hash over the name's UTF-8 bytes. Both `GenerateSyntheticId` methods now
delegate to `SyntheticId.From(name)`. (The integer-keyed lookups are kept because
they are mandated by the shared `IExternalSystemRepository`/`INotificationRepository`
contracts in Commons — changing those contracts to name-keyed would be a cross-module
change outside this module's scope; the deterministic hash resolves the correctness
defect within scope.) Regression tests:
`SiteRepositoryTests.ExternalSystemRepository_SyntheticId_IsStableAcrossRestart` and
`NotificationRepository_SyntheticId_IsStableAcrossRestart` re-create the service to
simulate a process restart and confirm by-ID lookups still resolve.
### SiteRuntime-008 — Blocking `.GetAwaiter().GetResult()` on the actor thread during startup ### SiteRuntime-008 — Blocking `.GetAwaiter().GetResult()` on the actor thread during startup
@@ -302,8 +339,8 @@ _Unresolved._
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Akka.NET conventions | | Category | Akka.NET conventions |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.SiteRuntime/Actors/DeploymentManagerActor.cs:479` | | Location | `src/ScadaLink.SiteRuntime/Actors/DeploymentManagerActor.cs` (`HandleStartupConfigsLoaded`, `LoadSharedScriptsFromStorage`, `HandleSharedScriptsLoaded`) |
**Description** **Description**
@@ -327,7 +364,18 @@ back.
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (`commit pending`): root cause confirmed — the blocking
`.GetAwaiter().GetResult()` and Roslyn compilation ran on the singleton's mailbox
thread inside `HandleStartupConfigsLoaded`. `LoadSharedScriptsFromStorage` now runs
the SQLite read **and** the Roslyn compilation on a background `Task.Run` and pipes a
new internal `SharedScriptsLoaded` message back to the actor. A new
`HandleSharedScriptsLoaded` handler then begins staggered Instance Actor creation, so
the compilation→creation ordering is preserved without ever blocking the mailbox. A
shared-script load failure is logged and startup proceeds (scripts needing a missing
shared script fail at execution time). Regression test:
`DeploymentManagerMediumFindingsTests.Startup_WithSharedScripts_LoadsConfigsAndStaysResponsive`
(confirms startup completes and the actor stays responsive with shared scripts
present).
### SiteRuntime-009 — Script execution actors run scripts on the default thread pool, not a dedicated dispatcher ### SiteRuntime-009 — Script execution actors run scripts on the default thread pool, not a dedicated dispatcher
@@ -335,8 +383,8 @@ _Unresolved._
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Akka.NET conventions | | Category | Akka.NET conventions |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.SiteRuntime/Actors/ScriptExecutionActor.cs:72`, `src/ScadaLink.SiteRuntime/Actors/ScriptActor.cs:289`, `src/ScadaLink.SiteRuntime/Actors/AlarmExecutionActor.cs:57` | | Location | `src/ScadaLink.SiteRuntime/Actors/ScriptExecutionActor.cs`, `src/ScadaLink.SiteRuntime/Actors/AlarmExecutionActor.cs`, `src/ScadaLink.SiteRuntime/Scripts/ScriptExecutionScheduler.cs` |
**Description** **Description**
@@ -359,7 +407,19 @@ way, remove the "in production, configure…" comments by actually configuring i
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (`commit pending`): root cause confirmed — script and alarm
on-trigger bodies ran inside a bare `Task.Run` on the shared `ThreadPool`. The
recommendation's `TaskScheduler` option was taken because it is fully in-module (a
HOCON dispatcher would require editing the Host's ActorSystem config, out of scope).
A new `ScriptExecutionScheduler` provides a bounded set of dedicated background
threads (count from the new `SiteRuntimeOptions.ScriptExecutionThreadCount`, default
8). `ScriptExecutionActor` and `AlarmExecutionActor` now run their bodies via
`Task.Factory.StartNew(..., ScriptExecutionScheduler.Shared(options)).Unwrap()`
instead of `Task.Run`, so blocking script I/O is contained to those dedicated threads
and cannot starve the global pool. The misleading "in production, configure a
dedicated dispatcher" comments were removed. Regression tests:
`ScriptExecutionSchedulerTests` (`Scheduler_RunsWork_OffTheThreadPool`,
`Scheduler_RespectsConfiguredThreadCount`, `Scheduler_Shared_ReturnsSameInstanceForOptions`).
### SiteRuntime-010 — `EnsureDclConnections` never updates a connection whose configuration changed ### SiteRuntime-010 — `EnsureDclConnections` never updates a connection whose configuration changed
@@ -367,8 +427,8 @@ _Unresolved._
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Correctness & logic bugs | | Category | Correctness & logic bugs |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.SiteRuntime/Actors/DeploymentManagerActor.cs:413` | | Location | `src/ScadaLink.SiteRuntime/Actors/DeploymentManagerActor.cs` (`EnsureDclConnections`, `ComputeConnectionConfigHash`) |
**Description** **Description**
@@ -390,7 +450,15 @@ the name.
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (`commit pending`): root cause confirmed — the cache was a
name-only `HashSet`, so a changed connection config was silently dropped.
`_createdConnections` is now a `Dictionary<string,string>` mapping connection name to
a SHA-256 hash of its protocol/primary-config/backup-config/failover-retry-count
(`ComputeConnectionConfigHash`). A connection whose hash is unchanged is still
skipped; a connection whose config changed re-issues a `CreateConnectionCommand` so
the DCL adopts the new configuration. Regression tests:
`DeploymentManagerMediumFindingsTests.EnsureDclConnections_ConnectionConfigChanged_ReissuesCreateCommand`
and `EnsureDclConnections_UnchangedConfig_DoesNotReissueCreateCommand`.
### SiteRuntime-011 — Trust-model validation is a substring scan and is both over- and under-inclusive ### SiteRuntime-011 — Trust-model validation is a substring scan and is both over- and under-inclusive
@@ -398,8 +466,8 @@ _Unresolved._
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Security | | Category | Security |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.SiteRuntime/Scripts/ScriptCompilationService.cs:52` | | Location | `src/ScadaLink.SiteRuntime/Scripts/ScriptCompilationService.cs` (`ValidateTrustModel`) |
**Description** **Description**
@@ -430,7 +498,22 @@ unused `isAllowed` variable.
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (`commit pending`): root cause confirmed — `ValidateTrustModel`
was a raw `string.Contains`/`IndexOf` scan of the source text, with a dead `isAllowed`
variable. It is now Roslyn semantic analysis: the script is parsed and a
`CSharpCompilation` + `SemanticModel` are built; every name/member/object-creation
node is resolved to its symbol and the symbol's containing namespace and
fully-qualified containing type are checked against the forbidden roots. Bare
namespace symbols are ignored (so the `System.Threading` qualifier of the allowed
`System.Threading.Tasks.Task` no longer false-positives). A name that cannot be
resolved (a type from an assembly deliberately absent from the script's references)
falls back to a syntactic fully-qualified-name check, so e.g. `System.Net.Http`
references are still rejected. The dead `isAllowed` variable was removed. This fixes
both the bypass (`global::`/alias-qualified forbidden types) and the false positives
(forbidden namespace string in a comment, string literal, or unrelated identifier).
Regression tests: new `TrustModelSemanticTests` (alias/`global::` detection, comment/
literal/identifier non-detection, allowed-exception resolution); all 39 existing
`SandboxTests` + `ScriptCompilationServiceTests` continue to pass.
### SiteRuntime-012 — `AttributeAccessor`/`ScopeAccessors` block the script on a synchronous Ask ### SiteRuntime-012 — `AttributeAccessor`/`ScopeAccessors` block the script on a synchronous Ask
+75 -10
View File
@@ -8,7 +8,7 @@
| Last reviewed | 2026-05-16 | | Last reviewed | 2026-05-16 |
| Reviewer | claude-agent | | Reviewer | claude-agent |
| Commit reviewed | `9c60592` | | Commit reviewed | `9c60592` |
| Open findings | 11 | | Open findings | 7 |
## Summary ## Summary
@@ -212,7 +212,7 @@ corrected semantics.
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Documentation & comments | | Category | Documentation & comments |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.StoreAndForward/StoreAndForwardService.cs:38`, `:60` | | Location | `src/ScadaLink.StoreAndForward/StoreAndForwardService.cs:38`, `:60` |
**Description** **Description**
@@ -233,15 +233,25 @@ retry); exception = transient failure (buffer / increment retry).
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit pending). Confirmed the root cause against the source: the
old XML comment's "Permanent failures should return false (message will NOT be
buffered)" describes only the `EnqueueAsync` immediate path; on the retry path a
`false` return parks the already-buffered message. Reworded the `_deliveryHandlers`
field doc into an explicit three-way contract (`true` = delivered/removed-or-never-
buffered; `false` = permanent failure = not-buffered-on-immediate / parked-on-retry;
throws = transient = buffered-for-retry / retry-count-incremented), noting it applies
identically on both paths, and pointed `RegisterDeliveryHandler`'s summary at it.
Documentation-only change — no behavioural code touched, so no regression test
(an XML comment is not test-observable).
### StoreAndForward-005 — Parked-message retry/discard can race with the in-progress retry sweep ### StoreAndForward-005 — Parked-message retry/discard can race with the in-progress retry sweep
| | | | | |
|--|--| |--|--|
| Severity | Medium | | Severity | Low — re-triaged down from Medium on 2026-05-16; the described data-loss race is not actually reachable, see Re-triage note |
| Original severity | Medium |
| Category | Concurrency & thread safety | | Category | Concurrency & thread safety |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.StoreAndForward/StoreAndForwardService.cs:184`, `:266`, `:280` | | Location | `src/ScadaLink.StoreAndForward/StoreAndForwardService.cs:184`, `:266`, `:280` |
**Description** **Description**
@@ -266,9 +276,42 @@ so the actor mailbox serialises them, or make status transitions conditional in
(e.g. `UPDATE ... WHERE id = @id AND status = @expected`) and re-check the affected (e.g. `UPDATE ... WHERE id = @id AND status = @expected`) and re-check the affected
row count. row count.
**Re-triage note (2026-05-16)**
Verified against the source: the *specific* race the finding describes — *"an operator
`RetryParkedMessageAsync` resets a row to Pending while the sweep simultaneously parks
the same in-flight message"* — **cannot occur** at the reviewed code. The two operator
operations (`StoreAndForwardStorage.RetryParkedMessageAsync` /
`DiscardParkedMessageAsync`) are already SQL-conditional on `status = Parked`, and the
retry sweep only ever loads and processes rows that are `Pending`. For the operator and
the sweep to touch the *same* message the row would have to be simultaneously `Parked`
(operator-actionable) and in the sweep's in-flight `Pending` snapshot — mutually
exclusive states. Overlapping sweeps are additionally prevented by the `Interlocked`
guard, so there is only ever one sweep writer. The described data-loss outcome is
therefore not reachable, which makes the original Medium ("risky behaviour") severity
an over-statement — **re-triaged to Low**.
What *is* real is a latent fragility: the sweep's own state-changing writes used the
unconditional `UpdateMessageAsync` (`WHERE id = @id`), so the no-clobber guarantee
rested entirely on the `Interlocked` invariant with zero defence-in-depth if that
invariant were ever broken (e.g. `RetryMessageAsync` called outside the sweep, or the
guard removed). That residual Low-severity weakness is worth closing, so the
recommendation's conditional-SQL option was applied.
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit pending). Re-triaged Medium → Low (the described race is
unreachable; see Re-triage note) and the residual latent fragility fixed: added
`StoreAndForwardStorage.UpdateMessageIfStatusAsync`, a conditional update
(`UPDATE ... WHERE id = @id AND status = @expectedStatus`) returning whether a row was
written. `RetryMessageAsync`'s three state-changing writes (park-on-permanent-failure,
park-on-max-retries, retry-count increment) now use it with `expectedStatus = Pending`,
so the sweep can never overwrite a row whose status changed underneath it; a skipped
write is logged and the message left for the next sweep. Regression test
`RetryMessageAsync_StatusChangedDuringDelivery_SweepParkWriteIsSkipped` drives a writer
that moves the row out of `Pending` mid-delivery and asserts the sweep's stale park
write is skipped (it failed against the pre-fix unconditional update, clobbering the
other writer's `RetryCount`).
### StoreAndForward-006 — `GetParkedMessagesAsync` count and page run without a transaction ### StoreAndForward-006 — `GetParkedMessagesAsync` count and page run without a transaction
@@ -395,7 +438,7 @@ _Unresolved._
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Correctness & logic bugs | | Category | Correctness & logic bugs |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.StoreAndForward/StoreAndForwardStorage.cs:203`, `:101` | | Location | `src/ScadaLink.StoreAndForward/StoreAndForwardStorage.cs:203`, `:101` |
**Description** **Description**
@@ -419,7 +462,16 @@ interval. Document the chosen behaviour in the method's XML comment.
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit pending). Confirmed against the source: `RetryParkedMessage
Async` reset `status`, `retry_count` and `last_error` but left `last_attempt_at` stale,
so the operator-retried message's retry timing depended on the elapsed time since the
original pre-park attempt. Encoded the intent explicitly — an operator retry means
"attempt this again now" — by also setting `last_attempt_at = NULL` in the UPDATE, so
the re-queued message is unambiguously due on the next sweep regardless of the
configured `retry_interval_ms`. The method's XML comment now documents this. Regression
test `RetryParkedMessageAsync_ClearsLastAttemptAt_SoMessageIsImmediatelyDue` uses a
1-hour retry interval and a recent `last_attempt_at`; it failed pre-fix (timestamp not
cleared, message excluded from the retry-due set) and passes post-fix.
### StoreAndForward-011 — `StoreAndForwardMessageStatus.InFlight` is unused and the doc's "retrying" status is unmodelled ### StoreAndForward-011 — `StoreAndForwardMessageStatus.InFlight` is unused and the doc's "retrying" status is unmodelled
@@ -488,7 +540,7 @@ _Unresolved._
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Testing coverage | | Category | Testing coverage |
| Status | Open | | Status | Resolved |
| Location | `tests/ScadaLink.StoreAndForward.Tests/` (whole directory); `src/ScadaLink.StoreAndForward/StoreAndForwardStorage.cs:101`; `src/ScadaLink.StoreAndForward/ParkedMessageHandlerActor.cs` | | Location | `tests/ScadaLink.StoreAndForward.Tests/` (whole directory); `src/ScadaLink.StoreAndForward/StoreAndForwardStorage.cs:101`; `src/ScadaLink.StoreAndForward/ParkedMessageHandlerActor.cs` |
**Description** **Description**
@@ -514,7 +566,20 @@ invalid-JSON payloads.
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit pending). All three gaps closed.
(1) Retry-due timing: `GetMessagesForRetryAsync_NonZeroInterval_ExcludesNotYetDue
IncludesDue` exercises the `julianday` elapsed-time SQL with non-zero intervals — a
just-attempted message (1-hour interval) is excluded, an old one (attempted 2h ago,
5-min interval) and a never-attempted one are included.
(2) Active-side replication: this sub-claim was **already stale** at the reviewed code —
`StoreAndForwardReplicationTests` (added with finding 001's fix) asserts the Add, Remove
and Park replication operations; no new test was needed.
(3) `ParkedMessageHandlerActor`: new `ParkedMessageHandlerActorTests` using
`Akka.TestKit.Xunit2` (package reference added to the test project) covers Query/Retry/
Discard request-to-response mapping, correlation-ID propagation, the unknown-message
failure responses, and `ExtractMethodName` for `MethodName`, `Subject`, missing-property
and malformed-JSON payloads (all falling back to the category name without throwing).
These are coverage-gap tests over already-correct code, so they pass on first run.
### StoreAndForward-014 — Storage does not create its SQLite database directory ### StoreAndForward-014 — Storage does not create its SQLite database directory
+69 -12
View File
@@ -8,7 +8,7 @@
| Last reviewed | 2026-05-16 | | Last reviewed | 2026-05-16 |
| Reviewer | claude-agent | | Reviewer | claude-agent |
| Commit reviewed | `9c60592` | | Commit reviewed | `9c60592` |
| Open findings | 9 | | Open findings | 4 |
## Summary ## Summary
@@ -263,7 +263,7 @@ Regression test: `CreateTemplate_WithParent_DoesNotRunDeadCollisionQuery`.
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Security | | Category | Security |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.TemplateEngine/Validation/ScriptCompiler.cs:21`, `src/ScadaLink.TemplateEngine/Validation/ValidationService.cs:318` | | Location | `src/ScadaLink.TemplateEngine/Validation/ScriptCompiler.cs:21`, `src/ScadaLink.TemplateEngine/Validation/ValidationService.cs:318` |
**Description** **Description**
@@ -289,7 +289,20 @@ limitation prominently and treat the substring scan as advisory, not authoritati
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit `pending`): fixed the false-positive half and
documented the (deferred) bypass half. Added `CSharpDelimiterScanner.ContainsInCode`,
a code-region-aware substring search that blanks out string/char-literal/comment
spans before matching, so the inert text `System.IO.` inside a string or comment is
no longer flagged. `ScriptCompiler.TryCompile` and `ValidationService.CheckExpressionSyntax`
now use it. The bypass half (namespace aliases, `using static`, `global::`) genuinely
requires Roslyn semantic symbol analysis, which the TemplateEngine project deliberately
does not reference — that authoritative check is deferred to the real script compiler /
Site Runtime sandbox. The limitation is now documented prominently as a `SECURITY
LIMITATION` note in the `ScriptCompiler` class summary and the `ForbiddenPatterns`
doc, and the scan is explicitly labelled advisory. Regression tests:
`TryCompile_ForbiddenApiTextInsideStringLiteral_NotFlagged`,
`TryCompile_ForbiddenApiTextInsideComment_NotFlagged`,
`TryCompile_ForbiddenApiInRealCode_StillFlagged`.
### TemplateEngine-007 — Brace-balance "compilation" misjudges verbatim / interpolated / raw strings ### TemplateEngine-007 — Brace-balance "compilation" misjudges verbatim / interpolated / raw strings
@@ -297,7 +310,7 @@ _Unresolved._
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Correctness & logic bugs | | Category | Correctness & logic bugs |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.TemplateEngine/Validation/ScriptCompiler.cs:54`, `src/ScadaLink.TemplateEngine/SharedScriptService.cs:124` | | Location | `src/ScadaLink.TemplateEngine/Validation/ScriptCompiler.cs:54`, `src/ScadaLink.TemplateEngine/SharedScriptService.cs:124` |
**Description** **Description**
@@ -321,7 +334,18 @@ to something that cannot false-positive.
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit `pending`): replaced both hand-rolled string trackers
with `CSharpDelimiterScanner`, a single string-/comment-aware scanner that correctly
skips regular strings (with `\` escapes), verbatim strings (`@"..."`, `""` escape),
interpolated strings (`$"..."` / `$@"..."`, interpolation holes `{...}` treated as
code, `{{`/`}}` as escaped braces), C# 11 raw string literals (`"""..."""`), char
literals, and line/block comments while tracking `{}`/`[]`/`()` depth. `ScriptCompiler
.TryCompile` and `SharedScriptService.ValidateSyntax` now delegate to it, so a valid
script containing a delimiter inside a literal/comment is no longer falsely rejected;
genuine mismatches are still caught. Regression tests in `ScriptCompilerTests`
(`TryCompile_VerbatimStringWithBrace_*`, `_VerbatimStringWithEscapedQuote_*`,
`_InterpolatedStringWithBraces_*`, `_RawStringLiteralWithBraces_*`, `_CharLiteralWithBrace_*`,
`_GenuineMismatchedBraces_StillDetected`) and `SharedScriptServiceTests.ValidateSyntax_DelimiterInsideStringOrComment_ReturnsNull`.
### TemplateEngine-008 — `SetAlarmOverrideAsync` accepts overrides for unknown / composed alarms with no validation ### TemplateEngine-008 — `SetAlarmOverrideAsync` accepts overrides for unknown / composed alarms with no validation
@@ -329,7 +353,7 @@ _Unresolved._
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Error handling & resilience | | Category | Error handling & resilience |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.TemplateEngine/Services/InstanceService.cs:178` | | Location | `src/ScadaLink.TemplateEngine/Services/InstanceService.cs:178` |
**Description** **Description**
@@ -351,7 +375,16 @@ the lock check to composed alarms too.
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit `pending`): `SetAlarmOverrideAsync` now resolves the
instance template's full effective alarm set via `TemplateResolver.ResolveAllMembers`
(loaded from `GetAllTemplatesAsync`) instead of looking up only the template's direct
alarms. An override whose canonical name is absent from that set is rejected with a
"does not exist" failure (mirroring `SetAttributeOverrideAsync`); the `IsLocked` check
now also applies to composed (path-qualified) and inherited alarms, closing the
lock-bypass. Regression tests: `SetAlarmOverride_NonExistentAlarm_ReturnsFailure`,
`SetAlarmOverride_ComposedLockedAlarm_ReturnsFailure`,
`SetAlarmOverride_ComposedUnlockedAlarm_ReturnsSuccess`,
`SetAlarmOverride_DirectLockedAlarm_ReturnsFailure`.
### TemplateEngine-009 — N+1 query in `TemplateDeletionService.CanDeleteTemplateAsync` ### TemplateEngine-009 — N+1 query in `TemplateDeletionService.CanDeleteTemplateAsync`
@@ -359,7 +392,7 @@ _Unresolved._
|--|--| |--|--|
| Severity | Medium | | Severity | Medium |
| Category | Performance & resource management | | Category | Performance & resource management |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.TemplateEngine/Services/TemplateDeletionService.cs:75` | | Location | `src/ScadaLink.TemplateEngine/Services/TemplateDeletionService.cs:75` |
**Description** **Description**
@@ -380,15 +413,24 @@ template.
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit `pending`): `CanDeleteTemplateAsync` Check 3 now reads
the `Compositions` navigation already loaded by `GetAllTemplatesAsync` (a single
`SelectMany`) instead of issuing one `GetCompositionsByTemplateIdAsync` round-trip
per template — the same source `TemplateService.DeleteTemplateAsync` uses for the
equivalent check. The per-delete cost no longer scales with template count.
Regression test: `CanDeleteTemplate_DoesNotIssuePerTemplateCompositionQuery`
(verifies `GetCompositionsByTemplateIdAsync` is never called); the existing
`CanDeleteTemplate_ComposedByOthers_ReturnsFailure` and
`CanDeleteTemplate_MultipleConstraints_AllErrorsReported` tests were updated to seed
the `Compositions` navigation, matching how EF's `GetAllTemplatesAsync` loads it.
### TemplateEngine-010 — `InstanceService` documents optimistic concurrency that is not implemented ### TemplateEngine-010 — `InstanceService` documents optimistic concurrency that is not implemented
| | | | | |
|--|--| |--|--|
| Severity | Medium | | Severity | Low — re-triaged from Medium: this is a stale XML comment, not a behavioural defect. The code matches the design (last-write-wins); only the doc string was wrong. |
| Category | Documentation & comments | | Category | Documentation & comments |
| Status | Open | | Status | Resolved |
| Location | `src/ScadaLink.TemplateEngine/Services/InstanceService.cs:9` | | Location | `src/ScadaLink.TemplateEngine/Services/InstanceService.cs:9` |
**Description** **Description**
@@ -407,9 +449,24 @@ If last-write-wins is acceptable for instance state, correct the XML doc. If opt
concurrency is required, add a concurrency token to `Instance` and surface a conflict concurrency is required, add a concurrency token to `Instance` and surface a conflict
result. result.
**Re-triage**
Verified against the design: `docs/requirements/Component-TemplateEngine.md` states
"Concurrent editing uses **last-write-wins** — no pessimistic locking or conflict
detection." The system's optimistic-concurrency decision (per CLAUDE.md) applies to
*deployment status records*, not instance state. The code is therefore correct — a
plain read-modify-write is the intended behaviour — and the only defect is the stale
"with optimistic concurrency" phrase in the class XML summary. Re-triaged from
Medium (Error handling) to Low (Documentation): doc-only fix, no behaviour change.
**Resolution** **Resolution**
_Unresolved._ Resolved 2026-05-16 (commit `pending`): corrected the `InstanceService` class XML
summary — replaced "Enabled/disabled state with optimistic concurrency" with an
explicit statement that instance-state edits are last-write-wins (no version token /
conflict detection), citing the design decision and noting that optimistic concurrency
in the system applies to deployment status records, not instance state. No code or
behaviour change; no regression test (documentation-only).
### TemplateEngine-011 — `SortedPropertiesConverterFactory` is dead code with a misleading comment ### TemplateEngine-011 — `SortedPropertiesConverterFactory` is dead code with a misleading comment
@@ -19,12 +19,14 @@ public class EventLogPurgeService : BackgroundService
private readonly ILogger<EventLogPurgeService> _logger; private readonly ILogger<EventLogPurgeService> _logger;
public EventLogPurgeService( public EventLogPurgeService(
ISiteEventLogger eventLogger, SiteEventLogger eventLogger,
IOptions<SiteEventLogOptions> options, IOptions<SiteEventLogOptions> options,
ILogger<EventLogPurgeService> logger) ILogger<EventLogPurgeService> logger)
{ {
// We need the concrete type to funnel access through its shared lock. // Depend on the concrete recorder directly: purge must funnel database access
_eventLogger = (SiteEventLogger)eventLogger; // through its lock-guarded WithConnection. Taking ISiteEventLogger and
// downcasting would throw InvalidCastException for any other implementation.
_eventLogger = eventLogger;
_options = options.Value; _options = options.Value;
_logger = logger; _logger = logger;
} }
@@ -18,11 +18,14 @@ public class EventLogQueryService : IEventLogQueryService
private readonly ILogger<EventLogQueryService> _logger; private readonly ILogger<EventLogQueryService> _logger;
public EventLogQueryService( public EventLogQueryService(
ISiteEventLogger eventLogger, SiteEventLogger eventLogger,
IOptions<SiteEventLogOptions> options, IOptions<SiteEventLogOptions> options,
ILogger<EventLogQueryService> logger) ILogger<EventLogQueryService> logger)
{ {
_eventLogger = (SiteEventLogger)eventLogger; // Depend on the concrete recorder directly: queries must funnel database
// access through its lock-guarded WithConnection. Taking ISiteEventLogger and
// downcasting would throw InvalidCastException for any other implementation.
_eventLogger = eventLogger;
_options = options.Value; _options = options.Value;
_logger = logger; _logger = logger;
} }
@@ -9,7 +9,13 @@ public static class ServiceCollectionExtensions
/// </summary> /// </summary>
public static IServiceCollection AddSiteEventLogging(this IServiceCollection services) public static IServiceCollection AddSiteEventLogging(this IServiceCollection services)
{ {
services.AddSingleton<ISiteEventLogger, SiteEventLogger>(); // The recorder is registered as a concrete singleton and the interface is
// forwarded to the same instance. The purge and query services depend on the
// concrete SiteEventLogger directly (they need its lock-guarded WithConnection)
// rather than downcasting an ISiteEventLogger, which would throw
// InvalidCastException for any other ISiteEventLogger implementation.
services.AddSingleton<SiteEventLogger>();
services.AddSingleton<ISiteEventLogger>(sp => sp.GetRequiredService<SiteEventLogger>());
services.AddSingleton<IEventLogQueryService, EventLogQueryService>(); services.AddSingleton<IEventLogQueryService, EventLogQueryService>();
services.AddHostedService<EventLogPurgeService>(); services.AddHostedService<EventLogPurgeService>();
return services; return services;
+119 -23
View File
@@ -1,3 +1,4 @@
using System.Threading.Channels;
using Microsoft.Data.Sqlite; using Microsoft.Data.Sqlite;
using Microsoft.Extensions.Logging; using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Options; using Microsoft.Extensions.Options;
@@ -10,15 +11,28 @@ namespace ScadaLink.SiteEventLogging;
/// On failover, the new active node starts a fresh log. /// On failover, the new active node starts a fresh log.
/// </summary> /// </summary>
/// <remarks> /// <remarks>
/// <para>
/// A single <see cref="SqliteConnection"/> is owned here and is NOT thread-safe. /// A single <see cref="SqliteConnection"/> is owned here and is NOT thread-safe.
/// All access — recording, querying, purging — must be funnelled through /// All access — recording, querying, purging — must be funnelled through
/// <see cref="WithConnection"/>, which serialises callers on a shared lock. /// <see cref="WithConnection"/>, which serialises callers on a shared lock.
/// </para>
/// <para>
/// Event recording is offloaded to a dedicated background writer thread (fed by an
/// unbounded <see cref="Channel{T}"/>). <see cref="LogEventAsync"/> only validates
/// its arguments and enqueues, so callers — typically Akka actor threads on hot
/// paths — never block on disk I/O or on contention for the write lock. The
/// returned <see cref="Task"/> completes once the event is durably persisted and
/// faults if the write fails, so failures are observable rather than swallowed.
/// </para>
/// </remarks> /// </remarks>
public class SiteEventLogger : ISiteEventLogger, IDisposable public class SiteEventLogger : ISiteEventLogger, IDisposable
{ {
private readonly SqliteConnection _connection; private readonly SqliteConnection _connection;
private readonly ILogger<SiteEventLogger> _logger; private readonly ILogger<SiteEventLogger> _logger;
private readonly object _writeLock = new(); private readonly object _writeLock = new();
private readonly Channel<PendingEvent> _writeQueue;
private readonly Task _writerLoop;
private long _failedWriteCount;
private bool _disposed; private bool _disposed;
public SiteEventLogger( public SiteEventLogger(
@@ -34,8 +48,22 @@ public class SiteEventLogger : ISiteEventLogger, IDisposable
_connection.Open(); _connection.Open();
InitializeSchema(); InitializeSchema();
_writeQueue = Channel.CreateUnbounded<PendingEvent>(new UnboundedChannelOptions
{
SingleReader = true,
SingleWriter = false,
});
_writerLoop = Task.Run(ProcessWriteQueueAsync);
} }
/// <summary>
/// Number of event writes that have failed (SQLite error, disk full, etc.)
/// since this logger was created. Surfaced so Health Monitoring can detect a
/// logging outage instead of relying on a local log line nobody is watching.
/// </summary>
public long FailedWriteCount => Interlocked.Read(ref _failedWriteCount);
/// <summary> /// <summary>
/// Runs <paramref name="action"/> against the shared connection while holding the /// Runs <paramref name="action"/> against the shared connection while holding the
/// write lock, so purge / query / record callers on different threads never use /// write lock, so purge / query / record callers on different threads never use
@@ -112,42 +140,110 @@ public class SiteEventLogger : ISiteEventLogger, IDisposable
ArgumentException.ThrowIfNullOrWhiteSpace(source); ArgumentException.ThrowIfNullOrWhiteSpace(source);
ArgumentException.ThrowIfNullOrWhiteSpace(message); ArgumentException.ThrowIfNullOrWhiteSpace(message);
var timestamp = DateTimeOffset.UtcNow.ToString("o"); var pending = new PendingEvent(
DateTimeOffset.UtcNow.ToString("o"),
eventType,
severity,
instanceId,
source,
message,
details);
try // Enqueue only — the actual SQLite write happens on the background writer
// thread so the caller (an Akka actor thread on a hot path) never blocks
// on disk I/O or on contention for the write lock.
if (!_writeQueue.Writer.TryWrite(pending))
{ {
WithConnection(connection => // The channel is unbounded, so the only way TryWrite fails is that the
// writer has been completed (logger disposed). Drop silently — there is
// nowhere to persist the event.
pending.Completion.TrySetResult();
}
return pending.Completion.Task;
}
private async Task ProcessWriteQueueAsync()
{
await foreach (var pending in _writeQueue.Reader.ReadAllAsync().ConfigureAwait(false))
{
try
{ {
using var cmd = connection.CreateCommand(); var written = WithConnection(connection =>
cmd.CommandText = """ {
INSERT INTO site_events (timestamp, event_type, severity, instance_id, source, message, details) using var cmd = connection.CreateCommand();
VALUES ($timestamp, $event_type, $severity, $instance_id, $source, $message, $details) cmd.CommandText = """
"""; INSERT INTO site_events (timestamp, event_type, severity, instance_id, source, message, details)
cmd.Parameters.AddWithValue("$timestamp", timestamp); VALUES ($timestamp, $event_type, $severity, $instance_id, $source, $message, $details)
cmd.Parameters.AddWithValue("$event_type", eventType); """;
cmd.Parameters.AddWithValue("$severity", severity); cmd.Parameters.AddWithValue("$timestamp", pending.Timestamp);
cmd.Parameters.AddWithValue("$instance_id", (object?)instanceId ?? DBNull.Value); cmd.Parameters.AddWithValue("$event_type", pending.EventType);
cmd.Parameters.AddWithValue("$source", source); cmd.Parameters.AddWithValue("$severity", pending.Severity);
cmd.Parameters.AddWithValue("$message", message); cmd.Parameters.AddWithValue("$instance_id", (object?)pending.InstanceId ?? DBNull.Value);
cmd.Parameters.AddWithValue("$details", (object?)details ?? DBNull.Value); cmd.Parameters.AddWithValue("$source", pending.Source);
cmd.ExecuteNonQuery(); cmd.Parameters.AddWithValue("$message", pending.Message);
}); cmd.Parameters.AddWithValue("$details", (object?)pending.Details ?? DBNull.Value);
} cmd.ExecuteNonQuery();
catch (Exception ex) });
{
_logger.LogError(ex, "Failed to record event: {EventType} from {Source}", eventType, source);
}
return Task.CompletedTask; // WithConnection returns false only when the logger has been
// disposed; the event simply cannot be persisted in that case.
pending.Completion.TrySetResult();
}
catch (Exception ex)
{
// SiteEventLogging-008: a write failure must be observable. Count it
// (Health Monitoring reads FailedWriteCount) and fault the caller's
// Task instead of silently discarding the exception.
Interlocked.Increment(ref _failedWriteCount);
_logger.LogError(ex, "Failed to record event: {EventType} from {Source}",
pending.EventType, pending.Source);
pending.Completion.TrySetException(ex);
}
}
} }
public void Dispose() public void Dispose()
{ {
Task? writerLoop = null;
lock (_writeLock) lock (_writeLock)
{ {
if (_disposed) return; if (_disposed) return;
_disposed = true; _disposed = true;
// Stop accepting new events and let the writer loop drain.
_writeQueue.Writer.TryComplete();
writerLoop = _writerLoop;
}
// Wait for the writer loop to finish outside the lock — the loop itself
// acquires the lock for each write.
try
{
writerLoop?.Wait(TimeSpan.FromSeconds(5));
}
catch (AggregateException)
{
// A faulted writer loop has already been logged per event; nothing more
// to do during disposal.
}
lock (_writeLock)
{
_connection.Dispose(); _connection.Dispose();
} }
} }
/// <summary>An event awaiting persistence by the background writer.</summary>
private sealed record PendingEvent(
string Timestamp,
string EventType,
string Severity,
string? InstanceId,
string Source,
string Message,
string? Details)
{
public TaskCompletionSource Completion { get; } =
new(TaskCreationOptions.RunContinuationsAsynchronously);
}
} }
@@ -460,7 +460,8 @@ public class AlarmActor : ReceiveActor
var executionId = $"{_alarmName}-alarm-exec-{_executionCounter++}"; var executionId = $"{_alarmName}-alarm-exec-{_executionCounter++}";
// NOTE: In production, configure a dedicated blocking I/O dispatcher via HOCON. // SiteRuntime-009: the on-trigger script body runs on the dedicated
// ScriptExecutionScheduler, not the shared .NET thread pool.
var props = Props.Create(() => new AlarmExecutionActor( var props = Props.Create(() => new AlarmExecutionActor(
_alarmName, _alarmName,
_instanceName, _instanceName,
@@ -54,7 +54,11 @@ public class AlarmExecutionActor : ReceiveActor
{ {
var timeout = TimeSpan.FromSeconds(options.ScriptExecutionTimeoutSeconds); var timeout = TimeSpan.FromSeconds(options.ScriptExecutionTimeoutSeconds);
_ = Task.Run(async () => // SiteRuntime-009: run the alarm on-trigger body on the dedicated
// script-execution scheduler, not the shared .NET thread pool.
var scheduler = ScriptExecutionScheduler.Shared(options);
_ = Task.Factory.StartNew(async () =>
{ {
using var cts = new CancellationTokenSource(timeout); using var cts = new CancellationTokenSource(timeout);
try try
@@ -108,6 +112,6 @@ public class AlarmExecutionActor : ReceiveActor
{ {
self.Tell(PoisonPill.Instance); self.Tell(PoisonPill.Instance);
} }
}); }, CancellationToken.None, TaskCreationOptions.DenyChildAttach, scheduler).Unwrap();
} }
} }
@@ -99,6 +99,7 @@ public class DeploymentManagerActor : ReceiveActor, IWithTimers
// Internal startup messages // Internal startup messages
Receive<StartupConfigsLoaded>(HandleStartupConfigsLoaded); Receive<StartupConfigsLoaded>(HandleStartupConfigsLoaded);
Receive<SharedScriptsLoaded>(HandleSharedScriptsLoaded);
Receive<StartNextBatch>(HandleStartNextBatch); Receive<StartNextBatch>(HandleStartNextBatch);
// Internal enable result // Internal enable result
@@ -156,7 +157,14 @@ public class DeploymentManagerActor : ReceiveActor, IWithTimers
} }
/// <summary> /// <summary>
/// Processes the loaded configs from SQLite and begins staggered Instance Actor creation. /// Processes the loaded configs from SQLite.
///
/// SiteRuntime-008: shared scripts must be compiled before Instance Actors are
/// created, but the SQLite read and Roslyn compilation must not block the
/// singleton's mailbox. The compilation is run on a background task and a
/// <see cref="SharedScriptsLoaded"/> message is piped back; only then does
/// staggered Instance Actor creation begin. The deployed configs are stashed on the
/// actor field in the meantime.
/// </summary> /// </summary>
private void HandleStartupConfigsLoaded(StartupConfigsLoaded msg) private void HandleStartupConfigsLoaded(StartupConfigsLoaded msg)
{ {
@@ -166,9 +174,6 @@ public class DeploymentManagerActor : ReceiveActor, IWithTimers
return; return;
} }
// Load and compile shared scripts from SQLite before creating Instance Actors
LoadSharedScriptsFromStorage();
var enabledConfigs = msg.Configs.Where(c => c.IsEnabled).ToList(); var enabledConfigs = msg.Configs.Where(c => c.IsEnabled).ToList();
_totalDeployedCount = msg.Configs.Count; _totalDeployedCount = msg.Configs.Count;
_logger.LogInformation( _logger.LogInformation(
@@ -176,11 +181,25 @@ public class DeploymentManagerActor : ReceiveActor, IWithTimers
msg.Configs.Count, enabledConfigs.Count); msg.Configs.Count, enabledConfigs.Count);
UpdateInstanceCounts(); UpdateInstanceCounts();
if (enabledConfigs.Count == 0) // Load and compile shared scripts off the actor thread, then resume startup.
LoadSharedScriptsFromStorage(enabledConfigs);
}
/// <summary>
/// SiteRuntime-008: once shared scripts have been compiled off-thread, begins
/// staggered Instance Actor creation for the enabled configs captured at startup.
/// </summary>
private void HandleSharedScriptsLoaded(SharedScriptsLoaded msg)
{
_logger.LogInformation(
"Loaded {Compiled}/{Total} shared scripts from SQLite",
msg.CompiledCount, msg.TotalCount);
if (msg.EnabledConfigs.Count == 0)
return; return;
// Start the first batch immediately // Start the first batch immediately
var batchState = new BatchState(enabledConfigs, 0); var batchState = new BatchState(msg.EnabledConfigs, 0);
Self.Tell(new StartNextBatch(batchState)); Self.Tell(new StartNextBatch(batchState));
} }
@@ -275,6 +294,13 @@ public class DeploymentManagerActor : ReceiveActor, IWithTimers
/// Creates the Instance Actor, persists the config, and replies to the deployer. /// Creates the Instance Actor, persists the config, and replies to the deployer.
/// A redeployment is an update of an existing instance, so the deployed-instance /// A redeployment is an update of an existing instance, so the deployed-instance
/// counter is only incremented for genuinely new deployments. /// counter is only incremented for genuinely new deployments.
///
/// SiteRuntime-005: the deployer is <b>not</b> told <see cref="DeploymentStatus.Success"/>
/// until SQLite persistence has committed. The site's deployed-config store is the
/// durable source of truth — a config that was never persisted would be silently lost
/// on the next restart/failover, so reporting Success before the row is committed is
/// incorrect. The reply is sent from <see cref="HandleDeployPersistenceResult"/> once
/// the persistence outcome is known.
/// </summary> /// </summary>
private void ApplyDeployment(DeployInstanceCommand command, IActorRef sender, bool isRedeploy) private void ApplyDeployment(DeployInstanceCommand command, IActorRef sender, bool isRedeploy)
{ {
@@ -307,33 +333,56 @@ public class DeploymentManagerActor : ReceiveActor, IWithTimers
instanceName, command.FlattenedConfigurationJson, instanceName, command.FlattenedConfigurationJson,
command.DeploymentId, command.RevisionHash, true)); command.DeploymentId, command.RevisionHash, true));
return new DeployPersistenceResult(command.DeploymentId, instanceName, true, null, sender); return new DeployPersistenceResult(
command.DeploymentId, instanceName, true, null, sender, isRedeploy);
}).ContinueWith(t => }).ContinueWith(t =>
{ {
if (t.IsCompletedSuccessfully) if (t.IsCompletedSuccessfully)
return t.Result; return t.Result;
return new DeployPersistenceResult( return new DeployPersistenceResult(
command.DeploymentId, instanceName, false, command.DeploymentId, instanceName, false,
t.Exception?.GetBaseException().Message, sender); t.Exception?.GetBaseException().Message, sender, isRedeploy);
}).PipeTo(Self); }).PipeTo(Self);
// Reply immediately — deployment is applied (actor is running)
sender.Tell(new DeploymentStatusResponse(
command.DeploymentId,
instanceName,
DeploymentStatus.Success,
null,
DateTimeOffset.UtcNow));
} }
/// <summary>
/// SiteRuntime-005: reports the deployment outcome to central only after the
/// persistence result is known. On a persistence failure the Instance Actor that was
/// created optimistically is stopped and the deployed-instance counter rolled back,
/// so the in-memory state stays consistent with durable storage, and central is told
/// the deployment <see cref="DeploymentStatus.Failed"/>.
/// </summary>
private void HandleDeployPersistenceResult(DeployPersistenceResult result) private void HandleDeployPersistenceResult(DeployPersistenceResult result)
{ {
if (!result.Success) if (result.Success)
{ {
_logger.LogError( result.OriginalSender.Tell(new DeploymentStatusResponse(
"Failed to persist deployment {DeploymentId} for {Instance}: {Error}", result.DeploymentId,
result.DeploymentId, result.InstanceName, result.Error); result.InstanceName,
DeploymentStatus.Success,
null,
DateTimeOffset.UtcNow));
return;
} }
_logger.LogError(
"Failed to persist deployment {DeploymentId} for {Instance}: {Error}",
result.DeploymentId, result.InstanceName, result.Error);
// Persistence failed — undo the optimistic actor creation and counter bump so
// the site does not advertise an instance it cannot durably recover.
if (_instanceActors.Remove(result.InstanceName, out var orphan))
Context.Stop(orphan);
if (!result.IsRedeploy)
_totalDeployedCount = Math.Max(0, _totalDeployedCount - 1);
UpdateInstanceCounts();
result.OriginalSender.Tell(new DeploymentStatusResponse(
result.DeploymentId,
result.InstanceName,
DeploymentStatus.Failed,
result.Error ?? "Deployment persistence failed",
DateTimeOffset.UtcNow));
} }
/// <summary> /// <summary>
@@ -492,10 +541,20 @@ public class DeploymentManagerActor : ReceiveActor, IWithTimers
// ── DCL connection management ── // ── DCL connection management ──
private readonly HashSet<string> _createdConnections = new(); /// <summary>
/// Tracks the configuration last sent to the DCL for each connection name, keyed by
/// a hash of the connection's protocol/endpoints/credentials/failover count
/// (SiteRuntime-010). A name whose hash is unchanged is skipped; a name whose config
/// changed re-issues a <c>CreateConnectionCommand</c> so the DCL adopts the new
/// configuration instead of keeping a stale connection after a redeployment.
/// </summary>
private readonly Dictionary<string, string> _createdConnections = new();
/// <summary> /// <summary>
/// Sets up DCL connections from the flattened config (idempotent: tracks created connections). /// Sets up DCL connections from the flattened config. Idempotent on unchanged
/// configuration, but re-issues the create command when a connection's endpoint,
/// credentials, backup endpoint, or failover retry count has changed since it was
/// last sent (SiteRuntime-010).
/// </summary> /// </summary>
private void EnsureDclConnections(string configJson) private void EnsureDclConnections(string configJson)
{ {
@@ -508,7 +567,8 @@ public class DeploymentManagerActor : ReceiveActor, IWithTimers
foreach (var (name, connConfig) in config.Connections) foreach (var (name, connConfig) in config.Connections)
{ {
if (_createdConnections.Contains(name)) var configHash = ComputeConnectionConfigHash(connConfig);
if (_createdConnections.TryGetValue(name, out var lastHash) && lastHash == configHash)
continue; continue;
var primaryDetails = FlattenConnectionConfig(connConfig.Protocol, connConfig.ConfigurationJson); var primaryDetails = FlattenConnectionConfig(connConfig.Protocol, connConfig.ConfigurationJson);
@@ -519,10 +579,11 @@ public class DeploymentManagerActor : ReceiveActor, IWithTimers
_dclManager.Tell(new Commons.Messages.DataConnection.CreateConnectionCommand( _dclManager.Tell(new Commons.Messages.DataConnection.CreateConnectionCommand(
name, connConfig.Protocol, primaryDetails, backupDetails, connConfig.FailoverRetryCount)); name, connConfig.Protocol, primaryDetails, backupDetails, connConfig.FailoverRetryCount));
_createdConnections.Add(name); var changed = _createdConnections.ContainsKey(name);
_createdConnections[name] = configHash;
_logger.LogInformation( _logger.LogInformation(
"Created DCL connection {Connection} (protocol={Protocol})", "{Action} DCL connection {Connection} (protocol={Protocol})",
name, connConfig.Protocol); changed ? "Updated" : "Created", name, connConfig.Protocol);
} }
} }
catch (Exception ex) catch (Exception ex)
@@ -531,6 +592,26 @@ public class DeploymentManagerActor : ReceiveActor, IWithTimers
} }
} }
/// <summary>
/// Computes a stable hash over the configuration fields that affect how the DCL
/// connects, so a changed endpoint/credential/backup/failover count is detected
/// (SiteRuntime-010).
/// </summary>
private static string ComputeConnectionConfigHash(
Commons.Types.Flattening.ConnectionConfig connConfig)
{
var material = string.Join(
"",
connConfig.Protocol,
connConfig.ConfigurationJson ?? string.Empty,
connConfig.BackupConfigurationJson ?? string.Empty,
connConfig.FailoverRetryCount.ToString());
var bytes = System.Security.Cryptography.SHA256.HashData(
System.Text.Encoding.UTF8.GetBytes(material));
return Convert.ToHexString(bytes);
}
private static IDictionary<string, string> FlattenConnectionConfig(string protocol, string? json) private static IDictionary<string, string> FlattenConnectionConfig(string protocol, string? json)
{ {
if (string.IsNullOrEmpty(json)) if (string.IsNullOrEmpty(json))
@@ -559,25 +640,35 @@ public class DeploymentManagerActor : ReceiveActor, IWithTimers
// ── Shared Script Loading ── // ── Shared Script Loading ──
private void LoadSharedScriptsFromStorage() /// <summary>
/// SiteRuntime-008: reads and compiles all shared scripts on a background task so the
/// SQLite read and Roslyn compilation never block the singleton's mailbox thread. The
/// result is piped back as a <see cref="SharedScriptsLoaded"/> message, carrying the
/// enabled configs to resume staggered Instance Actor creation on the actor thread.
/// </summary>
private void LoadSharedScriptsFromStorage(List<DeployedInstance> enabledConfigs)
{ {
try Task.Run(async () =>
{ {
var scripts = _storage.GetAllSharedScriptsAsync().GetAwaiter().GetResult(); var scripts = await _storage.GetAllSharedScriptsAsync();
var compiled = 0; var compiled = 0;
foreach (var script in scripts) foreach (var script in scripts)
{ {
if (_sharedScriptLibrary.CompileAndRegister(script.Name, script.Code)) if (_sharedScriptLibrary.CompileAndRegister(script.Name, script.Code))
compiled++; compiled++;
} }
_logger.LogInformation( return new SharedScriptsLoaded(enabledConfigs, compiled, scripts.Count);
"Loaded {Compiled}/{Total} shared scripts from SQLite", }).ContinueWith(t =>
compiled, scripts.Count);
}
catch (Exception ex)
{ {
_logger.LogError(ex, "Failed to load shared scripts from SQLite"); if (t.IsCompletedSuccessfully)
} return t.Result;
_logger.LogError(
t.Exception?.GetBaseException(), "Failed to load shared scripts from SQLite");
// A shared-script load failure must not abandon startup — proceed with
// Instance Actor creation; scripts that need a missing shared script fail
// at execution time and are recorded to the site event log.
return new SharedScriptsLoaded(enabledConfigs, 0, 0);
}).PipeTo(Self);
} }
// ── Debug View routing ── // ── Debug View routing ──
@@ -891,12 +982,22 @@ public class DeploymentManagerActor : ReceiveActor, IWithTimers
// ── Internal messages ── // ── Internal messages ──
internal record StartupConfigsLoaded(List<DeployedInstance> Configs, string? Error); internal record StartupConfigsLoaded(List<DeployedInstance> Configs, string? Error);
/// <summary>
/// Internal message piped back once shared scripts have been compiled off-thread
/// (SiteRuntime-008). Carries the enabled configs so staggered Instance Actor
/// creation resumes on the actor thread.
/// </summary>
internal record SharedScriptsLoaded(
List<DeployedInstance> EnabledConfigs, int CompiledCount, int TotalCount);
internal record StartNextBatch(BatchState State); internal record StartNextBatch(BatchState State);
internal record BatchState(List<DeployedInstance> Configs, int NextIndex); internal record BatchState(List<DeployedInstance> Configs, int NextIndex);
internal record EnableResult( internal record EnableResult(
EnableInstanceCommand Command, DeployedInstance? Config, string? Error, IActorRef OriginalSender); EnableInstanceCommand Command, DeployedInstance? Config, string? Error, IActorRef OriginalSender);
internal record DeployPersistenceResult( internal record DeployPersistenceResult(
string DeploymentId, string InstanceName, bool Success, string? Error, IActorRef OriginalSender); string DeploymentId, string InstanceName, bool Success, string? Error,
IActorRef OriginalSender, bool IsRedeploy);
/// <summary> /// <summary>
/// A redeployment command buffered until the previous Instance Actor terminates. /// A redeployment command buffered until the previous Instance Actor terminates.
@@ -286,9 +286,10 @@ public class ScriptActor : ReceiveActor, IWithTimers
{ {
var executionId = $"{_scriptName}-exec-{_executionCounter++}"; var executionId = $"{_scriptName}-exec-{_executionCounter++}";
// NOTE: In production, configure a dedicated blocking I/O dispatcher via HOCON: // SiteRuntime-009: the actor's mailbox stays on the default dispatcher, but the
// akka.actor.script-execution-dispatcher { type = PinnedDispatcher } // script body itself runs on the dedicated ScriptExecutionScheduler (a bounded
// and chain .WithDispatcher("akka.actor.script-execution-dispatcher") below. // set of dedicated threads), so blocking script I/O is contained there and
// cannot starve the shared .NET thread pool.
var props = Props.Create(() => new ScriptExecutionActor( var props = Props.Create(() => new ScriptExecutionActor(
_scriptName, _scriptName,
_instanceName, _instanceName,
@@ -68,8 +68,13 @@ public class ScriptExecutionActor : ReceiveActor
{ {
var timeout = TimeSpan.FromSeconds(options.ScriptExecutionTimeoutSeconds); var timeout = TimeSpan.FromSeconds(options.ScriptExecutionTimeoutSeconds);
// SiteRuntime-009: run the script body on the dedicated script-execution
// scheduler, not the shared .NET thread pool, so blocking script I/O cannot
// starve the global pool and stall Akka dispatchers / HTTP handling.
var scheduler = ScriptExecutionScheduler.Shared(options);
// CTS must be created inside the async lambda so it outlives this method // CTS must be created inside the async lambda so it outlives this method
_ = Task.Run(async () => _ = Task.Factory.StartNew(async () =>
{ {
IServiceScope? serviceScope = null; IServiceScope? serviceScope = null;
// ISiteEventLogger is a singleton; resolve from the root provider so // ISiteEventLogger is a singleton; resolve from the root provider so
@@ -164,6 +169,6 @@ public class ScriptExecutionActor : ReceiveActor
// Stop self after execution completes // Stop self after execution completes
self.Tell(PoisonPill.Instance); self.Tell(PoisonPill.Instance);
} }
}); }, CancellationToken.None, TaskCreationOptions.DenyChildAttach, scheduler).Unwrap();
} }
} }
@@ -19,6 +19,14 @@ public class SiteStorageService
_logger = logger; _logger = logger;
} }
/// <summary>
/// Creates a new (unopened) SQLite connection against the site database.
/// Exposed so site-local repositories can open their own connections without
/// reaching into private state via reflection (SiteRuntime-006). The caller owns
/// the connection and is responsible for opening and disposing it.
/// </summary>
public SqliteConnection CreateConnection() => new(_connectionString);
/// <summary> /// <summary>
/// Creates the SQLite tables if they do not exist. /// Creates the SQLite tables if they do not exist.
/// Called once on site startup. /// Called once on site startup.
@@ -169,27 +169,12 @@ public class SiteExternalSystemRepository : IExternalSystemRepository
// ── Private helpers ── // ── Private helpers ──
/// <summary> /// <summary>
/// Creates a new SQLite connection using the same connection string as <see cref="SiteStorageService"/>. /// Creates a new SQLite connection against the site database via
/// We access the connection string via reflection-free approach: the storage service /// <see cref="SiteStorageService.CreateConnection"/> (SiteRuntime-006). The
/// exposes it through a known field. Since it doesn't, we derive it from the injected service /// connection string is owned by <see cref="SiteStorageService"/>; the repository
/// by using a shared connection string provider pattern. For now, we accept the connection /// no longer reaches into its private state via reflection.
/// string via a secondary constructor path or expose it from storage.
///
/// Implementation note: We use the SiteStorageService's internal connection string.
/// This field is accessed via a package-internal helper since SiteStorageService
/// doesn't expose it directly. As a pragmatic solution, we pass the connection string
/// separately at DI registration time.
/// </summary> /// </summary>
private SqliteConnection CreateConnection() private SqliteConnection CreateConnection() => _storage.CreateConnection();
{
// Access the connection string from SiteStorageService via its internal field.
// This uses reflection as a pragmatic choice — the alternative is modifying
// SiteStorageService to expose the connection string, which is out of scope.
var field = typeof(SiteStorageService).GetField("_connectionString",
System.Reflection.BindingFlags.NonPublic | System.Reflection.BindingFlags.Instance);
var connectionString = (string)field!.GetValue(_storage)!;
return new SqliteConnection(connectionString);
}
private static ExternalSystemDefinition MapExternalSystem(SqliteDataReader reader) private static ExternalSystemDefinition MapExternalSystem(SqliteDataReader reader)
{ {
@@ -233,12 +218,13 @@ public class SiteExternalSystemRepository : IExternalSystemRepository
} }
/// <summary> /// <summary>
/// Generates a stable positive integer ID from a string name. /// Generates a stable positive integer ID from a string name (SiteRuntime-007).
/// Uses a hash to produce a deterministic synthetic ID since the SQLite /// Uses a deterministic FNV-1a hash rather than <see cref="string.GetHashCode()"/>,
/// tables are keyed by name rather than auto-increment integer. /// which is randomized per process on .NET Core and would therefore change every
/// time the process restarts — breaking any caller that stored an ID and later
/// looks the entity up by that ID.
/// </summary> /// </summary>
private static int GenerateSyntheticId(string name) private static int GenerateSyntheticId(string name) => SyntheticId.From(name);
=> name.GetHashCode() & 0x7FFFFFFF;
/// <summary> /// <summary>
/// DTO for deserializing individual method entries from the method_definitions JSON column. /// DTO for deserializing individual method entries from the method_definitions JSON column.
@@ -178,13 +178,12 @@ public class SiteNotificationRepository : INotificationRepository
// ── Private helpers ── // ── Private helpers ──
private SqliteConnection CreateConnection() /// <summary>
{ /// Creates a new SQLite connection against the site database via
var field = typeof(SiteStorageService).GetField("_connectionString", /// <see cref="SiteStorageService.CreateConnection"/> (SiteRuntime-006) instead of
System.Reflection.BindingFlags.NonPublic | System.Reflection.BindingFlags.Instance); /// reaching into its private connection-string field via reflection.
var connectionString = (string)field!.GetValue(_storage)!; /// </summary>
return new SqliteConnection(connectionString); private SqliteConnection CreateConnection() => _storage.CreateConnection();
}
private static NotificationList MapNotificationList(SqliteDataReader reader) private static NotificationList MapNotificationList(SqliteDataReader reader)
{ {
@@ -246,10 +245,9 @@ public class SiteNotificationRepository : INotificationRepository
} }
/// <summary> /// <summary>
/// Generates a stable positive integer ID from a string name. /// Generates a stable positive integer ID from a string name (SiteRuntime-007).
/// Uses a hash to produce a deterministic synthetic ID since the SQLite /// Uses a deterministic FNV-1a hash rather than <see cref="string.GetHashCode()"/>,
/// tables are keyed by name rather than auto-increment integer. /// which is randomized per process on .NET Core and would change every restart.
/// </summary> /// </summary>
private static int GenerateSyntheticId(string name) private static int GenerateSyntheticId(string name) => SyntheticId.From(name);
=> name.GetHashCode() & 0x7FFFFFFF;
} }
@@ -0,0 +1,35 @@
namespace ScadaLink.SiteRuntime.Repositories;
/// <summary>
/// SiteRuntime-007: deterministic synthetic-ID generation for site-local artifacts.
///
/// The site SQLite tables are keyed by name rather than an auto-increment integer, but
/// the shared repository contracts (<c>IExternalSystemRepository</c>,
/// <c>INotificationRepository</c>) expose integer-keyed lookups. A synthetic integer ID
/// is therefore derived from the entity name. It MUST be stable across process restarts
/// — <see cref="string.GetHashCode()"/> is randomized per process on .NET Core and so
/// cannot be used.
/// </summary>
internal static class SyntheticId
{
// FNV-1a 32-bit constants.
private const uint FnvOffsetBasis = 2166136261;
private const uint FnvPrime = 16777619;
/// <summary>
/// Computes a deterministic, process-stable positive 31-bit integer ID for the
/// given name using the FNV-1a hash over its UTF-8 bytes.
/// </summary>
public static int From(string name)
{
var hash = FnvOffsetBasis;
foreach (var b in System.Text.Encoding.UTF8.GetBytes(name))
{
hash ^= b;
hash *= FnvPrime;
}
// Mask to a positive 31-bit value so the ID is always non-negative.
return (int)(hash & 0x7FFFFFFF);
}
}
@@ -1,6 +1,7 @@
using Microsoft.CodeAnalysis; using Microsoft.CodeAnalysis;
using Microsoft.CodeAnalysis.CSharp; using Microsoft.CodeAnalysis.CSharp;
using Microsoft.CodeAnalysis.CSharp.Scripting; using Microsoft.CodeAnalysis.CSharp.Scripting;
using Microsoft.CodeAnalysis.CSharp.Syntax;
using Microsoft.CodeAnalysis.Scripting; using Microsoft.CodeAnalysis.Scripting;
using Microsoft.Extensions.Logging; using Microsoft.Extensions.Logging;
using ScadaLink.Commons.Types; using ScadaLink.Commons.Types;
@@ -17,7 +18,10 @@ public class ScriptCompilationService
private readonly ILogger<ScriptCompilationService> _logger; private readonly ILogger<ScriptCompilationService> _logger;
/// <summary> /// <summary>
/// Namespaces that are forbidden in user scripts for security. /// Forbidden API roots. Each entry is matched as a prefix against both the resolved
/// symbol's containing namespace and its fully-qualified containing type name, so an
/// entry may name a whole namespace ("System.IO") or a single type
/// ("System.Diagnostics.Process").
/// </summary> /// </summary>
private static readonly string[] ForbiddenNamespaces = private static readonly string[] ForbiddenNamespaces =
[ [
@@ -30,8 +34,8 @@ public class ScriptCompilationService
]; ];
/// <summary> /// <summary>
/// Specific types/members allowed even within forbidden namespaces. /// Specific namespaces/types allowed even though they sit under a forbidden root.
/// async/await is OK despite System.Threading being blocked. /// async/await and cancellation tokens are OK despite System.Threading being blocked.
/// </summary> /// </summary>
private static readonly string[] AllowedExceptions = private static readonly string[] AllowedExceptions =
[ [
@@ -46,58 +50,184 @@ public class ScriptCompilationService
} }
/// <summary> /// <summary>
/// Validates that the script source code does not reference forbidden APIs. /// SiteRuntime-011: validates that the script does not reference forbidden APIs.
///
/// Validation is performed with Roslyn semantic analysis rather than a raw substring
/// scan of the source text. The script is parsed and a semantic model is built; every
/// identifier, type reference, member access, and object creation is resolved to its
/// symbol and the symbol's containing namespace is checked against the forbidden list.
///
/// This is reliable in both directions a textual scan was not:
/// - it catches forbidden types regardless of how they are written (<c>global::</c>
/// prefixes, aliases, transitively-imported namespaces) because it inspects the
/// resolved symbol, not the spelling;
/// - it does not raise false positives for the namespace string appearing in a
/// comment, a string literal, or an unrelated identifier.
///
/// Returns a list of violation messages, empty if clean. /// Returns a list of violation messages, empty if clean.
/// </summary> /// </summary>
public IReadOnlyList<string> ValidateTrustModel(string code) public IReadOnlyList<string> ValidateTrustModel(string code)
{ {
var violations = new List<string>(); var tree = CSharpSyntaxTree.ParseText(
var tree = CSharpSyntaxTree.ParseText(code); code, new CSharpParseOptions(kind: SourceCodeKind.Script));
var compilation = CSharpCompilation.CreateScriptCompilation(
"TrustValidation",
tree,
ScriptReferences,
new CSharpCompilationOptions(OutputKind.DynamicallyLinkedLibrary));
var model = compilation.GetSemanticModel(tree);
var root = tree.GetRoot(); var root = tree.GetRoot();
var text = root.ToFullString();
foreach (var ns in ForbiddenNamespaces) // Deduplicate so a forbidden symbol used many times is reported once but
// distinct forbidden symbols are all reported.
var violations = new SortedSet<string>(StringComparer.Ordinal);
foreach (var node in root.DescendantNodes())
{ {
if (text.Contains(ns, StringComparison.Ordinal)) // Only inspect nodes that name a type or member; skip declarations,
// string literals and comments entirely. Member-access and qualified-name
// parents are evaluated as a whole, so their nested name parts are skipped.
if (node is not (SimpleNameSyntax or MemberAccessExpressionSyntax
or QualifiedNameSyntax or ObjectCreationExpressionSyntax))
{ {
// Check if it matches an allowed exception continue;
var isAllowed = AllowedExceptions.Any(allowed =>
text.Contains(allowed, StringComparison.Ordinal) &&
ns != allowed &&
allowed.StartsWith(ns, StringComparison.Ordinal));
// More precise: check each occurrence
var idx = 0;
while ((idx = text.IndexOf(ns, idx, StringComparison.Ordinal)) >= 0)
{
var remainder = text.Substring(idx);
var matchesAllowed = AllowedExceptions.Any(a =>
remainder.StartsWith(a, StringComparison.Ordinal));
if (!matchesAllowed)
{
violations.Add($"Forbidden API reference: '{ns}' at position {idx}");
break;
}
idx += ns.Length;
}
} }
var info = model.GetSymbolInfo(node);
var symbol = info.Symbol ?? info.CandidateSymbols.FirstOrDefault();
// The set of fully-qualified scopes this reference touches: the resolved
// symbol's containing namespace and type, or — when the symbol could not
// be resolved (a type from an unreferenced assembly) — the syntactic
// fully-qualified name written in source as a safe fallback.
var scopes = symbol != null
? GetSymbolScopes(symbol)
: GetSyntacticScopes(node);
if (scopes.Count == 0)
continue;
var forbidden = ForbiddenNamespaces.FirstOrDefault(
f => scopes.Any(s => IsUnderScope(s, f)));
if (forbidden == null)
continue;
// Allow specific exception namespaces/types (async/await, cancellation).
if (scopes.Any(s => AllowedExceptions.Any(a => IsUnderScope(s, a))))
continue;
var name = symbol?.Name ?? node.ToString();
violations.Add($"Forbidden API reference: '{forbidden}' ({scopes[0]}.{name})");
} }
return violations; return violations.ToList();
} }
/// <summary>
/// Returns the fully-qualified scopes a resolved symbol belongs to — its containing
/// namespace and, for a type or member, the fully-qualified containing type. A bare
/// namespace symbol is intentionally ignored: a namespace name on its own performs
/// no action; harm requires referencing a type or a member.
/// </summary>
private static List<string> GetSymbolScopes(ISymbol symbol)
{
var scopes = new List<string>();
switch (symbol)
{
case INamespaceSymbol:
// A namespace reference alone is harmless — skip it. (This avoids a
// false positive on the "System.Threading" qualifier of the allowed
// "System.Threading.Tasks.Task".)
break;
case ITypeSymbol typeSymbol:
scopes.Add(typeSymbol.ToDisplayString());
if (typeSymbol.ContainingNamespace is { IsGlobalNamespace: false } typeNs)
scopes.Add(typeNs.ToDisplayString());
break;
default:
if (symbol.ContainingType != null)
{
scopes.Add(symbol.ContainingType.ToDisplayString());
if (symbol.ContainingType.ContainingNamespace is { IsGlobalNamespace: false } memberNs)
scopes.Add(memberNs.ToDisplayString());
}
else if (symbol.ContainingNamespace is { IsGlobalNamespace: false } ns)
{
scopes.Add(ns.ToDisplayString());
}
break;
}
return scopes;
}
/// <summary>
/// Fallback used when a name could not be resolved to a symbol (e.g. a type from an
/// assembly the script is not allowed to reference). The fully-qualified name as
/// written in source is used directly — a script that names
/// <c>System.Net.Http.HttpClient</c> is still rejected even though that assembly is
/// deliberately absent from the script's metadata references.
/// </summary>
private static List<string> GetSyntacticScopes(SyntaxNode node)
{
// A dotted name written in source is itself the fully-qualified scope. Only
// consider names that actually contain a dot — bare local identifiers cannot
// reach a forbidden namespace.
var text = node switch
{
QualifiedNameSyntax q => q.ToString(),
MemberAccessExpressionSyntax m => m.ToString(),
_ => string.Empty
};
// Strip whitespace/newlines that a multi-line member-access chain may contain.
text = new string(text.Where(c => !char.IsWhiteSpace(c)).ToArray());
return string.IsNullOrEmpty(text) || !text.Contains('.')
? []
: [text];
}
/// <summary>
/// True if <paramref name="actual"/> is exactly, or nested within,
/// <paramref name="root"/> (e.g. "System.IO.Compression" is under "System.IO",
/// "System.Diagnostics.Process" is under "System.Diagnostics.Process").
/// </summary>
private static bool IsUnderScope(string actual, string root)
=> actual.Equals(root, StringComparison.Ordinal)
|| actual.StartsWith(root + ".", StringComparison.Ordinal);
/// <summary>
/// Assemblies referenced by compiled scripts. Shared between the Roslyn scripting
/// options and the semantic-analysis compilation built for trust validation
/// (SiteRuntime-011), so the validator resolves symbols against exactly the same
/// metadata the script is compiled against.
/// </summary>
private static readonly System.Reflection.Assembly[] ScriptAssemblies =
[
typeof(object).Assembly,
typeof(Enumerable).Assembly,
typeof(Math).Assembly,
typeof(Microsoft.CSharp.RuntimeBinder.CSharpArgumentInfo).Assembly,
typeof(Commons.Types.DynamicJsonElement).Assembly
];
/// <summary>
/// Metadata references for the trust-validation semantic compilation.
/// </summary>
private static readonly MetadataReference[] ScriptReferences =
ScriptAssemblies
.Select(a => (MetadataReference)MetadataReference.CreateFromFile(a.Location))
.ToArray();
/// <summary> /// <summary>
/// Shared Roslyn scripting options (references + imports) used by both full /// Shared Roslyn scripting options (references + imports) used by both full
/// script compilation and trigger-expression compilation. /// script compilation and trigger-expression compilation.
/// </summary> /// </summary>
private static ScriptOptions BuildScriptOptions() => ScriptOptions.Default private static ScriptOptions BuildScriptOptions() => ScriptOptions.Default
.WithReferences( .WithReferences(ScriptAssemblies)
typeof(object).Assembly,
typeof(Enumerable).Assembly,
typeof(Math).Assembly,
typeof(Microsoft.CSharp.RuntimeBinder.CSharpArgumentInfo).Assembly,
typeof(Commons.Types.DynamicJsonElement).Assembly)
.WithImports( .WithImports(
"System", "System",
"System.Collections.Generic", "System.Collections.Generic",
@@ -0,0 +1,107 @@
using System.Collections.Concurrent;
namespace ScadaLink.SiteRuntime.Scripts;
/// <summary>
/// SiteRuntime-009: a dedicated, bounded <see cref="TaskScheduler"/> for running script
/// and alarm on-trigger bodies.
///
/// Script bodies may perform synchronous blocking I/O (a database connection, a
/// synchronous external-system call). Running them on the shared .NET
/// <see cref="ThreadPool"/> lets a burst of blocking scripts starve the pool and stall
/// unrelated Akka dispatchers and HTTP request handling. This scheduler owns a fixed set
/// of dedicated threads, so script blocking is contained to those threads and cannot
/// exhaust the global pool.
///
/// The scheduler is process-wide (one set of threads for all instances) and is sized
/// from <see cref="SiteRuntimeOptions"/> the first time it is configured.
/// </summary>
public sealed class ScriptExecutionScheduler : TaskScheduler, IDisposable
{
private readonly BlockingCollection<Task> _queue = new();
private readonly List<Thread> _threads;
private int _disposed;
private static volatile ScriptExecutionScheduler? _shared;
private static readonly object SharedLock = new();
/// <summary>
/// The process-wide script-execution scheduler. Lazily created on first use with the
/// thread count from <see cref="SiteRuntimeOptions.ScriptExecutionThreadCount"/>; the
/// first caller wins, subsequent calls reuse the existing instance.
/// </summary>
public static ScriptExecutionScheduler Shared(SiteRuntimeOptions options)
{
if (_shared != null)
return _shared;
lock (SharedLock)
{
return _shared ??= new ScriptExecutionScheduler(options.ScriptExecutionThreadCount);
}
}
/// <summary>
/// Creates a scheduler backed by <paramref name="threadCount"/> dedicated threads.
/// </summary>
public ScriptExecutionScheduler(int threadCount)
{
if (threadCount < 1)
threadCount = 1;
_threads = new List<Thread>(threadCount);
for (var i = 0; i < threadCount; i++)
{
var thread = new Thread(WorkerLoop)
{
IsBackground = true,
Name = $"script-execution-{i}"
};
_threads.Add(thread);
thread.Start();
}
}
/// <summary>The number of dedicated worker threads.</summary>
public override int MaximumConcurrencyLevel => _threads.Count;
private void WorkerLoop()
{
try
{
foreach (var task in _queue.GetConsumingEnumerable())
{
TryExecuteTask(task);
}
}
catch (ObjectDisposedException)
{
// Scheduler disposed — worker exits.
}
}
protected override void QueueTask(Task task) => _queue.Add(task);
protected override bool TryExecuteTaskInline(Task task, bool taskWasPreviouslyQueued)
{
// Only inline if we are already on one of this scheduler's worker threads,
// so script work never escapes onto a thread-pool thread.
if (Thread.CurrentThread.Name?.StartsWith("script-execution-", StringComparison.Ordinal) != true)
return false;
return TryExecuteTask(task);
}
protected override IEnumerable<Task> GetScheduledTasks() => _queue.ToArray();
public void Dispose()
{
if (Interlocked.Exchange(ref _disposed, 1) != 0)
return;
_queue.CompleteAdding();
foreach (var thread in _threads)
thread.Join(TimeSpan.FromSeconds(5));
_queue.Dispose();
}
}
@@ -36,4 +36,12 @@ public class SiteRuntimeOptions
/// Default: 1000. /// Default: 1000.
/// </summary> /// </summary>
public int StreamBufferSize { get; set; } = 1000; public int StreamBufferSize { get; set; } = 1000;
/// <summary>
/// SiteRuntime-009: number of dedicated threads in the script-execution scheduler.
/// Script and alarm on-trigger bodies run on these threads instead of the shared
/// .NET thread pool, so blocking script I/O cannot starve the global pool.
/// Default: 8.
/// </summary>
public int ScriptExecutionThreadCount { get; set; } = 8;
} }
@@ -36,8 +36,20 @@ public class StoreAndForwardService
private int _retryInProgress; private int _retryInProgress;
/// <summary> /// <summary>
/// WP-10: Delivery handler delegate. Returns true on success, throws on transient failure. /// WP-10: Delivery handler delegate. The return value / exception is interpreted
/// Permanent failures should return false (message will NOT be buffered). /// the same way on both the immediate-delivery path (<see cref="EnqueueAsync"/>)
/// and the background retry path (<c>RetryMessageAsync</c>):
/// <list type="bullet">
/// <item><description><c>true</c> — delivered successfully. The message is
/// removed from the buffer (or, on the immediate path, never buffered).</description></item>
/// <item><description><c>false</c> — permanent failure. On the immediate path
/// the message is NOT buffered; on a retry the message is already buffered and
/// is parked immediately (no further retries).</description></item>
/// <item><description>throws — transient failure. On the immediate path the
/// message is buffered for retry; on a retry the retry count is incremented and
/// the message is parked once <see cref="StoreAndForwardMessage.MaxRetries"/> is
/// reached.</description></item>
/// </list>
/// </summary> /// </summary>
private readonly Dictionary<StoreAndForwardCategory, Func<StoreAndForwardMessage, Task<bool>>> _deliveryHandlers = new(); private readonly Dictionary<StoreAndForwardCategory, Func<StoreAndForwardMessage, Task<bool>>> _deliveryHandlers = new();
@@ -59,7 +71,9 @@ public class StoreAndForwardService
} }
/// <summary> /// <summary>
/// Registers a delivery handler for a given message category. /// Registers a delivery handler for a given message category. See the
/// <c>_deliveryHandlers</c> field documentation for the true/false/throws contract,
/// which applies identically on the immediate and retry paths.
/// </summary> /// </summary>
public void RegisterDeliveryHandler( public void RegisterDeliveryHandler(
StoreAndForwardCategory category, StoreAndForwardCategory category,
@@ -241,11 +255,22 @@ public class StoreAndForwardService
return; return;
} }
// Permanent failure on retry — park immediately // Permanent failure on retry — park immediately.
// StoreAndForward-005: the sweep observed this row as Pending; only commit
// the park if it is still Pending so a concurrent operator action that
// moved it (retry/discard) is not silently overwritten.
message.Status = StoreAndForwardMessageStatus.Parked; message.Status = StoreAndForwardMessageStatus.Parked;
message.LastAttemptAt = DateTimeOffset.UtcNow; message.LastAttemptAt = DateTimeOffset.UtcNow;
message.LastError = "Permanent failure (handler returned false)"; message.LastError = "Permanent failure (handler returned false)";
await _storage.UpdateMessageAsync(message); var parked = await _storage.UpdateMessageIfStatusAsync(
message, StoreAndForwardMessageStatus.Pending);
if (!parked)
{
_logger.LogDebug(
"Message {MessageId} changed status during delivery; sweep park skipped",
message.Id);
return;
}
_replication?.ReplicatePark(message); _replication?.ReplicatePark(message);
RaiseActivity("Parked", message.Category, RaiseActivity("Parked", message.Category,
$"Permanent failure for {message.Target}: handler returned false"); $"Permanent failure for {message.Target}: handler returned false");
@@ -259,8 +284,18 @@ public class StoreAndForwardService
if (message.MaxRetries > 0 && message.RetryCount >= message.MaxRetries) if (message.MaxRetries > 0 && message.RetryCount >= message.MaxRetries)
{ {
// StoreAndForward-005: conditional park — see the permanent-failure
// branch above for rationale.
message.Status = StoreAndForwardMessageStatus.Parked; message.Status = StoreAndForwardMessageStatus.Parked;
await _storage.UpdateMessageAsync(message); var parked = await _storage.UpdateMessageIfStatusAsync(
message, StoreAndForwardMessageStatus.Pending);
if (!parked)
{
_logger.LogDebug(
"Message {MessageId} changed status during delivery; sweep park skipped",
message.Id);
return;
}
_replication?.ReplicatePark(message); _replication?.ReplicatePark(message);
RaiseActivity("Parked", message.Category, RaiseActivity("Parked", message.Category,
$"Max retries ({message.MaxRetries}) reached for {message.Target}"); $"Max retries ({message.MaxRetries}) reached for {message.Target}");
@@ -270,7 +305,17 @@ public class StoreAndForwardService
} }
else else
{ {
await _storage.UpdateMessageAsync(message); // StoreAndForward-005: the retry-count increment is also conditional
// on the row still being Pending so it cannot clobber an operator
// action that ran during the failed delivery.
if (!await _storage.UpdateMessageIfStatusAsync(
message, StoreAndForwardMessageStatus.Pending))
{
_logger.LogDebug(
"Message {MessageId} changed status during delivery; sweep retry-count update skipped",
message.Id);
return;
}
RaiseActivity("Retried", message.Category, RaiseActivity("Retried", message.Category,
$"Retry {message.RetryCount}/{message.MaxRetries} for {message.Target}: {ex.Message}"); $"Retry {message.RetryCount}/{message.MaxRetries} for {message.Target}: {ex.Message}");
} }
@@ -164,6 +164,47 @@ public class StoreAndForwardStorage
await cmd.ExecuteNonQueryAsync(); await cmd.ExecuteNonQueryAsync();
} }
/// <summary>
/// WP-10: Updates a message after a delivery attempt, but only if the row is still
/// in the expected status. Returns true if the row was updated, false if it had
/// already been changed (e.g. an operator retried or discarded the message) and so
/// was skipped.
///
/// StoreAndForward-005: the retry sweep uses this for its state-changing writes so
/// it cannot clobber a concurrent operator action (RetryParkedMessageAsync /
/// DiscardParkedMessageAsync). Those operator operations are themselves SQL-
/// conditional on <c>status = Parked</c>; making the sweep's writes conditional on
/// the status the sweep observed closes the sweep-vs-management race rather than
/// relying only on the in-process overlapping-sweep guard.
/// </summary>
public async Task<bool> UpdateMessageIfStatusAsync(
StoreAndForwardMessage message,
StoreAndForwardMessageStatus expectedStatus)
{
await using var connection = new SqliteConnection(_connectionString);
await connection.OpenAsync();
await using var cmd = connection.CreateCommand();
cmd.CommandText = @"
UPDATE sf_messages
SET retry_count = @retryCount,
last_attempt_at = @lastAttempt,
status = @status,
last_error = @lastError
WHERE id = @id AND status = @expectedStatus";
cmd.Parameters.AddWithValue("@id", message.Id);
cmd.Parameters.AddWithValue("@retryCount", message.RetryCount);
cmd.Parameters.AddWithValue("@lastAttempt", message.LastAttemptAt.HasValue
? message.LastAttemptAt.Value.ToString("O") : DBNull.Value);
cmd.Parameters.AddWithValue("@status", (int)message.Status);
cmd.Parameters.AddWithValue("@lastError", (object?)message.LastError ?? DBNull.Value);
cmd.Parameters.AddWithValue("@expectedStatus", (int)expectedStatus);
var rows = await cmd.ExecuteNonQueryAsync();
return rows > 0;
}
/// <summary> /// <summary>
/// WP-10: Removes a successfully delivered message. /// WP-10: Removes a successfully delivered message.
/// </summary> /// </summary>
@@ -221,6 +262,13 @@ public class StoreAndForwardStorage
/// <summary> /// <summary>
/// WP-12: Moves a parked message back to pending for retry. /// WP-12: Moves a parked message back to pending for retry.
///
/// StoreAndForward-010: <c>last_attempt_at</c> is reset to NULL so the re-queued
/// message is unambiguously due on the next retry sweep. An operator-initiated
/// retry means "attempt this again now"; leaving the stale parked timestamp in
/// place would make the message's retry timing depend on the configured retry
/// interval relative to the original (pre-park) attempt — "try immediately" only
/// by accident, and a long interval would instead delay the operator's retry.
/// </summary> /// </summary>
public async Task<bool> RetryParkedMessageAsync(string messageId) public async Task<bool> RetryParkedMessageAsync(string messageId)
{ {
@@ -230,7 +278,7 @@ public class StoreAndForwardStorage
await using var cmd = connection.CreateCommand(); await using var cmd = connection.CreateCommand();
cmd.CommandText = @" cmd.CommandText = @"
UPDATE sf_messages UPDATE sf_messages
SET status = @pending, retry_count = 0, last_error = NULL SET status = @pending, retry_count = 0, last_error = NULL, last_attempt_at = NULL
WHERE id = @id AND status = @parked"; WHERE id = @id AND status = @parked";
cmd.Parameters.AddWithValue("@id", messageId); cmd.Parameters.AddWithValue("@id", messageId);
@@ -3,6 +3,7 @@ using ScadaLink.Commons.Interfaces.Repositories;
using ScadaLink.Commons.Interfaces.Services; using ScadaLink.Commons.Interfaces.Services;
using ScadaLink.Commons.Types; using ScadaLink.Commons.Types;
using ScadaLink.Commons.Types.Enums; using ScadaLink.Commons.Types.Enums;
using ScadaLink.TemplateEngine;
namespace ScadaLink.TemplateEngine.Services; namespace ScadaLink.TemplateEngine.Services;
@@ -13,7 +14,11 @@ namespace ScadaLink.TemplateEngine.Services;
/// - Override non-locked attribute values /// - Override non-locked attribute values
/// - Cannot add or remove attributes (only override existing ones) /// - Cannot add or remove attributes (only override existing ones)
/// - Per-attribute connection binding (bulk assignment support) /// - Per-attribute connection binding (bulk assignment support)
/// - Enabled/disabled state with optimistic concurrency /// - Enabled/disabled state. Concurrent edits are last-write-wins — there is no
/// version token or conflict detection on instance state, matching the design
/// decision (Component-TemplateEngine.md: "Concurrent editing uses
/// last-write-wins — no pessimistic locking or conflict detection"). Optimistic
/// concurrency in the system applies to deployment status records, not here.
/// - Audit logging /// - Audit logging
/// </summary> /// </summary>
public class InstanceService public class InstanceService
@@ -170,10 +175,11 @@ public class InstanceService
} }
/// <summary> /// <summary>
/// Sets a per-instance alarm override. The alarm must exist on the /// Sets a per-instance alarm override. The alarm must exist in the
/// template and must not be locked. For HiLo alarms, the override JSON /// instance's effective alarm set (direct, inherited, or composed) and
/// merges into the inherited TriggerConfiguration setpoint-by-setpoint; /// must not be locked. For HiLo alarms, the override JSON merges into the
/// for binary trigger types, it replaces the whole config. /// inherited TriggerConfiguration setpoint-by-setpoint; for binary trigger
/// types, it replaces the whole config.
/// </summary> /// </summary>
public async Task<Result<InstanceAlarmOverride>> SetAlarmOverrideAsync( public async Task<Result<InstanceAlarmOverride>> SetAlarmOverrideAsync(
int instanceId, int instanceId,
@@ -187,17 +193,25 @@ public class InstanceService
if (instance == null) if (instance == null)
return Result<InstanceAlarmOverride>.Failure($"Instance with ID {instanceId} not found."); return Result<InstanceAlarmOverride>.Failure($"Instance with ID {instanceId} not found.");
// Verify alarm exists in the template and is not locked. Only direct // Verify the alarm exists in the instance's effective alarm set and is
// template alarms are checked here — composed-member overrides go // not locked. The effective set is resolved via TemplateResolver so that
// through but are silently ignored at runtime if the name doesn't // composed (path-qualified) and inherited alarms are found — a lookup
// match (same behavior as attribute overrides). // against the template's direct alarms alone would miss them, silently
var templateAlarms = await _repository.GetAlarmsByTemplateIdAsync(instance.TemplateId, cancellationToken); // accepting an override for a non-existent name or bypassing the lock
var templateAlarm = templateAlarms.FirstOrDefault(a => a.Name == alarmCanonicalName); // rule for a composed alarm. Mirrors SetAttributeOverrideAsync.
if (templateAlarm != null && templateAlarm.IsLocked) var allTemplates = await _repository.GetAllTemplatesAsync(cancellationToken);
{ var resolvedAlarm = TemplateResolver
.ResolveAllMembers(instance.TemplateId, allTemplates)
.FirstOrDefault(m => m.MemberType == "Alarm" && m.CanonicalName == alarmCanonicalName);
if (resolvedAlarm == null)
return Result<InstanceAlarmOverride>.Failure(
$"Alarm '{alarmCanonicalName}' does not exist in template {instance.TemplateId}. " +
"Cannot override an unknown alarm.");
if (resolvedAlarm.IsLocked)
return Result<InstanceAlarmOverride>.Failure( return Result<InstanceAlarmOverride>.Failure(
$"Alarm '{alarmCanonicalName}' is locked and cannot be overridden."); $"Alarm '{alarmCanonicalName}' is locked and cannot be overridden.");
}
var existingOverride = await _repository.GetAlarmOverrideAsync( var existingOverride = await _repository.GetAlarmOverrideAsync(
instanceId, alarmCanonicalName, cancellationToken); instanceId, alarmCanonicalName, cancellationToken);
@@ -72,16 +72,15 @@ public class TemplateDeletionService
} }
// Check 3: Other templates compose it directly (e.g., pre-Phase-3 data). // Check 3: Other templates compose it directly (e.g., pre-Phase-3 data).
var composingTemplates = new List<(string TemplateName, string InstanceName)>(); // Read the Compositions navigation already loaded by GetAllTemplatesAsync
foreach (var t in allTemplates) // rather than issuing one GetCompositionsByTemplateIdAsync round-trip per
{ // template (TemplateEngine-009) — this is the same source TemplateService
var compositions = await _repository.GetCompositionsByTemplateIdAsync(t.Id, cancellationToken); // .DeleteTemplateAsync uses for the equivalent check.
foreach (var comp in compositions) var composingTemplates = allTemplates
{ .SelectMany(t => t.Compositions
if (comp.ComposedTemplateId == templateId) .Where(comp => comp.ComposedTemplateId == templateId)
composingTemplates.Add((t.Name, comp.InstanceName)); .Select(comp => (TemplateName: t.Name, comp.InstanceName)))
} .ToList();
}
if (composingTemplates.Count > 0) if (composingTemplates.Count > 0)
{ {
@@ -118,7 +118,10 @@ public class SharedScriptService
/// <summary> /// <summary>
/// Basic structural validation of C# script code. /// Basic structural validation of C# script code.
/// Checks for balanced braces and basic syntax structure. /// Checks for balanced braces/brackets/parentheses. The scan is string- and
/// comment-aware (see <see cref="Validation.CSharpDelimiterScanner"/>) so a
/// delimiter inside a regular/verbatim/interpolated/raw string literal, a
/// char literal, or a comment does not produce a false syntax error.
/// Full Roslyn compilation would be added in a later phase when the scripting sandbox is available. /// Full Roslyn compilation would be added in a later phase when the scripting sandbox is available.
/// </summary> /// </summary>
internal static string? ValidateSyntax(string code) internal static string? ValidateSyntax(string code)
@@ -126,38 +129,28 @@ public class SharedScriptService
if (string.IsNullOrWhiteSpace(code)) if (string.IsNullOrWhiteSpace(code))
return "Script code cannot be empty."; return "Script code cannot be empty.";
// Check for balanced braces return Validation.CSharpDelimiterScanner.Scan(code) switch
int braceCount = 0;
int bracketCount = 0;
int parenCount = 0;
foreach (var ch in code)
{ {
switch (ch) Validation.CSharpDelimiterScanner.Mismatch.None => null,
{ Validation.CSharpDelimiterScanner.Mismatch.UnexpectedCloseBrace =>
case '{': braceCount++; break; "Syntax error: unmatched closing brace '}'.",
case '}': braceCount--; break; Validation.CSharpDelimiterScanner.Mismatch.UnclosedBrace =>
case '[': bracketCount++; break; "Syntax error: unmatched opening brace '{'.",
case ']': bracketCount--; break; Validation.CSharpDelimiterScanner.Mismatch.UnexpectedCloseBracket =>
case '(': parenCount++; break; "Syntax error: unmatched closing bracket ']'.",
case ')': parenCount--; break; Validation.CSharpDelimiterScanner.Mismatch.UnclosedBracket =>
} "Syntax error: unmatched opening bracket '['.",
Validation.CSharpDelimiterScanner.Mismatch.UnexpectedCloseParen =>
if (braceCount < 0) "Syntax error: unmatched closing parenthesis ')'.",
return "Syntax error: unmatched closing brace '}'."; Validation.CSharpDelimiterScanner.Mismatch.UnclosedParen =>
if (bracketCount < 0) "Syntax error: unmatched opening parenthesis '('.",
return "Syntax error: unmatched closing bracket ']'."; Validation.CSharpDelimiterScanner.Mismatch.UnclosedBlockComment =>
if (parenCount < 0) "Syntax error: unclosed block comment.",
return "Syntax error: unmatched closing parenthesis ')'."; Validation.CSharpDelimiterScanner.Mismatch.UnterminatedString =>
} "Syntax error: unterminated string literal.",
Validation.CSharpDelimiterScanner.Mismatch.UnterminatedChar =>
if (braceCount != 0) "Syntax error: unterminated character literal.",
return "Syntax error: unmatched opening brace '{'."; _ => null,
if (bracketCount != 0) };
return "Syntax error: unmatched opening bracket '['.";
if (parenCount != 0)
return "Syntax error: unmatched opening parenthesis '('.";
return null;
} }
} }
@@ -0,0 +1,413 @@
namespace ScadaLink.TemplateEngine.Validation;
/// <summary>
/// String/comment-aware scanner for the balanced-delimiter ("does it look like
/// valid C#") checks used by <see cref="ScriptCompiler"/> and
/// <c>SharedScriptService.ValidateSyntax</c>.
///
/// <para>
/// This is <b>not</b> a compiler. It is an interim structural check that walks
/// the source once and tracks <c>{}</c>, <c>[]</c> and <c>()</c> depth while
/// correctly skipping over the C# lexical constructs in which a delimiter is
/// inert: line/block comments, regular string literals (with <c>\</c> escapes),
/// verbatim strings (<c>@"..."</c>, where <c>""</c> escapes a quote and <c>\</c>
/// is literal), interpolated strings (<c>$"..."</c> / <c>$@"..."</c> — the holes
/// <c>{...}</c> are code and <c>{{</c>/<c>}}</c> are escaped braces), raw string
/// literals (<c>"""..."""</c>), and char literals (<c>'}'</c>).
/// </para>
///
/// <para>
/// It is intentionally conservative: when the real Roslyn-based compiler is
/// wired in (see <see cref="ScriptCompiler"/>) this hand-rolled scan should be
/// replaced by <c>CSharpSyntaxTree.ParseText</c> diagnostics. Until then this
/// scanner removes the false positives that a naive character count produced
/// for valid scripts containing a delimiter inside a string or comment.
/// </para>
/// </summary>
internal static class CSharpDelimiterScanner
{
/// <summary>The kind of delimiter mismatch found, if any.</summary>
internal enum Mismatch
{
None,
UnexpectedCloseBrace,
UnexpectedCloseBracket,
UnexpectedCloseParen,
UnclosedBrace,
UnclosedBracket,
UnclosedParen,
UnclosedBlockComment,
UnterminatedString,
UnterminatedChar,
}
/// <summary>
/// Returns true when <paramref name="pattern"/> occurs in a <b>code</b>
/// region of <paramref name="code"/> — i.e. not wholly inside a string
/// literal, char literal, or comment. Used by the interim forbidden-API
/// scan so that the inert text <c>System.IO.</c> in a comment or string
/// literal is not flagged as a forbidden API call (TemplateEngine-006).
///
/// <para>
/// This removes the false-positive half of the substring scan. It does
/// <b>not</b> close the bypass half: namespace aliases, <c>using static</c>,
/// and <c>global::</c>-qualified references still evade a pure text match.
/// Authoritative forbidden-API enforcement requires Roslyn semantic symbol
/// analysis and is deferred to the real script compiler / Site Runtime
/// sandbox; this check is advisory only.
/// </para>
/// </summary>
internal static bool ContainsInCode(string code, string pattern)
{
if (string.IsNullOrEmpty(pattern))
return false;
// Blank out every string/char-literal/comment span, then do an ordinary
// substring search over what remains (the code regions).
var codeOnly = BlankNonCodeSpans(code);
return codeOnly.Contains(pattern, StringComparison.Ordinal);
}
/// <summary>
/// Replaces the content of every comment, string literal, and char literal
/// with spaces (newlines preserved), leaving only code regions intact.
/// Delimiter characters themselves are also blanked so a pattern cannot
/// straddle a literal boundary.
/// </summary>
private static string BlankNonCodeSpans(string code)
{
var buffer = code.ToCharArray();
int n = code.Length;
int i = 0;
void Blank(int from, int to)
{
for (int k = from; k < to && k < n; k++)
if (buffer[k] != '\n' && buffer[k] != '\r')
buffer[k] = ' ';
}
while (i < n)
{
char c = code[i];
char next = i + 1 < n ? code[i + 1] : '\0';
int start = i;
if (c == '/' && next == '/')
{
i += 2;
while (i < n && code[i] != '\n') i++;
Blank(start, i);
continue;
}
if (c == '/' && next == '*')
{
i += 2;
while (i < n && !(code[i] == '*' && i + 1 < n && code[i + 1] == '/')) i++;
if (i < n) i += 2;
Blank(start, i);
continue;
}
if (c == '"' && next == '"' && i + 2 < n && code[i + 2] == '"')
{
SkipRawString(code, ref i);
Blank(start, i);
continue;
}
if (c == '$')
{
int j = i + 1;
bool verbatim = false;
if (j < n && code[j] == '@') { verbatim = true; j++; }
if (j < n && code[j] == '"')
{
i = j;
SkipInterpolatedString(code, ref i, verbatim);
Blank(start, i);
continue;
}
}
if (c == '@' && next == '"')
{
i++;
SkipVerbatimString(code, ref i);
Blank(start, i);
continue;
}
if (c == '"')
{
SkipRegularString(code, ref i);
Blank(start, i);
continue;
}
if (c == '\'')
{
SkipCharLiteral(code, ref i);
Blank(start, i);
continue;
}
i++;
}
return new string(buffer);
}
/// <summary>
/// Walks <paramref name="code"/> once and reports the first structural
/// delimiter problem, or <see cref="Mismatch.None"/> when the source is
/// balanced. Delimiters inside comments, strings, and char literals are
/// ignored.
/// </summary>
internal static Mismatch Scan(string code)
{
int brace = 0, bracket = 0, paren = 0;
int i = 0;
int n = code.Length;
while (i < n)
{
char c = code[i];
char next = i + 1 < n ? code[i + 1] : '\0';
// Line comment.
if (c == '/' && next == '/')
{
i += 2;
while (i < n && code[i] != '\n') i++;
continue;
}
// Block comment.
if (c == '/' && next == '*')
{
i += 2;
bool closed = false;
while (i < n)
{
if (code[i] == '*' && i + 1 < n && code[i + 1] == '/')
{
i += 2;
closed = true;
break;
}
i++;
}
if (!closed) return Mismatch.UnclosedBlockComment;
continue;
}
// Raw string literal: three or more consecutive quotes open it; the
// same number of quotes closes it. Detected before $/@-prefixed and
// plain strings.
if (c == '"' && next == '"' && i + 2 < n && code[i + 2] == '"')
{
if (!SkipRawString(code, ref i)) return Mismatch.UnterminatedString;
continue;
}
// Interpolated string ($"..." or $@"..." / @$"...").
if (c == '$')
{
int j = i + 1;
bool verbatim = false;
if (j < n && code[j] == '@') { verbatim = true; j++; }
if (j < n && code[j] == '"')
{
i = j;
if (!SkipInterpolatedString(code, ref i, verbatim)) return Mismatch.UnterminatedString;
continue;
}
}
// Verbatim string (@"...").
if (c == '@' && next == '"')
{
i++; // now on the opening quote
if (!SkipVerbatimString(code, ref i)) return Mismatch.UnterminatedString;
continue;
}
// Regular string literal.
if (c == '"')
{
if (!SkipRegularString(code, ref i)) return Mismatch.UnterminatedString;
continue;
}
// Char literal.
if (c == '\'')
{
if (!SkipCharLiteral(code, ref i)) return Mismatch.UnterminatedChar;
continue;
}
switch (c)
{
case '{': brace++; break;
case '}':
brace--;
if (brace < 0) return Mismatch.UnexpectedCloseBrace;
break;
case '[': bracket++; break;
case ']':
bracket--;
if (bracket < 0) return Mismatch.UnexpectedCloseBracket;
break;
case '(': paren++; break;
case ')':
paren--;
if (paren < 0) return Mismatch.UnexpectedCloseParen;
break;
}
i++;
}
if (brace != 0) return Mismatch.UnclosedBrace;
if (bracket != 0) return Mismatch.UnclosedBracket;
if (paren != 0) return Mismatch.UnclosedParen;
return Mismatch.None;
}
/// <summary>
/// Advances <paramref name="i"/> past a regular <c>"..."</c> string literal.
/// On entry <c>code[i] == '"'</c>. Returns false if the string is unterminated.
/// </summary>
private static bool SkipRegularString(string code, ref int i)
{
int n = code.Length;
i++; // past opening quote
while (i < n)
{
char c = code[i];
if (c == '\\') { i += 2; continue; } // escape — skip next char
if (c == '\n') return false; // unterminated (no multi-line)
if (c == '"') { i++; return true; }
i++;
}
return false;
}
/// <summary>
/// Advances past a verbatim <c>@"..."</c> string. On entry <c>code[i] == '"'</c>.
/// Inside, <c>\</c> is literal and <c>""</c> is an escaped quote.
/// </summary>
private static bool SkipVerbatimString(string code, ref int i)
{
int n = code.Length;
i++; // past opening quote
while (i < n)
{
if (code[i] == '"')
{
if (i + 1 < n && code[i + 1] == '"') { i += 2; continue; } // escaped quote
i++;
return true;
}
i++;
}
return false;
}
/// <summary>
/// Advances past an interpolated string. <paramref name="verbatim"/> selects
/// the <c>$@"..."</c> escaping rules. Interpolation holes <c>{...}</c> are
/// skipped over (their braces are code, not literal text); <c>{{</c>/<c>}}</c>
/// are escaped braces. On entry <c>code[i] == '"'</c>.
/// </summary>
private static bool SkipInterpolatedString(string code, ref int i, bool verbatim)
{
int n = code.Length;
i++; // past opening quote
while (i < n)
{
char c = code[i];
if (!verbatim && c == '\\') { i += 2; continue; }
if (c == '"')
{
if (verbatim && i + 1 < n && code[i + 1] == '"') { i += 2; continue; }
i++;
return true;
}
if (c == '{')
{
if (i + 1 < n && code[i + 1] == '{') { i += 2; continue; } // escaped brace
// Interpolation hole — skip to the matching '}', tracking nested
// braces so a hole containing an object initializer is handled.
i++;
int depth = 1;
while (i < n && depth > 0)
{
char h = code[i];
if (h == '{') depth++;
else if (h == '}') depth--;
else if (h == '"')
{
// A nested string inside the hole.
if (!SkipRegularString(code, ref i)) return false;
continue;
}
i++;
}
continue;
}
if (c == '}' && i + 1 < n && code[i + 1] == '}') { i += 2; continue; } // escaped brace
i++;
}
return false;
}
/// <summary>
/// Advances past a raw string literal <c>"""..."""</c> (C# 11). On entry
/// <c>code[i]</c> is the first of three or more opening quotes.
/// </summary>
private static bool SkipRawString(string code, ref int i)
{
int n = code.Length;
int openCount = 0;
while (i < n && code[i] == '"') { openCount++; i++; }
// Look for a run of the same number of quotes.
while (i < n)
{
if (code[i] == '"')
{
int closeCount = 0;
int start = i;
while (i < n && code[i] == '"') { closeCount++; i++; }
if (closeCount >= openCount) return true;
// Fewer quotes than the opener — they are literal content; keep scanning.
if (closeCount == 0) i = start + 1;
}
else
{
i++;
}
}
return false;
}
/// <summary>
/// Advances past a <c>'x'</c> char literal. On entry <c>code[i] == '\''</c>.
/// </summary>
private static bool SkipCharLiteral(string code, ref int i)
{
int n = code.Length;
i++; // past opening quote
while (i < n)
{
char c = code[i];
if (c == '\\') { i += 2; continue; }
if (c == '\n') return false;
if (c == '\'') { i++; return true; }
i++;
}
return false;
}
}
@@ -9,6 +9,18 @@ namespace ScadaLink.TemplateEngine.Validation;
/// and enforces the forbidden API list (System.IO, Process, Threading, Reflection, raw network). /// and enforces the forbidden API list (System.IO, Process, Threading, Reflection, raw network).
/// ///
/// For now, this implementation performs basic syntax validation. /// For now, this implementation performs basic syntax validation.
///
/// <para>
/// <b>SECURITY LIMITATION (TemplateEngine-006):</b> the forbidden-API check below
/// is an interim, <i>advisory</i> text scan — it is NOT an authoritative trust-model
/// boundary. <see cref="CSharpDelimiterScanner.ContainsInCode"/> removes the
/// false-positive half (forbidden text inside a string/comment is ignored), but a
/// determined script can still bypass the literal patterns via namespace aliases,
/// <c>using static</c>, or <c>global::</c>-qualified references. Authoritative
/// enforcement requires Roslyn semantic symbol analysis of the referenced
/// types/namespaces and is the responsibility of the real script compiler and the
/// Site Runtime sandbox. Do not rely on this class as the sole trust-model gate.
/// </para>
/// </summary> /// </summary>
public class ScriptCompiler public class ScriptCompiler
{ {
@@ -17,6 +29,13 @@ public class ScriptCompiler
/// <see cref="ValidationService"/>) must not use these. Trigger expressions run /// <see cref="ValidationService"/>) must not use these. Trigger expressions run
/// under the same trust model as scripts, so the list is shared from here rather /// under the same trust model as scripts, so the list is shared from here rather
/// than duplicated. /// than duplicated.
///
/// <para>
/// Matched with <see cref="CSharpDelimiterScanner.ContainsInCode"/> against code
/// regions only. This is advisory — see the class summary's SECURITY LIMITATION
/// note; the substring patterns are bypassable and the authoritative check is
/// deferred to Roslyn semantic analysis.
/// </para>
/// </summary> /// </summary>
internal static readonly string[] ForbiddenPatterns = internal static readonly string[] ForbiddenPatterns =
[ [
@@ -39,10 +58,12 @@ public class ScriptCompiler
if (string.IsNullOrWhiteSpace(code)) if (string.IsNullOrWhiteSpace(code))
return Result<bool>.Failure($"Script '{scriptName}' has empty code."); return Result<bool>.Failure($"Script '{scriptName}' has empty code.");
// Check for forbidden APIs // Check for forbidden APIs. Advisory only (see class summary): the scan is
// code-region-aware so forbidden text inside a string/comment is ignored,
// but it remains a substring match and is not an authoritative boundary.
foreach (var pattern in ForbiddenPatterns) foreach (var pattern in ForbiddenPatterns)
{ {
if (code.Contains(pattern, StringComparison.Ordinal)) if (CSharpDelimiterScanner.ContainsInCode(code, pattern))
{ {
return Result<bool>.Failure( return Result<bool>.Failure(
$"Script '{scriptName}' uses forbidden API: '{pattern.TrimEnd('.')}'. " + $"Script '{scriptName}' uses forbidden API: '{pattern.TrimEnd('.')}'. " +
@@ -50,76 +71,35 @@ public class ScriptCompiler
} }
} }
// Basic brace matching validation // Basic structural validation: balanced braces/brackets/parens. The scan
var braceDepth = 0; // is string- and comment-aware (see CSharpDelimiterScanner) so a delimiter
var inString = false; // inside a regular/verbatim/interpolated/raw string, a char literal, or a
var inLineComment = false; // comment does not produce a false mismatch. This remains an interim check
var inBlockComment = false; // until the Roslyn-based compiler is wired in.
var mismatch = CSharpDelimiterScanner.Scan(code);
for (int i = 0; i < code.Length; i++) return mismatch switch
{ {
var c = code[i]; CSharpDelimiterScanner.Mismatch.None =>
var next = i + 1 < code.Length ? code[i + 1] : '\0'; Result<bool>.Success(true),
CSharpDelimiterScanner.Mismatch.UnexpectedCloseBrace =>
if (inLineComment) Result<bool>.Failure($"Script '{scriptName}' has mismatched braces (unexpected closing brace)."),
{ CSharpDelimiterScanner.Mismatch.UnclosedBrace =>
if (c == '\n') inLineComment = false; Result<bool>.Failure($"Script '{scriptName}' has mismatched braces (unclosed opening brace)."),
continue; CSharpDelimiterScanner.Mismatch.UnexpectedCloseBracket =>
} Result<bool>.Failure($"Script '{scriptName}' has mismatched brackets (unexpected closing bracket)."),
CSharpDelimiterScanner.Mismatch.UnclosedBracket =>
if (inBlockComment) Result<bool>.Failure($"Script '{scriptName}' has mismatched brackets (unclosed opening bracket)."),
{ CSharpDelimiterScanner.Mismatch.UnexpectedCloseParen =>
if (c == '*' && next == '/') Result<bool>.Failure($"Script '{scriptName}' has mismatched parentheses (unexpected closing parenthesis)."),
{ CSharpDelimiterScanner.Mismatch.UnclosedParen =>
inBlockComment = false; Result<bool>.Failure($"Script '{scriptName}' has mismatched parentheses (unclosed opening parenthesis)."),
i++; CSharpDelimiterScanner.Mismatch.UnclosedBlockComment =>
} Result<bool>.Failure($"Script '{scriptName}' has an unclosed block comment."),
continue; CSharpDelimiterScanner.Mismatch.UnterminatedString =>
} Result<bool>.Failure($"Script '{scriptName}' has an unterminated string literal."),
CSharpDelimiterScanner.Mismatch.UnterminatedChar =>
if (c == '/' && next == '/') Result<bool>.Failure($"Script '{scriptName}' has an unterminated character literal."),
{ _ => Result<bool>.Success(true),
inLineComment = true; };
i++;
continue;
}
if (c == '/' && next == '*')
{
inBlockComment = true;
i++;
continue;
}
if (c == '"' && !inString)
{
inString = true;
continue;
}
if (c == '"' && inString)
{
// Check for escaped quote
if (i > 0 && code[i - 1] != '\\')
inString = false;
continue;
}
if (inString) continue;
if (c == '{') braceDepth++;
else if (c == '}') braceDepth--;
if (braceDepth < 0)
return Result<bool>.Failure($"Script '{scriptName}' has mismatched braces (unexpected closing brace).");
}
if (braceDepth != 0)
return Result<bool>.Failure($"Script '{scriptName}' has mismatched braces ({braceDepth} unclosed).");
if (inBlockComment)
return Result<bool>.Failure($"Script '{scriptName}' has an unclosed block comment.");
return Result<bool>.Success(true);
} }
} }
@@ -317,9 +317,12 @@ public class ValidationService
/// </summary> /// </summary>
internal static string? CheckExpressionSyntax(string expression) internal static string? CheckExpressionSyntax(string expression)
{ {
// Advisory forbidden-API scan (TemplateEngine-006): code-region-aware so
// the inert text inside a string/comment is not flagged, but still a
// substring match — not an authoritative boundary. See ScriptCompiler.
foreach (var pattern in ScriptCompiler.ForbiddenPatterns) foreach (var pattern in ScriptCompiler.ForbiddenPatterns)
{ {
if (expression.Contains(pattern, StringComparison.Ordinal)) if (CSharpDelimiterScanner.ContainsInCode(expression, pattern))
{ {
return $"uses forbidden API '{pattern.TrimEnd('.')}'. " + return $"uses forbidden API '{pattern.TrimEnd('.')}'. " +
"Trigger expressions cannot use System.IO, Process, Threading, Reflection, or raw network APIs."; "Trigger expressions cannot use System.IO, Process, Threading, Reflection, or raw network APIs.";
@@ -0,0 +1,86 @@
using Microsoft.Extensions.Logging.Abstractions;
using Microsoft.Extensions.Options;
using ScadaLink.Commons.Messages.RemoteQuery;
namespace ScadaLink.SiteEventLogging.Tests;
/// <summary>
/// Regression tests for SiteEventLogging-010: previously untested behaviours —
/// the query service error path and the recorder's disposed-state semantics.
/// </summary>
public class EventLogCoverageTests : IDisposable
{
private readonly string _dbPath;
public EventLogCoverageTests()
{
_dbPath = Path.Combine(Path.GetTempPath(), $"test_coverage_{Guid.NewGuid()}.db");
}
public void Dispose()
{
if (File.Exists(_dbPath)) File.Delete(_dbPath);
}
private SiteEventLogger NewLogger() => new(
Options.Create(new SiteEventLogOptions { DatabasePath = _dbPath }),
NullLogger<SiteEventLogger>.Instance);
private static EventLogQueryRequest MakeRequest() => new(
CorrelationId: "corr-err",
SiteId: "site-1",
From: null,
To: null,
EventType: null,
Severity: null,
InstanceId: null,
KeywordFilter: null,
ContinuationToken: null,
PageSize: 500,
Timestamp: DateTimeOffset.UtcNow);
[Fact]
public void ExecuteQuery_ReturnsFailureResponse_WhenDatabaseUnavailable()
{
// The catch block in EventLogQueryService.ExecuteQuery was untested.
// Disposing the recorder makes WithConnection throw ObjectDisposedException;
// the query service must convert that into a Success=false response rather
// than letting the exception escape to the actor.
var logger = NewLogger();
var queryService = new EventLogQueryService(
logger,
Options.Create(new SiteEventLogOptions { DatabasePath = _dbPath }),
NullLogger<EventLogQueryService>.Instance);
logger.Dispose();
var response = queryService.ExecuteQuery(MakeRequest());
Assert.False(response.Success);
Assert.NotNull(response.ErrorMessage);
Assert.Empty(response.Entries);
Assert.Null(response.ContinuationToken);
Assert.Equal("corr-err", response.CorrelationId);
}
[Fact]
public async Task LogEventAsync_AfterDispose_CompletesWithoutThrowing()
{
// Logging after disposal must not throw or hang — the event is simply
// dropped because the background writer has been completed.
var logger = NewLogger();
logger.Dispose();
await logger.LogEventAsync("script", "Info", null, "Source", "After dispose")
.WaitAsync(TimeSpan.FromSeconds(5));
}
[Fact]
public void Dispose_IsIdempotent()
{
// Re-entrant / repeated Dispose must be a safe no-op.
var logger = NewLogger();
logger.Dispose();
logger.Dispose();
}
}
@@ -0,0 +1,81 @@
using Akka.Actor;
using Akka.TestKit.Xunit2;
using ScadaLink.Commons.Messages.RemoteQuery;
namespace ScadaLink.SiteEventLogging.Tests;
/// <summary>
/// Regression tests for SiteEventLogging-010: the actor message contract of
/// <see cref="EventLogHandlerActor"/> was previously untested.
/// </summary>
public class EventLogHandlerActorTests : TestKit
{
/// <summary>Test double returning a canned response and recording the request.</summary>
private sealed class FakeQueryService : IEventLogQueryService
{
private readonly Func<EventLogQueryRequest, EventLogQueryResponse> _handler;
public EventLogQueryRequest? LastRequest { get; private set; }
public FakeQueryService(Func<EventLogQueryRequest, EventLogQueryResponse> handler)
=> _handler = handler;
public EventLogQueryResponse ExecuteQuery(EventLogQueryRequest request)
{
LastRequest = request;
return _handler(request);
}
}
private static EventLogQueryRequest MakeRequest(string correlationId) => new(
CorrelationId: correlationId,
SiteId: "site-1",
From: null,
To: null,
EventType: null,
Severity: null,
InstanceId: null,
KeywordFilter: null,
ContinuationToken: null,
PageSize: 500,
Timestamp: DateTimeOffset.UtcNow);
private static EventLogQueryResponse MakeResponse(EventLogQueryRequest req, bool success = true) => new(
CorrelationId: req.CorrelationId,
SiteId: req.SiteId,
Entries: [],
ContinuationToken: null,
HasMore: false,
Success: success,
ErrorMessage: success ? null : "boom",
Timestamp: DateTimeOffset.UtcNow);
[Fact]
public void Actor_RepliesToSender_WithQueryResponse()
{
var fake = new FakeQueryService(req => MakeResponse(req));
var actor = Sys.ActorOf(Props.Create(() => new EventLogHandlerActor(fake)));
var request = MakeRequest("corr-1");
actor.Tell(request, TestActor);
var response = ExpectMsg<EventLogQueryResponse>();
Assert.Equal("corr-1", response.CorrelationId);
Assert.Equal("site-1", response.SiteId);
Assert.True(response.Success);
Assert.Same(request, fake.LastRequest);
}
[Fact]
public void Actor_PropagatesQueryServiceErrorResponse_ToSender()
{
var fake = new FakeQueryService(req => MakeResponse(req, success: false));
var actor = Sys.ActorOf(Props.Create(() => new EventLogHandlerActor(fake)));
actor.Tell(MakeRequest("corr-2"), TestActor);
var response = ExpectMsg<EventLogQueryResponse>();
Assert.False(response.Success);
Assert.Equal("corr-2", response.CorrelationId);
Assert.Equal("boom", response.ErrorMessage);
}
}
@@ -9,8 +9,11 @@
</PropertyGroup> </PropertyGroup>
<ItemGroup> <ItemGroup>
<PackageReference Include="Akka.TestKit.Xunit2" />
<PackageReference Include="coverlet.collector" /> <PackageReference Include="coverlet.collector" />
<PackageReference Include="Microsoft.Data.Sqlite" /> <PackageReference Include="Microsoft.Data.Sqlite" />
<PackageReference Include="Microsoft.Extensions.DependencyInjection" />
<PackageReference Include="Microsoft.Extensions.Logging" />
<PackageReference Include="Microsoft.Extensions.Logging.Abstractions" /> <PackageReference Include="Microsoft.Extensions.Logging.Abstractions" />
<PackageReference Include="Microsoft.Extensions.Options" /> <PackageReference Include="Microsoft.Extensions.Options" />
<PackageReference Include="Microsoft.NET.Test.Sdk" /> <PackageReference Include="Microsoft.NET.Test.Sdk" />
@@ -0,0 +1,71 @@
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Options;
namespace ScadaLink.SiteEventLogging.Tests;
/// <summary>
/// Regression tests for SiteEventLogging-007: the purge and query services must not
/// downcast an injected <see cref="ISiteEventLogger"/> to the concrete type. They now
/// depend on the concrete <see cref="SiteEventLogger"/> directly, and DI must resolve
/// the recorder, query service, and purge service to a single shared instance.
/// </summary>
public class ServiceWiringTests : IDisposable
{
private readonly string _dbPath;
private ServiceProvider? _provider;
public ServiceWiringTests()
{
_dbPath = Path.Combine(Path.GetTempPath(), $"test_wiring_{Guid.NewGuid()}.db");
}
public void Dispose()
{
_provider?.Dispose();
if (File.Exists(_dbPath)) File.Delete(_dbPath);
}
[Fact]
public void AddSiteEventLogging_ResolvesAllServices_SharingOneRecorderInstance()
{
var services = new ServiceCollection();
services.AddLogging();
services.Configure<SiteEventLogOptions>(o => o.DatabasePath = _dbPath);
services.AddSiteEventLogging();
_provider = services.BuildServiceProvider();
var recorderViaInterface = _provider.GetRequiredService<ISiteEventLogger>();
var recorderConcrete = _provider.GetRequiredService<SiteEventLogger>();
var queryService = _provider.GetRequiredService<IEventLogQueryService>();
// The interface registration must forward to the concrete singleton — purge
// and query depend on that same instance, so all three share one connection.
Assert.Same(recorderConcrete, recorderViaInterface);
Assert.NotNull(queryService);
// The hosted purge service must also resolve without an InvalidCastException.
var hosted = _provider.GetServices<Microsoft.Extensions.Hosting.IHostedService>();
Assert.Contains(hosted, h => h is EventLogPurgeService);
}
[Fact]
public void PurgeAndQueryServices_AcceptConcreteRecorder_WithoutDowncast()
{
// The services no longer take ISiteEventLogger and downcast it; they take the
// concrete SiteEventLogger. Constructing them directly must compile and work.
var options = Options.Create(new SiteEventLogOptions { DatabasePath = _dbPath });
using var loggerFactory = LoggerFactory.Create(_ => { });
using var recorder = new SiteEventLogger(
options, loggerFactory.CreateLogger<SiteEventLogger>());
var purge = new EventLogPurgeService(
recorder, options, loggerFactory.CreateLogger<EventLogPurgeService>());
var query = new EventLogQueryService(
recorder, options, loggerFactory.CreateLogger<EventLogQueryService>());
Assert.NotNull(purge);
Assert.NotNull(query);
}
}
@@ -0,0 +1,116 @@
using System.Diagnostics;
using Microsoft.Data.Sqlite;
using Microsoft.Extensions.Logging.Abstractions;
using Microsoft.Extensions.Options;
namespace ScadaLink.SiteEventLogging.Tests;
/// <summary>
/// Regression tests for SiteEventLogging-005 (synchronous I/O on caller thread)
/// and SiteEventLogging-008 (write failures silently swallowed).
/// </summary>
public class SiteEventLoggerAsyncTests : IDisposable
{
private readonly SiteEventLogger _logger;
private readonly string _dbPath;
public SiteEventLoggerAsyncTests()
{
_dbPath = Path.Combine(Path.GetTempPath(), $"test_async_{Guid.NewGuid()}.db");
var options = Options.Create(new SiteEventLogOptions { DatabasePath = _dbPath });
_logger = new SiteEventLogger(options, NullLogger<SiteEventLogger>.Instance);
}
public void Dispose()
{
_logger.Dispose();
if (File.Exists(_dbPath)) File.Delete(_dbPath);
}
[Fact]
public async Task LogEventAsync_DoesNotBlockCaller_WhenWriteIsSlow()
{
// SiteEventLogging-005: LogEventAsync must not perform the SQLite write
// inline on the caller's thread. We hold the recorder busy on a long
// database operation from one thread, then time how long it takes the
// caller of LogEventAsync to get control back. If recording is offloaded
// to a background writer, the caller returns essentially immediately even
// though the database is busy. If recording is synchronous (the bug), the
// caller blocks on the write lock for the full busy period.
var busyStarted = new ManualResetEventSlim(false);
var releaseBusy = new ManualResetEventSlim(false);
var busyThread = new Thread(() =>
{
_logger.WithConnection(_ =>
{
busyStarted.Set();
// Hold the connection lock until the test releases it.
releaseBusy.Wait(TimeSpan.FromSeconds(10));
});
});
busyThread.Start();
Assert.True(busyStarted.Wait(TimeSpan.FromSeconds(5)), "Busy thread did not start.");
var sw = Stopwatch.StartNew();
var recordTask = _logger.LogEventAsync("script", "Info", null, "Caller", "Should not block");
sw.Stop();
// The CALL must return quickly even though the database is busy for ~1s.
Assert.True(sw.ElapsedMilliseconds < 500,
$"LogEventAsync blocked the caller for {sw.ElapsedMilliseconds} ms — recording is not offloaded.");
// Release the busy thread; the record must still complete successfully.
releaseBusy.Set();
busyThread.Join(TimeSpan.FromSeconds(10));
await recordTask.WaitAsync(TimeSpan.FromSeconds(10));
}
[Fact]
public async Task LogEventAsync_TaskCompletes_AfterEventIsPersisted()
{
// Awaiting the returned Task must guarantee the row is durably written.
await _logger.LogEventAsync("script", "Info", "inst-1", "Source", "Persisted event");
var count = _logger.WithConnection(connection =>
{
using var cmd = connection.CreateCommand();
cmd.CommandText = "SELECT COUNT(*) FROM site_events";
return (long)cmd.ExecuteScalar()!;
});
Assert.Equal(1, count);
}
[Fact]
public async Task LogEventAsync_FaultsTask_AndCountsFailure_OnWriteError()
{
// SiteEventLogging-008: a write failure must not be silently swallowed.
// The returned Task must fault and the failure counter must increment so
// Health Monitoring can surface a logging outage.
using var failingConnection = new SqliteConnection("Data Source=:memory:");
failingConnection.Open();
var options = Options.Create(new SiteEventLogOptions
{
DatabasePath = Path.Combine(Path.GetTempPath(), $"test_fail_{Guid.NewGuid()}.db")
});
var logger = new SiteEventLogger(options, NullLogger<SiteEventLogger>.Instance);
// Drop the table so every write fails with "no such table".
logger.WithConnection(connection =>
{
using var cmd = connection.CreateCommand();
cmd.CommandText = "DROP TABLE site_events";
cmd.ExecuteNonQuery();
});
await Assert.ThrowsAnyAsync<SqliteException>(() =>
logger.LogEventAsync("script", "Error", null, "Source", "Failing write"));
Assert.True(logger.FailedWriteCount > 0,
"FailedWriteCount must increment when an event write fails.");
logger.Dispose();
}
}
@@ -0,0 +1,248 @@
using Akka.Actor;
using Akka.TestKit.Xunit2;
using Microsoft.Extensions.Logging.Abstractions;
using ScadaLink.Commons.Messages.Deployment;
using ScadaLink.Commons.Types.Enums;
using ScadaLink.Commons.Types.Flattening;
using ScadaLink.SiteRuntime.Actors;
using ScadaLink.SiteRuntime.Persistence;
using ScadaLink.SiteRuntime.Scripts;
using System.Text.Json;
namespace ScadaLink.SiteRuntime.Tests.Actors;
/// <summary>
/// Regression tests for the Medium-severity DeploymentManagerActor findings:
/// SiteRuntime-005 (Success reported before persistence completes) and
/// SiteRuntime-008 (blocking shared-script load on the actor thread).
/// </summary>
public class DeploymentManagerMediumFindingsTests : TestKit, IDisposable
{
private readonly ScriptCompilationService _compilationService;
private readonly SharedScriptLibrary _sharedScriptLibrary;
private readonly string _dbFile;
public DeploymentManagerMediumFindingsTests()
{
_dbFile = Path.Combine(Path.GetTempPath(), $"dm-medium-test-{Guid.NewGuid():N}.db");
_compilationService = new ScriptCompilationService(
NullLogger<ScriptCompilationService>.Instance);
_sharedScriptLibrary = new SharedScriptLibrary(
_compilationService, NullLogger<SharedScriptLibrary>.Instance);
}
void IDisposable.Dispose()
{
Shutdown();
try { File.Delete(_dbFile); } catch { /* cleanup */ }
}
private SiteStorageService NewStorage(string connectionString)
=> new(connectionString, NullLogger<SiteStorageService>.Instance);
private IActorRef CreateDeploymentManager(SiteStorageService storage, IActorRef? dclManager = null)
{
return ActorOf(Props.Create(() => new DeploymentManagerActor(
storage,
_compilationService,
_sharedScriptLibrary,
null,
new SiteRuntimeOptions(),
NullLogger<DeploymentManagerActor>.Instance,
dclManager,
null,
null,
null)));
}
private static string MakeConfigJsonWithConnection(
string instanceName, string endpoint, int failoverRetryCount)
{
var config = new FlattenedConfiguration
{
InstanceUniqueName = instanceName,
Attributes =
[
new ResolvedAttribute { CanonicalName = "TestAttr", Value = "1", DataType = "Int32" }
],
Connections = new Dictionary<string, ConnectionConfig>
{
["Conn1"] = new ConnectionConfig
{
Protocol = "Custom",
ConfigurationJson = $"{{\"endpoint\":\"{endpoint}\"}}",
FailoverRetryCount = failoverRetryCount
}
}
};
return JsonSerializer.Serialize(config);
}
private static string MakeConfigJson(string instanceName)
{
var config = new FlattenedConfiguration
{
InstanceUniqueName = instanceName,
Attributes =
[
new ResolvedAttribute { CanonicalName = "TestAttr", Value = "1", DataType = "Int32" }
]
};
return JsonSerializer.Serialize(config);
}
/// <summary>
/// SiteRuntime-005: when SQLite persistence of the deployed config fails, the
/// Deployment Manager must report <see cref="DeploymentStatus.Failed"/> to central,
/// not <see cref="DeploymentStatus.Success"/>. Reporting Success on a persistence
/// failure silently loses the deployment on the next restart/failover.
/// </summary>
[Fact]
public async Task Deploy_PersistenceFailure_ReportsFailedNotSuccess()
{
// A connection string pointing at an unwritable path makes every storage
// write throw, so StoreDeployedConfigAsync fails.
var badPath = Path.Combine(
Path.GetTempPath(), $"no-such-dir-{Guid.NewGuid():N}", "site.db");
var storage = NewStorage($"Data Source={badPath}");
var actor = CreateDeploymentManager(storage);
await Task.Delay(500); // empty startup
actor.Tell(new DeployInstanceCommand(
"dep-fail", "FailPump", "h1", MakeConfigJson("FailPump"), "admin", DateTimeOffset.UtcNow));
var response = ExpectMsg<DeploymentStatusResponse>(TimeSpan.FromSeconds(10));
Assert.Equal("FailPump", response.InstanceUniqueName);
Assert.Equal(DeploymentStatus.Failed, response.Status);
Assert.False(string.IsNullOrEmpty(response.ErrorMessage));
}
/// <summary>
/// SiteRuntime-005: a successful deployment must still report
/// <see cref="DeploymentStatus.Success"/>, and only after the config row is
/// committed to SQLite (so a restart re-creates the instance).
/// </summary>
[Fact]
public async Task Deploy_Success_ReportsSuccessAndPersistsConfig()
{
var storage = NewStorage($"Data Source={_dbFile}");
await storage.InitializeAsync();
var actor = CreateDeploymentManager(storage);
await Task.Delay(500);
actor.Tell(new DeployInstanceCommand(
"dep-ok", "OkPump", "h1", MakeConfigJson("OkPump"), "admin", DateTimeOffset.UtcNow));
var response = ExpectMsg<DeploymentStatusResponse>(TimeSpan.FromSeconds(10));
Assert.Equal(DeploymentStatus.Success, response.Status);
// By the time Success is reported, the config must be durable.
var configs = await storage.GetAllDeployedConfigsAsync();
Assert.Contains(configs, c => c.InstanceUniqueName == "OkPump");
}
/// <summary>
/// SiteRuntime-010: when a redeployment changes a connection's configuration
/// (here the failover retry count and endpoint), the Deployment Manager must
/// re-issue a <see cref="ScadaLink.Commons.Messages.DataConnection.CreateConnectionCommand"/>
/// so the DCL adopts the new configuration rather than keeping the stale one.
/// </summary>
[Fact]
public async Task EnsureDclConnections_ConnectionConfigChanged_ReissuesCreateCommand()
{
var storage = NewStorage($"Data Source={_dbFile}");
await storage.InitializeAsync();
var dcl = CreateTestProbe();
var actor = CreateDeploymentManager(storage, dcl.Ref);
await Task.Delay(500);
// Initial deploy with one connection.
actor.Tell(new DeployInstanceCommand(
"dep-c1", "ConnPump", "h1",
MakeConfigJsonWithConnection("ConnPump", "opc.tcp://host-a:4840", 3),
"admin", DateTimeOffset.UtcNow));
var firstCreate = dcl.ExpectMsg<ScadaLink.Commons.Messages.DataConnection.CreateConnectionCommand>(
TimeSpan.FromSeconds(5));
Assert.Equal("Conn1", firstCreate.ConnectionName);
Assert.Equal(3, firstCreate.FailoverRetryCount);
ExpectMsg<DeploymentStatusResponse>(TimeSpan.FromSeconds(5));
await Task.Delay(500);
// Redeploy with a CHANGED connection configuration.
actor.Tell(new DeployInstanceCommand(
"dep-c2", "ConnPump", "h2",
MakeConfigJsonWithConnection("ConnPump", "opc.tcp://host-b:4840", 7),
"admin", DateTimeOffset.UtcNow));
// The DCL must receive a fresh create command reflecting the new config.
var secondCreate = dcl.ExpectMsg<ScadaLink.Commons.Messages.DataConnection.CreateConnectionCommand>(
TimeSpan.FromSeconds(10));
Assert.Equal("Conn1", secondCreate.ConnectionName);
Assert.Equal(7, secondCreate.FailoverRetryCount);
}
/// <summary>
/// SiteRuntime-010: an unchanged connection configuration must still be skipped —
/// re-sending an identical create command on every deploy is wasteful.
/// </summary>
[Fact]
public async Task EnsureDclConnections_UnchangedConfig_DoesNotReissueCreateCommand()
{
var storage = NewStorage($"Data Source={_dbFile}");
await storage.InitializeAsync();
var dcl = CreateTestProbe();
var actor = CreateDeploymentManager(storage, dcl.Ref);
await Task.Delay(500);
var json = MakeConfigJsonWithConnection("StablePump", "opc.tcp://host-a:4840", 3);
actor.Tell(new DeployInstanceCommand(
"dep-s1", "StablePump", "h1", json, "admin", DateTimeOffset.UtcNow));
dcl.ExpectMsg<ScadaLink.Commons.Messages.DataConnection.CreateConnectionCommand>(
TimeSpan.FromSeconds(5));
ExpectMsg<DeploymentStatusResponse>(TimeSpan.FromSeconds(5));
await Task.Delay(500);
// Redeploy with the IDENTICAL connection configuration.
actor.Tell(new DeployInstanceCommand(
"dep-s2", "StablePump", "h2", json, "admin", DateTimeOffset.UtcNow));
ExpectMsg<DeploymentStatusResponse>(TimeSpan.FromSeconds(10));
// No further create command for an unchanged connection.
dcl.ExpectNoMsg(TimeSpan.FromMilliseconds(800));
}
/// <summary>
/// SiteRuntime-008: startup must not block the Deployment Manager mailbox on a
/// synchronous shared-script load. With shared scripts present, the actor must
/// still load deployed configs, create Instance Actors, and remain responsive.
/// </summary>
[Fact]
public async Task Startup_WithSharedScripts_LoadsConfigsAndStaysResponsive()
{
var storage = NewStorage($"Data Source={_dbFile}");
await storage.InitializeAsync();
// Several shared scripts to compile during startup.
for (var i = 0; i < 5; i++)
{
await storage.StoreSharedScriptAsync(
$"Shared{i}", "return 1 + 1;", null, null);
}
await storage.StoreDeployedConfigAsync(
"StartupPump", MakeConfigJson("StartupPump"), "d1", "h1", true);
var actor = CreateDeploymentManager(storage);
await Task.Delay(2000);
// The instance loaded at startup must be operable — proves startup completed
// and the actor processed messages after the shared-script load.
actor.Tell(new DeploymentStateQueryRequest("corr-1", "StartupPump", DateTimeOffset.UtcNow));
var response = ExpectMsg<DeploymentStateQueryResponse>(TimeSpan.FromSeconds(5));
Assert.True(response.IsDeployed);
}
}
@@ -0,0 +1,108 @@
using Microsoft.Extensions.Logging.Abstractions;
using ScadaLink.SiteRuntime.Persistence;
using ScadaLink.SiteRuntime.Repositories;
namespace ScadaLink.SiteRuntime.Tests.Repositories;
/// <summary>
/// SiteRuntime-006 / SiteRuntime-007 regression tests for the site-local repositories.
///
/// SiteRuntime-006: the repositories must obtain a SQLite connection through
/// <see cref="SiteStorageService.CreateConnection"/>, not by reading a private field
/// via reflection.
///
/// SiteRuntime-007: the synthetic integer IDs derived from entity names must be stable
/// across process restarts (a freshly-constructed service/repository), so an ID handed
/// to a caller still resolves the same entity later.
/// </summary>
public class SiteRepositoryTests : IDisposable
{
private readonly string _dbFile;
public SiteRepositoryTests()
{
_dbFile = Path.Combine(Path.GetTempPath(), $"site-repo-test-{Guid.NewGuid():N}.db");
}
public void Dispose()
{
try { File.Delete(_dbFile); } catch { /* cleanup */ }
GC.SuppressFinalize(this);
}
private SiteStorageService NewStorage()
=> new($"Data Source={_dbFile}", NullLogger<SiteStorageService>.Instance);
/// <summary>
/// SiteRuntime-006: an external system stored via <see cref="SiteStorageService"/>
/// can be read back through the repository — proving the repository's connection
/// (now obtained from <see cref="SiteStorageService.CreateConnection"/>) is valid.
/// </summary>
[Fact]
public async Task ExternalSystemRepository_RoundTripsStoredDefinition()
{
var storage = NewStorage();
await storage.InitializeAsync();
await storage.StoreExternalSystemAsync(
"WeatherApi", "https://api.example.com", "ApiKey", "{\"key\":\"x\"}", null);
var repo = new SiteExternalSystemRepository(storage);
var all = await repo.GetAllExternalSystemsAsync();
Assert.Single(all);
Assert.Equal("WeatherApi", all[0].Name);
Assert.Equal("https://api.example.com", all[0].EndpointUrl);
}
/// <summary>
/// SiteRuntime-007: the synthetic ID for an external system must be identical when
/// the storage service and repository are re-created (simulating a process restart).
/// With the old <see cref="string.GetHashCode()"/> the ID was randomized per process
/// and a by-ID lookup after a restart would fail.
/// </summary>
[Fact]
public async Task ExternalSystemRepository_SyntheticId_IsStableAcrossRestart()
{
var storage1 = NewStorage();
await storage1.InitializeAsync();
await storage1.StoreExternalSystemAsync(
"StableSystem", "https://x", "None", null, null);
var repo1 = new SiteExternalSystemRepository(storage1);
var idBeforeRestart = (await repo1.GetAllExternalSystemsAsync())[0].Id;
// Simulate a process restart — brand-new service + repository instances.
var storage2 = NewStorage();
var repo2 = new SiteExternalSystemRepository(storage2);
var idAfterRestart = (await repo2.GetAllExternalSystemsAsync())[0].Id;
Assert.Equal(idBeforeRestart, idAfterRestart);
// And the by-ID lookup must succeed using the pre-restart ID.
var found = await repo2.GetExternalSystemByIdAsync(idBeforeRestart);
Assert.NotNull(found);
Assert.Equal("StableSystem", found.Name);
}
/// <summary>
/// SiteRuntime-007: the same stability guarantee for notification lists.
/// </summary>
[Fact]
public async Task NotificationRepository_SyntheticId_IsStableAcrossRestart()
{
var storage1 = NewStorage();
await storage1.InitializeAsync();
await storage1.StoreNotificationListAsync(
"OnCall", new[] { "a@example.com", "b@example.com" });
var repo1 = new SiteNotificationRepository(storage1);
var idBeforeRestart = (await repo1.GetAllNotificationListsAsync())[0].Id;
var storage2 = NewStorage();
var repo2 = new SiteNotificationRepository(storage2);
var found = await repo2.GetNotificationListByIdAsync(idBeforeRestart);
Assert.NotNull(found);
Assert.Equal("OnCall", found.Name);
}
}
@@ -0,0 +1,47 @@
using ScadaLink.SiteRuntime;
using ScadaLink.SiteRuntime.Scripts;
namespace ScadaLink.SiteRuntime.Tests.Scripts;
/// <summary>
/// SiteRuntime-009: the dedicated script-execution scheduler must run script bodies on
/// its own dedicated threads, not on the shared .NET thread pool, so blocking script
/// I/O cannot starve the global pool.
/// </summary>
public class ScriptExecutionSchedulerTests
{
[Fact]
public async Task Scheduler_RunsWork_OffTheThreadPool()
{
using var scheduler = new ScriptExecutionScheduler(2);
bool wasThreadPoolThread = true;
string? threadName = null;
await Task.Factory.StartNew(() =>
{
wasThreadPoolThread = Thread.CurrentThread.IsThreadPoolThread;
threadName = Thread.CurrentThread.Name;
}, CancellationToken.None, TaskCreationOptions.DenyChildAttach, scheduler);
Assert.False(wasThreadPoolThread,
"Script work must not run on a shared thread-pool thread.");
Assert.StartsWith("script-execution-", threadName);
}
[Fact]
public void Scheduler_RespectsConfiguredThreadCount()
{
using var scheduler = new ScriptExecutionScheduler(4);
Assert.Equal(4, scheduler.MaximumConcurrencyLevel);
}
[Fact]
public void Scheduler_Shared_ReturnsSameInstanceForOptions()
{
var options = new SiteRuntimeOptions { ScriptExecutionThreadCount = 3 };
var a = ScriptExecutionScheduler.Shared(options);
var b = ScriptExecutionScheduler.Shared(options);
Assert.Same(a, b);
}
}
@@ -0,0 +1,92 @@
using Microsoft.Extensions.Logging.Abstractions;
using ScadaLink.SiteRuntime.Scripts;
namespace ScadaLink.SiteRuntime.Tests.Scripts;
/// <summary>
/// SiteRuntime-011: regression tests for the semantic-analysis trust-model validation.
/// The previous implementation was a raw substring scan of the source text — it both
/// missed forbidden APIs (no literal namespace string) and raised false positives on
/// the namespace string appearing in comments, string literals or unrelated identifiers.
/// </summary>
public class TrustModelSemanticTests
{
private readonly ScriptCompilationService _service =
new(NullLogger<ScriptCompilationService>.Instance);
// ── Bypass cases (under-inclusive substring scan would MISS these) ──
[Fact]
public void TrustModel_GlobalQualifiedForbiddenType_IsDetected()
{
// `global::`-prefixed name — the literal "System.IO" substring is still present
// here, but the resolved-symbol approach catches it regardless of spelling.
var violations = _service.ValidateTrustModel(
"global::System.IO.File.ReadAllText(\"/etc/passwd\")");
Assert.NotEmpty(violations);
}
[Fact]
public void TrustModel_ForbiddenTypeViaUsingAlias_IsDetected()
{
// A using-alias hides the forbidden namespace from a substring scan entirely:
// the script body never writes "System.IO". Semantic resolution still sees that
// the alias resolves to System.IO.File.
var code = """
using F = System.IO.File;
F.ReadAllText("/etc/passwd");
""";
var violations = _service.ValidateTrustModel(code);
Assert.NotEmpty(violations);
Assert.Contains(violations, v => v.Contains("System.IO"));
}
// ── False-positive cases (over-inclusive substring scan would WRONGLY flag these) ──
[Fact]
public void TrustModel_ForbiddenNamespaceInStringLiteral_IsNotFlagged()
{
// "System.IO" appears only inside a string literal — not an API reference.
var violations = _service.ValidateTrustModel(
"var label = \"System.IO is blocked\"; return label;");
Assert.Empty(violations);
}
[Fact]
public void TrustModel_ForbiddenNamespaceInComment_IsNotFlagged()
{
var code = """
// This script does not use System.IO or System.Reflection at all.
var x = 1 + 2;
return x;
""";
var violations = _service.ValidateTrustModel(code);
Assert.Empty(violations);
}
[Fact]
public void TrustModel_UnrelatedIdentifierContainingForbiddenSubstring_IsNotFlagged()
{
// A local variable whose name merely contains "Threading" is harmless.
var code = """
var ProcessThreadingCount = 5;
return ProcessThreadingCount + 1;
""";
var violations = _service.ValidateTrustModel(code);
Assert.Empty(violations);
}
// ── Allowed exceptions still resolve correctly ──
[Fact]
public void TrustModel_TaskAndCancellationToken_RemainAllowed()
{
var code = """
var cts = new System.Threading.CancellationTokenSource();
await System.Threading.Tasks.Task.Delay(1, cts.Token);
return 0;
""";
var violations = _service.ValidateTrustModel(code);
Assert.Empty(violations);
}
}
@@ -0,0 +1,173 @@
using Akka.Actor;
using Akka.TestKit.Xunit2;
using Microsoft.Data.Sqlite;
using Microsoft.Extensions.Logging.Abstractions;
using ScadaLink.Commons.Messages.RemoteQuery;
using ScadaLink.Commons.Types.Enums;
namespace ScadaLink.StoreAndForward.Tests;
/// <summary>
/// StoreAndForward-013: tests for the <see cref="ParkedMessageHandlerActor"/> actor
/// bridge — the Query/Retry/Discard request-to-response mapping and the
/// <c>ExtractMethodName</c> payload-JSON parsing (including the malformed-JSON branch).
/// </summary>
public class ParkedMessageHandlerActorTests : TestKit, IAsyncLifetime, IDisposable
{
private readonly SqliteConnection _keepAlive;
private readonly StoreAndForwardStorage _storage;
private readonly StoreAndForwardService _service;
public ParkedMessageHandlerActorTests()
{
var connStr = $"Data Source=ActorTests_{Guid.NewGuid():N};Mode=Memory;Cache=Shared";
_keepAlive = new SqliteConnection(connStr);
_keepAlive.Open();
_storage = new StoreAndForwardStorage(connStr, NullLogger<StoreAndForwardStorage>.Instance);
var options = new StoreAndForwardOptions
{
DefaultRetryInterval = TimeSpan.Zero,
DefaultMaxRetries = 1,
RetryTimerInterval = TimeSpan.FromMinutes(10),
ReplicationEnabled = false,
};
_service = new StoreAndForwardService(
_storage, options, NullLogger<StoreAndForwardService>.Instance);
}
public async Task InitializeAsync() => await _storage.InitializeAsync();
public Task DisposeAsync() => Task.CompletedTask;
protected override void Dispose(bool disposing)
{
if (disposing) _keepAlive.Dispose();
base.Dispose(disposing);
}
/// <summary>Enqueues a message and parks it via the retry sweep (MaxRetries = 1).</summary>
private async Task<string> ParkMessageAsync(StoreAndForwardCategory category, string payloadJson)
{
_service.RegisterDeliveryHandler(category, _ => throw new HttpRequestException("always fails"));
var result = await _service.EnqueueAsync(category, "target", payloadJson, maxRetries: 1);
await _service.RetryPendingMessagesAsync();
return result.MessageId;
}
[Fact]
public async Task Query_ReturnsParkedEntries_WithExtractedMethodName()
{
await ParkMessageAsync(StoreAndForwardCategory.ExternalSystem,
"""{"MethodName":"StartPump","Args":{}}""");
var actor = Sys.ActorOf(Props.Create(() => new ParkedMessageHandlerActor(_service, "site-1")));
actor.Tell(new ParkedMessageQueryRequest("corr-1", "site-1", 1, 50, DateTimeOffset.UtcNow));
var response = ExpectMsg<ParkedMessageQueryResponse>();
Assert.True(response.Success);
Assert.Equal("corr-1", response.CorrelationId);
Assert.Equal("site-1", response.SiteId);
Assert.Single(response.Messages);
Assert.Equal("StartPump", response.Messages[0].MethodName);
Assert.Equal(StoreAndForwardCategory.ExternalSystem, response.Messages[0].Category);
}
[Fact]
public async Task Query_NotificationPayload_UsesSubjectAsMethodName()
{
await ParkMessageAsync(StoreAndForwardCategory.Notification,
"""{"Subject":"Tank overflow","Body":"..."}""");
var actor = Sys.ActorOf(Props.Create(() => new ParkedMessageHandlerActor(_service, "site-1")));
actor.Tell(new ParkedMessageQueryRequest("corr-2", "site-1", 1, 50, DateTimeOffset.UtcNow));
var response = ExpectMsg<ParkedMessageQueryResponse>();
Assert.True(response.Success);
Assert.Equal("Tank overflow", response.Messages[0].MethodName);
}
[Fact]
public async Task Query_MalformedJsonPayload_FallsBackToCategoryName()
{
await ParkMessageAsync(StoreAndForwardCategory.CachedDbWrite, "not-valid-json{");
var actor = Sys.ActorOf(Props.Create(() => new ParkedMessageHandlerActor(_service, "site-1")));
actor.Tell(new ParkedMessageQueryRequest("corr-3", "site-1", 1, 50, DateTimeOffset.UtcNow));
var response = ExpectMsg<ParkedMessageQueryResponse>();
Assert.True(response.Success);
// Malformed JSON must not throw — ExtractMethodName falls back to the category.
Assert.Equal(StoreAndForwardCategory.CachedDbWrite.ToString(), response.Messages[0].MethodName);
}
[Fact]
public async Task Query_PayloadWithoutMethodNameOrSubject_FallsBackToCategoryName()
{
await ParkMessageAsync(StoreAndForwardCategory.ExternalSystem, """{"Unrelated":"value"}""");
var actor = Sys.ActorOf(Props.Create(() => new ParkedMessageHandlerActor(_service, "site-1")));
actor.Tell(new ParkedMessageQueryRequest("corr-4", "site-1", 1, 50, DateTimeOffset.UtcNow));
var response = ExpectMsg<ParkedMessageQueryResponse>();
Assert.Equal(StoreAndForwardCategory.ExternalSystem.ToString(), response.Messages[0].MethodName);
}
[Fact]
public async Task Retry_ParkedMessage_ReturnsSuccess()
{
var messageId = await ParkMessageAsync(StoreAndForwardCategory.ExternalSystem, """{}""");
var actor = Sys.ActorOf(Props.Create(() => new ParkedMessageHandlerActor(_service, "site-1")));
actor.Tell(new ParkedMessageRetryRequest("corr-5", "site-1", messageId, DateTimeOffset.UtcNow));
var response = ExpectMsg<ParkedMessageRetryResponse>();
Assert.True(response.Success);
Assert.Equal("corr-5", response.CorrelationId);
Assert.Null(response.ErrorMessage);
var msg = await _storage.GetMessageByIdAsync(messageId);
Assert.Equal(StoreAndForwardMessageStatus.Pending, msg!.Status);
}
[Fact]
public void Retry_UnknownMessage_ReturnsFailureWithMessage()
{
var actor = Sys.ActorOf(Props.Create(() => new ParkedMessageHandlerActor(_service, "site-1")));
actor.Tell(new ParkedMessageRetryRequest("corr-6", "site-1", "does-not-exist", DateTimeOffset.UtcNow));
var response = ExpectMsg<ParkedMessageRetryResponse>();
Assert.False(response.Success);
Assert.Equal("corr-6", response.CorrelationId);
Assert.NotNull(response.ErrorMessage);
}
[Fact]
public async Task Discard_ParkedMessage_ReturnsSuccessAndRemovesMessage()
{
var messageId = await ParkMessageAsync(StoreAndForwardCategory.ExternalSystem, """{}""");
var actor = Sys.ActorOf(Props.Create(() => new ParkedMessageHandlerActor(_service, "site-1")));
actor.Tell(new ParkedMessageDiscardRequest("corr-7", "site-1", messageId, DateTimeOffset.UtcNow));
var response = ExpectMsg<ParkedMessageDiscardResponse>();
Assert.True(response.Success);
Assert.Equal("corr-7", response.CorrelationId);
var msg = await _storage.GetMessageByIdAsync(messageId);
Assert.Null(msg);
}
[Fact]
public void Discard_UnknownMessage_ReturnsFailureWithMessage()
{
var actor = Sys.ActorOf(Props.Create(() => new ParkedMessageHandlerActor(_service, "site-1")));
actor.Tell(new ParkedMessageDiscardRequest("corr-8", "site-1", "does-not-exist", DateTimeOffset.UtcNow));
var response = ExpectMsg<ParkedMessageDiscardResponse>();
Assert.False(response.Success);
Assert.NotNull(response.ErrorMessage);
}
}
@@ -9,6 +9,7 @@
</PropertyGroup> </PropertyGroup>
<ItemGroup> <ItemGroup>
<PackageReference Include="Akka.TestKit.Xunit2" />
<PackageReference Include="coverlet.collector" /> <PackageReference Include="coverlet.collector" />
<PackageReference Include="Microsoft.Data.Sqlite" /> <PackageReference Include="Microsoft.Data.Sqlite" />
<PackageReference Include="Microsoft.NET.Test.Sdk" /> <PackageReference Include="Microsoft.NET.Test.Sdk" />
@@ -177,6 +177,49 @@ public class StoreAndForwardServiceTests : IAsyncLifetime, IDisposable
Assert.Equal(1, msg.RetryCount); // one sweep retry recorded Assert.Equal(1, msg.RetryCount); // one sweep retry recorded
} }
// ── StoreAndForward-005: sweep-vs-management race hardening ──
[Fact]
public async Task RetryMessageAsync_StatusChangedDuringDelivery_SweepParkWriteIsSkipped()
{
// StoreAndForward-005: the retry sweep's state-changing writes must be
// conditional on the status it observed, so a concurrent operator action that
// moved the row out of Pending (e.g. between the sweep's snapshot load and its
// park write) is not silently overwritten by the sweep's stale view.
var result = await _service.EnqueueAsync(
StoreAndForwardCategory.ExternalSystem, "api", """{}""",
attemptImmediateDelivery: false, maxRetries: 1);
_service.RegisterDeliveryHandler(StoreAndForwardCategory.ExternalSystem,
async msg =>
{
// Simulate an operator action winning the race: the row leaves Pending
// (here: parked) while the sweep is still mid-delivery. The sweep would
// otherwise unconditionally re-write this row from its stale snapshot.
var parkedOutFromUnderTheSweep = new StoreAndForwardMessage
{
Id = msg.Id, Category = msg.Category, Target = msg.Target,
PayloadJson = msg.PayloadJson, RetryCount = 7,
MaxRetries = msg.MaxRetries, RetryIntervalMs = msg.RetryIntervalMs,
CreatedAt = msg.CreatedAt, LastAttemptAt = DateTimeOffset.UtcNow,
Status = StoreAndForwardMessageStatus.Parked,
LastError = "operator/other writer"
};
await _storage.UpdateMessageAsync(parkedOutFromUnderTheSweep);
throw new HttpRequestException("transient — sweep will try to park");
});
await _service.RetryPendingMessagesAsync();
// The sweep observed Pending; the row is now Parked with the other writer's
// RetryCount (7), not the sweep's (1). The sweep's conditional write was skipped.
var msg = await _storage.GetMessageByIdAsync(result.MessageId);
Assert.NotNull(msg);
Assert.Equal(StoreAndForwardMessageStatus.Parked, msg!.Status);
Assert.Equal(7, msg.RetryCount);
Assert.Equal("operator/other writer", msg.LastError);
}
[Fact] [Fact]
public async Task RetryPendingMessagesAsync_PermanentFailureOnRetry_ParksMessage() public async Task RetryPendingMessagesAsync_PermanentFailureOnRetry_ParksMessage()
{ {
@@ -106,6 +106,35 @@ public class StoreAndForwardStorageTests : IAsyncLifetime, IDisposable
Assert.All(forRetry, m => Assert.Equal(StoreAndForwardMessageStatus.Pending, m.Status)); Assert.All(forRetry, m => Assert.Equal(StoreAndForwardMessageStatus.Pending, m.Status));
} }
[Fact]
public async Task GetMessagesForRetryAsync_NonZeroInterval_ExcludesNotYetDueIncludesDue()
{
// StoreAndForward-013: exercise the julianday elapsed-time comparison with a
// non-zero retry interval. A message attempted just now must NOT be due; one
// attempted long ago must be due.
var notDue = CreateMessage("notdue", StoreAndForwardCategory.ExternalSystem);
notDue.RetryIntervalMs = (long)TimeSpan.FromHours(1).TotalMilliseconds;
notDue.LastAttemptAt = DateTimeOffset.UtcNow;
await _storage.EnqueueAsync(notDue);
var due = CreateMessage("due", StoreAndForwardCategory.ExternalSystem);
due.RetryIntervalMs = (long)TimeSpan.FromMinutes(5).TotalMilliseconds;
due.LastAttemptAt = DateTimeOffset.UtcNow.AddHours(-2);
await _storage.EnqueueAsync(due);
var neverAttempted = CreateMessage("never", StoreAndForwardCategory.ExternalSystem);
neverAttempted.RetryIntervalMs = (long)TimeSpan.FromHours(1).TotalMilliseconds;
neverAttempted.LastAttemptAt = null;
await _storage.EnqueueAsync(neverAttempted);
var forRetry = await _storage.GetMessagesForRetryAsync();
var ids = forRetry.Select(m => m.Id).ToHashSet();
Assert.DoesNotContain("notdue", ids);
Assert.Contains("due", ids);
Assert.Contains("never", ids);
}
[Fact] [Fact]
public async Task GetParkedMessagesAsync_ReturnsParkedOnly() public async Task GetParkedMessagesAsync_ReturnsParkedOnly()
{ {
@@ -136,6 +165,31 @@ public class StoreAndForwardStorageTests : IAsyncLifetime, IDisposable
Assert.Equal(0, retrieved.RetryCount); Assert.Equal(0, retrieved.RetryCount);
} }
[Fact]
public async Task RetryParkedMessageAsync_ClearsLastAttemptAt_SoMessageIsImmediatelyDue()
{
// StoreAndForward-010: a re-queued parked message must be unambiguously due
// for the next sweep regardless of its (stale) last_attempt_at. Use a large
// retry interval so a leftover timestamp would otherwise exclude the message.
var msg = CreateMessage("requeue1", StoreAndForwardCategory.ExternalSystem);
msg.RetryIntervalMs = (long)TimeSpan.FromHours(1).TotalMilliseconds;
msg.LastAttemptAt = DateTimeOffset.UtcNow; // recent attempt
msg.Status = StoreAndForwardMessageStatus.Parked;
await _storage.EnqueueAsync(msg);
await _storage.UpdateMessageAsync(msg);
var requeued = await _storage.RetryParkedMessageAsync("requeue1");
Assert.True(requeued);
var retrieved = await _storage.GetMessageByIdAsync("requeue1");
Assert.Null(retrieved!.LastAttemptAt);
// It must appear in the retry-due set even though the configured interval
// (1 hour) has not elapsed since the original attempt.
var due = await _storage.GetMessagesForRetryAsync();
Assert.Contains(due, m => m.Id == "requeue1");
}
[Fact] [Fact]
public async Task DiscardParkedMessageAsync_RemovesMessage() public async Task DiscardParkedMessageAsync_RemovesMessage()
{ {
@@ -119,6 +119,106 @@ public class InstanceServiceTests
Assert.Equal("99", result.Value.OverrideValue); Assert.Equal("99", result.Value.OverrideValue);
} }
// --- TemplateEngine-008 regression: SetAlarmOverrideAsync validation ---
private static Template TemplateWithAlarms(int id, params TemplateAlarm[] alarms)
{
var t = new Template($"T{id}") { Id = id };
foreach (var a in alarms)
{
a.TemplateId = id;
t.Alarms.Add(a);
}
return t;
}
[Fact]
public async Task SetAlarmOverride_NonExistentAlarm_ReturnsFailure()
{
var instance = new Instance("Inst1") { Id = 1, TemplateId = 1 };
_repoMock.Setup(r => r.GetInstanceByIdAsync(1, It.IsAny<CancellationToken>()))
.ReturnsAsync(instance);
_repoMock.Setup(r => r.GetAllTemplatesAsync(It.IsAny<CancellationToken>()))
.ReturnsAsync(new List<Template>
{
TemplateWithAlarms(1, new TemplateAlarm("HighTemp") { Id = 10 })
});
var result = await _sut.SetAlarmOverrideAsync(1, "Missing", "{}", null, "admin");
Assert.True(result.IsFailure);
Assert.Contains("does not exist", result.Error);
_repoMock.Verify(r => r.AddInstanceAlarmOverrideAsync(
It.IsAny<InstanceAlarmOverride>(), It.IsAny<CancellationToken>()), Times.Never);
}
[Fact]
public async Task SetAlarmOverride_ComposedLockedAlarm_ReturnsFailure()
{
// The locked alarm lives in a composed module, so it is NOT a direct
// alarm of the instance's template — the old code skipped the lock check.
var instance = new Instance("Inst1") { Id = 1, TemplateId = 1 };
_repoMock.Setup(r => r.GetInstanceByIdAsync(1, It.IsAny<CancellationToken>()))
.ReturnsAsync(instance);
var module = TemplateWithAlarms(2, new TemplateAlarm("Fault") { Id = 20, IsLocked = true });
var host = new Template("Host") { Id = 1 };
host.Compositions.Add(new TemplateComposition("Pump") { Id = 1, ComposedTemplateId = 2 });
_repoMock.Setup(r => r.GetAllTemplatesAsync(It.IsAny<CancellationToken>()))
.ReturnsAsync(new List<Template> { host, module });
var result = await _sut.SetAlarmOverrideAsync(1, "Pump.Fault", "{}", null, "admin");
Assert.True(result.IsFailure);
Assert.Contains("locked", result.Error, StringComparison.OrdinalIgnoreCase);
_repoMock.Verify(r => r.AddInstanceAlarmOverrideAsync(
It.IsAny<InstanceAlarmOverride>(), It.IsAny<CancellationToken>()), Times.Never);
}
[Fact]
public async Task SetAlarmOverride_ComposedUnlockedAlarm_ReturnsSuccess()
{
var instance = new Instance("Inst1") { Id = 1, TemplateId = 1 };
_repoMock.Setup(r => r.GetInstanceByIdAsync(1, It.IsAny<CancellationToken>()))
.ReturnsAsync(instance);
var module = TemplateWithAlarms(2, new TemplateAlarm("Fault") { Id = 20, IsLocked = false });
var host = new Template("Host") { Id = 1 };
host.Compositions.Add(new TemplateComposition("Pump") { Id = 1, ComposedTemplateId = 2 });
_repoMock.Setup(r => r.GetAllTemplatesAsync(It.IsAny<CancellationToken>()))
.ReturnsAsync(new List<Template> { host, module });
_repoMock.Setup(r => r.GetAlarmOverrideAsync(1, "Pump.Fault", It.IsAny<CancellationToken>()))
.ReturnsAsync((InstanceAlarmOverride?)null);
_repoMock.Setup(r => r.SaveChangesAsync(It.IsAny<CancellationToken>()))
.ReturnsAsync(1);
var result = await _sut.SetAlarmOverrideAsync(1, "Pump.Fault", "{}", 2, "admin");
Assert.True(result.IsSuccess);
_repoMock.Verify(r => r.AddInstanceAlarmOverrideAsync(
It.IsAny<InstanceAlarmOverride>(), It.IsAny<CancellationToken>()), Times.Once);
}
[Fact]
public async Task SetAlarmOverride_DirectLockedAlarm_ReturnsFailure()
{
var instance = new Instance("Inst1") { Id = 1, TemplateId = 1 };
_repoMock.Setup(r => r.GetInstanceByIdAsync(1, It.IsAny<CancellationToken>()))
.ReturnsAsync(instance);
_repoMock.Setup(r => r.GetAllTemplatesAsync(It.IsAny<CancellationToken>()))
.ReturnsAsync(new List<Template>
{
TemplateWithAlarms(1, new TemplateAlarm("HighTemp") { Id = 10, IsLocked = true })
});
var result = await _sut.SetAlarmOverrideAsync(1, "HighTemp", "{}", null, "admin");
Assert.True(result.IsFailure);
Assert.Contains("locked", result.Error, StringComparison.OrdinalIgnoreCase);
}
[Fact] [Fact]
public async Task Enable_ExistingInstance_SetsEnabled() public async Task Enable_ExistingInstance_SetsEnabled()
{ {
@@ -82,6 +82,9 @@ public class TemplateDeletionServiceTests
[Fact] [Fact]
public async Task CanDeleteTemplate_ComposedByOthers_ReturnsFailure() public async Task CanDeleteTemplate_ComposedByOthers_ReturnsFailure()
{ {
var composer = new Template("Composer") { Id = 2 };
composer.Compositions.Add(new TemplateComposition("PumpModule") { ComposedTemplateId = 1 });
_repoMock.Setup(r => r.GetTemplateByIdAsync(1, It.IsAny<CancellationToken>())) _repoMock.Setup(r => r.GetTemplateByIdAsync(1, It.IsAny<CancellationToken>()))
.ReturnsAsync(new Template("Module") { Id = 1 }); .ReturnsAsync(new Template("Module") { Id = 1 });
_repoMock.Setup(r => r.GetInstancesByTemplateIdAsync(1, It.IsAny<CancellationToken>())) _repoMock.Setup(r => r.GetInstancesByTemplateIdAsync(1, It.IsAny<CancellationToken>()))
@@ -90,15 +93,8 @@ public class TemplateDeletionServiceTests
.ReturnsAsync(new List<Template> .ReturnsAsync(new List<Template>
{ {
new("Module") { Id = 1 }, new("Module") { Id = 1 },
new("Composer") { Id = 2 } composer
}); });
_repoMock.Setup(r => r.GetCompositionsByTemplateIdAsync(2, It.IsAny<CancellationToken>()))
.ReturnsAsync(new List<TemplateComposition>
{
new("PumpModule") { ComposedTemplateId = 1 }
});
_repoMock.Setup(r => r.GetCompositionsByTemplateIdAsync(1, It.IsAny<CancellationToken>()))
.ReturnsAsync(new List<TemplateComposition>());
var result = await _sut.CanDeleteTemplateAsync(1); var result = await _sut.CanDeleteTemplateAsync(1);
@@ -107,6 +103,34 @@ public class TemplateDeletionServiceTests
Assert.Contains("Composer", result.Error); Assert.Contains("Composer", result.Error);
} }
[Fact]
public async Task CanDeleteTemplate_DoesNotIssuePerTemplateCompositionQuery()
{
// TemplateEngine-009: Check 3 must read the Compositions navigation
// already loaded by GetAllTemplatesAsync rather than issuing one
// GetCompositionsByTemplateIdAsync round-trip per template.
var templates = new List<Template>
{
new("Module") { Id = 1 },
new("A") { Id = 2 },
new("B") { Id = 3 },
new("C") { Id = 4 },
};
_repoMock.Setup(r => r.GetTemplateByIdAsync(1, It.IsAny<CancellationToken>()))
.ReturnsAsync(new Template("Module") { Id = 1 });
_repoMock.Setup(r => r.GetInstancesByTemplateIdAsync(1, It.IsAny<CancellationToken>()))
.ReturnsAsync(new List<Instance>());
_repoMock.Setup(r => r.GetAllTemplatesAsync(It.IsAny<CancellationToken>()))
.ReturnsAsync(templates);
var result = await _sut.CanDeleteTemplateAsync(1);
Assert.True(result.IsSuccess);
_repoMock.Verify(r => r.GetCompositionsByTemplateIdAsync(
It.IsAny<int>(), It.IsAny<CancellationToken>()), Times.Never);
}
[Fact] [Fact]
public async Task CanDeleteTemplate_NotFound_ReturnsFailure() public async Task CanDeleteTemplate_NotFound_ReturnsFailure()
{ {
@@ -146,22 +170,15 @@ public class TemplateDeletionServiceTests
.ReturnsAsync(new Template("Busy") { Id = 1 }); .ReturnsAsync(new Template("Busy") { Id = 1 });
_repoMock.Setup(r => r.GetInstancesByTemplateIdAsync(1, It.IsAny<CancellationToken>())) _repoMock.Setup(r => r.GetInstancesByTemplateIdAsync(1, It.IsAny<CancellationToken>()))
.ReturnsAsync(new List<Instance> { new("Inst1") { Id = 1 } }); .ReturnsAsync(new List<Instance> { new("Inst1") { Id = 1 } });
var composer = new Template("Composer") { Id = 3 };
composer.Compositions.Add(new TemplateComposition("Module") { ComposedTemplateId = 1 });
_repoMock.Setup(r => r.GetAllTemplatesAsync(It.IsAny<CancellationToken>())) _repoMock.Setup(r => r.GetAllTemplatesAsync(It.IsAny<CancellationToken>()))
.ReturnsAsync(new List<Template> .ReturnsAsync(new List<Template>
{ {
new("Busy") { Id = 1 }, new("Busy") { Id = 1 },
new("Child") { Id = 2, ParentTemplateId = 1 }, new("Child") { Id = 2, ParentTemplateId = 1 },
new("Composer") { Id = 3 } composer
}); });
_repoMock.Setup(r => r.GetCompositionsByTemplateIdAsync(3, It.IsAny<CancellationToken>()))
.ReturnsAsync(new List<TemplateComposition>
{
new("Module") { ComposedTemplateId = 1 }
});
_repoMock.Setup(r => r.GetCompositionsByTemplateIdAsync(1, It.IsAny<CancellationToken>()))
.ReturnsAsync(new List<TemplateComposition>());
_repoMock.Setup(r => r.GetCompositionsByTemplateIdAsync(2, It.IsAny<CancellationToken>()))
.ReturnsAsync(new List<TemplateComposition>());
var result = await _sut.CanDeleteTemplateAsync(1); var result = await _sut.CanDeleteTemplateAsync(1);
@@ -141,4 +141,19 @@ public class SharedScriptServiceTests
Assert.NotNull(result); Assert.NotNull(result);
Assert.Contains("Syntax error", result); Assert.Contains("Syntax error", result);
} }
// --- TemplateEngine-007 regression: string/comment-literal awareness ---
[Theory]
[InlineData("var s = \"a } brace\"; { }")] // brace inside a normal string
[InlineData("var s = \"a ) paren ] bracket\";")] // paren/bracket inside a string
[InlineData("var s = @\"verbatim } brace\"; { }")] // brace inside a verbatim string
[InlineData("var x = 1; var s = $\"hole {x} literal}}\"; { }")] // interpolated string with braces
[InlineData("var c = '}'; if (true) { }")] // char literal containing a brace
[InlineData("// a stray } here\nvar x = 1;")] // brace inside a line comment
[InlineData("/* a stray ) here */ var x = 1;")] // paren inside a block comment
public void ValidateSyntax_DelimiterInsideStringOrComment_ReturnsNull(string code)
{
Assert.Null(SharedScriptService.ValidateSyntax(code));
}
} }
@@ -71,4 +71,85 @@ public class ScriptCompilerTests
var result = _sut.TryCompile("/* { } */ var x = 1;", "Test"); var result = _sut.TryCompile("/* { } */ var x = 1;", "Test");
Assert.True(result.IsSuccess); Assert.True(result.IsSuccess);
} }
// --- TemplateEngine-007 regression: string-literal awareness ---
[Fact]
public void TryCompile_VerbatimStringWithBrace_NotFlaggedAsMismatched()
{
// @"..." — backslash is literal, "" is the escape. The closing brace
// inside the verbatim string must not affect the brace balance.
var result = _sut.TryCompile("var s = @\"a brace } and a \\ slash\"; if (true) { }", "Test");
Assert.True(result.IsSuccess);
}
[Fact]
public void TryCompile_VerbatimStringWithEscapedQuote_NotFlaggedAsMismatched()
{
// The "" inside a verbatim string is an escaped quote, not a string end.
var result = _sut.TryCompile("var s = @\"he said \"\"hi}\"\"\"; { }", "Test");
Assert.True(result.IsSuccess);
}
[Fact]
public void TryCompile_InterpolatedStringWithBraces_NotFlaggedAsMismatched()
{
// The braces in $"{x}" are interpolation holes; the literal "}}" is an
// escaped brace. Neither should unbalance the real braces.
var result = _sut.TryCompile("var x = 1; var s = $\"val={x} literal}}\"; if (x>0) { x++; }", "Test");
Assert.True(result.IsSuccess);
}
[Fact]
public void TryCompile_RawStringLiteralWithBraces_NotFlaggedAsMismatched()
{
// C# 11 raw string literal — the triple quotes delimit, braces inside are text.
var result = _sut.TryCompile("var s = \"\"\"a } brace { in raw\"\"\"; { }", "Test");
Assert.True(result.IsSuccess);
}
[Fact]
public void TryCompile_CharLiteralWithBrace_NotFlaggedAsMismatched()
{
// A '}' char literal must not decrement the brace depth.
var result = _sut.TryCompile("var c = '}'; if (true) { }", "Test");
Assert.True(result.IsSuccess);
}
[Fact]
public void TryCompile_GenuineMismatchedBraces_StillDetected()
{
// Sanity check that the string-aware scan still catches real mismatches.
var result = _sut.TryCompile("var s = \"ok\"; if (true) { x++;", "Test");
Assert.True(result.IsFailure);
Assert.Contains("braces", result.Error, StringComparison.OrdinalIgnoreCase);
}
// --- TemplateEngine-006 regression: forbidden-API scan false positives ---
[Fact]
public void TryCompile_ForbiddenApiTextInsideStringLiteral_NotFlagged()
{
// "System.IO." appears only inside a string literal — it is inert text,
// not a use of the forbidden API, and must not be rejected.
var result = _sut.TryCompile("var msg = \"see System.IO.File docs\"; var x = 1;", "Test");
Assert.True(result.IsSuccess);
}
[Fact]
public void TryCompile_ForbiddenApiTextInsideComment_NotFlagged()
{
// "System.Threading." appears only inside a comment — inert.
var result = _sut.TryCompile("// avoid System.Threading.Thread here\nvar x = 1;", "Test");
Assert.True(result.IsSuccess);
}
[Fact]
public void TryCompile_ForbiddenApiInRealCode_StillFlagged()
{
// Sanity check: a genuine use in code is still rejected.
var result = _sut.TryCompile("var x = System.IO.File.ReadAllText(\"a\");", "Test");
Assert.True(result.IsFailure);
Assert.Contains("forbidden", result.Error, StringComparison.OrdinalIgnoreCase);
}
} }