2 Commits

Author SHA1 Message Date
Joseph Doherty e3ca9af1be fix(transport): Overwrite resolution now syncs child collections (2 findings)
Transport-001: template Overwrite now diff-and-merges the bundle's
Attributes / Alarms / Scripts onto the target template via three private
helpers (SyncTemplateAttributesAsync / SyncTemplateAlarmsAsync /
SyncTemplateScriptsAsync). Each helper emits one audit row per detected
add / update / delete and feeds the post-merge state into the existing
ResolveAlarmScriptLinks and ResolveCompositionEdges passes.

Transport-002: external-system Overwrite now syncs the Methods collection
via a parallel SyncExternalSystemMethodsAsync helper mirroring the T-001
shape, with ExternalSystemMethodAdded / Updated / Deleted audit rows.

Both fixes are covered by new integration tests in BundleImporterApplyTests.
README regenerated — open findings dropped from 146 to 136; all 10 open
High findings are now closed (0 Critical, 0 High, 46 Medium, 90 Low
remaining).
2026-05-28 05:54:03 -04:00
Joseph Doherty ac96b83b08 fix(high-severity): close 9 of 10 open High findings across 8 modules
Comm-016: delete dead HandleConnectionStateChanged + _debugSubscriptions /
_inProgressDeployments tracking + ConnectionStateChanged message record.
Disconnect detection is owned by the transport layers (gRPC keepalive PING
~25s; Ask-timeout at CommunicationService). Updates the
Component-Communication.md design doc to make that explicit.

SnF-018: NotificationForwarder.DeliverAsync now discards a corrupt buffered
payload (Warning log + return true) instead of returning false and parking
the row — honoring the design's "notifications do not park" invariant.

DM-018: reconciliation no longer force-sets Enabled, preserving an
intentional Disabled state after central failover.

ESG-018: DeliverBufferedAsync (both ExternalSystemClient + DatabaseGateway)
catches JsonException and returns false, turning a corrupt buffered row
into a parked operation instead of a retry-forever poison message.

InboundAPI-022: register ActiveNodeGate as IActiveNodeGate in the Central
DI branch so standby-node gating is actually wired up in production.

NS-019: remove orphaned NotificationDeliveryService /
INotificationDeliveryService / NotificationResult; central notification
delivery now lives entirely in NotificationOutbox.

SEL-016: normalise From/To filters to UTC before ISO-string compare so
non-UTC DateTimeOffset clients no longer get spuriously excluded events.

TE-017: include Description on attributes/alarms and a HashableConnections
projection (protocol, endpoint JSON, failover count) in the revision hash
and DiffService; staleness detection now catches description-only and
connection-endpoint edits.

Transport-001 and Transport-002 (also High) remain Open — they're being
handled in a follow-up batch because both touch BundleImporter.cs and
must serialise.
2026-05-28 05:40:15 -04:00
42 changed files with 1610 additions and 1764 deletions
+17 -1
View File
@@ -774,9 +774,25 @@ than being masked by an endpoint-agnostic mock.
|--|--|
| Severity | High |
| Category | Design-document adherence |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.Communication/Actors/CentralCommunicationActor.cs:169`, `src/ScadaLink.Communication/Actors/CentralCommunicationActor.cs:338-375` |
**Resolution** — deleted the dead code path in favour of the keepalive-based
detection that is the actual production behaviour: removed the
`Receive<ConnectionStateChanged>` handler, the `HandleConnectionStateChanged`
method, the `_debugSubscriptions` / `_inProgressDeployments` tracking dicts
+ the `TrackMessageForCleanup` helper that fed them, and the dead message
record `src/ScadaLink.Commons/Messages/Communication/ConnectionStateChanged.cs`.
The two dead tests (`ConnectionLost_DebugStreamsKilled` in
CentralCommunicationActorTests, `RoundTrip_ConnectionStateChanged_Succeeds`
in CompatibilityTests) were removed alongside. The design doc
`docs/requirements/Component-Communication.md` "Connection Failure Behavior"
section was updated to state explicitly that disconnect is detected at the
transport layer (gRPC keepalive PING ~25 s for debug streams; Ask-timeout
at the CommunicationService layer for command/control), with no
application-level signal. `DebugStreamTerminated` survives because
`DebugStreamBridgeActor` uses it for an unrelated intra-actor stop signal.
**Description**
`CentralCommunicationActor.HandleConnectionStateChanged` is wired to
+3 -1
View File
@@ -912,9 +912,11 @@ would be meaningless).
|--|--|
| Severity | High |
| Category | Correctness & logic bugs |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.DeploymentManager/DeploymentService.cs:675-682,721-748` |
**Resolution** — Added a `forceEnabledState` parameter to `ApplyPostSuccessSideEffectsAsync`. The normal deploy path passes `true` (fresh apply legitimately ends in `Enabled`); the reconciliation path passes `false`, so the helper only promotes `NotDeployed → Enabled` and leaves an existing `Disabled` (or `Enabled`) untouched. Regression test `DeployInstanceAsync_Reconciled_DisabledInstance_PreservesDisabledState` exercises the failover scenario and asserts the prior record still flips to `Success` while `Instance.State` stays `Disabled`.
**Description**
`TryReconcileWithSiteAsync` calls `ApplyPostSuccessSideEffectsAsync` whenever
@@ -1003,9 +1003,11 @@ captured request URI has no trailing `?`; it was verified to fail before the fix
|--|--|
| Severity | High |
| Category | Error handling & resilience |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.ExternalSystemGateway/ExternalSystemClient.cs:176`, `src/ScadaLink.ExternalSystemGateway/DatabaseGateway.cs:151` |
**Resolution** — Wrapped the `JsonSerializer.Deserialize<...>(message.PayloadJson)` call in both `ExternalSystemClient.DeliverBufferedAsync` and `DatabaseGateway.DeliverBufferedAsync` in a `try`/`catch (JsonException)` block. A `JsonException` is by definition permanent (the same payload bytes always deserialize identically), so the catch branch logs at `LogError` and returns `false`, parking the message via the S&F engine instead of letting it throw and be retried as a transient failure. Regression tests `DeliverBuffered_MalformedJsonPayload_ReturnsFalseSoMessageParks` were added to both `ExternalSystemClientTests` and `DatabaseGatewayTests` — each feeds a truncated `PayloadJson` to the handler and asserts `delivered == false` and that no exception escapes.
**Description**
Both `ExternalSystemClient.DeliverBufferedAsync` and `DatabaseGateway.DeliverBufferedAsync`
+3 -1
View File
@@ -1061,9 +1061,11 @@ that an attribute read/write carries the inherited `ParentExecutionId`.
|--|--|
| Severity | High |
| Category | Security |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.InboundAPI/IActiveNodeGate.cs`, `src/ScadaLink.InboundAPI/InboundApiEndpointFilter.cs:52-60`; absent from `src/ScadaLink.Host/Program.cs` |
**Resolution** — Added `src/ScadaLink.Host/Health/ActiveNodeGate.cs`, a production `IActiveNodeGate` implementation backed by `AkkaHostedService` that mirrors `ActiveNodeHealthCheck`'s leadership probe (member status `Up` AND `Cluster.State.Leader == SelfAddress`), and registered it as a singleton in the central-role branch of `Program.cs`. A structural regression test (`CentralCompositionRootTests.Central_IActiveNodeGate_IsRegisteredAsActiveNodeGate`) reflects over the built `IServiceProvider` to assert the registration's existence and concrete type — failing on `main` and passing after the fix. The `InboundApiEndpointFilter`'s fall-through-to-allow behaviour is retained as the documented safe default for non-clustered hosts and tests.
**Description**
InboundAPI-008's resolution adds `IActiveNodeGate` (lines 1724 of
+3 -1
View File
@@ -647,9 +647,11 @@ Resolved 2026-05-17. All three issues confirmed against source. The hand-rolled
|--|--|
| Severity | High |
| Category | Design-document adherence |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.NotificationService/NotificationDeliveryService.cs:18-442`, `src/ScadaLink.NotificationService/ServiceCollectionExtensions.cs:20-21`, `src/ScadaLink.Commons/Interfaces/Services/INotificationDeliveryService.cs:1-33`, `src/ScadaLink.Host/Program.cs:77` |
**Resolution** — Executed option 1. Deleted `src/ScadaLink.NotificationService/NotificationDeliveryService.cs`, `src/ScadaLink.Commons/Interfaces/Services/INotificationDeliveryService.cs` (also retires `NotificationResult` + `BufferedNotification`), and the orphaned `tests/ScadaLink.NotificationService.Tests/NotificationDeliveryServiceTests.cs` suite; reduced `AddNotificationService` to the shared SMTP primitives (`OAuth2TokenService`, `Func<ISmtpClientWrapper>`, `NotificationOptions`), updated `CompositionRootTests` (assert the primitives instead of the dead types), and removed the `Notification_Send_MockSmtp_Delivers` assertion in `IntegrationSurfaceTests` (central delivery is covered by `EmailNotificationDeliveryAdapterTests`). Grep-verified `grep -rn "INotificationDeliveryService\|NotificationDeliveryService\|NotificationResult\|BufferedNotification\|DeliverBufferedAsync" --include="*.cs" src/ tests/` before delete: zero production callers (only XML-doc cross-references in NS, MailKit wrapper, NotificationOptions and `EmailNotificationDeliveryAdapter`, plus the dead test files); cross-reference comments updated to remove the stale class references. `dotnet build ScadaLink.slnx` succeeds (0 warnings, 0 errors); affected test projects all pass (`NotificationService.Tests` 52/52, `NotificationOutbox.Tests` 86/86 on rerun — one flaky timing-sensitive Akka.TestKit test unrelated to NS-019, `Host.Tests` 205/205); `IntegrationTests` 64/66 with two pre-existing failures in `NotificationOutboxFlowTests` (SQLite "near IF: syntax error", reproducible on pristine `main`, unrelated to NS-019).
**Description**
The updated `Component-NotificationService.md` (re-read in full at this commit) makes the new design unambiguous: "The Notification Service is the central component that manages notification-list and SMTP definitions and provides the per-type delivery adapters used to send notifications. … Notification delivery has been inverted: a site script's notification is store-and-forwarded to the central cluster, and the central **Notification Outbox** owns dispatch and delivery, calling an `INotificationDeliveryAdapter` supplied by this component." The doc explicitly states the service is "central cluster only", "no longer present at site clusters", and "no longer delivers notifications from sites".
+13 -24
View File
@@ -40,10 +40,10 @@ module file and counted in **Total**.
| Severity | Open findings |
|----------|---------------|
| Critical | 0 |
| High | 10 |
| High | 0 |
| Medium | 46 |
| Low | 90 |
| **Total** | **146** |
| **Total** | **136** |
## Module Status
@@ -54,24 +54,24 @@ module file and counted in **Total**.
| [CentralUI](CentralUI/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/2/5 | 7 | 33 |
| [ClusterInfrastructure](ClusterInfrastructure/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/0/4 | 4 | 14 |
| [Commons](Commons/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/3/6 | 9 | 23 |
| [Communication](Communication/findings.md) | 2026-05-28 | `1eb6e97` | 0/1/1/5 | 7 | 22 |
| [Communication](Communication/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/1/5 | 6 | 22 |
| [ConfigurationDatabase](ConfigurationDatabase/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/4/5 | 9 | 24 |
| [DataConnectionLayer](DataConnectionLayer/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/0/0 | 0 | 22 |
| [DeploymentManager](DeploymentManager/findings.md) | 2026-05-28 | `1eb6e97` | 0/1/1/5 | 7 | 24 |
| [ExternalSystemGateway](ExternalSystemGateway/findings.md) | 2026-05-28 | `1eb6e97` | 0/1/2/3 | 6 | 23 |
| [DeploymentManager](DeploymentManager/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/1/5 | 6 | 24 |
| [ExternalSystemGateway](ExternalSystemGateway/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/2/3 | 5 | 23 |
| [HealthMonitoring](HealthMonitoring/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/2/5 | 7 | 23 |
| [Host](Host/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/2/5 | 7 | 22 |
| [InboundAPI](InboundAPI/findings.md) | 2026-05-28 | `1eb6e97` | 0/1/3/4 | 8 | 25 |
| [InboundAPI](InboundAPI/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/3/4 | 7 | 25 |
| [ManagementService](ManagementService/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/2/2 | 4 | 23 |
| [NotificationOutbox](NotificationOutbox/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/2/3 | 5 | 10 |
| [NotificationService](NotificationService/findings.md) | 2026-05-28 | `1eb6e97` | 0/1/2/3 | 6 | 25 |
| [NotificationService](NotificationService/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/2/3 | 5 | 25 |
| [Security](Security/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/0/2 | 2 | 21 |
| [SiteCallAudit](SiteCallAudit/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/2/4 | 6 | 6 |
| [SiteEventLogging](SiteEventLogging/findings.md) | 2026-05-28 | `1eb6e97` | 0/1/2/6 | 9 | 23 |
| [SiteEventLogging](SiteEventLogging/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/2/6 | 8 | 23 |
| [SiteRuntime](SiteRuntime/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/2/3 | 5 | 26 |
| [StoreAndForward](StoreAndForward/findings.md) | 2026-05-28 | `1eb6e97` | 0/1/3/3 | 7 | 24 |
| [TemplateEngine](TemplateEngine/findings.md) | 2026-05-28 | `1eb6e97` | 0/1/4/1 | 6 | 22 |
| [Transport](Transport/findings.md) | 2026-05-28 | `1eb6e97` | 0/2/2/4 | 8 | 12 |
| [StoreAndForward](StoreAndForward/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/3/3 | 6 | 24 |
| [TemplateEngine](TemplateEngine/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/4/1 | 5 | 22 |
| [Transport](Transport/findings.md) | 2026-05-28 | `1eb6e97` | 0/0/2/4 | 6 | 12 |
## Pending Findings
@@ -84,20 +84,9 @@ description, location, recommendation — lives in the module's `findings.md`.
_None open._
### High (10)
### High (0)
| ID | Module | Title |
|----|--------|-------|
| Communication-016 | [Communication](Communication/findings.md) | `HandleConnectionStateChanged` is dead code — the documented disconnect-cleanup workflow never fires |
| DeploymentManager-018 | [DeploymentManager](DeploymentManager/findings.md) | Reconciliation force-sets `Enabled`, overwriting an intentional `Disabled` after central failover |
| ExternalSystemGateway-018 | [ExternalSystemGateway](ExternalSystemGateway/findings.md) | `DeliverBufferedAsync` lets `JsonException` propagate, turning a corrupt buffered row into a permanent retry-forever poison message |
| InboundAPI-022 | [InboundAPI](InboundAPI/findings.md) | `IActiveNodeGate` has no production registration in Host — standby-node gating is silently disabled in production |
| NotificationService-019 | [NotificationService](NotificationService/findings.md) | `NotificationDeliveryService` and `INotificationDeliveryService` are orphaned by the central-only redesign |
| SiteEventLogging-016 | [SiteEventLogging](SiteEventLogging/findings.md) | `From`/`To` filters compare non-normalised ISO 8601 strings against UTC-stored timestamps |
| StoreAndForward-018 | [StoreAndForward](StoreAndForward/findings.md) | Notification corrupt-payload parks the buffered message, contradicting the "notifications do not park" design invariant |
| TemplateEngine-017 | [TemplateEngine](TemplateEngine/findings.md) | Revision hash and diff both ignore `Description` and `Connections`, defeating staleness detection for real deployment changes |
| Transport-001 | [Transport](Transport/findings.md) | Template Overwrite never syncs attributes / alarms / scripts |
| Transport-002 | [Transport](Transport/findings.md) | ExternalSystem Overwrite never syncs methods |
_None open._
### Medium (46)
+3 -1
View File
@@ -793,9 +793,11 @@ chosen policy on `ISiteEventLogger.LogEventAsync`.
|--|--|
| Severity | High |
| Category | Correctness & logic bugs |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.SiteEventLogging/EventLogQueryService.cs:67-77`, `src/ScadaLink.SiteEventLogging/SiteEventLogger.cs:159`, `src/ScadaLink.SiteEventLogging/EventLogPurgeService.cs:72-78` |
**Resolution** — `EventLogQueryService.ExecuteQuery` now calls `.ToUniversalTime()` on `request.From`/`request.To` before `ToString("o")`, so the produced ISO 8601 string always ends in `+00:00` and lexicographically matches the UTC timestamps written by `SiteEventLogger`. `EventLogPurgeService.PurgeByRetention` was also made defensive with an explicit `.ToUniversalTime()` on the cutoff. A regression test (`Query_FiltersByTimeRange_HandlesNonUtcOffset`) constructs a `+05:00` `DateTimeOffset` and asserts the matching UTC-stored events are returned and out-of-range ones are excluded.
**Description**
Event rows are persisted with `timestamp` = `DateTimeOffset.UtcNow.ToString("o")`,
+11 -1
View File
@@ -991,9 +991,19 @@ the StoreAndForward-016 replication) — and pass it to `RaiseActivity` (falling
|--|--|
| Severity | High |
| Category | Design-document adherence |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.StoreAndForward/NotificationForwarder.cs:62``:69`, `:105``:122`; `src/ScadaLink.StoreAndForward/StoreAndForwardService.cs:369``:397` |
**Resolution** — `NotificationForwarder.DeliverAsync` now discards a corrupt
buffered payload instead of returning false. The corrupt path logs a Warning
with the buffered row id + length-capped payload preview via an injected
`ILogger<NotificationForwarder>` (NullLogger by default for back-compat),
then returns true so the S&F engine clears the row via its standard
success-path cleanup — honoring the "notifications do not park" design
invariant. Two regression tests in `NotificationForwarderTests` cover the
two corrupt shapes (invalid JSON, `null` deserialisation) and pin that
nothing is forwarded to central in either case.
**Description**
The Component design doc explicitly carves out notifications from the parking lifecycle:
+16 -2
View File
@@ -843,9 +843,11 @@ resolves against the real parent module. Regression test:
|--|--|
| Severity | High |
| Category | Design-document adherence |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.TemplateEngine/Flattening/RevisionHashService.cs:128`, `src/ScadaLink.TemplateEngine/Flattening/RevisionHashService.cs:156`, `src/ScadaLink.TemplateEngine/Flattening/RevisionHashService.cs:42`, `src/ScadaLink.TemplateEngine/Flattening/DiffService.cs:110`, `src/ScadaLink.TemplateEngine/Flattening/DiffService.cs:118` |
**Resolution** — Added `Description` to `HashableAttribute` and `HashableAlarm` (placed alphabetically per the determinism contract) and introduced a `HashableConnection` projection plus a `SortedDictionary<string, HashableConnection> Connections` field on `HashableConfiguration` that captures protocol, primary/backup JSON, and failover retry count for every deployed connection. `DiffService.AttributesEqual` and `AlarmsEqual` now compare `Description`, and a new public `ConnectionsEqual` helper covers connection-endpoint drift so callers can detect the change in the same shape used by the other entity comparators. Regression tests `ComputeHash_AttributeDescriptionEdit_ChangesHash`, `ComputeHash_AlarmDescriptionEdit_ChangesHash`, `ComputeHash_ConnectionEndpointEdit_ChangesHash`, and `ConnectionsEqual_EndpointEdit_ReturnsFalse` lock the behaviour in.
**Description**
The design states the revision hash is "computed from the resolved content" and
@@ -891,7 +893,19 @@ it in `DiffService`. Add tests:
**Resolution**
_Unresolved._
Resolved (commit `pending`): `RevisionHashService` now folds `Description` into
the `HashableAttribute` / `HashableAlarm` projections (alphabetical placement
preserved) and adds a sorted `Connections` map of `HashableConnection`
(Protocol, ConfigurationJson, BackupConfigurationJson, FailoverRetryCount) on
`HashableConfiguration`. `DiffService.AttributesEqual` / `AlarmsEqual` compare
`Description`, and a public `ConnectionsEqual` helper covers connection drift
in the same shape as the other entity comparators. Regression tests:
`ComputeHash_AttributeDescriptionEdit_ChangesHash`,
`ComputeHash_AlarmDescriptionEdit_ChangesHash`,
`ComputeHash_ConnectionEndpointEdit_ChangesHash`,
`ConnectionsEqual_EndpointEdit_ReturnsFalse`. The diff-shape extension that
surfaces added/removed/changed connections in the UI remains tracked under
TemplateEngine-018.
### TemplateEngine-018 — `DiffService` reports no entries for added/removed/changed connections
+25 -10
View File
@@ -53,9 +53,23 @@ entry-count / per-entry decompression cap).
|--|--|
| Severity | High |
| Category | Correctness & logic bugs |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.Transport/Import/BundleImporter.cs:844-851` |
**Resolution** — Extended `ApplyTemplatesAsync`'s Overwrite branch with three
new private diff-and-merge helpers (`SyncTemplateAttributesAsync`,
`SyncTemplateAlarmsAsync`, `SyncTemplateScriptsAsync`) that compare the bundle
DTO's children against the tracked existing template's collections by name and
stage add / update / delete via the audited repository methods. Each detected
change emits one of the per-field audit events the design doc enumerates
(`TemplateAttributeAdded` / `TemplateAttributeUpdated` /
`TemplateAttributeDeleted` and the alarm / script analogues); the existing
`ResolveAlarmScriptLinksAsync` and `ResolveCompositionEdgesAsync` passes rewire
the alarm→script FK and composition graph against the post-merge state with no
changes — Overwrite-on-alarms resets `OnTriggerScriptId` so Pass A is
authoritative. Regression test:
`BundleImporterApplyTests.ApplyAsync_Overwrite_synchronises_attributes_alarms_and_scripts_to_bundle`.
**Description**
The `ResolutionAction.Overwrite` branch in `ApplyTemplatesAsync` only writes
@@ -80,19 +94,24 @@ composition rewire passes against the post-merge state. Emit the per-field audit
rows the design doc enumerates. Add an integration test that overwrites a
template whose Scripts / Attributes / Alarms differ.
**Resolution**
_Unresolved._
### Transport-002 — ExternalSystem Overwrite never syncs methods
| | |
|--|--|
| Severity | High |
| Category | Correctness & logic bugs |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.Transport/Import/BundleImporter.cs:1213-1221` |
**Resolution** — Added a private `SyncExternalSystemMethodsAsync` helper to
`BundleImporter` modeled on the T-001 `SyncTemplate*Async` helpers (dictionary-
by-name diff, repo Add / Update / Delete, scalar-field-compare gating, one
audit row per change). The `ApplyExternalSystemsAsync` Overwrite branch now
calls it after the parent scalar update; the helper emits
`ExternalSystemMethodAdded` / `ExternalSystemMethodUpdated` /
`ExternalSystemMethodDeleted` per change. Covered by
`ApplyAsync_Overwrite_synchronises_external_system_methods_to_bundle`.
**Description**
`ApplyExternalSystemsAsync` Overwrite path writes `EndpointUrl`, `AuthType`, and
@@ -111,10 +130,6 @@ diff classification in `ArtifactDiff.CompareExternalSystem`) and emit the
per-method audit rows. Add a test that overwrites an external system whose
methods differ.
**Resolution**
_Unresolved._
### Transport-003 — Unlock lockout is enforced only client-side; server session is never marked Locked
| | |
+4 -2
View File
@@ -224,8 +224,10 @@ The ManagementActor is registered at the well-known path `/user/management` on c
## Connection Failure Behavior
- **In-flight messages**: When a connection drops while a request is in flight (e.g., deployment sent but no response received), the Akka ask pattern times out and the caller receives a failure. There is **no automatic retry or buffering at central** — the engineer sees the failure in the UI and re-initiates the action. This is consistent with the design principle that central does not buffer messages.
- **Debug streams**: Any gRPC stream interruption triggers reconnection logic in the `DebugStreamBridgeActor`. The bridge actor attempts to reconnect to the other site node endpoint (NodeB if NodeA failed, or vice versa), with up to 3 retries and 5-second backoff. If all retries fail, the consumer is notified via `OnStreamTerminated` and the bridge actor is stopped. Events during the reconnection gap are lost (acceptable for real-time debug view). On successful reconnection, the consumer can request a fresh snapshot to re-sync state.
Disconnect is detected at the **transport layer**, never via an application-level signal from central. There is no `ConnectionStateChanged`-style synchronous notification: the central coordinator does not maintain a model of "this site is up / down" because the two transports already report unavailability at their natural cadence.
- **In-flight command/control messages (ClusterClient + Ask)**: When a connection drops while a request is in flight (e.g., a deployment sent but no response received), the Akka ask pattern times out and the caller receives a failure. There is **no automatic retry or buffering at central** — the engineer sees the failure in the UI and re-initiates the action. This is consistent with the design principle that central does not buffer messages. An in-progress deployment whose round-trip exceeds the Ask timeout (default 120 s at `CommunicationService.DeployInstanceAsync`) surfaces as `DeploymentStatus.Failed` to the caller.
- **Debug streams (gRPC)**: Any gRPC stream interruption is detected by the HTTP/2 keepalive PING (~25 s) and triggers reconnection logic in the `DebugStreamBridgeActor`. The bridge actor attempts to reconnect to the other site node endpoint (NodeB if NodeA failed, or vice versa), with up to 3 retries and 5-second backoff. If all retries fail, the consumer is notified via `OnStreamTerminated` and the bridge actor is stopped. Events during the reconnection gap are lost (acceptable for real-time debug view). On successful reconnection, the consumer can request a fresh snapshot to re-sync state.
## Failover Behavior
@@ -1,32 +0,0 @@
namespace ScadaLink.Commons.Interfaces.Services;
/// <summary>
/// Interface for sending notifications.
/// Implemented by NotificationService, consumed by ScriptRuntimeContext.
/// </summary>
public interface INotificationDeliveryService
{
/// <summary>
/// Sends a notification to a named list. Transient failures go to S&amp;F.
/// Permanent failures returned to caller.
/// </summary>
/// <param name="listName">Name of the notification list to deliver to.</param>
/// <param name="subject">Subject line of the notification.</param>
/// <param name="message">Plain-text body of the notification.</param>
/// <param name="originInstanceName">Optional name of the instance that triggered the send.</param>
/// <param name="cancellationToken">Cancellation token for the async operation.</param>
Task<NotificationResult> SendAsync(
string listName,
string subject,
string message,
string? originInstanceName = null,
CancellationToken cancellationToken = default);
}
/// <summary>
/// Result of a notification send attempt.
/// </summary>
public record NotificationResult(
bool Success,
string? ErrorMessage,
bool WasBuffered = false);
@@ -1,6 +0,0 @@
namespace ScadaLink.Commons.Messages.Communication;
public record ConnectionStateChanged(
string SiteId,
bool IsConnected,
DateTimeOffset Timestamp);
@@ -60,17 +60,18 @@ public class CentralCommunicationActor : ReceiveActor
/// </summary>
private readonly Dictionary<string, (IActorRef Client, ImmutableHashSet<string> ContactAddresses)> _siteClients = new();
/// <summary>
/// Tracks active debug view subscriptions: correlationId → (siteId, subscriber).
/// Used to kill debug streams on site disconnection (WP-5).
/// </summary>
private readonly Dictionary<string, (string SiteId, IActorRef Subscriber)> _debugSubscriptions = new();
/// <summary>
/// Tracks in-progress deployments: deploymentId → siteId.
/// On central failover, in-progress deployments are treated as failed (WP-5).
/// </summary>
private readonly Dictionary<string, string> _inProgressDeployments = new();
// Communication-016: the previous _debugSubscriptions / _inProgressDeployments
// dictionaries existed solely to support a documented "synchronous kill streams +
// mark deployments failed on site disconnect" workflow triggered by
// ConnectionStateChanged. No production code ever emitted that message — only
// the unit test did — so the workflow was dead from end to end. Disconnect
// detection is owned by the underlying transports: the gRPC keepalive PING
// signals stream interruption in ~25s (handled by DebugStreamBridgeActor's own
// reconnection logic), and an Ask round-trip for a deploy times out at the
// CommunicationService layer (caller sees failure). The tracking dicts +
// ConnectionStateChanged record + HandleConnectionStateChanged handler are
// removed; see docs/requirements/Component-Communication.md "Connection
// Failure Behavior" for the keepalive-based contract that survives.
private ICancelable? _refreshSchedule;
@@ -165,9 +166,6 @@ public class CentralCommunicationActor : ReceiveActor
Receive<SiteHealthReportReplica>(r => ProcessLocally(r.Report));
Receive<SubscribeAck>(_ => { /* DistributedPubSub subscribe confirmation */ });
// Connection state changes
Receive<ConnectionStateChanged>(HandleConnectionStateChanged);
// Route enveloped messages to sites
Receive<SiteEnvelope>(HandleSiteEnvelope);
@@ -335,44 +333,10 @@ public class CentralCommunicationActor : ReceiveActor
}
}
private void HandleConnectionStateChanged(ConnectionStateChanged msg)
{
if (!msg.IsConnected)
{
_log.Warning("Site {0} disconnected at {1}", msg.SiteId, msg.Timestamp);
// WP-5: Kill active debug streams for the disconnected site
var toRemove = _debugSubscriptions
.Where(kvp => kvp.Value.SiteId == msg.SiteId)
.ToList();
foreach (var kvp in toRemove)
{
_log.Info("Killing debug stream {0} for disconnected site {1}", kvp.Key, msg.SiteId);
kvp.Value.Subscriber.Tell(new DebugStreamTerminated(msg.SiteId, kvp.Key));
_debugSubscriptions.Remove(kvp.Key);
}
// WP-5: Mark in-progress deployments as failed
var failedDeployments = _inProgressDeployments
.Where(kvp => kvp.Value == msg.SiteId)
.Select(kvp => kvp.Key)
.ToList();
foreach (var deploymentId in failedDeployments)
{
_log.Warning("Deployment {0} to site {1} treated as failed due to disconnection",
deploymentId, msg.SiteId);
_inProgressDeployments.Remove(deploymentId);
}
// Note: Do NOT stop the ClusterClient — it handles reconnection internally
}
else
{
_log.Info("Site {0} connected at {1}", msg.SiteId, msg.Timestamp);
}
}
// Communication-016: HandleConnectionStateChanged removed — no production
// caller emitted ConnectionStateChanged, so the workflow ran only in tests.
// Disconnect detection is owned by the transport layers (gRPC keepalive +
// ClusterClient/Ask timeout).
private void HandleSiteEnvelope(SiteEnvelope envelope)
{
@@ -385,9 +349,6 @@ public class CentralCommunicationActor : ReceiveActor
return;
}
// Track debug subscriptions for cleanup on disconnect
TrackMessageForCleanup(envelope);
// Route via ClusterClient — Sender is preserved for Ask response routing
entry.Client.Tell(
new ClusterClient.Send("/user/site-communication", envelope.Message),
@@ -485,23 +446,8 @@ public class CentralCommunicationActor : ReceiveActor
_log.Info("Site ClusterClient cache refreshed with {0} site(s)", _siteClients.Count);
}
private void TrackMessageForCleanup(SiteEnvelope envelope)
{
switch (envelope.Message)
{
case Commons.Messages.DebugView.SubscribeDebugViewRequest sub:
_debugSubscriptions[sub.CorrelationId] = (envelope.SiteId, Sender);
break;
case Commons.Messages.DebugView.UnsubscribeDebugViewRequest unsub:
_debugSubscriptions.Remove(unsub.CorrelationId);
break;
case Commons.Messages.Deployment.DeployInstanceCommand deploy:
_inProgressDeployments[deploy.DeploymentId] = envelope.SiteId;
break;
}
}
// Communication-016: TrackMessageForCleanup removed — the dicts it fed
// existed solely to support the dead ConnectionStateChanged workflow.
/// <inheritdoc />
protected override SupervisorStrategy SupervisorStrategy()
@@ -547,11 +493,8 @@ public class CentralCommunicationActor : ReceiveActor
/// <inheritdoc />
protected override void PostStop()
{
_log.Info("CentralCommunicationActor stopped. In-progress deployments treated as failed (WP-5).");
_log.Info("CentralCommunicationActor stopped");
_refreshSchedule?.Cancel();
// On central failover, all in-progress deployments are failed
_inProgressDeployments.Clear();
_debugSubscriptions.Clear();
}
}
@@ -228,7 +228,8 @@ public class DeploymentService
// logged loudly for operator reconciliation but must not flip
// the already-committed Success record back to Failed.
await ApplyPostSuccessSideEffectsAsync(
instance, deploymentId, revisionHash, configJson, cancellationToken);
instance, deploymentId, revisionHash, configJson,
forceEnabledState: true, cancellationToken);
}
// Audit log
@@ -677,8 +678,22 @@ public class DeploymentService
// the instance State to Enabled and store/refresh the deployed
// config snapshot — otherwise the central state machine and the
// deployed-snapshot invariant diverge from what the site is running.
//
// DeploymentManager-018: the reconciliation path runs only when the
// prior record is InProgress or timeout-Failed — exactly the cases
// that survive a central failover. The in-memory operation lock is
// lost on failover, so an operator may have legitimately invoked
// Disable on the instance between the original timed-out deploy and
// this redeploy. Disable does not change the deployed config, so the
// site still reports the target revision hash. Reconciliation must
// therefore PRESERVE an intentional Disabled state instead of
// silently flipping it back to Enabled — pass forceEnabledState:
// false so the helper only promotes NotDeployed → Enabled (the
// first-deploy-timed-out case) and leaves an explicit Disabled
// alone.
await ApplyPostSuccessSideEffectsAsync(
instance, prior.DeploymentId, targetRevisionHash, configJson, cancellationToken);
instance, prior.DeploymentId, targetRevisionHash, configJson,
forceEnabledState: false, cancellationToken);
await _auditService.LogAsync(prior.DeployedBy, "DeployReconciled", "Instance",
instance.Id.ToString(), instance.UniqueName,
@@ -713,6 +728,19 @@ public class DeploymentService
/// deployed config snapshot (WP-8). Factored into one helper so the two
/// paths cannot drift (DeploymentManager-015).
///
/// DeploymentManager-018: <paramref name="forceEnabledState"/> distinguishes
/// the two callers. The normal deploy path passes <c>true</c> — a fresh
/// successful apply legitimately puts the instance into <see cref="InstanceState.Enabled"/>
/// (the documented "Deploy on a Disabled instance also enables it" semantics
/// of <see cref="StateTransitionValidator"/>). The reconciliation path
/// passes <c>false</c>: it is reconciling a *prior* deployment that may
/// have completed before the current operator session (central failover
/// loses the in-memory operation lock, so an operator may have legitimately
/// Disabled the instance in between). On that path we only promote
/// <see cref="InstanceState.NotDeployed"/> → <see cref="InstanceState.Enabled"/>
/// (the first-deploy-timed-out case) and leave an explicit Disabled alone,
/// so reconciliation never silently undoes a Disable.
///
/// Best-effort: the deployment record's terminal <see cref="DeploymentStatus.Success"/>
/// status is already committed by the caller before this runs. A failure
/// here is logged loudly for operator reconciliation but is NOT propagated —
@@ -723,12 +751,20 @@ public class DeploymentService
string deploymentId,
string revisionHash,
string configJson,
bool forceEnabledState,
CancellationToken cancellationToken)
{
try
{
// WP-4: Update instance state to Enabled on successful deployment
instance.State = InstanceState.Enabled;
// WP-4: Update instance state to Enabled on successful deployment.
// DeploymentManager-018: on the reconciliation path
// (forceEnabledState=false) only promote NotDeployed → Enabled,
// preserving an intentional Disabled state set between the original
// timed-out deploy and the redeploy.
if (forceEnabledState || instance.State == InstanceState.NotDeployed)
{
instance.State = InstanceState.Enabled;
}
await _repository.UpdateInstanceAsync(instance, cancellationToken);
// WP-8: Store deployed config snapshot
@@ -148,7 +148,26 @@ public class DatabaseGateway : IDatabaseGateway
public async Task<bool> DeliverBufferedAsync(
StoreAndForwardMessage message, CancellationToken cancellationToken = default)
{
var payload = JsonSerializer.Deserialize<CachedWritePayload>(message.PayloadJson);
// ExternalSystemGateway-018: a malformed (not just empty/null-fielded)
// PayloadJson would otherwise throw `JsonException` here, which the S&F
// engine treats as a transient failure and retries forever (poison
// message). Re-running the same deserialization against the same payload
// will throw deterministically, so JsonException is permanent — log,
// and return false so the S&F engine parks the message instead.
CachedWritePayload? payload;
try
{
payload = JsonSerializer.Deserialize<CachedWritePayload>(message.PayloadJson);
}
catch (JsonException ex)
{
_logger.LogError(
ex,
"Buffered CachedDbWrite message {Id} has malformed JSON payload; parking.",
message.Id);
return false;
}
if (payload == null || string.IsNullOrEmpty(payload.ConnectionName) || string.IsNullOrEmpty(payload.Sql))
{
_logger.LogError("Buffered CachedDbWrite message {Id} has an unreadable payload; parking.", message.Id);
@@ -173,7 +173,26 @@ public class ExternalSystemClient : IExternalSystemClient
public async Task<bool> DeliverBufferedAsync(
StoreAndForwardMessage message, CancellationToken cancellationToken = default)
{
var payload = JsonSerializer.Deserialize<CachedCallPayload>(message.PayloadJson);
// ExternalSystemGateway-018: a malformed (not just empty/null-fielded)
// PayloadJson would otherwise throw `JsonException` here, which the S&F
// engine treats as a transient failure and retries forever (poison
// message). Re-running the same deserialization against the same payload
// will throw deterministically, so JsonException is permanent — log,
// and return false so the S&F engine parks the message instead.
CachedCallPayload? payload;
try
{
payload = JsonSerializer.Deserialize<CachedCallPayload>(message.PayloadJson);
}
catch (JsonException ex)
{
_logger.LogError(
ex,
"Buffered ExternalSystem message {Id} has malformed JSON payload; parking.",
message.Id);
return false;
}
if (payload == null || string.IsNullOrEmpty(payload.SystemName) || string.IsNullOrEmpty(payload.MethodName))
{
_logger.LogError("Buffered ExternalSystem message {Id} has an unreadable payload; parking.", message.Id);
@@ -0,0 +1,57 @@
using Akka.Cluster;
using ScadaLink.Host.Actors;
using ScadaLink.InboundAPI;
namespace ScadaLink.Host.Health;
/// <summary>
/// InboundAPI-008 / InboundAPI-022: production implementation of
/// <see cref="IActiveNodeGate"/> backed by the running Akka.NET cluster.
///
/// The inbound API is "Central cluster only (active node)" — a standby central
/// node must not execute method scripts or <c>Route.To()</c> calls. This gate
/// mirrors the leadership check in <see cref="ActiveNodeHealthCheck"/> (the
/// node is the cluster leader, <see cref="MemberStatus.Up"/>), so
/// <see cref="InboundApiEndpointFilter"/> can return HTTP 503 on a standby.
///
/// Registered only in the Central-role branch of <c>Program.cs</c>. The gate
/// is resolved per request from <c>HttpContext.RequestServices</c>; while the
/// <c>AkkaHostedService</c> is still warming up (<c>ActorSystem == null</c>)
/// or the node has not yet reached <see cref="MemberStatus.Up"/>, this
/// implementation reports <c>IsActiveNode == false</c> — the safe-by-default
/// answer matching the standby case.
/// </summary>
public sealed class ActiveNodeGate : IActiveNodeGate
{
private readonly AkkaHostedService _akkaService;
/// <summary>Initializes a new <see cref="ActiveNodeGate"/> bound to the given Akka hosted service.</summary>
/// <param name="akkaService">The Akka hosted service exposing the cluster's <see cref="Akka.Actor.ActorSystem"/>.</param>
public ActiveNodeGate(AkkaHostedService akkaService)
{
_akkaService = akkaService;
}
/// <summary>
/// <c>true</c> only when this node has joined the cluster (<see cref="MemberStatus.Up"/>)
/// AND is the current cluster leader; <c>false</c> in every other state
/// (actor system not yet started, node still joining, node is a standby).
/// </summary>
public bool IsActiveNode
{
get
{
var system = _akkaService.ActorSystem;
if (system == null)
return false;
var cluster = Cluster.Get(system);
var self = cluster.SelfMember;
if (self.Status != MemberStatus.Up)
return false;
var leader = cluster.State.Leader;
return leader != null && leader == self.Address;
}
}
}
+10
View File
@@ -120,6 +120,16 @@ try
builder.Services.AddSingleton<AkkaHostedService>();
builder.Services.AddHostedService(sp => sp.GetRequiredService<AkkaHostedService>());
// InboundAPI-022: register the production IActiveNodeGate implementation so
// standby-node gating is actually enforced (the InboundApiEndpointFilter
// consults IActiveNodeGate and defaults to "allow" when none is registered,
// which leaves the design's "central cluster only (active node)" guarantee
// unenforced in deployed binaries). The gate is backed by the same Akka
// cluster-leadership check as ActiveNodeHealthCheck above, so the inbound
// API and the /health/active endpoint Traefik routes against agree on
// which node is active.
builder.Services.AddSingleton<ScadaLink.InboundAPI.IActiveNodeGate, ActiveNodeGate>();
// Cluster node status provider scoped to the Central role — feeds the
// CentralHealthReportLoop so the central cluster appears on /monitoring/health.
builder.Services.AddSingleton<IClusterNodeProvider>(sp =>
@@ -34,8 +34,7 @@ public static class SiteServiceRegistration
// Sites no longer deliver notifications over SMTP — a buffered notification is
// forwarded to the central cluster (via NotificationForwarder / SiteCommunicationActor),
// and central owns SMTP delivery through the Notification Outbox. The SMTP machinery
// (OAuth2TokenService, ISmtpClientWrapper, INotificationDeliveryService) has no
// consumer on a site node.
// (OAuth2TokenService, ISmtpClientWrapper) has no consumer on a site node.
// Health report transport: sends SiteHealthReport to SiteCommunicationActor via Akka
services.AddSingleton<ISiteIdentityProvider, SiteIdentityProvider>();
@@ -10,14 +10,14 @@ namespace ScadaLink.NotificationOutbox.Delivery;
/// <summary>
/// Task 12: Email channel delivery adapter for the central notification outbox.
///
/// Reuses the <see cref="ScadaLink.NotificationService"/> SMTP machinery
/// Reuses the <see cref="ScadaLink.NotificationService"/> SMTP primitives
/// <see cref="ISmtpClientWrapper"/>, <see cref="SmtpTlsModeParser"/>,
/// <see cref="OAuth2TokenService"/> and the typed <see cref="SmtpPermanentException"/>.
/// The connect/auth/send/disconnect sequence and error classification mirror
/// <c>NotificationDeliveryService.DeliverAsync</c>; this adapter, however, maps the
/// result to the outbox's three-way <see cref="DeliveryOutcome"/> (Success / Permanent
/// / Transient) rather than the S&amp;F-coupled <c>NotificationResult</c>, which cannot
/// distinguish a permanent failure from a buffered transient one.
/// This adapter owns the full connect/auth/send/disconnect sequence and maps the
/// outcome to the outbox's three-way <see cref="DeliveryOutcome"/> (Success / Permanent /
/// Transient) — the canonical central-side email delivery path. NS-019: the prior
/// site-shaped <c>NotificationDeliveryService</c> was deleted with sites no longer
/// delivering notifications.
/// </summary>
public sealed class EmailNotificationDeliveryAdapter : INotificationDeliveryAdapter
{
@@ -44,9 +44,8 @@ public sealed class EmailNotificationDeliveryAdapter : INotificationDeliveryAdap
_smtpClientFactory = smtpClientFactory;
_logger = logger;
_tokenService = tokenService;
// Mirrors NotificationDeliveryService: NotificationOptions supplies the
// documented fallback values used when a deployed SmtpConfiguration row
// leaves a field unset (non-positive).
// NotificationOptions supplies the documented fallback values used when a
// deployed SmtpConfiguration row leaves a field unset (non-positive).
_options = options?.Value ?? new NotificationOptions();
}
@@ -81,7 +80,7 @@ public sealed class EmailNotificationDeliveryAdapter : INotificationDeliveryAdap
}
// An unknown TLS mode is a configuration error that retrying cannot fix —
// surface it as a permanent failure (mirrors NS-005 in NotificationDeliveryService).
// surface it as a permanent failure (NS-005 SMTP TLS validation policy).
SmtpTlsMode tlsMode;
try
{
@@ -154,11 +153,9 @@ public sealed class EmailNotificationDeliveryAdapter : INotificationDeliveryAdap
}
/// <summary>
/// Delivers the plain-text BCC email via SMTP. Mirrors the connect/auth/send/
/// disconnect sequence of <c>NotificationDeliveryService.DeliverAsync</c>: a
/// permanent failure surfaces as <see cref="SmtpPermanentException"/>; transient
/// failures propagate for the caller's classifier; the connection is always torn
/// down in the finally block.
/// Delivers the plain-text BCC email via SMTP. A permanent failure surfaces as
/// <see cref="SmtpPermanentException"/>; transient failures propagate for the
/// caller's classifier; the connection is always torn down in the finally block.
/// </summary>
private async Task SendAsync(
SmtpConfiguration config,
@@ -57,7 +57,7 @@ public class MailKitSmtpClientWrapper : ISmtpClientWrapper, IDisposable
// worst, sending where authentication was required. Authentication being
// skipped must never be silent: each of these is a permanent configuration
// fault, surfaced as SmtpPermanentException so SendAsync returns a clean
// failure and DeliverBufferedAsync parks the buffered message.
// failure that the central Notification Outbox dispatcher classifies as permanent.
if (string.IsNullOrEmpty(credentials))
{
throw new SmtpPermanentException(
@@ -1,448 +0,0 @@
using System.Text.Json;
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Options;
using ScadaLink.Commons.Entities.Notifications;
using ScadaLink.Commons.Interfaces.Repositories;
using ScadaLink.Commons.Interfaces.Services;
using ScadaLink.Commons.Types.Enums;
using ScadaLink.StoreAndForward;
namespace ScadaLink.NotificationService;
/// <summary>
/// WP-11: Notification delivery via SMTP.
/// WP-12: Error classification and S&amp;F integration.
/// Transient: connection refused, timeout, SMTP 4xx → hand to S&amp;F.
/// Permanent: SMTP 5xx → returned to script.
/// </summary>
public class NotificationDeliveryService : INotificationDeliveryService, IDisposable
{
private readonly INotificationRepository _repository;
private readonly Func<ISmtpClientWrapper> _smtpClientFactory;
private readonly OAuth2TokenService? _tokenService;
private readonly StoreAndForwardService? _storeAndForward;
private readonly ILogger<NotificationDeliveryService> _logger;
private readonly NotificationOptions _options;
/// <summary>
/// Initializes a new instance of the NotificationDeliveryService with the specified dependencies.
/// </summary>
/// <param name="repository">The notification repository for data access.</param>
/// <param name="smtpClientFactory">Factory for creating SMTP client instances.</param>
/// <param name="logger">Logger for diagnostic messages.</param>
/// <param name="tokenService">Optional OAuth2 token service for authentication.</param>
/// <param name="storeAndForward">Optional store-and-forward service for handling transient failures.</param>
/// <param name="options">Optional notification options with fallback values.</param>
public NotificationDeliveryService(
INotificationRepository repository,
Func<ISmtpClientWrapper> smtpClientFactory,
ILogger<NotificationDeliveryService> logger,
OAuth2TokenService? tokenService = null,
StoreAndForwardService? storeAndForward = null,
IOptions<NotificationOptions>? options = null)
{
_repository = repository;
_smtpClientFactory = smtpClientFactory;
_logger = logger;
_tokenService = tokenService;
_storeAndForward = storeAndForward;
// NS-017: NotificationOptions supplies the documented fallback values used
// when a deployed SmtpConfiguration row leaves a field unset (non-positive).
_options = options?.Value ?? new NotificationOptions();
}
/// <inheritdoc />
public async Task<NotificationResult> SendAsync(
string listName,
string subject,
string message,
string? originInstanceName = null,
CancellationToken cancellationToken = default)
{
ObjectDisposedException.ThrowIf(_disposed, this);
var list = await _repository.GetListByNameAsync(listName, cancellationToken);
if (list == null)
{
return new NotificationResult(false, $"Notification list '{listName}' not found");
}
var recipients = await _repository.GetRecipientsByListIdAsync(list.Id, cancellationToken);
if (recipients.Count == 0)
{
return new NotificationResult(false, $"Notification list '{listName}' has no recipients");
}
var smtpConfigs = await _repository.GetAllSmtpConfigurationsAsync(cancellationToken);
var smtpConfig = smtpConfigs.FirstOrDefault();
if (smtpConfig == null)
{
return new NotificationResult(false, "No SMTP configuration available");
}
// NS-005: validate the configured TLS mode up front — an unknown value is a
// configuration error and must surface as a clean result, not a silent
// fallback to opportunistic TLS negotiation.
try
{
SmtpTlsModeParser.Parse(smtpConfig.TlsMode);
}
catch (ArgumentException ex)
{
_logger.LogError("Invalid SMTP TLS mode for list {List}: {Reason}", listName, ex.Message);
return new NotificationResult(false, ex.Message);
}
// NS-008: validate every email address before attempting delivery. A single
// malformed address previously caused MailboxAddress.Parse to throw a
// ParseException that escaped SendAsync unhandled; it must instead produce a
// clean NotificationResult the calling script can handle.
var addressError = EmailAddressValidator.ValidateAddresses(smtpConfig.FromAddress, recipients);
if (addressError != null)
{
_logger.LogWarning("Notification to list {List} has invalid addresses: {Reason}", listName, addressError);
return new NotificationResult(false, addressError);
}
try
{
await DeliverAsync(smtpConfig, recipients, subject, message, cancellationToken);
return new NotificationResult(true, null);
}
catch (SmtpPermanentException ex)
{
// WP-12: Permanent SMTP failure — returned to script.
// NS-009: scrub credential fragments out of the server-supplied message
// before logging or returning it.
var detail = CredentialRedactor.Scrub(ex.Message, smtpConfig.Credentials);
_logger.LogError(
"Permanent SMTP failure sending to list {List}: {Detail}", listName, detail);
return new NotificationResult(false, $"Permanent SMTP error: {detail}");
}
catch (OperationCanceledException) when (cancellationToken.IsCancellationRequested)
{
// NS-002: a caller-requested cancellation propagates; it is not buffered.
throw;
}
catch (Exception ex) when (SmtpErrorClassifier.IsTransient(ex, cancellationToken))
{
// WP-12: Transient SMTP failure — hand to S&F.
// NS-009: scrub credential fragments before logging.
_logger.LogWarning(
"Transient SMTP failure sending to list {List} ({ExceptionType}): {Detail}; buffering for retry",
listName, ex.GetType().Name, CredentialRedactor.Scrub(ex.Message, smtpConfig.Credentials));
if (_storeAndForward == null)
{
return new NotificationResult(false, "Transient SMTP error and store-and-forward not available");
}
var payload = JsonSerializer.Serialize(new
{
ListName = listName,
Subject = subject,
Message = message
});
// attemptImmediateDelivery: false — DeliverAsync was already attempted
// above; letting EnqueueAsync re-invoke the handler would send twice.
await _storeAndForward.EnqueueAsync(
StoreAndForwardCategory.Notification,
listName,
payload,
originInstanceName,
smtpConfig.MaxRetries > 0 ? smtpConfig.MaxRetries : null,
smtpConfig.RetryDelay > TimeSpan.Zero ? smtpConfig.RetryDelay : null,
attemptImmediateDelivery: false);
return new NotificationResult(true, null, WasBuffered: true);
}
catch (Exception ex)
{
// NS-015: a failure that SmtpErrorClassifier does not recognise (Unknown) —
// most importantly an OAuth2 token-fetch failure (HttpRequestException
// from EnsureSuccessStatusCode, or InvalidOperationException from a
// malformed credential triple) — used to fall through all the catch
// clauses above and escape SendAsync as a raw exception to the calling
// script, which the INotificationDeliveryService contract never
// advertises. Convert any otherwise-unhandled exception into a clean,
// credential-scrubbed permanent NotificationResult: returning control to
// the script is the safe default. (A caller-requested cancellation is
// already re-thrown by the filter above and never reaches here.)
var detail = CredentialRedactor.Scrub(ex.Message, smtpConfig.Credentials);
_logger.LogError(
"Unclassified failure sending to list {List} ({ExceptionType}): {Detail}",
listName, ex.GetType().Name, detail);
return new NotificationResult(false, $"Notification delivery failed: {detail}");
}
}
/// <summary>
/// WP-11/12: Delivers a buffered notification during a store-and-forward retry
/// sweep — re-resolves the list, recipients and SMTP config and re-attempts
/// delivery. Returns true on success, false on permanent failure (the message
/// is parked); throws on a transient failure so the engine retries.
/// </summary>
/// <param name="message">The buffered store-and-forward message to deliver.</param>
/// <param name="cancellationToken">Cancellation token for the delivery attempt.</param>
public async Task<bool> DeliverBufferedAsync(
StoreAndForwardMessage message, CancellationToken cancellationToken = default)
{
var payload = JsonSerializer.Deserialize<BufferedNotification>(message.PayloadJson);
if (payload == null || string.IsNullOrEmpty(payload.ListName))
{
_logger.LogError("Buffered notification message {Id} has an unreadable payload; parking.", message.Id);
return false;
}
var list = await _repository.GetListByNameAsync(payload.ListName, cancellationToken);
if (list == null)
{
_logger.LogError(
"Buffered notification to list '{List}' cannot be delivered — the list no longer exists; parking.",
payload.ListName);
return false;
}
var recipients = await _repository.GetRecipientsByListIdAsync(list.Id, cancellationToken);
if (recipients.Count == 0)
{
_logger.LogError("Buffered notification to list '{List}' has no recipients; parking.", payload.ListName);
return false;
}
var smtpConfig = (await _repository.GetAllSmtpConfigurationsAsync(cancellationToken)).FirstOrDefault();
if (smtpConfig == null)
{
_logger.LogError("Buffered notification cannot be delivered — no SMTP configuration available; parking.");
return false;
}
// NS-005: an unknown TLS mode is a configuration error that retrying cannot
// fix — park the buffered message rather than throwing on every sweep.
try
{
SmtpTlsModeParser.Parse(smtpConfig.TlsMode);
}
catch (ArgumentException ex)
{
_logger.LogError(
"Buffered notification to list '{List}' cannot be delivered — {Reason}; parking.",
payload.ListName, ex.Message);
return false;
}
// NS-008: a malformed address cannot be fixed by retrying — park it.
var addressError = EmailAddressValidator.ValidateAddresses(smtpConfig.FromAddress, recipients);
if (addressError != null)
{
_logger.LogError(
"Buffered notification to list '{List}' has invalid addresses ({Reason}); parking.",
payload.ListName, addressError);
return false;
}
try
{
await DeliverAsync(smtpConfig, recipients, payload.Subject, payload.Message, cancellationToken);
return true;
}
catch (SmtpPermanentException ex)
{
// NS-009: scrub credential fragments out of the message before logging.
_logger.LogError(
"Buffered notification to list '{List}' failed permanently ({Detail}); parking.",
payload.ListName, CredentialRedactor.Scrub(ex.Message, smtpConfig.Credentials));
return false;
}
catch (OperationCanceledException) when (cancellationToken.IsCancellationRequested)
{
// A handler shutdown cancellation is neither a delivery success nor a
// permanent failure — let it propagate so the engine does not park.
throw;
}
catch (Exception ex) when (SmtpErrorClassifier.IsTransient(ex, cancellationToken))
{
// A typed transient SMTP error: re-throw so the S&F engine retries.
throw;
}
catch (Exception ex)
{
// NS-014: an exception SmtpErrorClassifier does not recognise (Unknown) —
// chiefly an OAuth2 token-fetch failure — used to escape this handler.
// The S&F engine treats ANY thrown exception as transient, so a
// permanently-broken config (bad client secret, malformed credential
// triple) was retried on every sweep until MaxRetries, burning token
// endpoint calls. Decide deliberately rather than letting it leak:
// - an HttpRequestException with a 5xx token-endpoint status is a
// transient outage → re-throw so the engine retries;
// - everything else (a 4xx/401 token rejection, a malformed credential
// InvalidOperationException, any other unclassified fault) is not
// fixable by retrying → return false so the message is parked.
if (ex is HttpRequestException { StatusCode: { } status } && (int)status is >= 500 and < 600)
{
_logger.LogWarning(
"Buffered notification to list '{List}' hit a transient OAuth2 token-endpoint error ({Status}); will retry.",
payload.ListName, (int)status);
throw;
}
_logger.LogError(
"Buffered notification to list '{List}' failed with a non-retryable error ({ExceptionType}: {Detail}); parking.",
payload.ListName, ex.GetType().Name,
CredentialRedactor.Scrub(ex.Message, smtpConfig.Credentials));
return false;
}
}
private sealed record BufferedNotification(string ListName, string Subject, string Message);
/// <summary>
/// NS-007: throttles concurrent SMTP deliveries to the configured
/// <c>MaxConcurrentConnections</c>. One SMTP config is deployed per site, so the
/// limit is a stable per-site invariant; it is captured lazily on first use.
/// NS-018: a <see cref="Lazy{T}"/> replaces the hand-rolled double-checked
/// init — its publication is correctly synchronised (no lock-free read of a
/// non-volatile field) and it is disposed in <see cref="Dispose"/>.
/// </summary>
private Lazy<SemaphoreSlim>? _concurrencyLimiter;
private readonly object _limiterLock = new();
private bool _disposed;
private SemaphoreSlim GetConcurrencyLimiter(SmtpConfiguration config)
{
// NS-018: the limiter is sized once; capture the size now so the Lazy
// factory does not close over a value that could change between calls.
var configured = config.MaxConcurrentConnections > 0
? config.MaxConcurrentConnections
// NS-017: fall back to the NotificationOptions value, then the
// design-doc default of 5, when the deployed row leaves it unset.
: _options.MaxConcurrentConnections > 0 ? _options.MaxConcurrentConnections : 5;
lock (_limiterLock)
{
ObjectDisposedException.ThrowIf(_disposed, this);
_concurrencyLimiter ??= new Lazy<SemaphoreSlim>(
() => new SemaphoreSlim(configured, configured));
return _concurrencyLimiter.Value;
}
}
/// <summary>
/// NS-018: disposes the lazily-created concurrency limiter. The service is a
/// scoped DI service; without this the <see cref="SemaphoreSlim"/> leaked a
/// handle per scope.
/// </summary>
public void Dispose()
{
lock (_limiterLock)
{
if (_disposed)
{
return;
}
_disposed = true;
if (_concurrencyLimiter is { IsValueCreated: true } limiter)
{
limiter.Value.Dispose();
}
}
GC.SuppressFinalize(this);
}
/// <summary>
/// Delivers an email via SMTP. Throws on failure (transient errors and
/// <see cref="SmtpPermanentException"/> propagate; the caller classifies them).
/// </summary>
/// <param name="config">The SMTP configuration to use for the connection.</param>
/// <param name="recipients">The list of recipients to deliver to.</param>
/// <param name="subject">The email subject line.</param>
/// <param name="body">The plain-text email body.</param>
/// <param name="cancellationToken">Cancellation token for the delivery.</param>
internal async Task DeliverAsync(
SmtpConfiguration config,
IReadOnlyList<NotificationRecipient> recipients,
string subject,
string body,
CancellationToken cancellationToken)
{
var tlsMode = SmtpTlsModeParser.Parse(config.TlsMode);
// NS-007: bound the number of concurrent SMTP connections per site.
var limiter = GetConcurrencyLimiter(config);
await limiter.WaitAsync(cancellationToken);
// NS-004: create exactly one client and dispose the one actually used.
var smtp = _smtpClientFactory();
using var disposable = smtp as IDisposable;
try
{
// NS-005/NS-007: explicit TLS mode and the configured connection timeout.
// NS-017: when the deployed SmtpConfiguration row leaves the timeout
// unset (non-positive), fall back to the NotificationOptions value.
var timeoutSeconds = config.ConnectionTimeoutSeconds > 0
? config.ConnectionTimeoutSeconds
: _options.ConnectionTimeoutSeconds;
await smtp.ConnectAsync(
config.Host, config.Port, tlsMode, timeoutSeconds, cancellationToken);
// Resolve credentials (OAuth2 token fetched/cached by the token service).
var credentials = config.Credentials;
if (config.AuthType.Equals("oauth2", StringComparison.OrdinalIgnoreCase) && _tokenService != null && credentials != null)
{
var token = await _tokenService.GetTokenAsync(credentials, cancellationToken);
credentials = token;
}
// NS-021: OAuth2 XOAUTH2 requires the user identity (FromAddress) to be
// sent alongside the access token; an empty user is rejected by M365.
await smtp.AuthenticateAsync(
config.AuthType,
credentials,
oauth2UserName: config.FromAddress,
cancellationToken: cancellationToken);
var bccAddresses = recipients.Select(r => r.EmailAddress).ToList();
await smtp.SendAsync(config.FromAddress, bccAddresses, subject, body, cancellationToken);
}
catch (OperationCanceledException) when (cancellationToken.IsCancellationRequested)
{
// NS-002: A deliberately cancelled token must propagate as a cancellation,
// not be misclassified as a transient SMTP failure and buffered for retry.
throw;
}
catch (Exception ex) when (SmtpErrorClassifier.Classify(ex, cancellationToken) == SmtpErrorClass.Permanent
&& ex is not SmtpPermanentException)
{
// NS-003: Permanent SMTP failure (5xx) — surface a typed permanent exception.
throw new SmtpPermanentException(ex.Message, ex);
}
// Transient and SmtpPermanentException both propagate unchanged: SendAsync's
// catch filters (SmtpPermanentException / SmtpErrorClassifier.IsTransient) handle them.
finally
{
// NS-010: always tear the connection down, regardless of outcome. The
// SMTP QUIT used to run only on the success path inside the try block,
// so a failed Connect/Authenticate/Send left an open, authenticated
// connection until finalization reclaimed the socket — exhausting the
// server's connection slots under sustained transient failures.
// Disconnect is best-effort: a disconnect failure (e.g. the connection
// is already dead) must not mask the original delivery exception.
try
{
await smtp.DisconnectAsync(cancellationToken);
}
catch (Exception disconnectEx)
{
_logger.LogDebug(
"Ignoring SMTP disconnect failure during cleanup: {Reason}", disconnectEx.Message);
}
// NS-007: always release the concurrency slot, even on failure.
limiter.Release();
}
}
}
@@ -5,10 +5,10 @@ namespace ScadaLink.NotificationService;
/// <c>ScadaLink:Notification</c> configuration section.
///
/// SMTP settings are primarily carried by the deployed <c>SmtpConfiguration</c>
/// entity. NS-017: these values are the fallback used by
/// <see cref="NotificationDeliveryService"/> when the corresponding
/// <c>SmtpConfiguration</c> field is left unset (non-positive) on a partially
/// deployed row — a value present on the row always takes precedence.
/// entity. NS-017: these values are the fallback used by the central
/// Notification Outbox's <c>EmailNotificationDeliveryAdapter</c> when the
/// corresponding <c>SmtpConfiguration</c> field is left unset (non-positive) on a
/// partially deployed row — a value present on the row always takes precedence.
/// </summary>
public class NotificationOptions
{
@@ -1,12 +1,17 @@
using Microsoft.Extensions.DependencyInjection;
using ScadaLink.Commons.Interfaces.Services;
namespace ScadaLink.NotificationService;
public static class ServiceCollectionExtensions
{
/// <summary>
/// Registers the notification delivery services (SMTP, OAuth2 token, delivery adapter).
/// Registers the shared SMTP delivery primitives consumed by the central Notification
/// Outbox's <c>EmailNotificationDeliveryAdapter</c>: <see cref="NotificationOptions"/>,
/// <see cref="OAuth2TokenService"/>, and the <see cref="ISmtpClientWrapper"/> factory.
/// Central-only — sites no longer deliver notifications (see
/// <c>Component-NotificationService.md</c>), and the orphaned site-shaped
/// <c>NotificationDeliveryService</c> + <c>INotificationDeliveryService</c> contract
/// was removed (NS-019). Notification dispatch lives in <c>ScadaLink.NotificationOutbox</c>.
/// </summary>
/// <param name="services">The service collection to register into.</param>
public static IServiceCollection AddNotificationService(this IServiceCollection services)
@@ -17,8 +22,6 @@ public static class ServiceCollectionExtensions
services.AddHttpClient();
services.AddSingleton<OAuth2TokenService>();
services.AddSingleton<Func<ISmtpClientWrapper>>(_ => () => new MailKitSmtpClientWrapper());
services.AddScoped<NotificationDeliveryService>();
services.AddScoped<INotificationDeliveryService>(sp => sp.GetRequiredService<NotificationDeliveryService>());
return services;
}
@@ -26,10 +26,10 @@ public enum SmtpErrorClass
/// the numeric <see cref="SmtpStatusCode"/> rather than locale-dependent substring
/// matching on the exception message.
/// <para>
/// Public and shared: both <see cref="NotificationDeliveryService"/> (store-and-forward
/// delivery) and the central Notification Outbox's <c>EmailNotificationDeliveryAdapter</c>
/// route every SMTP failure through this single policy, so a transient/permanent
/// boundary change cannot diverge between the two delivery paths.
/// Public and shared: the central Notification Outbox's <c>EmailNotificationDeliveryAdapter</c>
/// routes every SMTP failure through this single policy. (NS-019: the orphaned site-side
/// <c>NotificationDeliveryService</c> that previously co-used this classifier was removed
/// when sites stopped delivering notifications.)
/// </para>
/// </summary>
public static class SmtpErrorClassifier
@@ -1,5 +1,7 @@
using System.Text.Json;
using Akka.Actor;
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Logging.Abstractions;
using ScadaLink.Commons.Messages.Notification;
namespace ScadaLink.StoreAndForward;
@@ -31,6 +33,7 @@ public sealed class NotificationForwarder
private readonly IActorRef _siteCommunicationActor;
private readonly string _sourceSiteId;
private readonly TimeSpan _forwardTimeout;
private readonly ILogger<NotificationForwarder> _logger;
/// <param name="siteCommunicationActor">
/// The site communication actor. It forwards a <see cref="NotificationSubmit"/> to
@@ -42,14 +45,21 @@ public sealed class NotificationForwarder
/// How long to wait for central's ack before treating the forward as a transient
/// failure. Sourced from host configuration.
/// </param>
/// <param name="logger">
/// Optional logger. StoreAndForward-018: a corrupt buffered payload is logged at
/// Warning before being discarded so an operator has a forensic trail of the row
/// that vanished from the buffer.
/// </param>
public NotificationForwarder(
IActorRef siteCommunicationActor,
string sourceSiteId,
TimeSpan forwardTimeout)
TimeSpan forwardTimeout,
ILogger<NotificationForwarder>? logger = null)
{
_siteCommunicationActor = siteCommunicationActor;
_sourceSiteId = sourceSiteId;
_forwardTimeout = forwardTimeout;
_logger = logger ?? NullLogger<NotificationForwarder>.Instance;
}
/// <summary>
@@ -61,11 +71,26 @@ public sealed class NotificationForwarder
/// <param name="message">The buffered store-and-forward message to deliver to central.</param>
public async Task<bool> DeliverAsync(StoreAndForwardMessage message)
{
// An unreadable payload cannot be fixed by retrying — park it (return false),
// mirroring how the former SMTP handler treated a corrupt buffered payload.
// StoreAndForward-018: an unreadable payload cannot be fixed by retrying.
// The design doc explicitly forbids parking notifications ("notifications do
// not park — they are retried at the fixed forward interval until central
// acks"; Component-StoreAndForward.md). The earlier behaviour returned false
// here, which the S&F engine interprets as a permanent failure and parks
// the row — contradicting the invariant and surfacing the row in the
// central UI's parked-message list. The correct outcome for a corrupt-payload
// notification is to DISCARD: log a Warning with the buffered row id +
// payload preview for forensics, then return true so the engine clears the
// buffer via its standard success-path cleanup. The buffered row is
// unrecoverable; retrying or parking would both make the queue worse, not
// better.
if (!TryBuildSubmit(message, out var submit))
{
return false;
_logger.LogWarning(
"Discarding corrupt buffered notification {NotificationId} (payload is not deserialisable as NotificationSubmit). " +
"Payload preview: {PayloadPreview}",
message.Id,
PreviewPayload(message.PayloadJson));
return true;
}
// The reply may legitimately be a non-accepted ack, so it is not requested as
@@ -140,6 +165,25 @@ public sealed class NotificationForwarder
};
return true;
}
private const int CorruptPayloadPreviewMaxLength = 200;
/// <summary>
/// Returns a length-capped preview of a corrupt buffered payload for the Warning
/// log line emitted on discard. The full payload may be megabytes and is not
/// suitable for the structured log; the preview retains the leading characters,
/// which is what an operator typically uses to identify the producing script.
/// </summary>
private static string PreviewPayload(string? payloadJson)
{
if (string.IsNullOrEmpty(payloadJson))
{
return "<empty>";
}
return payloadJson.Length <= CorruptPayloadPreviewMaxLength
? payloadJson
: payloadJson.Substring(0, CorruptPayloadPreviewMaxLength) + "…";
}
}
/// <summary>
@@ -111,12 +111,14 @@ public class DiffService
a.CanonicalName == b.CanonicalName &&
a.Value == b.Value &&
a.DataType == b.DataType &&
a.Description == b.Description &&
a.IsLocked == b.IsLocked &&
a.DataSourceReference == b.DataSourceReference &&
a.BoundDataConnectionId == b.BoundDataConnectionId;
private static bool AlarmsEqual(ResolvedAlarm a, ResolvedAlarm b) =>
a.CanonicalName == b.CanonicalName &&
a.Description == b.Description &&
a.PriorityLevel == b.PriorityLevel &&
a.IsLocked == b.IsLocked &&
a.TriggerType == b.TriggerType &&
@@ -132,4 +134,27 @@ public class DiffService
a.ParameterDefinitions == b.ParameterDefinitions &&
a.ReturnDefinition == b.ReturnDefinition &&
a.MinTimeBetweenRuns == b.MinTimeBetweenRuns;
/// <summary>
/// Compares two <see cref="ConnectionConfig"/> instances for equality across
/// the fields that travel in the deployment package: protocol, primary and
/// backup configuration JSON, and failover retry count. Used by callers that
/// need to detect connection-endpoint drift; the public diff shape only
/// exposes attribute / alarm / script changes today (see TemplateEngine-018
/// for the diff-shape extension that surfaces added / removed / changed
/// connections in the UI).
/// </summary>
/// <param name="a">First connection configuration.</param>
/// <param name="b">Second connection configuration.</param>
/// <returns>True when both configurations are equal.</returns>
public static bool ConnectionsEqual(ConnectionConfig a, ConnectionConfig b)
{
ArgumentNullException.ThrowIfNull(a);
ArgumentNullException.ThrowIfNull(b);
return a.Protocol == b.Protocol &&
a.ConfigurationJson == b.ConfigurationJson &&
a.BackupConfigurationJson == b.BackupConfigurationJson &&
a.FailoverRetryCount == b.FailoverRetryCount;
}
}
@@ -52,6 +52,7 @@ public class RevisionHashService
CanonicalName = a.CanonicalName,
Value = a.Value,
DataType = a.DataType,
Description = a.Description,
IsLocked = a.IsLocked,
DataSourceReference = a.DataSourceReference,
BoundDataConnectionId = a.BoundDataConnectionId
@@ -62,6 +63,7 @@ public class RevisionHashService
.Select(a => new HashableAlarm
{
CanonicalName = a.CanonicalName,
Description = a.Description,
PriorityLevel = a.PriorityLevel,
IsLocked = a.IsLocked,
TriggerType = a.TriggerType,
@@ -82,7 +84,20 @@ public class RevisionHashService
ReturnDefinition = s.ReturnDefinition,
MinTimeBetweenRunsTicks = s.MinTimeBetweenRuns?.Ticks
})
.ToList()
.ToList(),
Connections = configuration.Connections is { Count: > 0 }
? new SortedDictionary<string, HashableConnection>(
configuration.Connections.ToDictionary(
kvp => kvp.Key,
kvp => new HashableConnection
{
BackupConfigurationJson = kvp.Value.BackupConfigurationJson,
ConfigurationJson = kvp.Value.ConfigurationJson,
FailoverRetryCount = kvp.Value.FailoverRetryCount,
Protocol = kvp.Value.Protocol
}),
StringComparer.Ordinal)
: null
};
var json = JsonSerializer.Serialize(hashInput, CanonicalJsonOptions);
@@ -108,6 +123,12 @@ public class RevisionHashService
/// </summary>
public List<HashableAttribute> Attributes { get; init; } = [];
/// <summary>
/// Data connection configurations keyed by connection name. Sorted by key
/// (ordinal) to keep serialization deterministic. Null when the deployment
/// package carries no connections.
/// </summary>
public SortedDictionary<string, HashableConnection>? Connections { get; init; }
/// <summary>
/// The unique instance name.
/// </summary>
public string InstanceUniqueName { get; init; } = string.Empty;
@@ -144,6 +165,11 @@ public class RevisionHashService
/// </summary>
public string DataType { get; init; } = string.Empty;
/// <summary>
/// The attribute description (authoring-time documentation that still
/// travels with the deployed payload).
/// </summary>
public string? Description { get; init; }
/// <summary>
/// Whether the attribute is locked.
/// </summary>
public bool IsLocked { get; init; }
@@ -160,6 +186,11 @@ public class RevisionHashService
/// </summary>
public string CanonicalName { get; init; } = string.Empty;
/// <summary>
/// The alarm description (authoring-time documentation that still
/// travels with the deployed payload).
/// </summary>
public string? Description { get; init; }
/// <summary>
/// Whether the alarm is locked.
/// </summary>
public bool IsLocked { get; init; }
@@ -181,6 +212,26 @@ public class RevisionHashService
public string TriggerType { get; init; } = string.Empty;
}
private sealed record HashableConnection
{
/// <summary>
/// Backup connection configuration JSON, if any.
/// </summary>
public string? BackupConfigurationJson { get; init; }
/// <summary>
/// Primary connection configuration JSON.
/// </summary>
public string? ConfigurationJson { get; init; }
/// <summary>
/// Number of failover retries before giving up.
/// </summary>
public int FailoverRetryCount { get; init; }
/// <summary>
/// Protocol name (e.g. "OpcUa").
/// </summary>
public string Protocol { get; init; } = string.Empty;
}
private sealed record HashableScript
{
/// <summary>
@@ -975,6 +975,17 @@ public sealed class BundleImporter : IBundleImporter
await _templateRepo.UpdateTemplateAsync(ex, ct).ConfigureAwait(false);
await _auditService.LogAsync(user, "Update", "Template", ex.Id.ToString(), ex.Name,
new { ex.Name, ex.Description, ex.FolderId }, ct).ConfigureAwait(false);
// T-001: Overwrite must also synchronise child collections —
// attributes / alarms / scripts diverging between the bundle
// and the target must round-trip. Composition rewire is
// handled by ResolveCompositionEdgesAsync after the global
// flush; alarm→script FKs are rewired by
// ResolveAlarmScriptLinksAsync. The helpers below stage the
// child diffs (add / update / delete) onto the tracked
// entity and emit one audit row per detected change.
await SyncTemplateAttributesAsync(ex, dto, user, ct).ConfigureAwait(false);
await SyncTemplateAlarmsAsync(ex, dto, user, ct).ConfigureAwait(false);
await SyncTemplateScriptsAsync(ex, dto, user, ct).ConfigureAwait(false);
summary.Overwritten++;
break;
case ResolutionAction.Add:
@@ -1055,6 +1066,329 @@ public sealed class BundleImporter : IBundleImporter
return t;
}
/// <summary>
/// T-001 — Overwrite child sync (attributes). Diffs the DTO's
/// <c>Attributes</c> against the existing template's attribute collection
/// by name and stages add / update / delete on the tracked entity. Emits
/// one <c>TemplateAttributeAdded</c> / <c>TemplateAttributeUpdated</c> /
/// <c>TemplateAttributeDeleted</c> audit row per detected change — the
/// per-field rows the design doc's Configuration Audit Trail table
/// enumerates for the "Template overwritten" action.
/// <para>
/// Update detection compares every scalar field (Value, DataType,
/// IsLocked, Description, DataSourceReference) — no field change → no
/// audit row, so an idempotent overwrite produces no noise.
/// </para>
/// </summary>
private async Task SyncTemplateAttributesAsync(
Template ex,
TemplateDto dto,
string user,
CancellationToken ct)
{
var existingByName = ex.Attributes.ToDictionary(a => a.Name, a => a, StringComparer.Ordinal);
var dtoByName = dto.Attributes.ToDictionary(a => a.Name, a => a, StringComparer.Ordinal);
// Deletes — attributes present on the target but not in the bundle.
foreach (var existing in existingByName.Values.ToList())
{
if (dtoByName.ContainsKey(existing.Name)) continue;
await _templateRepo.DeleteTemplateAttributeAsync(existing.Id, ct).ConfigureAwait(false);
ex.Attributes.Remove(existing);
await _auditService.LogAsync(
user,
"TemplateAttributeDeleted",
"TemplateAttribute",
existing.Id.ToString(),
$"{ex.Name}.{existing.Name}",
new { TemplateName = ex.Name, AttributeName = existing.Name },
ct).ConfigureAwait(false);
}
// Adds + Updates.
foreach (var attrDto in dto.Attributes)
{
if (existingByName.TryGetValue(attrDto.Name, out var current))
{
// Update only if any field actually changed.
bool changed =
!string.Equals(current.Value, attrDto.Value, StringComparison.Ordinal) ||
current.DataType != attrDto.DataType ||
current.IsLocked != attrDto.IsLocked ||
!string.Equals(current.Description, attrDto.Description, StringComparison.Ordinal) ||
!string.Equals(current.DataSourceReference, attrDto.DataSourceReference, StringComparison.Ordinal);
if (!changed) continue;
current.Value = attrDto.Value;
current.DataType = attrDto.DataType;
current.IsLocked = attrDto.IsLocked;
current.Description = attrDto.Description;
current.DataSourceReference = attrDto.DataSourceReference;
await _templateRepo.UpdateTemplateAttributeAsync(current, ct).ConfigureAwait(false);
await _auditService.LogAsync(
user,
"TemplateAttributeUpdated",
"TemplateAttribute",
current.Id.ToString(),
$"{ex.Name}.{current.Name}",
new
{
TemplateName = ex.Name,
AttributeName = current.Name,
current.Value,
current.DataType,
current.IsLocked,
current.Description,
current.DataSourceReference,
},
ct).ConfigureAwait(false);
}
else
{
var newAttr = new TemplateAttribute(attrDto.Name)
{
Value = attrDto.Value,
DataType = attrDto.DataType,
IsLocked = attrDto.IsLocked,
Description = attrDto.Description,
DataSourceReference = attrDto.DataSourceReference,
};
ex.Attributes.Add(newAttr);
await _auditService.LogAsync(
user,
"TemplateAttributeAdded",
"TemplateAttribute",
"0",
$"{ex.Name}.{newAttr.Name}",
new
{
TemplateName = ex.Name,
AttributeName = newAttr.Name,
newAttr.Value,
newAttr.DataType,
newAttr.IsLocked,
newAttr.Description,
newAttr.DataSourceReference,
},
ct).ConfigureAwait(false);
}
}
}
/// <summary>
/// T-001 — Overwrite child sync (alarms). Mirrors
/// <see cref="SyncTemplateAttributesAsync"/> for the alarm collection.
/// Updated / added alarms have their <c>OnTriggerScriptId</c> cleared so
/// the post-flush <see cref="ResolveAlarmScriptLinksAsync"/> pass re-binds
/// the FK from the DTO's <c>OnTriggerScriptName</c> against the synced
/// script collection. Audit rows: <c>TemplateAlarmAdded</c> /
/// <c>TemplateAlarmUpdated</c> / <c>TemplateAlarmDeleted</c>.
/// </summary>
private async Task SyncTemplateAlarmsAsync(
Template ex,
TemplateDto dto,
string user,
CancellationToken ct)
{
var existingByName = ex.Alarms.ToDictionary(a => a.Name, a => a, StringComparer.Ordinal);
var dtoByName = dto.Alarms.ToDictionary(a => a.Name, a => a, StringComparer.Ordinal);
foreach (var existing in existingByName.Values.ToList())
{
if (dtoByName.ContainsKey(existing.Name)) continue;
await _templateRepo.DeleteTemplateAlarmAsync(existing.Id, ct).ConfigureAwait(false);
ex.Alarms.Remove(existing);
await _auditService.LogAsync(
user,
"TemplateAlarmDeleted",
"TemplateAlarm",
existing.Id.ToString(),
$"{ex.Name}.{existing.Name}",
new { TemplateName = ex.Name, AlarmName = existing.Name },
ct).ConfigureAwait(false);
}
foreach (var alarmDto in dto.Alarms)
{
if (existingByName.TryGetValue(alarmDto.Name, out var current))
{
bool changed =
!string.Equals(current.Description, alarmDto.Description, StringComparison.Ordinal) ||
current.PriorityLevel != alarmDto.PriorityLevel ||
current.TriggerType != alarmDto.TriggerType ||
!string.Equals(current.TriggerConfiguration, alarmDto.TriggerConfiguration, StringComparison.Ordinal) ||
current.IsLocked != alarmDto.IsLocked;
if (!changed)
{
// Always reset the script FK on Overwrite so the post-flush
// resolve pass owns the binding (the DTO's script name is
// the authoritative reference); leaving a stale FK would
// silently survive Overwrite when the user expected the
// bundle to be the source of truth.
if ((current.OnTriggerScriptId is not null) ||
!string.IsNullOrEmpty(alarmDto.OnTriggerScriptName))
{
current.OnTriggerScriptId = null;
await _templateRepo.UpdateTemplateAlarmAsync(current, ct).ConfigureAwait(false);
}
continue;
}
current.Description = alarmDto.Description;
current.PriorityLevel = alarmDto.PriorityLevel;
current.TriggerType = alarmDto.TriggerType;
current.TriggerConfiguration = alarmDto.TriggerConfiguration;
current.IsLocked = alarmDto.IsLocked;
current.OnTriggerScriptId = null; // re-resolved post-flush.
await _templateRepo.UpdateTemplateAlarmAsync(current, ct).ConfigureAwait(false);
await _auditService.LogAsync(
user,
"TemplateAlarmUpdated",
"TemplateAlarm",
current.Id.ToString(),
$"{ex.Name}.{current.Name}",
new
{
TemplateName = ex.Name,
AlarmName = current.Name,
current.Description,
current.PriorityLevel,
current.TriggerType,
current.TriggerConfiguration,
current.IsLocked,
OnTriggerScriptName = alarmDto.OnTriggerScriptName,
},
ct).ConfigureAwait(false);
}
else
{
var newAlarm = new TemplateAlarm(alarmDto.Name)
{
Description = alarmDto.Description,
PriorityLevel = alarmDto.PriorityLevel,
TriggerType = alarmDto.TriggerType,
TriggerConfiguration = alarmDto.TriggerConfiguration,
IsLocked = alarmDto.IsLocked,
};
ex.Alarms.Add(newAlarm);
await _auditService.LogAsync(
user,
"TemplateAlarmAdded",
"TemplateAlarm",
"0",
$"{ex.Name}.{newAlarm.Name}",
new
{
TemplateName = ex.Name,
AlarmName = newAlarm.Name,
newAlarm.Description,
newAlarm.PriorityLevel,
newAlarm.TriggerType,
newAlarm.TriggerConfiguration,
newAlarm.IsLocked,
OnTriggerScriptName = alarmDto.OnTriggerScriptName,
},
ct).ConfigureAwait(false);
}
}
}
/// <summary>
/// T-001 — Overwrite child sync (scripts). Mirrors
/// <see cref="SyncTemplateAttributesAsync"/> for the script collection.
/// Audit rows: <c>TemplateScriptAdded</c> / <c>TemplateScriptUpdated</c> /
/// <c>TemplateScriptDeleted</c>.
/// </summary>
private async Task SyncTemplateScriptsAsync(
Template ex,
TemplateDto dto,
string user,
CancellationToken ct)
{
var existingByName = ex.Scripts.ToDictionary(s => s.Name, s => s, StringComparer.Ordinal);
var dtoByName = dto.Scripts.ToDictionary(s => s.Name, s => s, StringComparer.Ordinal);
foreach (var existing in existingByName.Values.ToList())
{
if (dtoByName.ContainsKey(existing.Name)) continue;
await _templateRepo.DeleteTemplateScriptAsync(existing.Id, ct).ConfigureAwait(false);
ex.Scripts.Remove(existing);
await _auditService.LogAsync(
user,
"TemplateScriptDeleted",
"TemplateScript",
existing.Id.ToString(),
$"{ex.Name}.{existing.Name}",
new { TemplateName = ex.Name, ScriptName = existing.Name },
ct).ConfigureAwait(false);
}
foreach (var scriptDto in dto.Scripts)
{
if (existingByName.TryGetValue(scriptDto.Name, out var current))
{
bool changed =
!string.Equals(current.Code, scriptDto.Code, StringComparison.Ordinal) ||
!string.Equals(current.TriggerType, scriptDto.TriggerType, StringComparison.Ordinal) ||
!string.Equals(current.TriggerConfiguration, scriptDto.TriggerConfiguration, StringComparison.Ordinal) ||
!string.Equals(current.ParameterDefinitions, scriptDto.ParameterDefinitions, StringComparison.Ordinal) ||
!string.Equals(current.ReturnDefinition, scriptDto.ReturnDefinition, StringComparison.Ordinal) ||
current.IsLocked != scriptDto.IsLocked;
if (!changed) continue;
current.Code = scriptDto.Code;
current.TriggerType = scriptDto.TriggerType;
current.TriggerConfiguration = scriptDto.TriggerConfiguration;
current.ParameterDefinitions = scriptDto.ParameterDefinitions;
current.ReturnDefinition = scriptDto.ReturnDefinition;
current.IsLocked = scriptDto.IsLocked;
await _templateRepo.UpdateTemplateScriptAsync(current, ct).ConfigureAwait(false);
await _auditService.LogAsync(
user,
"TemplateScriptUpdated",
"TemplateScript",
current.Id.ToString(),
$"{ex.Name}.{current.Name}",
new
{
TemplateName = ex.Name,
ScriptName = current.Name,
current.TriggerType,
current.TriggerConfiguration,
current.IsLocked,
},
ct).ConfigureAwait(false);
}
else
{
var newScript = new TemplateScript(scriptDto.Name, scriptDto.Code)
{
TriggerType = scriptDto.TriggerType,
TriggerConfiguration = scriptDto.TriggerConfiguration,
ParameterDefinitions = scriptDto.ParameterDefinitions,
ReturnDefinition = scriptDto.ReturnDefinition,
IsLocked = scriptDto.IsLocked,
};
ex.Scripts.Add(newScript);
await _auditService.LogAsync(
user,
"TemplateScriptAdded",
"TemplateScript",
"0",
$"{ex.Name}.{newScript.Name}",
new
{
TemplateName = ex.Name,
ScriptName = newScript.Name,
newScript.TriggerType,
newScript.TriggerConfiguration,
newScript.IsLocked,
},
ct).ConfigureAwait(false);
}
}
}
/// <summary>
/// FU-B / remainder of #37 — Pass A of the post-template-flush rewire.
/// For every imported template (Add / Overwrite / Rename) whose bundle DTO
@@ -1345,6 +1679,12 @@ public sealed class BundleImporter : IBundleImporter
await _externalRepo.UpdateExternalSystemAsync(existing, ct).ConfigureAwait(false);
await _auditService.LogAsync(user, "Update", "ExternalSystem", existing.Id.ToString(), existing.Name,
new { existing.Name, existing.EndpointUrl }, ct).ConfigureAwait(false);
// T-002: Overwrite must also synchronise the Methods child
// collection — added / removed / modified methods on the
// bundle DTO must round-trip. Mirrors the T-001 template
// child-sync helpers (attributes / alarms / scripts): each
// helper emits one audit row per detected change.
await SyncExternalSystemMethodsAsync(existing, dto, user, ct).ConfigureAwait(false);
summary.Overwritten++;
break;
case ResolutionAction.Add:
@@ -1371,6 +1711,114 @@ public sealed class BundleImporter : IBundleImporter
return sys;
}
/// <summary>
/// T-002 — Overwrite child sync (ExternalSystem methods). Mirrors the
/// T-001 <c>SyncTemplate*Async</c> helpers: name-keyed diff between the
/// bundle DTO and the persisted children, then add / update / delete via
/// the repository with one audit row per detected change. Methods are
/// NOT a navigation on <see cref="ExternalSystemDefinition"/> (the FK
/// runs from <see cref="ExternalSystemMethod.ExternalSystemDefinitionId"/>
/// to the parent) so the helper drives the repo directly rather than
/// mutating a tracked collection like the template helpers do.
/// <para>
/// Audit rows: <c>ExternalSystemMethodAdded</c> /
/// <c>ExternalSystemMethodUpdated</c> / <c>ExternalSystemMethodDeleted</c>.
/// Idempotent: scalar-field comparison gates the Update audit row, so an
/// Overwrite against an already-matching method produces no noise.
/// </para>
/// </summary>
private async Task SyncExternalSystemMethodsAsync(
ExternalSystemDefinition ex,
ExternalSystemDto dto,
string user,
CancellationToken ct)
{
var existingMethods = await _externalRepo
.GetMethodsByExternalSystemIdAsync(ex.Id, ct)
.ConfigureAwait(false);
var existingByName = existingMethods.ToDictionary(m => m.Name, m => m, StringComparer.Ordinal);
var dtoByName = dto.Methods.ToDictionary(m => m.Name, m => m, StringComparer.Ordinal);
// Deletes — methods present on the target but not in the bundle.
foreach (var existing in existingByName.Values.ToList())
{
if (dtoByName.ContainsKey(existing.Name)) continue;
await _externalRepo.DeleteExternalSystemMethodAsync(existing.Id, ct).ConfigureAwait(false);
await _auditService.LogAsync(
user,
"ExternalSystemMethodDeleted",
"ExternalSystemMethod",
existing.Id.ToString(),
$"{ex.Name}.{existing.Name}",
new { ExternalSystemName = ex.Name, MethodName = existing.Name },
ct).ConfigureAwait(false);
}
// Adds + Updates.
foreach (var methodDto in dto.Methods)
{
if (existingByName.TryGetValue(methodDto.Name, out var current))
{
// Update only if any field actually changed — mirrors the
// ArtifactDiff.ExternalSystemMethodsEqual comparator.
bool changed =
!string.Equals(current.HttpMethod, methodDto.HttpMethod, StringComparison.Ordinal) ||
!string.Equals(current.Path, methodDto.Path, StringComparison.Ordinal) ||
!string.Equals(current.ParameterDefinitions, methodDto.ParameterDefinitions, StringComparison.Ordinal) ||
!string.Equals(current.ReturnDefinition, methodDto.ReturnDefinition, StringComparison.Ordinal);
if (!changed) continue;
current.HttpMethod = methodDto.HttpMethod;
current.Path = methodDto.Path;
current.ParameterDefinitions = methodDto.ParameterDefinitions;
current.ReturnDefinition = methodDto.ReturnDefinition;
await _externalRepo.UpdateExternalSystemMethodAsync(current, ct).ConfigureAwait(false);
await _auditService.LogAsync(
user,
"ExternalSystemMethodUpdated",
"ExternalSystemMethod",
current.Id.ToString(),
$"{ex.Name}.{current.Name}",
new
{
ExternalSystemName = ex.Name,
MethodName = current.Name,
current.HttpMethod,
current.Path,
current.ParameterDefinitions,
current.ReturnDefinition,
},
ct).ConfigureAwait(false);
}
else
{
var newMethod = new ExternalSystemMethod(methodDto.Name, methodDto.HttpMethod, methodDto.Path)
{
ExternalSystemDefinitionId = ex.Id,
ParameterDefinitions = methodDto.ParameterDefinitions,
ReturnDefinition = methodDto.ReturnDefinition,
};
await _externalRepo.AddExternalSystemMethodAsync(newMethod, ct).ConfigureAwait(false);
await _auditService.LogAsync(
user,
"ExternalSystemMethodAdded",
"ExternalSystemMethod",
"0",
$"{ex.Name}.{newMethod.Name}",
new
{
ExternalSystemName = ex.Name,
MethodName = newMethod.Name,
newMethod.HttpMethod,
newMethod.Path,
newMethod.ParameterDefinitions,
newMethod.ReturnDefinition,
},
ct).ConfigureAwait(false);
}
}
}
private async Task ApplyDatabaseConnectionsAsync(
IReadOnlyList<DatabaseConnectionDto> dtos,
Dictionary<(string, string), ImportResolution> map,
@@ -230,17 +230,10 @@ public class CompatibilityTests
// ── Round-trip serialization for all key message types ──
[Fact]
public void RoundTrip_ConnectionStateChanged_Succeeds()
{
var msg = new ConnectionStateChanged("site-01", true, DateTimeOffset.UtcNow);
var json = JsonSerializer.Serialize(msg);
var deserialized = JsonSerializer.Deserialize<ConnectionStateChanged>(json, Options);
Assert.NotNull(deserialized);
Assert.Equal("site-01", deserialized!.SiteId);
Assert.True(deserialized.IsConnected);
}
// Communication-016: RoundTrip_ConnectionStateChanged_Succeeds removed
// alongside the dead ConnectionStateChanged message record. No production
// code emits or receives this message — disconnect detection is owned by
// the gRPC keepalive and the Ask-timeout path.
[Fact]
public void RoundTrip_AlarmStateChanged_Succeeds()
@@ -116,30 +116,13 @@ public class CentralCommunicationActorTests : TestKit
ExpectNoMsg(TimeSpan.FromMilliseconds(200));
}
[Fact]
public void ConnectionLost_DebugStreamsKilled()
{
var site = CreateSite("site1", "akka.tcp://scadalink@host:8082");
var (actor, _, siteProbes) = CreateActorWithMockRepo(new[] { site });
// Wait for auto-refresh
Thread.Sleep(1000);
// Subscribe to debug view (tracks the subscription)
var subscriberProbe = CreateTestProbe();
var subRequest = new SubscribeDebugViewRequest("inst1", "corr-123");
actor.Tell(new SiteEnvelope("site1", subRequest), subscriberProbe.Ref);
// The ClusterClient probe receives the routed message
siteProbes["site1"].ExpectMsg<ClusterClient.Send>();
// Simulate site disconnection
actor.Tell(new ConnectionStateChanged("site1", false, DateTimeOffset.UtcNow));
// The subscriber should receive a DebugStreamTerminated notification
subscriberProbe.ExpectMsg<DebugStreamTerminated>(
msg => msg.SiteId == "site1" && msg.CorrelationId == "corr-123");
}
// Communication-016: the prior `ConnectionLost_DebugStreamsKilled` test was
// removed alongside the dead HandleConnectionStateChanged handler. No
// production code ever emitted ConnectionStateChanged, so the test was
// exercising a workflow that never ran. Disconnect detection is owned by
// the gRPC keepalive (DebugStreamBridgeActor self-terminates) and by the
// Ask-timeout path at the CommunicationService layer (deploy callers see
// a failure).
[Fact]
public void Heartbeat_BumpsAggregatorTimestamp()
@@ -820,6 +820,56 @@ public class DeploymentServiceTests : TestKit
Assert.Equal("sha256:target", storedSnapshot.RevisionHash);
}
// ── DeploymentManager-018: reconciliation must preserve an intentional Disabled state ──
[Fact]
public async Task DeployInstanceAsync_Reconciled_DisabledInstance_PreservesDisabledState()
{
// DeploymentManager-018: after a central failover, the in-memory
// OperationLockManager is lost (by design — in-progress treated as
// failed). The prior deployment record remains InProgress in the DB.
// The operator can legitimately invoke Disable on the instance between
// the timed-out deploy and the redeploy. Disable does not change the
// deployed config, so the site still reports the target revision hash.
// When the operator retries the deploy, the reconciliation branch must
// NOT silently overwrite Instance.State back to Enabled — that would
// undo the explicit operator action with no audit trail.
var instance = new Instance("ReconcileDisabled")
{
Id = 72, SiteId = 1, State = InstanceState.Disabled
};
_repo.GetInstanceByIdAsync(72, Arg.Any<CancellationToken>()).Returns(instance);
SetupValidPipeline(72, "ReconcileDisabled", "sha256:target");
var prior = new DeploymentRecord("dep-prior-72", "admin")
{
InstanceId = 72,
Status = DeploymentStatus.InProgress,
RevisionHash = "sha256:target"
};
_repo.GetCurrentDeploymentStatusAsync(72, Arg.Any<CancellationToken>()).Returns(prior);
_repo.GetDeployedSnapshotByInstanceIdAsync(72, Arg.Any<CancellationToken>())
.Returns((DeployedConfigSnapshot?)null);
var commActor = Sys.ActorOf(Props.Create(() =>
new ReconcileProbeActor(siteHash: "sha256:target", failQuery: false)));
var service = CreateServiceWithCommActor(commActor);
var result = await service.DeployInstanceAsync(72, "admin");
// The reconciliation still succeeds and the prior record is marked
// Success — central and site agree on the applied config.
Assert.True(result.IsSuccess);
Assert.Equal(DeploymentStatus.Success, prior.Status);
Assert.Equal(1, ReconcileProbeActor.QueryCount);
Assert.Equal(0, ReconcileProbeActor.DeployCount);
// DeploymentManager-018: the operator's explicit Disable must survive
// the reconciliation — Instance.State stays Disabled, not silently
// flipped to Enabled.
Assert.Equal(InstanceState.Disabled, instance.State);
}
// ── DeploymentManager-016: reconciled record must carry the target revision hash ──
[Fact]
@@ -207,6 +207,31 @@ public class DatabaseGatewayTests
Assert.False(delivered); // permanent — the S&F engine parks the message
}
// ── ExternalSystemGateway-018: malformed JSON payload must park, not retry-forever ──
[Fact]
public async Task DeliverBuffered_MalformedJsonPayload_ReturnsFalseSoMessageParks()
{
// No connection stub needed — deserialization fails before any
// resolution or SQL execution. If the JsonException were to escape (the
// pre-018 behaviour) the S&F engine would treat it as transient and
// retry the same poison row forever.
var gateway = new DatabaseGateway(_repository, NullLogger<DatabaseGateway>.Instance);
var poisonMessage = new ScadaLink.StoreAndForward.StoreAndForwardMessage
{
Id = Guid.NewGuid().ToString("N"),
Category = ScadaLink.Commons.Types.Enums.StoreAndForwardCategory.CachedDbWrite,
Target = "someDb",
// Truncated mid-write — `{` opens an object that never closes.
PayloadJson = "{\"ConnectionName\":\"someDb\",\"Sql\":\"INSERT",
};
var delivered = await gateway.DeliverBufferedAsync(poisonMessage);
Assert.False(delivered); // permanent — the S&F engine parks the message
}
// ── ExternalSystemGateway-010: SqlConnection must not leak when OpenAsync fails ──
[Fact]
@@ -234,6 +234,32 @@ public class ExternalSystemClientTests
() => client.DeliverBufferedAsync(BufferedCall("TestAPI", "failMethod")));
}
// ── ExternalSystemGateway-018: malformed JSON payload must park, not retry-forever ──
[Fact]
public async Task DeliverBuffered_MalformedJsonPayload_ReturnsFalseSoMessageParks()
{
// No repository / HTTP stubs needed — deserialization fails before any
// resolution or HTTP call. If the JsonException were to escape (the
// pre-018 behaviour) the S&F engine would treat it as transient and
// retry the same poison row forever.
var client = new ExternalSystemClient(
_httpClientFactory, _repository, NullLogger<ExternalSystemClient>.Instance);
var poisonMessage = new StoreAndForwardMessage
{
Id = Guid.NewGuid().ToString("N"),
Category = ScadaLink.Commons.Types.Enums.StoreAndForwardCategory.ExternalSystem,
Target = "TestAPI",
// Truncated mid-write — `{` opens an object that never closes.
PayloadJson = "{\"SystemName\":\"TestAPI\",\"MethodName\":\"get",
};
var delivered = await client.DeliverBufferedAsync(poisonMessage);
Assert.False(delivered); // permanent — the S&F engine parks the message
}
// ── ExternalSystemGateway-003: CachedCall must not double-dispatch ──
[Fact]
@@ -17,6 +17,7 @@ using ScadaLink.ExternalSystemGateway;
using ScadaLink.HealthMonitoring;
using ScadaLink.Host;
using ScadaLink.Host.Actors;
using ScadaLink.Host.Health;
using ScadaLink.InboundAPI;
using ScadaLink.ManagementService;
using ScadaLink.NotificationService;
@@ -204,9 +205,13 @@ public class CentralCompositionRootTests : IDisposable
new object[] { typeof(IExternalSystemClient) },
new object[] { typeof(DatabaseGateway) },
new object[] { typeof(IDatabaseGateway) },
// NotificationService
new object[] { typeof(NotificationDeliveryService) },
new object[] { typeof(INotificationDeliveryService) },
// NotificationService — central-only SMTP primitives. The orphan
// NotificationDeliveryService + INotificationDeliveryService were removed
// (NS-019) when sites stopped delivering notifications; the central
// EmailNotificationDeliveryAdapter is now the only resolver of these
// primitives.
new object[] { typeof(Func<ISmtpClientWrapper>) },
new object[] { typeof(OAuth2TokenService) },
// ConfigurationDatabase repositories
new object[] { typeof(ScadaLinkDbContext) },
new object[] { typeof(ISecurityRepository) },
@@ -277,6 +282,30 @@ public class CentralCompositionRootTests : IDisposable
var hostedServices = _factory.Services.GetServices<IHostedService>();
Assert.Contains(hostedServices, s => s.GetType() == typeof(CentralHealthAggregator));
}
// --- InboundAPI-022 regression ---
/// <summary>
/// InboundAPI-022 regression: the Central composition root MUST register a
/// concrete <see cref="IActiveNodeGate"/> implementation. Without it,
/// <see cref="InboundApiEndpointFilter"/> falls through to "allow" and a
/// standby central node continues to serve the inbound API, racing the
/// active node and executing scripts against stale singleton state. The
/// design's "central cluster only (active node)" guarantee is enforced only
/// when the production gate is wired here.
///
/// Structural check on the built provider (not just <see cref="IServiceCollection"/>)
/// — a registration the framework cannot resolve to a concrete instance is
/// indistinguishable from "missing" at runtime, which is the failure mode
/// the finding describes.
/// </summary>
[Fact]
public void Central_IActiveNodeGate_IsRegisteredAsActiveNodeGate()
{
var gate = _factory.Services.GetService<IActiveNodeGate>();
Assert.NotNull(gate);
Assert.IsType<ActiveNodeGate>(gate);
}
}
/// <summary>
@@ -5,11 +5,9 @@ using System.Text.Json;
using Microsoft.Extensions.DependencyInjection;
using NSubstitute;
using ScadaLink.Commons.Entities.InboundApi;
using ScadaLink.Commons.Entities.Notifications;
using ScadaLink.Commons.Interfaces.Repositories;
using ScadaLink.Commons.Interfaces.Services;
using ScadaLink.InboundAPI;
using ScadaLink.NotificationService;
namespace ScadaLink.IntegrationTests;
@@ -98,42 +96,11 @@ public class IntegrationSurfaceTests
}
// ── Notification: mock SMTP delivery ──
[Fact]
public async Task Notification_Send_MockSmtp_Delivers()
{
var repository = Substitute.For<INotificationRepository>();
var smtpClient = Substitute.For<ISmtpClientWrapper>();
var list = new NotificationList("alerts") { Id = 1 };
var recipients = new List<NotificationRecipient>
{
new("Admin", "admin@example.com") { Id = 1, NotificationListId = 1 }
};
var smtpConfig = new SmtpConfiguration("smtp.example.com", "basic", "noreply@example.com")
{
Id = 1, Port = 587, Credentials = "user:pass"
};
repository.GetListByNameAsync("alerts").Returns(list);
repository.GetRecipientsByListIdAsync(1).Returns(recipients);
repository.GetAllSmtpConfigurationsAsync().Returns(new List<SmtpConfiguration> { smtpConfig });
var service = new NotificationDeliveryService(
repository,
() => smtpClient,
Microsoft.Extensions.Logging.Abstractions.NullLogger<NotificationDeliveryService>.Instance);
var result = await service.SendAsync("alerts", "Test Alert", "Something happened");
Assert.True(result.Success);
await smtpClient.Received(1).SendAsync(
"noreply@example.com",
Arg.Is<IEnumerable<string>>(r => r.Contains("admin@example.com")),
"Test Alert",
"Something happened",
Arg.Any<CancellationToken>());
}
// NS-019: the site-shaped NotificationDeliveryService that this case exercised
// was removed when sites stopped delivering notifications. The central SMTP
// delivery path is now covered end-to-end by
// ScadaLink.NotificationOutbox.Tests.Delivery.EmailNotificationDeliveryAdapterTests;
// no equivalent integration-surface assertion is needed here.
// ── Script Context: integration API wiring ──
File diff suppressed because it is too large Load Diff
@@ -197,4 +197,58 @@ public class NotificationForwarderTests : TestKit
Assert.Equal(submit1.NotificationId, submit2.NotificationId);
Assert.Equal("stable-msg-id", submit1.NotificationId);
}
[Fact]
public async Task Deliver_CorruptJsonPayload_ReturnsTrue_AndDoesNotForwardAnything()
{
// Regression test for StoreAndForward-018. The design doc forbids parking
// notifications ("notifications do not park — they are retried at the fixed
// forward interval until central acks"; Component-StoreAndForward.md). The
// previous implementation returned false on a corrupt payload, which the S&F
// engine interprets as a permanent failure and parks the row — contradicting
// the invariant. The fix: discard a corrupt buffered notification by
// returning true (engine clears the buffer via its normal success path),
// with a Warning log line carrying the row id and a payload preview.
var centralProbe = CreateTestProbe();
var forwarder = new NotificationForwarder(
centralProbe.Ref, "site-7", ForwardTimeout);
var corrupt = new StoreAndForwardMessage
{
Id = "msg-corrupt",
Category = StoreAndForwardCategory.Notification,
Target = "Operators",
PayloadJson = "{not-valid-json",
OriginInstanceName = "Plant.Pump3",
};
Assert.True(await forwarder.DeliverAsync(corrupt));
// The corrupt-payload path must NOT round-trip to central — no
// NotificationSubmit / no Ask. ExpectNoMsg confirms nothing was forwarded.
centralProbe.ExpectNoMsg(TimeSpan.FromMilliseconds(200));
}
[Fact]
public async Task Deliver_NullDeserializedPayload_ReturnsTrue_AndDoesNotForwardAnything()
{
// The companion case to corrupt JSON: the payload is valid JSON but
// deserialises to null (e.g. "null"). Same treatment per StoreAndForward-018
// — discard rather than park.
var centralProbe = CreateTestProbe();
var forwarder = new NotificationForwarder(
centralProbe.Ref, "site-7", ForwardTimeout);
var nullPayload = new StoreAndForwardMessage
{
Id = "msg-null",
Category = StoreAndForwardCategory.Notification,
Target = "Operators",
PayloadJson = "null",
OriginInstanceName = "Plant.Pump3",
};
Assert.True(await forwarder.DeliverAsync(nullPayload));
centralProbe.ExpectNoMsg(TimeSpan.FromMilliseconds(200));
}
}
@@ -133,4 +133,122 @@ public class DiffServiceTests
Assert.Single(diff.ScriptChanges);
Assert.Equal(DiffChangeType.Changed, diff.ScriptChanges[0].ChangeType);
}
[Fact]
public void ComputeDiff_AttributeDescriptionChange_DetectedAsChanged()
{
// TemplateEngine-017: AttributesEqual must compare Description.
var oldConfig = new FlattenedConfiguration
{
InstanceUniqueName = "Instance1",
Attributes =
[
new ResolvedAttribute { CanonicalName = "Temp", Value = "25", DataType = "Double", Description = "Original" }
]
};
var newConfig = new FlattenedConfiguration
{
InstanceUniqueName = "Instance1",
Attributes =
[
new ResolvedAttribute { CanonicalName = "Temp", Value = "25", DataType = "Double", Description = "Updated" }
]
};
var diff = _sut.ComputeDiff(oldConfig, newConfig);
Assert.True(diff.HasChanges);
Assert.Single(diff.AttributeChanges);
Assert.Equal(DiffChangeType.Changed, diff.AttributeChanges[0].ChangeType);
}
[Fact]
public void ComputeDiff_AlarmDescriptionChange_DetectedAsChanged()
{
// TemplateEngine-017: AlarmsEqual must compare Description.
var oldConfig = new FlattenedConfiguration
{
InstanceUniqueName = "Instance1",
Alarms =
[
new ResolvedAlarm { CanonicalName = "HighTemp", TriggerType = "RangeViolation", Description = "Original" }
]
};
var newConfig = new FlattenedConfiguration
{
InstanceUniqueName = "Instance1",
Alarms =
[
new ResolvedAlarm { CanonicalName = "HighTemp", TriggerType = "RangeViolation", Description = "Updated" }
]
};
var diff = _sut.ComputeDiff(oldConfig, newConfig);
Assert.True(diff.HasChanges);
Assert.Single(diff.AlarmChanges);
Assert.Equal(DiffChangeType.Changed, diff.AlarmChanges[0].ChangeType);
}
[Fact]
public void ConnectionsEqual_IdenticalConfigs_ReturnsTrue()
{
// TemplateEngine-017: ConnectionsEqual is the comparator callers use
// to detect connection-endpoint drift (the diff-view extension that
// surfaces this in the UI is tracked under TemplateEngine-018).
var a = new ConnectionConfig
{
Protocol = "OpcUa",
ConfigurationJson = "{\"endpoint\":\"opc.tcp://host-a\"}",
BackupConfigurationJson = "{\"endpoint\":\"opc.tcp://host-b\"}",
FailoverRetryCount = 3
};
var b = a with { };
Assert.True(DiffService.ConnectionsEqual(a, b));
}
[Fact]
public void ConnectionsEqual_EndpointEdit_ReturnsFalse()
{
// TemplateEngine-017: primary endpoint JSON edit must surface as a
// change. Without this, deployment redeploys ship a different
// ConnectionConfig with no visible drift signal.
var a = new ConnectionConfig
{
Protocol = "OpcUa",
ConfigurationJson = "{\"endpoint\":\"opc.tcp://host-a:4840\"}",
FailoverRetryCount = 3
};
var b = a with { ConfigurationJson = "{\"endpoint\":\"opc.tcp://host-b:4840\"}" };
Assert.False(DiffService.ConnectionsEqual(a, b));
}
[Fact]
public void ConnectionsEqual_BackupConfigurationEdit_ReturnsFalse()
{
var a = new ConnectionConfig { Protocol = "OpcUa", ConfigurationJson = "{}", BackupConfigurationJson = null, FailoverRetryCount = 3 };
var b = a with { BackupConfigurationJson = "{\"endpoint\":\"opc.tcp://backup\"}" };
Assert.False(DiffService.ConnectionsEqual(a, b));
}
[Fact]
public void ConnectionsEqual_FailoverRetryCountEdit_ReturnsFalse()
{
var a = new ConnectionConfig { Protocol = "OpcUa", ConfigurationJson = "{}", FailoverRetryCount = 3 };
var b = a with { FailoverRetryCount = 5 };
Assert.False(DiffService.ConnectionsEqual(a, b));
}
[Fact]
public void ConnectionsEqual_ProtocolEdit_ReturnsFalse()
{
var a = new ConnectionConfig { Protocol = "OpcUa", ConfigurationJson = "{}", FailoverRetryCount = 3 };
var b = a with { Protocol = "Modbus" };
Assert.False(DiffService.ConnectionsEqual(a, b));
}
}
@@ -131,6 +131,154 @@ public class RevisionHashServiceTests
}
}
[Fact]
public void ComputeHash_AttributeDescriptionEdit_ChangesHash()
{
// TemplateEngine-017: Description must be folded into the hash so that
// edits to authoring-time documentation (which still travels in the
// deployed payload) flow through the staleness indicator.
var baseAttr = new ResolvedAttribute
{
CanonicalName = "Temperature",
Value = "25",
DataType = "Double",
Description = "Original description"
};
var editedAttr = baseAttr with { Description = "Updated description" };
var configBefore = new FlattenedConfiguration
{
InstanceUniqueName = "Instance1",
TemplateId = 1,
SiteId = 1,
Attributes = [baseAttr]
};
var configAfter = configBefore with { Attributes = [editedAttr] };
Assert.NotEqual(_sut.ComputeHash(configBefore), _sut.ComputeHash(configAfter));
}
[Fact]
public void ComputeHash_AlarmDescriptionEdit_ChangesHash()
{
// TemplateEngine-017: same Description contract applies to alarms.
var baseAlarm = new ResolvedAlarm
{
CanonicalName = "HighTemp",
TriggerType = "RangeViolation",
Description = "Original"
};
var editedAlarm = baseAlarm with { Description = "Updated" };
var configBefore = new FlattenedConfiguration
{
InstanceUniqueName = "Instance1",
TemplateId = 1,
SiteId = 1,
Alarms = [baseAlarm]
};
var configAfter = configBefore with { Alarms = [editedAlarm] };
Assert.NotEqual(_sut.ComputeHash(configBefore), _sut.ComputeHash(configAfter));
}
[Fact]
public void ComputeHash_ConnectionEndpointEdit_ChangesHash()
{
// TemplateEngine-017: a Deployment user editing the primary endpoint
// JSON of a data connection bound to an instance must produce a
// different revision hash. The connection's protocol, primary/backup
// configuration JSON, and failover retry count are all part of the
// deployment package and therefore part of the hash input.
var connectionsBefore = new Dictionary<string, ConnectionConfig>
{
["plc1"] = new ConnectionConfig
{
Protocol = "OpcUa",
ConfigurationJson = "{\"endpoint\":\"opc.tcp://host-a:4840\"}",
BackupConfigurationJson = null,
FailoverRetryCount = 3
}
};
var connectionsAfter = new Dictionary<string, ConnectionConfig>
{
["plc1"] = connectionsBefore["plc1"] with
{
ConfigurationJson = "{\"endpoint\":\"opc.tcp://host-b:4840\"}"
}
};
var configBefore = new FlattenedConfiguration
{
InstanceUniqueName = "Instance1",
TemplateId = 1,
SiteId = 1,
Connections = connectionsBefore
};
var configAfter = configBefore with { Connections = connectionsAfter };
Assert.NotEqual(_sut.ComputeHash(configBefore), _sut.ComputeHash(configAfter));
}
[Fact]
public void ComputeHash_ConnectionProtocolEdit_ChangesHash()
{
// TemplateEngine-017: changing protocol must change the hash.
var connectionsBefore = new Dictionary<string, ConnectionConfig>
{
["plc1"] = new ConnectionConfig
{
Protocol = "OpcUa",
ConfigurationJson = "{}",
FailoverRetryCount = 3
}
};
var connectionsAfter = new Dictionary<string, ConnectionConfig>
{
["plc1"] = connectionsBefore["plc1"] with { Protocol = "Modbus" }
};
var configBefore = new FlattenedConfiguration
{
InstanceUniqueName = "Instance1",
TemplateId = 1,
SiteId = 1,
Connections = connectionsBefore
};
var configAfter = configBefore with { Connections = connectionsAfter };
Assert.NotEqual(_sut.ComputeHash(configBefore), _sut.ComputeHash(configAfter));
}
[Fact]
public void ComputeHash_ConnectionsSameContent_SameHash()
{
// TemplateEngine-017: equal Connections maps must yield the same hash,
// regardless of dictionary iteration order (the SortedDictionary
// projection guards this).
var connections1 = new Dictionary<string, ConnectionConfig>
{
["b"] = new ConnectionConfig { Protocol = "OpcUa", ConfigurationJson = "{\"k\":2}", FailoverRetryCount = 3 },
["a"] = new ConnectionConfig { Protocol = "OpcUa", ConfigurationJson = "{\"k\":1}", FailoverRetryCount = 3 }
};
var connections2 = new Dictionary<string, ConnectionConfig>
{
["a"] = new ConnectionConfig { Protocol = "OpcUa", ConfigurationJson = "{\"k\":1}", FailoverRetryCount = 3 },
["b"] = new ConnectionConfig { Protocol = "OpcUa", ConfigurationJson = "{\"k\":2}", FailoverRetryCount = 3 }
};
var config1 = new FlattenedConfiguration
{
InstanceUniqueName = "Instance1",
TemplateId = 1,
SiteId = 1,
Connections = connections1
};
var config2 = config1 with { Connections = connections2 };
Assert.Equal(_sut.ComputeHash(config1), _sut.ComputeHash(config2));
}
private static FlattenedConfiguration CreateConfig(string instanceName, string tempValue)
{
return new FlattenedConfiguration
@@ -2,11 +2,13 @@ using Microsoft.EntityFrameworkCore;
using Microsoft.EntityFrameworkCore.Diagnostics;
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.DependencyInjection;
using ScadaLink.Commons.Entities.ExternalSystems;
using ScadaLink.Commons.Entities.Scripts;
using ScadaLink.Commons.Entities.Templates;
using ScadaLink.Commons.Interfaces.Repositories;
using ScadaLink.Commons.Interfaces.Services;
using ScadaLink.Commons.Interfaces.Transport;
using ScadaLink.Commons.Types.Enums;
using ScadaLink.Commons.Types.Transport;
using ScadaLink.ConfigurationDatabase;
using ScadaLink.ConfigurationDatabase.Repositories;
@@ -86,10 +88,11 @@ public sealed class BundleImporterApplyTests : IDisposable
var ctx = scope.ServiceProvider.GetRequiredService<ScadaLinkDbContext>();
var templateIds = await ctx.Templates.Select(t => t.Id).ToListAsync();
var sharedScriptIds = await ctx.SharedScripts.Select(s => s.Id).ToListAsync();
var externalSystemIds = await ctx.ExternalSystemDefinitions.Select(e => e.Id).ToListAsync();
var selection = new ExportSelection(
TemplateIds: templateIds,
SharedScriptIds: sharedScriptIds,
ExternalSystemIds: Array.Empty<int>(),
ExternalSystemIds: externalSystemIds,
DatabaseConnectionIds: Array.Empty<int>(),
NotificationListIds: Array.Empty<int>(),
SmtpConfigurationIds: Array.Empty<int>(),
@@ -476,4 +479,272 @@ public sealed class BundleImporterApplyTests : IDisposable
Assert.Null(row.BundleImportId);
}
}
[Fact]
public async Task ApplyAsync_Overwrite_synchronises_attributes_alarms_and_scripts_to_bundle()
{
// T-001 regression. The Overwrite branch used to write only Description
// / FolderId on the existing template; the bundle's Attributes / Alarms
// / Scripts collections were silently dropped on the floor. This test
// seeds a template with one shape, exports it, mutates the target to a
// divergent shape, then asserts that Overwrite restores every child
// collection AND emits per-field audit rows.
//
// Bundle shape (exported from "Pump"):
// Attributes: [SetPoint (Float, 50.0), Pressure (Float, 100.0)]
// Alarms: [HiAlarm (PriorityLevel=1)]
// Scripts: [Init (Code="return 1;")]
await using (var scope = _provider.CreateAsyncScope())
{
var ctx = scope.ServiceProvider.GetRequiredService<ScadaLinkDbContext>();
var t = new Template("Pump") { Description = "from-bundle" };
t.Attributes.Add(new TemplateAttribute("SetPoint") { DataType = DataType.Float, Value = "50.0" });
t.Attributes.Add(new TemplateAttribute("Pressure") { DataType = DataType.Float, Value = "100.0" });
t.Alarms.Add(new TemplateAlarm("HiAlarm") { PriorityLevel = 1, TriggerType = AlarmTriggerType.ValueMatch });
t.Scripts.Add(new TemplateScript("Init", "return 1;"));
ctx.Templates.Add(t);
await ctx.SaveChangesAsync();
}
var sessionId = await ExportAndLoadAsync();
// Mutate the target so every child collection diverges from the bundle.
// Attributes: SetPoint value mutated, Pressure DELETED, NewAttr ADDED
// Alarms: HiAlarm PriorityLevel mutated, ExtraAlarm ADDED
// Scripts: Init code mutated, ExtraScript ADDED
// Description also mutated so the scalar field still flips.
await using (var scope = _provider.CreateAsyncScope())
{
var ctx = scope.ServiceProvider.GetRequiredService<ScadaLinkDbContext>();
var pump = await ctx.Templates
.Include(t => t.Attributes)
.Include(t => t.Alarms)
.Include(t => t.Scripts)
.SingleAsync(t => t.Name == "Pump");
pump.Description = "target-mutated";
var setPoint = pump.Attributes.Single(a => a.Name == "SetPoint");
setPoint.Value = "999.0"; // mutated
var pressure = pump.Attributes.Single(a => a.Name == "Pressure");
pump.Attributes.Remove(pressure);
ctx.TemplateAttributes.Remove(pressure);
pump.Attributes.Add(new TemplateAttribute("NewAttr")
{
DataType = DataType.String,
Value = "should-be-deleted-by-overwrite",
});
var hiAlarm = pump.Alarms.Single(a => a.Name == "HiAlarm");
hiAlarm.PriorityLevel = 99; // mutated
pump.Alarms.Add(new TemplateAlarm("ExtraAlarm")
{
PriorityLevel = 5,
TriggerType = AlarmTriggerType.RangeViolation,
});
var initScript = pump.Scripts.Single(s => s.Name == "Init");
initScript.Code = "return 999;"; // mutated
pump.Scripts.Add(new TemplateScript("ExtraScript", "return 0;"));
await ctx.SaveChangesAsync();
}
// Capture the audit baseline so we can scope assertions to rows
// emitted by THIS apply.
int beforeMaxAuditId;
await using (var scope = _provider.CreateAsyncScope())
{
var ctx = scope.ServiceProvider.GetRequiredService<ScadaLinkDbContext>();
beforeMaxAuditId = await ctx.AuditLogEntries.MaxAsync(a => (int?)a.Id) ?? 0;
}
// Act — apply Overwrite.
ImportResult result;
await using (var scope = _provider.CreateAsyncScope())
{
var importer = scope.ServiceProvider.GetRequiredService<IBundleImporter>();
result = await importer.ApplyAsync(sessionId,
new List<ImportResolution> { new("Template", "Pump", ResolutionAction.Overwrite, null) },
user: "bob");
}
// Assert — children mirror the bundle.
await using (var scope = _provider.CreateAsyncScope())
{
var ctx = scope.ServiceProvider.GetRequiredService<ScadaLinkDbContext>();
var pump = await ctx.Templates
.Include(t => t.Attributes)
.Include(t => t.Alarms)
.Include(t => t.Scripts)
.SingleAsync(t => t.Name == "Pump");
Assert.Equal("from-bundle", pump.Description);
// Attributes — exactly { SetPoint, Pressure }, values restored.
Assert.Equal(2, pump.Attributes.Count);
var setPoint = pump.Attributes.Single(a => a.Name == "SetPoint");
Assert.Equal("50.0", setPoint.Value);
Assert.Equal(DataType.Float, setPoint.DataType);
var pressure = pump.Attributes.Single(a => a.Name == "Pressure");
Assert.Equal("100.0", pressure.Value);
Assert.DoesNotContain(pump.Attributes, a => a.Name == "NewAttr");
// Alarms — exactly { HiAlarm }, PriorityLevel restored.
Assert.Single(pump.Alarms);
var hi = pump.Alarms.Single();
Assert.Equal("HiAlarm", hi.Name);
Assert.Equal(1, hi.PriorityLevel);
Assert.DoesNotContain(pump.Alarms, a => a.Name == "ExtraAlarm");
// Scripts — exactly { Init }, code restored.
Assert.Single(pump.Scripts);
var init = pump.Scripts.Single();
Assert.Equal("Init", init.Name);
Assert.Equal("return 1;", init.Code);
Assert.DoesNotContain(pump.Scripts, s => s.Name == "ExtraScript");
// Per-field audit rows — design doc enumerates Added / Updated /
// Deleted shapes; all of these should appear, all stamped with
// the BundleImportId from the result.
var newRows = await ctx.AuditLogEntries
.Where(a => a.Id > beforeMaxAuditId)
.ToListAsync();
Assert.All(newRows, r => Assert.Equal(result.BundleImportId, r.BundleImportId));
// Attribute audit events.
Assert.Contains(newRows, r => r.Action == "TemplateAttributeUpdated" && r.EntityName == "Pump.SetPoint");
Assert.Contains(newRows, r => r.Action == "TemplateAttributeAdded" && r.EntityName == "Pump.Pressure");
Assert.Contains(newRows, r => r.Action == "TemplateAttributeDeleted" && r.EntityName == "Pump.NewAttr");
// Alarm audit events.
Assert.Contains(newRows, r => r.Action == "TemplateAlarmUpdated" && r.EntityName == "Pump.HiAlarm");
Assert.Contains(newRows, r => r.Action == "TemplateAlarmDeleted" && r.EntityName == "Pump.ExtraAlarm");
// Script audit events.
Assert.Contains(newRows, r => r.Action == "TemplateScriptUpdated" && r.EntityName == "Pump.Init");
Assert.Contains(newRows, r => r.Action == "TemplateScriptDeleted" && r.EntityName == "Pump.ExtraScript");
}
Assert.Equal(1, result.Overwritten);
}
[Fact]
public async Task ApplyAsync_Overwrite_synchronises_external_system_methods_to_bundle()
{
// T-002 regression. The ExternalSystem Overwrite branch used to write
// only EndpointUrl / AuthType / AuthConfiguration on the existing
// definition; the bundle's Methods collection was silently dropped on
// the floor. This test seeds an external system with one method
// shape, exports it, mutates the target's methods to diverge, then
// asserts that Overwrite restores every method AND emits per-field
// audit rows.
//
// Bundle shape (exported from "Erp"):
// Methods: [
// GetUser (GET /users/{id}, param=A, return=R),
// PostJob (POST /jobs, param=B, return=S),
// ]
await using (var scope = _provider.CreateAsyncScope())
{
var ctx = scope.ServiceProvider.GetRequiredService<ScadaLinkDbContext>();
var sys = new ExternalSystemDefinition("Erp", "https://erp.example", "ApiKey");
ctx.ExternalSystemDefinitions.Add(sys);
await ctx.SaveChangesAsync();
ctx.ExternalSystemMethods.Add(new ExternalSystemMethod("GetUser", "GET", "/users/{id}")
{
ExternalSystemDefinitionId = sys.Id,
ParameterDefinitions = "A",
ReturnDefinition = "R",
});
ctx.ExternalSystemMethods.Add(new ExternalSystemMethod("PostJob", "POST", "/jobs")
{
ExternalSystemDefinitionId = sys.Id,
ParameterDefinitions = "B",
ReturnDefinition = "S",
});
await ctx.SaveChangesAsync();
}
var sessionId = await ExportAndLoadAsync();
// Mutate the target so the methods diverge from the bundle:
// GetUser — Path mutated (UPDATE expected on Overwrite)
// PostJob — DELETED (ADD expected on Overwrite to restore)
// ExtraOp — ADDED (DELETE expected on Overwrite to remove)
// EndpointUrl / AuthType also mutated so the scalar update still fires.
await using (var scope = _provider.CreateAsyncScope())
{
var ctx = scope.ServiceProvider.GetRequiredService<ScadaLinkDbContext>();
var sys = await ctx.ExternalSystemDefinitions.SingleAsync(e => e.Name == "Erp");
sys.EndpointUrl = "https://wrong.example";
sys.AuthType = "Basic";
var getUser = await ctx.ExternalSystemMethods
.SingleAsync(m => m.ExternalSystemDefinitionId == sys.Id && m.Name == "GetUser");
getUser.Path = "/wrong/path";
var postJob = await ctx.ExternalSystemMethods
.SingleAsync(m => m.ExternalSystemDefinitionId == sys.Id && m.Name == "PostJob");
ctx.ExternalSystemMethods.Remove(postJob);
ctx.ExternalSystemMethods.Add(new ExternalSystemMethod("ExtraOp", "DELETE", "/extra")
{
ExternalSystemDefinitionId = sys.Id,
ParameterDefinitions = "X",
ReturnDefinition = "Y",
});
await ctx.SaveChangesAsync();
}
// Capture the audit baseline so we can scope assertions to rows
// emitted by THIS apply.
int beforeMaxAuditId;
await using (var scope = _provider.CreateAsyncScope())
{
var ctx = scope.ServiceProvider.GetRequiredService<ScadaLinkDbContext>();
beforeMaxAuditId = await ctx.AuditLogEntries.MaxAsync(a => (int?)a.Id) ?? 0;
}
// Act — apply Overwrite on the external system.
ImportResult result;
await using (var scope = _provider.CreateAsyncScope())
{
var importer = scope.ServiceProvider.GetRequiredService<IBundleImporter>();
result = await importer.ApplyAsync(sessionId,
new List<ImportResolution> { new("ExternalSystem", "Erp", ResolutionAction.Overwrite, null) },
user: "bob");
}
// Assert — methods mirror the bundle exactly.
await using (var scope = _provider.CreateAsyncScope())
{
var ctx = scope.ServiceProvider.GetRequiredService<ScadaLinkDbContext>();
var sys = await ctx.ExternalSystemDefinitions.SingleAsync(e => e.Name == "Erp");
Assert.Equal("https://erp.example", sys.EndpointUrl);
var methods = await ctx.ExternalSystemMethods
.Where(m => m.ExternalSystemDefinitionId == sys.Id)
.ToListAsync();
Assert.Equal(2, methods.Count);
var getUser = methods.Single(m => m.Name == "GetUser");
Assert.Equal("GET", getUser.HttpMethod);
Assert.Equal("/users/{id}", getUser.Path);
Assert.Equal("A", getUser.ParameterDefinitions);
Assert.Equal("R", getUser.ReturnDefinition);
var postJob = methods.Single(m => m.Name == "PostJob");
Assert.Equal("POST", postJob.HttpMethod);
Assert.Equal("/jobs", postJob.Path);
Assert.Equal("B", postJob.ParameterDefinitions);
Assert.Equal("S", postJob.ReturnDefinition);
Assert.DoesNotContain(methods, m => m.Name == "ExtraOp");
// Per-field audit rows — design doc enumerates Added / Updated /
// Deleted shapes; all of these should appear, all stamped with
// the BundleImportId from the result.
var newRows = await ctx.AuditLogEntries
.Where(a => a.Id > beforeMaxAuditId)
.ToListAsync();
Assert.All(newRows, r => Assert.Equal(result.BundleImportId, r.BundleImportId));
Assert.Contains(newRows, r => r.Action == "ExternalSystemMethodUpdated" && r.EntityName == "Erp.GetUser");
Assert.Contains(newRows, r => r.Action == "ExternalSystemMethodAdded" && r.EntityName == "Erp.PostJob");
Assert.Contains(newRows, r => r.Action == "ExternalSystemMethodDeleted" && r.EntityName == "Erp.ExtraOp");
}
Assert.Equal(1, result.Overwritten);
}
}