Resolves StoreAndForward-001, ExternalSystemGateway-001, NotificationService-001 — one systemic gap where buffered messages were persisted but never delivered, and the active node never replicated its buffer to the standby. Delivery handlers (ExternalSystemGateway-001 / NotificationService-001): - AkkaHostedService registers delivery handlers for the ExternalSystem, CachedDbWrite and Notification categories after StoreAndForwardService starts; each resolves its scoped consumer in a fresh DI scope. - ExternalSystemClient, DatabaseGateway and NotificationDeliveryService each gain a DeliverBufferedAsync method: re-resolve the target and re-attempt delivery, returning true/false/throwing per the transient-vs-permanent contract. - EnqueueAsync gains an attemptImmediateDelivery flag; CachedCallAsync and NotificationDeliveryService.SendAsync pass false (they already attempted delivery themselves) so registering a handler does not dispatch twice. Replication (StoreAndForward-001): - ReplicationService is injected into StoreAndForwardService; a new BufferAsync helper replicates every enqueue, and successful-retry removes and parks are replicated too. Fire-and-forget, no-op when replication is disabled. Tests: StoreAndForwardReplicationTests (Add/Remove/Park observed), attemptImmediateDelivery behaviour, and DeliverBufferedAsync paths for each consumer. Full solution builds; StoreAndForward/ExternalSystemGateway/ NotificationService suites green.
525 lines
24 KiB
Markdown
525 lines
24 KiB
Markdown
# Code Review — ExternalSystemGateway
|
||
|
||
| Field | Value |
|
||
|-------|-------|
|
||
| Module | `src/ScadaLink.ExternalSystemGateway` |
|
||
| Design doc | `docs/requirements/Component-ExternalSystemGateway.md` |
|
||
| Status | Reviewed |
|
||
| Last reviewed | 2026-05-16 |
|
||
| Reviewer | claude-agent |
|
||
| Commit reviewed | `9c60592` |
|
||
| Open findings | 13 |
|
||
|
||
## Summary
|
||
|
||
The External System Gateway is a small module (five source files plus options) that
|
||
implements the HTTP/REST client (`ExternalSystemClient`), the database access surface
|
||
(`DatabaseGateway`), and error classification (`ErrorClassifier`). The structure is
|
||
clean and the dual call-mode semantics broadly match the design doc. However, the
|
||
review surfaced several substantive problems that prevent the module from behaving as
|
||
designed. The most serious is that **no store-and-forward delivery handler is ever
|
||
registered** for the `ExternalSystem` or `CachedDbWrite` categories, so cached calls
|
||
and cached writes are buffered but can never actually be delivered on retry — a silent
|
||
data-loss path. Two further high-impact issues are that the **per-system call timeout
|
||
is never applied** to the HTTP client (the design's central error-handling guarantee
|
||
is absent), and that **`CachedCall` double-dispatches the HTTP request** because
|
||
`StoreAndForwardService.EnqueueAsync` itself re-attempts immediate delivery, breaking
|
||
the idempotency expectations. A cluster of medium issues concern resource leaks,
|
||
classification gaps (cancellation conflation), and the dropped `StoreAndForwardResult`.
|
||
Test coverage is thin — `CachedCall` transient/buffering paths and `DatabaseGateway`
|
||
are entirely untested. Themes: incomplete wiring against the S&F engine, and design-doc
|
||
requirements (timeout, retry settings) that are declared but not implemented.
|
||
|
||
## Checklist coverage
|
||
|
||
| # | Category | Examined | Notes |
|
||
|---|----------|----------|-------|
|
||
| 1 | Correctness & logic bugs | ☑ | URL building edge cases, dropped S&F result, classification gaps — findings 003, 006, 009. |
|
||
| 2 | Akka.NET conventions | ☑ | No actors in this module; `AddExternalSystemGatewayActors` is a no-op. Blocking-I/O isolation is delegated to Site Runtime. No issues found in this module. |
|
||
| 3 | Concurrency & thread safety | ☑ | Services are stateless and DI-scoped; `ExternalCallResult.Response` lazy-parse is not thread-safe but instances are single-use. No findings raised. |
|
||
| 4 | Error handling & resilience | ☑ | S&F handler never registered, double-dispatch, timeout not applied, cancellation conflation — findings 001, 002, 003, 008. |
|
||
| 5 | Security | ☑ | Auth secrets logged-safe, but error bodies echoed verbatim — finding 007. |
|
||
| 6 | Performance & resource management | ☑ | `HttpRequestMessage`/`HttpResponseMessage` and failed `SqlConnection` not disposed; full repository scan per call — findings 005, 010, 011. |
|
||
| 7 | Design-document adherence | ☑ | Timeout, retry settings, audit logging gaps — findings 002, 004, 012. |
|
||
| 8 | Code organization & conventions | ☑ | Options class correctly owned by module; `MaxConcurrentConnectionsPerSystem` unused — finding 013. |
|
||
| 9 | Testing coverage | ☑ | CachedCall buffering and DatabaseGateway untested — finding 014. |
|
||
| 10 | Documentation & comments | ☑ | XML docs reference WP numbers; permanent-failure logging requirement unverified — folded into finding 012. |
|
||
|
||
## Findings
|
||
|
||
### ExternalSystemGateway-001 — No S&F delivery handler registered; cached calls and writes can never be delivered
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Critical |
|
||
| Category | Error handling & resilience |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.ExternalSystemGateway/ExternalSystemClient.cs:109`, `src/ScadaLink.ExternalSystemGateway/DatabaseGateway.cs:81` |
|
||
|
||
**Description**
|
||
|
||
`CachedCallAsync` and `CachedWriteAsync` enqueue messages under
|
||
`StoreAndForwardCategory.ExternalSystem` and `StoreAndForwardCategory.CachedDbWrite`.
|
||
`StoreAndForwardService.RegisterDeliveryHandler` is the only mechanism that lets the
|
||
S&F engine actually deliver a buffered message, and a repository-wide search shows it
|
||
is **never called for either category** anywhere in the codebase. Consequences:
|
||
|
||
1. On a transient failure, `EnqueueAsync` falls through to the "No handler registered
|
||
— buffer for later" branch (`StoreAndForwardService.cs:163`) and the message is
|
||
persisted.
|
||
2. During the retry sweep, `AttemptDeliveryAsync` (`StoreAndForwardService.cs:201`)
|
||
logs `"No delivery handler for category {Category}"` and returns without ever
|
||
removing or delivering the message.
|
||
|
||
The result is that every cached external call and cached DB write is silently
|
||
buffered forever and never delivered — a data-loss path for the exact "deferred
|
||
delivery is acceptable" use cases the design doc calls out (posting production data,
|
||
quality reports). The script also receives `WasBuffered: true` / a successful
|
||
`CachedWriteAsync` completion, so the failure is completely invisible.
|
||
|
||
**Recommendation**
|
||
|
||
Register delivery handlers for `StoreAndForwardCategory.ExternalSystem` and
|
||
`StoreAndForwardCategory.CachedDbWrite` during host/site startup. The `ExternalSystem`
|
||
handler should deserialize the payload, re-resolve the system/method, and re-invoke
|
||
`InvokeHttpAsync`, returning `true`/`false`/throwing per the transient-vs-permanent
|
||
contract `EnqueueAsync` expects. The `CachedDbWrite` handler should execute the SQL
|
||
against the named connection. Add an integration test that buffers a message and
|
||
verifies it is delivered by a retry sweep.
|
||
|
||
**Resolution**
|
||
|
||
Resolved 2026-05-16. Delivery handlers for `StoreAndForwardCategory.ExternalSystem` and
|
||
`CachedDbWrite` are now registered at site startup in `AkkaHostedService`, after
|
||
`StoreAndForwardService.StartAsync()`. Each handler resolves its consumer in a fresh DI
|
||
scope and calls a new `DeliverBufferedAsync`: `ExternalSystemClient.DeliverBufferedAsync`
|
||
re-resolves the system/method and re-invokes `InvokeHttpAsync`, and
|
||
`DatabaseGateway.DeliverBufferedAsync` executes the buffered SQL — each returning `true`
|
||
on success, `false` (park) when the target no longer exists or fails permanently, and
|
||
throwing on transient failure so the engine retries. `EnqueueAsync` gained an
|
||
`attemptImmediateDelivery` parameter; `CachedCallAsync` passes `false` so registering the
|
||
handler does not dispatch the request twice (the double-dispatch noted in
|
||
`ExternalSystemGateway-003`). Regression tests cover the success, target-removed and
|
||
transient-retry paths. Fixed by the commit whose message references
|
||
`ExternalSystemGateway-001`.
|
||
|
||
### ExternalSystemGateway-002 — Per-system call timeout is never applied to HTTP requests
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | High |
|
||
| Category | Error handling & resilience |
|
||
| Status | Open |
|
||
| Location | `src/ScadaLink.ExternalSystemGateway/ExternalSystemClient.cs:130`, `src/ScadaLink.ExternalSystemGateway/ServiceCollectionExtensions.cs:13` |
|
||
|
||
**Description**
|
||
|
||
The design doc states each external system definition specifies a timeout that
|
||
"applies to all method calls on that system" and "applies to the HTTP request
|
||
round-trip", and `ExternalSystemGatewayOptions.DefaultHttpTimeout` exists as a
|
||
fallback. In practice no timeout is ever configured. `ServiceCollectionExtensions`
|
||
calls `services.AddHttpClient()` with no per-named-client configuration, and
|
||
`InvokeHttpAsync` calls `_httpClientFactory.CreateClient($"ExternalSystem_{system.Name}")`
|
||
without setting `client.Timeout` or passing a `CancellationToken` derived from a
|
||
timeout. `SendAsync` is therefore subject only to `HttpClient`'s default 100-second
|
||
timeout, regardless of the system definition or the configured `DefaultHttpTimeout`.
|
||
A slow or hung external system will block the calling Script Execution Actor far
|
||
longer than the operator configured, and the design's core error-handling guarantee
|
||
(timeout → transient classification) does not hold within the intended window.
|
||
|
||
There is also no `Timeout` field on `ExternalSystemDefinition` at all, so even a
|
||
correct implementation has nowhere to read the per-system value from — the entity is
|
||
missing the field the design requires.
|
||
|
||
**Recommendation**
|
||
|
||
Add a `Timeout` (TimeSpan) field to `ExternalSystemDefinition` and have
|
||
`InvokeHttpAsync` enforce it — either by setting `client.Timeout` via a typed/named
|
||
`HttpClient` registration, or by linking a `CancellationTokenSource` with the
|
||
per-system (or `DefaultHttpTimeout`) timeout to the supplied `cancellationToken`
|
||
before `SendAsync`. Ensure the resulting `TaskCanceledException`/`TimeoutException`
|
||
is classified as transient.
|
||
|
||
**Resolution**
|
||
|
||
_Unresolved._
|
||
|
||
### ExternalSystemGateway-003 — `CachedCall` double-dispatches the HTTP request
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | High |
|
||
| Category | Correctness & logic bugs |
|
||
| Status | Open |
|
||
| Location | `src/ScadaLink.ExternalSystemGateway/ExternalSystemClient.cs:84-117` |
|
||
|
||
**Description**
|
||
|
||
`CachedCallAsync` first calls `InvokeHttpAsync` directly (line 86). On a
|
||
`TransientExternalSystemException` it then calls `_storeAndForward.EnqueueAsync(...)`
|
||
(line 109). `StoreAndForwardService.EnqueueAsync` is **not** a pure enqueue — it
|
||
"Attempts immediate delivery" by invoking the registered delivery handler
|
||
(`StoreAndForwardService.cs:128-159`). If a delivery handler for the `ExternalSystem`
|
||
category is registered (as finding 001 recommends), the HTTP request will be executed
|
||
a **second time** synchronously inside `EnqueueAsync`, immediately after the first
|
||
attempt failed. For a transient failure that is actually a slow/overloaded system,
|
||
this doubles the load and — critically — if the original request did reach the
|
||
external system, the immediate retry produces a duplicate delivery before the script
|
||
even returns, worsening the idempotency hazard the design doc explicitly warns about.
|
||
|
||
**Recommendation**
|
||
|
||
Decide on one dispatch path. Either (a) have `CachedCall` not pre-invoke
|
||
`InvokeHttpAsync` and instead let `EnqueueAsync`'s immediate-delivery attempt be the
|
||
single first attempt (requires the handler to exist and to surface permanent vs
|
||
transient correctly); or (b) add an enqueue-only entry point to
|
||
`StoreAndForwardService` that skips the immediate-delivery attempt, and have
|
||
`CachedCall` use it after its own first attempt. Approach (a) is cleaner and removes
|
||
the duplicated logic.
|
||
|
||
**Resolution**
|
||
|
||
_Unresolved._
|
||
|
||
### ExternalSystemGateway-004 — System retry settings are not honoured for cached calls/writes
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Medium |
|
||
| Category | Design-document adherence |
|
||
| Status | Open |
|
||
| Location | `src/ScadaLink.ExternalSystemGateway/ExternalSystemClient.cs:114-115`, `src/ScadaLink.ExternalSystemGateway/DatabaseGateway.cs:86-87` |
|
||
|
||
**Description**
|
||
|
||
`CachedCallAsync` and `CachedWriteAsync` pass the definition's `MaxRetries` /
|
||
`RetryDelay` to `EnqueueAsync` only when they are non-default
|
||
(`MaxRetries > 0 ? ... : null`, `RetryDelay > TimeSpan.Zero ? ... : null`), otherwise
|
||
falling back to the S&F defaults. The site-side repository that supplies these
|
||
definitions, `SiteExternalSystemRepository.MapExternalSystem`
|
||
(`src/ScadaLink.SiteRuntime/Repositories/SiteExternalSystemRepository.cs:194`), never
|
||
reads `MaxRetries`/`RetryDelay` from SQLite at all — the constructed entities always
|
||
have `MaxRetries == 0` and `RetryDelay == TimeSpan.Zero`. As a result, at sites the
|
||
per-system retry settings the design doc requires are *always* discarded and the
|
||
global S&F defaults are silently used instead. The `> 0` guard in the ESG also makes
|
||
a legitimately-configured `MaxRetries` of 0 ("never retry") indistinguishable from
|
||
"unset", so an operator cannot express "do not retry".
|
||
|
||
**Recommendation**
|
||
|
||
Within this module, drop the `> 0` / `> Zero` guards and pass the definition values
|
||
through directly (or use nullable fields on the entity to distinguish "unset"). The
|
||
companion fix in `SiteExternalSystemRepository` to actually map the retry columns
|
||
should be tracked against the SiteRuntime module.
|
||
|
||
**Resolution**
|
||
|
||
_Unresolved._
|
||
|
||
### ExternalSystemGateway-005 — `HttpRequestMessage` and `HttpResponseMessage` are not disposed
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Medium |
|
||
| Category | Performance & resource management |
|
||
| Status | Open |
|
||
| Location | `src/ScadaLink.ExternalSystemGateway/ExternalSystemClient.cs:133-167` |
|
||
|
||
**Description**
|
||
|
||
`InvokeHttpAsync` creates an `HttpRequestMessage` (line 133) and receives an
|
||
`HttpResponseMessage` from `SendAsync` (line 155); neither is wrapped in a `using` nor
|
||
explicitly disposed. Both are `IDisposable` and own resources (the request's
|
||
`StringContent`, the response's content stream). Under the per-invocation call volume
|
||
of a busy site this produces avoidable pressure on the finalizer queue and can hold
|
||
socket/stream resources longer than necessary. The success path reads the content but
|
||
never disposes the response; the error path likewise reads `errorBody` and then throws
|
||
without disposing.
|
||
|
||
**Recommendation**
|
||
|
||
Wrap the request in `using var request = ...` and the response in
|
||
`using var response = ...` (or call `Dispose()` in a `finally`). Ensure disposal still
|
||
occurs on the exception paths.
|
||
|
||
**Resolution**
|
||
|
||
_Unresolved._
|
||
|
||
### ExternalSystemGateway-006 — `BuildUrl` ignores path templates and appends a trailing slash for empty paths
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Medium |
|
||
| Category | Correctness & logic bugs |
|
||
| Status | Open |
|
||
| Location | `src/ScadaLink.ExternalSystemGateway/ExternalSystemClient.cs:180-196` |
|
||
|
||
**Description**
|
||
|
||
`BuildUrl` does `baseUrl.TrimEnd('/') + "/" + path.TrimStart('/')`. When `method.Path`
|
||
is empty (a method that targets the base URL itself), this still appends a `/`,
|
||
producing `https://host/api/` which some servers treat as a different resource than
|
||
`https://host/api`. More importantly, the design doc shows method paths as templates
|
||
like `/recipes/{id}`, but `BuildUrl` performs no placeholder substitution — a `{id}`
|
||
token is sent literally in the URL and the corresponding parameter is instead appended
|
||
as a query-string entry (for GET/DELETE) or placed in the JSON body (POST/PUT). Either
|
||
the design's path-template feature is unimplemented, or the doc is stale; in the
|
||
current code a method defined as `/recipes/{id}` will never produce a correct URL.
|
||
|
||
**Recommendation**
|
||
|
||
Decide whether path templating is in scope. If yes, implement `{name}` substitution
|
||
from `parameters` in `BuildUrl` and exclude substituted parameters from the query
|
||
string/body. If no, update the component design doc to remove the `/recipes/{id}`
|
||
example and state that paths are literal. Also avoid appending a trailing `/` when
|
||
`path` is empty.
|
||
|
||
**Resolution**
|
||
|
||
_Unresolved._
|
||
|
||
### ExternalSystemGateway-007 — External error response bodies are echoed verbatim into script-visible error messages
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Medium |
|
||
| Category | Security |
|
||
| Status | Open |
|
||
| Location | `src/ScadaLink.ExternalSystemGateway/ExternalSystemClient.cs:167-177` |
|
||
|
||
**Description**
|
||
|
||
On a non-success HTTP response, the full response body is read into `errorBody` and
|
||
embedded verbatim into the exception message (`$"HTTP {code} from {name}: {errorBody}"`),
|
||
which then flows into `ExternalCallResult.ErrorMessage` and back to the calling script,
|
||
and into Site Event Logging. An external system error page can be arbitrarily large
|
||
(an HTML stack trace, a multi-megabyte body) and may contain sensitive detail. There
|
||
is no size cap, so a hostile or misbehaving endpoint can inflate every error log entry
|
||
and error string returned to scripts. There is also no content-type check before
|
||
treating the body as text.
|
||
|
||
**Recommendation**
|
||
|
||
Truncate `errorBody` to a bounded length (e.g. 1–2 KB) before embedding it, and
|
||
consider logging the full body separately at debug level rather than returning it to
|
||
the script. Optionally only include the body when the content type is textual.
|
||
|
||
**Resolution**
|
||
|
||
_Unresolved._
|
||
|
||
### ExternalSystemGateway-008 — Cancellation is conflated with transient timeout failure
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Medium |
|
||
| Category | Error handling & resilience |
|
||
| Status | Open |
|
||
| Location | `src/ScadaLink.ExternalSystemGateway/ErrorClassifier.cs:24-30`, `src/ScadaLink.ExternalSystemGateway/ExternalSystemClient.cs:157-159` |
|
||
|
||
**Description**
|
||
|
||
`ErrorClassifier.IsTransient(Exception)` returns `true` for `TaskCanceledException`
|
||
and `OperationCanceledException`. `HttpClient.SendAsync` throws `TaskCanceledException`
|
||
both when its internal timeout elapses *and* when the supplied `CancellationToken` is
|
||
cancelled (e.g. the Script Execution Actor is stopped, or the actor system is shutting
|
||
down). Because `InvokeHttpAsync`'s `catch` filter treats all of these as transient, a
|
||
caller-initiated cancellation during a `CachedCall` will be misclassified as a
|
||
transient failure and the message will be buffered for retry — work the caller
|
||
explicitly asked to abandon. For a `Call`, a shutdown-time cancellation is reported to
|
||
the script as a "Transient error" rather than an `OperationCanceledException`.
|
||
|
||
**Recommendation**
|
||
|
||
In `InvokeHttpAsync`, check `cancellationToken.IsCancellationRequested` first and
|
||
rethrow `OperationCanceledException` (or let it propagate) before applying transient
|
||
classification. Only treat a cancellation as a timeout when the supplied token is
|
||
*not* the one that was cancelled.
|
||
|
||
**Resolution**
|
||
|
||
_Unresolved._
|
||
|
||
### ExternalSystemGateway-009 — `StoreAndForwardResult` from `EnqueueAsync` is discarded; permanent failures during buffering are swallowed
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Medium |
|
||
| Category | Correctness & logic bugs |
|
||
| Status | Open |
|
||
| Location | `src/ScadaLink.ExternalSystemGateway/ExternalSystemClient.cs:109-117` |
|
||
|
||
**Description**
|
||
|
||
`CachedCallAsync` assigns the result of `_storeAndForward.EnqueueAsync(...)` to
|
||
`sfResult` and then never reads it — it unconditionally returns
|
||
`new ExternalCallResult(true, null, null, WasBuffered: true)`. `EnqueueAsync` can
|
||
return `Success == false` (a permanent failure encountered during its
|
||
immediate-delivery attempt — `StoreAndForwardService.cs:142`) or `Buffered == false`
|
||
(delivered immediately). In both cases the ESG still reports the call as buffered and
|
||
successful to the script. A permanent failure surfaced by the S&F immediate attempt is
|
||
therefore silently lost instead of being returned to the script as the design requires
|
||
("On permanent failure (HTTP 4xx), the error is returned synchronously").
|
||
|
||
**Recommendation**
|
||
|
||
Inspect `sfResult`: if `Success == false` return an error `ExternalCallResult`; set
|
||
`WasBuffered` from `sfResult.Buffered` rather than hard-coding `true`. (This finding is
|
||
partly subsumed by the dispatch redesign in finding 003.)
|
||
|
||
**Resolution**
|
||
|
||
_Unresolved._
|
||
|
||
### ExternalSystemGateway-010 — `GetConnectionAsync` leaks the `SqlConnection` when `OpenAsync` fails
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Medium |
|
||
| Category | Performance & resource management |
|
||
| Status | Open |
|
||
| Location | `src/ScadaLink.ExternalSystemGateway/DatabaseGateway.cs:48-50` |
|
||
|
||
**Description**
|
||
|
||
`GetConnectionAsync` constructs `new SqlConnection(...)` and calls `await
|
||
connection.OpenAsync(...)`. If `OpenAsync` throws (unreachable server, bad
|
||
credentials, cancellation) the just-created `SqlConnection` instance is never disposed
|
||
— the exception propagates and the local reference is lost. While an unopened
|
||
`SqlConnection` is lightweight, over many failing calls this is an avoidable leak. The
|
||
design doc says `Database.Connection()` failures return an error to the script; the
|
||
current code lets a raw `SqlException` escape, which is acceptable, but the leak is
|
||
not.
|
||
|
||
**Recommendation**
|
||
|
||
Wrap the open in a try/catch that disposes the connection before rethrowing:
|
||
`try { await connection.OpenAsync(ct); } catch { connection.Dispose(); throw; }`.
|
||
|
||
**Resolution**
|
||
|
||
_Unresolved._
|
||
|
||
### ExternalSystemGateway-011 — Every call performs a full repository scan of all systems and methods
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Low |
|
||
| Category | Performance & resource management |
|
||
| Status | Open |
|
||
| Location | `src/ScadaLink.ExternalSystemGateway/ExternalSystemClient.cs:231-245`, `src/ScadaLink.ExternalSystemGateway/DatabaseGateway.cs:90-97` |
|
||
|
||
**Description**
|
||
|
||
`ResolveSystemAndMethodAsync` calls `GetAllExternalSystemsAsync()` and then
|
||
`GetMethodsByExternalSystemIdAsync()` and filters in memory on every single call;
|
||
`ResolveConnectionAsync` calls `GetAllDatabaseConnectionsAsync()` and filters in memory
|
||
on every cached write / connection request. At sites this hits the SQLite repository,
|
||
and `SiteExternalSystemRepository` re-reads and re-parses the methods JSON each time.
|
||
For a hot script path this is unnecessary repeated I/O and allocation. Definitions only
|
||
change on deployment, so they are eminently cacheable.
|
||
|
||
**Recommendation**
|
||
|
||
Add an in-memory cache of system/method/connection definitions keyed by name,
|
||
invalidated on artifact deployment. Alternatively use a name-keyed repository lookup
|
||
rather than fetch-all-then-filter.
|
||
|
||
**Resolution**
|
||
|
||
_Unresolved._
|
||
|
||
### ExternalSystemGateway-012 — Permanent-failure logging requirement is not met; `_logger` is injected but unused
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Low |
|
||
| Category | Design-document adherence |
|
||
| Status | Open |
|
||
| Location | `src/ScadaLink.ExternalSystemGateway/ExternalSystemClient.cs:24,169-177`, `src/ScadaLink.ExternalSystemGateway/DatabaseGateway.cs:22` |
|
||
|
||
**Description**
|
||
|
||
The design doc states permanent failures are "Logged to Site Event Logging", but
|
||
`InvokeHttpAsync` performs no logging on the permanent-failure path. In fact the
|
||
injected `ILogger<ExternalSystemClient>` and `ILogger<DatabaseGateway>` fields are
|
||
never used at all in either class. Either the logging is expected to happen in the
|
||
caller (Script Execution Actor) — in which case the design doc is imprecise about
|
||
where — or it is missing. Separately, `IsTransient(HttpStatusCode)` treats any
|
||
non-success, non-(5xx/408/429) status as permanent without an explicit comment, which
|
||
is a reasonable default but undocumented.
|
||
|
||
**Recommendation**
|
||
|
||
Add a `_logger.LogWarning` on the permanent-failure path (and a debug log on
|
||
transient), or clarify in the design doc that Site Event Logging capture is the
|
||
caller's responsibility and remove the unused `_logger` fields. Add a comment in
|
||
`ErrorClassifier` documenting the "default to permanent" behaviour.
|
||
|
||
**Resolution**
|
||
|
||
_Unresolved._
|
||
|
||
### ExternalSystemGateway-013 — `MaxConcurrentConnectionsPerSystem` and `DefaultHttpTimeout` options are defined but never used
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Low |
|
||
| Category | Code organization & conventions |
|
||
| Status | Open |
|
||
| Location | `src/ScadaLink.ExternalSystemGateway/ExternalSystemGatewayOptions.cs:9,12`, `src/ScadaLink.ExternalSystemGateway/ServiceCollectionExtensions.cs:13` |
|
||
|
||
**Description**
|
||
|
||
`ExternalSystemGatewayOptions.MaxConcurrentConnectionsPerSystem` (default 10) and
|
||
`DefaultHttpTimeout` (default 30s) are bound from configuration but neither is read
|
||
anywhere. `AddHttpClient()` registers the default factory with no
|
||
`ConfigurePrimaryHttpMessageHandler`/`SocketsHttpHandler` `MaxConnectionsPerServer` and
|
||
no `Timeout`, so both options have no effect. An operator setting these values gets
|
||
them silently ignored — a misleading configuration surface (`DefaultHttpTimeout` is
|
||
also referenced by finding 002).
|
||
|
||
**Recommendation**
|
||
|
||
Either wire the options into a named/typed `HttpClient` registration (set
|
||
`MaxConnectionsPerServer` on the primary handler, set `Timeout`), or remove the unused
|
||
options to avoid implying behaviour that does not exist.
|
||
|
||
**Resolution**
|
||
|
||
_Unresolved._
|
||
|
||
### ExternalSystemGateway-014 — Cached-call buffering path and `DatabaseGateway` are untested
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Low |
|
||
| Category | Testing coverage |
|
||
| Status | Open |
|
||
| Location | `tests/ScadaLink.ExternalSystemGateway.Tests/ExternalSystemClientTests.cs:1`, (no `DatabaseGatewayTests.cs`) |
|
||
|
||
**Description**
|
||
|
||
`ExternalSystemClientTests` covers system/method not-found, success, transient 500 and
|
||
permanent 400 for `CallAsync`, plus `CachedCall` not-found and success. It does **not**
|
||
cover: the `CachedCall` transient-failure → S&F buffering branch (the most
|
||
behaviour-rich path, including the `_storeAndForward == null` fallback and `WasBuffered`
|
||
semantics), the `CachedCall` permanent-failure branch, connection-exception
|
||
classification (`HttpRequestException` thrown by the handler), `BuildUrl` query-string
|
||
construction, and `ApplyAuth` for the apikey/basic variants. There is **no test file
|
||
for `DatabaseGateway`** at all — `GetConnectionAsync` not-found, `CachedWriteAsync`
|
||
not-found, and the `_storeAndForward == null` guard are entirely uncovered. The
|
||
`MockHttpMessageHandler` also does not assert request URL/headers/body, so auth and
|
||
URL construction are unverified.
|
||
|
||
**Recommendation**
|
||
|
||
Add tests for the `CachedCall` transient/buffering paths (with a substituted S&F
|
||
service), `DatabaseGateway` not-found and null-S&F guards, and `BuildUrl`/`ApplyAuth`
|
||
by asserting on the captured `HttpRequestMessage` in the mock handler.
|
||
|
||
**Resolution**
|
||
|
||
_Unresolved._
|