fix(external-system-gateway): resolve ExternalSystemGateway-015..017 — treat MaxRetries=0 as unset, scope HTTP connection cap to gateway clients, no bare trailing '?'

This commit is contained in:
Joseph Doherty
2026-05-17 03:18:24 -04:00
parent 4fa6f0e774
commit da8c9f171b
7 changed files with 211 additions and 35 deletions

View File

@@ -8,7 +8,7 @@
| Last reviewed | 2026-05-17 |
| Reviewer | claude-agent |
| Commit reviewed | `39d737e` |
| Open findings | 3 |
| Open findings | 0 |
## Summary
@@ -788,7 +788,7 @@ against regression.
|--|--|
| Severity | High |
| Category | Correctness & logic bugs |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.ExternalSystemGateway/ExternalSystemClient.cs:120-127`, `src/ScadaLink.ExternalSystemGateway/DatabaseGateway.cs:102-108` |
**Description**
@@ -842,7 +842,18 @@ outcome (parked / not retried), not just the stored column value.
**Resolution**
_Unresolved._
Resolved 2026-05-17. Root cause confirmed: the S&F engine treats a stored
`MaxRetries == 0` as "no limit / retry forever" (`StoreAndForwardMessage.MaxRetries`
doc "0 = no limit"; sweep guard `MaxRetries > 0 && RetryCount >= MaxRetries`), while
the entity's non-nullable `int MaxRetries` defaults to `0` — so passing it verbatim
buffered every cached call/write as an unbounded retry loop. Fix (ESG-side only,
recommendation (a)): `CachedCallAsync` and `CachedWriteAsync` now pass
`MaxRetries > 0 ? MaxRetries : null`, so an entity `0` is treated as "unset" and the
bounded S&F `DefaultMaxRetries` applies; the misleading "0 = never retry" inline
comments were corrected. The two `ZeroMaxRetries...` tests were rewritten to
`CachedCall_TransientFailure_ZeroMaxRetriesIsTreatedAsUnsetNotRetryForever` /
`CachedWrite_ZeroMaxRetriesIsTreatedAsUnsetNotRetryForever`, asserting the buffered
message carries the bounded default (99) and never `0`.
### ExternalSystemGateway-016 — `ConfigureHttpClientDefaults` applies the ESG connection cap to every `HttpClient` in the host process
@@ -850,7 +861,7 @@ _Unresolved._
|--|--|
| Severity | Medium |
| Category | Code organization & conventions |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.ExternalSystemGateway/ServiceCollectionExtensions.cs:21-29` |
**Description**
@@ -891,7 +902,18 @@ preferred fix is to stop using `ConfigureHttpClientDefaults`.
**Resolution**
_Unresolved._
Resolved 2026-05-17. Root cause confirmed: `ConfigureHttpClientDefaults` is
process-global and replaced the primary handler of every `IHttpClientFactory` client
in the host, leaking the ESG connection cap onto unrelated clients. Fix: the global
`ConfigureHttpClientDefaults` registration was replaced with an
`IConfigureNamedOptions<HttpClientFactoryOptions>` (`GatewayHttpClientConfigurator`)
that applies the `SocketsHttpHandler`/`MaxConnectionsPerServer` cap only to clients
whose name starts with `ExternalSystem_` (the gateway's own per-system clients), so
clients owned by other components keep their own (or the framework default) primary
handler. Regression test
`ServiceWiringTests.MaxConcurrentConnectionsPerSystem_IsNotAppliedToNonGatewayHttpClients`
asserts a non-gateway client does not inherit the cap while the gateway client still
does; it was verified to fail before the fix.
### ExternalSystemGateway-017 — `BuildUrl` appends a bare trailing `?` when a GET method's parameters are all null
@@ -899,7 +921,7 @@ _Unresolved._
|--|--|
| Severity | Low |
| Category | Correctness & logic bugs |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.ExternalSystemGateway/ExternalSystemClient.cs:324-333` |
**Description**
@@ -921,4 +943,11 @@ produces a clean URL identical to the no-parameters case.
**Resolution**
_Unresolved._
Resolved 2026-05-17. Root cause confirmed: `BuildUrl` appended `"?" + queryString`
whenever the GET/DELETE parameter dictionary was non-empty, even when every value
was null and `queryString` was the empty string, yielding a bare trailing `?`. Fix:
`BuildUrl` now appends `"?" + queryString` only when `queryString.Length > 0`, so a
method whose effective parameter set is empty produces a URL identical to the
no-parameters case. Regression test
`Call_GetWithAllNullParameters_DoesNotAppendTrailingQuestionMark` asserts the
captured request URI has no trailing `?`; it was verified to fail before the fix.