fix(external-system-gateway): resolve ExternalSystemGateway-015..017 — treat MaxRetries=0 as unset, scope HTTP connection cap to gateway clients, no bare trailing '?'

This commit is contained in:
Joseph Doherty
2026-05-17 03:18:24 -04:00
parent 4fa6f0e774
commit da8c9f171b
7 changed files with 211 additions and 35 deletions

View File

@@ -8,7 +8,7 @@
| Last reviewed | 2026-05-17 |
| Reviewer | claude-agent |
| Commit reviewed | `39d737e` |
| Open findings | 3 |
| Open findings | 0 |
## Summary
@@ -788,7 +788,7 @@ against regression.
|--|--|
| Severity | High |
| Category | Correctness & logic bugs |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.ExternalSystemGateway/ExternalSystemClient.cs:120-127`, `src/ScadaLink.ExternalSystemGateway/DatabaseGateway.cs:102-108` |
**Description**
@@ -842,7 +842,18 @@ outcome (parked / not retried), not just the stored column value.
**Resolution**
_Unresolved._
Resolved 2026-05-17. Root cause confirmed: the S&F engine treats a stored
`MaxRetries == 0` as "no limit / retry forever" (`StoreAndForwardMessage.MaxRetries`
doc "0 = no limit"; sweep guard `MaxRetries > 0 && RetryCount >= MaxRetries`), while
the entity's non-nullable `int MaxRetries` defaults to `0` — so passing it verbatim
buffered every cached call/write as an unbounded retry loop. Fix (ESG-side only,
recommendation (a)): `CachedCallAsync` and `CachedWriteAsync` now pass
`MaxRetries > 0 ? MaxRetries : null`, so an entity `0` is treated as "unset" and the
bounded S&F `DefaultMaxRetries` applies; the misleading "0 = never retry" inline
comments were corrected. The two `ZeroMaxRetries...` tests were rewritten to
`CachedCall_TransientFailure_ZeroMaxRetriesIsTreatedAsUnsetNotRetryForever` /
`CachedWrite_ZeroMaxRetriesIsTreatedAsUnsetNotRetryForever`, asserting the buffered
message carries the bounded default (99) and never `0`.
### ExternalSystemGateway-016 — `ConfigureHttpClientDefaults` applies the ESG connection cap to every `HttpClient` in the host process
@@ -850,7 +861,7 @@ _Unresolved._
|--|--|
| Severity | Medium |
| Category | Code organization & conventions |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.ExternalSystemGateway/ServiceCollectionExtensions.cs:21-29` |
**Description**
@@ -891,7 +902,18 @@ preferred fix is to stop using `ConfigureHttpClientDefaults`.
**Resolution**
_Unresolved._
Resolved 2026-05-17. Root cause confirmed: `ConfigureHttpClientDefaults` is
process-global and replaced the primary handler of every `IHttpClientFactory` client
in the host, leaking the ESG connection cap onto unrelated clients. Fix: the global
`ConfigureHttpClientDefaults` registration was replaced with an
`IConfigureNamedOptions<HttpClientFactoryOptions>` (`GatewayHttpClientConfigurator`)
that applies the `SocketsHttpHandler`/`MaxConnectionsPerServer` cap only to clients
whose name starts with `ExternalSystem_` (the gateway's own per-system clients), so
clients owned by other components keep their own (or the framework default) primary
handler. Regression test
`ServiceWiringTests.MaxConcurrentConnectionsPerSystem_IsNotAppliedToNonGatewayHttpClients`
asserts a non-gateway client does not inherit the cap while the gateway client still
does; it was verified to fail before the fix.
### ExternalSystemGateway-017 — `BuildUrl` appends a bare trailing `?` when a GET method's parameters are all null
@@ -899,7 +921,7 @@ _Unresolved._
|--|--|
| Severity | Low |
| Category | Correctness & logic bugs |
| Status | Open |
| Status | Resolved |
| Location | `src/ScadaLink.ExternalSystemGateway/ExternalSystemClient.cs:324-333` |
**Description**
@@ -921,4 +943,11 @@ produces a clean URL identical to the no-parameters case.
**Resolution**
_Unresolved._
Resolved 2026-05-17. Root cause confirmed: `BuildUrl` appended `"?" + queryString`
whenever the GET/DELETE parameter dictionary was non-empty, even when every value
was null and `queryString` was the empty string, yielding a bare trailing `?`. Fix:
`BuildUrl` now appends `"?" + queryString` only when `queryString.Length > 0`, so a
method whose effective parameter set is empty produces a URL identical to the
no-parameters case. Regression test
`Call_GetWithAllNullParameters_DoesNotAppendTrailingQuestionMark` asserts the
captured request URI has no trailing `?`; it was verified to fail before the fix.

View File

@@ -96,15 +96,20 @@ public class DatabaseGateway : IDatabaseGateway
Parameters = parameters
});
// The per-connection retry settings are passed through verbatim — a
// configured MaxRetries of 0 means "never retry" and must NOT be
// collapsed to the S&F default (ExternalSystemGateway-004).
// ExternalSystemGateway-015: the entity's MaxRetries is a non-nullable int
// whose default is 0, and the Store-and-Forward engine interprets a stored
// MaxRetries of 0 as "no limit" (retry forever) — see
// StoreAndForwardMessage.MaxRetries ("0 = no limit") and the retry-sweep
// guard `MaxRetries > 0 && ...`. Passing 0 verbatim would turn every
// unconfigured cached write into an unbounded retry loop. A 0 is treated as
// "unset" and passed as null so the bounded S&F default applies; the
// RetryDelay default of TimeSpan.Zero is likewise unset.
await _storeAndForward.EnqueueAsync(
StoreAndForwardCategory.CachedDbWrite,
connectionName,
payload,
originInstanceName,
definition.MaxRetries,
definition.MaxRetries > 0 ? definition.MaxRetries : null,
definition.RetryDelay > TimeSpan.Zero ? definition.RetryDelay : null);
}

View File

@@ -114,15 +114,20 @@ public class ExternalSystemClient : IExternalSystemClient
// attempt above; letting EnqueueAsync re-invoke the handler would
// dispatch the same request a second time.
//
// The per-system retry settings are passed through verbatim — a
// configured MaxRetries of 0 means "never retry" and must NOT be
// collapsed to the S&F default (ExternalSystemGateway-004).
// ExternalSystemGateway-015: the entity's MaxRetries is a non-nullable
// int whose default is 0, and the Store-and-Forward engine interprets a
// stored MaxRetries of 0 as "no limit" (retry forever) — see
// StoreAndForwardMessage.MaxRetries ("0 = no limit") and the retry-sweep
// guard `MaxRetries > 0 && ...`. Passing 0 verbatim would therefore turn
// every unconfigured cached call into an unbounded retry loop. A 0 is
// treated as "unset" and passed as null so the bounded S&F default
// applies; the RetryDelay default of TimeSpan.Zero is likewise unset.
await _storeAndForward.EnqueueAsync(
StoreAndForwardCategory.ExternalSystem,
systemName,
payload,
originInstanceName,
system.MaxRetries,
system.MaxRetries > 0 ? system.MaxRetries : null,
system.RetryDelay > TimeSpan.Zero ? system.RetryDelay : null,
attemptImmediateDelivery: false);
@@ -329,8 +334,16 @@ public class ExternalSystemClient : IExternalSystemClient
var queryString = string.Join("&",
parameters.Where(p => p.Value != null)
.Select(p => $"{Uri.EscapeDataString(p.Key)}={Uri.EscapeDataString(p.Value?.ToString() ?? "")}"));
// Only append "?" when the effective query string is non-empty — a method
// whose parameter values are all null produces no query string, and the
// URL must then be identical to the no-parameters case rather than ending
// in a bare "?" (ExternalSystemGateway-017).
if (queryString.Length > 0)
{
url += "?" + queryString;
}
}
return url;
}

View File

@@ -1,4 +1,5 @@
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Http;
using Microsoft.Extensions.Options;
using ScadaLink.Commons.Interfaces.Services;
@@ -6,6 +7,12 @@ namespace ScadaLink.ExternalSystemGateway;
public static class ServiceCollectionExtensions
{
/// <summary>
/// Name prefix of the per-system <see cref="System.Net.Http.HttpClient"/> clients
/// created by <see cref="ExternalSystemClient"/> (<c>ExternalSystem_{systemName}</c>).
/// </summary>
internal const string GatewayClientNamePrefix = "ExternalSystem_";
public static IServiceCollection AddExternalSystemGateway(this IServiceCollection services)
{
services.AddOptions<ExternalSystemGatewayOptions>()
@@ -13,20 +20,18 @@ public static class ServiceCollectionExtensions
services.AddHttpClient();
// ExternalSystemGateway-013: wire MaxConcurrentConnectionsPerSystem into the
// primary handler of every per-system named client ("ExternalSystem_{name}"),
// so the option an operator configures actually bounds concurrent connections
// instead of being silently ignored. ConfigureHttpClientDefaults applies to
// the dynamically-named clients created by ExternalSystemClient.
services.ConfigureHttpClientDefaults(builder =>
builder.ConfigurePrimaryHttpMessageHandler(sp =>
{
var options = sp.GetRequiredService<IOptions<ExternalSystemGatewayOptions>>().Value;
return new SocketsHttpHandler
{
MaxConnectionsPerServer = options.MaxConcurrentConnectionsPerSystem,
};
}));
// ExternalSystemGateway-013 / -016: wire MaxConcurrentConnectionsPerSystem
// into the primary handler of the gateway's per-system named clients
// ("ExternalSystem_{name}") only. The names are created dynamically, so a
// static AddHttpClient("name") registration is not possible; instead a
// post-configure on HttpClientFactoryOptions is applied, filtered by the
// client-name prefix. ConfigureHttpClientDefaults is deliberately NOT used —
// it is process-global and would replace the primary handler of every
// HttpClient in the host (e.g. the Notification Service's OAuth2 token
// client), silently capping and overriding unrelated components.
services.AddSingleton<IConfigureOptions<HttpClientFactoryOptions>>(sp =>
new GatewayHttpClientConfigurator(
sp.GetRequiredService<IOptionsMonitor<ExternalSystemGatewayOptions>>()));
services.AddScoped<ExternalSystemClient>();
services.AddScoped<IExternalSystemClient>(sp => sp.GetRequiredService<ExternalSystemClient>());
@@ -42,4 +47,41 @@ public static class ServiceCollectionExtensions
// Script Execution Actors run on dedicated blocking I/O dispatcher.
return services;
}
/// <summary>
/// ExternalSystemGateway-016: configures the primary HTTP message handler with the
/// gateway's <see cref="ExternalSystemGatewayOptions.MaxConcurrentConnectionsPerSystem"/>
/// cap, but only for the gateway's own named clients
/// (<see cref="GatewayClientNamePrefix"/>). Clients owned by other host components
/// are left untouched, so the cap does not leak process-wide.
/// </summary>
private sealed class GatewayHttpClientConfigurator
: IConfigureNamedOptions<HttpClientFactoryOptions>
{
private readonly IOptionsMonitor<ExternalSystemGatewayOptions> _options;
public GatewayHttpClientConfigurator(IOptionsMonitor<ExternalSystemGatewayOptions> options)
{
_options = options;
}
public void Configure(HttpClientFactoryOptions options)
{
// The default (unnamed) client is not a gateway client — do nothing.
}
public void Configure(string? name, HttpClientFactoryOptions options)
{
if (name == null || !name.StartsWith(GatewayClientNamePrefix, StringComparison.Ordinal))
{
return;
}
options.HttpMessageHandlerBuilderActions.Add(builder =>
builder.PrimaryHandler = new SocketsHttpHandler
{
MaxConnectionsPerServer = _options.CurrentValue.MaxConcurrentConnectionsPerSystem,
});
}
}
}

View File

@@ -113,8 +113,13 @@ public class DatabaseGatewayTests
}
[Fact]
public async Task CachedWrite_ZeroMaxRetriesIsHonouredNotTreatedAsUnset()
public async Task CachedWrite_ZeroMaxRetriesIsTreatedAsUnsetNotRetryForever()
{
// ExternalSystemGateway-015: a stored MaxRetries of 0 is interpreted by the
// Store-and-Forward retry sweep as "no limit" (retry forever). The entity's
// non-nullable int default is also 0, so the gateway must treat the
// connection's MaxRetries == 0 as "unset" and pass null — the bounded S&F
// default must apply, never 0.
var conn = new DatabaseConnectionDefinition("testDb", "Server=localhost;Database=test")
{
Id = 1,
@@ -144,7 +149,9 @@ public class DatabaseGatewayTests
await gateway.CachedWriteAsync("testDb", "INSERT INTO t VALUES (1)");
var (maxRetries, _) = ReadBufferedRetrySettings(connStr);
Assert.Equal(0, maxRetries); // honoured — not the S&F default of 99
// Must be the bounded S&F default, never 0 — a stored 0 would mean retry-forever.
Assert.Equal(99, maxRetries);
Assert.NotEqual(0, maxRetries);
}
private static (int MaxRetries, long RetryIntervalMs) ReadBufferedRetrySettings(string connStr)

View File

@@ -396,9 +396,14 @@ public class ExternalSystemClientTests
}
[Fact]
public async Task CachedCall_TransientFailure_ZeroMaxRetriesIsHonouredNotTreatedAsUnset()
public async Task CachedCall_TransientFailure_ZeroMaxRetriesIsTreatedAsUnsetNotRetryForever()
{
// MaxRetries == 0 must mean "never retry", not "fall back to the S&F default".
// ExternalSystemGateway-015: the Store-and-Forward engine interprets a stored
// MaxRetries of 0 as "no limit" (retry forever) — see StoreAndForwardMessage.cs
// and the retry-sweep guard `MaxRetries > 0 && ...`. The entity's non-nullable
// int default is also 0, so passing 0 verbatim would buffer every cached call
// as an unbounded retry loop. The ESG must therefore treat the entity's
// MaxRetries == 0 as "unset" and pass null, so the bounded S&F default applies.
var system = new ExternalSystemDefinition("TestAPI", "https://api.example.com", "none")
{
Id = 1,
@@ -432,7 +437,9 @@ public class ExternalSystemClientTests
await client.CachedCallAsync("TestAPI", "postData");
var (maxRetries, _) = ReadBufferedRetrySettings(connStr);
Assert.Equal(0, maxRetries); // honoured — not the default of 99
// Must be the bounded S&F default, never 0 — a stored 0 would mean retry-forever.
Assert.Equal(99, maxRetries);
Assert.NotEqual(0, maxRetries);
}
// ── ExternalSystemGateway-005: HttpRequestMessage / HttpResponseMessage disposal ──
@@ -615,6 +622,33 @@ public class ExternalSystemClientTests
Assert.Contains("page=2", uri);
}
[Fact]
public async Task Call_GetWithAllNullParameters_DoesNotAppendTrailingQuestionMark()
{
// ExternalSystemGateway-017: a GET method invoked with a non-empty parameter
// dictionary whose values are all null has an effectively empty query string.
// The URL must be identical to the no-parameters case — no bare trailing '?'.
var system = new ExternalSystemDefinition("TestAPI", "https://api.example.com", "none") { Id = 1 };
var method = new ExternalSystemMethod("search", "GET", "/search") { Id = 1, ExternalSystemDefinitionId = 1 };
StubResolution(system, method);
var handler = new RequestCapturingHandler(HttpStatusCode.OK, "{}");
_httpClientFactory.CreateClient(Arg.Any<string>()).Returns(new HttpClient(handler));
var client = new ExternalSystemClient(
_httpClientFactory, _repository, NullLogger<ExternalSystemClient>.Instance);
await client.CallAsync("TestAPI", "search", new Dictionary<string, object?>
{
["q"] = null,
["page"] = null,
});
var uri = handler.LastUri!.AbsoluteUri;
Assert.Equal("https://api.example.com/search", uri);
Assert.DoesNotContain("?", uri);
}
[Fact]
public async Task Call_PostWithParameters_SendsJsonBody()
{

View File

@@ -40,6 +40,52 @@ public class ServiceWiringTests
Assert.Equal(4, sockets.MaxConnectionsPerServer);
}
[Fact]
public void MaxConcurrentConnectionsPerSystem_IsNotAppliedToNonGatewayHttpClients()
{
// ExternalSystemGateway-016: the gateway's connection cap must be scoped to
// its own per-system clients ("ExternalSystem_{name}"). It must NOT leak onto
// unrelated HttpClient consumers in the same host process (e.g. the
// Notification Service's OAuth2 token client) — that would silently throttle
// and override the primary-handler configuration of another component.
var config = new ConfigurationBuilder()
.AddInMemoryCollection(new Dictionary<string, string?>
{
["ScadaLink:ExternalSystemGateway:MaxConcurrentConnectionsPerSystem"] = "4",
})
.Build();
var services = new ServiceCollection();
services.AddLogging();
services.AddSingleton<IConfiguration>(config);
services.AddSingleton(Substitute.For<IExternalSystemRepository>());
services.AddExternalSystemGateway();
// A client owned by a different component, registered the way the
// Notification Service registers its OAuth2 token client — a plain
// AddHttpClient with no custom primary handler. Its primary handler must
// remain the framework default (uncapped), not the gateway's SocketsHttpHandler.
services.AddHttpClient("NotificationService_OAuth2");
using var provider = services.BuildServiceProvider();
var handlerFactory = provider.GetRequiredService<IHttpMessageHandlerFactory>();
// The gateway's own client must still get the gateway cap.
var gatewayPrimary = FindPrimaryHandler(handlerFactory.CreateHandler("ExternalSystem_AnySystem"));
Assert.Equal(4, Assert.IsType<SocketsHttpHandler>(gatewayPrimary).MaxConnectionsPerServer);
// The unrelated component's client must NOT inherit the gateway's connection
// cap. With ConfigureHttpClientDefaults the primary handler is a
// SocketsHttpHandler capped at the gateway's value (the leak); with a scoped
// registration it is the framework default whose MaxConnectionsPerServer is
// int.MaxValue.
var otherPrimary = FindPrimaryHandler(handlerFactory.CreateHandler("NotificationService_OAuth2"));
if (otherPrimary is SocketsHttpHandler otherSockets)
{
Assert.NotEqual(4, otherSockets.MaxConnectionsPerServer);
}
}
private static HttpMessageHandler FindPrimaryHandler(HttpMessageHandler handler)
{
var current = handler;