fix(error-handling): close Theme 4 — 18 cancellation / fire-and-forget findings
Async cancellation hygiene, fire-and-forget observability, retry/shutdown semantics, and audit-row coverage across 9 modules. Highlights: Cancellation & lifecycle: - AuditLog-006: SqliteAuditWriter.Dispose hops to thread pool, escaping the captured SyncContext that risked sync-over-async deadlock. - AuditLog-010: SiteAuditTelemetryActor owns a private lifecycle CTS, threaded through drain paths instead of CancellationToken.None. - Comm-019: CentralCommunicationActor adds lifecycle CTS for repo calls. - Host-019: Migration StartupRetry forwards ApplicationStopping so SIGTERM during the bounded-retry window aborts cleanly. Cursor / retry / counter correctness: - AuditLog-004: SiteAuditReconciliationActor's cursor now holds at `since` when any row's idempotent insert is still being retried (per-EventId retry counter, MaxPermanentInsertAttempts=5 escape valve with LogCritical abandon). No more silent abandonment of permanently-failing rows. - ConfigDB-019: Dropped the catch-and-continue on EnsureLookaheadAsync's SPLIT loop — by class-doc construction the catch could only mask real failures and let the next iteration create permanent partition holes. - HM-017/018: HealthReportSender + CentralHealthReportLoop snapshot per-interval counters before sending, restore via new ISiteHealthCollector.AddIntervalCounters on transport failure so counts aren't silently lost. Fire-and-forget / shutdown waits: - InboundAPI-018: AuditWriteMiddleware observes faulted audit-write tasks via OnlyOnFaulted continuation (Warning log; response unchanged). - SnF-024: StoreAndForwardService.StopAsync awaits in-flight retry sweep with a bounded SweepShutdownWaitTimeout (10s). Leak / refactor: - Comm-021: SiteStreamGrpcServer.SubscribeInstance wraps Subscribe in its own try/catch so a throw doesn't leak the relay actor or _activeStreams entry. - Comm-022: VERIFIED already-closed by Comm-016's dead-code purge. - CLI-017: BundleCommands' three subcommands delegate to ExecuteCommandAsync (auth-failure exit-code contract unified). Defensive / validation: - CLI-021: CliConfig.Load wraps file-read/JSON parse so malformed config prints a warning and returns defaults instead of crashing the CLI. - Host-022: ParseLevel emits stderr one-shot warning for unrecognised MinimumLevel instead of silently coercing to Information. - ESG-019: ExternalSystemClient sets HttpClient.Timeout=Infinite so the per-call CTS is the sole timeout source (was clipped to 100s by .NET). - Security-020: New SecurityOptionsValidator (IValidateOptions) rejects empty LdapServer/LdapSearchBase with ValidateOnStart. - DM-019: Lifecycle command timeouts now emit DisableTimedOut/EnableTimedOut/ DeleteTimedOut audit entries (mirrors DeployFailed pattern). Plus reconciled stale per-module Open-findings counters that had drifted from prior sessions. 20+ new regression tests across 11 test projects; build clean; affected suites all green. README regenerated: 75 open (was 93).
This commit is contained in:
@@ -17,13 +17,28 @@ internal static class CommandHelpers
|
||||
/// <param name="usernameOption">Option that supplies the username override.</param>
|
||||
/// <param name="passwordOption">Option that supplies the password override.</param>
|
||||
/// <param name="command">The management command object to send.</param>
|
||||
/// <param name="timeout">
|
||||
/// Optional per-command HTTP timeout. Defaults to 30s, matching the management API's
|
||||
/// own request timeout. Larger payloads (e.g. Transport bundles) should supply a
|
||||
/// longer value.
|
||||
/// </param>
|
||||
/// <param name="onSuccess">
|
||||
/// Optional success handler. When supplied, the helper invokes it with the success
|
||||
/// body instead of running the default <see cref="HandleResponse"/> rendering path —
|
||||
/// useful when the caller needs to capture the response (e.g. write a file) rather
|
||||
/// than print it. The authorization-failure exit-code contract
|
||||
/// (<see cref="IsAuthorizationFailure"/>) is preserved on the error path either way,
|
||||
/// closing CLI-017's regression.
|
||||
/// </param>
|
||||
internal static async Task<int> ExecuteCommandAsync(
|
||||
ParseResult result,
|
||||
Option<string> urlOption,
|
||||
Option<string> formatOption,
|
||||
Option<string> usernameOption,
|
||||
Option<string> passwordOption,
|
||||
object command)
|
||||
object command,
|
||||
TimeSpan? timeout = null,
|
||||
Func<string, int>? onSuccess = null)
|
||||
{
|
||||
var config = CliConfig.Load();
|
||||
var format = ResolveFormat(result, formatOption, config);
|
||||
@@ -67,7 +82,20 @@ internal static class CommandHelpers
|
||||
|
||||
// Send via HTTP
|
||||
using var client = new ManagementHttpClient(url, username, password);
|
||||
var response = await client.SendCommandAsync(commandName, command, TimeSpan.FromSeconds(30));
|
||||
var response = await client.SendCommandAsync(commandName, command, timeout ?? TimeSpan.FromSeconds(30));
|
||||
|
||||
// Caller-supplied success handler short-circuits the default rendering — but
|
||||
// the error path still routes through IsAuthorizationFailure so the documented
|
||||
// exit-2 contract holds whether or not a custom handler is provided
|
||||
// (CLI-017 unification of the bundle path).
|
||||
if (onSuccess is not null)
|
||||
{
|
||||
if (response.JsonData is not null)
|
||||
return onSuccess(response.JsonData);
|
||||
|
||||
OutputFormatter.WriteError(response.Error ?? "Unknown error", response.ErrorCode ?? "ERROR");
|
||||
return IsAuthorizationFailure(response) ? 2 : 1;
|
||||
}
|
||||
|
||||
return HandleResponse(response, format);
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user