fix(driver-galaxy): wire event-stream faults to the reconnect supervisor (Driver.Galaxy-001)
The ReconnectSupervisor was constructed but its trigger ReportTransportFailure was never called. When the gateway StreamEvents stream faulted, EventPump just logged and exited — the supervisor was never notified, so a transient gateway drop permanently stopped data-change notifications while GetHealth() still reported Healthy. EventPump gains an optional onStreamFault callback invoked from its stream-fault catch block (not on clean shutdown). GalaxyDriver wires it to ReconnectSupervisor.ReportTransportFailure so a transport drop drives reopen → replay. This is the minimal fix for -001; the pump-restart-on-reopen gap remains tracked as Driver.Galaxy-008. Regression tests cover the callback being invoked on fault, the end-to-end supervisor reopen/replay, and that a clean shutdown does not fire it. Driver.Galaxy suite: 206/206 pass. Resolves code-review finding Driver.Galaxy-001 (Critical). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -732,13 +732,43 @@ public sealed class GalaxyDriver
|
||||
_eventPump = new EventPump(
|
||||
_subscriber!, _subscriptions, _logger,
|
||||
channelCapacity: _options.MxAccess.EventPumpChannelCapacity,
|
||||
clientName: _options.MxAccess.ClientName);
|
||||
clientName: _options.MxAccess.ClientName,
|
||||
onStreamFault: OnEventPumpStreamFault);
|
||||
_eventPump.OnDataChange += OnPumpDataChange;
|
||||
_eventPump.Start();
|
||||
return _eventPump;
|
||||
}
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Stream-fault callback for the <see cref="EventPump"/>. The gw StreamEvents
|
||||
/// stream faulted (transient gateway drop, network blip, gw restart). Forward
|
||||
/// the cause to the <see cref="ReconnectSupervisor"/> so it drives reopen →
|
||||
/// replay; without this hand-off a transient transport drop permanently kills
|
||||
/// the event stream and <c>GetHealth()</c> keeps reporting Healthy.
|
||||
/// </summary>
|
||||
private void OnEventPumpStreamFault(Exception cause)
|
||||
{
|
||||
var supervisor = _supervisor;
|
||||
if (supervisor is null)
|
||||
{
|
||||
// No production runtime (skeleton / injected-seam path) — nothing to drive.
|
||||
_logger.LogWarning(cause,
|
||||
"GalaxyDriver {InstanceId} event stream faulted but no reconnect supervisor is wired.",
|
||||
_driverInstanceId);
|
||||
return;
|
||||
}
|
||||
|
||||
try
|
||||
{
|
||||
supervisor.ReportTransportFailure(cause);
|
||||
}
|
||||
catch (ObjectDisposedException)
|
||||
{
|
||||
// Driver is being disposed — the stream fault is just shutdown noise.
|
||||
}
|
||||
}
|
||||
|
||||
// ===== IAlarmSource =====
|
||||
|
||||
/// <summary>
|
||||
|
||||
Reference in New Issue
Block a user