metrics(alarms): expose provider-switch count in snapshot, bound the reason tag
B1: add AlarmProviderSwitchCount to GatewayMetricsSnapshot so the switch total is readable without scraping the OTEL counter. B2: replace the free-text reason tag on mxgateway.alarms.provider_switches with a bounded AlarmProviderSwitchReason enum (failover/failback/unknown); the human-readable reason stays in the structured log.
This commit is contained in:
@@ -399,7 +399,13 @@ public sealed class GatewayAlarmMonitor : BackgroundService, IGatewayAlarmServic
|
||||
BroadcastToAll(new AlarmFeedMessage { ProviderStatus = status });
|
||||
}
|
||||
|
||||
_metrics.AlarmProviderSwitched(fromModeInt, ModeToInt(toMode), reason);
|
||||
AlarmProviderSwitchReason switchReason = toMode switch
|
||||
{
|
||||
AlarmProviderMode.Subtag => AlarmProviderSwitchReason.Failover,
|
||||
AlarmProviderMode.Alarmmgr => AlarmProviderSwitchReason.Failback,
|
||||
_ => AlarmProviderSwitchReason.Unknown,
|
||||
};
|
||||
_metrics.AlarmProviderSwitched(fromModeInt, ModeToInt(toMode), switchReason);
|
||||
|
||||
_logger.LogInformation(
|
||||
"Alarm provider mode changed to {Mode} (degraded={Degraded}): {Reason}",
|
||||
|
||||
Reference in New Issue
Block a user