docs(alarms): clarify resolver cancellation contract; mark design doc superseded
C6b: IAlarmWatchListResolver.ResolveAsync doc now notes that while discovery being unavailable never throws, a triggered cancellation token still propagates. C7: annotate the original design doc where it drifted from the shipped code — metric names / unimplemented watch-list gauges, and the proto-type location (gateway proto, not worker proto).
This commit is contained in:
@@ -1,7 +1,10 @@
|
|||||||
# Alarm Subtag-Monitoring Fallback — Design
|
# Alarm Subtag-Monitoring Fallback — Design
|
||||||
|
|
||||||
**Date:** 2026-06-13
|
**Date:** 2026-06-13
|
||||||
**Status:** Approved (brainstorming), ready for implementation planning
|
**Status:** Superseded by implementation (merged to `main`). This is the original
|
||||||
|
brainstorming design; a few details below were refined during implementation —
|
||||||
|
see the inline **Superseded** notes. The shipped behaviour is documented in
|
||||||
|
`docs/AlarmClientDiscovery.md`, the client READMEs, and the contracts.
|
||||||
**Branch:** `feat/alarm-subtag-fallback`
|
**Branch:** `feat/alarm-subtag-fallback`
|
||||||
|
|
||||||
## Problem
|
## Problem
|
||||||
@@ -162,6 +165,11 @@ reconcile cadence and pushes an updated watch-list when the model changes.
|
|||||||
|
|
||||||
**`mxaccess_worker.proto`:**
|
**`mxaccess_worker.proto`:**
|
||||||
|
|
||||||
|
> **Superseded:** these additions shipped in `mxaccess_gateway.proto`, not
|
||||||
|
> `mxaccess_worker.proto` — the worker imports the gateway proto and the alarm
|
||||||
|
> commands/events live there (`AlarmSubtagTarget`,
|
||||||
|
> `OnAlarmProviderModeChangedEvent`, the extended subscribe command).
|
||||||
|
|
||||||
- Extend the alarm-subscribe command with: `AlarmProviderMode forced_mode`
|
- Extend the alarm-subscribe command with: `AlarmProviderMode forced_mode`
|
||||||
(`UNSPECIFIED` = auto), `int32 consecutive_failure_threshold`,
|
(`UNSPECIFIED` = auto), `int32 consecutive_failure_threshold`,
|
||||||
`int32 failback_probe_interval_seconds`, `int32 failback_stable_probes`, and
|
`int32 failback_probe_interval_seconds`, `int32 failback_stable_probes`, and
|
||||||
@@ -240,6 +248,12 @@ to `/hubs/alarms`, (c) update metrics, (d) force a reconcile.
|
|||||||
- `mxgateway_alarm_provider_switch_total{from,to,reason}` (counter)
|
- `mxgateway_alarm_provider_switch_total{from,to,reason}` (counter)
|
||||||
- `mxgateway_alarm_fallback_watchlist_size` (gauge)
|
- `mxgateway_alarm_fallback_watchlist_size` (gauge)
|
||||||
|
|
||||||
|
> **Superseded:** the shipped meter names are `mxgateway.alarms.provider_mode`
|
||||||
|
> (gauge) and `mxgateway.alarms.provider_switches{from,to,reason}` (counter,
|
||||||
|
> `reason` bounded to `failover`/`failback`/`unknown`). The watch-list-size /
|
||||||
|
> watch-list-empty gauges were not implemented; an empty watch-list is surfaced
|
||||||
|
> via a warning log and the feed's degraded `ProviderStatus` instead.
|
||||||
|
|
||||||
## Configuration
|
## Configuration
|
||||||
|
|
||||||
```jsonc
|
```jsonc
|
||||||
|
|||||||
@@ -19,8 +19,10 @@ public interface IAlarmWatchListResolver
|
|||||||
/// <param name="cancellationToken">Token to cancel the asynchronous operation.</param>
|
/// <param name="cancellationToken">Token to cancel the asynchronous operation.</param>
|
||||||
/// <returns>
|
/// <returns>
|
||||||
/// The resolved <see cref="AlarmSubtagTarget"/> watch-list, possibly empty.
|
/// The resolved <see cref="AlarmSubtagTarget"/> watch-list, possibly empty.
|
||||||
/// Discovery being unavailable never throws; the caller decides what to do
|
/// Discovery being unavailable never throws — it yields an empty (or
|
||||||
/// with an empty list.
|
/// config-only) list and the caller decides what to do with it. Cancellation
|
||||||
|
/// is the one exception: a triggered <paramref name="cancellationToken"/>
|
||||||
|
/// still propagates an <see cref="OperationCanceledException"/>.
|
||||||
/// </returns>
|
/// </returns>
|
||||||
Task<IReadOnlyList<AlarmSubtagTarget>> ResolveAsync(
|
Task<IReadOnlyList<AlarmSubtagTarget>> ResolveAsync(
|
||||||
AlarmsOptions options,
|
AlarmsOptions options,
|
||||||
|
|||||||
Reference in New Issue
Block a user