docs: implementation plan for alarm subtag-monitoring fallback
18 TDD tasks across contracts, worker (SubtagAlarmConsumer + FailoverAlarmConsumer), gateway (GR-SQL watch-list discovery, monitor mode reflection, metrics, dashboard), and docs. Grounded in current signatures; parity-preserving (worker-side synthesis).
This commit is contained in:
@@ -0,0 +1,858 @@
|
||||
# Alarm Subtag-Monitoring Fallback — Implementation Plan
|
||||
|
||||
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers-extended-cc:executing-plans (or subagent-driven-development) to implement this plan task-by-task.
|
||||
|
||||
**Goal:** Add a second alarm source — direct MXAccess subtag monitoring — that the gateway auto-fails-over to when the wnwrap alarmmgr provider breaks, auto-fails-back to when it recovers, and can be forced on by config.
|
||||
|
||||
**Architecture:** Worker-side synthesis (parity rule preserved). A new `SubtagAlarmConsumer` (own `LMXProxyServerClass`, `AddItem`/`Advise` on alarm subtags) and a `FailoverAlarmConsumer` composite (state machine over the wnwrap primary + subtag standby) both implement the existing `IMxAccessAlarmConsumer` seam. The gateway resolves the subtag watch-list (Galaxy Repository SQL + config override), arms the worker at subscribe time, and reflects the live provider mode into the gRPC alarm feed, the dashboard hub, and metrics.
|
||||
|
||||
**Tech Stack:** .NET 10 (gateway, x64) + .NET Framework 4.8 (worker, x86, STA), protobuf/gRPC, `Microsoft.Data.SqlClient` (Galaxy Repository), SignalR (dashboard), `System.Diagnostics.Metrics`, xUnit (plain `Assert`, no FluentAssertions).
|
||||
|
||||
**Design source:** `docs/plans/2026-06-13-alarm-subtag-fallback-design.md`
|
||||
|
||||
**Branch:** `feat/alarm-subtag-fallback` (already created)
|
||||
|
||||
---
|
||||
|
||||
## Conventions for every task
|
||||
|
||||
- **TDD:** write the failing test, run it red, implement, run it green, commit.
|
||||
- **xUnit, plain `Assert.*`**, naming `Subject_Condition_Expected`. Worker fakes are sealed private nested classes that raise events.
|
||||
- **Build/test commands:**
|
||||
- Contracts regen: `dotnet build src/ZB.MOM.WW.MxGateway.Contracts/ZB.MOM.WW.MxGateway.Contracts.csproj`
|
||||
- Gateway: `dotnet build src/ZB.MOM.WW.MxGateway.Server` ; `dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj`
|
||||
- Worker (x86): `dotnet build src/ZB.MOM.WW.MxGateway.Worker/ZB.MOM.WW.MxGateway.Worker.csproj -p:Platform=x86` ; `dotnet test src/ZB.MOM.WW.MxGateway.Worker.Tests/ZB.MOM.WW.MxGateway.Worker.Tests.csproj -p:Platform=x86`
|
||||
- Single test: append `--filter FullyQualifiedName~<ClassOrMethod>`
|
||||
- **Build is strict:** `TreatWarningsAsErrors=true`, nullable enabled. Add XML doc comments on public members (the repo runs a doc checker).
|
||||
- **Generated code** under `Generated/` is never hand-edited — rebuild the contracts project to regenerate.
|
||||
- **Namespaces:** worker MxAccess types live in `ZB.MOM.WW.MxGateway.Worker.MxAccess`; proto C# types in `ZB.MOM.WW.MxGateway.Contracts.Proto`.
|
||||
|
||||
---
|
||||
|
||||
## Phase 0 — Contracts
|
||||
|
||||
### Task 1: Worker proto — subtag watch-list, failover config, provider-mode enum
|
||||
|
||||
**Classification:** high-risk
|
||||
**Estimated implement time:** ~4 min
|
||||
**Parallelizable with:** none (Task 2 imports these types)
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_worker.proto` (alarm command block, ~lines 318-346)
|
||||
|
||||
**Step 1: Add the enum and messages.** In `mxaccess_worker.proto`, replace the `SubscribeAlarmsCommand` message and add the new types after it:
|
||||
|
||||
```protobuf
|
||||
// Provider selection / current provider for the alarm feed. Defined here in
|
||||
// the worker contract because the worker SubscribeAlarmsCommand references it;
|
||||
// mxaccess_gateway.proto imports this file and reuses the same enum.
|
||||
enum AlarmProviderMode {
|
||||
ALARM_PROVIDER_MODE_UNSPECIFIED = 0; // auto: alarmmgr primary, subtag fallback
|
||||
ALARM_PROVIDER_MODE_ALARMMGR = 1;
|
||||
ALARM_PROVIDER_MODE_SUBTAG = 2;
|
||||
}
|
||||
|
||||
message SubscribeAlarmsCommand {
|
||||
string subscription_expression = 1;
|
||||
// UNSPECIFIED = auto-failover/failback. ALARMMGR/SUBTAG force one provider.
|
||||
AlarmProviderMode forced_mode = 2;
|
||||
// Subtag watch-list resolved by the gateway (GR SQL + config). Empty in pure
|
||||
// alarmmgr mode; in subtag mode it bounds what the consumer can observe.
|
||||
repeated AlarmSubtagTarget watch_list = 3;
|
||||
AlarmFailoverConfig failover = 4;
|
||||
}
|
||||
|
||||
// One alarm attribute the subtag consumer advises. Addresses are full MXAccess
|
||||
// item references the worker passes straight to AddItem.
|
||||
message AlarmSubtagTarget {
|
||||
string alarm_full_reference = 1; // e.g. "Galaxy!Area.Tank01.Level.HiHi"
|
||||
string source_object_reference = 2; // e.g. "Tank01"
|
||||
string active_subtag = 3; // item address of the in-alarm boolean
|
||||
string acked_subtag = 4; // item address of the acknowledged boolean
|
||||
string ack_comment_subtag = 5; // writable ack-comment attribute (ack write target)
|
||||
string priority_subtag = 6; // optional severity source; empty if absent
|
||||
}
|
||||
|
||||
message AlarmFailoverConfig {
|
||||
int32 consecutive_failure_threshold = 1; // wnwrap COM failures before switching (>=1)
|
||||
int32 failback_probe_interval_seconds = 2; // probe cadence while degraded (>=1)
|
||||
int32 failback_stable_probes = 3; // clean probes before switching back (>=1)
|
||||
}
|
||||
```
|
||||
|
||||
`UnsubscribeAlarmsCommand` and `AcknowledgeAlarmCommand` are unchanged.
|
||||
|
||||
**Step 2: Regenerate & verify it compiles.**
|
||||
Run: `dotnet build src/ZB.MOM.WW.MxGateway.Contracts/ZB.MOM.WW.MxGateway.Contracts.csproj`
|
||||
Expected: build succeeds; generated `AlarmProviderMode`, `AlarmSubtagTarget`, `AlarmFailoverConfig` types appear.
|
||||
|
||||
**Step 3: Commit.**
|
||||
```bash
|
||||
git add src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_worker.proto
|
||||
git commit -m "contracts(worker): subtag watch-list + failover config + AlarmProviderMode"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 2: Gateway proto — provider status on the feed, degraded provenance, mode-changed event
|
||||
|
||||
**Classification:** high-risk
|
||||
**Estimated implement time:** ~5 min
|
||||
**Parallelizable with:** none (depends on Task 1; Task 3 tests both)
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_gateway.proto` (`OnAlarmTransitionEvent` ~719-771, `ActiveAlarmSnapshot` ~783-803, `AlarmFeedMessage` ~860-870, `MxEvent` family enum + body oneof, `MxEventFamily` enum)
|
||||
|
||||
**Step 1: Add degraded provenance to the two alarm payloads.** Append to `OnAlarmTransitionEvent` (next free field 14):
|
||||
|
||||
```protobuf
|
||||
// True when this transition came from the subtag-monitoring fallback rather
|
||||
// than the native alarmmgr provider — i.e. it was synthesized from data
|
||||
// changes and carries reduced fidelity (synthetic GUID, no native raise time).
|
||||
bool degraded = 14;
|
||||
// Which provider produced this transition.
|
||||
AlarmProviderMode source_provider = 15;
|
||||
```
|
||||
|
||||
Append the identical two fields to `ActiveAlarmSnapshot` (next free field 14):
|
||||
```protobuf
|
||||
bool degraded = 14;
|
||||
AlarmProviderMode source_provider = 15;
|
||||
```
|
||||
|
||||
**Step 2: Add provider status to the feed oneof.** Add a new oneof case to `AlarmFeedMessage` (next free field 4) and a new message:
|
||||
|
||||
```protobuf
|
||||
message AlarmFeedMessage {
|
||||
oneof payload {
|
||||
ActiveAlarmSnapshot active_alarm = 1;
|
||||
bool snapshot_complete = 2;
|
||||
OnAlarmTransitionEvent transition = 3;
|
||||
// Provider-mode status. Emitted once on stream open and again on every
|
||||
// failover/failback so late joiners learn the current mode immediately.
|
||||
AlarmProviderStatus provider_status = 4;
|
||||
}
|
||||
}
|
||||
|
||||
message AlarmProviderStatus {
|
||||
AlarmProviderMode mode = 1;
|
||||
bool degraded = 2; // true whenever mode == SUBTAG
|
||||
string reason = 3; // human-readable switch reason
|
||||
google.protobuf.Timestamp since = 4;
|
||||
}
|
||||
```
|
||||
|
||||
**Step 3: Add the worker→gateway mode-changed event to `MxEvent`.** Find the `MxEventFamily` enum and the `MxEvent` body oneof. Add a family member and a body message + oneof case (use the next free family value and the next free `MxEvent` body field number — check the file):
|
||||
|
||||
```protobuf
|
||||
// in MxEventFamily enum:
|
||||
MX_EVENT_FAMILY_ON_ALARM_PROVIDER_MODE_CHANGED = <next>;
|
||||
|
||||
// new message near OnAlarmTransitionEvent:
|
||||
message OnAlarmProviderModeChangedEvent {
|
||||
AlarmProviderMode mode = 1;
|
||||
string reason = 2;
|
||||
int32 hresult = 3; // COM HRESULT that triggered failover; 0 on failback
|
||||
google.protobuf.Timestamp at = 4;
|
||||
}
|
||||
|
||||
// in MxEvent body oneof:
|
||||
OnAlarmProviderModeChangedEvent on_alarm_provider_mode_changed = <next>;
|
||||
```
|
||||
|
||||
`AlarmProviderMode` is defined in `mxaccess_worker.proto`; confirm `mxaccess_gateway.proto` already has `import "mxaccess_worker.proto";` (it references `SubscribeAlarmsCommand`, so it does) and reference the enum unqualified or via its package as the existing references do.
|
||||
|
||||
**Step 4: Regenerate & verify.**
|
||||
Run: `dotnet build src/ZB.MOM.WW.MxGateway.Contracts/ZB.MOM.WW.MxGateway.Contracts.csproj`
|
||||
Expected: build succeeds.
|
||||
|
||||
**Step 5: Commit.**
|
||||
```bash
|
||||
git add src/ZB.MOM.WW.MxGateway.Contracts/Protos/mxaccess_gateway.proto
|
||||
git commit -m "contracts(gateway): AlarmProviderStatus feed case, degraded provenance, mode-changed event"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 3: Proto round-trip tests for the new alarm fields
|
||||
|
||||
**Classification:** small
|
||||
**Estimated implement time:** ~3 min
|
||||
**Parallelizable with:** none (depends on Tasks 1-2)
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Tests/Contracts/ProtobufContractRoundTripTests.cs`
|
||||
|
||||
**Step 1: Add tests** mirroring the existing `Event_RoundTripsOnAlarmTransitionWithFullPayload` style:
|
||||
|
||||
```csharp
|
||||
[Fact]
|
||||
public void Feed_RoundTripsProviderStatus()
|
||||
{
|
||||
var since = Timestamp.FromDateTime(new DateTime(2026, 6, 13, 9, 0, 0, DateTimeKind.Utc));
|
||||
var original = new AlarmFeedMessage
|
||||
{
|
||||
ProviderStatus = new AlarmProviderStatus
|
||||
{
|
||||
Mode = AlarmProviderMode.Subtag,
|
||||
Degraded = true,
|
||||
Reason = "wnwrap poll failed 3x (HRESULT 0x80004005)",
|
||||
Since = since,
|
||||
},
|
||||
};
|
||||
|
||||
var parsed = AlarmFeedMessage.Parser.ParseFrom(original.ToByteArray());
|
||||
|
||||
Assert.Equal(original, parsed);
|
||||
Assert.Equal(AlarmFeedMessage.PayloadOneofCase.ProviderStatus, parsed.PayloadCase);
|
||||
Assert.True(parsed.ProviderStatus.Degraded);
|
||||
Assert.Equal(AlarmProviderMode.Subtag, parsed.ProviderStatus.Mode);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Transition_RoundTripsDegradedProvenance()
|
||||
{
|
||||
var t = new OnAlarmTransitionEvent
|
||||
{
|
||||
AlarmFullReference = "Galaxy!Area.Tank01.Level.HiHi",
|
||||
TransitionKind = AlarmTransitionKind.Raise,
|
||||
Degraded = true,
|
||||
SourceProvider = AlarmProviderMode.Subtag,
|
||||
};
|
||||
|
||||
var parsed = OnAlarmTransitionEvent.Parser.ParseFrom(t.ToByteArray());
|
||||
|
||||
Assert.True(parsed.Degraded);
|
||||
Assert.Equal(AlarmProviderMode.Subtag, parsed.SourceProvider);
|
||||
}
|
||||
```
|
||||
|
||||
**Step 2: Run red→green.**
|
||||
Run: `dotnet test src/ZB.MOM.WW.MxGateway.Tests/ZB.MOM.WW.MxGateway.Tests.csproj --filter FullyQualifiedName~ProtobufContractRoundTripTests`
|
||||
Expected: PASS.
|
||||
|
||||
**Step 3: Commit.**
|
||||
```bash
|
||||
git add src/ZB.MOM.WW.MxGateway.Tests/Contracts/ProtobufContractRoundTripTests.cs
|
||||
git commit -m "test(contracts): round-trip provider status + degraded provenance"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Phase 1 — Worker: subtag consumer + failover
|
||||
|
||||
### Task 4: Subtag value-source abstraction + synthesis state holder
|
||||
|
||||
**Classification:** standard
|
||||
**Estimated implement time:** ~5 min
|
||||
**Parallelizable with:** none (Task 5 builds on it)
|
||||
|
||||
A testable seam so synthesis logic is unit-tested without COM. The COM wiring lands in Task 6.
|
||||
|
||||
**Files:**
|
||||
- Create: `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/ISubtagAlarmSource.cs`
|
||||
- Create: `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/SubtagAlarmStateMachine.cs`
|
||||
- Test: `src/ZB.MOM.WW.MxGateway.Worker.Tests/MxAccess/SubtagAlarmStateMachineTests.cs`
|
||||
|
||||
**Step 1: Define the source abstraction.** `ISubtagAlarmSource` advises subtag addresses and raises a normalized value-change callback on the STA:
|
||||
|
||||
```csharp
|
||||
namespace ZB.MOM.WW.MxGateway.Worker.MxAccess;
|
||||
|
||||
/// <summary>A change in one advised subtag value, normalized off the COM boundary.</summary>
|
||||
public sealed class SubtagValueChange
|
||||
{
|
||||
/// <summary>The full item address that changed (matches an AlarmSubtagTarget subtag).</summary>
|
||||
public string ItemAddress { get; init; } = string.Empty;
|
||||
/// <summary>The new value (boolean for .active/.acked, numeric for priority).</summary>
|
||||
public object? Value { get; init; }
|
||||
/// <summary>The change timestamp in UTC.</summary>
|
||||
public DateTime TimestampUtc { get; init; }
|
||||
}
|
||||
|
||||
/// <summary>
|
||||
/// Advises a set of MXAccess subtag addresses and surfaces value changes.
|
||||
/// The production implementation (Task 6) owns its own LMXProxyServerClass;
|
||||
/// tests substitute a fake that pushes <see cref="SubtagValueChange"/>s.
|
||||
/// </summary>
|
||||
public interface ISubtagAlarmSource : IDisposable
|
||||
{
|
||||
/// <summary>Raised on the STA when an advised subtag's value changes.</summary>
|
||||
event EventHandler<SubtagValueChange>? ValueChanged;
|
||||
|
||||
/// <summary>Advises every subtag in the supplied addresses; idempotent per address.</summary>
|
||||
void Advise(IReadOnlyCollection<string> itemAddresses);
|
||||
|
||||
/// <summary>Writes a value to an item address (used for the ack-comment write).</summary>
|
||||
void Write(string itemAddress, object? value);
|
||||
}
|
||||
```
|
||||
|
||||
**Step 2: Write the state-machine tests first.** `SubtagAlarmStateMachine` maps `(active, acked)` changes per target to `MxAlarmTransitionEvent`s. Test the four core transitions:
|
||||
|
||||
```csharp
|
||||
namespace ZB.MOM.WW.MxGateway.Worker.Tests.MxAccess;
|
||||
|
||||
public sealed class SubtagAlarmStateMachineTests
|
||||
{
|
||||
private static AlarmSubtagTarget Target() => new()
|
||||
{
|
||||
AlarmFullReference = "Galaxy!Area.Tank01.Level.HiHi",
|
||||
SourceObjectReference = "Tank01",
|
||||
ActiveSubtag = "Tank01.Level.HiHi.active",
|
||||
AckedSubtag = "Tank01.Level.HiHi.acked",
|
||||
AckCommentSubtag = "Tank01.Level.HiHi.ackmsg",
|
||||
};
|
||||
|
||||
[Fact]
|
||||
public void ActiveFalseToTrue_EmitsRaise_FlaggedDegraded()
|
||||
{
|
||||
var sm = new SubtagAlarmStateMachine(new[] { Target() });
|
||||
var ts = new DateTime(2026, 6, 13, 9, 0, 0, DateTimeKind.Utc);
|
||||
|
||||
var events = sm.Apply("Tank01.Level.HiHi.active", true, ts);
|
||||
|
||||
var e = Assert.Single(events);
|
||||
Assert.Equal(MxAlarmStateKind.UnackAlm, e.Record.State);
|
||||
Assert.Equal(MxAlarmStateKind.Unspecified, e.PreviousState);
|
||||
Assert.Equal("Tank01.Level.HiHi", e.Record.TagName); // reference minus provider/area
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void AckedTrueWhileActive_EmitsAckTransition()
|
||||
{
|
||||
var sm = new SubtagAlarmStateMachine(new[] { Target() });
|
||||
var ts = new DateTime(2026, 6, 13, 9, 0, 0, DateTimeKind.Utc);
|
||||
sm.Apply("Tank01.Level.HiHi.active", true, ts);
|
||||
|
||||
var events = sm.Apply("Tank01.Level.HiHi.acked", true, ts.AddSeconds(5));
|
||||
|
||||
var e = Assert.Single(events);
|
||||
Assert.Equal(MxAlarmStateKind.AckAlm, e.Record.State);
|
||||
Assert.Equal(MxAlarmStateKind.UnackAlm, e.PreviousState);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void ActiveTrueToFalse_WhileUnacked_EmitsUnackRtn()
|
||||
{
|
||||
var sm = new SubtagAlarmStateMachine(new[] { Target() });
|
||||
var ts = new DateTime(2026, 6, 13, 9, 0, 0, DateTimeKind.Utc);
|
||||
sm.Apply("Tank01.Level.HiHi.active", true, ts);
|
||||
|
||||
var events = sm.Apply("Tank01.Level.HiHi.active", false, ts.AddSeconds(10));
|
||||
|
||||
var e = Assert.Single(events);
|
||||
Assert.Equal(MxAlarmStateKind.UnackRtn, e.Record.State);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void Snapshot_ReflectsActiveAndAckedState()
|
||||
{
|
||||
var sm = new SubtagAlarmStateMachine(new[] { Target() });
|
||||
var ts = new DateTime(2026, 6, 13, 9, 0, 0, DateTimeKind.Utc);
|
||||
sm.Apply("Tank01.Level.HiHi.active", true, ts);
|
||||
sm.Apply("Tank01.Level.HiHi.acked", true, ts);
|
||||
|
||||
var snap = Assert.Single(sm.SnapshotActive());
|
||||
Assert.Equal(MxAlarmStateKind.AckAlm, snap.State);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Run: `dotnet test ...Worker.Tests... -p:Platform=x86 --filter FullyQualifiedName~SubtagAlarmStateMachineTests` → FAIL (type missing).
|
||||
|
||||
**Step 3: Implement `SubtagAlarmStateMachine`.** Build an address→target index (active/acked/priority/comment addresses), hold per-reference `(bool active, bool acked, DateTime firstRaiseUtc, int priority)`, and emit on change:
|
||||
- active `false→true` ⇒ `UnackAlm`, set `firstRaiseUtc`, `PreviousState` from prior state.
|
||||
- acked `false→true` while active ⇒ `AckAlm`.
|
||||
- active `true→false` ⇒ `AckRtn` if currently acked else `UnackRtn`; then reset acked.
|
||||
- priority change ⇒ update stored priority, no transition.
|
||||
- `TagName` = `alarm_full_reference` with any `Provider!Area.` prefix stripped (match `WnWrapAlarmConsumer`'s reference shape so `GatewayAlarmMonitor` keys align). Set `ProviderName`, `Group`, `Priority`, `AlarmComment` from the target/last values. Mark a `Degraded`/source flag (carried via a new field — see Task 5 wiring).
|
||||
- `SnapshotActive()` returns `MxAlarmSnapshotRecord` for references whose active is true.
|
||||
|
||||
**Step 4: Run green.** Expected: PASS.
|
||||
|
||||
**Step 5: Commit.**
|
||||
```bash
|
||||
git add src/ZB.MOM.WW.MxGateway.Worker/MxAccess/ISubtagAlarmSource.cs \
|
||||
src/ZB.MOM.WW.MxGateway.Worker/MxAccess/SubtagAlarmStateMachine.cs \
|
||||
src/ZB.MOM.WW.MxGateway.Worker.Tests/MxAccess/SubtagAlarmStateMachineTests.cs
|
||||
git commit -m "worker(alarms): subtag value-source seam + synthesis state machine"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 5: `SubtagAlarmConsumer` over the source seam (no COM yet)
|
||||
|
||||
**Classification:** standard
|
||||
**Estimated implement time:** ~5 min
|
||||
**Parallelizable with:** none (depends on Task 4)
|
||||
|
||||
**Files:**
|
||||
- Create: `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/SubtagAlarmConsumer.cs`
|
||||
- Test: `src/ZB.MOM.WW.MxGateway.Worker.Tests/MxAccess/SubtagAlarmConsumerTests.cs`
|
||||
|
||||
**Step 1: Test with a fake `ISubtagAlarmSource`.** Drive value changes through the source, assert `AlarmTransitionEmitted` fires with synthesized records and that ack writes the comment to the ack-comment subtag:
|
||||
|
||||
```csharp
|
||||
public sealed class SubtagAlarmConsumerTests
|
||||
{
|
||||
private sealed class FakeSource : ISubtagAlarmSource
|
||||
{
|
||||
public event EventHandler<SubtagValueChange>? ValueChanged;
|
||||
public List<string> Advised { get; } = new();
|
||||
public (string Address, object? Value)? LastWrite { get; private set; }
|
||||
public void Advise(IReadOnlyCollection<string> a) => Advised.AddRange(a);
|
||||
public void Write(string a, object? v) => LastWrite = (a, v);
|
||||
public void Raise(string addr, object? val, DateTime ts) =>
|
||||
ValueChanged?.Invoke(this, new SubtagValueChange { ItemAddress = addr, Value = val, TimestampUtc = ts });
|
||||
public void Dispose() { }
|
||||
}
|
||||
|
||||
private static AlarmSubtagTarget Target() => new()
|
||||
{
|
||||
AlarmFullReference = "Galaxy!Area.Tank01.Level.HiHi",
|
||||
ActiveSubtag = "Tank01.Level.HiHi.active",
|
||||
AckedSubtag = "Tank01.Level.HiHi.acked",
|
||||
AckCommentSubtag = "Tank01.Level.HiHi.ackmsg",
|
||||
};
|
||||
|
||||
[Fact]
|
||||
public void Subscribe_AdvisesAllSubtags()
|
||||
{
|
||||
var src = new FakeSource();
|
||||
using var c = new SubtagAlarmConsumer(src, new[] { Target() });
|
||||
c.Subscribe("ignored-in-subtag-mode");
|
||||
Assert.Contains("Tank01.Level.HiHi.active", src.Advised);
|
||||
Assert.Contains("Tank01.Level.HiHi.acked", src.Advised);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void ValueChange_RaisesSynthesizedTransition()
|
||||
{
|
||||
var src = new FakeSource();
|
||||
using var c = new SubtagAlarmConsumer(src, new[] { Target() });
|
||||
c.Subscribe("x");
|
||||
MxAlarmTransitionEvent? seen = null;
|
||||
c.AlarmTransitionEmitted += (_, e) => seen = e;
|
||||
|
||||
src.Raise("Tank01.Level.HiHi.active", true, new DateTime(2026, 6, 13, 9, 0, 0, DateTimeKind.Utc));
|
||||
|
||||
Assert.NotNull(seen);
|
||||
Assert.Equal(MxAlarmStateKind.UnackAlm, seen!.Record.State);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void AcknowledgeByName_WritesCommentToAckCommentSubtag()
|
||||
{
|
||||
var src = new FakeSource();
|
||||
using var c = new SubtagAlarmConsumer(src, new[] { Target() });
|
||||
c.Subscribe("x");
|
||||
|
||||
int rc = c.AcknowledgeByName("Tank01.Level.HiHi", "Galaxy", "Area",
|
||||
"ack from HMI", "op1", "node", "dom", "Op One");
|
||||
|
||||
Assert.Equal(0, rc);
|
||||
Assert.Equal(("Tank01.Level.HiHi.ackmsg", (object?)"ack from HMI"), src.LastWrite);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Step 2: Implement `SubtagAlarmConsumer : IMxAccessAlarmConsumer`.**
|
||||
- Constructor `(ISubtagAlarmSource source, IReadOnlyList<AlarmSubtagTarget> watchList)`; build a `SubtagAlarmStateMachine`; index `alarm_full_reference`→target for ack routing.
|
||||
- `Subscribe(_)`: call `source.Advise(<all subtag addresses>)`; subscribe to `source.ValueChanged`, feed each into the state machine, and re-raise each produced `MxAlarmTransitionEvent` via `AlarmTransitionEmitted` (mark degraded).
|
||||
- `AcknowledgeByName(alarmName, …, comment, …)`: resolve the target by reference; if no `AckCommentSubtag`, return a non-zero failure code; else `source.Write(target.AckCommentSubtag, comment)` and return 0.
|
||||
- `AcknowledgeByGuid(guid, …)`: map the synthetic GUID (deterministic hash of reference — see Task 8 helper, or a local copy) back to a reference, then delegate to the name path; unknown GUID ⇒ non-zero.
|
||||
- `SnapshotActiveAlarms()`: from the state machine.
|
||||
- `PollOnce()`: no-op.
|
||||
- `Dispose()`: unsubscribe + dispose source.
|
||||
|
||||
**Step 3: Run green.** `dotnet test ...Worker.Tests... -p:Platform=x86 --filter FullyQualifiedName~SubtagAlarmConsumerTests`.
|
||||
|
||||
**Step 4: Commit.**
|
||||
```bash
|
||||
git add src/ZB.MOM.WW.MxGateway.Worker/MxAccess/SubtagAlarmConsumer.cs \
|
||||
src/ZB.MOM.WW.MxGateway.Worker.Tests/MxAccess/SubtagAlarmConsumerTests.cs
|
||||
git commit -m "worker(alarms): SubtagAlarmConsumer synthesizing transitions over the source seam"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 6: COM-backed `LmxSubtagAlarmSource` (own LMXProxyServerClass)
|
||||
|
||||
**Classification:** high-risk
|
||||
**Estimated implement time:** ~5 min
|
||||
**Parallelizable with:** none
|
||||
|
||||
The only piece that touches live COM. Like `WnWrapAlarmConsumer`, it owns its own MXAccess server object so the subtag source is self-contained and isolated from the session's item pipeline. Logic stays thin (advise/write/marshal); real verification is the live smoke test in Task 17.
|
||||
|
||||
**Files:**
|
||||
- Create: `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/LmxSubtagAlarmSource.cs`
|
||||
- Test: `src/ZB.MOM.WW.MxGateway.Worker.Tests/MxAccess/LmxSubtagAlarmSourceTests.cs` (constructor/guard tests only; COM path is live-gated)
|
||||
|
||||
**Step 1: Implement `LmxSubtagAlarmSource : ISubtagAlarmSource`.**
|
||||
- Own an `LMXProxyServerClass` (reuse the worker's `IMxAccessServer`/`MxAccessComServer` wrapper + `IMxAccessComObjectFactory` so it is fakeable; constructor takes the factory).
|
||||
- `Advise(addresses)`: `RegisterServer` (topic) once; per address `AddItem`→`itemHandle`, `Advise`, and record `itemHandle→address`. Subscribe to the proxy's `OnDataChange`; in the handler, look up the address by `phItemHandle`, normalize `pvItemValue` (VARIANT→bool/double) and `pftItemTimeStamp`→UTC, and raise `ValueChanged`. All calls run on the STA (the worker STA pumps messages, so `OnDataChange` delivers).
|
||||
- `Write(address, value)`: resolve/create the item handle, `server.Write(serverHandle, itemHandle, value, userId: 0)`.
|
||||
- `Dispose()`: `UnAdvise`/`RemoveItem`/`Unregister`/release COM.
|
||||
|
||||
**Step 2: Tests** — only the non-COM guards (null factory throws; `Write` before `Advise` resolves a handle or throws a clear error). Mark the COM round-trip `[LiveMxAccessFact]` and `Skip` per the `AlarmsLiveSmokeTests` precedent.
|
||||
|
||||
**Step 3: Build x86 + run unit tests.**
|
||||
`dotnet build src/ZB.MOM.WW.MxGateway.Worker/ZB.MOM.WW.MxGateway.Worker.csproj -p:Platform=x86`
|
||||
`dotnet test ...Worker.Tests... -p:Platform=x86 --filter FullyQualifiedName~LmxSubtagAlarmSourceTests`
|
||||
|
||||
**Step 4: Commit.**
|
||||
```bash
|
||||
git add src/ZB.MOM.WW.MxGateway.Worker/MxAccess/LmxSubtagAlarmSource.cs \
|
||||
src/ZB.MOM.WW.MxGateway.Worker.Tests/MxAccess/LmxSubtagAlarmSourceTests.cs
|
||||
git commit -m "worker(alarms): COM-backed LmxSubtagAlarmSource advising alarm subtags"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 7: `FailoverAlarmConsumer` state machine
|
||||
|
||||
**Classification:** high-risk
|
||||
**Estimated implement time:** ~5 min
|
||||
**Parallelizable with:** none (depends on Task 5)
|
||||
|
||||
**Files:**
|
||||
- Create: `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/FailoverAlarmConsumer.cs`
|
||||
- Create: `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/AlarmProviderModeChange.cs` (small EventArgs)
|
||||
- Test: `src/ZB.MOM.WW.MxGateway.Worker.Tests/MxAccess/FailoverAlarmConsumerTests.cs`
|
||||
|
||||
**Step 1: Test the switch/failback with a fake primary that throws.**
|
||||
|
||||
```csharp
|
||||
public sealed class FailoverAlarmConsumerTests
|
||||
{
|
||||
private sealed class FlakyPrimary : IMxAccessAlarmConsumer
|
||||
{
|
||||
public event EventHandler<MxAlarmTransitionEvent>? AlarmTransitionEmitted;
|
||||
public int PollsUntilHeal = int.MaxValue; // becomes healthy after N polls while degraded
|
||||
public bool ThrowOnPoll = true;
|
||||
private int _polls;
|
||||
public void Subscribe(string s) { if (ThrowOnPoll) throw new COMException("boom", unchecked((int)0x80004005)); }
|
||||
public void PollOnce()
|
||||
{
|
||||
_polls++;
|
||||
if (ThrowOnPoll && _polls < PollsUntilHeal) throw new COMException("boom", unchecked((int)0x80004005));
|
||||
}
|
||||
public int AcknowledgeByGuid(Guid g, string c, string a, string b, string d, string e) => 0;
|
||||
public int AcknowledgeByName(string n, string p, string gr, string c, string a, string b, string d, string e) => 0;
|
||||
public IReadOnlyList<MxAlarmSnapshotRecord> SnapshotActiveAlarms() => Array.Empty<MxAlarmSnapshotRecord>();
|
||||
public void Dispose() { }
|
||||
}
|
||||
|
||||
private sealed class StubStandby : IMxAccessAlarmConsumer { /* records Subscribe, no-op rest */ }
|
||||
|
||||
[Fact]
|
||||
public void Primary_FailsThresholdTimes_SwitchesToSubtagAndEmitsModeChange()
|
||||
{
|
||||
var primary = new FlakyPrimary();
|
||||
var standby = new StubStandby();
|
||||
using var c = new FailoverAlarmConsumer(primary, standby,
|
||||
new FailoverSettings(threshold: 3, probeIntervalSeconds: 30, stableProbes: 3));
|
||||
AlarmProviderModeChange? change = null;
|
||||
c.ProviderModeChanged += (_, e) => change = e;
|
||||
|
||||
c.Subscribe("\\\\host\\Galaxy!Area"); // primary.Subscribe throws -> counts as failure 1
|
||||
c.PollOnce(); // failure 2
|
||||
c.PollOnce(); // failure 3 -> switch
|
||||
|
||||
Assert.NotNull(change);
|
||||
Assert.Equal(AlarmProviderMode.Subtag, change!.Mode);
|
||||
}
|
||||
|
||||
[Fact]
|
||||
public void WhileDegraded_PrimaryHeals_FailsBackAfterStableProbes()
|
||||
{
|
||||
var primary = new FlakyPrimary { PollsUntilHeal = 0 }; // will heal once we stop throwing
|
||||
var standby = new StubStandby();
|
||||
using var c = new FailoverAlarmConsumer(primary, standby,
|
||||
new FailoverSettings(threshold: 1, probeIntervalSeconds: 0, stableProbes: 2));
|
||||
var modes = new List<AlarmProviderMode>();
|
||||
c.ProviderModeChanged += (_, e) => modes.Add(e.Mode);
|
||||
|
||||
c.Subscribe("x"); // failure -> switch to subtag
|
||||
primary.ThrowOnPoll = false;
|
||||
c.ProbeOnce(); // clean probe 1
|
||||
c.ProbeOnce(); // clean probe 2 -> failback
|
||||
|
||||
Assert.Equal(AlarmProviderMode.Subtag, modes[0]);
|
||||
Assert.Equal(AlarmProviderMode.Alarmmgr, modes[^1]);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Step 2: Implement.**
|
||||
- `record FailoverSettings(int threshold, int probeIntervalSeconds, int stableProbes)`; `AlarmProviderModeChange : EventArgs { AlarmProviderMode Mode; string Reason; int HResult; DateTime AtUtc; }`.
|
||||
- Constructor `(IMxAccessAlarmConsumer primary, IMxAccessAlarmConsumer standby, FailoverSettings settings)`; forced-mode variants handled in Task 9 wiring (forced ⇒ skip the other consumer).
|
||||
- Forward `AlarmTransitionEmitted` from the **active** child only (swap the subscription on switch).
|
||||
- Wrap `Subscribe`/`PollOnce` on the primary: on `COMException` (or a failure HRESULT) while `PrimaryActive`, increment a counter; at `threshold`, ensure standby `Subscribe`d, set active=standby, snapshot standby for hand-off, raise `ProviderModeChanged(Subtag, reason, hresult, now)`. Reset counter on any clean primary poll.
|
||||
- `ProbeOnce()` (driven by the poll loop while degraded, gated by `probeIntervalSeconds`): try primary `Subscribe`+`PollOnce`; count consecutive clean probes; at `stableProbes`, set active=primary, return standby to standby, raise `ProviderModeChanged(Alarmmgr, "recovered", 0, now)`.
|
||||
- `Acknowledge*` / `SnapshotActiveAlarms` delegate to the **active** child.
|
||||
- `PollOnce()` drives the active child's poll, and—while degraded—also drives the failback probe cadence.
|
||||
|
||||
**Step 3: Run green** (x86 filter `FailoverAlarmConsumerTests`).
|
||||
|
||||
**Step 4: Commit.**
|
||||
```bash
|
||||
git add src/ZB.MOM.WW.MxGateway.Worker/MxAccess/FailoverAlarmConsumer.cs \
|
||||
src/ZB.MOM.WW.MxGateway.Worker/MxAccess/AlarmProviderModeChange.cs \
|
||||
src/ZB.MOM.WW.MxGateway.Worker.Tests/MxAccess/FailoverAlarmConsumerTests.cs
|
||||
git commit -m "worker(alarms): FailoverAlarmConsumer auto-failover/failback state machine"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 8: Synthetic-GUID helper + degraded flag on the event sink path
|
||||
|
||||
**Classification:** standard
|
||||
**Estimated implement time:** ~4 min
|
||||
**Parallelizable with:** Task 9
|
||||
|
||||
Carry `degraded` + `source_provider` from the worker synthesis into the emitted `OnAlarmTransitionEvent`.
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAlarmSnapshot.cs` (add `bool Degraded`)
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessAlarmEventSink.cs` (`EnqueueTransition` carries degraded)
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessEventMapper.cs` (`CreateOnAlarmTransition` sets `Degraded`/`SourceProvider`)
|
||||
- Create: `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/SyntheticAlarmGuid.cs`
|
||||
- Test: add cases to `src/ZB.MOM.WW.MxGateway.Worker.Tests/MxAccess/AlarmDispatcherTests.cs` and a new `SyntheticAlarmGuidTests.cs`
|
||||
|
||||
**Step 1: `SyntheticAlarmGuid.ForReference(string reference)`** — deterministic GUID from a stable hash (e.g. MD5 of the UTF-8 reference → `new Guid(bytes)`), so subtag-mode acks resolve by GUID. Test determinism + difference:
|
||||
|
||||
```csharp
|
||||
[Fact] public void SameReference_SameGuid() =>
|
||||
Assert.Equal(SyntheticAlarmGuid.ForReference("A.B.C"), SyntheticAlarmGuid.ForReference("A.B.C"));
|
||||
[Fact] public void DifferentReference_DifferentGuid() =>
|
||||
Assert.NotEqual(SyntheticAlarmGuid.ForReference("A.B.C"), SyntheticAlarmGuid.ForReference("A.B.D"));
|
||||
```
|
||||
|
||||
**Step 2: Thread `degraded`** through `MxAlarmSnapshotRecord.Degraded`, `EnqueueTransition(... bool degraded)`, and `CreateOnAlarmTransition(... bool degraded, AlarmProviderMode sourceProvider)`. Default `degraded=false`, `sourceProvider=Alarmmgr` so the wnwrap path is unchanged (regression: existing `AlarmDispatcherTests` still pass with `Degraded=false`).
|
||||
|
||||
**Step 3: Tests** — extend `AlarmDispatcherTests` with a subtag-style transition asserting `body.Degraded == true` and `SourceProvider == Subtag`.
|
||||
|
||||
**Step 4: Build x86 + run** worker tests for `AlarmDispatcherTests`, `SyntheticAlarmGuidTests`.
|
||||
|
||||
**Step 5: Commit.**
|
||||
```bash
|
||||
git add src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAlarmSnapshot.cs \
|
||||
src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessAlarmEventSink.cs \
|
||||
src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessEventMapper.cs \
|
||||
src/ZB.MOM.WW.MxGateway.Worker/MxAccess/SyntheticAlarmGuid.cs \
|
||||
src/ZB.MOM.WW.MxGateway.Worker.Tests/MxAccess/
|
||||
git commit -m "worker(alarms): synthetic GUID + degraded provenance on emitted transitions"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 9: Wire watch-list + failover config through `AlarmCommandHandler`; emit mode-changed event
|
||||
|
||||
**Classification:** high-risk
|
||||
**Estimated implement time:** ~5 min
|
||||
**Parallelizable with:** none (depends on Tasks 5, 7, 8)
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/AlarmCommandHandler.cs`
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/IAlarmCommandHandler.cs`
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessCommandExecutor.cs` (`ExecuteSubscribeAlarms`, ~lines 588-616)
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Worker/MxAccess/MxAccessStaSession.cs` (consumer factory wiring; mode-change → event queue)
|
||||
- Test: `src/ZB.MOM.WW.MxGateway.Worker.Tests/MxAccess/AlarmCommandHandlerTests.cs` (extend or create)
|
||||
|
||||
**Step 1: Carry the subscribe payload.** Change the alarm subscribe entry point from `Subscribe(string subscription)` to `Subscribe(SubscribeAlarmsCommand command)` (the command now has `ForcedMode`, `WatchList`, `Failover`). In `AlarmCommandHandler.Subscribe`:
|
||||
- Build the active provider per `ForcedMode`:
|
||||
- `ALARMMGR` ⇒ `WnWrapAlarmConsumer` only.
|
||||
- `SUBTAG` ⇒ `SubtagAlarmConsumer(new LmxSubtagAlarmSource(factory), watchList)` only.
|
||||
- `UNSPECIFIED` ⇒ `FailoverAlarmConsumer(primary: wnwrap, standby: subtag, settings-from-Failover)`.
|
||||
- Use the existing `consumerFactory` seam but widen it to `Func<SubscribeAlarmsCommand, IMxAccessAlarmConsumer>` so tests inject fakes and production builds the failover composite. Subscribe to `FailoverAlarmConsumer.ProviderModeChanged` and enqueue an `OnAlarmProviderModeChangedEvent` MxEvent via the event queue (new mapper method `CreateOnAlarmProviderModeChanged`).
|
||||
|
||||
**Step 2: Executor + STA wiring.** `ExecuteSubscribeAlarms` passes the full `SubscribeAlarmsCommand` (not just the expression). In `MxAccessStaSession`, the `alarmCommandHandlerFactory` must give the handler access to the `IMxAccessComObjectFactory` so the subtag source can create its own proxy server on the STA; keep the `EnsureOnAlarmConsumerThread` affinity guard on every path.
|
||||
|
||||
**Step 3: Test** — fake consumer factory; assert that a `SUBTAG` forced command builds the subtag consumer and advises; that an auto command building a fake failover composite, when it raises `ProviderModeChanged`, enqueues an `OnAlarmProviderModeChangedEvent` on the queue.
|
||||
|
||||
**Step 4: Build x86 + worker tests.**
|
||||
|
||||
**Step 5: Commit.**
|
||||
```bash
|
||||
git add src/ZB.MOM.WW.MxGateway.Worker/MxAccess/ src/ZB.MOM.WW.MxGateway.Worker.Tests/MxAccess/
|
||||
git commit -m "worker(alarms): route watch-list/failover config; emit provider-mode-changed event"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Phase 2 — Gateway: discovery, options, monitor, metrics, dashboard
|
||||
|
||||
### Task 10: `AlarmsOptions.Fallback` + validation
|
||||
|
||||
**Classification:** standard
|
||||
**Estimated implement time:** ~4 min
|
||||
**Parallelizable with:** Task 11, Task 13
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Configuration/AlarmsOptions.cs`
|
||||
- Create: `src/ZB.MOM.WW.MxGateway.Server/Configuration/AlarmFallbackOptions.cs`
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Configuration/GatewayOptionsValidator.cs` (`ValidateAlarms`, ~lines 234-258)
|
||||
- Test: `src/ZB.MOM.WW.MxGateway.Tests/Configuration/GatewayOptionsValidatorTests.cs` (extend)
|
||||
|
||||
**Step 1:** Add `AlarmFallbackOptions Fallback { get; init; } = new();` to `AlarmsOptions`. `AlarmFallbackOptions`: `string Mode = "Auto"` (`Auto|ForceAlarmManager|ForceSubtag`), `int ConsecutiveFailureThreshold = 3`, `int FailbackProbeIntervalSeconds = 30`, `int FailbackStableProbes = 3`, a `Discovery` sub-object (`bool UseGalaxyRepository = true`, `string Area = ""`, `string[] IncludeAttributes = []`, `string[] ExcludeAttributes = []`), and a `Subtags` sub-object (`Active="active"`, `Acked="acked"`, `AckComment=""`, `Priority="priority"`).
|
||||
|
||||
**Step 2:** In `ValidateAlarms`, when `Enabled` and `Mode == "ForceSubtag"` and `Discovery.UseGalaxyRepository == false` and `IncludeAttributes` empty ⇒ add a validation error ("ForceSubtag requires Galaxy Repository discovery or an explicit IncludeAttributes list"). Floor the three numeric values at 1. Validate `Mode` is one of the three literals.
|
||||
|
||||
**Step 3-5:** Test the new validation cases (red→green), build the server, commit.
|
||||
|
||||
---
|
||||
|
||||
### Task 11: Galaxy Repository "alarm attributes" discovery query
|
||||
|
||||
**Classification:** standard
|
||||
**Estimated implement time:** ~5 min
|
||||
**Parallelizable with:** Task 10, Task 13
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Galaxy/GalaxyRepository.cs` (add `GetAlarmAttributesAsync` + SQL constant, following `GetAttributesAsync` ~lines 86-115 and `AttributesSql` ~line 176)
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Galaxy/IGalaxyRepository.cs`
|
||||
- Create: `src/ZB.MOM.WW.MxGateway.Server/Galaxy/GalaxyAlarmAttributeRow.cs`
|
||||
- Test: `src/ZB.MOM.WW.MxGateway.Tests/Galaxy/` (projection unit test; live SQL gated)
|
||||
|
||||
**Step 1:** `GalaxyAlarmAttributeRow { string FullTagReference; string SourceObjectReference; string AckCommentSubtag; }` (and any priority subtag). `GetAlarmAttributesAsync` reuses the existing `is_alarm` detection (the `AlarmExtension` primitive join already in `AttributesSql`) filtered to `is_alarm = 1`, projecting the alarm reference + its ack-comment attribute. Follow the exact `SqlConnection`/`SqlCommand`/`SqlDataReader` pattern from `GetAttributesAsync`.
|
||||
|
||||
**Step 2:** Unit-test the row→`AlarmSubtagTarget` mapping (a pure mapper function); gate any live-DB test like the existing Galaxy live tests (or `Skip` with a note, matching `AlarmsLiveSmokeTests`).
|
||||
|
||||
**Step 3-5:** red→green, build server, commit.
|
||||
|
||||
---
|
||||
|
||||
### Task 12: Watch-list resolver (GR SQL + config override → `AlarmSubtagTarget[]`)
|
||||
|
||||
**Classification:** standard
|
||||
**Estimated implement time:** ~4 min
|
||||
**Parallelizable with:** none (depends on Tasks 10, 11)
|
||||
|
||||
**Files:**
|
||||
- Create: `src/ZB.MOM.WW.MxGateway.Server/Alarms/AlarmWatchListResolver.cs`
|
||||
- Create: `src/ZB.MOM.WW.MxGateway.Server/Alarms/IAlarmWatchListResolver.cs`
|
||||
- Test: `src/ZB.MOM.WW.MxGateway.Tests/Alarms/AlarmWatchListResolverTests.cs`
|
||||
|
||||
**Step 1: Test the merge** with a fake `IGalaxyRepository`:
|
||||
- discovery rows + `IncludeAttributes` are unioned; `ExcludeAttributes` removed; each becomes an `AlarmSubtagTarget` with `.active`/`.acked`/`.ackmsg` addresses composed from the configured `Subtags` names (`<reference>.<Active>`, etc.); empty config subtag names fall back to defaults; GR unavailable + no includes ⇒ empty list + a logged warning flag.
|
||||
|
||||
**Step 2: Implement** `ResolveAsync(AlarmsOptions, CancellationToken) → IReadOnlyList<AlarmSubtagTarget>`.
|
||||
|
||||
**Step 3-5:** red→green, build, commit.
|
||||
|
||||
---
|
||||
|
||||
### Task 13: Gateway metrics — provider-mode gauge + switch counter
|
||||
|
||||
**Classification:** small
|
||||
**Estimated implement time:** ~3 min
|
||||
**Parallelizable with:** Task 10, Task 11
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Metrics/GatewayMetrics.cs` (ctor ~lines 55-79; add counter + observable gauge following the existing pattern)
|
||||
- Test: `src/ZB.MOM.WW.MxGateway.Tests/Metrics/GatewayMetricsTests.cs` (if present; else assert via a `MeterListener`)
|
||||
|
||||
**Step 1:** Add `mxgateway.alarms.provider_switches` counter (tagged `from`,`to`,`reason`) and `mxgateway.alarms.provider_mode` observable gauge (1=alarmmgr, 2=subtag), plus `AlarmProviderSwitched(int from, int to, string reason)` and a private `GetAlarmProviderMode()` (lock on `_syncRoot` like the others).
|
||||
|
||||
**Step 2-4:** test, build, commit.
|
||||
|
||||
---
|
||||
|
||||
### Task 14: `GatewayAlarmMonitor` — arm watch-list, reflect provider mode, reconcile on switch
|
||||
|
||||
**Classification:** high-risk
|
||||
**Estimated implement time:** ~5 min
|
||||
**Parallelizable with:** none (depends on Tasks 9, 12, 13)
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Alarms/GatewayAlarmMonitor.cs` (ctor ~41-49; `SubscribeAlarmsAsync` ~210-233; event-drain loop; `StreamAsync` ~386-434)
|
||||
- Test: `src/ZB.MOM.WW.MxGateway.Tests/Alarms/GatewayAlarmMonitorProviderModeTests.cs` (new, using `FakeWorkerHarness`)
|
||||
|
||||
**Step 1:** Inject `IAlarmWatchListResolver` and `GatewayMetrics`. In `SubscribeAlarmsAsync`, resolve the watch-list and build the `SubscribeAlarmsCommand` with `ForcedMode` (from `Fallback.Mode`), `WatchList`, and `Failover` populated from options — instead of the bare `{ SubscriptionExpression }`.
|
||||
|
||||
**Step 2:** In the worker-event drain path, handle `OnAlarmProviderModeChangedEvent`: update a `_providerStatus` field (mode/degraded/reason/since), `Broadcast(new AlarmFeedMessage { ProviderStatus = … })` to every subscriber, call `metrics.AlarmProviderSwitched(...)`, and force a `ReconcileAsync` so the cache re-seeds from the now-active provider (avoids raise/clear storms).
|
||||
|
||||
**Step 3:** In `StreamAsync`, emit the current `provider_status` as the **first** message (before the snapshot) so a late joiner immediately knows the mode.
|
||||
|
||||
**Step 4: Test** — stand up the monitor with `FakeWorkerHarness`; emit an `OnAlarmProviderModeChangedEvent(Subtag)`; assert a `StreamAsync` subscriber receives a `ProviderStatus{ Mode=Subtag, Degraded=true }` and that the switch counter incremented. Also assert a transition emitted in subtag mode flows through with `Degraded=true`.
|
||||
|
||||
**Step 5:** build server, run the new test, commit.
|
||||
|
||||
---
|
||||
|
||||
### Task 15: Dashboard — push provider status to `/hubs/alarms` + UI indicator
|
||||
|
||||
**Classification:** standard
|
||||
**Estimated implement time:** ~5 min
|
||||
**Parallelizable with:** none (depends on Task 14)
|
||||
|
||||
**Files:**
|
||||
- Modify: `src/ZB.MOM.WW.MxGateway.Server/Dashboard/Hubs/AlarmsHubPublisher.cs` (forward `ProviderStatus` messages — they already flow through `StreamAsync`, so confirm the existing `SendAsync(AlarmMessage, message)` carries them; add a dedicated `"ProviderModeChanged"` client method if the dashboard needs a distinct channel)
|
||||
- Modify: the alarms dashboard page/component (Bootstrap-only badge: green "alarmmgr" / amber "degraded — subtag") — find under `src/ZB.MOM.WW.MxGateway.Server/Dashboard/`
|
||||
- Test: `src/ZB.MOM.WW.MxGateway.Tests/` dashboard model test (e.g. a `DashboardAlarmProviderStatus.FromFeed` mapper, mirroring `DashboardActiveAlarm.FromSnapshot`)
|
||||
|
||||
**Constraint:** Bootstrap CSS/JS only — no MudBlazor/Radzen/FluentUI.
|
||||
|
||||
**Steps:** TDD the model mapper, wire the publisher + badge, build, commit.
|
||||
|
||||
---
|
||||
|
||||
## Phase 3 — Integration, docs, live smoke
|
||||
|
||||
### Task 16: End-to-end fake-worker failover test
|
||||
|
||||
**Classification:** standard
|
||||
**Estimated implement time:** ~5 min
|
||||
**Parallelizable with:** Task 18
|
||||
|
||||
**Files:**
|
||||
- Test: `src/ZB.MOM.WW.MxGateway.Tests/Alarms/AlarmFailoverEndToEndTests.cs`
|
||||
|
||||
Drive the full gateway path with `FakeWorkerHarness`: subscribe (assert the `SubscribeAlarmsCommand` carries a watch-list), emit a wnwrap-style transition (assert `Degraded=false`), emit `OnAlarmProviderModeChangedEvent(Subtag)`, emit a synthesized transition (assert `Degraded=true`, `SourceProvider=Subtag`), then `OnAlarmProviderModeChangedEvent(Alarmmgr)` and assert the feed reports recovery. Build, run, commit.
|
||||
|
||||
---
|
||||
|
||||
### Task 17: Live subtag smoke test (opt-in)
|
||||
|
||||
**Classification:** small
|
||||
**Estimated implement time:** ~4 min
|
||||
**Parallelizable with:** Task 18
|
||||
|
||||
**Files:**
|
||||
- Test: `src/ZB.MOM.WW.MxGateway.IntegrationTests/...AlarmSubtagLiveSmokeTests.cs` (or the worker live suite)
|
||||
|
||||
A `[LiveMxAccessFact]`, `Skip`-by-default test (per `AlarmsLiveSmokeTests` precedent) that, against a live Galaxy + alarm flip script: advises the real `.active`/`.acked` subtags via `LmxSubtagAlarmSource`, asserts a synthesized raise/clear, and performs an ack via the ack-comment write. Document the exact subtag names discovered (resolves the design's open item). Commit.
|
||||
|
||||
---
|
||||
|
||||
### Task 18: Documentation
|
||||
|
||||
**Classification:** trivial
|
||||
**Estimated implement time:** ~5 min
|
||||
**Parallelizable with:** Task 16, Task 17
|
||||
|
||||
**Files:**
|
||||
- Modify: `gateway.md` (alarm provider section: dual provider + auto-failover/failback)
|
||||
- Modify: `docs/DesignDecisions.md` (record the fallback decision + parity rationale)
|
||||
- Modify: `docs/GatewayConfiguration.md` (the `MxGateway:Alarms:Fallback` block)
|
||||
- Modify: `docs/AlarmClientDiscovery.md` (subtag provider, synthesis rules, ack-comment write)
|
||||
- Modify: `docs/Grpc.md` (new `provider_status` feed case + `degraded`/`source_provider` fields)
|
||||
|
||||
Follow `StyleGuide.md` (PascalCase filenames, present tense, explain *why*). No code; commit.
|
||||
|
||||
---
|
||||
|
||||
## Execution order & parallelism summary
|
||||
|
||||
- **Serial spine:** 1 → 2 → 3 → 4 → 5 → 6 → 7 → 8/9 → 10/11 → 12 → 13 → 14 → 15 → 16 → 17/18.
|
||||
- **Parallelizable clusters:** {8, 9 partially}, {10, 11, 13}, {16, 17, 18}.
|
||||
- **High-risk tasks** (full review chain): 1, 2, 6, 7, 9, 14. **Standard:** 4, 5, 8, 10, 11, 12, 15, 16. **Small/trivial:** 3, 13, 17, 18.
|
||||
|
||||
## Risk notes for the executor
|
||||
|
||||
- **Field-number collisions:** Task 2 must read the live `MxEvent`/`MxEventFamily` numbers before adding — the agent map gave alarm-payload maxima but not `MxEvent`'s. Verify before editing.
|
||||
- **STA discipline:** every COM call in `LmxSubtagAlarmSource` and every consumer swap runs on the worker STA; keep the `EnsureOnAlarmConsumerThread` guard. The worker STA already pumps Windows messages, which is required for the subtag `OnDataChange` to deliver.
|
||||
- **Parity regression:** alarmmgr-mode output must be byte-for-byte unchanged. Existing `AlarmDispatcherTests` and `ProtobufContractRoundTripTests` are the guardrail — they must stay green with `Degraded=false` defaults.
|
||||
- **Subtag names unverified:** the design leaves exact AVEVA subtag names (`.active`, `.acked`, ack-comment) to confirm against `C:\Users\dohertj2\Desktop\mxaccess` + a live Galaxy (Task 17). The config `Subtags` block exists so names are not hard-coded.
|
||||
@@ -0,0 +1,24 @@
|
||||
{
|
||||
"planPath": "docs/plans/2026-06-13-alarm-subtag-fallback.md",
|
||||
"tasks": [
|
||||
{"id": 54, "subject": "Task 1: Worker proto — watch-list, failover config, AlarmProviderMode", "status": "pending"},
|
||||
{"id": 55, "subject": "Task 2: Gateway proto — provider status, degraded provenance, mode-changed event", "status": "pending", "blockedBy": [54]},
|
||||
{"id": 56, "subject": "Task 3: Proto round-trip tests for new alarm fields", "status": "pending", "blockedBy": [54, 55]},
|
||||
{"id": 57, "subject": "Task 4: Subtag value-source abstraction + synthesis state machine", "status": "pending", "blockedBy": [54]},
|
||||
{"id": 58, "subject": "Task 5: SubtagAlarmConsumer over the source seam", "status": "pending", "blockedBy": [57]},
|
||||
{"id": 59, "subject": "Task 6: COM-backed LmxSubtagAlarmSource", "status": "pending", "blockedBy": [57]},
|
||||
{"id": 60, "subject": "Task 7: FailoverAlarmConsumer state machine", "status": "pending", "blockedBy": [58]},
|
||||
{"id": 61, "subject": "Task 8: Synthetic GUID + degraded flag on event sink path", "status": "pending", "blockedBy": [55]},
|
||||
{"id": 62, "subject": "Task 9: Wire watch-list/failover through AlarmCommandHandler; emit mode-changed", "status": "pending", "blockedBy": [58, 60, 61]},
|
||||
{"id": 63, "subject": "Task 10: AlarmsOptions.Fallback + validation", "status": "pending"},
|
||||
{"id": 64, "subject": "Task 11: Galaxy Repository alarm-attributes discovery query", "status": "pending"},
|
||||
{"id": 65, "subject": "Task 12: Watch-list resolver (GR SQL + config override)", "status": "pending", "blockedBy": [54, 63, 64]},
|
||||
{"id": 66, "subject": "Task 13: Metrics — provider-mode gauge + switch counter", "status": "pending"},
|
||||
{"id": 67, "subject": "Task 14: GatewayAlarmMonitor — arm watch-list, reflect mode, reconcile on switch", "status": "pending", "blockedBy": [55, 62, 65, 66]},
|
||||
{"id": 68, "subject": "Task 15: Dashboard — push provider status + UI badge", "status": "pending", "blockedBy": [67]},
|
||||
{"id": 69, "subject": "Task 16: End-to-end fake-worker failover test", "status": "pending", "blockedBy": [67]},
|
||||
{"id": 70, "subject": "Task 17: Live subtag smoke test (opt-in)", "status": "pending", "blockedBy": [59, 62]},
|
||||
{"id": 71, "subject": "Task 18: Documentation", "status": "pending", "blockedBy": [67]}
|
||||
],
|
||||
"lastUpdated": "2026-06-13T12:40:00Z"
|
||||
}
|
||||
Reference in New Issue
Block a user