Files
lmxopcua/src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared/Contracts/HistorianAlarms.cs
Joseph Doherty 25ad4b1929 Phase 7 Stream D — Historian alarm sink (SQLite store-and-forward + Galaxy.Host IPC contracts)
Phase 7 plan decisions #16, #17, #19, #21 implementation. Durable local SQLite queue
absorbs every qualifying alarm event; drain worker forwards batches to Galaxy.Host
(reusing the already-loaded 32-bit aahClientManaged DLLs) on an exponential-backoff
cadence; operator acks never block on the historian being reachable.

## New project Core.AlarmHistorian (net10)

- AlarmHistorianEvent — source-agnostic event shape (scripted alarms + Galaxy-native +
  AB CIP ALMD + any future IAlarmSource)
- IAlarmHistorianSink / NullAlarmHistorianSink — interface + disabled default
- IAlarmHistorianWriter — per-event outcome (Ack / RetryPlease / PermanentFail); Stream G
  wires the Galaxy.Host IPC client implementation
- SqliteStoreAndForwardSink — full implementation:
  - Queue table with AttemptCount / LastError / DeadLettered columns
  - DrainOnceAsync serialised via SemaphoreSlim
  - BackoffLadder 1s → 2s → 5s → 15s → 60s (cap)
  - DefaultCapacity 1,000,000 rows — overflow evicts oldest non-dead-lettered
  - DefaultDeadLetterRetention 30 days — sweeper purges on every drain tick
  - RetryDeadLettered operator action reattaches dead-letters to the regular queue
  - Writer-side exceptions treated as whole-batch RetryPlease (no data loss)

## New IPC contracts in Driver.Galaxy.Shared

- HistorianAlarmEventRequest — batched up to 100 events/request per plan Stream D.5
- HistorianAlarmEventResponse — per-event outcome (1:1 with request order)
- HistorianAlarmEventOutcomeDto enum (byte on the wire — Ack/RetryPlease/PermanentFail)
- HistorianAlarmEventDto — mirrors Core.AlarmHistorian.AlarmHistorianEvent
- HistorianConnectivityStatusNotification — Host pushes proactively when the SDK
  session drops so /alarms/historian flips red without waiting for the next drain
- MessageKind additions: 0x80 HistorianAlarmEventRequest / 0x81 HistorianAlarmEventResponse
  / 0x82 HistorianConnectivityStatus

## Tests — 14/14

SqliteStoreAndForwardSinkTests covers: enqueue→drain→Ack round-trip, empty-queue no-op,
RetryPlease bumps backoff + keeps row, Ack after Retry resets backoff, PermanentFail
dead-letters one row without blocking neighbors, writer exception treated as whole-batch
retry with error surfaced in status, capacity eviction drops oldest non-dead-lettered,
dead-letters purged past retention window, RetryDeadLettered requeues, ladder caps at
60s after 10 retries, Null sink reports Disabled status, null sink swallows enqueue,
ctor argument validation, disposed sink rejects enqueue.

## Totals
Full Phase 7 tests: 160 green (63 Scripting + 36 VirtualTags + 47 ScriptedAlarms +
14 AlarmHistorian). Stream G wires this into the real Galaxy.Host IPC pipe.
2026-04-20 19:11:17 -04:00

93 lines
4.1 KiB
C#

using System;
using MessagePack;
namespace ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Shared.Contracts;
/// <summary>
/// Phase 7 Stream D — IPC contracts for routing Part 9 alarm transitions from the
/// main .NET 10 server into Galaxy.Host's already-loaded <c>aahClientManaged</c>
/// DLLs. Reuses the Tier-C isolation + licensing pathway rather than loading 32-bit
/// native historian code into the main server.
/// </summary>
/// <remarks>
/// <para>
/// Batched on the wire to amortize IPC overhead — the main server's SqliteStoreAndForwardSink
/// ships up to 100 events per request per Phase 7 plan Stream D.5.
/// </para>
/// <para>
/// Per-event outcomes (Ack / RetryPlease / PermanentFail) let the drain worker
/// dead-letter malformed events without blocking neighbors in the batch.
/// <see cref="HistorianConnectivityStatusNotification"/> fires proactively from
/// the Host when the SDK session drops so the /hosts + /alarms/historian Admin
/// diagnostics pages flip to red promptly instead of waiting for the next
/// drain cycle.
/// </para>
/// </remarks>
[MessagePackObject]
public sealed class HistorianAlarmEventRequest
{
[Key(0)] public HistorianAlarmEventDto[] Events { get; set; } = Array.Empty<HistorianAlarmEventDto>();
}
[MessagePackObject]
public sealed class HistorianAlarmEventResponse
{
/// <summary>Per-event outcome, same order as the request.</summary>
[Key(0)] public HistorianAlarmEventOutcomeDto[] Outcomes { get; set; } = Array.Empty<HistorianAlarmEventOutcomeDto>();
}
/// <summary>Outcome enum — bytes on the wire so it stays compact.</summary>
public enum HistorianAlarmEventOutcomeDto : byte
{
/// <summary>Successfully persisted to the historian — remove from queue.</summary>
Ack = 0,
/// <summary>Transient failure (historian disconnected, timeout, busy) — retry after backoff.</summary>
RetryPlease = 1,
/// <summary>Permanent failure (malformed, unrecoverable SDK error) — move to dead-letter.</summary>
PermanentFail = 2,
}
/// <summary>One alarm-transition payload. Fields mirror <c>Core.AlarmHistorian.AlarmHistorianEvent</c>.</summary>
[MessagePackObject]
public sealed class HistorianAlarmEventDto
{
[Key(0)] public string AlarmId { get; set; } = string.Empty;
[Key(1)] public string EquipmentPath { get; set; } = string.Empty;
[Key(2)] public string AlarmName { get; set; } = string.Empty;
/// <summary>Concrete Part 9 subtype name — "LimitAlarm" / "OffNormalAlarm" / "AlarmCondition" / "DiscreteAlarm".</summary>
[Key(3)] public string AlarmTypeName { get; set; } = string.Empty;
/// <summary>Numeric severity the Host maps to the historian's priority scale.</summary>
[Key(4)] public int Severity { get; set; }
/// <summary>Which transition this event represents — "Activated" / "Cleared" / "Acknowledged" / etc.</summary>
[Key(5)] public string EventKind { get; set; } = string.Empty;
/// <summary>Pre-rendered message — template tokens resolved upstream.</summary>
[Key(6)] public string Message { get; set; } = string.Empty;
/// <summary>Operator who triggered the transition. "system" for engine-driven events.</summary>
[Key(7)] public string User { get; set; } = "system";
/// <summary>Operator-supplied free-form comment, if any.</summary>
[Key(8)] public string? Comment { get; set; }
/// <summary>Source timestamp (UTC Unix milliseconds).</summary>
[Key(9)] public long TimestampUtcUnixMs { get; set; }
}
/// <summary>
/// Proactive notification — Galaxy.Host pushes this when the historian SDK session
/// transitions (connected / disconnected / degraded). The main server reflects this
/// into the historian sink status so Admin UI surfaces the problem without the
/// operator having to scrutinize drain cadence.
/// </summary>
[MessagePackObject]
public sealed class HistorianConnectivityStatusNotification
{
[Key(0)] public string Status { get; set; } = "unknown"; // connected | disconnected | degraded
[Key(1)] public string? Detail { get; set; }
[Key(2)] public long ObservedAtUtcUnixMs { get; set; }
}