diff --git a/code-reviews/Core.AlarmHistorian/findings.md b/code-reviews/Core.AlarmHistorian/findings.md
index cc731587..01bc78e2 100644
--- a/code-reviews/Core.AlarmHistorian/findings.md
+++ b/code-reviews/Core.AlarmHistorian/findings.md
@@ -4,10 +4,10 @@
|---|---|
| Module | `src/Core/ZB.MOM.WW.OtOpcUa.Core.AlarmHistorian` |
| Reviewer | Claude Code |
-| Review date | 2026-05-22 |
-| Commit reviewed | `76d35d1` |
+| Review date | 2026-06-19 |
+| Commit reviewed | `621d00e4` |
| Status | Reviewed |
-| Open findings | 0 |
+| Open findings | 1 |
## Checklist coverage
@@ -190,3 +190,69 @@
**Recommendation:** Update the doc-comments to refer to the Wonderware historian sidecar / `WonderwareHistorianClient` (`IAlarmHistorianWriter` implementation) instead of `Galaxy.Host`, consistent with `docs/AlarmTracking.md`'s "Historian write-back" section.
**Resolution:** Resolved 2026-05-23 — the three stale `Galaxy.Host` references were already replaced ahead of this resolution by earlier commits (`bdca772` rewrote the `IAlarmHistorianSink` summary + `IAlarmHistorianWriter` summary to name the Wonderware historian sidecar / `WonderwareHistorianClient`; `f6d487b` rewrote the `AlarmHistorianEvent.EventKind` doc-comment). A fresh grep across the project confirms no remaining `Galaxy.Host` / "Stream G wires this" strings — only the legitimate `Galaxy-native` alarm-source label survives. Status flipped to Resolved during the -008 pass; no new source change was needed.
+
+---
+
+## Re-review 2026-06-19 (commit 621d00e4)
+
+All 11 prior findings confirmed resolved at HEAD. The re-review covered all 10 checklist categories against the 3 source files (874 LOC) and the single test file (820 LOC, 28 tests) at HEAD. Two new findings: one resolved in-session (Medium), one deferred (Low).
+
+## Checklist coverage — re-review (621d00e4, 2026-06-19)
+
+| # | Category | Result |
+|---|---|---|
+| 1 | Correctness & logic bugs | Core.AlarmHistorian-012 (Resolved) |
+| 2 | OtOpcUa conventions | No issues found |
+| 3 | Concurrency & thread safety | Core.AlarmHistorian-013 (Resolved) |
+| 4 | Error handling & resilience | Core.AlarmHistorian-014 (Open-deferred) |
+| 5 | Security | No issues found |
+| 6 | Performance & resource management | No issues found |
+| 7 | Design-document adherence | No issues found |
+| 8 | Code organization & conventions | No issues found |
+| 9 | Testing coverage | No issues found |
+| 10 | Documentation & comments | No issues found |
+
+### Core.AlarmHistorian-012
+
+| Field | Value |
+|---|---|
+| Severity | Medium |
+| Category | Correctness & logic bugs |
+| Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core.AlarmHistorian/SqliteStoreAndForwardSink.cs:412-415` |
+| Status | Resolved |
+
+**Description:** When `WriteBatchAsync` throws `OperationCanceledException`, the catch at line 412–415 re-throws without resetting `_drainState`. The drain tick set `_drainState = Draining` at the top of `DrainOnceAsync` (inside `lock (_statusLock)`), and the `OperationCanceledException` path exits the method without setting it back to `Idle` or `BackingOff`. The status surface therefore permanently shows `Draining` until the next drain tick begins — any Admin UI poll or health check in that window sees a stale, misleading state. In production the drain timer always passes `CancellationToken.None` so this is not triggerable via the timer path; however, any code that calls `DrainOnceAsync` directly with a real cancellable token (future test code, operator tooling, or a shutdown path) will hit this. The `_drainState` invariant — only `Draining` while the semaphore is held — is violated.
+
+**Recommendation:** Before re-throwing, set `_drainState = BackingOff` under `_statusLock` so the status surface accurately reflects that the drain stopped without applying any outcomes.
+
+**Resolution:** Resolved 2026-06-19 — added `lock (_statusLock) { _drainState = HistorianDrainState.BackingOff; }` in the `OperationCanceledException` catch block before re-throwing. Regression test `Drain_cancelled_mid_write_resets_drain_state_to_not_draining` added; confirmed fail-before / pass-after. All 28 tests green.
+
+### Core.AlarmHistorian-013
+
+| Field | Value |
+|---|---|
+| Severity | Low |
+| Category | Concurrency & thread safety |
+| Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core.AlarmHistorian/SqliteStoreAndForwardSink.cs:79` |
+| Status | Resolved |
+
+**Description:** `_disposed` is a plain `bool` field read on multiple threads (`EnqueueAsync` on the emitting thread, `DrainOnceAsync` / `RescheduleDrain` on the timer thread, `StartDrainLoop` on the caller's thread) and written by `Dispose()` on the owner's thread — with no memory barrier. Per the C# language specification this is a data race. On the .NET/x64 JIT the weak read is unlikely to manifest as a bug in practice, but it is technically undefined behaviour: the compiler or JIT is permitted to cache the stale `false` value in a register across repeated checks. `_backoffIndex` was already correctly marked `volatile` for the same cross-thread-visibility reason (commit `c20d2283`); `_disposed` was not updated at the same time.
+
+**Recommendation:** Mark `_disposed` as `volatile bool` so all reads see the most recently written value without requiring an additional lock or `Interlocked` operation.
+
+**Resolution:** Resolved 2026-06-19 — `_disposed` changed to `private volatile bool _disposed` with an explanatory comment. No test added (the race is non-deterministic and the existing `Disposed_sink_rejects_enqueue` test already exercises the dispose path; a data-race test would require a memory-model fuzzer). All 28 tests green.
+
+### Core.AlarmHistorian-014
+
+| Field | Value |
+|---|---|
+| Severity | Low |
+| Category | Error handling & resilience |
+| Location | `src/Core/ZB.MOM.WW.OtOpcUa.Core.AlarmHistorian/SqliteStoreAndForwardSink.cs:515-555,558-568` |
+| Status | Open |
+
+**Description:** `GetStatus()` and `RetryDeadLettered()` have no `_disposed` guard. After `Dispose()` is called both methods still open new `SqliteConnection` objects against the database file (which persists on disk after disposal). This works because SQLite is file-backed, but the behavior is surprising and inconsistent with the convention that post-disposal calls on an `IDisposable` object throw `ObjectDisposedException`. `RetryDeadLettered()` in particular performs a write (`UPDATE Queue …`) against a sink that the owner has declared finished. If a lingering Admin UI polling loop calls `GetStatus()` after the host shuts down and disposes the sink, no error is surfaced — the method silently succeeds.
+
+**Recommendation:** Add `if (_disposed) throw new ObjectDisposedException(nameof(SqliteStoreAndForwardSink));` guards at the top of `GetStatus()` and `RetryDeadLettered()`, consistent with the guards already present on `EnqueueAsync` and `StartDrainLoop`.
+
+**Deferred:** The current behavior (silently succeeds post-disposal) is benign in production because the Admin UI is also shutting down when the host disposes the sink. Adding the guard is a one-line change per method but requires coordinating with the Admin UI callers to confirm they tolerate `ObjectDisposedException` after shutdown — that is a cross-module concern. Deferring until the Admin UI shutdown sequence is reviewed.
diff --git a/src/Core/ZB.MOM.WW.OtOpcUa.Core.AlarmHistorian/SqliteStoreAndForwardSink.cs b/src/Core/ZB.MOM.WW.OtOpcUa.Core.AlarmHistorian/SqliteStoreAndForwardSink.cs
index f00ea542..5127bf71 100644
--- a/src/Core/ZB.MOM.WW.OtOpcUa.Core.AlarmHistorian/SqliteStoreAndForwardSink.cs
+++ b/src/Core/ZB.MOM.WW.OtOpcUa.Core.AlarmHistorian/SqliteStoreAndForwardSink.cs
@@ -76,7 +76,9 @@ public sealed class SqliteStoreAndForwardSink : IAlarmHistorianSink, IDisposable
private Timer? _drainTimer;
private TimeSpan _tickInterval;
private volatile int _backoffIndex;
- private bool _disposed;
+ // volatile: read by EnqueueAsync, DrainOnceAsync, RescheduleDrain, and StartDrainLoop
+ // from different threads; Dispose() writes it from the owner's thread (Core.AlarmHistorian-013).
+ private volatile bool _disposed;
// Core.AlarmHistorian-005: status fields written by the drain timer thread and
// read concurrently by GetStatus() / health-check threads. Guard all reads and
@@ -411,6 +413,10 @@ public sealed class SqliteStoreAndForwardSink : IAlarmHistorianSink, IDisposable
}
catch (OperationCanceledException)
{
+ // Core.AlarmHistorian-012: reset _drainState so the status surface does
+ // not stay stuck at Draining after the cancellation unwinds the method.
+ // The row stays queued; the next drain tick will retry it.
+ lock (_statusLock) { _drainState = HistorianDrainState.BackingOff; }
throw;
}
catch (Exception ex)
diff --git a/tests/Core/ZB.MOM.WW.OtOpcUa.Core.AlarmHistorian.Tests/SqliteStoreAndForwardSinkTests.cs b/tests/Core/ZB.MOM.WW.OtOpcUa.Core.AlarmHistorian.Tests/SqliteStoreAndForwardSinkTests.cs
index d0348822..4096f275 100644
--- a/tests/Core/ZB.MOM.WW.OtOpcUa.Core.AlarmHistorian.Tests/SqliteStoreAndForwardSinkTests.cs
+++ b/tests/Core/ZB.MOM.WW.OtOpcUa.Core.AlarmHistorian.Tests/SqliteStoreAndForwardSinkTests.cs
@@ -786,6 +786,55 @@ public sealed class SqliteStoreAndForwardSinkTests : IDisposable
sink.GetStatus().QueueDepth.ShouldBe(0, "every event drained at the end of the run");
}
+ ///
+ /// Regression for Core.AlarmHistorian-012: when WriteBatchAsync throws
+ /// the drain exits via re-throw
+ /// without resetting _drainState, leaving the status surface permanently
+ /// showing Draining. The next call to
+ /// must reflect Idle or BackingOff, not the stale Draining
+ /// state that was set at the top of the drain tick.
+ ///
+ [Fact]
+ public async Task Drain_cancelled_mid_write_resets_drain_state_to_not_draining()
+ {
+ var writer = new CancellableWriter();
+ using var sink = new SqliteStoreAndForwardSink(_dbPath, writer, _log);
+ await sink.EnqueueAsync(Event("A1"), CancellationToken.None);
+
+ using var cts = new CancellationTokenSource();
+ writer.CancelOn(cts);
+
+ // The drain will cancel inside WriteBatchAsync — OperationCanceledException re-throws.
+ await Should.ThrowAsync(
+ () => sink.DrainOnceAsync(cts.Token));
+
+ // DrainState must NOT be left as Draining after the cancellation escapes.
+ sink.GetStatus().DrainState.ShouldNotBe(HistorianDrainState.Draining,
+ "cancellation must clear the Draining status; row stays queued for next tick");
+
+ // The row must still be in the queue — the cancelled drain applied no outcomes.
+ sink.GetStatus().QueueDepth.ShouldBe(1, "row stays queued when drain is cancelled");
+ }
+
+ /// A writer that cancels its own CancellationTokenSource on the first call.
+ private sealed class CancellableWriter : IAlarmHistorianWriter
+ {
+ private CancellationTokenSource? _cts;
+
+ /// Register the token source to cancel on first write.
+ public void CancelOn(CancellationTokenSource cts) => _cts = cts;
+
+ /// Writes a batch; cancels via the registered CTS on the first call.
+ public Task> WriteBatchAsync(
+ IReadOnlyList batch, CancellationToken ct)
+ {
+ _cts?.Cancel();
+ ct.ThrowIfCancellationRequested();
+ var outcomes = Enumerable.Repeat(HistorianWriteOutcome.Ack, batch.Count).ToList();
+ return Task.FromResult>(outcomes);
+ }
+ }
+
///
/// Helper that confirms the queue depth surfaced by GetStatus matches a fresh
/// COUNT(*) read directly from storage — proves the in-memory counter has not