code-review: 2026-05-28 baseline re-review of all 23 modules at 1eb6e97
Re-applies the full 10-category checklist to every src/ project — including
first-time reviews of the four newer components (AuditLog, NotificationOutbox,
SiteCallAudit, Transport) — so the code-reviews/ index reflects today's
codebase rather than the 2026-05-16 baseline. 172 new Open findings (0
Critical, 18 High, 62 Medium, 92 Low); 481 findings total across 23 modules.
regen-readme.py now derives each module's Last reviewed + Commit from its
findings.md header instead of hard-coding 2026-05-16 / 9c60592, so future
single-module re-reviews show their own date in the Module Status table.
This commit is contained in:
@@ -0,0 +1,488 @@
|
||||
# Code Review — NotificationOutbox
|
||||
|
||||
| Field | Value |
|
||||
|-------|-------|
|
||||
| Module | `src/ScadaLink.NotificationOutbox` |
|
||||
| Design doc | `docs/requirements/Component-NotificationOutbox.md` |
|
||||
| Status | Reviewed |
|
||||
| Last reviewed | 2026-05-28 |
|
||||
| Reviewer | claude-agent |
|
||||
| Commit reviewed | `1eb6e97` |
|
||||
| Open findings | 10 |
|
||||
|
||||
## Summary
|
||||
|
||||
NotificationOutbox is a small, focused module — one ~985-line actor
|
||||
(`NotificationOutboxActor`), a strongly-typed options class, an
|
||||
`INotificationDeliveryAdapter` seam, and the single concrete `EmailNotificationDeliveryAdapter`.
|
||||
The Akka.NET conventions are textbook: every async path is wrapped with `PipeTo`, the
|
||||
dispatcher uses an in-flight guard cleared on `DispatchComplete`, the sender is captured
|
||||
before crossing the await, and the actor isolates per-notification failures so one bad row
|
||||
never aborts a batch. Test coverage is broad — ingest, dispatch, query, retry/discard,
|
||||
purge, KPI, and the new audit-emission paths (B2 attempts + B3 terminals) all have
|
||||
dedicated test files — and the audit-write-failure-never-aborts-delivery contract is
|
||||
explicitly asserted.
|
||||
|
||||
The dominant theme is **trust-boundary leakage between Outbox, NotificationService, and
|
||||
ConfigurationDatabase**. The outbox inherits two known defects from its sibling modules
|
||||
that are reachable through `EmailNotificationDeliveryAdapter`: the OAuth2 SASL empty-user
|
||||
bug (NS-021) ships every M365 send with `user=""`, and the
|
||||
`InsertIfNotExistsAsync` check-then-act race (CD-015) lives on the outbox's ack-after-persist
|
||||
hot path. Neither is a defect of code under `src/ScadaLink.NotificationOutbox/`, but both
|
||||
are surfaced here because production dispatch and ingest go through these exact lines.
|
||||
A secondary theme is **dispatcher-fire-and-forget audit writes** (`_ = _auditWriter.WriteAsync(...)`)
|
||||
that can race the per-sweep scope dispose under the wrong DI graph, and a few smaller
|
||||
drifts: the dispatcher passes `CancellationToken.None` to adapter delivery (no graceful
|
||||
shutdown for in-flight SMTP sends), the `StuckAgeThreshold` XML-doc describes a behavior
|
||||
the design explicitly forbids (display-only, never reclaim), the `MaxRetries` boundary check
|
||||
uses `>=` against a config value that can be zero (immediate park on first transient
|
||||
failure), and several `NotificationOutboxOptions` fields are documented in code but absent
|
||||
from `Component-NotificationOutbox.md`. No Critical findings; two High, six Medium, two Low.
|
||||
|
||||
## Checklist coverage
|
||||
|
||||
| # | Category | Examined | Notes |
|
||||
|---|----------|----------|-------|
|
||||
| 1 | Correctness & logic bugs | Yes | `MaxRetries` zero/negative immediately parks (NotificationOutbox-002); `StuckAgeThreshold` XML doc contradicts design (NotificationOutbox-009); `Guid.TryParse` accepts compact `"N"` ids emitted by sites. |
|
||||
| 2 | Akka.NET conventions | Yes | `PipeTo` / sender-capture / in-flight guard pattern is correctly applied throughout. Fire-and-forget `_ = _auditWriter.WriteAsync(...)` raises a scope-lifetime concern (NotificationOutbox-004). |
|
||||
| 3 | Concurrency & thread safety | Yes | Actor state mutated only on actor thread. Inherited CD-015 race on `InsertIfNotExistsAsync` (NotificationOutbox-005) is the only race; the dispatcher's in-flight guard correctly serializes sweeps. |
|
||||
| 4 | Error handling & resilience | Yes | Outer try/catch on `RunDispatchPass`/`RunPurgePass` keeps the in-flight guard sane; per-notification isolation is correct. CT not threaded into delivery (NotificationOutbox-003). |
|
||||
| 5 | Security | Yes | Inherited OAuth2 empty-user (NotificationOutbox-001) reachable through the adapter. No new credential or trust-boundary issues introduced by the outbox itself. |
|
||||
| 6 | Performance & resource management | Yes | Dispatch interval & batch size are simple polling; `ResolveAdapters` rebuilds the lookup per sweep (NotificationOutbox-006). No leaks. |
|
||||
| 7 | Design-document adherence | Yes | `NotificationOutboxOptions.DispatchBatchSize`, `DeliveredKpiWindow`, `PurgeInterval` are not in the design doc (NotificationOutbox-007). |
|
||||
| 8 | Code organization & conventions | Yes | Options class lives in the component project (correct); DI extension lives in the component (correct); adapter is `scoped`, actor singleton — interaction correctly documented in `ServiceCollectionExtensions`. No issues. |
|
||||
| 9 | Testing coverage | Yes | Solid actor-behaviour coverage. Missing tests for `FallbackMaxRetries` / empty-SMTP-config dispatch path (NotificationOutbox-008). |
|
||||
| 10 | Documentation & comments | Yes | XML on `StuckAgeThreshold` misleading (NotificationOutbox-009); XML on dispatcher's audit `_ =` fire-and-forget says "writer never throws" but `EmitAttemptAudit` still wraps in try/catch — comment contradicts itself (NotificationOutbox-010). |
|
||||
|
||||
## Findings
|
||||
|
||||
### NotificationOutbox-001 — `EmailNotificationDeliveryAdapter` inherits the OAuth2 empty-user SASL bug (NS-021) on the M365 send path
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | High |
|
||||
| Category | Correctness & logic bugs |
|
||||
| Status | Open |
|
||||
| Location | `src/ScadaLink.NotificationOutbox/Delivery/EmailNotificationDeliveryAdapter.cs:185-191` (calls `smtp.AuthenticateAsync("oauth2", token)`); root cause in `src/ScadaLink.NotificationService/MailKitSmtpClientWrapper.cs:76-79` |
|
||||
|
||||
**Description**
|
||||
|
||||
`EmailNotificationDeliveryAdapter.SendAsync` resolves an OAuth2 access token via
|
||||
`_tokenService.GetTokenAsync(...)` and then calls
|
||||
`await smtp.AuthenticateAsync(config.AuthType, credentials, cancellationToken);`
|
||||
on `ISmtpClientWrapper`. The production implementation (`MailKitSmtpClientWrapper`)
|
||||
constructs `new SaslMechanismOAuth2("", credentials)` — an empty user-name field —
|
||||
which Microsoft 365 SMTP rejects with `535 5.7.3 Authentication unsuccessful`. The
|
||||
sibling NotificationService finding NS-021 documents this in full; the outbox is the
|
||||
*new home* for delivery on central, so every OAuth2 send that the outbox dispatches
|
||||
hits this code path. The defect is therefore reachable here even though the offending
|
||||
constructor lives in the NotificationService project, and the central-only redesign
|
||||
means this is now the only delivery path in production. Existing outbox tests do not
|
||||
catch it because they all substitute `ISmtpClientWrapper` and assert only that
|
||||
`AuthenticateAsync` is invoked with `("oauth2", "<token>")` — the real
|
||||
`SaslMechanismOAuth2` is never instantiated. `OAuth2TokenService.GetTokenAsync` is
|
||||
explicitly wired to `login.microsoftonline.com/.../oauth2/v2.0/token` with
|
||||
`scope=https://outlook.office365.com/.default`, so M365 SMTP is the intended target —
|
||||
and is precisely the relay that requires the user field to be populated.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Track the NS-021 fix and add an outbox-side regression test once the wrapper signature
|
||||
is widened. Concretely, when `ISmtpClientWrapper.AuthenticateAsync` is extended to
|
||||
accept the sender mailbox (or a dedicated `oauth2UserName` parameter), update
|
||||
`EmailNotificationDeliveryAdapter.SendAsync` to pass `config.FromAddress`, and add a
|
||||
test in `EmailNotificationDeliveryAdapterTests` that asserts the OAuth2 path forwards
|
||||
the sender identity. Until then, surface the same finding here so the outbox is not
|
||||
treated as resolved when NS-021 fires.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
### NotificationOutbox-002 — Dispatcher parks on first transient failure when `SmtpConfiguration.MaxRetries == 0`
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | High |
|
||||
| Category | Correctness & logic bugs |
|
||||
| Status | Open |
|
||||
| Location | `src/ScadaLink.NotificationOutbox/NotificationOutboxActor.cs:348-360` |
|
||||
|
||||
**Description**
|
||||
|
||||
The transient-failure branch increments `RetryCount` then evaluates
|
||||
`if (notification.RetryCount >= maxRetries) notification.Status = NotificationStatus.Parked;`.
|
||||
`maxRetries` is read from the central `SmtpConfiguration.MaxRetries` column, which has
|
||||
no enforced lower bound and is not validated by the outbox. A row whose `MaxRetries`
|
||||
is `0` (or any negative value) immediately satisfies `1 >= 0` on the very first
|
||||
transient failure, so the notification is parked without a single retry — directly
|
||||
contradicting the design doc's "fixed retry interval, reuse central SMTP
|
||||
max-retry-count" intent, where a configured value of zero would naturally read as
|
||||
"never retry, fail straight to permanent". `SetupSmtpRetryPolicy` in the dispatch
|
||||
tests always supplies a positive value, so this path is not exercised.
|
||||
|
||||
Additionally, an operator who clears the SMTP config row drops into the
|
||||
`FallbackMaxRetries = 10` / `FallbackRetryDelay = 1 min` path
|
||||
(`ResolveRetryPolicyAsync` line 251); that path is also untested — see
|
||||
NotificationOutbox-008. The operational result is that a single bad SMTP config
|
||||
value silently halves the outbox's delivery guarantees.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Validate `MaxRetries` at the read point: treat a non-positive value as either the
|
||||
configured fallback (current `FallbackMaxRetries = 10`) or — preferred — surface the
|
||||
mis-configuration to the operator via a health metric and refuse to dispatch until
|
||||
the row is corrected. Either way, add a test that asserts the dispatcher's behaviour
|
||||
for `MaxRetries == 0` and `MaxRetries < 0`.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
### NotificationOutbox-003 — Dispatcher does not propagate a `CancellationToken` into delivery; in-flight SMTP sends cannot be cancelled on shutdown
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Medium |
|
||||
| Category | Error handling & resilience |
|
||||
| Status | Open |
|
||||
| Location | `src/ScadaLink.NotificationOutbox/NotificationOutboxActor.cs:334`, `src/ScadaLink.NotificationOutbox/Delivery/INotificationDeliveryAdapter.cs:22` |
|
||||
|
||||
**Description**
|
||||
|
||||
`DeliverOneAsync` calls `var outcome = await adapter.DeliverAsync(notification);` —
|
||||
the second `CancellationToken` parameter on `INotificationDeliveryAdapter.DeliverAsync`
|
||||
is left at its `default(CancellationToken)` value, meaning `CancellationToken.None`.
|
||||
`EmailNotificationDeliveryAdapter.SendAsync` then threads that `None` token into
|
||||
`smtp.ConnectAsync`, `smtp.AuthenticateAsync`, and `smtp.SendAsync`. The consequence
|
||||
is that during a coordinated cluster shutdown (singleton handover, drain) any
|
||||
in-flight SMTP send is uncancellable and the dispatcher's sweep must wait for the
|
||||
underlying socket/SMTP timeout (`SmtpConfiguration.ConnectionTimeoutSeconds`) before
|
||||
the sweep's task completes and `DispatchComplete` lowers the in-flight guard. With
|
||||
the default connect-timeout values this is on the order of tens of seconds per
|
||||
notification in the in-progress batch, blocking `CoordinatedShutdown`.
|
||||
|
||||
The adapter implementations clearly *expect* a token — the contract type is
|
||||
`CancellationToken cancellationToken = default` everywhere — so this is a wiring
|
||||
gap, not a missing interface.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Wire a per-sweep `CancellationTokenSource` linked to the actor's lifecycle (cancel
|
||||
in `PostStop`) and pass its token into `DeliverAsync`. A linked source per sweep
|
||||
also bounds individual deliveries by the configured connection timeout when a more
|
||||
explicit per-attempt budget is wanted. Add a test that cancels mid-`DeliverAsync` and
|
||||
asserts the dispatcher completes promptly and the row is left non-terminal
|
||||
(`Pending`/`Retrying` unchanged) for the next sweep.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
### NotificationOutbox-004 — `EmitAttemptAudit`/`EmitTerminalAudit` fire-and-forget pattern can outlive the per-sweep DI scope
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Medium |
|
||||
| Category | Akka.NET conventions |
|
||||
| Status | Open |
|
||||
| Location | `src/ScadaLink.NotificationOutbox/NotificationOutboxActor.cs:425-435`, `463-485` |
|
||||
|
||||
**Description**
|
||||
|
||||
Both emission helpers issue `_ = _auditWriter.WriteAsync(evt);` — discarding the
|
||||
returned task. `CentralAuditWriter.WriteAsync` opens its own `await using var scope =
|
||||
_services.CreateAsyncScope();` and resolves a scoped `IAuditLogRepository` (verified
|
||||
at `src/ScadaLink.AuditLog/Central/CentralAuditWriter.cs:118-121`), so the writer is
|
||||
defensively scope-independent. However the dispatcher already holds a per-sweep
|
||||
`using var scope = _serviceProvider.CreateScope();` and the per-notification
|
||||
`UpdateAsync` runs in that scope. The fire-and-forget pattern means:
|
||||
|
||||
1. The dispatcher's outer scope can be disposed (sweep done, `DispatchComplete`
|
||||
piped) while the audit `WriteAsync` task is still running on a *different*
|
||||
scope it owns — works today only because the writer creates its own scope.
|
||||
2. A faulted unobserved task is silently lost: if `CentralAuditWriter.WriteAsync`
|
||||
itself were ever made `async void` or refactored to not internally try/catch,
|
||||
the dispatcher would never see the fault and the audit row would vanish without
|
||||
the `_logger.LogWarning` reaching the operator.
|
||||
3. The XML-doc above `EmitAttemptAudit` says "PipeTo is not used because the writer
|
||||
never throws" — but the surrounding `try { _ = _auditWriter.WriteAsync(evt); }
|
||||
catch (Exception ex)` will only catch a synchronous throw from the *task
|
||||
construction*, not the awaited body of `WriteAsync`. The comment understates the
|
||||
risk: the catch is structurally unreachable for the documented failure mode.
|
||||
|
||||
The system actually wants the *invariant* "audit write never affects delivery"
|
||||
(verified by the `AuditWriter_Throws_…StillSucceeds` tests). That invariant is
|
||||
better expressed by `await`-ing the writer inside the actor's outer try/catch (the
|
||||
dispatcher already swallows per-notification exceptions) than by a discard-task,
|
||||
which couples the lifetime of the dispatcher's scope to that of the audit task
|
||||
through whatever scope graph the writer happens to use today.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Either `await _auditWriter.WriteAsync(evt)` inside the existing `try`/`catch` (the
|
||||
preferred fix — preserves the invariant, plays well with the per-sweep scope, and
|
||||
makes the catch block actually reachable), or — if a true fire-and-forget remains
|
||||
desired — capture the returned task and attach a continuation that calls
|
||||
`_logger.LogWarning` on faulted to keep diagnostics intact. Either way, fix the
|
||||
"writer never throws" XML-doc to match the implementation.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
### NotificationOutbox-005 — Ingest persistence inherits the CD-015 check-then-act race; under contention the second writer throws and the site retries
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Medium |
|
||||
| Category | Concurrency & thread safety |
|
||||
| Status | Open |
|
||||
| Location | `src/ScadaLink.NotificationOutbox/NotificationOutboxActor.cs:127-132` (caller); root cause in `src/ScadaLink.ConfigurationDatabase/Repositories/NotificationOutboxRepository.cs:33-45` |
|
||||
|
||||
**Description**
|
||||
|
||||
`HandleSubmit` → `PersistAsync` calls `repository.InsertIfNotExistsAsync(notification)`
|
||||
on `INotificationOutboxRepository`. The current implementation
|
||||
(`src/ScadaLink.ConfigurationDatabase/Repositories/NotificationOutboxRepository.cs`)
|
||||
does a check-then-act with no duplicate-key catch — documented as CD-015 (High,
|
||||
Open). The Notification Outbox's documented contract is "at-least-once handoff with
|
||||
ack-after-persist plus insert-if-not-exists on `NotificationId`" (CLAUDE.md,
|
||||
Component-NotificationOutbox.md §Ingest & Idempotency), and the duplicate-insert
|
||||
race is the **expected contention pattern** — the site retries the same submission
|
||||
after a lost ack. As written, the loser surfaces a `SqlException` (2627 PK
|
||||
violation) wrapped in `DbUpdateException`, propagates through `PipeTo`'s failure
|
||||
projection as a `NotificationSubmitAck { Accepted: false, Error: "... PRIMARY KEY ..." }`,
|
||||
the site treats the ack as a forwarding failure and forwards the message **again**,
|
||||
re-entering the same race. If the contending pair keeps racing this can livelock.
|
||||
|
||||
The actor side is fine — `PipeTo`'s success/failure projection correctly forwards
|
||||
the exception message. The repository side needs the standard `2601/2627 → no-op`
|
||||
pattern that AuditLog and SiteCall already use. This finding tracks the outbox-side
|
||||
visibility of the CD-015 defect so a re-review of NotificationOutbox surfaces it
|
||||
even if the reader has not yet read the ConfigurationDatabase findings.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Track CD-015 to resolution. As a defense-in-depth complement here, consider
|
||||
treating a duplicate-key `DbUpdateException` in the actor's ingest failure
|
||||
projection as `Accepted: true` so a lost ack between persisted-by-the-first-writer
|
||||
and ack-back does not produce a permanent re-forward loop — but the cleanest fix
|
||||
remains the CD-015 raw-SQL `IF NOT EXISTS … INSERT` with `2601/2627` catch in
|
||||
`NotificationOutboxRepository`.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
### NotificationOutbox-006 — `ResolveAdapters` rebuilds the `NotificationType → adapter` dictionary on every dispatch sweep
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Low |
|
||||
| Category | Performance & resource management |
|
||||
| Status | Open |
|
||||
| Location | `src/ScadaLink.NotificationOutbox/NotificationOutboxActor.cs:267-277` |
|
||||
|
||||
**Description**
|
||||
|
||||
Every dispatch sweep calls `ResolveAdapters(scope.ServiceProvider)` which enumerates
|
||||
`scopedServices.GetServices<INotificationDeliveryAdapter>()` and builds a fresh
|
||||
`Dictionary<NotificationType, INotificationDeliveryAdapter>`. Adapter registration
|
||||
is decided at startup (`AddNotificationOutbox` registers
|
||||
`EmailNotificationDeliveryAdapter`); the registration set does not change at
|
||||
runtime. With a default `DispatchInterval = 10s` and only ever one entry today, the
|
||||
allocation overhead is trivial — but the comment "the last adapter registered for a
|
||||
given type wins, mirroring DI's last-wins resolution semantics" elevates this to a
|
||||
behaviour contract, and the per-sweep dictionary construction obscures the lookup's
|
||||
identity from one sweep to the next, making any future stateful adapter (rate
|
||||
limiter, circuit breaker) silently lose its state.
|
||||
|
||||
The same issue is the reason `EmailNotificationDeliveryAdapter` is *scoped* — it
|
||||
holds a scoped `INotificationRepository`. A trivial cache-the-types-but-resolve-
|
||||
the-instance fix is possible: cache the set of declared `NotificationType` values
|
||||
and look up each adapter by `GetService<INotificationDeliveryAdapter>()`
|
||||
filtered by `Type` per sweep.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Document the per-sweep contract explicitly ("each sweep gets a fresh adapter
|
||||
instance per the scoped DI contract — adapters must not carry state across
|
||||
sweeps") in the actor XML, or — preferred — cache only the *types* at startup
|
||||
(`PreStart`) and resolve the scoped instance per sweep, so future adapters with
|
||||
stateful intent (timeouts, circuit breakers) cannot accidentally lose state.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
### NotificationOutbox-007 — `NotificationOutboxOptions.DispatchBatchSize`, `DeliveredKpiWindow`, and `PurgeInterval` are not in the design document
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Medium |
|
||||
| Category | Design-document adherence |
|
||||
| Status | Open |
|
||||
| Location | `src/ScadaLink.NotificationOutbox/NotificationOutboxOptions.cs:13`, `:22`, `:25`; `docs/requirements/Component-NotificationOutbox.md:152-160` |
|
||||
|
||||
**Description**
|
||||
|
||||
`Component-NotificationOutbox.md` §Configuration enumerates three options: dispatch
|
||||
interval, stuck-age threshold, and terminal-row retention window. The implemented
|
||||
`NotificationOutboxOptions` adds three additional fields:
|
||||
|
||||
- `DispatchBatchSize` (default `100`) — caps the per-sweep claim size, but is invisible
|
||||
to anyone reading only the spec.
|
||||
- `PurgeInterval` (default `1 day`) — the design doc says "daily purge" as if the
|
||||
cadence is fixed; in code it is configurable.
|
||||
- `DeliveredKpiWindow` (default `1 min`) — the KPI section says "Delivered (last
|
||||
interval)" without saying how long "last interval" is or that it is configurable.
|
||||
|
||||
The design doc also asserts "Delivery max-retry-count and retry interval are not
|
||||
part of `NotificationOutboxOptions` — they are reused from the central SMTP
|
||||
configuration" (line 160) — implementation honours this. But the three additions
|
||||
above are dead text in the design doc. The KPI dashboard cadence and the dispatch
|
||||
batch size are both operationally important values an operator/engineer will hunt
|
||||
for; their absence from the spec is design drift.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Add the three fields to `Component-NotificationOutbox.md §Configuration` with their
|
||||
defaults, or remove them from the implementation if they were meant to be fixed
|
||||
constants. Cross-link `DeliveredKpiWindow` from the §Monitoring "Delivered (last
|
||||
interval)" KPI bullet so a reader sees what controls the bucket length.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
### NotificationOutbox-008 — `FallbackMaxRetries` / `FallbackRetryDelay` path is unreachable in production AND untested
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Low |
|
||||
| Category | Testing coverage |
|
||||
| Status | Open |
|
||||
| Location | `src/ScadaLink.NotificationOutbox/NotificationOutboxActor.cs:29-31`, `:251-259`; tests in `tests/ScadaLink.NotificationOutbox.Tests/NotificationOutboxActorDispatchTests.cs` |
|
||||
|
||||
**Description**
|
||||
|
||||
`ResolveRetryPolicyAsync` falls back to `FallbackMaxRetries = 10` and
|
||||
`FallbackRetryDelay = 1 min` when `notificationRepository.GetAllSmtpConfigurationsAsync()`
|
||||
returns an empty list (no SMTP configuration row). The comment correctly observes
|
||||
that delivery itself will then return `Permanent("No SMTP configuration available")`
|
||||
from `EmailNotificationDeliveryAdapter.cs:78-81`, so the fallback retry policy
|
||||
never actually retries anything — the row is permanently parked on first attempt
|
||||
regardless of retry count or delay.
|
||||
|
||||
This produces three concerns. (1) The fallback is essentially dead code — the retry
|
||||
policy values are never consulted in practice because delivery always fails
|
||||
permanently before the retry branch is reached. (2) The fallback can be reached
|
||||
*after* a previously-deployed SMTP config is deleted, which is precisely the
|
||||
moment an operator needs accurate audit trails; the row will say `Parked` with
|
||||
`LastError = "No SMTP configuration available"` but the audit signal "retry policy
|
||||
fell back to defaults" is invisible. (3) Tests never exercise either the fallback
|
||||
path or the empty-SMTP-config dispatch path: `SetupSmtpRetryPolicy` always supplies
|
||||
a config in every dispatch test.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Add a regression test that runs a dispatch sweep with no SMTP config row and
|
||||
asserts the row is parked with the documented error. Optionally remove the fallback
|
||||
constants if parking-with-no-config is the *intended* operational signal; document
|
||||
the choice in the actor XML so a maintainer does not "fix" the unreachable code.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
### NotificationOutbox-009 — `StuckAgeThreshold` XML-doc says "in-progress notification is re-claimed" — contradicts the design's display-only stuck detection
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Low |
|
||||
| Category | Documentation & comments |
|
||||
| Status | Open |
|
||||
| Location | `src/ScadaLink.NotificationOutbox/NotificationOutboxOptions.cs:15-16` |
|
||||
|
||||
**Description**
|
||||
|
||||
```csharp
|
||||
/// <summary>Age past which an in-progress notification is considered stuck and re-claimed.</summary>
|
||||
public TimeSpan StuckAgeThreshold { get; set; } = TimeSpan.FromMinutes(10);
|
||||
```
|
||||
|
||||
The implementation never reclaims anything based on `StuckAgeThreshold`. It is used
|
||||
only as a cutoff for the stuck-count KPI (`StuckCutoff`/`IsStuck` in
|
||||
`NotificationOutboxActor.cs:932-942`) and as a `StuckCutoff` filter on paginated
|
||||
queries. The design doc is explicit: "A notification is **stuck** if it is `Pending`
|
||||
or `Retrying` and older than a configurable age threshold (default 10 minutes).
|
||||
Detection is **display-only** — a count KPI and a row badge. There is no automated
|
||||
escalation or alerting" (`Component-NotificationOutbox.md:143-145`). A maintainer
|
||||
reading the XML and expecting "re-claim" behaviour will be surprised twice — once
|
||||
when no re-claim happens, and once when they go looking for the re-claim code and
|
||||
find none.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Rewrite the XML to match the design: "Age past which a still-`Pending`/`Retrying`
|
||||
notification is counted as stuck on the KPI tile and the per-row badge.
|
||||
Display-only — does not affect dispatch."
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
### NotificationOutbox-010 — Comment claims `PipeTo` is not used "because the writer never throws"; the surrounding try/catch is dead-letter for the documented failure mode
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Medium |
|
||||
| Category | Documentation & comments |
|
||||
| Status | Open |
|
||||
| Location | `src/ScadaLink.NotificationOutbox/NotificationOutboxActor.cs:469-477` |
|
||||
|
||||
**Description**
|
||||
|
||||
```csharp
|
||||
try
|
||||
{
|
||||
var evt = BuildNotifyDeliverEvent(notification, now, AuditStatus.Attempted, errorMessage)
|
||||
with { DurationMs = durationMs };
|
||||
// Fire-and-forget — we do NOT await: the dispatcher loop must not
|
||||
// be blocked by audit IO, and the writer swallows its own faults.
|
||||
// PipeTo is not used because the writer never throws.
|
||||
_ = _auditWriter.WriteAsync(evt);
|
||||
}
|
||||
catch (Exception ex)
|
||||
{
|
||||
_logger.LogWarning(ex, "Failed to emit Attempted audit row …");
|
||||
}
|
||||
```
|
||||
|
||||
The XML-doc on `EmitAttemptAudit` is internally inconsistent and structurally
|
||||
incorrect: (1) if "the writer never throws" then the surrounding try/catch is
|
||||
unreachable and dead code; (2) if the writer *can* throw (and the catch is
|
||||
meaningful) then "never throws" is wrong. In practice the catch only ever fires
|
||||
on a synchronous throw from the writer's *task construction* — never on a fault
|
||||
in the awaited body — because the discarded task is not observed. The current
|
||||
behaviour matches the design intent ("audit failure NEVER aborts delivery"), but
|
||||
the comment misleads the next reader on the *why*.
|
||||
|
||||
This is the same root cause as NotificationOutbox-004 — they target the same lines
|
||||
from different angles (NotificationOutbox-004 is the scope-lifetime /
|
||||
fire-and-forget Akka concern, NotificationOutbox-010 is the doc/comment-clarity
|
||||
concern). Closing NotificationOutbox-004 by switching to `await` resolves both.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
If `await`-ing the writer (recommended fix per NotificationOutbox-004): delete the
|
||||
"PipeTo is not used because the writer never throws" line entirely and let
|
||||
the try/catch's behaviour speak for itself. If keeping fire-and-forget: rewrite
|
||||
the comment to "fire-and-forget by design (the writer is responsible for its
|
||||
own failure handling); the surrounding try/catch only catches the synchronous
|
||||
task-construction throw and is otherwise unreachable."
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
Reference in New Issue
Block a user