Files
Joseph Doherty 9175b0c013 docs(components): accuracy fixes from deep review (batch 3)
NotificationService (Notify.Send returns string not NotificationId;
MaxConcurrentConnections unenforced; AddHttpClient), NotificationOutbox
(one Attempted row always, terminal row only on terminal status), SiteCallAudit
(direct dual-write, no Tell; KPI tiles consumed by CentralUI), HealthMonitoring
(CentralOfflineTimeout 180s = 6x ReportInterval; HealthReportSender gates on
IsActiveNode), SiteEventLogging (active-node purge seam not wired; runs on both
nodes), InboundAPI (whole System.Diagnostics namespace forbidden).
2026-06-03 16:37:15 -04:00

17 KiB

Notification Service

The Notification Service is the central-only component that owns notification-list and SMTP definitions, and supplies the per-channel INotificationDeliveryAdapter implementations that the Notification Outbox invokes at delivery time. Sites never deliver notifications; they store-and-forward notification payloads to central, where this component's adapters perform all actual SMTP sends.

Overview

Notification Service (#8) runs on the central cluster only. Its responsibilities split cleanly into two layers:

  • DefinitionsNotificationList and SmtpConfiguration entities stored in the central Configuration Database. Notification lists carry a NotificationType discriminator (Email now; additional types such as Teams are planned). Lists and SMTP config are never deployed to sites.
  • Delivery adapters — per-type implementations of INotificationDeliveryAdapter. The Notification Outbox selects the adapter matching a notification's Type, calls DeliverAsync, and receives a three-way DeliveryOutcome (Success / TransientFailure / PermanentFailure). The adapter owns the full recipient-resolution, connection, authentication, send, and disconnect sequence. EmailNotificationDeliveryAdapter is registered as scoped (it holds a scoped INotificationRepository) and the outbox actor caches a single instance for its lifetime.

The component code lives in src/ZB.MOM.WW.ScadaBridge.NotificationService/. The EmailNotificationDeliveryAdapter that consumes these primitives lives in src/ZB.MOM.WW.ScadaBridge.NotificationOutbox/Delivery/.

Key Concepts

Central-only delivery

Before the current design, site nodes delivered notifications directly over SMTP. That arrangement required SMTP credentials and notification lists to be deployed to every site. The redesign inverts the path: a site script calls Notify.To("list").Send(subject, body), receives a string notification id immediately, and the notification is store-and-forwarded to central. The Notification Outbox on central ingests it and calls the delivery adapter. Sites never open an SMTP connection.

This means:

  • Credential exposure is limited to the central cluster.
  • List membership is resolved at delivery time, so a list change takes effect for all future deliveries without redeploying to sites.
  • The SMTP MaxConcurrentConnections value is configured at a single point, though it is not currently enforced (no connection gate or semaphore).

NotificationType discriminator

NotificationList.Type is a NotificationType enum value (Email currently). The script API Notify.To("listName") is type-agnostic — the calling script does not reference a type. The Notification Outbox reads the type from the central database when it picks up the notification, then selects the matching adapter by INotificationDeliveryAdapter.Type. Adding a new delivery channel means adding a new adapter; existing scripts continue to work.

Per-delivery SMTP client lifetime

MailKitSmtpClientWrapper wraps a single MailKit.Net.Smtp.SmtpClient. MailKit's client is not thread-safe and holds one TCP/TLS connection. The DI registration is therefore a factory, not a singleton wrapper:

services.AddSingleton<Func<ISmtpClientWrapper>>(_ => () => new MailKitSmtpClientWrapper());

EmailNotificationDeliveryAdapter.SendAsync invokes the factory at the top of each delivery attempt, runs connect → authenticate → send → disconnect on the fresh wrapper, and disposes it in a finally block. Each delivery pays a full TCP+TLS handshake; this is the deliberate cost of avoiding shared connection state between concurrent outbox dispatches. The factory shape allows a future pooled implementation to be slotted in without changing callers.

Architecture

Primitives registered by AddNotificationService

ServiceCollectionExtensions.AddNotificationService is the single DI entry point, called on the central composition root only:

public static IServiceCollection AddNotificationService(this IServiceCollection services)
{
    services.AddOptions<NotificationOptions>()
        .BindConfiguration("ScadaBridge:Notification");

    services.AddHttpClient();
    services.AddSingleton<OAuth2TokenService>();
    services.AddSingleton<Func<ISmtpClientWrapper>>(_ => () => new MailKitSmtpClientWrapper());

    return services;
}

Four things are registered: the NotificationOptions fallback values, the HttpClient infrastructure (required by OAuth2TokenService), the OAuth2TokenService token cache, and the ISmtpClientWrapper factory. The EmailNotificationDeliveryAdapter itself is registered by ZB.MOM.WW.ScadaBridge.NotificationOutbox, which depends on this project.

INotificationDeliveryAdapter

public interface INotificationDeliveryAdapter
{
    NotificationType Type { get; }
    Task<DeliveryOutcome> DeliverAsync(
        Notification notification,
        CancellationToken cancellationToken = default);
}

The DeliveryOutcome record carries a DeliveryResult (Success / TransientFailure / PermanentFailure), ResolvedTargets (a snapshotted string of the concrete recipients, written to the Notifications audit row on success), and an Error string on failure.

Email delivery sequence

EmailNotificationDeliveryAdapter.DeliverAsync runs this sequence, classifying every failure before returning:

  1. Resolve list — calls INotificationRepository.GetListByNameAsync. An unknown list returns Permanent immediately (the list was deleted; retrying cannot fix it).
  2. Resolve recipients — calls GetRecipientsByListIdAsync. An empty list returns Permanent.
  3. Resolve SMTP config — calls GetAllSmtpConfigurationsAsync, takes the first row. No config returns Permanent.
  4. Parse TLS modeSmtpTlsModeParser.Parse(smtpConfig.TlsMode). An unrecognised string throws ArgumentException; DeliverAsync catches it and returns Permanent (config fault, not a transient network condition).
  5. Validate addressesEmailAddressValidator.ValidateAddresses(fromAddress, recipients). A malformed address returns Permanent.
  6. Send — calls the private SendAsync, which connect/auth/send/disconnects via a fresh ISmtpClientWrapper.

SendAsync maps SmtpCommandException 5xx responses to SmtpPermanentException, then lets it propagate. DeliverAsync catches SmtpPermanentExceptionPermanent; SMTP 4xx / socket / protocol / timeout exceptions → Transient (via SmtpErrorClassifier); unclassified exceptions (e.g., OAuth2 token fetch failure) → Permanent (retrying a broken credential wastes token-endpoint calls).

SMTP error classification

SmtpErrorClassifier.Classify uses MailKit's typed exceptions and the numeric SmtpStatusCode rather than message substring matching:

public static SmtpErrorClass Classify(Exception ex, CancellationToken cancellationToken)
{
    if (ex is OperationCanceledException && cancellationToken.IsCancellationRequested)
        return SmtpErrorClass.Unknown;

    if (ex is SmtpCommandException command)
    {
        var code = (int)command.StatusCode;
        if (code >= 400 && code < 500) return SmtpErrorClass.Transient;
        if (code >= 500 && code < 600) return SmtpErrorClass.Permanent;
        return SmtpErrorClass.Unknown;
    }

    if (ex is SmtpProtocolException
        or ServiceNotConnectedException
        or SocketException
        or TimeoutException)
        return SmtpErrorClass.Transient;

    return SmtpErrorClass.Unknown;
}

A Permanent classification inside SendAsync is wrapped in SmtpPermanentException so the outer DeliverAsync catch filter can identify it cleanly.

OAuth2 token lifecycle

OAuth2TokenService.GetTokenAsync fetches tokens for Microsoft 365 Client Credentials SMTP. Credentials are supplied as tenantId:clientId:clientSecret. Tokens are cached in a ConcurrentDictionary keyed by a SHA-256 hash of the credential string (NS-006), so distinct SMTP configurations never share a token. A per-credential SemaphoreSlim prevents thundering-herd refreshes. Tokens are refreshed 60 seconds before the reported expires_in expiry. Only the tenant is logged — the client secret and token value are never written to logs.

Credential redaction

CredentialRedactor.Scrub(text, credentials) masks the full packed credential string and its trailing colon-component (password or clientSecret) in any text before it reaches a log line. Components shorter than 12 characters are not masked — a short username such as root would otherwise mask unrelated diagnostic text. All SMTP error paths in EmailNotificationDeliveryAdapter pass exception messages through Scrub before logging.

Usage

Script API

Site scripts do not interact with this component directly. The script surface is:

// Returns a string notification id immediately — does not block for delivery.
string id = await Notify.To("Shift-Supervisors").Send("Tank overflow", "Tank T-03 is at 98%");

// Site-local while still in the S&F buffer; round-trips to central once forwarded.
NotificationDeliveryStatus status = await Notify.Status(id);

Notify.To("list") is type-agnostic. The notification id is a 32-character "N"-format GUID string (Guid.NewGuid().ToString("N")) generated at the site. Notify.Status(string notificationId) returns a NotificationDeliveryStatus record with Status (Forwarding site-local, Unknown if no central row and not in the S&F buffer, or Pending / Retrying / Delivered / Parked / Discarded from central), RetryCount, LastError, and DeliveredAt.

Registering the adapter

On the central host, both projects are registered. The Notification Outbox registers EmailNotificationDeliveryAdapter as a scoped concrete type and as a scoped INotificationDeliveryAdapter; the outbox actor resolves adapters by enumerating IEnumerable<INotificationDeliveryAdapter> (no keyed/named registration). AddNotificationService is called to register the shared SMTP primitives:

// Central composition root (simplified)
services.AddNotificationService();
services.AddNotificationOutbox();   // registers EmailNotificationDeliveryAdapter

Configuration

NotificationOptions is bound from ScadaBridge:Notification. These values are fallbacks — when a SmtpConfiguration row has a non-positive value for a field, the adapter uses the option value instead. A positive value on the row always takes precedence.

Section Key Default Description
ScadaBridge:Notification ConnectionTimeoutSeconds 30 SMTP connection/operation timeout in seconds. Applied when SmtpConfiguration.ConnectionTimeoutSeconds is zero or negative.
ScadaBridge:Notification MaxConcurrentConnections 5 Maximum concurrent SMTP connections. Used as the documented fallback default when the SmtpConfiguration row is unset; this limit is not currently enforced by a connection gate or semaphore.

SMTP retry settings (MaxRetries, RetryDelay) live on the SmtpConfiguration entity and are read by the Notification Outbox dispatcher — they are not part of NotificationOptions.

SmtpConfiguration entity fields

Field Type Notes
Host string SMTP server hostname or IP.
Port int e.g., 587 for StartTLS, 465 for SSL.
AuthType string basic or oauth2.
Credentials string? Basic: username:password. OAuth2: tenantId:clientId:clientSecret.
TlsMode string? None, StartTLS, or SSL. Null/empty defaults to StartTls.
FromAddress string Sender address in the From header. Also the XOAUTH2 user= identity for M365.
ConnectionTimeoutSeconds int 0 → falls back to NotificationOptions.
MaxConcurrentConnections int 0 → falls back to NotificationOptions.
MaxRetries int Read by Notification Outbox.
RetryDelay TimeSpan Read by Notification Outbox.

NotificationList entity fields

Field Type Notes
Name string Unique list name. Passed as Notify.To("name").
Type NotificationType Enum discriminator. Currently Email only.
Recipients ICollection<NotificationRecipient> Resolved at delivery time by the adapter.

Each NotificationRecipient carries Name (display) and EmailAddress.

Dependencies & Interactions

  • Commons (#16) — owns NotificationList, NotificationRecipient, SmtpConfiguration, Notification, NotificationType, NotificationStatus, INotificationRepository, and the NotificationSubmit / NotificationSubmitAck / NotificationStatusQuery / NotificationStatusResponse / NotificationDeliveryStatus message contracts.
  • Configuration Database (#17) — persists NotificationList, NotificationRecipient, and SmtpConfiguration. Implements INotificationRepository. The EmailNotificationDeliveryAdapter resolves lists and recipients via this repository at delivery time.
  • Notification Outbox (#21) — the central dispatch counterpart. The Notification Outbox registers EmailNotificationDeliveryAdapter, drives retry and parking, and owns the Notifications audit table. Notification Service supplies the SMTP primitives (ISmtpClientWrapper factory, OAuth2TokenService, SmtpErrorClassifier, CredentialRedactor, EmailAddressValidator); Notification Outbox owns when and how often DeliverAsync is called.
  • Store-and-Forward Engine (#6) — site-side buffer. Site scripts hand notifications to the S&F engine, which forwards them to central. The Notification Service has no direct interaction with the site S&F engine; by the time DeliverAsync is called, the notification has already been ingested by the Notification Outbox.
  • Security & Auth (#10) — Design role is required to manage notification lists and SMTP configuration.
  • Design spec: Component-NotificationService.md.

Troubleshooting

A notification is Parked with a permanent failure

A PermanentFailure outcome means EmailNotificationDeliveryAdapter determined that retrying cannot fix the failure. Common root causes:

Symptom Cause Fix
"Notification list '…' not found" List was renamed or deleted after the notification was submitted. Recreate the list or discard the notification in the Central UI Outbox page.
"Notification list '…' has no recipients" List exists but has no recipient rows. Add recipients to the list.
"No SMTP configuration available" No SmtpConfiguration row exists. Add an SMTP configuration in Central UI.
"Unknown SMTP TLS mode '…'" TlsMode field contains a value other than None, StartTLS, or SSL. Correct the TlsMode value.
"Invalid sender (from) email address" or "Invalid recipient email address(es)" Malformed address in the SmtpConfiguration.FromAddress or in a NotificationRecipient.EmailAddress. Correct the address; the adapter validates via MailboxAddress.TryParse.
SMTP 5xx reply Server rejected the message permanently (e.g., mailbox not found, policy block). Check the LastError field on the Notifications row. The error text has credentials redacted.
OAuth2 credential parse error Credentials field is not in tenantId:clientId:clientSecret format. Correct the credentials on the SMTP configuration.

Transient failures retrying indefinitely

The retry count and delay come from SmtpConfiguration.MaxRetries and RetryDelay, enforced by the Notification Outbox. Once MaxRetries is exhausted, the Notification Outbox moves the row to Parked. If a notification stays in Retrying longer than expected, check whether MaxRetries is set to a non-zero value on the SmtpConfiguration row and that the Notification Outbox actor is running on the active central node.

OAuth2 token not refreshing

OAuth2TokenService caches tokens per credential hash. A singleton restart resets the cache; the next GetTokenAsync call fetches a fresh token. If token fetches fail repeatedly (network partition to login.microsoftonline.com, wrong tenant/client/secret), the failure surfaces as an unclassified exception in DeliverAsync and the notification is parked as permanent. The log line includes the tenant ID but not the secret.