229 lines
16 KiB
Markdown
229 lines
16 KiB
Markdown
# Notification Service
|
|
|
|
The Notification Service is the central-only component that owns notification-list and SMTP definitions, and supplies the per-channel `INotificationDeliveryAdapter` implementations that the Notification Outbox invokes at delivery time. Sites never deliver notifications; they store-and-forward notification payloads to central, where this component's adapters perform all actual SMTP sends.
|
|
|
|
## Overview
|
|
|
|
Notification Service (#8) runs on the central cluster only. Its responsibilities split cleanly into two layers:
|
|
|
|
- **Definitions** — `NotificationList` and `SmtpConfiguration` entities stored in the central Configuration Database. Notification lists carry a `NotificationType` discriminator (`Email` now; additional types such as `Teams` are planned). Lists and SMTP config are never deployed to sites.
|
|
- **Delivery adapters** — stateless, per-type implementations of `INotificationDeliveryAdapter`. The Notification Outbox selects the adapter matching a notification's `Type`, calls `DeliverAsync`, and receives a three-way `DeliveryOutcome` (`Success` / `TransientFailure` / `PermanentFailure`). The adapter owns the full recipient-resolution, connection, authentication, send, and disconnect sequence.
|
|
|
|
The component code lives in `src/ZB.MOM.WW.ScadaBridge.NotificationService/`. The `EmailNotificationDeliveryAdapter` that consumes these primitives lives in `src/ZB.MOM.WW.ScadaBridge.NotificationOutbox/Delivery/`.
|
|
|
|
## Key Concepts
|
|
|
|
### Central-only delivery
|
|
|
|
Before the current design, site nodes delivered notifications directly over SMTP. That arrangement required SMTP credentials and notification lists to be deployed to every site. The redesign inverts the path: a site script calls `Notify.To("list").Send(subject, body)`, receives a `NotificationId` GUID immediately, and the notification is store-and-forwarded to central. The Notification Outbox on central ingests it and calls the delivery adapter. Sites never open an SMTP connection.
|
|
|
|
This means:
|
|
- Credential exposure is limited to the central cluster.
|
|
- List membership is resolved at delivery time, so a list change takes effect for all future deliveries without redeploying to sites.
|
|
- The SMTP `MaxConcurrentConnections` limit is enforced at a single point.
|
|
|
|
### `NotificationType` discriminator
|
|
|
|
`NotificationList.Type` is a `NotificationType` enum value (`Email` currently). The script API `Notify.To("listName")` is type-agnostic — the calling script does not reference a type. The Notification Outbox reads the type from the central database when it picks up the notification, then selects the matching adapter by `INotificationDeliveryAdapter.Type`. Adding a new delivery channel means adding a new adapter; existing scripts continue to work.
|
|
|
|
### Per-delivery SMTP client lifetime
|
|
|
|
`MailKitSmtpClientWrapper` wraps a single `MailKit.Net.Smtp.SmtpClient`. MailKit's client is not thread-safe and holds one TCP/TLS connection. The DI registration is therefore a **factory**, not a singleton wrapper:
|
|
|
|
```csharp
|
|
services.AddSingleton<Func<ISmtpClientWrapper>>(_ => () => new MailKitSmtpClientWrapper());
|
|
```
|
|
|
|
`EmailNotificationDeliveryAdapter.SendAsync` invokes the factory at the top of each delivery attempt, runs connect → authenticate → send → disconnect on the fresh wrapper, and disposes it in a `finally` block. Each delivery pays a full TCP+TLS handshake; this is the deliberate cost of avoiding shared connection state between concurrent outbox dispatches. The factory shape allows a future pooled implementation to be slotted in without changing callers.
|
|
|
|
## Architecture
|
|
|
|
### Primitives registered by `AddNotificationService`
|
|
|
|
`ServiceCollectionExtensions.AddNotificationService` is the single DI entry point, called on the central composition root only:
|
|
|
|
```csharp
|
|
public static IServiceCollection AddNotificationService(this IServiceCollection services)
|
|
{
|
|
services.AddOptions<NotificationOptions>()
|
|
.BindConfiguration("ScadaBridge:Notification");
|
|
|
|
services.AddHttpClient();
|
|
services.AddSingleton<OAuth2TokenService>();
|
|
services.AddSingleton<Func<ISmtpClientWrapper>>(_ => () => new MailKitSmtpClientWrapper());
|
|
|
|
return services;
|
|
}
|
|
```
|
|
|
|
Three things are registered: the `NotificationOptions` fallback values, the `OAuth2TokenService` token cache, and the `ISmtpClientWrapper` factory. The `EmailNotificationDeliveryAdapter` itself is registered by `ZB.MOM.WW.ScadaBridge.NotificationOutbox`, which depends on this project.
|
|
|
|
### `INotificationDeliveryAdapter`
|
|
|
|
```csharp
|
|
public interface INotificationDeliveryAdapter
|
|
{
|
|
NotificationType Type { get; }
|
|
Task<DeliveryOutcome> DeliverAsync(
|
|
Notification notification,
|
|
CancellationToken cancellationToken = default);
|
|
}
|
|
```
|
|
|
|
The `DeliveryOutcome` record carries a `DeliveryResult` (`Success` / `TransientFailure` / `PermanentFailure`), `ResolvedTargets` (a snapshotted string of the concrete recipients, written to the `Notifications` audit row on success), and an `Error` string on failure.
|
|
|
|
### Email delivery sequence
|
|
|
|
`EmailNotificationDeliveryAdapter.DeliverAsync` runs this sequence, classifying every failure before returning:
|
|
|
|
1. **Resolve list** — calls `INotificationRepository.GetListByNameAsync`. An unknown list returns `Permanent` immediately (the list was deleted; retrying cannot fix it).
|
|
2. **Resolve recipients** — calls `GetRecipientsByListIdAsync`. An empty list returns `Permanent`.
|
|
3. **Resolve SMTP config** — calls `GetAllSmtpConfigurationsAsync`, takes the first row. No config returns `Permanent`.
|
|
4. **Parse TLS mode** — `SmtpTlsModeParser.Parse(smtpConfig.TlsMode)`. An unrecognised string returns `Permanent` (config fault, not a transient network condition).
|
|
5. **Validate addresses** — `EmailAddressValidator.ValidateAddresses(fromAddress, recipients)`. A malformed address returns `Permanent`.
|
|
6. **Send** — calls the private `SendAsync`, which connect/auth/send/disconnects via a fresh `ISmtpClientWrapper`.
|
|
|
|
`SendAsync` maps `SmtpCommandException` 5xx responses to `SmtpPermanentException`, then lets it propagate. `DeliverAsync` catches `SmtpPermanentException` → `Permanent`; SMTP 4xx / socket / protocol / timeout exceptions → `Transient` (via `SmtpErrorClassifier`); unclassified exceptions (e.g., OAuth2 token fetch failure) → `Permanent` (retrying a broken credential wastes token-endpoint calls).
|
|
|
|
### SMTP error classification
|
|
|
|
`SmtpErrorClassifier.Classify` uses MailKit's typed exceptions and the numeric `SmtpStatusCode` rather than message substring matching:
|
|
|
|
```csharp
|
|
public static SmtpErrorClass Classify(Exception ex, CancellationToken cancellationToken)
|
|
{
|
|
if (ex is OperationCanceledException && cancellationToken.IsCancellationRequested)
|
|
return SmtpErrorClass.Unknown;
|
|
|
|
if (ex is SmtpCommandException command)
|
|
{
|
|
var code = (int)command.StatusCode;
|
|
if (code >= 400 && code < 500) return SmtpErrorClass.Transient;
|
|
if (code >= 500 && code < 600) return SmtpErrorClass.Permanent;
|
|
return SmtpErrorClass.Unknown;
|
|
}
|
|
|
|
if (ex is SmtpProtocolException
|
|
or ServiceNotConnectedException
|
|
or SocketException
|
|
or TimeoutException)
|
|
return SmtpErrorClass.Transient;
|
|
|
|
return SmtpErrorClass.Unknown;
|
|
}
|
|
```
|
|
|
|
A `Permanent` classification inside `SendAsync` is wrapped in `SmtpPermanentException` so the outer `DeliverAsync` catch filter can identify it cleanly.
|
|
|
|
### OAuth2 token lifecycle
|
|
|
|
`OAuth2TokenService.GetTokenAsync` fetches tokens for Microsoft 365 Client Credentials SMTP. Credentials are supplied as `tenantId:clientId:clientSecret`. Tokens are cached in a `ConcurrentDictionary` keyed by a SHA-256 hash of the credential string (NS-006), so distinct SMTP configurations never share a token. A per-credential `SemaphoreSlim` prevents thundering-herd refreshes. Tokens are refreshed 60 seconds before the reported `expires_in` expiry. Only the tenant is logged — the client secret and token value are never written to logs.
|
|
|
|
### Credential redaction
|
|
|
|
`CredentialRedactor.Scrub(text, credentials)` masks the full packed credential string and its trailing colon-component (password or `clientSecret`) in any text before it reaches a log line. Components shorter than 12 characters are not masked — a short username such as `root` would otherwise mask unrelated diagnostic text. All SMTP error paths in `EmailNotificationDeliveryAdapter` pass exception messages through `Scrub` before logging.
|
|
|
|
## Usage
|
|
|
|
### Script API
|
|
|
|
Site scripts do not interact with this component directly. The script surface is:
|
|
|
|
```csharp
|
|
// Returns a NotificationId immediately — does not block for delivery.
|
|
NotificationId id = Notify.To("Shift-Supervisors").Send("Tank overflow", "Tank T-03 is at 98%");
|
|
|
|
// Site-local while still in the S&F buffer; round-trips to central once forwarded.
|
|
NotificationDeliveryStatus status = Notify.Status(id);
|
|
```
|
|
|
|
`Notify.To("list")` is type-agnostic. The `NotificationId` is a GUID generated at the site. `Notify.Status` returns a `NotificationDeliveryStatus` record with `Status` (`Forwarding` site-local, or `Pending` / `Retrying` / `Delivered` / `Parked` / `Discarded` from central), `RetryCount`, `LastError`, and `DeliveredAt`.
|
|
|
|
### Registering the adapter
|
|
|
|
On the central host, both projects are registered. The Notification Outbox registers `EmailNotificationDeliveryAdapter` as a keyed or enumerated `INotificationDeliveryAdapter` and calls `AddNotificationService` to get its dependencies:
|
|
|
|
```csharp
|
|
// Central composition root (simplified)
|
|
services.AddNotificationService();
|
|
services.AddNotificationOutbox(); // registers EmailNotificationDeliveryAdapter
|
|
```
|
|
|
|
## Configuration
|
|
|
|
`NotificationOptions` is bound from `ScadaBridge:Notification`. These values are **fallbacks** — when a `SmtpConfiguration` row has a non-positive value for a field, the adapter uses the option value instead. A positive value on the row always takes precedence.
|
|
|
|
| Section | Key | Default | Description |
|
|
|---------|-----|---------|-------------|
|
|
| `ScadaBridge:Notification` | `ConnectionTimeoutSeconds` | `30` | SMTP connection/operation timeout in seconds. Applied when `SmtpConfiguration.ConnectionTimeoutSeconds` is zero or negative. |
|
|
| `ScadaBridge:Notification` | `MaxConcurrentConnections` | `5` | Maximum concurrent SMTP connections. Used as the documented default; enforcement is in `EmailNotificationDeliveryAdapter`. |
|
|
|
|
SMTP retry settings (`MaxRetries`, `RetryDelay`) live on the `SmtpConfiguration` entity and are read by the Notification Outbox dispatcher — they are not part of `NotificationOptions`.
|
|
|
|
### `SmtpConfiguration` entity fields
|
|
|
|
| Field | Type | Notes |
|
|
|-------|------|-------|
|
|
| `Host` | `string` | SMTP server hostname or IP. |
|
|
| `Port` | `int` | e.g., 587 for StartTLS, 465 for SSL. |
|
|
| `AuthType` | `string` | `basic` or `oauth2`. |
|
|
| `Credentials` | `string?` | Basic: `username:password`. OAuth2: `tenantId:clientId:clientSecret`. |
|
|
| `TlsMode` | `string?` | `None`, `StartTLS`, or `SSL`. Null/empty defaults to `StartTls`. |
|
|
| `FromAddress` | `string` | Sender address in the From header. Also the XOAUTH2 `user=` identity for M365. |
|
|
| `ConnectionTimeoutSeconds` | `int` | 0 → falls back to `NotificationOptions`. |
|
|
| `MaxConcurrentConnections` | `int` | 0 → falls back to `NotificationOptions`. |
|
|
| `MaxRetries` | `int` | Read by Notification Outbox. |
|
|
| `RetryDelay` | `TimeSpan` | Read by Notification Outbox. |
|
|
|
|
### `NotificationList` entity fields
|
|
|
|
| Field | Type | Notes |
|
|
|-------|------|-------|
|
|
| `Name` | `string` | Unique list name. Passed as `Notify.To("name")`. |
|
|
| `Type` | `NotificationType` | Enum discriminator. Currently `Email` only. |
|
|
| `Recipients` | `ICollection<NotificationRecipient>` | Resolved at delivery time by the adapter. |
|
|
|
|
Each `NotificationRecipient` carries `Name` (display) and `EmailAddress`.
|
|
|
|
## Dependencies & Interactions
|
|
|
|
- [Commons (#16)](./Commons.md) — owns `NotificationList`, `NotificationRecipient`, `SmtpConfiguration`, `Notification`, `NotificationType`, `NotificationStatus`, `INotificationRepository`, and the `NotificationSubmit` / `NotificationSubmitAck` / `NotificationStatusQuery` / `NotificationStatusResponse` / `NotificationDeliveryStatus` message contracts.
|
|
- [Configuration Database (#17)](./ConfigurationDatabase.md) — persists `NotificationList`, `NotificationRecipient`, and `SmtpConfiguration`. Implements `INotificationRepository`. The `EmailNotificationDeliveryAdapter` resolves lists and recipients via this repository at delivery time.
|
|
- [Notification Outbox (#21)](./NotificationOutbox.md) — the central dispatch counterpart. The Notification Outbox registers `EmailNotificationDeliveryAdapter`, drives retry and parking, and owns the `Notifications` audit table. Notification Service supplies the SMTP primitives (`ISmtpClientWrapper` factory, `OAuth2TokenService`, `SmtpErrorClassifier`, `CredentialRedactor`, `EmailAddressValidator`); Notification Outbox owns when and how often `DeliverAsync` is called.
|
|
- [Store-and-Forward Engine (#6)](./StoreAndForward.md) — site-side buffer. Site scripts hand notifications to the S&F engine, which forwards them to central. The Notification Service has no direct interaction with the site S&F engine; by the time `DeliverAsync` is called, the notification has already been ingested by the Notification Outbox.
|
|
- [Security & Auth (#10)](./Security.md) — Design role is required to manage notification lists and SMTP configuration.
|
|
- Design spec: [Component-NotificationService.md](../requirements/Component-NotificationService.md).
|
|
|
|
## Troubleshooting
|
|
|
|
### A notification is Parked with a permanent failure
|
|
|
|
A `PermanentFailure` outcome means `EmailNotificationDeliveryAdapter` determined that retrying cannot fix the failure. Common root causes:
|
|
|
|
| Symptom | Cause | Fix |
|
|
|---------|-------|-----|
|
|
| "Notification list '…' not found" | List was renamed or deleted after the notification was submitted. | Recreate the list or discard the notification in the Central UI Outbox page. |
|
|
| "Notification list '…' has no recipients" | List exists but has no recipient rows. | Add recipients to the list. |
|
|
| "No SMTP configuration available" | No `SmtpConfiguration` row exists. | Add an SMTP configuration in Central UI. |
|
|
| "Unknown SMTP TLS mode '…'" | `TlsMode` field contains a value other than `None`, `StartTLS`, or `SSL`. | Correct the `TlsMode` value. |
|
|
| "Invalid sender (from) email address" or "Invalid recipient email address(es)" | Malformed address in the `SmtpConfiguration.FromAddress` or in a `NotificationRecipient.EmailAddress`. | Correct the address; the adapter validates via `MailboxAddress.TryParse`. |
|
|
| SMTP 5xx reply | Server rejected the message permanently (e.g., mailbox not found, policy block). | Check the `LastError` field on the `Notifications` row. The error text has credentials redacted. |
|
|
| OAuth2 credential parse error | `Credentials` field is not in `tenantId:clientId:clientSecret` format. | Correct the credentials on the SMTP configuration. |
|
|
|
|
### Transient failures retrying indefinitely
|
|
|
|
The retry count and delay come from `SmtpConfiguration.MaxRetries` and `RetryDelay`, enforced by the Notification Outbox. Once `MaxRetries` is exhausted, the Notification Outbox moves the row to `Parked`. If a notification stays in `Retrying` longer than expected, check whether `MaxRetries` is set to a non-zero value on the `SmtpConfiguration` row and that the Notification Outbox actor is running on the active central node.
|
|
|
|
### OAuth2 token not refreshing
|
|
|
|
`OAuth2TokenService` caches tokens per credential hash. A singleton restart resets the cache; the next `GetTokenAsync` call fetches a fresh token. If token fetches fail repeatedly (network partition to `login.microsoftonline.com`, wrong tenant/client/secret), the failure surfaces as an unclassified exception in `DeliverAsync` and the notification is parked as permanent. The log line includes the tenant ID but not the secret.
|
|
|
|
## Related Documentation
|
|
|
|
- [Notification Service design specification](../requirements/Component-NotificationService.md)
|
|
- [Notification Outbox](./NotificationOutbox.md)
|
|
- [Commons](./Commons.md)
|
|
- [Configuration Database](./ConfigurationDatabase.md)
|
|
- [Store-and-Forward Engine](./StoreAndForward.md)
|
|
- [Security & Auth](./Security.md)
|