docs(plans): fold refinement decisions into notification outbox design

Resolves the six open questions: host-level forward-retry config,
Notify.Status returns a status record, 10-min stuck threshold, a
site-local Forwarding state, site-side logging of forward failures
only, and point-in-time KPIs computed from the Notifications table.
This commit is contained in:
Joseph Doherty
2026-05-18 22:57:45 -04:00
parent d4e86c1b1d
commit bbfa0c515e

View File

@@ -49,8 +49,10 @@ Central Notification Outbox actor (singleton, active central node)
/ retries exhausted → Parked
```
`Notify.Status(notificationId)` round-trips site→central and reads the table. Before central
has ingested the row, status reads as `Pending` (in transit).
`Notify.Status(notificationId)` returns a small **status record** — status, retry count,
last error, and key timestamps (enqueued, delivered). While the notification is still in the
site S&F buffer the site answers the query **locally** (status `Forwarding`); once forwarded,
the query round-trips to central and reads the `Notifications` table.
## Component design
@@ -81,7 +83,8 @@ Notifications and SMTP config are **no longer deployed to sites**. Sites never t
Keeps its notification category, but the delivery *target* changes from SMTP to **central**.
"Delivering" a buffered notification now means handing it to the Communication Layer for the
central cluster and clearing it on central's ack. The site→central forward uses a fixed
retry interval (host-level config, since it concerns reaching central, not any list).
retry interval configured in the host `appsettings.json` it concerns reaching the central
cluster rather than any notification list.
## Typed notification lists
@@ -121,7 +124,10 @@ All timestamps are UTC.
### Status lifecycle
- `Pending` — ingested, awaiting first dispatch.
- `Forwarding` — in the site S&F buffer, not yet received by central. **Site-local only**
never stored in the central `Notifications` table; reported by `Notify.Status` while the
site still holds the notification.
- `Pending` — ingested by central, awaiting first dispatch.
- `Retrying` — a transient failure occurred; `NextAttemptAt` schedules the next attempt.
- `Delivered` — terminal, success.
- `Parked` — terminal-not-delivered: a permanent failure, or retries exhausted. `LastError`
@@ -202,21 +208,29 @@ escalation or alerting, consistent with the current system-wide no-alerting poli
| `Component-NotificationService.md` | Delivery moves central; lists gain a `Type`; no deploy-to-sites; async script API; delivery adapters. |
| `Component-StoreAndForward.md` | Notification category retargeted from SMTP to central. |
| `Component-HealthMonitoring.md` | Outbox KPIs added as central-computed headline metrics. |
| `Component-SiteEventLogging.md` | New Notification event category — logs site→central forward failures and long-buffered notifications. |
| `Component-CentralUI.md` | New Notification Outbox page. |
| CentralSite Communication | New `NotificationSubmit` + ack message pair. |
| Configuration Database / Commons | `Notifications` table, entity POCO, repository interface + implementation, EF migration, message contracts. |
| `README.md` | Component table 20 → 21. |
| `CLAUDE.md` | Component list 20 → 21; new key design decisions. |
## Open questions for refinement
## Refinement decisions (2026-05-18)
- **Site→central forward retry config** — where the fixed forward-retry interval lives
(host appsettings vs a deployed setting).
- **`Notify.Status` payload** — whether status queries also return retry count / last error
to scripts, or just the status enum.
- **Stuck threshold default** — 10 minutes is a placeholder.
- **Pre-ingest status** — confirm `Pending` is the right reading for a notification still
in the site S&F buffer (vs a distinct "Forwarding" state).
- **Site-side diagnostics** — whether to keep a lightweight Site Event Logging entry for
"notification enqueued / forwarded," now that central holds the authoritative record.
- **KPI history** — KPIs are currently point-in-time; whether any trend/history is wanted.
- **Site→central forward retry config** — the fixed forward-retry interval lives in the host
`appsettings.json` (infrastructure config, not a deployed artifact).
- **`Notify.Status` payload** — returns a status record: status, retry count, last error,
and key timestamps (enqueued, delivered).
- **Stuck threshold default** — 10 minutes, configurable.
- **Pre-ingest status** — a distinct site-local `Forwarding` state; the site answers
`Notify.Status` from its own S&F buffer without a round-trip to central.
- **Site-side diagnostics** — Site Event Logging records site→central **forward failures**
and long-buffered notifications only, not routine enqueue/forward success events.
- **KPI history** — point-in-time only, computed on demand from the `Notifications` table;
the ~1-year row retention answers historical questions directly, so no separate
time-series store is added.
## Open questions
None outstanding — the basic design is fully specified. The next step is an implementation
plan against the cross-document impact table.