docs(notification-outbox): update component list and design decisions in CLAUDE.md
This commit is contained in:
10
CLAUDE.md
10
CLAUDE.md
@@ -36,7 +36,7 @@ This project contains design documentation for a distributed SCADA system built
|
|||||||
- Use `git diff` to review changes before committing.
|
- Use `git diff` to review changes before committing.
|
||||||
- Commit related changes together with a descriptive message summarizing the design decision.
|
- Commit related changes together with a descriptive message summarizing the design decision.
|
||||||
|
|
||||||
## Current Component List (20 components)
|
## Current Component List (21 components)
|
||||||
|
|
||||||
1. Template Engine — Template modeling, inheritance, composition, validation, flattening, diffs.
|
1. Template Engine — Template modeling, inheritance, composition, validation, flattening, diffs.
|
||||||
2. Deployment Manager — Central-side deployment pipeline, system-wide artifact deployment, instance lifecycle.
|
2. Deployment Manager — Central-side deployment pipeline, system-wide artifact deployment, instance lifecycle.
|
||||||
@@ -45,7 +45,7 @@ This project contains design documentation for a distributed SCADA system built
|
|||||||
5. Central–Site Communication — Akka.NET ClusterClient (command/control) + gRPC server-streaming (real-time data), message patterns, debug streaming.
|
5. Central–Site Communication — Akka.NET ClusterClient (command/control) + gRPC server-streaming (real-time data), message patterns, debug streaming.
|
||||||
6. Store-and-Forward Engine — Buffering, fixed-interval retry, parking, SQLite persistence, replication.
|
6. Store-and-Forward Engine — Buffering, fixed-interval retry, parking, SQLite persistence, replication.
|
||||||
7. External System Gateway — External system definitions, API method invocation, database connections.
|
7. External System Gateway — External system definitions, API method invocation, database connections.
|
||||||
8. Notification Service — Notification lists, email delivery, store-and-forward integration.
|
8. Notification Service — Central-only notification-list and SMTP definitions, per-type delivery adapters (sites no longer deliver notifications).
|
||||||
9. Central UI — Web-based management interface, all workflows.
|
9. Central UI — Web-based management interface, all workflows.
|
||||||
10. Security & Auth — LDAP/AD authentication, role-based authorization, site-scoped permissions.
|
10. Security & Auth — LDAP/AD authentication, role-based authorization, site-scoped permissions.
|
||||||
11. Health Monitoring — Site health metrics collection and central reporting.
|
11. Health Monitoring — Site health metrics collection and central reporting.
|
||||||
@@ -58,6 +58,7 @@ This project contains design documentation for a distributed SCADA system built
|
|||||||
18. Management Service — Akka.NET actor providing programmatic access to all admin operations, ClusterClientReceptionist registration.
|
18. Management Service — Akka.NET actor providing programmatic access to all admin operations, ClusterClientReceptionist registration.
|
||||||
19. CLI — Command-line tool using HTTP Management API, System.CommandLine, JSON/table output.
|
19. CLI — Command-line tool using HTTP Management API, System.CommandLine, JSON/table output.
|
||||||
20. Traefik Proxy — Reverse proxy/load balancer fronting central cluster, active node routing via `/health/active`, automatic failover.
|
20. Traefik Proxy — Reverse proxy/load balancer fronting central cluster, active node routing via `/health/active`, automatic failover.
|
||||||
|
21. Notification Outbox — Central component ingesting store-and-forwarded notifications, `Notifications` audit table, dispatcher loop, retry/parking, delivery KPIs.
|
||||||
|
|
||||||
## Key Design Decisions (for context across sessions)
|
## Key Design Decisions (for context across sessions)
|
||||||
|
|
||||||
@@ -88,6 +89,9 @@ This project contains design documentation for a distributed SCADA system built
|
|||||||
- Dual call modes: `ExternalSystem.Call()` (synchronous) and `ExternalSystem.CachedCall()` (store-and-forward on transient failure).
|
- Dual call modes: `ExternalSystem.Call()` (synchronous) and `ExternalSystem.CachedCall()` (store-and-forward on transient failure).
|
||||||
- Error classification: HTTP 5xx/408/429/connection errors = transient; other 4xx = permanent (returned to script).
|
- Error classification: HTTP 5xx/408/429/connection errors = transient; other 4xx = permanent (returned to script).
|
||||||
- Notification Service: SMTP with OAuth2 Client Credentials (Microsoft 365) or Basic Auth. BCC delivery, plain text.
|
- Notification Service: SMTP with OAuth2 Client Credentials (Microsoft 365) or Basic Auth. BCC delivery, plain text.
|
||||||
|
- Notification delivery is central-only: sites store-and-forward notifications to the central cluster (target = central, not SMTP); sites never talk to SMTP. Notification lists and SMTP config are no longer deployed to sites; recipient resolution happens at central, at delivery time.
|
||||||
|
- Notification lists carry a `Type` discriminator (`Email` now; `Teams` and others later). `Notify.To("list")` is type-agnostic; delivery is via per-type `INotificationDeliveryAdapter` (success/transient/permanent classification, same pattern as External System Gateway).
|
||||||
|
- `Notify.Send` is async — returns a `NotificationId` (GUID, idempotency key) status handle immediately. `Notify.Status(notificationId)` returns a status record (status, retry count, last error, key timestamps); answered site-locally as `Forwarding` while still in the site S&F buffer, otherwise round-trips to central.
|
||||||
- Inbound API: `POST /api/{methodName}`, `X-API-Key` header, flat JSON, extended type system (Object, List).
|
- Inbound API: `POST /api/{methodName}`, `X-API-Key` header, flat JSON, extended type system (Object, List).
|
||||||
|
|
||||||
### Templates & Deployment
|
### Templates & Deployment
|
||||||
@@ -109,6 +113,7 @@ This project contains design documentation for a distributed SCADA system built
|
|||||||
- Async best-effort replication to standby (no ack wait).
|
- Async best-effort replication to standby (no ack wait).
|
||||||
- Messages not cleared on instance deletion.
|
- Messages not cleared on instance deletion.
|
||||||
- CachedCall idempotency is the caller's responsibility.
|
- CachedCall idempotency is the caller's responsibility.
|
||||||
|
- Notification Outbox: central `NotificationOutboxActor` singleton on the active central node — the first centrally-hosted outbox (S&F Engine remains site-only). Owns the durable `Notifications` table in central MS SQL — the single source of audit truth (one row per notification). Dispatcher loop polls due rows, resolves the list, delivers via the typed adapter; transient failures retry to `Parked`, permanent failures park immediately. `Notifications` table is type-agnostic via the `Type` discriminator; status lifecycle `Pending → Retrying → Delivered / Parked / Discarded` (plus site-local `Forwarding`, never persisted centrally). Site→central handoff is at-least-once with ack-after-persist and insert-if-not-exists on `NotificationId`. No Akka replication — MS SQL is the HA store; daily purge of terminal rows after a configurable window (default 365 days). Retry reuses central SMTP max-retry-count and fixed interval.
|
||||||
|
|
||||||
### Security & Auth
|
### Security & Auth
|
||||||
- Authentication: direct LDAP bind (username/password), no Kerberos/NTLM. LDAPS/StartTLS required.
|
- Authentication: direct LDAP bind (username/password), no Kerberos/NTLM. LDAPS/StartTLS required.
|
||||||
@@ -130,6 +135,7 @@ This project contains design documentation for a distributed SCADA system built
|
|||||||
- Health reports: 30s interval, 60s offline threshold, monotonic sequence numbers, raw error counts per interval.
|
- Health reports: 30s interval, 60s offline threshold, monotonic sequence numbers, raw error counts per interval.
|
||||||
- Dead letter monitoring as a health metric.
|
- Dead letter monitoring as a health metric.
|
||||||
- Site Event Logging: 30-day retention, 1GB storage cap, daily purge, paginated queries with keyword search.
|
- Site Event Logging: 30-day retention, 1GB storage cap, daily purge, paginated queries with keyword search.
|
||||||
|
- Notification Outbox KPIs are central-computed point-in-time from the `Notifications` table (global + per-source-site): queue depth, stuck count, parked count, delivered-last-interval, oldest-pending age. Stuck = `Pending`/`Retrying` older than a configurable age threshold (default 10 min) — display-only (KPI count + row badge), no escalation/alerting. Headline KPI tiles surface on the Health dashboard; a new Central UI Notification Outbox page offers a queryable list with Retry/Discard actions on parked notifications.
|
||||||
|
|
||||||
### Code Organization
|
### Code Organization
|
||||||
- Entity classes are persistence-ignorant POCOs in Commons; EF mappings in Configuration Database.
|
- Entity classes are persistence-ignorant POCOs in Commons; EF mappings in Configuration Database.
|
||||||
|
|||||||
Reference in New Issue
Block a user