docs(requirements): document Failed terminal state for permanent cached-call failures

This commit is contained in:
Joseph Doherty
2026-05-19 11:44:23 -04:00
parent 17ef5f85de
commit 320e4d7479

View File

@@ -46,7 +46,9 @@ Attempt immediate delivery
For notifications, "delivery" means forwarding the message to the central cluster via CentralSite Communication; "success" is central's ack, on which the message is cleared. Notifications do not park — they are retried at the fixed forward interval until central acks. Parking applies only to the external-system-call and cached-database-write categories. For notifications, "delivery" means forwarding the message to the central cluster via CentralSite Communication; "success" is central's ack, on which the message is cleared. Notifications do not park — they are retried at the fixed forward interval until central acks. Parking applies only to the external-system-call and cached-database-write categories.
For the cached-call categories (`ExternalCall` and `DatabaseWrite`), the operation tracking table is the status record and the S&F buffer is purely the retry mechanism. A cached call that succeeds on its first immediate attempt is written directly as a terminal `Delivered` tracking row and never enters the S&F buffer. When immediate delivery fails transiently, the message is buffered and its tracking row moves to `Pending`/`Retrying`; the buffered message carries its `TrackedOperationId` so the tracking row and the retry record stay linked. On every tracking-table status transition the site emits `CachedCallTelemetry` to central. For the cached-call categories (`ExternalCall` and `DatabaseWrite`), the operation tracking table is the status record and the S&F buffer is purely the retry mechanism. A cached call that succeeds on its first immediate attempt is written directly as a terminal `Delivered` tracking row and never enters the S&F buffer. When immediate delivery fails transiently, the message is buffered and its tracking row moves to `Pending`/`Retrying`; the buffered message carries its `TrackedOperationId` so the tracking row and the retry record stay linked. When immediate delivery fails **permanently** (e.g. HTTP 4xx), the message is not buffered — the error is returned synchronously to the calling script as before — but the tracking row is written directly as a terminal `Failed` row capturing the error. On every tracking-table status transition the site emits `CachedCallTelemetry` to central.
Every cached-call outcome maps to a tracking-table state: immediate success → `Delivered`; transient failure → `Pending`/`Retrying`, eventually `Delivered` or `Parked`; permanent failure → terminal `Failed`; operator discard of a parked row → terminal `Discarded`.
## Retry Policy ## Retry Policy
@@ -58,7 +60,7 @@ The **notification** category retries differently: it has no source-entity setti
The retry interval is **fixed** (not exponential backoff). Fixed interval is sufficient for the expected use cases. The retry interval is **fixed** (not exponential backoff). Fixed interval is sufficient for the expected use cases.
**Note**: Only **transient failures** are eligible for store-and-forward buffering. For external system calls, transient failures are connection errors, timeouts, and HTTP 5xx responses. Permanent failures (HTTP 4xx) are returned directly to the calling script and are **not** queued for retry. This prevents the buffer from accumulating requests that will never succeed. **Note**: Only **transient failures** are eligible for store-and-forward buffering. For external system calls, transient failures are connection errors, timeouts, and HTTP 5xx responses. Permanent failures (HTTP 4xx) are returned directly to the calling script and are **not** queued for retry. This prevents the buffer from accumulating requests that will never succeed. For the cached-call categories, a permanent failure additionally sets the operation's tracking-table row to terminal `Failed`, capturing the error — so even a never-buffered cached call has an authoritative status record. `Failed` rows are not operator-actionable: a permanent failure would only fail again, and the error was already returned to the script.
## Buffer Size ## Buffer Size
@@ -109,7 +111,7 @@ Each buffered message stores:
- **Retry Count**: Number of attempts so far. - **Retry Count**: Number of attempts so far.
- **Created At**: Timestamp when the message was first queued. - **Created At**: Timestamp when the message was first queued.
- **Last Attempt At**: Timestamp of the most recent delivery attempt. - **Last Attempt At**: Timestamp of the most recent delivery attempt.
- **Status**: Pending, retrying, or parked. - **Status**: Pending, retrying, or parked. This is the **buffer message's** retry state, distinct from the operation's `TrackedOperationStatus` lifecycle in the operation tracking table. A buffer message exists only while a cached call is mid-retry, so it never carries the terminal `Delivered`, `Failed`, or `Discarded` states — those live solely on the tracking row.
## Dependencies ## Dependencies