docs(audit): design — full request/response capture for inbound API rows
Carve-out from Payload Capture Policy: ApiInbound rows capture RequestSummary and ResponseSummary in full up to a configurable 1 MB per-body ceiling (AuditLog:InboundMaxBytes), instead of the global 8 KB / 64 KB caps. No schema change; existing redaction (headers + per-target body redactors) still applies before persistence.
This commit is contained in:
145
docs/plans/2026-05-23-inbound-api-full-response-audit-design.md
Normal file
145
docs/plans/2026-05-23-inbound-api-full-response-audit-design.md
Normal file
@@ -0,0 +1,145 @@
|
||||
# Inbound API: Full Request/Response Capture in Audit Log
|
||||
|
||||
**Date:** 2026-05-23
|
||||
**Status:** Approved (brainstorming complete)
|
||||
**Affects:** Component-AuditLog (#23), Component-InboundAPI (#14)
|
||||
|
||||
## Problem
|
||||
|
||||
Today the centralized Audit Log captures inbound API request and response bodies
|
||||
into `RequestSummary` / `ResponseSummary`, but with the global Payload Capture
|
||||
Policy cap — 8 KB by default, 64 KB on error rows. For inbound API traffic this
|
||||
is too tight: operators routinely need to replay exactly what an external caller
|
||||
sent and exactly what we returned. Truncation defeats both replay and the most
|
||||
common "why did this script see that input / what did the caller actually
|
||||
receive" debugging path.
|
||||
|
||||
## Decision
|
||||
|
||||
For `Channel = ApiInbound` rows only, capture `RequestSummary` and
|
||||
`ResponseSummary` verbatim up to a hard per-body ceiling of **1 MB**
|
||||
(configurable). The 8 KB / 64 KB default/error caps that apply to other channels
|
||||
do not apply here. All other channels (`ApiOutbound`, `DbOutbound`,
|
||||
`Notification`, cached-call lifecycle, `InboundAuthFailure`) keep the existing
|
||||
policy unchanged.
|
||||
|
||||
## Capture Policy Change
|
||||
|
||||
The Payload Capture Policy in `Component-AuditLog.md` gains an Inbound API
|
||||
carve-out:
|
||||
|
||||
> **Inbound API exception.** For `Channel = ApiInbound`, `RequestSummary` and
|
||||
> `ResponseSummary` are captured in full up to a per-body hard ceiling of 1 MB
|
||||
> (configurable via `AuditLog:InboundMaxBytes`; default 1 048 576 bytes; min
|
||||
> 8 192; max 16 777 216). The 8 KB / 64 KB default/error caps that apply to
|
||||
> other channels do not apply here. `PayloadTruncated = 1` is set only when the
|
||||
> 1 MB ceiling is hit — verbatim capture is the normal case.
|
||||
|
||||
The rest of the policy is unchanged:
|
||||
|
||||
- Header redact list (`Authorization`, `Cookie`, `Set-Cookie`, `X-API-Key`,
|
||||
configured regex) still applies.
|
||||
- Per-target body redactors (regex → replacement, keyed by inbound method name)
|
||||
still run before persistence.
|
||||
- The redactor-error safety net (`<redacted: redactor error>` plus
|
||||
`AuditRedactionFailure` health metric increment) still applies.
|
||||
- UTF-8 byte-safe truncation when the 1 MB ceiling *is* hit.
|
||||
|
||||
The ceiling applies independently to the request body and the response body —
|
||||
each gets its own 1 MB budget on a given audit row.
|
||||
|
||||
## Schema
|
||||
|
||||
No schema change. `RequestSummary` and `ResponseSummary` are already
|
||||
`nvarchar(max)`; SQL Server transparently stores LOB content out-of-row, so
|
||||
larger row payloads are paid for only when the column is read. Only the column
|
||||
description text changes to reflect the inbound carve-out.
|
||||
|
||||
## Ingestion Path
|
||||
|
||||
Unchanged. Inbound rows are already a central direct-write from the request-
|
||||
handler middleware via `ICentralAuditWriter` before the HTTP response is
|
||||
flushed, and audit-write failure is already fail-soft (logged + increments
|
||||
`CentralAuditWriteFailures`, never fails the user-facing request).
|
||||
|
||||
The only code change at the write site is the cap selection:
|
||||
|
||||
```text
|
||||
maxBytes = channel == ApiInbound
|
||||
? options.InboundMaxBytes // default 1 MB
|
||||
: isErrorRow ? 64*1024 : 8*1024; // existing policy
|
||||
```
|
||||
|
||||
Redactors run before the cap; the cap is the final byte-budget step before the
|
||||
INSERT.
|
||||
|
||||
## Configuration
|
||||
|
||||
New option on the existing `AuditLog` options class:
|
||||
|
||||
| Key | Default | Min | Max | Description |
|
||||
|---|---|---|---|---|
|
||||
| `AuditLog:InboundMaxBytes` | `1048576` | `8192` | `16777216` | Per-body ceiling for `ApiInbound` `RequestSummary` / `ResponseSummary`. Truncation past this is the only case where `PayloadTruncated` is set on an inbound row. |
|
||||
|
||||
Bounds enforced on options binding; out-of-range values fail startup with the
|
||||
same "options validation" path used for other AuditLog settings.
|
||||
|
||||
## Doc Edits
|
||||
|
||||
1. **`Component-AuditLog.md`**
|
||||
- `RequestSummary` and `ResponseSummary` rows in the schema table: amend
|
||||
descriptions to note the `ApiInbound` carve-out (full capture up to
|
||||
`InboundMaxBytes`, default 1 MB).
|
||||
- Payload Capture Policy section: add the **Inbound API exception**
|
||||
paragraph above; add `AuditLog:InboundMaxBytes` to the configuration knobs
|
||||
list.
|
||||
2. **`Component-InboundAPI.md`**
|
||||
- Line ~119 (audit row description): "truncated request/response bodies per
|
||||
the Audit Log capture policy" → "request/response bodies captured in full
|
||||
up to the configured `AuditLog:InboundMaxBytes` ceiling (default 1 MB);
|
||||
`PayloadTruncated = 1` only when that ceiling is hit".
|
||||
- Line ~202 (Dependencies → Audit Log): mirror the wording adjustment.
|
||||
|
||||
## Operational Trade-offs
|
||||
|
||||
- **Storage growth.** At 365-day retention, full-body capture on every inbound
|
||||
request can grow `AuditLog` significantly compared to today's 8 KB cap.
|
||||
Operators tune by lowering `InboundMaxBytes`, shortening retention via
|
||||
`AuditLog:RetentionDays`, or — once per-target redaction is configured for
|
||||
chatty methods — applying body redactors to drop noise. Monthly partition
|
||||
purge keeps reclamation cheap regardless of row size.
|
||||
- **No new health metric.** Hitting the 1 MB ceiling is reflected in the
|
||||
existing `PayloadTruncated` bit; no separate counter in v1. If ceiling-hits
|
||||
become a real operational signal, an `AuditInboundCeilingHits` metric can be
|
||||
added later without schema change.
|
||||
- **Append-only and audit role.** The `scadalink_audit_writer` role already
|
||||
permits `INSERT` only — full-body rows don't change the security model.
|
||||
|
||||
## Not in Scope (Deferred)
|
||||
|
||||
- **Structured response capture.** `ResponseSummary` stays a single string;
|
||||
response status code remains in `HttpStatus`. No separate columns for response
|
||||
headers or content type. Inbound request headers remain uncaptured.
|
||||
- **Per-method opt-out** from full capture. If specific methods produce
|
||||
routinely-huge responses, operators use the existing per-target body redactor
|
||||
to compress them, or lower the global ceiling.
|
||||
- **Changes to other channels' caps.** `ApiOutbound`, `DbOutbound`,
|
||||
`Notification`, cached-call lifecycle rows, and `InboundAuthFailure` keep the
|
||||
existing 8 KB / 64 KB policy.
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] `AuditLog:InboundMaxBytes` option exists on the AuditLog options class,
|
||||
with the documented default and bounds, validated at startup.
|
||||
- [ ] Inbound request middleware writes `RequestSummary` and `ResponseSummary`
|
||||
using the inbound ceiling instead of the 8 KB / 64 KB defaults.
|
||||
- [ ] Other channels' rows (e.g. an `ApiOutbound.ApiCall` over the limit) still
|
||||
truncate at 8 KB (64 KB on error rows) — regression-tested.
|
||||
- [ ] `PayloadTruncated = 1` on an inbound row iff request body or response
|
||||
body exceeded `InboundMaxBytes`.
|
||||
- [ ] Header redaction list and per-target body redactors still apply to
|
||||
inbound rows.
|
||||
- [ ] Redactor failure on an inbound row still produces `<redacted: redactor
|
||||
error>` and increments `AuditRedactionFailure`.
|
||||
- [ ] `Component-AuditLog.md` and `Component-InboundAPI.md` updated as
|
||||
described in **Doc Edits**.
|
||||
Reference in New Issue
Block a user