Files
scadalink-design/docs/plans/2026-05-23-inbound-api-full-response-audit-design.md
Joseph Doherty 0670864160 docs(audit): design — full request/response capture for inbound API rows
Carve-out from Payload Capture Policy: ApiInbound rows capture
RequestSummary and ResponseSummary in full up to a configurable 1 MB
per-body ceiling (AuditLog:InboundMaxBytes), instead of the global 8 KB /
64 KB caps. No schema change; existing redaction (headers + per-target
body redactors) still applies before persistence.
2026-05-23 05:28:34 -04:00

146 lines
6.8 KiB
Markdown

# Inbound API: Full Request/Response Capture in Audit Log
**Date:** 2026-05-23
**Status:** Approved (brainstorming complete)
**Affects:** Component-AuditLog (#23), Component-InboundAPI (#14)
## Problem
Today the centralized Audit Log captures inbound API request and response bodies
into `RequestSummary` / `ResponseSummary`, but with the global Payload Capture
Policy cap — 8 KB by default, 64 KB on error rows. For inbound API traffic this
is too tight: operators routinely need to replay exactly what an external caller
sent and exactly what we returned. Truncation defeats both replay and the most
common "why did this script see that input / what did the caller actually
receive" debugging path.
## Decision
For `Channel = ApiInbound` rows only, capture `RequestSummary` and
`ResponseSummary` verbatim up to a hard per-body ceiling of **1 MB**
(configurable). The 8 KB / 64 KB default/error caps that apply to other channels
do not apply here. All other channels (`ApiOutbound`, `DbOutbound`,
`Notification`, cached-call lifecycle, `InboundAuthFailure`) keep the existing
policy unchanged.
## Capture Policy Change
The Payload Capture Policy in `Component-AuditLog.md` gains an Inbound API
carve-out:
> **Inbound API exception.** For `Channel = ApiInbound`, `RequestSummary` and
> `ResponseSummary` are captured in full up to a per-body hard ceiling of 1 MB
> (configurable via `AuditLog:InboundMaxBytes`; default 1 048 576 bytes; min
> 8 192; max 16 777 216). The 8 KB / 64 KB default/error caps that apply to
> other channels do not apply here. `PayloadTruncated = 1` is set only when the
> 1 MB ceiling is hit — verbatim capture is the normal case.
The rest of the policy is unchanged:
- Header redact list (`Authorization`, `Cookie`, `Set-Cookie`, `X-API-Key`,
configured regex) still applies.
- Per-target body redactors (regex → replacement, keyed by inbound method name)
still run before persistence.
- The redactor-error safety net (`<redacted: redactor error>` plus
`AuditRedactionFailure` health metric increment) still applies.
- UTF-8 byte-safe truncation when the 1 MB ceiling *is* hit.
The ceiling applies independently to the request body and the response body —
each gets its own 1 MB budget on a given audit row.
## Schema
No schema change. `RequestSummary` and `ResponseSummary` are already
`nvarchar(max)`; SQL Server transparently stores LOB content out-of-row, so
larger row payloads are paid for only when the column is read. Only the column
description text changes to reflect the inbound carve-out.
## Ingestion Path
Unchanged. Inbound rows are already a central direct-write from the request-
handler middleware via `ICentralAuditWriter` before the HTTP response is
flushed, and audit-write failure is already fail-soft (logged + increments
`CentralAuditWriteFailures`, never fails the user-facing request).
The only code change at the write site is the cap selection:
```text
maxBytes = channel == ApiInbound
? options.InboundMaxBytes // default 1 MB
: isErrorRow ? 64*1024 : 8*1024; // existing policy
```
Redactors run before the cap; the cap is the final byte-budget step before the
INSERT.
## Configuration
New option on the existing `AuditLog` options class:
| Key | Default | Min | Max | Description |
|---|---|---|---|---|
| `AuditLog:InboundMaxBytes` | `1048576` | `8192` | `16777216` | Per-body ceiling for `ApiInbound` `RequestSummary` / `ResponseSummary`. Truncation past this is the only case where `PayloadTruncated` is set on an inbound row. |
Bounds enforced on options binding; out-of-range values fail startup with the
same "options validation" path used for other AuditLog settings.
## Doc Edits
1. **`Component-AuditLog.md`**
- `RequestSummary` and `ResponseSummary` rows in the schema table: amend
descriptions to note the `ApiInbound` carve-out (full capture up to
`InboundMaxBytes`, default 1 MB).
- Payload Capture Policy section: add the **Inbound API exception**
paragraph above; add `AuditLog:InboundMaxBytes` to the configuration knobs
list.
2. **`Component-InboundAPI.md`**
- Line ~119 (audit row description): "truncated request/response bodies per
the Audit Log capture policy" → "request/response bodies captured in full
up to the configured `AuditLog:InboundMaxBytes` ceiling (default 1 MB);
`PayloadTruncated = 1` only when that ceiling is hit".
- Line ~202 (Dependencies → Audit Log): mirror the wording adjustment.
## Operational Trade-offs
- **Storage growth.** At 365-day retention, full-body capture on every inbound
request can grow `AuditLog` significantly compared to today's 8 KB cap.
Operators tune by lowering `InboundMaxBytes`, shortening retention via
`AuditLog:RetentionDays`, or — once per-target redaction is configured for
chatty methods — applying body redactors to drop noise. Monthly partition
purge keeps reclamation cheap regardless of row size.
- **No new health metric.** Hitting the 1 MB ceiling is reflected in the
existing `PayloadTruncated` bit; no separate counter in v1. If ceiling-hits
become a real operational signal, an `AuditInboundCeilingHits` metric can be
added later without schema change.
- **Append-only and audit role.** The `scadalink_audit_writer` role already
permits `INSERT` only — full-body rows don't change the security model.
## Not in Scope (Deferred)
- **Structured response capture.** `ResponseSummary` stays a single string;
response status code remains in `HttpStatus`. No separate columns for response
headers or content type. Inbound request headers remain uncaptured.
- **Per-method opt-out** from full capture. If specific methods produce
routinely-huge responses, operators use the existing per-target body redactor
to compress them, or lower the global ceiling.
- **Changes to other channels' caps.** `ApiOutbound`, `DbOutbound`,
`Notification`, cached-call lifecycle rows, and `InboundAuthFailure` keep the
existing 8 KB / 64 KB policy.
## Acceptance Criteria
- [ ] `AuditLog:InboundMaxBytes` option exists on the AuditLog options class,
with the documented default and bounds, validated at startup.
- [ ] Inbound request middleware writes `RequestSummary` and `ResponseSummary`
using the inbound ceiling instead of the 8 KB / 64 KB defaults.
- [ ] Other channels' rows (e.g. an `ApiOutbound.ApiCall` over the limit) still
truncate at 8 KB (64 KB on error rows) — regression-tested.
- [ ] `PayloadTruncated = 1` on an inbound row iff request body or response
body exceeded `InboundMaxBytes`.
- [ ] Header redaction list and per-target body redactors still apply to
inbound rows.
- [ ] Redactor failure on an inbound row still produces `<redacted: redactor
error>` and increments `AuditRedactionFailure`.
- [ ] `Component-AuditLog.md` and `Component-InboundAPI.md` updated as
described in **Doc Edits**.