feat(inboundapi): bound audit capture at InboundMaxBytes (memory safety)

AuditWriteMiddleware previously buffered the FULL request and response
bodies into memory and only let DefaultAuditPayloadFilter trim them
after persistence. A 500 MiB upload allocated 500 MiB of MemoryStream
plus 1 GiB of UTF-16 string transiently before the filter pulled it
back to the 1 MiB inbound ceiling — the cap was real on the persisted
row but not at the capture site.

Inject IOptionsMonitor<AuditLogOptions> and read InboundMaxBytes
per-request (same convention as DefaultAuditPayloadFilter so a live
config change picks up the next request). The request reader now pulls
at most cap + 1 bytes into a UTF-8 byte-safe-truncated string and
rewinds the stream so the endpoint handler still sees the full body.
The response wrap is a new CapturedResponseStream that forwards every
Write / WriteAsync to the real sink (the client still receives all
bytes) while capturing at most cap + 1 bytes for the audit copy. The
middleware now sets PayloadTruncated itself when either body hit the
cap; the filter still OR's its own determination on top.

Adds a project reference from ScadaLink.InboundAPI to
ScadaLink.AuditLog so AuditLogOptions resolves. AuditLog does NOT
reference InboundAPI back, so no cycle is introduced.

Tests:
 - All 21 existing AuditWriteMiddlewareTests still pass (the helper
   gains an optional AuditLogOptions argument; default is the standard
   1 MiB ceiling so existing small-body tests are unaffected).
 - MiddlewareOrderTests' construction site updated for the new ctor
   arg; a StaticAuditLogOptionsMonitor file-local double mirrors the
   InboundChannelCapTests pattern.
 - New RequestBody_AboveInboundMaxBytes_TruncatedToCap_PayloadTruncatedTrue
   pins a 4 KiB cap against a 20 KB body: audit copy <= 4 KiB,
   PayloadTruncated = true, downstream handler reads the full 20 KB.
 - New ResponseBody_AboveInboundMaxBytes_TruncatedToCap_ClientStillReceivesAllBytes_PayloadTruncatedTrue
   pins the same shape on the response side: client sink receives
   20 KB, audit copy <= 4 KiB, PayloadTruncated = true.

InboundAPI test count: 133 -> 135.
This commit is contained in:
Joseph Doherty
2026-05-23 09:25:00 -04:00
parent 651c4b6833
commit 7d87994ac0
4 changed files with 399 additions and 74 deletions

View File

@@ -3,6 +3,8 @@ using System.Text;
using System.Text.Json;
using Microsoft.AspNetCore.Http;
using Microsoft.Extensions.Logging;
using Microsoft.Extensions.Options;
using ScadaLink.AuditLog.Configuration;
using ScadaLink.Commons.Entities.Audit;
using ScadaLink.Commons.Interfaces.Services;
using ScadaLink.Commons.Types.Enums;
@@ -43,13 +45,21 @@ namespace ScadaLink.InboundAPI.Middleware;
/// <b>Body capture.</b> The request body is buffered via
/// <see cref="HttpRequestRewindExtensions.EnableBuffering(HttpRequest)"/> then
/// rewound so the downstream endpoint handler still sees the full payload. The
/// response body is captured by swapping <see cref="HttpResponse.Body"/> for a
/// <see cref="MemoryStream"/> before the pipeline runs; after the pipeline
/// returns, the buffered bytes are copied to the original stream (transparent
/// to the real client) and read into <see cref="AuditEvent.ResponseSummary"/>.
/// Truncation to the configured inbound ceiling happens in
/// <see cref="ScadaLink.AuditLog.Payload.DefaultAuditPayloadFilter"/>; the
/// middleware itself stores the full buffered content.
/// response body is captured by wrapping <see cref="HttpResponse.Body"/> in a
/// forwarding stream that mirrors writes to the original sink (transparent to
/// the real client) while capturing a bounded copy for audit.
/// </para>
///
/// <para>
/// <b>Bounded capture at the source.</b> Both the request- and response-body
/// audit copies are bounded at <see cref="AuditLogOptions.InboundMaxBytes"/>
/// (default 1 MiB) AT THE CAPTURE SITE — we never buffer more than
/// <c>cap + 1</c> bytes per body even when the client streams hundreds of MiB.
/// The downstream handler and the real client still see every byte; only the
/// audit copy is bounded. The cap is also enforced again by
/// <see cref="ScadaLink.AuditLog.Payload.DefaultAuditPayloadFilter"/> (which OR's
/// in its own <see cref="AuditEvent.PayloadTruncated"/> determination), so a
/// row truncated here remains truncated even if the filter is bypassed.
/// </para>
/// </summary>
public sealed class AuditWriteMiddleware
@@ -77,21 +87,29 @@ public sealed class AuditWriteMiddleware
private readonly RequestDelegate _next;
private readonly ICentralAuditWriter _auditWriter;
private readonly ILogger<AuditWriteMiddleware> _logger;
private readonly IOptionsMonitor<AuditLogOptions> _options;
public AuditWriteMiddleware(
RequestDelegate next,
ICentralAuditWriter auditWriter,
ILogger<AuditWriteMiddleware> logger)
ILogger<AuditWriteMiddleware> logger,
IOptionsMonitor<AuditLogOptions> options)
{
_next = next ?? throw new ArgumentNullException(nameof(next));
_auditWriter = auditWriter ?? throw new ArgumentNullException(nameof(auditWriter));
_logger = logger ?? throw new ArgumentNullException(nameof(logger));
_options = options ?? throw new ArgumentNullException(nameof(options));
}
public async Task InvokeAsync(HttpContext ctx)
{
var sw = Stopwatch.StartNew();
// Per-request hot read of the inbound cap — mirrors the convention used
// by DefaultAuditPayloadFilter so a live config change picks up on the
// next request without re-resolving the singleton.
var cap = _options.CurrentValue.InboundMaxBytes;
// Audit Log #23 (ParentExecutionId): mint the inbound request's per-request
// ExecutionId ONCE, here at the start of the request, and stash it on
// HttpContext.Items. Two consumers share this single id:
@@ -109,18 +127,17 @@ public sealed class AuditWriteMiddleware
// of the pipeline for us — but we also rewind to position 0 after our
// own read so the very next reader starts from the top.
ctx.Request.EnableBuffering();
var requestBody = await ReadBufferedRequestBodyAsync(ctx.Request).ConfigureAwait(false);
var (requestBody, requestTruncated) =
await ReadBufferedRequestBodyAsync(ctx.Request, cap).ConfigureAwait(false);
// Response body — swap in a MemoryStream so the pipeline writes are
// captured. The original Response.Body is restored in the finally block,
// and the captured bytes are copied back to it so the real client still
// receives every byte (transparent wrap). The captured string is then
// available for the audit row.
// Response body — wrap Response.Body in a forwarding stream that mirrors
// every write to the original sink (transparent to the real client)
// while capturing AT MOST `cap + 1` bytes for the audit copy. The
// original Response.Body is restored in the finally block.
var originalResponseBody = ctx.Response.Body;
using var responseBuffer = new MemoryStream();
ctx.Response.Body = responseBuffer;
using var captureStream = new CapturedResponseStream(originalResponseBody, cap);
ctx.Response.Body = captureStream;
string? responseBody = null;
Exception? thrown = null;
try
{
@@ -137,14 +154,19 @@ public sealed class AuditWriteMiddleware
{
sw.Stop();
// Whatever the handler managed to write — full success, partial
// success before throwing, or nothing at all — copy back to the
// original stream and read for audit.
responseBody = await DrainResponseBufferAsync(responseBuffer, originalResponseBody)
.ConfigureAwait(false);
// Restore the original stream and resolve the captured audit copy.
// The forwarding wrapper has already written every byte to the
// original sink; this just pulls back the bounded UTF-8 string.
ctx.Response.Body = originalResponseBody;
var (responseBody, responseTruncated) = captureStream.GetCapturedBody();
EmitInboundAudit(ctx, sw.ElapsedMilliseconds, thrown, requestBody, responseBody);
EmitInboundAudit(
ctx,
sw.ElapsedMilliseconds,
thrown,
requestBody,
responseBody,
requestTruncated || responseTruncated);
}
}
@@ -158,7 +180,8 @@ public sealed class AuditWriteMiddleware
long durationMs,
Exception? thrown,
string? requestBody,
string? responseBody)
string? responseBody,
bool payloadTruncated)
{
try
{
@@ -210,7 +233,7 @@ public sealed class AuditWriteMiddleware
ErrorMessage = thrown?.Message,
RequestSummary = requestBody,
ResponseSummary = responseBody,
PayloadTruncated = false,
PayloadTruncated = payloadTruncated,
Extra = extra,
// Central direct-write — no site-local forwarding state.
ForwardState = null,
@@ -231,80 +254,101 @@ public sealed class AuditWriteMiddleware
}
/// <summary>
/// Reads the buffered request body fully into a string and rewinds the
/// stream so the downstream handler sees the unconsumed payload. Returns
/// null for empty/missing bodies so the audit row's
/// Reads the buffered request body up to <paramref name="capBytes"/> bytes
/// into a string for the audit copy and rewinds the stream so the
/// downstream handler sees the unconsumed payload. Returns
/// <c>(null, false)</c> for empty/missing bodies so the audit row's
/// <see cref="AuditEvent.RequestSummary"/> stays null rather than
/// containing an empty string.
/// </summary>
private static async Task<string?> ReadBufferedRequestBodyAsync(HttpRequest request)
/// <remarks>
/// Reads AT MOST <c>cap + 1</c> bytes from the request stream into a
/// scratch buffer; if the extra byte arrives the body is over the cap and
/// the returned string is UTF-8 byte-safe truncated to exactly
/// <c>cap</c> bytes with <c>truncated = true</c>. The cap applies only to
/// the audit copy — the request stream is always rewound to position 0
/// afterwards so the framework's next reader (the endpoint handler's
/// JSON parser) sees the full body.
/// </remarks>
private static async Task<(string? body, bool truncated)> ReadBufferedRequestBodyAsync(
HttpRequest request,
int capBytes)
{
if (request.ContentLength is 0)
{
return null;
return (null, false);
}
try
{
request.Body.Position = 0;
using var reader = new StreamReader(
request.Body,
Encoding.UTF8,
detectEncodingFromByteOrderMarks: false,
bufferSize: 1024,
leaveOpen: true);
var content = await reader.ReadToEndAsync().ConfigureAwait(false);
// Read AT MOST cap + 1 bytes — the extra byte tells us the body was
// over the cap without forcing us to allocate the whole payload.
var limit = capBytes + 1;
var buffer = new byte[limit];
var total = 0;
while (total < limit)
{
var read = await request.Body
.ReadAsync(buffer.AsMemory(total, limit - total))
.ConfigureAwait(false);
if (read == 0)
{
break;
}
total += read;
}
request.Body.Position = 0;
return string.IsNullOrEmpty(content) ? null : content;
if (total == 0)
{
return (null, false);
}
var truncated = total > capBytes;
var bytesForString = truncated ? capBytes : total;
var content = DecodeUtf8Bounded(buffer, bytesForString, cutAtValidBytes: truncated);
return (string.IsNullOrEmpty(content) ? null : content, truncated);
}
catch
{
// A failed body read must not abort the request — fall through
// with a null RequestSummary; the audit row still records the
// outcome.
return null;
return (null, false);
}
}
/// <summary>
/// Copies the bytes buffered in <paramref name="buffer"/> to
/// <paramref name="originalBody"/> (so the real client still receives them)
/// and returns a UTF-8 string copy for <see cref="AuditEvent.ResponseSummary"/>.
/// Returns null when no bytes were written, mirroring the
/// <see cref="ReadBufferedRequestBodyAsync"/> empty-body contract.
/// UTF-8 byte-safe decode of <paramref name="validBytes"/> bytes from
/// <paramref name="bytes"/>. When <paramref name="cutAtValidBytes"/> is
/// <c>true</c> the input is the result of a hard byte-count truncation, so
/// we walk back from <c>validBytes</c> while the byte is a continuation
/// byte (<c>byte &amp; 0xC0 == 0x80</c>) to avoid splitting a multi-byte
/// codepoint. When <c>false</c> the caller is decoding the full payload
/// and the boundary stands as-is.
/// </summary>
private static async Task<string?> DrainResponseBufferAsync(
MemoryStream buffer,
Stream originalBody)
/// <remarks>
/// Mirrors the algorithm in <c>DefaultAuditPayloadFilter.TruncateUtf8</c>;
/// kept local to avoid a backwards project reference from
/// ScadaLink.AuditLog into ScadaLink.InboundAPI.
/// </remarks>
private static string DecodeUtf8Bounded(byte[] bytes, int validBytes, bool cutAtValidBytes)
{
if (buffer.Length == 0)
if (validBytes <= 0)
{
return null;
return string.Empty;
}
buffer.Position = 0;
// Copy first so the client never misses bytes even if the read for audit
// throws somehow (defensive — MemoryStream.CopyToAsync to a sink shouldn't
// throw on its own, but the original body may).
try
var boundary = validBytes;
if (cutAtValidBytes)
{
await buffer.CopyToAsync(originalBody).ConfigureAwait(false);
while (boundary > 0 && (bytes[boundary] & 0xC0) == 0x80)
{
boundary--;
}
}
catch
{
// Best-effort: a sink that refuses our copy is the sink's problem;
// the audit still records what the handler produced. Do NOT rethrow.
}
buffer.Position = 0;
using var reader = new StreamReader(
buffer,
Encoding.UTF8,
detectEncodingFromByteOrderMarks: false,
bufferSize: 1024,
leaveOpen: true);
var content = await reader.ReadToEndAsync().ConfigureAwait(false);
return string.IsNullOrEmpty(content) ? null : content;
return Encoding.UTF8.GetString(bytes, 0, boundary);
}
/// <summary>
@@ -383,4 +427,153 @@ public sealed class AuditWriteMiddleware
return path[(lastSlash + 1)..];
}
/// <summary>
/// Write-only forwarding <see cref="Stream"/> wrapper that mirrors every
/// write to the inner ASP.NET <see cref="HttpResponse.Body"/> (so the real
/// client receives all bytes) while capturing AT MOST <c>cap + 1</c> bytes
/// into a private bounded <see cref="MemoryStream"/> for the audit copy.
/// </summary>
/// <remarks>
/// <para>
/// The inner sink is owned by the framework and is NOT disposed when this
/// wrapper is disposed — we only own the capture <see cref="MemoryStream"/>.
/// </para>
/// <para>
/// All Write overloads forward to the inner stream FIRST, then capture the
/// remaining quota. If the inner sink throws (e.g. the client disconnects),
/// the exception is allowed to propagate — capture is best-effort, the
/// real I/O is authoritative. The handler-throws-mid-response test
/// (<c>ResponseBody_OnHandlerThrow_BodyCapturedUpToTheThrow</c>) verifies
/// that captured bytes up to the throw are still recoverable.
/// </para>
/// </remarks>
private sealed class CapturedResponseStream : Stream
{
private readonly Stream _inner;
private readonly int _capBytes;
private readonly MemoryStream _captured;
private bool _disposed;
public CapturedResponseStream(Stream inner, int capBytes)
{
_inner = inner ?? throw new ArgumentNullException(nameof(inner));
_capBytes = Math.Max(0, capBytes);
// Capture up to cap + 1 bytes so we can detect the over-cap case
// without growing the buffer further.
_captured = new MemoryStream();
}
public override bool CanRead => false;
public override bool CanSeek => false;
public override bool CanWrite => true;
public override long Length =>
throw new NotSupportedException("CapturedResponseStream is write-only.");
public override long Position
{
get => throw new NotSupportedException("CapturedResponseStream is write-only.");
set => throw new NotSupportedException("CapturedResponseStream is write-only.");
}
public override void Flush() => _inner.Flush();
public override Task FlushAsync(CancellationToken cancellationToken) =>
_inner.FlushAsync(cancellationToken);
public override int Read(byte[] buffer, int offset, int count) =>
throw new NotSupportedException("CapturedResponseStream is write-only.");
public override long Seek(long offset, SeekOrigin origin) =>
throw new NotSupportedException("CapturedResponseStream is write-only.");
public override void SetLength(long value) =>
throw new NotSupportedException("CapturedResponseStream is write-only.");
public override void Write(byte[] buffer, int offset, int count)
{
// Forward to the real sink FIRST — the client must never miss
// bytes if capture throws.
_inner.Write(buffer, offset, count);
CaptureBytes(buffer.AsSpan(offset, count));
}
public override void Write(ReadOnlySpan<byte> buffer)
{
_inner.Write(buffer);
CaptureBytes(buffer);
}
public override async Task WriteAsync(
byte[] buffer, int offset, int count, CancellationToken cancellationToken)
{
await _inner.WriteAsync(buffer.AsMemory(offset, count), cancellationToken)
.ConfigureAwait(false);
CaptureBytes(buffer.AsSpan(offset, count));
}
public override async ValueTask WriteAsync(
ReadOnlyMemory<byte> buffer, CancellationToken cancellationToken = default)
{
await _inner.WriteAsync(buffer, cancellationToken).ConfigureAwait(false);
CaptureBytes(buffer.Span);
}
/// <summary>
/// Capture up to <c>cap + 1</c> bytes total into the private
/// <see cref="MemoryStream"/>. Once the cap quota is reached, further
/// bytes are silently dropped from the audit copy (the real sink has
/// already received them upstream of this call).
/// </summary>
private void CaptureBytes(ReadOnlySpan<byte> span)
{
if (span.Length == 0)
{
return;
}
var quota = (_capBytes + 1) - (int)_captured.Length;
if (quota <= 0)
{
return;
}
var take = Math.Min(quota, span.Length);
_captured.Write(span[..take]);
}
/// <summary>
/// Returns the captured response body as a UTF-8 string (byte-safe
/// truncated to <c>cap</c> bytes) and a flag indicating whether the
/// audit copy hit the cap. Returns <c>(null, false)</c> when no bytes
/// were captured, mirroring the request-body empty contract.
/// </summary>
public (string? body, bool truncated) GetCapturedBody()
{
var length = (int)_captured.Length;
if (length == 0)
{
return (null, false);
}
var truncated = length > _capBytes;
var bytes = _captured.GetBuffer();
var bytesForString = truncated ? _capBytes : length;
var content = DecodeUtf8Bounded(bytes, bytesForString, cutAtValidBytes: truncated);
return (string.IsNullOrEmpty(content) ? null : content, truncated);
}
protected override void Dispose(bool disposing)
{
if (!_disposed)
{
if (disposing)
{
// Own only the capture stream; the inner sink belongs to
// the framework's response pipeline.
_captured.Dispose();
}
_disposed = true;
}
base.Dispose(disposing);
}
}
}

View File

@@ -14,6 +14,9 @@
<ItemGroup>
<ProjectReference Include="../ScadaLink.Commons/ScadaLink.Commons.csproj" />
<ProjectReference Include="../ScadaLink.Communication/ScadaLink.Communication.csproj" />
<!-- AuditWriteMiddleware reads AuditLogOptions.InboundMaxBytes to bound
per-request request/response audit capture at the source. -->
<ProjectReference Include="../ScadaLink.AuditLog/ScadaLink.AuditLog.csproj" />
</ItemGroup>
<ItemGroup>