Files
ScadaBridge/code-reviews/InboundAPI/findings.md
T
Joseph Doherty 46cb6965ac fix(security): close Theme 7 — 8 secrets / redaction / append-only findings
Security-sensitive batch, handled main-thread for careful judgment on
secret-leak and pepper-bypass paths.

Secret leak / pepper bypass:
- CD-016 (pepper bypass): InboundApiRepository's GetApiKeyByValueAsync no
  longer hashes the candidate with the unpeppered ApiKeyHasher.Default —
  ctor takes a lazy Func<IApiKeyHasher> accessor (lazy so test composition
  roots without a pepper still bring up the repository), and the DI
  registration wires sp.GetService<IApiKeyHasher>() so the production
  peppered hasher matches the stored KeyHash. Regression test asserts
  positive (peppered roundtrip) AND negative (Default hasher misses the
  same key — proving the lookup uses the injected hasher).
- MgmtSvc-020 (SMTP credential leak): UpdateSmtpConfig/ListSmtpConfigs
  now project through SmtpConfigPublicShape so the response payload and
  audit-row afterState never carry the Credentials field — only a
  HasCredentials bool. The SMTP password / OAuth2 client secret no
  longer leaves the Admin-only UpdateSmtpConfig boundary the caller
  already supplied it to.

Redaction:
- AuditLog-008 (test-fixture under-redact): new
  SafeDefaultAuditPayloadFilter (stateless singleton) does HTTP header
  redaction for the always-sensitive defaults (Authorization, X-Api-Key,
  Cookie, Set-Cookie). FallbackAuditWriter, CentralAuditWriter, and
  AuditLogIngestActor (both ingest paths) default to it instead of null
  — composition roots that bypass AddAuditLog can no longer write
  unredacted auth headers to the audit store.
- NotifService-025 (over-mask): CredentialRedactor.Scrub now only masks
  the last colon-separated component (password / clientSecret) AND only
  if it's >= 12 chars (typical password heuristic). Short user names
  like "root" no longer become global redaction tokens that eat unrelated
  diagnostic text. The full packed string is always masked regardless of
  length. 3 new negative tests pin the no-over-mask contract.

Audit-row correctness / fail-loud:
- InboundAPI-025: Program.cs UseWhen predicate now excludes /api/audit,
  /api/management, /api/centralui, /api/script-analysis AND requires POST
  — the AuditWriteMiddleware no longer emits spurious ApiInbound rows
  for audit-log query/export endpoints (write-on-read recursion broken).
- ESG-021: ApplyAuth now logs Warning (not silent) on empty
  AuthConfiguration for apikey/basic, unknown AuthType, and malformed
  Basic config. AuthConfiguration value NEVER logged. AuthType=none
  remains silent (documented unauthenticated sentinel).
- Security-021: AddSecurity now logs a startup Warning when
  RequireHttpsCookie=false — an HTTP-only deployment that previously
  transmitted the cookie-embedded JWT silently in cleartext is now
  audible in the log.

Defensive:
- CD-021: SwitchOutPartitionAsync's monthBoundary format string now
  yyyy-MM-dd HH:mm:ss.fffffff (datetime2(7) precision) so a future
  sub-second / non-midnight boundary doesn't silently round to the
  wrong partition.

Plus reconciled stale per-module Open-findings counters that had drifted
from earlier sessions (AuditLog, CD, ESG, IAPI, MgmtSvc, NotifService,
Security).

Build clean; all affected test projects green (Host 208, ConfigDB 242,
ESG 69, IAPI 151, MgmtSvc 100, NotifService 55, Security 85, AuditLog
247/248 — 1 pre-existing date-sensitive integration test flake on
PartitionPurgeTests, unrelated). README regenerated: 46 open (was 54).
2026-05-28 08:04:10 -04:00

1300 lines
74 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Code Review — InboundAPI
| Field | Value |
|-------|-------|
| Module | `src/ScadaLink.InboundAPI` |
| Design doc | `docs/requirements/Component-InboundAPI.md` |
| Status | Reviewed |
| Last reviewed | 2026-05-28 |
| Reviewer | claude-agent |
| Commit reviewed | `1eb6e97` |
| Open findings | 1 |
## Summary
The InboundAPI module is small (8 source files) and the happy-path flow — extract
key, validate, deserialize parameters, execute script, serialize result — is clean
and readable. However the review surfaced several real problems concentrated in two
themes: **concurrency** and **security**. The `InboundScriptExecutor` is a singleton
that mutates a plain `Dictionary` from concurrent ASP.NET request threads with no
synchronization, which can corrupt the handler cache or crash the process under load.
On the security side, API-key comparison is a non-constant-time database string
match (timing oracle), compiled scripts run with no enforcement of the documented
script trust model (forbidden APIs such as `System.IO`/`Process`/`Reflection` are
fully reachable), there is no request-body size limit, and the executor's catch-all
swallows `OperationCanceledException` from genuine client disconnects as a "timeout".
Design-doc adherence is also incomplete: the `Database.Connection()` script API
described in the design doc is entirely absent from `InboundScriptContext`, and the
endpoint never enforces that the API is central-only. Testing covers the validators
well but there is no coverage of the HTTP endpoint, concurrency, or recompilation.
None of the findings are data-loss-class, but the concurrency and trust-model issues
are High severity and should be addressed before production use.
#### Re-review 2026-05-17 (commit `39d737e`)
All 13 findings from the initial review remain `Resolved`; the module source under
`src/ScadaLink.InboundAPI` is unchanged since the last InboundAPI fix commit
(`8dd7412`), which precedes `39d737e`. This re-review re-walked all 10 checklist
categories against the resolved code and surfaced **4 new findings** — none touching
the previously-fixed concurrency/trust-model code, but all in areas the first pass
did not probe deeply: (1) the `ReturnDefinition` column is loaded onto `ApiMethod`
but is never consulted — script return values are serialized verbatim with no shaping
or validation against the declared return structure (InboundAPI-014); (2) the new
`ForbiddenApiChecker` is a purely textual syntax walker and can be bypassed by
reaching forbidden functionality through member access that never spells a forbidden
namespace, e.g. `typeof(x).Assembly.GetType("System.IO.File")` (InboundAPI-015);
(3) routed `Route.To().Call()` invocations are not bound by the method timeout unless
the script explicitly threads `Parameters`-side cancellation, contradicting the design
statement that the timeout covers routed calls (InboundAPI-016); and (4) `RouteHelper`
/ `RouteTarget` — the entire WP-4 cross-site routing surface — has no test coverage
(InboundAPI-017). New findings are one Medium-trio plus one Low; no Critical or High.
## Checklist coverage
| # | Category | Examined | Notes |
|---|----------|----------|-------|
| 1 | Correctness & logic bugs | ☑ | `CoerceValue` returns `null` for legitimately-null/`String` values indistinguishably; parameter-definition edge cases noted. |
| 2 | Akka.NET conventions | ☑ | Module is ASP.NET-hosted, no actors of its own; routes to actors via `CommunicationService`. No correlation-ID issues — IDs are set in `RouteHelper`. |
| 3 | Concurrency & thread safety | ☑ | Singleton `InboundScriptExecutor` mutates a non-thread-safe `Dictionary` from concurrent request threads — see InboundAPI-001/002. |
| 4 | Error handling & resilience | ☑ | Catch-all conflates client cancellation with timeout (InboundAPI-004); compilation-failure path repeats work on every request (InboundAPI-009). |
| 5 | Security | ☑ | Prior items resolved. Re-review: `ForbiddenApiChecker` is a textual deny-list bypassable via reflection without a forbidden namespace token (InboundAPI-015). |
| 6 | Performance & resource management | ☑ | Up to 3 separate DB round-trips per request in `ApiKeyValidator`; uncapped lazy recompilation. |
| 7 | Design-document adherence | ☑ | Re-review: `ReturnDefinition` loaded but never used (InboundAPI-014); routed-call timeout not enforced (InboundAPI-016). Prior `Database.Connection()`/central-only items resolved. |
| 8 | Code organization & conventions | ☑ | `ParameterDefinition` moved to Commons (InboundAPI-012 resolved); no new issues. |
| 9 | Testing coverage | ☑ | Re-review: `RouteHelper`/`RouteTarget` (WP-4 routing) entirely untested (InboundAPI-017); validators/executor/filter well covered. |
| 10 | Documentation & comments | ☑ | `ApiKeyValidationResult.NotFound` XML/name says "NotFound" but returns HTTP 400 — misleading (InboundAPI-013). |
#### Re-review 2026-05-28 (commit `1eb6e97`)
All 17 prior findings remain `Resolved`. The module has grown materially since the
last pass — a new `AuditWriteMiddleware` (Audit Log #23 M4 Bundle D) now lives under
`src/ScadaLink.InboundAPI/Middleware/`, the `ApiKeyValidator` was rewired to hash the
candidate with `IApiKeyHasher` (ConfigurationDatabase-012), and an `IInstanceRouter`
seam was introduced. This re-review re-walked all 10 checklist categories against
`1eb6e97` and surfaced **8 new findings** concentrated on the new audit middleware
and a stranded follow-up from InboundAPI-008:
1. The InboundAPI-008 resolution explicitly deferred registering an `IActiveNodeGate`
implementation in `ScadaLink.Host` as a "follow-up outside this module's scope" —
that follow-up is still unfulfilled (no production registration anywhere in
`src/ScadaLink.Host/`), so the design-mandated standby-node gating is silently
disabled in production today (`InboundAPI-022`, High).
2. `AuditWriteMiddleware` is wired in `Program.cs` against `/api/*` rather than the
specific `POST /api/{methodName}` route, so GETs against `/api/audit/query` and
`/api/audit/export` (audit query endpoints — themselves not script invocations)
now emit spurious `AuditChannel.ApiInbound`/`InboundRequest` rows back into the
audit log with `Target` set to the last path segment (`InboundAPI-025`, Medium).
3. The middleware fires its audit write as `_ = _auditWriter.WriteAsync(evt)` — the
wrapping try/catch only catches synchronous throws, so a faulted async writer
task is unobserved and the row silently disappears with no log line
(`InboundAPI-018`, Low/Medium).
4. `ParentExecutionId` correlation flows only through `RouteToCallRequest`
`RouteToGetAttributesRequest`/`RouteToSetAttributesRequest` have no
`ParentExecutionId` field, so attribute reads/writes from inbound scripts lose
the inbound→site execution-tree link the Audit Log decision in CLAUDE.md
describes (`InboundAPI-021`, Medium).
5. `EndpointExtensions.HandleInboundApiRequest` — the entire wiring composition
that ties validator/executor/route/audit together — has no test coverage; only
the components it composes are tested (`InboundAPI-023`, Low).
6. `EndpointExtensions.HandleInboundApiRequest` does
`ContentType?.Contains("json")` (case-sensitive) so a request with
`application/JSON` and no Content-Length silently skips JSON body parsing
(`InboundAPI-020`, Low).
7. `AuditWriteMiddleware.InvokeAsync` calls `EnableBuffering()` unconditionally
before the empty-body short-circuit, allocating a `FileBufferingReadStream` for
every request including bodyless ones (`InboundAPI-019`, Low).
Severity mix: 1 High, 3 Medium, 4 Low — no Critical. (The eighth finding —
`InboundAPI-024`, Low — is a defensive watch-list item flagging that
`_knownBadMethods` is unbounded; it is bounded *in practice* today by the
configuration database, but the invariant is undocumented.)
## Checklist coverage — 2026-05-28 (commit `1eb6e97`)
| # | Category | Examined | Notes |
|---|----------|----------|-------|
| 1 | Correctness & logic bugs | ☑ | `ContentType?.Contains("json")` is case-sensitive (InboundAPI-020). |
| 2 | Akka.NET conventions | ☑ | ASP.NET-hosted, no actors of its own; routes via `IInstanceRouter`/`CommunicationService`. No new issues. |
| 3 | Concurrency & thread safety | ☑ | `ConcurrentDictionary` handler cache (post-001/002 fix). New audit middleware is per-request scoped, no shared mutable state. No new issues. |
| 4 | Error handling & resilience | ☑ | Audit `WriteAsync` is fire-and-forget; async faults are unobserved (InboundAPI-018). |
| 5 | Security | ☑ | `IActiveNodeGate` not registered in Host — standby-node gating disabled in production (InboundAPI-022). |
| 6 | Performance & resource management | ☑ | `EnableBuffering()` unconditional on bodyless requests (InboundAPI-019); audit middleware wraps `Response.Body` and mints `ExecutionId` for non-script /api routes (InboundAPI-025). |
| 7 | Design-document adherence | ☑ | `ParentExecutionId` not stamped on attribute-read/write routed messages (InboundAPI-021). InboundAPI-008's deferred Host registration still unfulfilled (InboundAPI-022). |
| 8 | Code organization & conventions | ☑ | No new issues. |
| 9 | Testing coverage | ☑ | `EndpointExtensions.HandleInboundApiRequest` composition wiring has no test (InboundAPI-023); middleware/filter/validator/executor/route are individually covered. |
| 10 | Documentation & comments | ☑ | No new issues. |
## Findings
### InboundAPI-001 — Singleton script handler cache mutated without synchronization
| | |
|--|--|
| Severity | High |
| Category | Concurrency & thread safety |
| Status | Resolved |
| Location | `src/ScadaLink.InboundAPI/InboundScriptExecutor.cs:17`, `:32`, `:40`, `:89`, `:123-128` |
**Description**
`InboundScriptExecutor` is registered as a singleton (`ServiceCollectionExtensions.cs:11`)
and its handler cache is a plain `Dictionary<string, Func<...>>` (`InboundScriptExecutor.cs:17`).
`RegisterHandler`, `RemoveHandler`, `CompileAndRegister`, and the lazy-compile path in
`ExecuteAsync` all read and write this dictionary with no lock. ASP.NET serves inbound
API requests on concurrent thread-pool threads, so two requests for an as-yet-uncompiled
method (or a request racing a CLI-triggered `CompileAndRegister`) can mutate the
dictionary concurrently. `Dictionary` is explicitly not safe for concurrent
read/write — this can corrupt internal buckets, throw `InvalidOperationException`,
or return a torn/`null` handler, crashing the request or the process.
**Recommendation**
Replace the `Dictionary` with a `ConcurrentDictionary<string, Func<...>>`, or guard all
access with a lock. For the lazy-compile path use `GetOrAdd` so concurrent first-callers
compile at most once.
**Resolution**
Resolved 2026-05-16 (commit `<pending>`): replaced the plain `Dictionary` handler
cache with a `ConcurrentDictionary`; `RemoveHandler` now uses `TryRemove`; the
lazy-compile path in `ExecuteAsync` compiles outside the cache and inserts atomically
via `GetOrAdd` so concurrent first-callers share one handler. Regression tests
`ConcurrentLazyCompile_SameMethod_DoesNotCorruptCache` and
`ConcurrentRegisterAndExecute_DoesNotThrow` added.
### InboundAPI-002 — Lazy compilation is a check-then-act race with no atomicity
| | |
|--|--|
| Severity | Medium — re-triaged: already fixed by the InboundAPI-001 fix; verified and closed |
| Category | Concurrency & thread safety |
| Status | Resolved |
| Location | `src/ScadaLink.InboundAPI/InboundScriptExecutor.cs:152-161` |
**Description**
`ExecuteAsync` does `if (!_scriptHandlers.TryGetValue(...)) { CompileAndRegister(method); handler = _scriptHandlers[method.Name]; }`.
Even setting aside the unsynchronized dictionary (InboundAPI-001), this is a
check-then-act sequence: between `TryGetValue` failing and the re-read on line 128,
another thread could `RemoveHandler` the entry, causing the indexer on line 128 to
throw `KeyNotFoundException` — an unhandled-in-context exception that is then caught
only by the broad catch on line 143 and reported to the caller as "Internal script
error". Multiple concurrent first-callers will also each compile the same script
redundantly (wasted Roslyn work).
**Recommendation**
Make compile-and-fetch a single atomic operation (`ConcurrentDictionary.GetOrAdd`
with a lazily-evaluated factory, or a per-method lock), and have `CompileAndRegister`
return the handler it produced rather than requiring a separate dictionary read.
**Resolution**
Resolved 2026-05-16 (commit `pending`): re-triage — verified against the current
source, this finding was **already fixed** by the InboundAPI-001 fix. The
`InboundScriptExecutor.cs:152-161` lazy-compile path no longer does check-then-act
re-read: `Compile(method)` runs unconditionally (it never reads the cache) and the
result is published via the atomic `_scriptHandlers.GetOrAdd(method.Name, compiled)`.
There is no separate dictionary indexer read, so the `KeyNotFoundException` race the
finding describes cannot occur, and concurrent first-callers all share the single
handler that `GetOrAdd` keeps. Regression test
`LazyCompile_RacingRemoveHandler_NeverThrowsKeyNotFound` added (asserts a concurrent
`RemoveHandler` storm against lazy-compiling callers never yields the catch-all
"Internal script error"); it passes against the current code, confirming the fix.
### InboundAPI-003 — API key compared with non-constant-time string equality
| | |
|--|--|
| Severity | High |
| Category | Security |
| Status | Resolved |
| Location | `src/ScadaLink.ConfigurationDatabase/Repositories/InboundApiRepository.cs:22-23`, consumed by `src/ScadaLink.InboundAPI/ApiKeyValidator.cs:33` |
**Description**
API-key authentication resolves the key with
`FirstOrDefaultAsync(k => k.KeyValue == keyValue)` — an ordinary equality match
translated to a SQL `WHERE KeyValue = @p` comparison. The secret is matched with
ordinary (early-exit) string/SQL comparison rather than a constant-time comparison,
which is a classic timing side-channel for secret material. Combined with the design's
explicit "no rate limiting" decision, an attacker with network access to the central
API can mount a timing attack to recover valid keys. The API key is the *sole*
credential for the inbound API, so this is the primary authentication path.
**Recommendation**
Look the key up by a non-secret indexed identifier (e.g. a key prefix/id) or fetch
candidate rows, then verify the secret in-process using
`CryptographicOperations.FixedTimeEquals` over the UTF-8 bytes. Preferably store only
a salted hash of the key value and compare hashes. Avoid leaking secret-length and
match-position timing.
**Resolution**
Resolved 2026-05-16 (commit `<pending>`): `ApiKeyValidator` no longer calls the
secret-equality lookup `GetApiKeyByValueAsync` (the SQL `WHERE KeyValue = @secret`
timing oracle). It now fetches all keys via `GetAllApiKeysAsync` and matches the
secret in-process with `CryptographicOperations.FixedTimeEquals` over the UTF-8 bytes,
scanning every candidate so neither match position nor secret length is observable.
Regression tests `ValidateAsync_DoesNotUseSecretEqualityLookup`,
`ValidateAsync_WrongKey_ConstantTimePath_Returns401`, and
`ValidateAsync_KeyOfDifferentLength_Returns401` added. Note: the timing-oracle method
`GetApiKeyByValueAsync` remains on `IInboundApiRepository` (it is outside this module);
removing it from the repository is left as separate follow-up since the validator no
longer depends on it.
### InboundAPI-004 — Client disconnect is misreported as a script timeout
| | |
|--|--|
| Severity | Medium |
| Category | Error handling & resilience |
| Status | Resolved |
| Location | `src/ScadaLink.InboundAPI/InboundScriptExecutor.cs:117-141` |
**Description**
`ExecuteAsync` creates a linked CTS from `httpContext.RequestAborted` and the method
timeout, then catches `OperationCanceledException` and unconditionally returns
"Script execution timed out". When the *client* aborts the request (`RequestAborted`
fires), the same exception type is thrown, so a normal client disconnect is logged as
a timeout (`_logger.LogWarning("Script execution timed out ...")`) and an attempt is
made to write a 500 timeout body to an already-gone connection. This pollutes the
failure log (which the design says is reserved for genuine script errors) and obscures
real timeout incidents.
**Recommendation**
Distinguish the two cancellation sources: if `cancellationToken` (the request token)
is cancelled, treat it as a client abort — do not log a timeout and do not attempt to
write a response. Only when the timeout CTS fired should the result be "timed out".
Check `cts.Token.IsCancellationRequested && !cancellationToken.IsCancellationRequested`
or use a dedicated timeout `CancellationTokenSource` so the two are separable.
**Resolution**
Resolved 2026-05-16 (commit `pending`): `ExecuteAsync` now uses a dedicated timeout
`CancellationTokenSource` (`new CancellationTokenSource(timeout)`) linked with the
request-abort token, so the two cancellation sources are separable. The
`OperationCanceledException` handler reports "Script execution timed out" (and logs a
warning) **only** when the timeout CTS fired and the request token did not; a client
abort instead returns "Request cancelled by client" and logs at Debug — the failure
log stays reserved for genuine script-execution timeouts. `HandleInboundApiRequest`
additionally short-circuits with `Results.Empty` (no warning log, no 500 body write)
when `RequestAborted` is cancelled, since the connection is already gone. Regression
tests `ClientDisconnect_IsNotReportedAsTimeout` and `GenuineTimeout_StillReportedAsTimeout`
added.
### InboundAPI-005 — Compiled API scripts run with no script-trust-model enforcement
| | |
|--|--|
| Severity | High |
| Category | Security |
| Status | Resolved |
| Location | `src/ScadaLink.InboundAPI/InboundScriptExecutor.cs:56-93` |
**Description**
CLAUDE.md's Akka.NET conventions state the script trust model forbids `System.IO`,
`Process`, `Threading`, `Reflection`, and raw network access. `CompileAndRegister`
compiles arbitrary C# with `CSharpScript.Create` and only restricts the *default
imports* (`WithImports("System", ...)`). Imports are a convenience, not a sandbox — a
script can still fully-qualify any type (`System.IO.File.Delete(...)`,
`System.Diagnostics.Process.Start(...)`, `System.Reflection`, raw `Socket`) because
the core framework assemblies are referenced and Roslyn scripting performs no API
allow/deny-listing. Inbound API scripts execute on the central node with the host
process's privileges, so a malicious or buggy method definition has full host access.
Note the Design role authors these scripts (less trusted than Admin), making
enforcement material.
**Recommendation**
Add a compile-time analyzer/`SyntaxWalker` (as the Site Runtime does for instance
scripts) that rejects forbidden namespaces/types before registering a handler, and/or
run scripts under a constrained boundary. At minimum, share the Site Runtime's
forbidden-API checker so the trust model is enforced consistently. Reject the method
(and log) when a violation is found instead of registering it.
**Resolution**
Resolved 2026-05-16 (commit `<pending>`): added `ForbiddenApiChecker`, a Roslyn
`CSharpSyntaxWalker` that statically rejects scripts referencing forbidden namespaces
(`System.IO`, `System.Diagnostics`, `System.Threading` except `Tasks`,
`System.Reflection`, `System.Net`, `System.Runtime.InteropServices`, `Microsoft.Win32`)
whether reached via a `using` directive or a fully-qualified name. `CompileAndRegister`
now runs the check before Roslyn compilation and refuses to register (and logs) a
violating method; `ExecuteAsync`'s lazy-compile path is gated by the same check.
Regression tests `CompileAndRegister_ForbiddenApi_RejectsScript` (theory),
`ExecuteAsync_ForbiddenApiScript_DoesNotRunAndReturnsFailure`, and
`CompileAndRegister_PermittedScript_StillRegisters` added.
### InboundAPI-006 — No request body size limit on the inbound endpoint
| | |
|--|--|
| Severity | Medium |
| Category | Security |
| Status | Resolved |
| Location | `src/ScadaLink.InboundAPI/EndpointExtensions.cs:54-62` |
**Description**
`HandleInboundApiRequest` calls `JsonDocument.ParseAsync(httpContext.Request.Body, ...)`
with no explicit body-size cap and no `[RequestSizeLimit]`/endpoint metadata. Although
Kestrel has a default max request body size, this endpoint accepts arbitrary JSON from
external systems, fully buffers it into a `JsonDocument`, and then `Clone()`s the
root element (`:61`) which materializes the entire document on the heap. With no rate
limiting (a deliberate design choice) a single caller can drive large allocations.
Deep/wide JSON also makes the `CoerceValue` `object`/`list` deserialization
(`ParameterValidator.cs:113,117`) expensive.
**Recommendation**
Set an explicit, modest body-size limit on the endpoint
(`.WithMetadata(new RequestSizeLimitAttribute(...))` or
`IHttpMaxRequestBodySizeFeature`) and consider a `JsonDocumentOptions` `MaxDepth`.
Reject oversized bodies with 413 before buffering.
**Resolution**
Resolved 2026-05-16 (commit `pending`): added `InboundApiEndpointFilter`, an
`IEndpointFilter` applied to `POST /api/{methodName}` via `.AddEndpointFilter<>()`.
It rejects requests whose declared `Content-Length` exceeds `InboundApiOptions.
MaxRequestBodyBytes` (default 1 MiB) with HTTP 413 *before* the handler buffers the
body into a `JsonDocument`, and also lowers the per-request `IHttpMaxRequestBodySizeFeature`
cap so a chunked/unknown-length stream is cut off by Kestrel while being read. The
limit is configurable via the bound `ScadaLink:InboundApi` options section. Regression
tests `OversizedBody_ShortCircuitsWith413_AndDoesNotRunHandler`, `BodyAtLimit_RunsHandler`,
and `FilterCapsMaxRequestBodySizeFeature` added.
### InboundAPI-007 — `Database.Connection()` script API from the design doc is not implemented
| | |
|--|--|
| Severity | Medium |
| Category | Design-document adherence |
| Status | Resolved |
| Location | `src/ScadaLink.InboundAPI/InboundScriptExecutor.cs:188-203` |
**Description**
`Component-InboundAPI.md` ("Script Runtime API -> Database Access") specifies
`Database.Connection("connectionName")` as an available script capability for
querying the configuration/machine-data databases. `InboundScriptContext` exposes only
`Parameters`, `Route`, and `CancellationToken` — there is no `Database` member. Any
method script that follows the documented API will fail to compile. Either the code
is incomplete or the design doc is stale; the two must be reconciled.
**Recommendation**
If database access is in scope, add a `Database` property to `InboundScriptContext`
backed by a connection-factory service. If it is not, remove the "Database Access"
section from `Component-InboundAPI.md` so the design doc stops advertising an absent
API.
**Resolution**
Resolved 2026-05-16 (commit `<pending>`). The drift was confirmed real:
`InboundScriptContext` (`InboundScriptExecutor.cs:188-203`) exposes only
`Parameters`, `Route`, and `CancellationToken` — there is no `Database` member,
so a method script following the documented `Database.Connection("name")` API
would fail to compile. Resolution direction: the design doc is stale, not the
code. Implementing `Database.Connection()` would hand inbound API scripts a
*raw* MS SQL client, in direct tension with the ScadaLink script trust model
(scripts are forbidden `System.IO`, raw network, etc.; `ForbiddenApiChecker`
statically enforces this). Rather than carve a hole in the trust model, the
"Database Access" section was removed from `docs/requirements/Component-InboundAPI.md`
and replaced with an explicit "No direct database access" note explaining that
scripts interact only through the curated `Route`/`Parameters` surfaces, and
that any future data access must go behind a dedicated scoped helper added as an
explicit design change. Code and doc now agree; no code or test change required.
### InboundAPI-008 — Inbound API endpoint not restricted to the active central node
| | |
|--|--|
| Severity | Medium |
| Category | Design-document adherence |
| Status | Resolved |
| Location | `src/ScadaLink.InboundAPI/EndpointExtensions.cs:19-23`, `src/ScadaLink.Host/Program.cs:149` |
**Description**
The design states the Inbound API is "Central cluster only (active node)" and "fails
over with it". `MapInboundAPI` registers `POST /api/{methodName}` unconditionally, and
`Program.cs` maps it inside the central-role branch but with no active-node gating —
unlike `/health/active` which has an `active-node` predicate. A standby central node
will happily serve inbound API calls, executing scripts and `Route.To()` calls from a
non-leader, which can race the active node or run against stale singleton state.
**Recommendation**
Gate the endpoint on active-node status (reuse the cluster `active-node` health check
or a leader-state check) and return 503 on the standby, so Traefik/clients only reach
the live node — consistent with how the Management API and `/health/active` are
treated.
**Resolution**
Resolved 2026-05-16 (commit `pending`): introduced `IActiveNodeGate`, an abstraction
the inbound API uses to ask whether this node is the active (cluster-leader) central
node. The new `InboundApiEndpointFilter` (applied to `POST /api/{methodName}`)
consults the gate and short-circuits a standby node with HTTP 503 before any
auth/script work, so Traefik/clients only reach the live node — consistent with
`/health/active`. The gate is resolved optionally: when no implementation is
registered (non-clustered host / tests) the endpoint defaults to "allow", preserving
prior behaviour. Regression tests `StandbyNode_ShortCircuitsWith503_AndDoesNotRunHandler`,
`ActiveNode_PassesGate_RunsHandler`, and `NoGateRegistered_PassesGate_RunsHandler`
added. **Follow-up (outside this module's scope):** `ScadaLink.Host` should register
an `IActiveNodeGate` implementation backed by `ActiveNodeHealthCheck` /
`Cluster.State.Leader` in the central-role branch of `Program.cs` so the gate is
actually enforced in production; until then the endpoint defaults to "allow".
### InboundAPI-009 — Failed compilation is retried on every subsequent request
| | |
|--|--|
| Severity | Low |
| Category | Performance & resource management |
| Status | Resolved |
| Location | `src/ScadaLink.InboundAPI/InboundScriptExecutor.cs:123-128` |
**Description**
When a method's script fails to compile, `CompileAndRegister` returns `false` and
nothing is stored in `_scriptHandlers`. Every subsequent call to that method re-enters
the lazy-compile branch and recompiles the broken script via Roslyn from scratch.
Roslyn compilation is expensive; a single broken method definition repeatedly invoked
by an external caller (no rate limiting) becomes a CPU amplification vector.
**Recommendation**
Cache the compilation *failure* (e.g. store a sentinel handler that immediately
returns the compile error, or keep a `HashSet` of known-bad method names with the
diagnostic) so a broken script is compiled at most once until the definition is
updated via `CompileAndRegister`.
**Resolution**
Resolved 2026-05-16 (commit pending): confirmed the root cause — a failed `Compile`
stored nothing in `_scriptHandlers`, so every subsequent request re-entered the
lazy-compile branch and re-ran Roslyn. Added a `_knownBadMethods` `ConcurrentDictionary`
of method names whose compilation failed; `ExecuteAsync`'s lazy-compile path
short-circuits before Roslyn when the method is already known-bad, and records the
failure on a fresh failed compile. `CompileAndRegister` also records failures and
clears the record on a successful (re)compile, so a fixed method definition is
re-evaluated. Regression tests `FailedCompilation_IsNotRetriedOnEveryRequest`
(asserts the compile-failure log fires exactly once across 5 requests) and
`FailedCompilation_RecompilesAfterCompileAndRegisterCalledAgain` added.
### InboundAPI-010 — `ParameterValidator` ignores extra body fields and cannot validate Object/List element types
| | |
|--|--|
| Severity | Low |
| Category | Correctness & logic bugs |
| Status | Resolved |
| Location | `src/ScadaLink.InboundAPI/ParameterValidator.cs:64-90`, `:112-118` |
**Description**
Two related correctness gaps: (1) The validator iterates only over *defined*
parameters; any extra top-level fields in the request body are silently ignored
rather than reported, so callers get no feedback on typo'd parameter names. (2) For
`Object` and `List` types the validator only checks the JSON *kind* (`Object`/`Array`)
and then blindly `JsonSerializer.Deserialize`s the raw text — the design's extended
type system describes Objects as "named structure with typed fields" and Lists as
collections "of objects or primitive types", but no field-level or element-level type
validation is performed. Invalid nested structures pass validation and surface only
as runtime script errors.
**Recommendation**
Optionally warn/400 on unexpected body fields. For the extended types, either parse a
richer `ParameterDefinition` (with nested field definitions / element type) and
validate recursively, or document explicitly that Object/List are validated only for
shape — and update the design doc to match.
**Resolution**
Resolved 2026-05-16 (commit pending): both gaps addressed along the lines the
recommendation offers. (1) `ParameterValidator.Validate` now enumerates the request
body's top-level fields and returns an Invalid result naming any field that does not
match a defined parameter (`"Unexpected parameter(s): ..."`), so a typo'd parameter
name is reported instead of silently ignored. (2) For `Object`/`List`, recursive
field/element-level type validation is **deliberately not** added — it requires a
richer nested `ParameterDefinition` schema, a design decision; instead the
shape-only behaviour is now explicitly documented in the `CoerceValue` XML comment so
the code's contract is unambiguous. Re-triage note: the design doc
(`Component-InboundAPI.md` line 43) lists only Boolean/Integer/Float/String as method
parameter types — the Object/List extended types are a CLAUDE.md decision; reconciling
the design doc is out of this module's editable scope and left as a doc-owner
follow-up. Regression tests `UnexpectedBodyField_ReturnsInvalid` and
`OnlyDefinedFields_StillValid` added.
### InboundAPI-011 — Method-existence check leaks to unapproved callers (enumeration oracle)
| | |
|--|--|
| Severity | Low |
| Category | Security |
| Status | Resolved |
| Location | `src/ScadaLink.InboundAPI/ApiKeyValidator.cs:39-52` |
**Description**
`ValidateAsync` returns 400 `Method '{methodName}' not found` when the method does not
exist, but 403 `API key not approved for this method` when it exists but the key is
not approved. A caller holding any valid enabled key can therefore enumerate which
method names exist on the central API by observing 400-vs-403 responses. The error
message also echoes the caller-supplied `methodName` back verbatim into the JSON
response (`EndpointExtensions.cs:47`), a minor reflected-input concern.
**Recommendation**
Return an indistinguishable response (e.g. 403/404) for both "method not found" and
"key not approved" so existence is not observable to unapproved callers. Avoid echoing
raw caller input in error bodies, or sanitize it.
**Resolution**
Resolved 2026-05-16 (commit pending): confirmed — `ValidateAsync` returned 400
`Method '{methodName}' not found` for a missing method but 403
`API key not approved for this method` for an existing-but-unapproved one, an
enumeration oracle, and echoed the caller-supplied method name verbatim. Both cases
now return an identical response: HTTP 403 with the single shared message
`API key not approved for this method` (a `NotApprovedMessage` constant); the
method name is no longer interpolated into any error body, removing both the
existence oracle and the reflected-input concern. Regression tests
`ValidKey_MethodNotFound_IsIndistinguishableFromNotApproved` and
`ValidKey_MethodNotFound_ErrorMessageDoesNotEchoMethodName` added; the pre-existing
`ValidKey_MethodNotFound_Returns400` test was updated to assert the new
indistinguishable contract.
### InboundAPI-012 — `ParameterDefinition` POCO declared in the component project, not Commons
| | |
|--|--|
| Severity | Low |
| Category | Code organization & conventions |
| Status | Resolved |
| Location | `src/ScadaLink.InboundAPI/ParameterValidator.cs:128-133` |
**Description**
`ParameterDefinition` is a persistence-/contract-shaped POCO: it is the deserialized
form of `ApiMethod.ParameterDefinitions` (a column in the configuration database) and
describes the public API contract. CLAUDE.md's code-organization rules place
persistence-ignorant entity/contract types in `ScadaLink.Commons`. Defining it inside
the InboundAPI project means any other component that needs to read or produce method
parameter definitions (e.g. Central UI's method editor, CLI, Management Service)
cannot share the type and will duplicate it.
**Recommendation**
Move `ParameterDefinition` (and a matching return-definition type, if added) to
`ScadaLink.Commons` under the InboundApi entity/types namespace so it is shared by all
components that work with method definitions.
**Resolution**
Resolved 2026-05-16 (commit `<pending>`): root cause confirmed against the source —
`ParameterDefinition` was a persistence-ignorant, API-contract-shaped POCO (the
deserialized form of the `ApiMethod.ParameterDefinitions` configuration-database
column) declared inside the component project, contrary to CLAUDE.md's
code-organization rule that such shared contract types live in `ScadaLink.Commons`.
The type was moved to `src/ScadaLink.Commons/Types/InboundApi/ParameterDefinition.cs`
(namespace `ScadaLink.Commons.Types.InboundApi`) — placed under `Types/` with an
`InboundApi` domain subfolder, matching the existing `Types/Scripts/` precedent, since
the column itself is the persisted form and this type is its deserialized contract
shape (not an EF-mapped entity). It remains a pure POCO with no EF attributes and no
behaviour. `ParameterValidator` now imports the moved type via a `using
ScadaLink.Commons.Types.InboundApi;` directive; a tree-wide search confirmed
`ParameterValidator.cs` was the type's only declaration and only direct consumer (all
other `ParameterDefinition*` matches are the unrelated `ParameterDefinitions` string
property). No return-definition type exists in the codebase — only a `ReturnDefinition`
string column — so none was invented. No behavioural change, so no new runtime
regression test: this is a compile-level type move, and the existing 52
`ScadaLink.InboundAPI.Tests` (including the `ParameterValidator` suite) act as the
regression guard. `dotnet test` for `ScadaLink.InboundAPI.Tests` (52 passed) and
`ScadaLink.Commons.Tests` (226 passed) are green; `dotnet build ScadaLink.slnx`
succeeds with 0 warnings / 0 errors.
### InboundAPI-013 — `ApiKeyValidationResult.NotFound` factory returns HTTP 400, contradicting its name
| | |
|--|--|
| Severity | Low |
| Category | Documentation & comments |
| Status | Resolved |
| Location | `src/ScadaLink.InboundAPI/ApiKeyValidator.cs:78-79` |
**Description**
The static factory is named `NotFound` and is used for the "method not found" case,
but it builds a result with `StatusCode = 400` (Bad Request), not 404. The name
strongly implies 404 and will mislead future maintainers; `EndpointExtensions`
faithfully propagates whatever status code the factory sets, so the misnaming directly
affects the wire contract.
**Recommendation**
Rename the factory to match its behaviour (e.g. `BadRequest`) or change the status
code to 404 if that is the intended contract — and document the chosen "method not
found" status in `Component-InboundAPI.md`'s Error Handling section, which currently
does not list it.
**Resolution**
Resolved 2026-05-16 (commit pending): the misnamed `NotFound` factory (which built a
`StatusCode = 400` result) was the only producer of the "method not found" result,
and the InboundAPI-011 fix made "method not found" return 403 via the existing
`Forbidden` factory instead. The misleading `NotFound` factory is therefore now
**removed entirely** — it has no remaining callers in or out of the module
(`ApiKeyValidationResult` is InboundAPI-internal), eliminating the name/behaviour
contradiction. No separate regression test is needed: the InboundAPI-011 tests cover
the new method-not-found status, and removing dead code cannot regress. Doc-owner
follow-up: `Component-InboundAPI.md`'s Error Handling section still does not list a
"method not found" status; it should note that it is reported as 403 (indistinguishable
from "key not approved"), but that doc edit is outside this module's editable scope.
### InboundAPI-014 — `ReturnDefinition` is loaded but never used; script return value is unshaped/unvalidated
| | |
|--|--|
| Severity | Medium |
| Category | Design-document adherence |
| Status | Resolved |
| Location | `src/ScadaLink.InboundAPI/InboundScriptExecutor.cs:201-205`, `src/ScadaLink.Commons/Entities/InboundApi/ApiMethod.cs:10` |
**Description**
`Component-InboundAPI.md` ("API Method Definition → Return Value Definition" and the
"Response Format" section) specifies that each method has a declared return structure
— "Field names and data types … Supports returning lists of objects" — and that the
success response body is "the method's return value as JSON, with fields matching the
return value definition". The `ApiMethod` entity carries a `ReturnDefinition` column
to hold exactly this. However, nothing in the module ever reads `ReturnDefinition`:
`ExecuteAsync` takes whatever object the script happens to return and does a blind
`JsonSerializer.Serialize(result)`. There is no validation that the script's return
value matches the declared shape, no coercion to the declared field types, and no
error when a method returns a structure inconsistent with its definition. A method
whose script returns the wrong shape (or `null` where a structure is required) will
silently emit a malformed 200 response, and the documented return-definition contract
is effectively unenforced. This is the response-side mirror of the parameter
validation that `ParameterValidator` does perform, leaving the two halves of the
method contract asymmetric.
**Recommendation**
Either (a) implement return-value validation/shaping: parse `ReturnDefinition` with
the same extended-type machinery used for parameters and validate/coerce the script
result before serializing, returning a 500 (or logging) when the script result does
not match; or (b) if return shaping is deliberately out of scope, remove the "Return
Value Definition" / "fields matching the return value definition" language from
`Component-InboundAPI.md` and document that the response is the script's raw return
value serialized as-is. Code and design doc must be reconciled.
**Resolution**
Resolved 2026-05-17 (commit `<pending>`): root cause confirmed — `ApiMethod` carries
`ReturnDefinition` but the executor did a blind `JsonSerializer.Serialize(result)`,
so a script returning the wrong shape silently emitted a malformed 200. Took
option (a): added `ReturnValueValidator`, the response-side mirror of
`ParameterValidator`. It parses `ReturnDefinition` (a JSON array of `{name,type}`
field definitions, same extended-type set as parameters), validates the serialized
script result against it — declared fields must be present with a compatible JSON
type, primitives type-checked, `Object`/`List` shape-checked — and a `null`/non-object
result is rejected when a structure is declared. `InboundScriptExecutor.ExecuteAsync`
now runs the validator after serialization and, on mismatch, logs and returns a
script failure (`"Method return value did not match its return definition"`, → 500)
instead of a malformed 200. A method with no `ReturnDefinition` stays unconstrained
(backward compatible). Doc-owner follow-up (outside this module's editable scope):
the `Component-InboundAPI.md` "Response Format" section may note that return shaping
is validation-only (no coercion). Regression tests: `ReturnValueValidatorTests`
(12 cases) plus executor-level `ReturnValue_MatchingReturnDefinition_Succeeds`,
`ReturnValue_NotMatchingReturnDefinition_ReturnsFailureNotMalformed200`, and
`ReturnValue_NoReturnDefinition_IsUnconstrained`.
### InboundAPI-015 — `ForbiddenApiChecker` is purely textual and is bypassable via reflection reachable without a forbidden namespace token
| | |
|--|--|
| Severity | Medium |
| Category | Security |
| Status | Resolved |
| Location | `src/ScadaLink.InboundAPI/ForbiddenApiChecker.cs:63-119`, `src/ScadaLink.InboundAPI/InboundScriptExecutor.cs:109-126` |
**Description**
`ForbiddenApiChecker` walks the script syntax tree and rejects any `using` directive,
`QualifiedNameSyntax`, or `MemberAccessExpressionSyntax` whose textual dotted name
starts with a forbidden namespace prefix (`System.IO`, `System.Diagnostics`,
`System.Reflection`, `System.Net`, etc.). This is a textual match, not a semantic
one, and the trust model it enforces (per InboundAPI-005) is explicitly meant to keep
*untrusted* Design-role scripts away from host APIs. The check can be bypassed because
forbidden functionality is reachable through member access that never spells a
forbidden namespace:
- `typeof(string).Assembly.GetType("System.IO.File")``typeof(string)` is permitted,
`.Assembly` is a `System.Type` property, `.GetType(string)` is a `System.Reflection.Assembly`
method. The string literal `"System.IO.File"` is a string, not a `QualifiedNameSyntax`
or `MemberAccessExpressionSyntax`, so `IsForbidden` never sees it. The script obtains
a `System.IO.File` `Type` and can `InvokeMember`/`GetMethod(...).Invoke(...)` on it —
all via members of permitted types — with no forbidden namespace ever appearing in
the source. `CompileAndRegister` references `typeof(object).Assembly`
(System.Private.CoreLib) in `ScriptOptions`, so every framework type is loadable at
runtime.
- The executor also references the `Microsoft.CSharp.RuntimeBinder` assembly
(`InboundScriptExecutor.cs:116`), enabling the `dynamic` keyword, which further
widens late-bound member access that the static walker cannot see through.
Because the inbound API script runs on the central node with the host process's
privileges and is authored by the (less-trusted-than-Admin) Design role, a static
textual deny-list gives a false sense of containment.
**Recommendation**
Treat the syntax walker as defence-in-depth, not the boundary. Strengthen it where
cheap (flag `Assembly.GetType`, `Type.GetType`, `Activator.CreateInstance`,
`InvokeMember`, and `dynamic` usage), but for real enforcement run compiled scripts
under a genuine boundary — a restricted `AssemblyLoadContext`/AppDomain-equivalent, a
curated reference set that does not expose reflection-to-arbitrary-type, or an
out-of-process sandbox — consistent with however the Site Runtime ultimately enforces
its instance-script trust model. At minimum, document in `Component-InboundAPI.md`
that the current check is best-effort and does not stop a determined script.
**Resolution**
Resolved 2026-05-17 (commit `<pending>`): the specific reflection-via-permitted-member
vector was confirmed and the textual checker materially hardened against it (full
sandboxing remains a separate, larger design effort — see below). `ForbiddenApiWalker`
now, in addition to the namespace deny-list, rejects a curated set of reflection-gateway
**member names** (`GetType`, `GetTypeInfo`, `Assembly`, `Module`, `CreateInstance`,
`InvokeMember`, `GetMethod(s)`, `GetConstructor(s)`, `GetField(s)`, `GetProperty(ies)`,
`GetMember(s)`, `GetRuntimeMethod(s)`, `MethodHandle`, `TypeHandle`) regardless of the
receiver expression — so `typeof(string).Assembly.GetType("System.IO.File")` is now
caught because `.Assembly` and `.GetType` appear as accessed member names. It also
rejects a bare `Activator` identifier and the `dynamic` keyword (which widens
late-bound access the static walker cannot see through). `Invoke` is deliberately
**not** flagged so legitimate `Action`/`Func` delegate invocation still compiles —
the reflection `MethodInfo.Invoke` path is cut off by rejecting the `GetMethod` that
produces the `MethodInfo`. **Documented limitation:** this is hardened defence-in-depth,
not a true sandbox — a determined author may still find a vector the syntax walker
cannot see (e.g. via `Microsoft.CSharp.RuntimeBinder` internals or generics tricks).
Genuine containment needs a runtime boundary (restricted `AssemblyLoadContext` /
curated reference set that does not expose reflection-to-arbitrary-type / out-of-process
sandbox); that is tracked as a future design change and noted in the `ForbiddenApiChecker`
XML summary. Regression tests: new `ForbiddenApiCheckerTests` suite (19 cases) covering
the `Assembly`/`GetType`/`Type.GetType`/`Activator.CreateInstance`/`InvokeMember`/
`GetMethod`/`GetTypeInfo`/`dynamic` bypass vectors plus permitted-script and
namespace-deny-list regression guards.
### InboundAPI-016 — Routed `Route.To().Call()` invocations are not bound by the method timeout
| | |
|--|--|
| Severity | Medium |
| Category | Design-document adherence |
| Status | Resolved |
| Location | `src/ScadaLink.InboundAPI/RouteHelper.cs:59-152`, `src/ScadaLink.InboundAPI/InboundScriptExecutor.cs:177`, `:199` |
**Description**
`Component-InboundAPI.md` states the per-method timeout "defines the maximum time the
method is allowed to execute (**including any routed calls to sites**)", and the
Routing Behavior section says a routed call "blocks until the site responds or the
**method-level timeout** is reached". The executor builds a linked
`CancellationTokenSource` (`cts`) combining the request-abort token and a dedicated
timeout CTS, and exposes `cts.Token` to the script as `InboundScriptContext.CancellationToken`.
However, every `RouteTarget` method (`Call`, `GetAttribute(s)`, `SetAttribute(s)`)
takes `CancellationToken cancellationToken = default` and the script must *explicitly*
pass the context token for the routed call to honour the timeout. A natural script —
`Route.To("inst").Call("doWork", parameters)` — invokes the routed call with
`CancellationToken.None`. That request flows into `CommunicationService.RouteToCallAsync`
with no cancellation, so the routed call is not bounded by the method timeout at all.
The only timeout guard left is `handler(context).WaitAsync(cts.Token)` in
`ExecuteAsync`: when the method timeout fires, `WaitAsync` returns a cancellation to
the caller, but the underlying script `Task` — and the in-flight `RouteToCallAsync`
awaiting a remote site — keeps running orphaned with no cancellation, holding the
correlation/communication resources until the site eventually responds or its own
transport timeout (if any) fires. The design's guarantee that the method timeout
covers routed calls is therefore not met, and a slow/hung site can leak background
work past the timeout the caller was told bounds the request.
**Recommendation**
Make routed calls inherit the method deadline without relying on script discipline:
have `RouteHelper`/`RouteTarget` carry the executing method's `CancellationToken`
(injected by `InboundScriptExecutor` when it constructs the context, e.g. a
`RouteHelper` bound to `cts.Token`) and pass it into every `CommunicationService`
call by default, so `Route.To("x").Call("s", p)` is timeout-bounded with no token
argument. Keep the explicit-token overload for callers that want a tighter bound.
Verify `RouteToCallAsync` and the attribute-routing calls actually observe the token
and abandon the in-flight request when it fires.
**Resolution**
Resolved 2026-05-17 (commit `<pending>`): root cause confirmed — every `RouteTarget`
method took `CancellationToken cancellationToken = default`, so a natural script
`Route.To("inst").Call("doWork", p)` routed with `CancellationToken.None` and was not
bound by the method timeout at all. `RouteHelper` now carries the executing method's
deadline token: `InboundScriptExecutor.ExecuteAsync` calls the new
`RouteHelper.WithDeadline(cts.Token)` when it builds the script context, so the route
helper handed to the script is bound to the method-level timeout CTS. Each
`RouteTarget` method resolves an *effective* token — the explicitly-supplied token if
the caller passed one (tighter bound preserved), otherwise the method deadline — and
forwards it into both `IInstanceLocator` site resolution and the routed call. The
deadline token therefore flows through to `CommunicationService.RouteTo*Async`, so
an in-flight routed call observes cancellation when the method timeout fires instead
of running orphaned. Regression tests (in the new `RouteHelperTests`):
`Call_WithNoExplicitToken_InheritsMethodDeadlineToken`,
`Call_WhenMethodDeadlineCancelled_RoutedCallObservesCancellation`,
`Call_ExplicitToken_OverridesDeadlineToken`,
`GetAttributes_WithNoExplicitToken_InheritsMethodDeadlineToken`,
`SetAttributes_WithNoExplicitToken_InheritsMethodDeadlineToken`.
### InboundAPI-017 — `RouteHelper` / `RouteTarget` has no test coverage
| | |
|--|--|
| Severity | Low |
| Category | Testing coverage |
| Status | Resolved |
| Location | `src/ScadaLink.InboundAPI/RouteHelper.cs:1-165`, `tests/ScadaLink.InboundAPI.Tests/` |
**Description**
`RouteHelper`/`RouteTarget` is the entire WP-4 cross-site routing surface — the
`Route.To().Call()/GetAttribute(s)/SetAttribute(s)` API that inbound API scripts use
to reach instances at any site. It has zero tests: the `ScadaLink.InboundAPI.Tests`
project covers `ApiKeyValidator`, `ParameterValidator`, `InboundScriptExecutor`, and
`InboundApiEndpointFilter`, but no test file exercises `RouteHelper`. Untested
behaviours include site resolution via `IInstanceLocator` (including the
"instance not found / no assigned site" `InvalidOperationException` path at
`RouteHelper.cs:154-164`), the `!response.Success``InvalidOperationException`
translation in each routed method, `GetAttribute` delegating to the batch
`GetAttributes` and returning `null` for an absent key, correlation-ID generation,
and `SetAttribute` delegating to `SetAttributes`. These are non-trivial branches
whose failure modes (a thrown exception inside a script) surface to the caller as a
500, so regressions would be silent.
**Recommendation**
Add a `RouteHelperTests` suite using substituted `IInstanceLocator` and
`CommunicationService` (the executor tests already substitute `CommunicationService`):
cover the happy path of each routed method, the unresolved-instance throw, the
`!Success``InvalidOperationException` mapping, and `GetAttribute` returning `null`
for a missing key. This also gives InboundAPI-016 a regression home if the timeout
wiring is added.
**Resolution**
Resolved 2026-05-17 (commit `<pending>`): confirmed — `ScadaLink.InboundAPI.Tests` had
no file exercising `RouteHelper`/`RouteTarget`. To make the surface testable without a
live actor system, an `IInstanceRouter` seam was introduced in the module (the routing
transport `RouteHelper` depends on); the production `CommunicationServiceInstanceRouter`
delegates to `CommunicationService` and is registered by `AddInboundAPI`. `RouteHelper`
now depends on `IInstanceLocator` + `IInstanceRouter` (both substitutable). Added the
`RouteHelperTests` suite (15 cases) covering: the happy path of `Call`/`GetAttribute(s)`/
`SetAttribute(s)`, correlation-ID generation, the unresolved-instance
`InvalidOperationException` path, the `!Success``InvalidOperationException` mapping
for each routed method, `GetAttribute` delegating to the batch `GetAttributes` and
returning `null` for an absent key, `SetAttribute` delegating to `SetAttributes`, and
the InboundAPI-016 deadline-token inheritance behaviour. All 15 pass.
### InboundAPI-018 — `AuditWriteMiddleware` fires `WriteAsync` as `_ = task` — faulted async writes are unobserved
| | |
|--|--|
| Severity | Medium |
| Category | Error handling & resilience |
| Status | Resolved |
| Location | `src/ScadaLink.InboundAPI/Middleware/AuditWriteMiddleware.cs:257` |
**Resolution (2026-05-28):** kept the fire-and-forget (audit emission must
never block or alter the user-facing response per alog.md §13) but added
`ObserveAuditWriteFault`, a small helper that attaches an
`OnlyOnFaulted` `ContinueWith` to the writer task — an asynchronously-faulted
audit write now logs a `Warning` (with the captured exception, method, path,
and HTTP status) instead of vanishing into `TaskScheduler.UnobservedTaskException`.
The continuation runs off-thread on `TaskScheduler.Default` so the response
hot path is unchanged. Regression test
`AuditWriter_AsyncFault_IsObserved_AsWarning_AndDoesNotAlterResponse` uses an
async-yielding throwing writer to prove the post-async fault is logged and the
response stays 200.
**Description**
`EmitInboundAudit` calls `_ = _auditWriter.WriteAsync(evt);` — the returned `Task` is
discarded with the discard operator inside a synchronous `try` block. The wrapping
`try/catch (Exception ex)` (lines 198266) only catches a *synchronous* throw before
the writer returns a task. Once `WriteAsync` returns a task, any exception that
faults that task (e.g. a DB timeout in the central audit writer, a serialization
failure, a cancellation that bubbles up) is never observed: it is not logged, it
does not increment the `CentralAuditWriteFailures` health-monitoring counter the
design doc references ("Fail-soft semantics" paragraph), and the audit row is
silently lost. In .NET, unobserved task exceptions are eventually surfaced via
`TaskScheduler.UnobservedTaskException` and may also be logged by the runtime —
either way, the middleware itself has no control over what (if anything) happens
on a fault. The XML doc comment at line 255 claims "the writer itself swallows"
but this is an implicit cross-component contract: the abstraction
`ICentralAuditWriter.WriteAsync` returns `Task` and makes no such guarantee, and
the only test that exercises a throwing writer (`AuditWriter_Throws_*` in
`AuditWriteMiddlewareTests.cs`) uses an `OnWrite` callback that throws
*synchronously*, not asynchronously — so the async fault path is not covered by
tests either.
This matters because Component-InboundAPI.md states that audit-emission failures
must increment `CentralAuditWriteFailures` (Health Monitoring #11) — a counter
that, with the current fire-and-forget, will under-count async-faulted writes.
**Recommendation**
Either (a) await the write and rely on the surrounding try/catch to log the
failure, accepting an extra await on the request hot path; or (b) keep the
fire-and-forget for latency but attach a `ContinueWith(t => ..., OnlyOnFaulted)`
that logs the fault and increments the failure counter, so a faulted async write
is at least observed. Option (b) preserves "audit emission never blocks the HTTP
response" while restoring the visibility the design assumes. Add a regression
test using a writer whose `WriteAsync` returns a faulted `Task` (not a
synchronous throw) to pin the new contract.
### InboundAPI-019 — `EnableBuffering()` called unconditionally on every request, including bodyless requests
| | |
|--|--|
| Severity | Low |
| Category | Performance & resource management |
| Location | `src/ScadaLink.InboundAPI/Middleware/AuditWriteMiddleware.cs:141` |
| Status | Resolved |
**Resolution (2026-05-28):** Added a `RequestHasBody(HttpRequest)` guard that returns `true` only when `ContentLength > 0` or the method is POST / PUT / PATCH (the `HttpMethods.IsPost/Put/Patch` helpers); the `EnableBuffering` + `ReadBufferedRequestBodyAsync` call now sits behind that guard. Bodyless GET / HEAD / DELETE / TRACE / OPTIONS requests (and any explicit `Content-Length: 0`) skip the `FileBufferingReadStream` allocation; body-carrying methods with no Content-Length (chunked POST etc.) still buffer. Regression tests `BodylessMethod_SkipsEnableBuffering_RequestStreamIsNotReplaced` (GET/HEAD/DELETE theory), `BodylessPost_ContentLengthZero_SkipsEnableBuffering`, and `PostWithBody_StillEnablesBuffering_AndCapturesRequestSummary` (anti-regression).
**Description**
`InvokeAsync` always calls `ctx.Request.EnableBuffering()` before the empty-body
short-circuit at `ReadBufferedRequestBodyAsync` line 289 (`if (request.ContentLength
is 0) return (null, false);`). `EnableBuffering()` swaps the request stream for a
`FileBufferingReadStream` whose construction allocates an internal buffer (default
threshold ~30 KB before spilling to a temp file) regardless of whether the request
actually has a body. The /api scope this middleware lives under will see at least
some bodyless requests (e.g. GET `/api/audit/query` once that route is in the same
branch — see InboundAPI-025; future health checks; misbehaving clients) and each
one pays the buffering allocation cost for no benefit.
**Recommendation**
Defer the `EnableBuffering()` call into `ReadBufferedRequestBodyAsync` after the
`ContentLength is 0` check, or short-circuit in `InvokeAsync` before enabling
buffering when `ContentLength is 0` and `Method is "GET" or "HEAD" or "DELETE"`.
The win is a per-request `FileBufferingReadStream` allocation avoided on every
bodyless request through the middleware.
### InboundAPI-020 — `ContentType.Contains("json")` is case-sensitive; `application/JSON` with no Content-Length skips body parsing
| | |
|--|--|
| Severity | Low |
| Category | Correctness & logic bugs |
| Status | Resolved |
| Location | `src/ScadaLink.InboundAPI/EndpointExtensions.cs:70` |
**Resolution (2026-05-28):** swapped the case-sensitive `Contains("json")`
substring match for `Contains("json", StringComparison.OrdinalIgnoreCase)` so
`application/JSON` / `Application/Json` / `APPLICATION/JSON` all enter the
JSON-deserialization path. Regression test
`EndpointContentTypeTests.ContentTypeCheck_IsCaseInsensitive_ParsesBodyForAnyCasing`
(theory, 4 casings) drives a TestServer-hosted `MapInboundAPI` end-to-end with
a required Integer parameter — proving body parsing actually ran by asserting
the parameter reaches the handler.
**Description**
`HandleInboundApiRequest` parses the JSON body only when
`httpContext.Request.ContentLength > 0 || httpContext.Request.ContentType?.Contains("json") == true`.
The `string.Contains(string)` overload used here is case-sensitive — a perfectly
valid HTTP header `Content-Type: application/JSON` (uppercase) would yield
`false` (`"application/JSON".Contains("json")` is `false`). With no
Content-Length (e.g. chunked transfer-encoding) and an uppercase content type,
the handler then leaves `body = null` and `ParameterValidator.Validate` runs
against a missing body — so a method that declares any required parameter is
rejected with "Missing required parameters" even though the caller did send a
well-formed JSON body. HTTP RFC 7230 §3.2 makes header field names case-insensitive
but is silent on values; in practice clients do sometimes uppercase media-type
tokens, and the framework's own `MediaTypeHeaderValue` is case-insensitive.
**Recommendation**
Use the case-insensitive overload —
`httpContext.Request.ContentType?.Contains("json", StringComparison.OrdinalIgnoreCase) == true`
— or rely on the framework's `IsJson` check via
`MediaTypeHeaderValue.TryParse`/`HttpRequest.HasJsonContentType()`. Add a
regression test posting with `application/JSON` and Transfer-Encoding: chunked.
### InboundAPI-021 — `ParentExecutionId` correlation flows only through `Call`; attribute reads/writes lose the inbound→site execution-tree link
| | |
|--|--|
| Severity | Medium |
| Category | Design-document adherence |
| Status | Resolved |
| Location | `src/ScadaLink.InboundAPI/RouteHelper.cs:141-143`, `:182-183`, `:225-226`; `src/ScadaLink.Commons/Messages/InboundApi/RouteToInstanceRequest.cs:15-21`, `:36-40`, `:55-59` |
**Description**
CLAUDE.md's Centralized Audit Log section describes `ParentExecutionId` as the
cross-execution spawn pointer that "every row of a spawned run carries" and
specifically calls out "the inbound API → routed-site-script case". The current
implementation honours this only on `RouteToCallRequest` — which carries
`ParentExecutionId` as its trailing additive field (line 21 of
`RouteToInstanceRequest.cs`) and is stamped by `RouteTarget.Call` with the
inbound request's execution id at line 143 of `RouteHelper.cs`.
`RouteToGetAttributesRequest` and `RouteToSetAttributesRequest`, however, have
**no `ParentExecutionId` field** and the matching `RouteTarget.GetAttributes` /
`SetAttributes` methods (`RouteHelper.cs:182-183`, `:225-226`) never reference
`_parentExecutionId`. So when an inbound API script reads or writes a site
attribute via `Route.To("inst").GetAttribute(...)` /
`Route.To("inst").SetAttribute(...)`, the site-side audit row for that
trust-boundary action (an outbound-by-the-script DB / OPC write at the site) is
emitted with `ParentExecutionId = null` and the execution-tree walk
`IX_AuditLog_ParentExecution` cannot link it back to the spawning inbound
request. The two-row pair (inbound + spawned site work) reverts to the
"top-level / null" state the design says is the *fallback* for non-spawned runs.
The asymmetry between `Call` and `GetAttributes`/`SetAttributes` is also surprising
— a script author would reasonably expect the same correlation across all
`Route.To(...)` calls.
**Recommendation**
Add a trailing `Guid? ParentExecutionId = null` field to
`RouteToGetAttributesRequest` and `RouteToSetAttributesRequest` (additive
trailing member, matches the message-evolution rule in CLAUDE.md); stamp it
from `_parentExecutionId` in `RouteTarget.GetAttributes` and
`RouteTarget.SetAttributes`; have the site-side handlers thread the field onto
their emitted audit rows. Add a `RouteHelperTests` regression case asserting
that an attribute read/write carries the inherited `ParentExecutionId`.
**Resolution (2026-05-28):**
Wire fix landed — `RouteToGetAttributesRequest` and `RouteToSetAttributesRequest`
now carry a trailing additive `Guid? ParentExecutionId = null`, mirroring
`RouteToCallRequest`; `RouteTarget.GetAttributes` and `RouteTarget.SetAttributes`
stamp `_parentExecutionId` onto the request, so the field travels off the
inbound API box symmetrically across all three `Route.To()` verbs. Four
regression tests added to `RouteHelperTests` (with/without parent id for both
verbs).
Site-side audit emission for routed reads/writes is NOT currently wired —
`DeploymentManagerActor.RouteInboundApiGetAttributes` / `…SetAttributes` and
`InstanceActor.HandleGetAttribute` / `HandleSetStaticAttribute` /
`HandleSetDataAttribute` do not call `IAuditWriter` today (only script-driven
`AuditingDbConnection` / `AuditingDbCommand` paths emit audit rows on the site
side). The `ParentExecutionId` is now available on the wire so once those
audit emissions land (tracked separately under the Audit Log site-wiring
backlog), they can stamp the parent id without any further plumbing.
### InboundAPI-022 — `IActiveNodeGate` has no production registration in Host — standby-node gating is silently disabled in production
| | |
|--|--|
| Severity | High |
| Category | Security |
| Status | Resolved |
| Location | `src/ScadaLink.InboundAPI/IActiveNodeGate.cs`, `src/ScadaLink.InboundAPI/InboundApiEndpointFilter.cs:52-60`; absent from `src/ScadaLink.Host/Program.cs` |
**Resolution** — Added `src/ScadaLink.Host/Health/ActiveNodeGate.cs`, a production `IActiveNodeGate` implementation backed by `AkkaHostedService` that mirrors `ActiveNodeHealthCheck`'s leadership probe (member status `Up` AND `Cluster.State.Leader == SelfAddress`), and registered it as a singleton in the central-role branch of `Program.cs`. A structural regression test (`CentralCompositionRootTests.Central_IActiveNodeGate_IsRegisteredAsActiveNodeGate`) reflects over the built `IServiceProvider` to assert the registration's existence and concrete type — failing on `main` and passing after the fix. The `InboundApiEndpointFilter`'s fall-through-to-allow behaviour is retained as the documented safe default for non-clustered hosts and tests.
**Description**
InboundAPI-008's resolution adds `IActiveNodeGate` (lines 1724 of
`IActiveNodeGate.cs`) so a standby central node can refuse to serve the inbound
API. `InboundApiEndpointFilter.InvokeAsync` consults the gate at line 52
(`var gate = httpContext.RequestServices.GetService<IActiveNodeGate>();`), and
when `gate is { IsActiveNode: false }` returns HTTP 503. The filter's behaviour
when **no implementation is registered** (line 51 comment) is to fall through and
serve the request — the resolution paragraph for InboundAPI-008 closes with:
> "Follow-up (outside this module's scope): `ScadaLink.Host` should register an
> `IActiveNodeGate` implementation backed by `ActiveNodeHealthCheck` /
> `Cluster.State.Leader` in the central-role branch of `Program.cs` so the gate is
> actually enforced in production; until then the endpoint defaults to "allow"."
A grep of the entire `src/ScadaLink.Host/` tree at `1eb6e97` finds **zero**
`IActiveNodeGate` registrations: `grep -rn "IActiveNodeGate\|AddSingleton.*ActiveNode"
src/ScadaLink.Host/` returns no matches. The follow-up was never carried out. So
in production today the standby central node still serves the inbound API exactly
as InboundAPI-008 described — executes method scripts, runs `Route.To()` calls,
races the active node, and may operate against stale singleton state. The new
infrastructure (interface + filter check) is present but unwired; from the user's
perspective the original High-severity issue is unresolved in deployed binaries.
The design says the inbound API is "Central cluster only (active node)" and
"fails over with it" — this guarantee is not currently enforced in production.
**Recommendation**
Register an `IActiveNodeGate` implementation in the central-role branch of
`ScadaLink.Host/Program.cs`. The natural backing is the existing
`ActiveNodeHealthCheck` (already wired for `/health/active`) or a direct read of
`Cluster.Get(actorSystem).State.Leader == Cluster.Get(actorSystem).SelfAddress`.
Add an integration test in the Host that spins up the central role and asserts
that the gate is resolvable and returns `IsActiveNode` consistent with cluster
leader state. Until that wiring lands, this finding is the user-facing
realisation of the InboundAPI-008 vulnerability.
### InboundAPI-023 — `EndpointExtensions.HandleInboundApiRequest` composition wiring has no test coverage
| | |
|--|--|
| Severity | Low |
| Category | Testing coverage |
| Status | Open |
| Location | `src/ScadaLink.InboundAPI/EndpointExtensions.cs:31-140`, `tests/ScadaLink.InboundAPI.Tests/` |
**Description**
The endpoint handler `HandleInboundApiRequest` is the wiring composition that
ties the validator → JSON parse → `ParameterValidator``InboundScriptExecutor`
result-serialization path together; it is the single piece of code that maps
validator status codes to HTTP responses, threads the `parentExecutionId` from
`HttpContext.Items` into the executor, stashes the resolved API key name as
`AuditActorItemKey`, and emits the request-aborted short-circuit. The test
project covers each composed component (`ApiKeyValidatorTests`,
`ParameterValidatorTests`, `InboundScriptExecutorTests`, `RouteHelperTests`,
`InboundApiEndpointFilter`, `AuditWriteMiddlewareTests`,
`MiddlewareOrderTests`) but no test exercises `HandleInboundApiRequest` itself —
so regressions in the wiring (e.g. forgetting to stash the actor name on
`HttpContext.Items`, the `Contains("json")` case sensitivity from
InboundAPI-020, or accidentally swapping `validationResult.StatusCode` for a
literal) are not caught.
**Recommendation**
Add an `EndpointExtensionsTests` suite using `TestServer` (the same pattern
`MiddlewareOrderTests` uses) covering: the happy path (200 + body), invalid
JSON (400), validator 401, validator 403, parameter-validation failure (400),
script-failure 500, client-aborted short-circuit (`Results.Empty`), and the
actor-stash invariant (HttpContext.Items[AuditActorItemKey] is set with the
resolved key name after successful auth, but is absent on auth failures).
### InboundAPI-024 — `_knownBadMethods` is unbounded — an attacker can grow the cache by spamming distinct method names against the audit middleware path
| | |
|--|--|
| Severity | Low |
| Category | Performance & resource management |
| Status | Resolved |
| Location | `src/ScadaLink.InboundAPI/InboundScriptExecutor.cs:30`, `:77`, `:223`, `:233` |
**Resolution (2026-05-28):** capped the cache at `KnownBadMethodsCap = 1000`
entries via a new `TryRecordBadMethod` helper that short-circuits when the cap
is reached — both `CompileAndRegister` and the `ExecuteAsync` lazy-compile path
now route through it. Once full, new bad-method records are dropped; the cache
is just a fast-fail optimisation and the per-request DB lookup remains the
correctness path. Regression tests
`KnownBadMethodsCache_SizeNeverExceedsCap_UnderUniqueNameFlood` and
`KnownBadMethodsCache_LazyCompilePath_AlsoCappedUnderUniqueNameFlood` flood
`cap + 500` / `cap + 250` unique broken methods and assert the cache size never
exceeds the cap. Internal `KnownBadMethodCount` exposed for testability only.
**Description**
The InboundAPI-009 fix introduced `_knownBadMethods`, a `ConcurrentDictionary<string, byte>`
of method names whose Roslyn compilation failed, to short-circuit lazy
recompilation. It is keyed by `method.Name` and entries are only ever removed
when `CompileAndRegister` succeeds for the same name (line 83). Practically the
key space is bounded by the configured method definitions in the database, so
this is bounded in normal operation. But because the cache is mutated from the
lazy-compile path at `ExecuteAsync.cs:233`, and `ExecuteAsync` is called from
`HandleInboundApiRequest` only **after** `ApiKeyValidator.ValidateAsync` has
returned `Valid` (i.e. a real method exists), the entry is keyed by a name that
must have already been resolved through `GetMethodByNameAsync` — so this attack
surface is gated by the configuration database. The finding is therefore mostly
defensive: there is no rate limit on inbound API calls (deliberate design), so
if a future change ever causes `ExecuteAsync` to be called for an unvalidated
caller-supplied method name (e.g. a refactor that moves method-existence
checking later), this cache would become attacker-controllable.
**Recommendation**
Optional / defensive: cap `_knownBadMethods` (e.g. an LRU with a fixed size, or
clear it periodically). At minimum, document the invariant in the executor's
XML comment that `_knownBadMethods` keys must come from validated
`ApiMethod.Name` values, so the safety property survives future refactors. No
immediate change required; this is a watch-list item.
### InboundAPI-025 — `AuditWriteMiddleware` runs against the entire `/api/*` branch — emits spurious `ApiInbound` audit rows for `/api/audit/query` and `/api/audit/export`
| | |
|--|--|
| Severity | Medium |
| Category | Correctness & logic bugs |
| Status | Resolved |
| Location | `src/ScadaLink.Host/Program.cs:183-185`; consumers: `src/ScadaLink.ManagementService/AuditEndpoints.cs:93-94`; emitter: `src/ScadaLink.InboundAPI/Middleware/AuditWriteMiddleware.cs:175-252` |
**Resolution (2026-05-28):** Took the defensive path-exclusion in
`Program.cs` (Option 1 from the recommendation). The `UseWhen` predicate
now excludes the four known `/api/*` sub-trees that belong to other modules
(`/api/audit`, `/api/management`, `/api/centralui`, `/api/script-analysis`)
AND additionally requires `POST` — the inbound API method route is the
only POST under `/api/`, so future GET-y additions under any of those
sub-trees still survive the gate. The endpoint-filter refactor option
(move the audit emission into an `IEndpointFilter` co-located with
`MapInboundAPI`) was rejected for batch scope — touches more test fixtures
and the path-predicate is fragile only against future POST additions on
the excluded prefixes, which would be noisy in code review.
**Description**
`Program.cs` wires the audit middleware as
`app.UseWhen(ctx => ctx.Request.Path.StartsWithSegments("/api"), branch => branch.UseAuditWriteMiddleware())`
— scoped to the `/api` *prefix*, not to the `POST /api/{methodName}` route.
Meanwhile, `ScadaLink.ManagementService/AuditEndpoints.cs` maps
`MapGet("/api/audit/query", ...)` (line 93) and `MapGet("/api/audit/export", ...)`
(line 94). Both routes therefore inherit `AuditWriteMiddleware`, which emits an
`AuditEvent { Channel = AuditChannel.ApiInbound, Kind = AuditKind.InboundRequest, ... }`
row for every call. The middleware's `ResolveMethodName` falls back to the last
path segment (lines 446452), so a GET `/api/audit/query?...` is recorded as if a
caller had invoked an inbound API method named "query"; an export is recorded
as method "export". Effects:
1. **Audit log is polluted with non-script rows.** The audit log is now
recording its own query traffic as if it were inbound script invocations,
contradicting Component-AuditLog.md's scope ("script trust boundary actions").
2. **Audit reads recursively emit audit writes.** Every audit-log query (e.g.
from the Central UI Audit Log page or the CLI `audit query` command) writes
an additional row into `AuditLog`, growing the table on read.
3. **`Target` is meaningless.** `/api/audit/query` has no method definition, so
the recorded `Target = "query"` is not joinable to any `ApiMethod` row in
audit-log drill-ins.
4. **Wasted resources on health probes / management calls.** Any future routes
added under `/api/` will inherit the middleware and pay the
`EnableBuffering`, `CapturedResponseStream`, and `JsonSerializer.Serialize`
costs even though they are not inbound script invocations.
Tests for the audit middleware (`AuditWriteMiddlewareTests`) and pipeline order
(`MiddlewareOrderTests`) wire the middleware only against the
`POST /api/{methodName}` route in test hosts, so this production-only
mis-scoping is not exercised.
**Recommendation**
Tighten the predicate so the middleware runs only on the inbound API method
route, not on the `/api/` prefix. Options:
- `app.UseWhen(ctx => ctx.Request.Path.StartsWithSegments("/api") && !ctx.Request.Path.StartsWithSegments("/api/audit") && !ctx.Request.Path.StartsWithSegments("/api/management"), ...)`
— defensive, but fragile to future route additions.
- Move the audit emission from a pipeline middleware to an `IEndpointFilter`
applied via `.AddEndpointFilter<>()` on the `MapInboundAPI` registration
(alongside `InboundApiEndpointFilter`). This makes the scope explicit on the
one route that needs it and survives future `/api/...` route additions
unchanged.
The endpoint-filter form is the recommended fix — it co-locates the audit-emission
scope with the route definition and matches how InboundAPI-006/008 gating is
already wired.