f93b7b99bb
Re-applies the full 10-category checklist to every src/ project — including
first-time reviews of the four newer components (AuditLog, NotificationOutbox,
SiteCallAudit, Transport) — so the code-reviews/ index reflects today's
codebase rather than the 2026-05-16 baseline. 172 new Open findings (0
Critical, 18 High, 62 Medium, 92 Low); 481 findings total across 23 modules.
regen-readme.py now derives each module's Last reviewed + Commit from its
findings.md header instead of hard-coding 2026-05-16 / 9c60592, so future
single-module re-reviews show their own date in the Module Status table.
1233 lines
68 KiB
Markdown
1233 lines
68 KiB
Markdown
# Code Review — InboundAPI
|
||
|
||
| Field | Value |
|
||
|-------|-------|
|
||
| Module | `src/ScadaLink.InboundAPI` |
|
||
| Design doc | `docs/requirements/Component-InboundAPI.md` |
|
||
| Status | Reviewed |
|
||
| Last reviewed | 2026-05-28 |
|
||
| Reviewer | claude-agent |
|
||
| Commit reviewed | `1eb6e97` |
|
||
| Open findings | 8 |
|
||
|
||
## Summary
|
||
|
||
The InboundAPI module is small (8 source files) and the happy-path flow — extract
|
||
key, validate, deserialize parameters, execute script, serialize result — is clean
|
||
and readable. However the review surfaced several real problems concentrated in two
|
||
themes: **concurrency** and **security**. The `InboundScriptExecutor` is a singleton
|
||
that mutates a plain `Dictionary` from concurrent ASP.NET request threads with no
|
||
synchronization, which can corrupt the handler cache or crash the process under load.
|
||
On the security side, API-key comparison is a non-constant-time database string
|
||
match (timing oracle), compiled scripts run with no enforcement of the documented
|
||
script trust model (forbidden APIs such as `System.IO`/`Process`/`Reflection` are
|
||
fully reachable), there is no request-body size limit, and the executor's catch-all
|
||
swallows `OperationCanceledException` from genuine client disconnects as a "timeout".
|
||
Design-doc adherence is also incomplete: the `Database.Connection()` script API
|
||
described in the design doc is entirely absent from `InboundScriptContext`, and the
|
||
endpoint never enforces that the API is central-only. Testing covers the validators
|
||
well but there is no coverage of the HTTP endpoint, concurrency, or recompilation.
|
||
None of the findings are data-loss-class, but the concurrency and trust-model issues
|
||
are High severity and should be addressed before production use.
|
||
|
||
#### Re-review 2026-05-17 (commit `39d737e`)
|
||
|
||
All 13 findings from the initial review remain `Resolved`; the module source under
|
||
`src/ScadaLink.InboundAPI` is unchanged since the last InboundAPI fix commit
|
||
(`8dd7412`), which precedes `39d737e`. This re-review re-walked all 10 checklist
|
||
categories against the resolved code and surfaced **4 new findings** — none touching
|
||
the previously-fixed concurrency/trust-model code, but all in areas the first pass
|
||
did not probe deeply: (1) the `ReturnDefinition` column is loaded onto `ApiMethod`
|
||
but is never consulted — script return values are serialized verbatim with no shaping
|
||
or validation against the declared return structure (InboundAPI-014); (2) the new
|
||
`ForbiddenApiChecker` is a purely textual syntax walker and can be bypassed by
|
||
reaching forbidden functionality through member access that never spells a forbidden
|
||
namespace, e.g. `typeof(x).Assembly.GetType("System.IO.File")` (InboundAPI-015);
|
||
(3) routed `Route.To().Call()` invocations are not bound by the method timeout unless
|
||
the script explicitly threads `Parameters`-side cancellation, contradicting the design
|
||
statement that the timeout covers routed calls (InboundAPI-016); and (4) `RouteHelper`
|
||
/ `RouteTarget` — the entire WP-4 cross-site routing surface — has no test coverage
|
||
(InboundAPI-017). New findings are one Medium-trio plus one Low; no Critical or High.
|
||
|
||
## Checklist coverage
|
||
|
||
| # | Category | Examined | Notes |
|
||
|---|----------|----------|-------|
|
||
| 1 | Correctness & logic bugs | ☑ | `CoerceValue` returns `null` for legitimately-null/`String` values indistinguishably; parameter-definition edge cases noted. |
|
||
| 2 | Akka.NET conventions | ☑ | Module is ASP.NET-hosted, no actors of its own; routes to actors via `CommunicationService`. No correlation-ID issues — IDs are set in `RouteHelper`. |
|
||
| 3 | Concurrency & thread safety | ☑ | Singleton `InboundScriptExecutor` mutates a non-thread-safe `Dictionary` from concurrent request threads — see InboundAPI-001/002. |
|
||
| 4 | Error handling & resilience | ☑ | Catch-all conflates client cancellation with timeout (InboundAPI-004); compilation-failure path repeats work on every request (InboundAPI-009). |
|
||
| 5 | Security | ☑ | Prior items resolved. Re-review: `ForbiddenApiChecker` is a textual deny-list bypassable via reflection without a forbidden namespace token (InboundAPI-015). |
|
||
| 6 | Performance & resource management | ☑ | Up to 3 separate DB round-trips per request in `ApiKeyValidator`; uncapped lazy recompilation. |
|
||
| 7 | Design-document adherence | ☑ | Re-review: `ReturnDefinition` loaded but never used (InboundAPI-014); routed-call timeout not enforced (InboundAPI-016). Prior `Database.Connection()`/central-only items resolved. |
|
||
| 8 | Code organization & conventions | ☑ | `ParameterDefinition` moved to Commons (InboundAPI-012 resolved); no new issues. |
|
||
| 9 | Testing coverage | ☑ | Re-review: `RouteHelper`/`RouteTarget` (WP-4 routing) entirely untested (InboundAPI-017); validators/executor/filter well covered. |
|
||
| 10 | Documentation & comments | ☑ | `ApiKeyValidationResult.NotFound` XML/name says "NotFound" but returns HTTP 400 — misleading (InboundAPI-013). |
|
||
|
||
#### Re-review 2026-05-28 (commit `1eb6e97`)
|
||
|
||
All 17 prior findings remain `Resolved`. The module has grown materially since the
|
||
last pass — a new `AuditWriteMiddleware` (Audit Log #23 M4 Bundle D) now lives under
|
||
`src/ScadaLink.InboundAPI/Middleware/`, the `ApiKeyValidator` was rewired to hash the
|
||
candidate with `IApiKeyHasher` (ConfigurationDatabase-012), and an `IInstanceRouter`
|
||
seam was introduced. This re-review re-walked all 10 checklist categories against
|
||
`1eb6e97` and surfaced **8 new findings** concentrated on the new audit middleware
|
||
and a stranded follow-up from InboundAPI-008:
|
||
|
||
1. The InboundAPI-008 resolution explicitly deferred registering an `IActiveNodeGate`
|
||
implementation in `ScadaLink.Host` as a "follow-up outside this module's scope" —
|
||
that follow-up is still unfulfilled (no production registration anywhere in
|
||
`src/ScadaLink.Host/`), so the design-mandated standby-node gating is silently
|
||
disabled in production today (`InboundAPI-022`, High).
|
||
2. `AuditWriteMiddleware` is wired in `Program.cs` against `/api/*` rather than the
|
||
specific `POST /api/{methodName}` route, so GETs against `/api/audit/query` and
|
||
`/api/audit/export` (audit query endpoints — themselves not script invocations)
|
||
now emit spurious `AuditChannel.ApiInbound`/`InboundRequest` rows back into the
|
||
audit log with `Target` set to the last path segment (`InboundAPI-025`, Medium).
|
||
3. The middleware fires its audit write as `_ = _auditWriter.WriteAsync(evt)` — the
|
||
wrapping try/catch only catches synchronous throws, so a faulted async writer
|
||
task is unobserved and the row silently disappears with no log line
|
||
(`InboundAPI-018`, Low/Medium).
|
||
4. `ParentExecutionId` correlation flows only through `RouteToCallRequest` —
|
||
`RouteToGetAttributesRequest`/`RouteToSetAttributesRequest` have no
|
||
`ParentExecutionId` field, so attribute reads/writes from inbound scripts lose
|
||
the inbound→site execution-tree link the Audit Log decision in CLAUDE.md
|
||
describes (`InboundAPI-021`, Medium).
|
||
5. `EndpointExtensions.HandleInboundApiRequest` — the entire wiring composition
|
||
that ties validator/executor/route/audit together — has no test coverage; only
|
||
the components it composes are tested (`InboundAPI-023`, Low).
|
||
6. `EndpointExtensions.HandleInboundApiRequest` does
|
||
`ContentType?.Contains("json")` (case-sensitive) so a request with
|
||
`application/JSON` and no Content-Length silently skips JSON body parsing
|
||
(`InboundAPI-020`, Low).
|
||
7. `AuditWriteMiddleware.InvokeAsync` calls `EnableBuffering()` unconditionally
|
||
before the empty-body short-circuit, allocating a `FileBufferingReadStream` for
|
||
every request including bodyless ones (`InboundAPI-019`, Low).
|
||
|
||
Severity mix: 1 High, 3 Medium, 4 Low — no Critical. (The eighth finding —
|
||
`InboundAPI-024`, Low — is a defensive watch-list item flagging that
|
||
`_knownBadMethods` is unbounded; it is bounded *in practice* today by the
|
||
configuration database, but the invariant is undocumented.)
|
||
|
||
## Checklist coverage — 2026-05-28 (commit `1eb6e97`)
|
||
|
||
| # | Category | Examined | Notes |
|
||
|---|----------|----------|-------|
|
||
| 1 | Correctness & logic bugs | ☑ | `ContentType?.Contains("json")` is case-sensitive (InboundAPI-020). |
|
||
| 2 | Akka.NET conventions | ☑ | ASP.NET-hosted, no actors of its own; routes via `IInstanceRouter`/`CommunicationService`. No new issues. |
|
||
| 3 | Concurrency & thread safety | ☑ | `ConcurrentDictionary` handler cache (post-001/002 fix). New audit middleware is per-request scoped, no shared mutable state. No new issues. |
|
||
| 4 | Error handling & resilience | ☑ | Audit `WriteAsync` is fire-and-forget; async faults are unobserved (InboundAPI-018). |
|
||
| 5 | Security | ☑ | `IActiveNodeGate` not registered in Host — standby-node gating disabled in production (InboundAPI-022). |
|
||
| 6 | Performance & resource management | ☑ | `EnableBuffering()` unconditional on bodyless requests (InboundAPI-019); audit middleware wraps `Response.Body` and mints `ExecutionId` for non-script /api routes (InboundAPI-025). |
|
||
| 7 | Design-document adherence | ☑ | `ParentExecutionId` not stamped on attribute-read/write routed messages (InboundAPI-021). InboundAPI-008's deferred Host registration still unfulfilled (InboundAPI-022). |
|
||
| 8 | Code organization & conventions | ☑ | No new issues. |
|
||
| 9 | Testing coverage | ☑ | `EndpointExtensions.HandleInboundApiRequest` composition wiring has no test (InboundAPI-023); middleware/filter/validator/executor/route are individually covered. |
|
||
| 10 | Documentation & comments | ☑ | No new issues. |
|
||
|
||
## Findings
|
||
|
||
### InboundAPI-001 — Singleton script handler cache mutated without synchronization
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | High |
|
||
| Category | Concurrency & thread safety |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.InboundAPI/InboundScriptExecutor.cs:17`, `:32`, `:40`, `:89`, `:123-128` |
|
||
|
||
**Description**
|
||
|
||
`InboundScriptExecutor` is registered as a singleton (`ServiceCollectionExtensions.cs:11`)
|
||
and its handler cache is a plain `Dictionary<string, Func<...>>` (`InboundScriptExecutor.cs:17`).
|
||
`RegisterHandler`, `RemoveHandler`, `CompileAndRegister`, and the lazy-compile path in
|
||
`ExecuteAsync` all read and write this dictionary with no lock. ASP.NET serves inbound
|
||
API requests on concurrent thread-pool threads, so two requests for an as-yet-uncompiled
|
||
method (or a request racing a CLI-triggered `CompileAndRegister`) can mutate the
|
||
dictionary concurrently. `Dictionary` is explicitly not safe for concurrent
|
||
read/write — this can corrupt internal buckets, throw `InvalidOperationException`,
|
||
or return a torn/`null` handler, crashing the request or the process.
|
||
|
||
**Recommendation**
|
||
|
||
Replace the `Dictionary` with a `ConcurrentDictionary<string, Func<...>>`, or guard all
|
||
access with a lock. For the lazy-compile path use `GetOrAdd` so concurrent first-callers
|
||
compile at most once.
|
||
|
||
**Resolution**
|
||
|
||
Resolved 2026-05-16 (commit `<pending>`): replaced the plain `Dictionary` handler
|
||
cache with a `ConcurrentDictionary`; `RemoveHandler` now uses `TryRemove`; the
|
||
lazy-compile path in `ExecuteAsync` compiles outside the cache and inserts atomically
|
||
via `GetOrAdd` so concurrent first-callers share one handler. Regression tests
|
||
`ConcurrentLazyCompile_SameMethod_DoesNotCorruptCache` and
|
||
`ConcurrentRegisterAndExecute_DoesNotThrow` added.
|
||
|
||
### InboundAPI-002 — Lazy compilation is a check-then-act race with no atomicity
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Medium — re-triaged: already fixed by the InboundAPI-001 fix; verified and closed |
|
||
| Category | Concurrency & thread safety |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.InboundAPI/InboundScriptExecutor.cs:152-161` |
|
||
|
||
**Description**
|
||
|
||
`ExecuteAsync` does `if (!_scriptHandlers.TryGetValue(...)) { CompileAndRegister(method); handler = _scriptHandlers[method.Name]; }`.
|
||
Even setting aside the unsynchronized dictionary (InboundAPI-001), this is a
|
||
check-then-act sequence: between `TryGetValue` failing and the re-read on line 128,
|
||
another thread could `RemoveHandler` the entry, causing the indexer on line 128 to
|
||
throw `KeyNotFoundException` — an unhandled-in-context exception that is then caught
|
||
only by the broad catch on line 143 and reported to the caller as "Internal script
|
||
error". Multiple concurrent first-callers will also each compile the same script
|
||
redundantly (wasted Roslyn work).
|
||
|
||
**Recommendation**
|
||
|
||
Make compile-and-fetch a single atomic operation (`ConcurrentDictionary.GetOrAdd`
|
||
with a lazily-evaluated factory, or a per-method lock), and have `CompileAndRegister`
|
||
return the handler it produced rather than requiring a separate dictionary read.
|
||
|
||
**Resolution**
|
||
|
||
Resolved 2026-05-16 (commit `pending`): re-triage — verified against the current
|
||
source, this finding was **already fixed** by the InboundAPI-001 fix. The
|
||
`InboundScriptExecutor.cs:152-161` lazy-compile path no longer does check-then-act
|
||
re-read: `Compile(method)` runs unconditionally (it never reads the cache) and the
|
||
result is published via the atomic `_scriptHandlers.GetOrAdd(method.Name, compiled)`.
|
||
There is no separate dictionary indexer read, so the `KeyNotFoundException` race the
|
||
finding describes cannot occur, and concurrent first-callers all share the single
|
||
handler that `GetOrAdd` keeps. Regression test
|
||
`LazyCompile_RacingRemoveHandler_NeverThrowsKeyNotFound` added (asserts a concurrent
|
||
`RemoveHandler` storm against lazy-compiling callers never yields the catch-all
|
||
"Internal script error"); it passes against the current code, confirming the fix.
|
||
|
||
### InboundAPI-003 — API key compared with non-constant-time string equality
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | High |
|
||
| Category | Security |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.ConfigurationDatabase/Repositories/InboundApiRepository.cs:22-23`, consumed by `src/ScadaLink.InboundAPI/ApiKeyValidator.cs:33` |
|
||
|
||
**Description**
|
||
|
||
API-key authentication resolves the key with
|
||
`FirstOrDefaultAsync(k => k.KeyValue == keyValue)` — an ordinary equality match
|
||
translated to a SQL `WHERE KeyValue = @p` comparison. The secret is matched with
|
||
ordinary (early-exit) string/SQL comparison rather than a constant-time comparison,
|
||
which is a classic timing side-channel for secret material. Combined with the design's
|
||
explicit "no rate limiting" decision, an attacker with network access to the central
|
||
API can mount a timing attack to recover valid keys. The API key is the *sole*
|
||
credential for the inbound API, so this is the primary authentication path.
|
||
|
||
**Recommendation**
|
||
|
||
Look the key up by a non-secret indexed identifier (e.g. a key prefix/id) or fetch
|
||
candidate rows, then verify the secret in-process using
|
||
`CryptographicOperations.FixedTimeEquals` over the UTF-8 bytes. Preferably store only
|
||
a salted hash of the key value and compare hashes. Avoid leaking secret-length and
|
||
match-position timing.
|
||
|
||
**Resolution**
|
||
|
||
Resolved 2026-05-16 (commit `<pending>`): `ApiKeyValidator` no longer calls the
|
||
secret-equality lookup `GetApiKeyByValueAsync` (the SQL `WHERE KeyValue = @secret`
|
||
timing oracle). It now fetches all keys via `GetAllApiKeysAsync` and matches the
|
||
secret in-process with `CryptographicOperations.FixedTimeEquals` over the UTF-8 bytes,
|
||
scanning every candidate so neither match position nor secret length is observable.
|
||
Regression tests `ValidateAsync_DoesNotUseSecretEqualityLookup`,
|
||
`ValidateAsync_WrongKey_ConstantTimePath_Returns401`, and
|
||
`ValidateAsync_KeyOfDifferentLength_Returns401` added. Note: the timing-oracle method
|
||
`GetApiKeyByValueAsync` remains on `IInboundApiRepository` (it is outside this module);
|
||
removing it from the repository is left as separate follow-up since the validator no
|
||
longer depends on it.
|
||
|
||
### InboundAPI-004 — Client disconnect is misreported as a script timeout
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Medium |
|
||
| Category | Error handling & resilience |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.InboundAPI/InboundScriptExecutor.cs:117-141` |
|
||
|
||
**Description**
|
||
|
||
`ExecuteAsync` creates a linked CTS from `httpContext.RequestAborted` and the method
|
||
timeout, then catches `OperationCanceledException` and unconditionally returns
|
||
"Script execution timed out". When the *client* aborts the request (`RequestAborted`
|
||
fires), the same exception type is thrown, so a normal client disconnect is logged as
|
||
a timeout (`_logger.LogWarning("Script execution timed out ...")`) and an attempt is
|
||
made to write a 500 timeout body to an already-gone connection. This pollutes the
|
||
failure log (which the design says is reserved for genuine script errors) and obscures
|
||
real timeout incidents.
|
||
|
||
**Recommendation**
|
||
|
||
Distinguish the two cancellation sources: if `cancellationToken` (the request token)
|
||
is cancelled, treat it as a client abort — do not log a timeout and do not attempt to
|
||
write a response. Only when the timeout CTS fired should the result be "timed out".
|
||
Check `cts.Token.IsCancellationRequested && !cancellationToken.IsCancellationRequested`
|
||
or use a dedicated timeout `CancellationTokenSource` so the two are separable.
|
||
|
||
**Resolution**
|
||
|
||
Resolved 2026-05-16 (commit `pending`): `ExecuteAsync` now uses a dedicated timeout
|
||
`CancellationTokenSource` (`new CancellationTokenSource(timeout)`) linked with the
|
||
request-abort token, so the two cancellation sources are separable. The
|
||
`OperationCanceledException` handler reports "Script execution timed out" (and logs a
|
||
warning) **only** when the timeout CTS fired and the request token did not; a client
|
||
abort instead returns "Request cancelled by client" and logs at Debug — the failure
|
||
log stays reserved for genuine script-execution timeouts. `HandleInboundApiRequest`
|
||
additionally short-circuits with `Results.Empty` (no warning log, no 500 body write)
|
||
when `RequestAborted` is cancelled, since the connection is already gone. Regression
|
||
tests `ClientDisconnect_IsNotReportedAsTimeout` and `GenuineTimeout_StillReportedAsTimeout`
|
||
added.
|
||
|
||
### InboundAPI-005 — Compiled API scripts run with no script-trust-model enforcement
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | High |
|
||
| Category | Security |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.InboundAPI/InboundScriptExecutor.cs:56-93` |
|
||
|
||
**Description**
|
||
|
||
CLAUDE.md's Akka.NET conventions state the script trust model forbids `System.IO`,
|
||
`Process`, `Threading`, `Reflection`, and raw network access. `CompileAndRegister`
|
||
compiles arbitrary C# with `CSharpScript.Create` and only restricts the *default
|
||
imports* (`WithImports("System", ...)`). Imports are a convenience, not a sandbox — a
|
||
script can still fully-qualify any type (`System.IO.File.Delete(...)`,
|
||
`System.Diagnostics.Process.Start(...)`, `System.Reflection`, raw `Socket`) because
|
||
the core framework assemblies are referenced and Roslyn scripting performs no API
|
||
allow/deny-listing. Inbound API scripts execute on the central node with the host
|
||
process's privileges, so a malicious or buggy method definition has full host access.
|
||
Note the Design role authors these scripts (less trusted than Admin), making
|
||
enforcement material.
|
||
|
||
**Recommendation**
|
||
|
||
Add a compile-time analyzer/`SyntaxWalker` (as the Site Runtime does for instance
|
||
scripts) that rejects forbidden namespaces/types before registering a handler, and/or
|
||
run scripts under a constrained boundary. At minimum, share the Site Runtime's
|
||
forbidden-API checker so the trust model is enforced consistently. Reject the method
|
||
(and log) when a violation is found instead of registering it.
|
||
|
||
**Resolution**
|
||
|
||
Resolved 2026-05-16 (commit `<pending>`): added `ForbiddenApiChecker`, a Roslyn
|
||
`CSharpSyntaxWalker` that statically rejects scripts referencing forbidden namespaces
|
||
(`System.IO`, `System.Diagnostics`, `System.Threading` except `Tasks`,
|
||
`System.Reflection`, `System.Net`, `System.Runtime.InteropServices`, `Microsoft.Win32`)
|
||
whether reached via a `using` directive or a fully-qualified name. `CompileAndRegister`
|
||
now runs the check before Roslyn compilation and refuses to register (and logs) a
|
||
violating method; `ExecuteAsync`'s lazy-compile path is gated by the same check.
|
||
Regression tests `CompileAndRegister_ForbiddenApi_RejectsScript` (theory),
|
||
`ExecuteAsync_ForbiddenApiScript_DoesNotRunAndReturnsFailure`, and
|
||
`CompileAndRegister_PermittedScript_StillRegisters` added.
|
||
|
||
### InboundAPI-006 — No request body size limit on the inbound endpoint
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Medium |
|
||
| Category | Security |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.InboundAPI/EndpointExtensions.cs:54-62` |
|
||
|
||
**Description**
|
||
|
||
`HandleInboundApiRequest` calls `JsonDocument.ParseAsync(httpContext.Request.Body, ...)`
|
||
with no explicit body-size cap and no `[RequestSizeLimit]`/endpoint metadata. Although
|
||
Kestrel has a default max request body size, this endpoint accepts arbitrary JSON from
|
||
external systems, fully buffers it into a `JsonDocument`, and then `Clone()`s the
|
||
root element (`:61`) which materializes the entire document on the heap. With no rate
|
||
limiting (a deliberate design choice) a single caller can drive large allocations.
|
||
Deep/wide JSON also makes the `CoerceValue` `object`/`list` deserialization
|
||
(`ParameterValidator.cs:113,117`) expensive.
|
||
|
||
**Recommendation**
|
||
|
||
Set an explicit, modest body-size limit on the endpoint
|
||
(`.WithMetadata(new RequestSizeLimitAttribute(...))` or
|
||
`IHttpMaxRequestBodySizeFeature`) and consider a `JsonDocumentOptions` `MaxDepth`.
|
||
Reject oversized bodies with 413 before buffering.
|
||
|
||
**Resolution**
|
||
|
||
Resolved 2026-05-16 (commit `pending`): added `InboundApiEndpointFilter`, an
|
||
`IEndpointFilter` applied to `POST /api/{methodName}` via `.AddEndpointFilter<>()`.
|
||
It rejects requests whose declared `Content-Length` exceeds `InboundApiOptions.
|
||
MaxRequestBodyBytes` (default 1 MiB) with HTTP 413 *before* the handler buffers the
|
||
body into a `JsonDocument`, and also lowers the per-request `IHttpMaxRequestBodySizeFeature`
|
||
cap so a chunked/unknown-length stream is cut off by Kestrel while being read. The
|
||
limit is configurable via the bound `ScadaLink:InboundApi` options section. Regression
|
||
tests `OversizedBody_ShortCircuitsWith413_AndDoesNotRunHandler`, `BodyAtLimit_RunsHandler`,
|
||
and `FilterCapsMaxRequestBodySizeFeature` added.
|
||
|
||
### InboundAPI-007 — `Database.Connection()` script API from the design doc is not implemented
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Medium |
|
||
| Category | Design-document adherence |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.InboundAPI/InboundScriptExecutor.cs:188-203` |
|
||
|
||
**Description**
|
||
|
||
`Component-InboundAPI.md` ("Script Runtime API -> Database Access") specifies
|
||
`Database.Connection("connectionName")` as an available script capability for
|
||
querying the configuration/machine-data databases. `InboundScriptContext` exposes only
|
||
`Parameters`, `Route`, and `CancellationToken` — there is no `Database` member. Any
|
||
method script that follows the documented API will fail to compile. Either the code
|
||
is incomplete or the design doc is stale; the two must be reconciled.
|
||
|
||
**Recommendation**
|
||
|
||
If database access is in scope, add a `Database` property to `InboundScriptContext`
|
||
backed by a connection-factory service. If it is not, remove the "Database Access"
|
||
section from `Component-InboundAPI.md` so the design doc stops advertising an absent
|
||
API.
|
||
|
||
**Resolution**
|
||
|
||
Resolved 2026-05-16 (commit `<pending>`). The drift was confirmed real:
|
||
`InboundScriptContext` (`InboundScriptExecutor.cs:188-203`) exposes only
|
||
`Parameters`, `Route`, and `CancellationToken` — there is no `Database` member,
|
||
so a method script following the documented `Database.Connection("name")` API
|
||
would fail to compile. Resolution direction: the design doc is stale, not the
|
||
code. Implementing `Database.Connection()` would hand inbound API scripts a
|
||
*raw* MS SQL client, in direct tension with the ScadaLink script trust model
|
||
(scripts are forbidden `System.IO`, raw network, etc.; `ForbiddenApiChecker`
|
||
statically enforces this). Rather than carve a hole in the trust model, the
|
||
"Database Access" section was removed from `docs/requirements/Component-InboundAPI.md`
|
||
and replaced with an explicit "No direct database access" note explaining that
|
||
scripts interact only through the curated `Route`/`Parameters` surfaces, and
|
||
that any future data access must go behind a dedicated scoped helper added as an
|
||
explicit design change. Code and doc now agree; no code or test change required.
|
||
|
||
### InboundAPI-008 — Inbound API endpoint not restricted to the active central node
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Medium |
|
||
| Category | Design-document adherence |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.InboundAPI/EndpointExtensions.cs:19-23`, `src/ScadaLink.Host/Program.cs:149` |
|
||
|
||
**Description**
|
||
|
||
The design states the Inbound API is "Central cluster only (active node)" and "fails
|
||
over with it". `MapInboundAPI` registers `POST /api/{methodName}` unconditionally, and
|
||
`Program.cs` maps it inside the central-role branch but with no active-node gating —
|
||
unlike `/health/active` which has an `active-node` predicate. A standby central node
|
||
will happily serve inbound API calls, executing scripts and `Route.To()` calls from a
|
||
non-leader, which can race the active node or run against stale singleton state.
|
||
|
||
**Recommendation**
|
||
|
||
Gate the endpoint on active-node status (reuse the cluster `active-node` health check
|
||
or a leader-state check) and return 503 on the standby, so Traefik/clients only reach
|
||
the live node — consistent with how the Management API and `/health/active` are
|
||
treated.
|
||
|
||
**Resolution**
|
||
|
||
Resolved 2026-05-16 (commit `pending`): introduced `IActiveNodeGate`, an abstraction
|
||
the inbound API uses to ask whether this node is the active (cluster-leader) central
|
||
node. The new `InboundApiEndpointFilter` (applied to `POST /api/{methodName}`)
|
||
consults the gate and short-circuits a standby node with HTTP 503 before any
|
||
auth/script work, so Traefik/clients only reach the live node — consistent with
|
||
`/health/active`. The gate is resolved optionally: when no implementation is
|
||
registered (non-clustered host / tests) the endpoint defaults to "allow", preserving
|
||
prior behaviour. Regression tests `StandbyNode_ShortCircuitsWith503_AndDoesNotRunHandler`,
|
||
`ActiveNode_PassesGate_RunsHandler`, and `NoGateRegistered_PassesGate_RunsHandler`
|
||
added. **Follow-up (outside this module's scope):** `ScadaLink.Host` should register
|
||
an `IActiveNodeGate` implementation backed by `ActiveNodeHealthCheck` /
|
||
`Cluster.State.Leader` in the central-role branch of `Program.cs` so the gate is
|
||
actually enforced in production; until then the endpoint defaults to "allow".
|
||
|
||
### InboundAPI-009 — Failed compilation is retried on every subsequent request
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Low |
|
||
| Category | Performance & resource management |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.InboundAPI/InboundScriptExecutor.cs:123-128` |
|
||
|
||
**Description**
|
||
|
||
When a method's script fails to compile, `CompileAndRegister` returns `false` and
|
||
nothing is stored in `_scriptHandlers`. Every subsequent call to that method re-enters
|
||
the lazy-compile branch and recompiles the broken script via Roslyn from scratch.
|
||
Roslyn compilation is expensive; a single broken method definition repeatedly invoked
|
||
by an external caller (no rate limiting) becomes a CPU amplification vector.
|
||
|
||
**Recommendation**
|
||
|
||
Cache the compilation *failure* (e.g. store a sentinel handler that immediately
|
||
returns the compile error, or keep a `HashSet` of known-bad method names with the
|
||
diagnostic) so a broken script is compiled at most once until the definition is
|
||
updated via `CompileAndRegister`.
|
||
|
||
**Resolution**
|
||
|
||
Resolved 2026-05-16 (commit pending): confirmed the root cause — a failed `Compile`
|
||
stored nothing in `_scriptHandlers`, so every subsequent request re-entered the
|
||
lazy-compile branch and re-ran Roslyn. Added a `_knownBadMethods` `ConcurrentDictionary`
|
||
of method names whose compilation failed; `ExecuteAsync`'s lazy-compile path
|
||
short-circuits before Roslyn when the method is already known-bad, and records the
|
||
failure on a fresh failed compile. `CompileAndRegister` also records failures and
|
||
clears the record on a successful (re)compile, so a fixed method definition is
|
||
re-evaluated. Regression tests `FailedCompilation_IsNotRetriedOnEveryRequest`
|
||
(asserts the compile-failure log fires exactly once across 5 requests) and
|
||
`FailedCompilation_RecompilesAfterCompileAndRegisterCalledAgain` added.
|
||
|
||
### InboundAPI-010 — `ParameterValidator` ignores extra body fields and cannot validate Object/List element types
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Low |
|
||
| Category | Correctness & logic bugs |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.InboundAPI/ParameterValidator.cs:64-90`, `:112-118` |
|
||
|
||
**Description**
|
||
|
||
Two related correctness gaps: (1) The validator iterates only over *defined*
|
||
parameters; any extra top-level fields in the request body are silently ignored
|
||
rather than reported, so callers get no feedback on typo'd parameter names. (2) For
|
||
`Object` and `List` types the validator only checks the JSON *kind* (`Object`/`Array`)
|
||
and then blindly `JsonSerializer.Deserialize`s the raw text — the design's extended
|
||
type system describes Objects as "named structure with typed fields" and Lists as
|
||
collections "of objects or primitive types", but no field-level or element-level type
|
||
validation is performed. Invalid nested structures pass validation and surface only
|
||
as runtime script errors.
|
||
|
||
**Recommendation**
|
||
|
||
Optionally warn/400 on unexpected body fields. For the extended types, either parse a
|
||
richer `ParameterDefinition` (with nested field definitions / element type) and
|
||
validate recursively, or document explicitly that Object/List are validated only for
|
||
shape — and update the design doc to match.
|
||
|
||
**Resolution**
|
||
|
||
Resolved 2026-05-16 (commit pending): both gaps addressed along the lines the
|
||
recommendation offers. (1) `ParameterValidator.Validate` now enumerates the request
|
||
body's top-level fields and returns an Invalid result naming any field that does not
|
||
match a defined parameter (`"Unexpected parameter(s): ..."`), so a typo'd parameter
|
||
name is reported instead of silently ignored. (2) For `Object`/`List`, recursive
|
||
field/element-level type validation is **deliberately not** added — it requires a
|
||
richer nested `ParameterDefinition` schema, a design decision; instead the
|
||
shape-only behaviour is now explicitly documented in the `CoerceValue` XML comment so
|
||
the code's contract is unambiguous. Re-triage note: the design doc
|
||
(`Component-InboundAPI.md` line 43) lists only Boolean/Integer/Float/String as method
|
||
parameter types — the Object/List extended types are a CLAUDE.md decision; reconciling
|
||
the design doc is out of this module's editable scope and left as a doc-owner
|
||
follow-up. Regression tests `UnexpectedBodyField_ReturnsInvalid` and
|
||
`OnlyDefinedFields_StillValid` added.
|
||
|
||
### InboundAPI-011 — Method-existence check leaks to unapproved callers (enumeration oracle)
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Low |
|
||
| Category | Security |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.InboundAPI/ApiKeyValidator.cs:39-52` |
|
||
|
||
**Description**
|
||
|
||
`ValidateAsync` returns 400 `Method '{methodName}' not found` when the method does not
|
||
exist, but 403 `API key not approved for this method` when it exists but the key is
|
||
not approved. A caller holding any valid enabled key can therefore enumerate which
|
||
method names exist on the central API by observing 400-vs-403 responses. The error
|
||
message also echoes the caller-supplied `methodName` back verbatim into the JSON
|
||
response (`EndpointExtensions.cs:47`), a minor reflected-input concern.
|
||
|
||
**Recommendation**
|
||
|
||
Return an indistinguishable response (e.g. 403/404) for both "method not found" and
|
||
"key not approved" so existence is not observable to unapproved callers. Avoid echoing
|
||
raw caller input in error bodies, or sanitize it.
|
||
|
||
**Resolution**
|
||
|
||
Resolved 2026-05-16 (commit pending): confirmed — `ValidateAsync` returned 400
|
||
`Method '{methodName}' not found` for a missing method but 403
|
||
`API key not approved for this method` for an existing-but-unapproved one, an
|
||
enumeration oracle, and echoed the caller-supplied method name verbatim. Both cases
|
||
now return an identical response: HTTP 403 with the single shared message
|
||
`API key not approved for this method` (a `NotApprovedMessage` constant); the
|
||
method name is no longer interpolated into any error body, removing both the
|
||
existence oracle and the reflected-input concern. Regression tests
|
||
`ValidKey_MethodNotFound_IsIndistinguishableFromNotApproved` and
|
||
`ValidKey_MethodNotFound_ErrorMessageDoesNotEchoMethodName` added; the pre-existing
|
||
`ValidKey_MethodNotFound_Returns400` test was updated to assert the new
|
||
indistinguishable contract.
|
||
|
||
### InboundAPI-012 — `ParameterDefinition` POCO declared in the component project, not Commons
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Low |
|
||
| Category | Code organization & conventions |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.InboundAPI/ParameterValidator.cs:128-133` |
|
||
|
||
**Description**
|
||
|
||
`ParameterDefinition` is a persistence-/contract-shaped POCO: it is the deserialized
|
||
form of `ApiMethod.ParameterDefinitions` (a column in the configuration database) and
|
||
describes the public API contract. CLAUDE.md's code-organization rules place
|
||
persistence-ignorant entity/contract types in `ScadaLink.Commons`. Defining it inside
|
||
the InboundAPI project means any other component that needs to read or produce method
|
||
parameter definitions (e.g. Central UI's method editor, CLI, Management Service)
|
||
cannot share the type and will duplicate it.
|
||
|
||
**Recommendation**
|
||
|
||
Move `ParameterDefinition` (and a matching return-definition type, if added) to
|
||
`ScadaLink.Commons` under the InboundApi entity/types namespace so it is shared by all
|
||
components that work with method definitions.
|
||
|
||
**Resolution**
|
||
|
||
Resolved 2026-05-16 (commit `<pending>`): root cause confirmed against the source —
|
||
`ParameterDefinition` was a persistence-ignorant, API-contract-shaped POCO (the
|
||
deserialized form of the `ApiMethod.ParameterDefinitions` configuration-database
|
||
column) declared inside the component project, contrary to CLAUDE.md's
|
||
code-organization rule that such shared contract types live in `ScadaLink.Commons`.
|
||
The type was moved to `src/ScadaLink.Commons/Types/InboundApi/ParameterDefinition.cs`
|
||
(namespace `ScadaLink.Commons.Types.InboundApi`) — placed under `Types/` with an
|
||
`InboundApi` domain subfolder, matching the existing `Types/Scripts/` precedent, since
|
||
the column itself is the persisted form and this type is its deserialized contract
|
||
shape (not an EF-mapped entity). It remains a pure POCO with no EF attributes and no
|
||
behaviour. `ParameterValidator` now imports the moved type via a `using
|
||
ScadaLink.Commons.Types.InboundApi;` directive; a tree-wide search confirmed
|
||
`ParameterValidator.cs` was the type's only declaration and only direct consumer (all
|
||
other `ParameterDefinition*` matches are the unrelated `ParameterDefinitions` string
|
||
property). No return-definition type exists in the codebase — only a `ReturnDefinition`
|
||
string column — so none was invented. No behavioural change, so no new runtime
|
||
regression test: this is a compile-level type move, and the existing 52
|
||
`ScadaLink.InboundAPI.Tests` (including the `ParameterValidator` suite) act as the
|
||
regression guard. `dotnet test` for `ScadaLink.InboundAPI.Tests` (52 passed) and
|
||
`ScadaLink.Commons.Tests` (226 passed) are green; `dotnet build ScadaLink.slnx`
|
||
succeeds with 0 warnings / 0 errors.
|
||
|
||
### InboundAPI-013 — `ApiKeyValidationResult.NotFound` factory returns HTTP 400, contradicting its name
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Low |
|
||
| Category | Documentation & comments |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.InboundAPI/ApiKeyValidator.cs:78-79` |
|
||
|
||
**Description**
|
||
|
||
The static factory is named `NotFound` and is used for the "method not found" case,
|
||
but it builds a result with `StatusCode = 400` (Bad Request), not 404. The name
|
||
strongly implies 404 and will mislead future maintainers; `EndpointExtensions`
|
||
faithfully propagates whatever status code the factory sets, so the misnaming directly
|
||
affects the wire contract.
|
||
|
||
**Recommendation**
|
||
|
||
Rename the factory to match its behaviour (e.g. `BadRequest`) or change the status
|
||
code to 404 if that is the intended contract — and document the chosen "method not
|
||
found" status in `Component-InboundAPI.md`'s Error Handling section, which currently
|
||
does not list it.
|
||
|
||
**Resolution**
|
||
|
||
Resolved 2026-05-16 (commit pending): the misnamed `NotFound` factory (which built a
|
||
`StatusCode = 400` result) was the only producer of the "method not found" result,
|
||
and the InboundAPI-011 fix made "method not found" return 403 via the existing
|
||
`Forbidden` factory instead. The misleading `NotFound` factory is therefore now
|
||
**removed entirely** — it has no remaining callers in or out of the module
|
||
(`ApiKeyValidationResult` is InboundAPI-internal), eliminating the name/behaviour
|
||
contradiction. No separate regression test is needed: the InboundAPI-011 tests cover
|
||
the new method-not-found status, and removing dead code cannot regress. Doc-owner
|
||
follow-up: `Component-InboundAPI.md`'s Error Handling section still does not list a
|
||
"method not found" status; it should note that it is reported as 403 (indistinguishable
|
||
from "key not approved"), but that doc edit is outside this module's editable scope.
|
||
|
||
### InboundAPI-014 — `ReturnDefinition` is loaded but never used; script return value is unshaped/unvalidated
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Medium |
|
||
| Category | Design-document adherence |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.InboundAPI/InboundScriptExecutor.cs:201-205`, `src/ScadaLink.Commons/Entities/InboundApi/ApiMethod.cs:10` |
|
||
|
||
**Description**
|
||
|
||
`Component-InboundAPI.md` ("API Method Definition → Return Value Definition" and the
|
||
"Response Format" section) specifies that each method has a declared return structure
|
||
— "Field names and data types … Supports returning lists of objects" — and that the
|
||
success response body is "the method's return value as JSON, with fields matching the
|
||
return value definition". The `ApiMethod` entity carries a `ReturnDefinition` column
|
||
to hold exactly this. However, nothing in the module ever reads `ReturnDefinition`:
|
||
`ExecuteAsync` takes whatever object the script happens to return and does a blind
|
||
`JsonSerializer.Serialize(result)`. There is no validation that the script's return
|
||
value matches the declared shape, no coercion to the declared field types, and no
|
||
error when a method returns a structure inconsistent with its definition. A method
|
||
whose script returns the wrong shape (or `null` where a structure is required) will
|
||
silently emit a malformed 200 response, and the documented return-definition contract
|
||
is effectively unenforced. This is the response-side mirror of the parameter
|
||
validation that `ParameterValidator` does perform, leaving the two halves of the
|
||
method contract asymmetric.
|
||
|
||
**Recommendation**
|
||
|
||
Either (a) implement return-value validation/shaping: parse `ReturnDefinition` with
|
||
the same extended-type machinery used for parameters and validate/coerce the script
|
||
result before serializing, returning a 500 (or logging) when the script result does
|
||
not match; or (b) if return shaping is deliberately out of scope, remove the "Return
|
||
Value Definition" / "fields matching the return value definition" language from
|
||
`Component-InboundAPI.md` and document that the response is the script's raw return
|
||
value serialized as-is. Code and design doc must be reconciled.
|
||
|
||
**Resolution**
|
||
|
||
Resolved 2026-05-17 (commit `<pending>`): root cause confirmed — `ApiMethod` carries
|
||
`ReturnDefinition` but the executor did a blind `JsonSerializer.Serialize(result)`,
|
||
so a script returning the wrong shape silently emitted a malformed 200. Took
|
||
option (a): added `ReturnValueValidator`, the response-side mirror of
|
||
`ParameterValidator`. It parses `ReturnDefinition` (a JSON array of `{name,type}`
|
||
field definitions, same extended-type set as parameters), validates the serialized
|
||
script result against it — declared fields must be present with a compatible JSON
|
||
type, primitives type-checked, `Object`/`List` shape-checked — and a `null`/non-object
|
||
result is rejected when a structure is declared. `InboundScriptExecutor.ExecuteAsync`
|
||
now runs the validator after serialization and, on mismatch, logs and returns a
|
||
script failure (`"Method return value did not match its return definition"`, → 500)
|
||
instead of a malformed 200. A method with no `ReturnDefinition` stays unconstrained
|
||
(backward compatible). Doc-owner follow-up (outside this module's editable scope):
|
||
the `Component-InboundAPI.md` "Response Format" section may note that return shaping
|
||
is validation-only (no coercion). Regression tests: `ReturnValueValidatorTests`
|
||
(12 cases) plus executor-level `ReturnValue_MatchingReturnDefinition_Succeeds`,
|
||
`ReturnValue_NotMatchingReturnDefinition_ReturnsFailureNotMalformed200`, and
|
||
`ReturnValue_NoReturnDefinition_IsUnconstrained`.
|
||
|
||
### InboundAPI-015 — `ForbiddenApiChecker` is purely textual and is bypassable via reflection reachable without a forbidden namespace token
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Medium |
|
||
| Category | Security |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.InboundAPI/ForbiddenApiChecker.cs:63-119`, `src/ScadaLink.InboundAPI/InboundScriptExecutor.cs:109-126` |
|
||
|
||
**Description**
|
||
|
||
`ForbiddenApiChecker` walks the script syntax tree and rejects any `using` directive,
|
||
`QualifiedNameSyntax`, or `MemberAccessExpressionSyntax` whose textual dotted name
|
||
starts with a forbidden namespace prefix (`System.IO`, `System.Diagnostics`,
|
||
`System.Reflection`, `System.Net`, etc.). This is a textual match, not a semantic
|
||
one, and the trust model it enforces (per InboundAPI-005) is explicitly meant to keep
|
||
*untrusted* Design-role scripts away from host APIs. The check can be bypassed because
|
||
forbidden functionality is reachable through member access that never spells a
|
||
forbidden namespace:
|
||
|
||
- `typeof(string).Assembly.GetType("System.IO.File")` — `typeof(string)` is permitted,
|
||
`.Assembly` is a `System.Type` property, `.GetType(string)` is a `System.Reflection.Assembly`
|
||
method. The string literal `"System.IO.File"` is a string, not a `QualifiedNameSyntax`
|
||
or `MemberAccessExpressionSyntax`, so `IsForbidden` never sees it. The script obtains
|
||
a `System.IO.File` `Type` and can `InvokeMember`/`GetMethod(...).Invoke(...)` on it —
|
||
all via members of permitted types — with no forbidden namespace ever appearing in
|
||
the source. `CompileAndRegister` references `typeof(object).Assembly`
|
||
(System.Private.CoreLib) in `ScriptOptions`, so every framework type is loadable at
|
||
runtime.
|
||
- The executor also references the `Microsoft.CSharp.RuntimeBinder` assembly
|
||
(`InboundScriptExecutor.cs:116`), enabling the `dynamic` keyword, which further
|
||
widens late-bound member access that the static walker cannot see through.
|
||
|
||
Because the inbound API script runs on the central node with the host process's
|
||
privileges and is authored by the (less-trusted-than-Admin) Design role, a static
|
||
textual deny-list gives a false sense of containment.
|
||
|
||
**Recommendation**
|
||
|
||
Treat the syntax walker as defence-in-depth, not the boundary. Strengthen it where
|
||
cheap (flag `Assembly.GetType`, `Type.GetType`, `Activator.CreateInstance`,
|
||
`InvokeMember`, and `dynamic` usage), but for real enforcement run compiled scripts
|
||
under a genuine boundary — a restricted `AssemblyLoadContext`/AppDomain-equivalent, a
|
||
curated reference set that does not expose reflection-to-arbitrary-type, or an
|
||
out-of-process sandbox — consistent with however the Site Runtime ultimately enforces
|
||
its instance-script trust model. At minimum, document in `Component-InboundAPI.md`
|
||
that the current check is best-effort and does not stop a determined script.
|
||
|
||
**Resolution**
|
||
|
||
Resolved 2026-05-17 (commit `<pending>`): the specific reflection-via-permitted-member
|
||
vector was confirmed and the textual checker materially hardened against it (full
|
||
sandboxing remains a separate, larger design effort — see below). `ForbiddenApiWalker`
|
||
now, in addition to the namespace deny-list, rejects a curated set of reflection-gateway
|
||
**member names** (`GetType`, `GetTypeInfo`, `Assembly`, `Module`, `CreateInstance`,
|
||
`InvokeMember`, `GetMethod(s)`, `GetConstructor(s)`, `GetField(s)`, `GetProperty(ies)`,
|
||
`GetMember(s)`, `GetRuntimeMethod(s)`, `MethodHandle`, `TypeHandle`) regardless of the
|
||
receiver expression — so `typeof(string).Assembly.GetType("System.IO.File")` is now
|
||
caught because `.Assembly` and `.GetType` appear as accessed member names. It also
|
||
rejects a bare `Activator` identifier and the `dynamic` keyword (which widens
|
||
late-bound access the static walker cannot see through). `Invoke` is deliberately
|
||
**not** flagged so legitimate `Action`/`Func` delegate invocation still compiles —
|
||
the reflection `MethodInfo.Invoke` path is cut off by rejecting the `GetMethod` that
|
||
produces the `MethodInfo`. **Documented limitation:** this is hardened defence-in-depth,
|
||
not a true sandbox — a determined author may still find a vector the syntax walker
|
||
cannot see (e.g. via `Microsoft.CSharp.RuntimeBinder` internals or generics tricks).
|
||
Genuine containment needs a runtime boundary (restricted `AssemblyLoadContext` /
|
||
curated reference set that does not expose reflection-to-arbitrary-type / out-of-process
|
||
sandbox); that is tracked as a future design change and noted in the `ForbiddenApiChecker`
|
||
XML summary. Regression tests: new `ForbiddenApiCheckerTests` suite (19 cases) covering
|
||
the `Assembly`/`GetType`/`Type.GetType`/`Activator.CreateInstance`/`InvokeMember`/
|
||
`GetMethod`/`GetTypeInfo`/`dynamic` bypass vectors plus permitted-script and
|
||
namespace-deny-list regression guards.
|
||
|
||
### InboundAPI-016 — Routed `Route.To().Call()` invocations are not bound by the method timeout
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Medium |
|
||
| Category | Design-document adherence |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.InboundAPI/RouteHelper.cs:59-152`, `src/ScadaLink.InboundAPI/InboundScriptExecutor.cs:177`, `:199` |
|
||
|
||
**Description**
|
||
|
||
`Component-InboundAPI.md` states the per-method timeout "defines the maximum time the
|
||
method is allowed to execute (**including any routed calls to sites**)", and the
|
||
Routing Behavior section says a routed call "blocks until the site responds or the
|
||
**method-level timeout** is reached". The executor builds a linked
|
||
`CancellationTokenSource` (`cts`) combining the request-abort token and a dedicated
|
||
timeout CTS, and exposes `cts.Token` to the script as `InboundScriptContext.CancellationToken`.
|
||
However, every `RouteTarget` method (`Call`, `GetAttribute(s)`, `SetAttribute(s)`)
|
||
takes `CancellationToken cancellationToken = default` and the script must *explicitly*
|
||
pass the context token for the routed call to honour the timeout. A natural script —
|
||
`Route.To("inst").Call("doWork", parameters)` — invokes the routed call with
|
||
`CancellationToken.None`. That request flows into `CommunicationService.RouteToCallAsync`
|
||
with no cancellation, so the routed call is not bounded by the method timeout at all.
|
||
The only timeout guard left is `handler(context).WaitAsync(cts.Token)` in
|
||
`ExecuteAsync`: when the method timeout fires, `WaitAsync` returns a cancellation to
|
||
the caller, but the underlying script `Task` — and the in-flight `RouteToCallAsync`
|
||
awaiting a remote site — keeps running orphaned with no cancellation, holding the
|
||
correlation/communication resources until the site eventually responds or its own
|
||
transport timeout (if any) fires. The design's guarantee that the method timeout
|
||
covers routed calls is therefore not met, and a slow/hung site can leak background
|
||
work past the timeout the caller was told bounds the request.
|
||
|
||
**Recommendation**
|
||
|
||
Make routed calls inherit the method deadline without relying on script discipline:
|
||
have `RouteHelper`/`RouteTarget` carry the executing method's `CancellationToken`
|
||
(injected by `InboundScriptExecutor` when it constructs the context, e.g. a
|
||
`RouteHelper` bound to `cts.Token`) and pass it into every `CommunicationService`
|
||
call by default, so `Route.To("x").Call("s", p)` is timeout-bounded with no token
|
||
argument. Keep the explicit-token overload for callers that want a tighter bound.
|
||
Verify `RouteToCallAsync` and the attribute-routing calls actually observe the token
|
||
and abandon the in-flight request when it fires.
|
||
|
||
**Resolution**
|
||
|
||
Resolved 2026-05-17 (commit `<pending>`): root cause confirmed — every `RouteTarget`
|
||
method took `CancellationToken cancellationToken = default`, so a natural script
|
||
`Route.To("inst").Call("doWork", p)` routed with `CancellationToken.None` and was not
|
||
bound by the method timeout at all. `RouteHelper` now carries the executing method's
|
||
deadline token: `InboundScriptExecutor.ExecuteAsync` calls the new
|
||
`RouteHelper.WithDeadline(cts.Token)` when it builds the script context, so the route
|
||
helper handed to the script is bound to the method-level timeout CTS. Each
|
||
`RouteTarget` method resolves an *effective* token — the explicitly-supplied token if
|
||
the caller passed one (tighter bound preserved), otherwise the method deadline — and
|
||
forwards it into both `IInstanceLocator` site resolution and the routed call. The
|
||
deadline token therefore flows through to `CommunicationService.RouteTo*Async`, so
|
||
an in-flight routed call observes cancellation when the method timeout fires instead
|
||
of running orphaned. Regression tests (in the new `RouteHelperTests`):
|
||
`Call_WithNoExplicitToken_InheritsMethodDeadlineToken`,
|
||
`Call_WhenMethodDeadlineCancelled_RoutedCallObservesCancellation`,
|
||
`Call_ExplicitToken_OverridesDeadlineToken`,
|
||
`GetAttributes_WithNoExplicitToken_InheritsMethodDeadlineToken`,
|
||
`SetAttributes_WithNoExplicitToken_InheritsMethodDeadlineToken`.
|
||
|
||
### InboundAPI-017 — `RouteHelper` / `RouteTarget` has no test coverage
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Low |
|
||
| Category | Testing coverage |
|
||
| Status | Resolved |
|
||
| Location | `src/ScadaLink.InboundAPI/RouteHelper.cs:1-165`, `tests/ScadaLink.InboundAPI.Tests/` |
|
||
|
||
**Description**
|
||
|
||
`RouteHelper`/`RouteTarget` is the entire WP-4 cross-site routing surface — the
|
||
`Route.To().Call()/GetAttribute(s)/SetAttribute(s)` API that inbound API scripts use
|
||
to reach instances at any site. It has zero tests: the `ScadaLink.InboundAPI.Tests`
|
||
project covers `ApiKeyValidator`, `ParameterValidator`, `InboundScriptExecutor`, and
|
||
`InboundApiEndpointFilter`, but no test file exercises `RouteHelper`. Untested
|
||
behaviours include site resolution via `IInstanceLocator` (including the
|
||
"instance not found / no assigned site" `InvalidOperationException` path at
|
||
`RouteHelper.cs:154-164`), the `!response.Success` → `InvalidOperationException`
|
||
translation in each routed method, `GetAttribute` delegating to the batch
|
||
`GetAttributes` and returning `null` for an absent key, correlation-ID generation,
|
||
and `SetAttribute` delegating to `SetAttributes`. These are non-trivial branches
|
||
whose failure modes (a thrown exception inside a script) surface to the caller as a
|
||
500, so regressions would be silent.
|
||
|
||
**Recommendation**
|
||
|
||
Add a `RouteHelperTests` suite using substituted `IInstanceLocator` and
|
||
`CommunicationService` (the executor tests already substitute `CommunicationService`):
|
||
cover the happy path of each routed method, the unresolved-instance throw, the
|
||
`!Success` → `InvalidOperationException` mapping, and `GetAttribute` returning `null`
|
||
for a missing key. This also gives InboundAPI-016 a regression home if the timeout
|
||
wiring is added.
|
||
|
||
**Resolution**
|
||
|
||
Resolved 2026-05-17 (commit `<pending>`): confirmed — `ScadaLink.InboundAPI.Tests` had
|
||
no file exercising `RouteHelper`/`RouteTarget`. To make the surface testable without a
|
||
live actor system, an `IInstanceRouter` seam was introduced in the module (the routing
|
||
transport `RouteHelper` depends on); the production `CommunicationServiceInstanceRouter`
|
||
delegates to `CommunicationService` and is registered by `AddInboundAPI`. `RouteHelper`
|
||
now depends on `IInstanceLocator` + `IInstanceRouter` (both substitutable). Added the
|
||
`RouteHelperTests` suite (15 cases) covering: the happy path of `Call`/`GetAttribute(s)`/
|
||
`SetAttribute(s)`, correlation-ID generation, the unresolved-instance
|
||
`InvalidOperationException` path, the `!Success` → `InvalidOperationException` mapping
|
||
for each routed method, `GetAttribute` delegating to the batch `GetAttributes` and
|
||
returning `null` for an absent key, `SetAttribute` delegating to `SetAttributes`, and
|
||
the InboundAPI-016 deadline-token inheritance behaviour. All 15 pass.
|
||
|
||
### InboundAPI-018 — `AuditWriteMiddleware` fires `WriteAsync` as `_ = task` — faulted async writes are unobserved
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Medium |
|
||
| Category | Error handling & resilience |
|
||
| Status | Open |
|
||
| Location | `src/ScadaLink.InboundAPI/Middleware/AuditWriteMiddleware.cs:257` |
|
||
|
||
**Description**
|
||
|
||
`EmitInboundAudit` calls `_ = _auditWriter.WriteAsync(evt);` — the returned `Task` is
|
||
discarded with the discard operator inside a synchronous `try` block. The wrapping
|
||
`try/catch (Exception ex)` (lines 198–266) only catches a *synchronous* throw before
|
||
the writer returns a task. Once `WriteAsync` returns a task, any exception that
|
||
faults that task (e.g. a DB timeout in the central audit writer, a serialization
|
||
failure, a cancellation that bubbles up) is never observed: it is not logged, it
|
||
does not increment the `CentralAuditWriteFailures` health-monitoring counter the
|
||
design doc references ("Fail-soft semantics" paragraph), and the audit row is
|
||
silently lost. In .NET, unobserved task exceptions are eventually surfaced via
|
||
`TaskScheduler.UnobservedTaskException` and may also be logged by the runtime —
|
||
either way, the middleware itself has no control over what (if anything) happens
|
||
on a fault. The XML doc comment at line 255 claims "the writer itself swallows"
|
||
but this is an implicit cross-component contract: the abstraction
|
||
`ICentralAuditWriter.WriteAsync` returns `Task` and makes no such guarantee, and
|
||
the only test that exercises a throwing writer (`AuditWriter_Throws_*` in
|
||
`AuditWriteMiddlewareTests.cs`) uses an `OnWrite` callback that throws
|
||
*synchronously*, not asynchronously — so the async fault path is not covered by
|
||
tests either.
|
||
|
||
This matters because Component-InboundAPI.md states that audit-emission failures
|
||
must increment `CentralAuditWriteFailures` (Health Monitoring #11) — a counter
|
||
that, with the current fire-and-forget, will under-count async-faulted writes.
|
||
|
||
**Recommendation**
|
||
|
||
Either (a) await the write and rely on the surrounding try/catch to log the
|
||
failure, accepting an extra await on the request hot path; or (b) keep the
|
||
fire-and-forget for latency but attach a `ContinueWith(t => ..., OnlyOnFaulted)`
|
||
that logs the fault and increments the failure counter, so a faulted async write
|
||
is at least observed. Option (b) preserves "audit emission never blocks the HTTP
|
||
response" while restoring the visibility the design assumes. Add a regression
|
||
test using a writer whose `WriteAsync` returns a faulted `Task` (not a
|
||
synchronous throw) to pin the new contract.
|
||
|
||
### InboundAPI-019 — `EnableBuffering()` called unconditionally on every request, including bodyless requests
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Low |
|
||
| Category | Performance & resource management |
|
||
| Location | `src/ScadaLink.InboundAPI/Middleware/AuditWriteMiddleware.cs:141` |
|
||
| Status | Open |
|
||
|
||
**Description**
|
||
|
||
`InvokeAsync` always calls `ctx.Request.EnableBuffering()` before the empty-body
|
||
short-circuit at `ReadBufferedRequestBodyAsync` line 289 (`if (request.ContentLength
|
||
is 0) return (null, false);`). `EnableBuffering()` swaps the request stream for a
|
||
`FileBufferingReadStream` whose construction allocates an internal buffer (default
|
||
threshold ~30 KB before spilling to a temp file) regardless of whether the request
|
||
actually has a body. The /api scope this middleware lives under will see at least
|
||
some bodyless requests (e.g. GET `/api/audit/query` once that route is in the same
|
||
branch — see InboundAPI-025; future health checks; misbehaving clients) and each
|
||
one pays the buffering allocation cost for no benefit.
|
||
|
||
**Recommendation**
|
||
|
||
Defer the `EnableBuffering()` call into `ReadBufferedRequestBodyAsync` after the
|
||
`ContentLength is 0` check, or short-circuit in `InvokeAsync` before enabling
|
||
buffering when `ContentLength is 0` and `Method is "GET" or "HEAD" or "DELETE"`.
|
||
The win is a per-request `FileBufferingReadStream` allocation avoided on every
|
||
bodyless request through the middleware.
|
||
|
||
### InboundAPI-020 — `ContentType.Contains("json")` is case-sensitive; `application/JSON` with no Content-Length skips body parsing
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Low |
|
||
| Category | Correctness & logic bugs |
|
||
| Status | Open |
|
||
| Location | `src/ScadaLink.InboundAPI/EndpointExtensions.cs:70` |
|
||
|
||
**Description**
|
||
|
||
`HandleInboundApiRequest` parses the JSON body only when
|
||
`httpContext.Request.ContentLength > 0 || httpContext.Request.ContentType?.Contains("json") == true`.
|
||
The `string.Contains(string)` overload used here is case-sensitive — a perfectly
|
||
valid HTTP header `Content-Type: application/JSON` (uppercase) would yield
|
||
`false` (`"application/JSON".Contains("json")` is `false`). With no
|
||
Content-Length (e.g. chunked transfer-encoding) and an uppercase content type,
|
||
the handler then leaves `body = null` and `ParameterValidator.Validate` runs
|
||
against a missing body — so a method that declares any required parameter is
|
||
rejected with "Missing required parameters" even though the caller did send a
|
||
well-formed JSON body. HTTP RFC 7230 §3.2 makes header field names case-insensitive
|
||
but is silent on values; in practice clients do sometimes uppercase media-type
|
||
tokens, and the framework's own `MediaTypeHeaderValue` is case-insensitive.
|
||
|
||
**Recommendation**
|
||
|
||
Use the case-insensitive overload —
|
||
`httpContext.Request.ContentType?.Contains("json", StringComparison.OrdinalIgnoreCase) == true`
|
||
— or rely on the framework's `IsJson` check via
|
||
`MediaTypeHeaderValue.TryParse`/`HttpRequest.HasJsonContentType()`. Add a
|
||
regression test posting with `application/JSON` and Transfer-Encoding: chunked.
|
||
|
||
### InboundAPI-021 — `ParentExecutionId` correlation flows only through `Call`; attribute reads/writes lose the inbound→site execution-tree link
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Medium |
|
||
| Category | Design-document adherence |
|
||
| Status | Open |
|
||
| Location | `src/ScadaLink.InboundAPI/RouteHelper.cs:141-143`, `:182-183`, `:225-226`; `src/ScadaLink.Commons/Messages/InboundApi/RouteToInstanceRequest.cs:15-21`, `:36-40`, `:55-59` |
|
||
|
||
**Description**
|
||
|
||
CLAUDE.md's Centralized Audit Log section describes `ParentExecutionId` as the
|
||
cross-execution spawn pointer that "every row of a spawned run carries" and
|
||
specifically calls out "the inbound API → routed-site-script case". The current
|
||
implementation honours this only on `RouteToCallRequest` — which carries
|
||
`ParentExecutionId` as its trailing additive field (line 21 of
|
||
`RouteToInstanceRequest.cs`) and is stamped by `RouteTarget.Call` with the
|
||
inbound request's execution id at line 143 of `RouteHelper.cs`.
|
||
|
||
`RouteToGetAttributesRequest` and `RouteToSetAttributesRequest`, however, have
|
||
**no `ParentExecutionId` field** and the matching `RouteTarget.GetAttributes` /
|
||
`SetAttributes` methods (`RouteHelper.cs:182-183`, `:225-226`) never reference
|
||
`_parentExecutionId`. So when an inbound API script reads or writes a site
|
||
attribute via `Route.To("inst").GetAttribute(...)` /
|
||
`Route.To("inst").SetAttribute(...)`, the site-side audit row for that
|
||
trust-boundary action (an outbound-by-the-script DB / OPC write at the site) is
|
||
emitted with `ParentExecutionId = null` and the execution-tree walk
|
||
`IX_AuditLog_ParentExecution` cannot link it back to the spawning inbound
|
||
request. The two-row pair (inbound + spawned site work) reverts to the
|
||
"top-level / null" state the design says is the *fallback* for non-spawned runs.
|
||
The asymmetry between `Call` and `GetAttributes`/`SetAttributes` is also surprising
|
||
— a script author would reasonably expect the same correlation across all
|
||
`Route.To(...)` calls.
|
||
|
||
**Recommendation**
|
||
|
||
Add a trailing `Guid? ParentExecutionId = null` field to
|
||
`RouteToGetAttributesRequest` and `RouteToSetAttributesRequest` (additive
|
||
trailing member, matches the message-evolution rule in CLAUDE.md); stamp it
|
||
from `_parentExecutionId` in `RouteTarget.GetAttributes` and
|
||
`RouteTarget.SetAttributes`; have the site-side handlers thread the field onto
|
||
their emitted audit rows. Add a `RouteHelperTests` regression case asserting
|
||
that an attribute read/write carries the inherited `ParentExecutionId`.
|
||
|
||
### InboundAPI-022 — `IActiveNodeGate` has no production registration in Host — standby-node gating is silently disabled in production
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | High |
|
||
| Category | Security |
|
||
| Status | Open |
|
||
| Location | `src/ScadaLink.InboundAPI/IActiveNodeGate.cs`, `src/ScadaLink.InboundAPI/InboundApiEndpointFilter.cs:52-60`; absent from `src/ScadaLink.Host/Program.cs` |
|
||
|
||
**Description**
|
||
|
||
InboundAPI-008's resolution adds `IActiveNodeGate` (lines 17–24 of
|
||
`IActiveNodeGate.cs`) so a standby central node can refuse to serve the inbound
|
||
API. `InboundApiEndpointFilter.InvokeAsync` consults the gate at line 52
|
||
(`var gate = httpContext.RequestServices.GetService<IActiveNodeGate>();`), and
|
||
when `gate is { IsActiveNode: false }` returns HTTP 503. The filter's behaviour
|
||
when **no implementation is registered** (line 51 comment) is to fall through and
|
||
serve the request — the resolution paragraph for InboundAPI-008 closes with:
|
||
|
||
> "Follow-up (outside this module's scope): `ScadaLink.Host` should register an
|
||
> `IActiveNodeGate` implementation backed by `ActiveNodeHealthCheck` /
|
||
> `Cluster.State.Leader` in the central-role branch of `Program.cs` so the gate is
|
||
> actually enforced in production; until then the endpoint defaults to "allow"."
|
||
|
||
A grep of the entire `src/ScadaLink.Host/` tree at `1eb6e97` finds **zero**
|
||
`IActiveNodeGate` registrations: `grep -rn "IActiveNodeGate\|AddSingleton.*ActiveNode"
|
||
src/ScadaLink.Host/` returns no matches. The follow-up was never carried out. So
|
||
in production today the standby central node still serves the inbound API exactly
|
||
as InboundAPI-008 described — executes method scripts, runs `Route.To()` calls,
|
||
races the active node, and may operate against stale singleton state. The new
|
||
infrastructure (interface + filter check) is present but unwired; from the user's
|
||
perspective the original High-severity issue is unresolved in deployed binaries.
|
||
|
||
The design says the inbound API is "Central cluster only (active node)" and
|
||
"fails over with it" — this guarantee is not currently enforced in production.
|
||
|
||
**Recommendation**
|
||
|
||
Register an `IActiveNodeGate` implementation in the central-role branch of
|
||
`ScadaLink.Host/Program.cs`. The natural backing is the existing
|
||
`ActiveNodeHealthCheck` (already wired for `/health/active`) or a direct read of
|
||
`Cluster.Get(actorSystem).State.Leader == Cluster.Get(actorSystem).SelfAddress`.
|
||
Add an integration test in the Host that spins up the central role and asserts
|
||
that the gate is resolvable and returns `IsActiveNode` consistent with cluster
|
||
leader state. Until that wiring lands, this finding is the user-facing
|
||
realisation of the InboundAPI-008 vulnerability.
|
||
|
||
### InboundAPI-023 — `EndpointExtensions.HandleInboundApiRequest` composition wiring has no test coverage
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Low |
|
||
| Category | Testing coverage |
|
||
| Status | Open |
|
||
| Location | `src/ScadaLink.InboundAPI/EndpointExtensions.cs:31-140`, `tests/ScadaLink.InboundAPI.Tests/` |
|
||
|
||
**Description**
|
||
|
||
The endpoint handler `HandleInboundApiRequest` is the wiring composition that
|
||
ties the validator → JSON parse → `ParameterValidator` → `InboundScriptExecutor` →
|
||
result-serialization path together; it is the single piece of code that maps
|
||
validator status codes to HTTP responses, threads the `parentExecutionId` from
|
||
`HttpContext.Items` into the executor, stashes the resolved API key name as
|
||
`AuditActorItemKey`, and emits the request-aborted short-circuit. The test
|
||
project covers each composed component (`ApiKeyValidatorTests`,
|
||
`ParameterValidatorTests`, `InboundScriptExecutorTests`, `RouteHelperTests`,
|
||
`InboundApiEndpointFilter`, `AuditWriteMiddlewareTests`,
|
||
`MiddlewareOrderTests`) but no test exercises `HandleInboundApiRequest` itself —
|
||
so regressions in the wiring (e.g. forgetting to stash the actor name on
|
||
`HttpContext.Items`, the `Contains("json")` case sensitivity from
|
||
InboundAPI-020, or accidentally swapping `validationResult.StatusCode` for a
|
||
literal) are not caught.
|
||
|
||
**Recommendation**
|
||
|
||
Add an `EndpointExtensionsTests` suite using `TestServer` (the same pattern
|
||
`MiddlewareOrderTests` uses) covering: the happy path (200 + body), invalid
|
||
JSON (400), validator 401, validator 403, parameter-validation failure (400),
|
||
script-failure 500, client-aborted short-circuit (`Results.Empty`), and the
|
||
actor-stash invariant (HttpContext.Items[AuditActorItemKey] is set with the
|
||
resolved key name after successful auth, but is absent on auth failures).
|
||
|
||
### InboundAPI-024 — `_knownBadMethods` is unbounded — an attacker can grow the cache by spamming distinct method names against the audit middleware path
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Low |
|
||
| Category | Performance & resource management |
|
||
| Status | Open |
|
||
| Location | `src/ScadaLink.InboundAPI/InboundScriptExecutor.cs:30`, `:77`, `:223`, `:233` |
|
||
|
||
**Description**
|
||
|
||
The InboundAPI-009 fix introduced `_knownBadMethods`, a `ConcurrentDictionary<string, byte>`
|
||
of method names whose Roslyn compilation failed, to short-circuit lazy
|
||
recompilation. It is keyed by `method.Name` and entries are only ever removed
|
||
when `CompileAndRegister` succeeds for the same name (line 83). Practically the
|
||
key space is bounded by the configured method definitions in the database, so
|
||
this is bounded in normal operation. But because the cache is mutated from the
|
||
lazy-compile path at `ExecuteAsync.cs:233`, and `ExecuteAsync` is called from
|
||
`HandleInboundApiRequest` only **after** `ApiKeyValidator.ValidateAsync` has
|
||
returned `Valid` (i.e. a real method exists), the entry is keyed by a name that
|
||
must have already been resolved through `GetMethodByNameAsync` — so this attack
|
||
surface is gated by the configuration database. The finding is therefore mostly
|
||
defensive: there is no rate limit on inbound API calls (deliberate design), so
|
||
if a future change ever causes `ExecuteAsync` to be called for an unvalidated
|
||
caller-supplied method name (e.g. a refactor that moves method-existence
|
||
checking later), this cache would become attacker-controllable.
|
||
|
||
**Recommendation**
|
||
|
||
Optional / defensive: cap `_knownBadMethods` (e.g. an LRU with a fixed size, or
|
||
clear it periodically). At minimum, document the invariant in the executor's
|
||
XML comment that `_knownBadMethods` keys must come from validated
|
||
`ApiMethod.Name` values, so the safety property survives future refactors. No
|
||
immediate change required; this is a watch-list item.
|
||
|
||
### InboundAPI-025 — `AuditWriteMiddleware` runs against the entire `/api/*` branch — emits spurious `ApiInbound` audit rows for `/api/audit/query` and `/api/audit/export`
|
||
|
||
| | |
|
||
|--|--|
|
||
| Severity | Medium |
|
||
| Category | Correctness & logic bugs |
|
||
| Status | Open |
|
||
| Location | `src/ScadaLink.Host/Program.cs:183-185`; consumers: `src/ScadaLink.ManagementService/AuditEndpoints.cs:93-94`; emitter: `src/ScadaLink.InboundAPI/Middleware/AuditWriteMiddleware.cs:175-252` |
|
||
|
||
**Description**
|
||
|
||
`Program.cs` wires the audit middleware as
|
||
`app.UseWhen(ctx => ctx.Request.Path.StartsWithSegments("/api"), branch => branch.UseAuditWriteMiddleware())`
|
||
— scoped to the `/api` *prefix*, not to the `POST /api/{methodName}` route.
|
||
Meanwhile, `ScadaLink.ManagementService/AuditEndpoints.cs` maps
|
||
`MapGet("/api/audit/query", ...)` (line 93) and `MapGet("/api/audit/export", ...)`
|
||
(line 94). Both routes therefore inherit `AuditWriteMiddleware`, which emits an
|
||
`AuditEvent { Channel = AuditChannel.ApiInbound, Kind = AuditKind.InboundRequest, ... }`
|
||
row for every call. The middleware's `ResolveMethodName` falls back to the last
|
||
path segment (lines 446–452), so a GET `/api/audit/query?...` is recorded as if a
|
||
caller had invoked an inbound API method named "query"; an export is recorded
|
||
as method "export". Effects:
|
||
|
||
1. **Audit log is polluted with non-script rows.** The audit log is now
|
||
recording its own query traffic as if it were inbound script invocations,
|
||
contradicting Component-AuditLog.md's scope ("script trust boundary actions").
|
||
2. **Audit reads recursively emit audit writes.** Every audit-log query (e.g.
|
||
from the Central UI Audit Log page or the CLI `audit query` command) writes
|
||
an additional row into `AuditLog`, growing the table on read.
|
||
3. **`Target` is meaningless.** `/api/audit/query` has no method definition, so
|
||
the recorded `Target = "query"` is not joinable to any `ApiMethod` row in
|
||
audit-log drill-ins.
|
||
4. **Wasted resources on health probes / management calls.** Any future routes
|
||
added under `/api/` will inherit the middleware and pay the
|
||
`EnableBuffering`, `CapturedResponseStream`, and `JsonSerializer.Serialize`
|
||
costs even though they are not inbound script invocations.
|
||
|
||
Tests for the audit middleware (`AuditWriteMiddlewareTests`) and pipeline order
|
||
(`MiddlewareOrderTests`) wire the middleware only against the
|
||
`POST /api/{methodName}` route in test hosts, so this production-only
|
||
mis-scoping is not exercised.
|
||
|
||
**Recommendation**
|
||
|
||
Tighten the predicate so the middleware runs only on the inbound API method
|
||
route, not on the `/api/` prefix. Options:
|
||
|
||
- `app.UseWhen(ctx => ctx.Request.Path.StartsWithSegments("/api") && !ctx.Request.Path.StartsWithSegments("/api/audit") && !ctx.Request.Path.StartsWithSegments("/api/management"), ...)`
|
||
— defensive, but fragile to future route additions.
|
||
- Move the audit emission from a pipeline middleware to an `IEndpointFilter`
|
||
applied via `.AddEndpointFilter<>()` on the `MapInboundAPI` registration
|
||
(alongside `InboundApiEndpointFilter`). This makes the scope explicit on the
|
||
one route that needs it and survives future `/api/...` route additions
|
||
unchanged.
|
||
|
||
The endpoint-filter form is the recommended fix — it co-locates the audit-emission
|
||
scope with the route definition and matches how InboundAPI-006/008 gating is
|
||
already wired.
|