docs(code-reviews): re-review batch 3 at 39d737e — Host, InboundAPI, ManagementService, NotificationService, Security
21 new findings: Host-012..015, InboundAPI-014..017, ManagementService-014..017, NotificationService-014..018, Security-012..015.
This commit is contained in:
@@ -5,10 +5,10 @@
|
||||
| Module | `src/ScadaLink.InboundAPI` |
|
||||
| Design doc | `docs/requirements/Component-InboundAPI.md` |
|
||||
| Status | Reviewed |
|
||||
| Last reviewed | 2026-05-16 |
|
||||
| Last reviewed | 2026-05-17 |
|
||||
| Reviewer | claude-agent |
|
||||
| Commit reviewed | `9c60592` |
|
||||
| Open findings | 0 |
|
||||
| Commit reviewed | `39d737e` |
|
||||
| Open findings | 4 |
|
||||
|
||||
## Summary
|
||||
|
||||
@@ -30,6 +30,25 @@ well but there is no coverage of the HTTP endpoint, concurrency, or recompilatio
|
||||
None of the findings are data-loss-class, but the concurrency and trust-model issues
|
||||
are High severity and should be addressed before production use.
|
||||
|
||||
#### Re-review 2026-05-17 (commit `39d737e`)
|
||||
|
||||
All 13 findings from the initial review remain `Resolved`; the module source under
|
||||
`src/ScadaLink.InboundAPI` is unchanged since the last InboundAPI fix commit
|
||||
(`8dd7412`), which precedes `39d737e`. This re-review re-walked all 10 checklist
|
||||
categories against the resolved code and surfaced **4 new findings** — none touching
|
||||
the previously-fixed concurrency/trust-model code, but all in areas the first pass
|
||||
did not probe deeply: (1) the `ReturnDefinition` column is loaded onto `ApiMethod`
|
||||
but is never consulted — script return values are serialized verbatim with no shaping
|
||||
or validation against the declared return structure (InboundAPI-014); (2) the new
|
||||
`ForbiddenApiChecker` is a purely textual syntax walker and can be bypassed by
|
||||
reaching forbidden functionality through member access that never spells a forbidden
|
||||
namespace, e.g. `typeof(x).Assembly.GetType("System.IO.File")` (InboundAPI-015);
|
||||
(3) routed `Route.To().Call()` invocations are not bound by the method timeout unless
|
||||
the script explicitly threads `Parameters`-side cancellation, contradicting the design
|
||||
statement that the timeout covers routed calls (InboundAPI-016); and (4) `RouteHelper`
|
||||
/ `RouteTarget` — the entire WP-4 cross-site routing surface — has no test coverage
|
||||
(InboundAPI-017). New findings are one Medium-trio plus one Low; no Critical or High.
|
||||
|
||||
## Checklist coverage
|
||||
|
||||
| # | Category | Examined | Notes |
|
||||
@@ -38,11 +57,11 @@ are High severity and should be addressed before production use.
|
||||
| 2 | Akka.NET conventions | ☑ | Module is ASP.NET-hosted, no actors of its own; routes to actors via `CommunicationService`. No correlation-ID issues — IDs are set in `RouteHelper`. |
|
||||
| 3 | Concurrency & thread safety | ☑ | Singleton `InboundScriptExecutor` mutates a non-thread-safe `Dictionary` from concurrent request threads — see InboundAPI-001/002. |
|
||||
| 4 | Error handling & resilience | ☑ | Catch-all conflates client cancellation with timeout (InboundAPI-004); compilation-failure path repeats work on every request (InboundAPI-009). |
|
||||
| 5 | Security | ☑ | Non-constant-time key comparison, no trust-model enforcement, no body-size limit, missing-method enumeration oracle — see InboundAPI-003/005/006/011. |
|
||||
| 5 | Security | ☑ | Prior items resolved. Re-review: `ForbiddenApiChecker` is a textual deny-list bypassable via reflection without a forbidden namespace token (InboundAPI-015). |
|
||||
| 6 | Performance & resource management | ☑ | Up to 3 separate DB round-trips per request in `ApiKeyValidator`; uncapped lazy recompilation. |
|
||||
| 7 | Design-document adherence | ☑ | `Database.Connection()` script API missing; central-only hosting not enforced; lazy-compile diverges from "compiled at startup". |
|
||||
| 8 | Code organization & conventions | ☑ | `ParameterDefinition` is an API-shaped POCO declared in the component project rather than Commons; otherwise conventions followed. |
|
||||
| 9 | Testing coverage | ☑ | Good unit coverage of the two validators; no endpoint, concurrency, recompilation, or timeout-vs-cancel tests. |
|
||||
| 7 | Design-document adherence | ☑ | Re-review: `ReturnDefinition` loaded but never used (InboundAPI-014); routed-call timeout not enforced (InboundAPI-016). Prior `Database.Connection()`/central-only items resolved. |
|
||||
| 8 | Code organization & conventions | ☑ | `ParameterDefinition` moved to Commons (InboundAPI-012 resolved); no new issues. |
|
||||
| 9 | Testing coverage | ☑ | Re-review: `RouteHelper`/`RouteTarget` (WP-4 routing) entirely untested (InboundAPI-017); validators/executor/filter well covered. |
|
||||
| 10 | Documentation & comments | ☑ | `ApiKeyValidationResult.NotFound` XML/name says "NotFound" but returns HTTP 400 — misleading (InboundAPI-013). |
|
||||
|
||||
## Findings
|
||||
@@ -580,3 +599,181 @@ the new method-not-found status, and removing dead code cannot regress. Doc-owne
|
||||
follow-up: `Component-InboundAPI.md`'s Error Handling section still does not list a
|
||||
"method not found" status; it should note that it is reported as 403 (indistinguishable
|
||||
from "key not approved"), but that doc edit is outside this module's editable scope.
|
||||
|
||||
### InboundAPI-014 — `ReturnDefinition` is loaded but never used; script return value is unshaped/unvalidated
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Medium |
|
||||
| Category | Design-document adherence |
|
||||
| Status | Open |
|
||||
| Location | `src/ScadaLink.InboundAPI/InboundScriptExecutor.cs:201-205`, `src/ScadaLink.Commons/Entities/InboundApi/ApiMethod.cs:10` |
|
||||
|
||||
**Description**
|
||||
|
||||
`Component-InboundAPI.md` ("API Method Definition → Return Value Definition" and the
|
||||
"Response Format" section) specifies that each method has a declared return structure
|
||||
— "Field names and data types … Supports returning lists of objects" — and that the
|
||||
success response body is "the method's return value as JSON, with fields matching the
|
||||
return value definition". The `ApiMethod` entity carries a `ReturnDefinition` column
|
||||
to hold exactly this. However, nothing in the module ever reads `ReturnDefinition`:
|
||||
`ExecuteAsync` takes whatever object the script happens to return and does a blind
|
||||
`JsonSerializer.Serialize(result)`. There is no validation that the script's return
|
||||
value matches the declared shape, no coercion to the declared field types, and no
|
||||
error when a method returns a structure inconsistent with its definition. A method
|
||||
whose script returns the wrong shape (or `null` where a structure is required) will
|
||||
silently emit a malformed 200 response, and the documented return-definition contract
|
||||
is effectively unenforced. This is the response-side mirror of the parameter
|
||||
validation that `ParameterValidator` does perform, leaving the two halves of the
|
||||
method contract asymmetric.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Either (a) implement return-value validation/shaping: parse `ReturnDefinition` with
|
||||
the same extended-type machinery used for parameters and validate/coerce the script
|
||||
result before serializing, returning a 500 (or logging) when the script result does
|
||||
not match; or (b) if return shaping is deliberately out of scope, remove the "Return
|
||||
Value Definition" / "fields matching the return value definition" language from
|
||||
`Component-InboundAPI.md` and document that the response is the script's raw return
|
||||
value serialized as-is. Code and design doc must be reconciled.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
### InboundAPI-015 — `ForbiddenApiChecker` is purely textual and is bypassable via reflection reachable without a forbidden namespace token
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Medium |
|
||||
| Category | Security |
|
||||
| Status | Open |
|
||||
| Location | `src/ScadaLink.InboundAPI/ForbiddenApiChecker.cs:63-119`, `src/ScadaLink.InboundAPI/InboundScriptExecutor.cs:109-126` |
|
||||
|
||||
**Description**
|
||||
|
||||
`ForbiddenApiChecker` walks the script syntax tree and rejects any `using` directive,
|
||||
`QualifiedNameSyntax`, or `MemberAccessExpressionSyntax` whose textual dotted name
|
||||
starts with a forbidden namespace prefix (`System.IO`, `System.Diagnostics`,
|
||||
`System.Reflection`, `System.Net`, etc.). This is a textual match, not a semantic
|
||||
one, and the trust model it enforces (per InboundAPI-005) is explicitly meant to keep
|
||||
*untrusted* Design-role scripts away from host APIs. The check can be bypassed because
|
||||
forbidden functionality is reachable through member access that never spells a
|
||||
forbidden namespace:
|
||||
|
||||
- `typeof(string).Assembly.GetType("System.IO.File")` — `typeof(string)` is permitted,
|
||||
`.Assembly` is a `System.Type` property, `.GetType(string)` is a `System.Reflection.Assembly`
|
||||
method. The string literal `"System.IO.File"` is a string, not a `QualifiedNameSyntax`
|
||||
or `MemberAccessExpressionSyntax`, so `IsForbidden` never sees it. The script obtains
|
||||
a `System.IO.File` `Type` and can `InvokeMember`/`GetMethod(...).Invoke(...)` on it —
|
||||
all via members of permitted types — with no forbidden namespace ever appearing in
|
||||
the source. `CompileAndRegister` references `typeof(object).Assembly`
|
||||
(System.Private.CoreLib) in `ScriptOptions`, so every framework type is loadable at
|
||||
runtime.
|
||||
- The executor also references the `Microsoft.CSharp.RuntimeBinder` assembly
|
||||
(`InboundScriptExecutor.cs:116`), enabling the `dynamic` keyword, which further
|
||||
widens late-bound member access that the static walker cannot see through.
|
||||
|
||||
Because the inbound API script runs on the central node with the host process's
|
||||
privileges and is authored by the (less-trusted-than-Admin) Design role, a static
|
||||
textual deny-list gives a false sense of containment.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Treat the syntax walker as defence-in-depth, not the boundary. Strengthen it where
|
||||
cheap (flag `Assembly.GetType`, `Type.GetType`, `Activator.CreateInstance`,
|
||||
`InvokeMember`, and `dynamic` usage), but for real enforcement run compiled scripts
|
||||
under a genuine boundary — a restricted `AssemblyLoadContext`/AppDomain-equivalent, a
|
||||
curated reference set that does not expose reflection-to-arbitrary-type, or an
|
||||
out-of-process sandbox — consistent with however the Site Runtime ultimately enforces
|
||||
its instance-script trust model. At minimum, document in `Component-InboundAPI.md`
|
||||
that the current check is best-effort and does not stop a determined script.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
### InboundAPI-016 — Routed `Route.To().Call()` invocations are not bound by the method timeout
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Medium |
|
||||
| Category | Design-document adherence |
|
||||
| Status | Open |
|
||||
| Location | `src/ScadaLink.InboundAPI/RouteHelper.cs:59-152`, `src/ScadaLink.InboundAPI/InboundScriptExecutor.cs:177`, `:199` |
|
||||
|
||||
**Description**
|
||||
|
||||
`Component-InboundAPI.md` states the per-method timeout "defines the maximum time the
|
||||
method is allowed to execute (**including any routed calls to sites**)", and the
|
||||
Routing Behavior section says a routed call "blocks until the site responds or the
|
||||
**method-level timeout** is reached". The executor builds a linked
|
||||
`CancellationTokenSource` (`cts`) combining the request-abort token and a dedicated
|
||||
timeout CTS, and exposes `cts.Token` to the script as `InboundScriptContext.CancellationToken`.
|
||||
However, every `RouteTarget` method (`Call`, `GetAttribute(s)`, `SetAttribute(s)`)
|
||||
takes `CancellationToken cancellationToken = default` and the script must *explicitly*
|
||||
pass the context token for the routed call to honour the timeout. A natural script —
|
||||
`Route.To("inst").Call("doWork", parameters)` — invokes the routed call with
|
||||
`CancellationToken.None`. That request flows into `CommunicationService.RouteToCallAsync`
|
||||
with no cancellation, so the routed call is not bounded by the method timeout at all.
|
||||
The only timeout guard left is `handler(context).WaitAsync(cts.Token)` in
|
||||
`ExecuteAsync`: when the method timeout fires, `WaitAsync` returns a cancellation to
|
||||
the caller, but the underlying script `Task` — and the in-flight `RouteToCallAsync`
|
||||
awaiting a remote site — keeps running orphaned with no cancellation, holding the
|
||||
correlation/communication resources until the site eventually responds or its own
|
||||
transport timeout (if any) fires. The design's guarantee that the method timeout
|
||||
covers routed calls is therefore not met, and a slow/hung site can leak background
|
||||
work past the timeout the caller was told bounds the request.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Make routed calls inherit the method deadline without relying on script discipline:
|
||||
have `RouteHelper`/`RouteTarget` carry the executing method's `CancellationToken`
|
||||
(injected by `InboundScriptExecutor` when it constructs the context, e.g. a
|
||||
`RouteHelper` bound to `cts.Token`) and pass it into every `CommunicationService`
|
||||
call by default, so `Route.To("x").Call("s", p)` is timeout-bounded with no token
|
||||
argument. Keep the explicit-token overload for callers that want a tighter bound.
|
||||
Verify `RouteToCallAsync` and the attribute-routing calls actually observe the token
|
||||
and abandon the in-flight request when it fires.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
### InboundAPI-017 — `RouteHelper` / `RouteTarget` has no test coverage
|
||||
|
||||
| | |
|
||||
|--|--|
|
||||
| Severity | Low |
|
||||
| Category | Testing coverage |
|
||||
| Status | Open |
|
||||
| Location | `src/ScadaLink.InboundAPI/RouteHelper.cs:1-165`, `tests/ScadaLink.InboundAPI.Tests/` |
|
||||
|
||||
**Description**
|
||||
|
||||
`RouteHelper`/`RouteTarget` is the entire WP-4 cross-site routing surface — the
|
||||
`Route.To().Call()/GetAttribute(s)/SetAttribute(s)` API that inbound API scripts use
|
||||
to reach instances at any site. It has zero tests: the `ScadaLink.InboundAPI.Tests`
|
||||
project covers `ApiKeyValidator`, `ParameterValidator`, `InboundScriptExecutor`, and
|
||||
`InboundApiEndpointFilter`, but no test file exercises `RouteHelper`. Untested
|
||||
behaviours include site resolution via `IInstanceLocator` (including the
|
||||
"instance not found / no assigned site" `InvalidOperationException` path at
|
||||
`RouteHelper.cs:154-164`), the `!response.Success` → `InvalidOperationException`
|
||||
translation in each routed method, `GetAttribute` delegating to the batch
|
||||
`GetAttributes` and returning `null` for an absent key, correlation-ID generation,
|
||||
and `SetAttribute` delegating to `SetAttributes`. These are non-trivial branches
|
||||
whose failure modes (a thrown exception inside a script) surface to the caller as a
|
||||
500, so regressions would be silent.
|
||||
|
||||
**Recommendation**
|
||||
|
||||
Add a `RouteHelperTests` suite using substituted `IInstanceLocator` and
|
||||
`CommunicationService` (the executor tests already substitute `CommunicationService`):
|
||||
cover the happy path of each routed method, the unresolved-instance throw, the
|
||||
`!Success` → `InvalidOperationException` mapping, and `GetAttribute` returning `null`
|
||||
for a missing key. This also gives InboundAPI-016 a regression home if the timeout
|
||||
wiring is added.
|
||||
|
||||
**Resolution**
|
||||
|
||||
_Unresolved._
|
||||
|
||||
Reference in New Issue
Block a user