Files
scadalink-design/code-reviews/InboundAPI/findings.md
Joseph Doherty 977d7369a7 docs: add code review process and baseline review of all 19 modules
Establishes a per-module code review workflow under code-reviews/ and
records the 2026-05-16 baseline review (commit 9c60592): 241 findings
across all src/ modules (6 Critical, 46 High, 100 Medium, 89 Low).
This is the clean starting point for remediation work.
2026-05-16 18:09:09 -04:00

19 KiB

Code Review — InboundAPI

Field Value
Module src/ScadaLink.InboundAPI
Design doc docs/requirements/Component-InboundAPI.md
Status Reviewed
Last reviewed 2026-05-16
Reviewer claude-agent
Commit reviewed 9c60592
Open findings 13

Summary

The InboundAPI module is small (8 source files) and the happy-path flow — extract key, validate, deserialize parameters, execute script, serialize result — is clean and readable. However the review surfaced several real problems concentrated in two themes: concurrency and security. The InboundScriptExecutor is a singleton that mutates a plain Dictionary from concurrent ASP.NET request threads with no synchronization, which can corrupt the handler cache or crash the process under load. On the security side, API-key comparison is a non-constant-time database string match (timing oracle), compiled scripts run with no enforcement of the documented script trust model (forbidden APIs such as System.IO/Process/Reflection are fully reachable), there is no request-body size limit, and the executor's catch-all swallows OperationCanceledException from genuine client disconnects as a "timeout". Design-doc adherence is also incomplete: the Database.Connection() script API described in the design doc is entirely absent from InboundScriptContext, and the endpoint never enforces that the API is central-only. Testing covers the validators well but there is no coverage of the HTTP endpoint, concurrency, or recompilation. None of the findings are data-loss-class, but the concurrency and trust-model issues are High severity and should be addressed before production use.

Checklist coverage

# Category Examined Notes
1 Correctness & logic bugs CoerceValue returns null for legitimately-null/String values indistinguishably; parameter-definition edge cases noted.
2 Akka.NET conventions Module is ASP.NET-hosted, no actors of its own; routes to actors via CommunicationService. No correlation-ID issues — IDs are set in RouteHelper.
3 Concurrency & thread safety Singleton InboundScriptExecutor mutates a non-thread-safe Dictionary from concurrent request threads — see InboundAPI-001/002.
4 Error handling & resilience Catch-all conflates client cancellation with timeout (InboundAPI-004); compilation-failure path repeats work on every request (InboundAPI-009).
5 Security Non-constant-time key comparison, no trust-model enforcement, no body-size limit, missing-method enumeration oracle — see InboundAPI-003/005/006/011.
6 Performance & resource management Up to 3 separate DB round-trips per request in ApiKeyValidator; uncapped lazy recompilation.
7 Design-document adherence Database.Connection() script API missing; central-only hosting not enforced; lazy-compile diverges from "compiled at startup".
8 Code organization & conventions ParameterDefinition is an API-shaped POCO declared in the component project rather than Commons; otherwise conventions followed.
9 Testing coverage Good unit coverage of the two validators; no endpoint, concurrency, recompilation, or timeout-vs-cancel tests.
10 Documentation & comments ApiKeyValidationResult.NotFound XML/name says "NotFound" but returns HTTP 400 — misleading (InboundAPI-013).

Findings

InboundAPI-001 — Singleton script handler cache mutated without synchronization

Severity High
Category Concurrency & thread safety
Status Open
Location src/ScadaLink.InboundAPI/InboundScriptExecutor.cs:17, :32, :40, :89, :123-128

Description

InboundScriptExecutor is registered as a singleton (ServiceCollectionExtensions.cs:11) and its handler cache is a plain Dictionary<string, Func<...>> (InboundScriptExecutor.cs:17). RegisterHandler, RemoveHandler, CompileAndRegister, and the lazy-compile path in ExecuteAsync all read and write this dictionary with no lock. ASP.NET serves inbound API requests on concurrent thread-pool threads, so two requests for an as-yet-uncompiled method (or a request racing a CLI-triggered CompileAndRegister) can mutate the dictionary concurrently. Dictionary is explicitly not safe for concurrent read/write — this can corrupt internal buckets, throw InvalidOperationException, or return a torn/null handler, crashing the request or the process.

Recommendation

Replace the Dictionary with a ConcurrentDictionary<string, Func<...>>, or guard all access with a lock. For the lazy-compile path use GetOrAdd so concurrent first-callers compile at most once.

Resolution

Unresolved.

InboundAPI-002 — Lazy compilation is a check-then-act race with no atomicity

Severity Medium
Category Concurrency & thread safety
Status Open
Location src/ScadaLink.InboundAPI/InboundScriptExecutor.cs:123-129

Description

ExecuteAsync does if (!_scriptHandlers.TryGetValue(...)) { CompileAndRegister(method); handler = _scriptHandlers[method.Name]; }. Even setting aside the unsynchronized dictionary (InboundAPI-001), this is a check-then-act sequence: between TryGetValue failing and the re-read on line 128, another thread could RemoveHandler the entry, causing the indexer on line 128 to throw KeyNotFoundException — an unhandled-in-context exception that is then caught only by the broad catch on line 143 and reported to the caller as "Internal script error". Multiple concurrent first-callers will also each compile the same script redundantly (wasted Roslyn work).

Recommendation

Make compile-and-fetch a single atomic operation (ConcurrentDictionary.GetOrAdd with a lazily-evaluated factory, or a per-method lock), and have CompileAndRegister return the handler it produced rather than requiring a separate dictionary read.

Resolution

Unresolved.

InboundAPI-003 — API key compared with non-constant-time string equality

Severity High
Category Security
Status Open
Location src/ScadaLink.ConfigurationDatabase/Repositories/InboundApiRepository.cs:22-23, consumed by src/ScadaLink.InboundAPI/ApiKeyValidator.cs:33

Description

API-key authentication resolves the key with FirstOrDefaultAsync(k => k.KeyValue == keyValue) — an ordinary equality match translated to a SQL WHERE KeyValue = @p comparison. The secret is matched with ordinary (early-exit) string/SQL comparison rather than a constant-time comparison, which is a classic timing side-channel for secret material. Combined with the design's explicit "no rate limiting" decision, an attacker with network access to the central API can mount a timing attack to recover valid keys. The API key is the sole credential for the inbound API, so this is the primary authentication path.

Recommendation

Look the key up by a non-secret indexed identifier (e.g. a key prefix/id) or fetch candidate rows, then verify the secret in-process using CryptographicOperations.FixedTimeEquals over the UTF-8 bytes. Preferably store only a salted hash of the key value and compare hashes. Avoid leaking secret-length and match-position timing.

Resolution

Unresolved.

InboundAPI-004 — Client disconnect is misreported as a script timeout

Severity Medium
Category Error handling & resilience
Status Open
Location src/ScadaLink.InboundAPI/InboundScriptExecutor.cs:117-141

Description

ExecuteAsync creates a linked CTS from httpContext.RequestAborted and the method timeout, then catches OperationCanceledException and unconditionally returns "Script execution timed out". When the client aborts the request (RequestAborted fires), the same exception type is thrown, so a normal client disconnect is logged as a timeout (_logger.LogWarning("Script execution timed out ...")) and an attempt is made to write a 500 timeout body to an already-gone connection. This pollutes the failure log (which the design says is reserved for genuine script errors) and obscures real timeout incidents.

Recommendation

Distinguish the two cancellation sources: if cancellationToken (the request token) is cancelled, treat it as a client abort — do not log a timeout and do not attempt to write a response. Only when the timeout CTS fired should the result be "timed out". Check cts.Token.IsCancellationRequested && !cancellationToken.IsCancellationRequested or use a dedicated timeout CancellationTokenSource so the two are separable.

Resolution

Unresolved.

InboundAPI-005 — Compiled API scripts run with no script-trust-model enforcement

Severity High
Category Security
Status Open
Location src/ScadaLink.InboundAPI/InboundScriptExecutor.cs:56-93

Description

CLAUDE.md's Akka.NET conventions state the script trust model forbids System.IO, Process, Threading, Reflection, and raw network access. CompileAndRegister compiles arbitrary C# with CSharpScript.Create and only restricts the default imports (WithImports("System", ...)). Imports are a convenience, not a sandbox — a script can still fully-qualify any type (System.IO.File.Delete(...), System.Diagnostics.Process.Start(...), System.Reflection, raw Socket) because the core framework assemblies are referenced and Roslyn scripting performs no API allow/deny-listing. Inbound API scripts execute on the central node with the host process's privileges, so a malicious or buggy method definition has full host access. Note the Design role authors these scripts (less trusted than Admin), making enforcement material.

Recommendation

Add a compile-time analyzer/SyntaxWalker (as the Site Runtime does for instance scripts) that rejects forbidden namespaces/types before registering a handler, and/or run scripts under a constrained boundary. At minimum, share the Site Runtime's forbidden-API checker so the trust model is enforced consistently. Reject the method (and log) when a violation is found instead of registering it.

Resolution

Unresolved.

InboundAPI-006 — No request body size limit on the inbound endpoint

Severity Medium
Category Security
Status Open
Location src/ScadaLink.InboundAPI/EndpointExtensions.cs:54-62

Description

HandleInboundApiRequest calls JsonDocument.ParseAsync(httpContext.Request.Body, ...) with no explicit body-size cap and no [RequestSizeLimit]/endpoint metadata. Although Kestrel has a default max request body size, this endpoint accepts arbitrary JSON from external systems, fully buffers it into a JsonDocument, and then Clone()s the root element (:61) which materializes the entire document on the heap. With no rate limiting (a deliberate design choice) a single caller can drive large allocations. Deep/wide JSON also makes the CoerceValue object/list deserialization (ParameterValidator.cs:113,117) expensive.

Recommendation

Set an explicit, modest body-size limit on the endpoint (.WithMetadata(new RequestSizeLimitAttribute(...)) or IHttpMaxRequestBodySizeFeature) and consider a JsonDocumentOptions MaxDepth. Reject oversized bodies with 413 before buffering.

Resolution

Unresolved.

InboundAPI-007 — Database.Connection() script API from the design doc is not implemented

Severity Medium
Category Design-document adherence
Status Open
Location src/ScadaLink.InboundAPI/InboundScriptExecutor.cs:155-170

Description

Component-InboundAPI.md ("Script Runtime API -> Database Access") specifies Database.Connection("connectionName") as an available script capability for querying the configuration/machine-data databases. InboundScriptContext exposes only Parameters, Route, and CancellationToken — there is no Database member. Any method script that follows the documented API will fail to compile. Either the code is incomplete or the design doc is stale; the two must be reconciled.

Recommendation

If database access is in scope, add a Database property to InboundScriptContext backed by a connection-factory service. If it is not, remove the "Database Access" section from Component-InboundAPI.md so the design doc stops advertising an absent API.

Resolution

Unresolved.

InboundAPI-008 — Inbound API endpoint not restricted to the active central node

Severity Medium
Category Design-document adherence
Status Open
Location src/ScadaLink.InboundAPI/EndpointExtensions.cs:19-23, src/ScadaLink.Host/Program.cs:149

Description

The design states the Inbound API is "Central cluster only (active node)" and "fails over with it". MapInboundAPI registers POST /api/{methodName} unconditionally, and Program.cs maps it inside the central-role branch but with no active-node gating — unlike /health/active which has an active-node predicate. A standby central node will happily serve inbound API calls, executing scripts and Route.To() calls from a non-leader, which can race the active node or run against stale singleton state.

Recommendation

Gate the endpoint on active-node status (reuse the cluster active-node health check or a leader-state check) and return 503 on the standby, so Traefik/clients only reach the live node — consistent with how the Management API and /health/active are treated.

Resolution

Unresolved.

InboundAPI-009 — Failed compilation is retried on every subsequent request

Severity Low
Category Performance & resource management
Status Open
Location src/ScadaLink.InboundAPI/InboundScriptExecutor.cs:123-128

Description

When a method's script fails to compile, CompileAndRegister returns false and nothing is stored in _scriptHandlers. Every subsequent call to that method re-enters the lazy-compile branch and recompiles the broken script via Roslyn from scratch. Roslyn compilation is expensive; a single broken method definition repeatedly invoked by an external caller (no rate limiting) becomes a CPU amplification vector.

Recommendation

Cache the compilation failure (e.g. store a sentinel handler that immediately returns the compile error, or keep a HashSet of known-bad method names with the diagnostic) so a broken script is compiled at most once until the definition is updated via CompileAndRegister.

Resolution

Unresolved.

InboundAPI-010 — ParameterValidator ignores extra body fields and cannot validate Object/List element types

Severity Low
Category Correctness & logic bugs
Status Open
Location src/ScadaLink.InboundAPI/ParameterValidator.cs:64-90, :112-118

Description

Two related correctness gaps: (1) The validator iterates only over defined parameters; any extra top-level fields in the request body are silently ignored rather than reported, so callers get no feedback on typo'd parameter names. (2) For Object and List types the validator only checks the JSON kind (Object/Array) and then blindly JsonSerializer.Deserializes the raw text — the design's extended type system describes Objects as "named structure with typed fields" and Lists as collections "of objects or primitive types", but no field-level or element-level type validation is performed. Invalid nested structures pass validation and surface only as runtime script errors.

Recommendation

Optionally warn/400 on unexpected body fields. For the extended types, either parse a richer ParameterDefinition (with nested field definitions / element type) and validate recursively, or document explicitly that Object/List are validated only for shape — and update the design doc to match.

Resolution

Unresolved.

InboundAPI-011 — Method-existence check leaks to unapproved callers (enumeration oracle)

Severity Low
Category Security
Status Open
Location src/ScadaLink.InboundAPI/ApiKeyValidator.cs:39-52

Description

ValidateAsync returns 400 Method '{methodName}' not found when the method does not exist, but 403 API key not approved for this method when it exists but the key is not approved. A caller holding any valid enabled key can therefore enumerate which method names exist on the central API by observing 400-vs-403 responses. The error message also echoes the caller-supplied methodName back verbatim into the JSON response (EndpointExtensions.cs:47), a minor reflected-input concern.

Recommendation

Return an indistinguishable response (e.g. 403/404) for both "method not found" and "key not approved" so existence is not observable to unapproved callers. Avoid echoing raw caller input in error bodies, or sanitize it.

Resolution

Unresolved.

InboundAPI-012 — ParameterDefinition POCO declared in the component project, not Commons

Severity Low
Category Code organization & conventions
Status Open
Location src/ScadaLink.InboundAPI/ParameterValidator.cs:128-133

Description

ParameterDefinition is a persistence-/contract-shaped POCO: it is the deserialized form of ApiMethod.ParameterDefinitions (a column in the configuration database) and describes the public API contract. CLAUDE.md's code-organization rules place persistence-ignorant entity/contract types in ScadaLink.Commons. Defining it inside the InboundAPI project means any other component that needs to read or produce method parameter definitions (e.g. Central UI's method editor, CLI, Management Service) cannot share the type and will duplicate it.

Recommendation

Move ParameterDefinition (and a matching return-definition type, if added) to ScadaLink.Commons under the InboundApi entity/types namespace so it is shared by all components that work with method definitions.

Resolution

Unresolved.

InboundAPI-013 — ApiKeyValidationResult.NotFound factory returns HTTP 400, contradicting its name

Severity Low
Category Documentation & comments
Status Open
Location src/ScadaLink.InboundAPI/ApiKeyValidator.cs:78-79

Description

The static factory is named NotFound and is used for the "method not found" case, but it builds a result with StatusCode = 400 (Bad Request), not 404. The name strongly implies 404 and will mislead future maintainers; EndpointExtensions faithfully propagates whatever status code the factory sets, so the misnaming directly affects the wire contract.

Recommendation

Rename the factory to match its behaviour (e.g. BadRequest) or change the status code to 404 if that is the intended contract — and document the chosen "method not found" status in Component-InboundAPI.md's Error Handling section, which currently does not list it.

Resolution

Unresolved.