docs(code-review): full review at 4307c381 — 18 modules, 67 findings recorded + remediation tracked

Full per-module re-review of the 16 stale modules (last seen 1eb6e97 / 2026-05-28)
plus first-ever reviews of KpiHistory (#26) and ScriptAnalysis (#25), at HEAD 4307c381.

67 new findings (0 Critical, 6 High, 27 Medium, 34 Low). Remediation in commit
fd618cf1 closed 5 of the 6 Highs and ~33 Medium/Low; the rest are Deferred/Won't Fix
with rationale. Remaining pending (4) are all InboundAPI's Database-helper findings
(IA-026 High .. IA-029), left to the active feat/ipsen-movein effort per owner decision.

Highlights: caught a central-only-delivery security drift (SMTP creds broadcast to
sites — DM-025/SR-031), a never-committed 'Resolved' fix (SiteEventLogging-016 → -024),
an unguarded KPI recorder tick (KH-001), a trust-analyzer fallback weakening (SA-001),
and a native-alarm subscribe-path leak (DCL-023). ScriptAnalysis verdict: trust boundary
is semantically sound (symbol-based) in the production cluster config.

README regenerated; regen-readme.py --check passes (4 pending / 567 total).
This commit is contained in:
Joseph Doherty
2026-06-20 18:02:32 -04:00
parent fd618cf1dc
commit d39089f4ed
19 changed files with 4031 additions and 69 deletions
+407
View File
@@ -0,0 +1,407 @@
# Code Review — ScriptAnalysis
| Field | Value |
|-------|-------|
| Module | `src/ZB.MOM.WW.ScadaBridge.ScriptAnalysis` |
| Design doc | `docs/requirements/Component-ScriptAnalysis.md` |
| Status | Reviewed |
| Last reviewed | 2026-06-20 |
| Reviewer | claude-agent |
| Commit reviewed | `4307c381` |
| Open findings | 0 |
## Summary
This is the first formal review of the ScriptAnalysis component (#25), the single
authoritative source of truth for the ScadaBridge script trust boundary. The module
is small (5 source files, ~700 lines) and well-documented, and its core design is
**sound**: the trust verdict is genuinely **semantic** (Roslyn symbol resolution in
Pass 1), not a string/regex deny-list, so the classic spelling-based evasions —
aliases, `using static`, `global::`, partial-namespace qualification — are correctly
defeated by resolving the symbol rather than the text. The reflection-gateway
hardening (Pass 2) closes the `typeof(x).Assembly.GetType("System.IO.File")` class of
bypass that a namespace-only deny-list cannot see. All four documented consumers
(Template Engine, Site Runtime, Inbound API, Central UI) genuinely delegate to
`ScriptTrustValidator.FindViolations` and **hard-reject** on a non-empty result; none
re-implements or relaxes the policy. A reflection-based parity guard for the
compile-only surfaces exists in `SiteRuntime.Tests`. **No working trust-boundary
bypass was found at HEAD against the production (cluster-hosted) configuration.**
That said, the review surfaced several real issues. The highest-impact one is a
**silent degradation of the semantic pass when `TRUSTED_PLATFORM_ASSEMBLIES` is
unavailable** (single-file/AOT/trimmed host): `AnalysisReferences` falls back to the
5-assembly minimal set, after which a *bare* forbidden type inside an *allowed*
namespace (the documented `Process` via `using System.Diagnostics;` case) no longer
resolves and slips Pass 1 entirely — and Pass 2 never flags bare identifiers, only
dotted spellings (ScriptAnalysis-001). This is the only finding that can weaken the
verdict, and it is narrow (the production hosts carry a TPA list, and the downstream
real `Compile` gate catches it for two of the four consumers), so it is rated High
rather than Critical. The remaining findings are robustness/correctness/doc issues:
the semantic pass swallows nothing but the unresolved-symbol fallback is namespace-
spelling-dependent (Medium); `nameof(...)` over-flags legitimate non-forbidden uses
inconsistently (Low, documented-intentional but worth pinning); `ParseDiagnostics`/
`Compile` drop warnings despite the doc promising "errors and warnings"; there is no
caching/throttling of the Roslyn compilations (performance); the `extraReferences`
XML-doc on `FindViolations` is contradicted by the Central UI consumer; and the
adversarial test corpus, though good, has gaps (no extension-method, no
TPA-fallback, no `file`-scoped/`unsafe`, no verbatim/Unicode-escape identifier
cases). The trust boundary is **semantically sound in production**; the open items
harden the edges and align the spec.
## Checklist coverage
| # | Category | Examined | Notes |
|---|----------|----------|-------|
| 1 | Correctness & logic bugs | ☑ | `nameof` over-flags inconsistently (ScriptAnalysis-005); `GetSyntacticScopes` only returns text for dotted names so a bare unresolved forbidden identifier yields no scope (root of ScriptAnalysis-001/002). |
| 2 | Akka.NET conventions | ☑ | No issues found — module has no actors, no Akka dependency (by design; csproj references only Roslyn + Commons). Stateless static APIs, thread-safe. |
| 3 | Concurrency & thread safety | ☑ | No issues found — all public surfaces are static/stateless; the shared `static readonly` policy collections are read-only after init; `HardeningWalker` is per-call. `AnalysisReferences` eager-built once at type init. |
| 4 | Error handling & resilience | ☑ | `RoslynScriptCompiler.Compile` converts exceptions into a fail-the-gate error entry (fail-safe). `BuildAnalysisReferences` swallows per-assembly read errors and silently falls back to the minimal set — the silent fallback is the resilience gap (ScriptAnalysis-001). |
| 5 | Security | ☑ | Semantic (symbol-based) verdict — no spelling bypass found in production. One conditional degradation (TPA-absent host) weakens Pass 1 to spelling-dependent (ScriptAnalysis-001); `extraReferences` can never cause a false allow (verified). Defence-in-depth only, as documented — not a runtime sandbox (ScriptAnalysis-008). |
| 6 | Performance & resource management | ☑ | No compilation caching/throttling; every gate call re-parses, rebuilds a compilation, and re-runs Roslyn — `FindViolations` builds a fresh `CSharpCompilation` against the *full framework* reference set on every call (ScriptAnalysis-006). |
| 7 | Design-document adherence | ☑ | `ParseDiagnostics`/`Compile` drop warnings though doc says "errors and warnings" (ScriptAnalysis-004); `AnalysisReferences` (full-framework set) is not described in REQ-SA-2, which says DefaultReferences (ScriptAnalysis-007). |
| 8 | Code organization & conventions | ☑ | No issues found — POCO/static layout is clean, namespace correct, depends only on Commons + Roslyn. `ScriptCompileSurface` lives in ScriptAnalysis (not Commons) — acceptable as it mirrors a runtime surface and is consumed here. |
| 9 | Testing coverage | ☑ | Good adversarial reject/allow corpus, but missing: TPA-fallback degradation, extension methods, verbatim/Unicode-escape identifiers, `unsafe`/pointer, comment/trivia, and `AnalysisReferences`/exception paths (ScriptAnalysis-003). |
| 10 | Documentation & comments | ☑ | `FindViolations` XML-doc claim "Forbidden references are NOT added here" is contradicted by the Central UI consumer forwarding `compilation.References` (ScriptAnalysis-007, doc half). Otherwise thorough and accurate. |
## Findings
### ScriptAnalysis-001 — Semantic trust pass silently degrades to spelling-dependent when `TRUSTED_PLATFORM_ASSEMBLIES` is unavailable
| | |
|--|--|
| Severity | High |
| Category | Security |
| Status | Resolved |
| Location | `src/ZB.MOM.WW.ScadaBridge.ScriptAnalysis/ScriptTrustPolicy.cs:146-184`; interacts with `ScriptTrustValidator.cs:111-136`, `:195-213` |
**Description**
`AnalysisReferences` is the reference set that lets Pass 1 (the semantic symbol pass —
the *authoritative* guard) resolve every type a script names to its true namespace.
It is built from `AppContext.GetData("TRUSTED_PLATFORM_ASSEMBLIES")`. When that value
is absent — a single-file deployment, an AOT/trimmed host, or any runtime that does
not publish the TPA list — `BuildAnalysisReferences` falls back to the 5-assembly
`DefaultReferences` set (`ScriptTrustPolicy.cs:183`) **silently** (the per-assembly
`catch { }` at `:165`/`:178` and the `Count > 0` fallback at `:183` emit nothing).
After that fallback, a *bare* forbidden type that lives inside an *allowed* namespace
no longer resolves. The policy's own documented example is exactly this case:
`using System.Diagnostics; var p = Process.Start("x");`. `System.Diagnostics` is an
allowed namespace (so `VisitUsingDirective` does not flag the import), and bare
`Process` is an `IdentifierName`. With `Process.dll` absent from the minimal
reference set, `model.GetSymbolInfo` returns no symbol, so Pass 1 falls to
`GetSyntacticScopes`, which returns an empty scope list for a bare (dotless)
identifier (`ScriptTrustValidator.cs:210-212`). Pass 2's `HardeningWalker` only flags
forbidden *dotted* names and forbidden `using` imports — never a bare identifier
token (`VisitIdentifierName`, `:373-395`, only checks `dynamic`/`Activator`). The
forbidden `Process.Start` therefore slips the trust validator entirely. The same
degradation reopens the broader "forbidden type in an allowed namespace reached as a
bare identifier" class that the full-framework reference set was specifically added
(commit `06975720`, M3.6) to close.
This is rated High not Critical because: (a) the production hosts (Akka.NET cluster,
docker topology) run on a normal framework-dependent runtime that publishes the TPA
list, so the full set is loaded; and (b) for two of the four consumers (Template
Engine, Site Runtime) the downstream real `Compile` gate against a curated reference
set independently rejects an undefined `Process` symbol. But Inbound API and the
Central UI run gate rely on `FindViolations` for the reflection/namespace verdict,
and a deployment-mode change that drops the TPA list would silently weaken the
boundary with no diagnostic.
**Recommendation**
Make the degradation loud and fail-safe: when the TPA list is unavailable, log a
warning (or expose a diagnostic flag the consumers can surface), and — more
importantly — add the BCL assemblies that host the forbidden-type-in-allowed-namespace
cases (at minimum `System.Diagnostics.Process`) to `DefaultAssemblies` so they resolve
even in the minimal fallback. Alternatively, extend Pass 2 to flag bare identifiers
whose name matches the simple name of a known forbidden type (e.g. `Process`,
`WebClient`), independent of symbol resolution, so the verdict never depends solely on
the reference set being complete.
**Resolution**
Resolved 2026-06-20 (commit `fd618cf1`): the TPA-absent fallback now folds the
assemblies hosting the forbidden anchor types (Process, Socket, File, Thread, Assembly,
Marshal) into the minimal validation reference set so a bare forbidden type still
resolves and is flagged; an `AnalysisReferencesDegraded` flag + a `Trace.TraceWarning`
make the degraded mode observable. CRITICAL: anchors are added ONLY to the validation
`AnalysisReferences`, never to `DefaultReferences` — keeping the compile gate from being
able to bind `Process`. Verified by a TPA-fallback test that a bare `Process.Start` is
still rejected.
### ScriptAnalysis-002 — Pass 2 (syntactic hardening) is spelling-dependent and cannot catch a bare forbidden identifier on its own
| | |
|--|--|
| Severity | Medium |
| Category | Security |
| Status | Deferred |
| Location | `src/ZB.MOM.WW.ScadaBridge.ScriptAnalysis/ScriptTrustValidator.cs:289-395`; `:195-213` |
**Description**
Pass 2 (`HardeningWalker`) is purely textual: `VisitUsingDirective` flags forbidden
`using` imports (`:289-295`), `VisitQualifiedName`/`VisitMemberAccessExpression` flag
forbidden *dotted* names (`:298-369`), and `VisitIdentifierName` flags only the bare
identifiers `dynamic`/`Activator` (`:373-395`). It cannot independently flag any other
bare forbidden identifier — it relies entirely on Pass 1's semantic resolution for
that. The two passes are therefore not the "two independent guards" the design implies
for the bare-identifier case: for an unresolved bare identifier (see
ScriptAnalysis-001) **neither** pass fires. `GetSyntacticScopes` deliberately returns
nothing for dotless names (`:210-212`, "bare local identifiers cannot reach a forbidden
namespace") — which is true only while Pass 1 resolves them; once resolution fails the
assumption breaks.
This is the structural root of ScriptAnalysis-001 and is recorded separately because
it is a design property (the syntactic pass is not self-sufficient against bare names)
that should be acknowledged even after the TPA-fallback hole is closed: the defence is
single-layered, not double-layered, for the bare-identifier vector.
**Recommendation**
Document explicitly that Pass 1 is the sole guard for bare (single-identifier)
forbidden references and that Pass 2 only backstops *spelled* forbidden namespaces. If
genuine defence-in-depth is wanted for bare names, add a simple-name deny-set to Pass 2
(the simple names of the forbidden root types — `Process`, `WebClient`, `Marshal`,
`Registry`, `Thread`, `File`, `Directory`, etc.), accepting the small false-positive
risk on a script's own identically-named local, which a security gate should tolerate.
**Resolution**
Deferred 2026-06-20: strengthening Pass-2 (syntactic) so a bare forbidden identifier is
caught without Pass-1 symbol resolution is a defence-in-depth enhancement; the
production path (TPA present, two-layer compile gate on most call sites) is sound.
Recorded for a follow-up.
### ScriptAnalysis-003 — Adversarial test corpus has bypass-relevant gaps
| | |
|--|--|
| Severity | Medium |
| Category | Testing coverage |
| Status | Resolved |
| Location | `tests/ZB.MOM.WW.ScadaBridge.ScriptAnalysis.Tests/ScriptTrustValidatorTests.cs`; `RoslynScriptCompilerTests.cs` |
**Description**
The reject/allow corpus is solid for the spelled-namespace, alias, `using static`,
`global::`, reflection-gateway, generic-argument, and nested-lambda vectors. For a
security-critical trust boundary, however, several bypass-relevant cases are untested:
- **TPA-fallback degradation** (ScriptAnalysis-001) — no test forces
`AnalysisReferences` onto the minimal set and asserts that bare `Process`/`WebClient`
is still rejected. This is the highest-value missing test.
- **Extension methods** — no test that a forbidden-namespace extension method invoked
in receiver-position (`x.SomeForbiddenExt()`) is resolved and flagged.
- **Verbatim / Unicode-escaped identifiers** — `@dynamic`, `dynamic`,
`Activator`, or `GetType` are not exercised; `VisitIdentifierName` compares
`Identifier.ValueText`, which decodes Unicode escapes (so it likely holds), but the
case is not pinned, and `dynamic` is matched via `text == "dynamic"` (`:380`) which is
a separate path from the `ForbiddenIdentifiers` set.
- **`unsafe`/pointer/`stackalloc`** — relies implicitly on `ScriptOptions` defaulting
`AllowUnsafe=false`; not asserted.
- **Comment/trivia and string-literal payloads** — no test that a forbidden namespace
inside a comment or string is *not* flagged (false-positive guard) and that a
forbidden token reconstructed only at runtime from strings is correctly *not* the
validator's concern (documents the sandbox caveat).
**Recommendation**
Add the cases above, prioritising the TPA-fallback test (inject a minimal-reference
build of `AnalysisReferences` or factor the reference set so a test can force the
fallback) and the extension-method test. Add a verbatim/Unicode-escape identifier
reject test for `dynamic`/`Activator`/`GetType`.
**Resolution**
Resolved 2026-06-20 (commit `fd618cf1`): added 18 adversarial tests — TPA-fallback
bare-Process/Socket, extension-method invocation, verbatim/Unicode-escaped identifiers,
`unsafe` blocks (benign vs forbidden-API-inside), and comment/string-literal
false-positive guards.
### ScriptAnalysis-004 — `ParseDiagnostics`/`Compile` drop warnings though the design doc promises "errors and warnings"
| | |
|--|--|
| Severity | Low |
| Category | Design-document adherence |
| Status | Resolved |
| Location | `src/ZB.MOM.WW.ScadaBridge.ScriptAnalysis/RoslynScriptCompiler.cs:30-33`, `:73-76` |
**Description**
REQ-SA-3 states `Compile` returns the diagnostics "filtered to errors and warnings"
and `ParseDiagnostics` returns "syntax-level diagnostics (errors and warnings)". The
code filters `d.Severity == DiagnosticSeverity.Error` only in both methods
(`RoslynScriptCompiler.cs:31` and `:74`), so warnings are silently dropped. Filtering
to errors-only is arguably the safer choice for a compile *gate* (a warning should not
block a deploy), but the code and the spec disagree, and a future maintainer reading
the doc will expect warnings in the returned list.
**Recommendation**
Reconcile: either change the doc (REQ-SA-3) to say "errors only", or include warnings
and let callers decide. Given the gate semantics, updating the doc to "error-severity
diagnostics" is the cleaner fix.
**Resolution**
Resolved 2026-06-20 (commit `fd618cf1`): corrected the design doc (REQ-SA-3) to
'error-severity only', matching the compile wrapper's existing behaviour (a warning must
not block a deploy gate). Code unchanged.
### ScriptAnalysis-005 — `nameof(forbiddenType)` flagging is inconsistent between passes and across resolution states
| | |
|--|--|
| Severity | Low |
| Category | Correctness & logic bugs |
| Status | Deferred |
| Location | `src/ZB.MOM.WW.ScadaBridge.ScriptAnalysis/ScriptTrustValidator.cs:42-48`, `:99-136`, `:298-325` |
**Description**
The class comment (`:42-48`) states that a forbidden type inside `nameof(...)` is
"deliberately flagged" as a conservative fail-safe, and `Rejects_NameOf_ForbiddenType`
pins `nameof(System.IO.File)`. However the flagging is incidental, not deliberate, and
inconsistent: `nameof(System.IO.File)` is caught because the inner `System.IO.File` is
a `QualifiedNameSyntax`/member access that both passes inspect directly — the
`nameof` wrapper is irrelevant. A bare `nameof(Process)` (after `using
System.Diagnostics;`, with full TPA loaded) resolves `Process` to a real symbol and
*is* flagged by Pass 1; but the same `nameof(Process)` under the minimal-reference
fallback resolves to nothing and is *not* flagged. So the "deliberate fail-safe"
behaviour actually varies with the reference set, and `nameof` — which produces only a
compile-time string and reaches no API — is a genuine false positive that will reject a
legitimate script doing `nameof(System.Net.IPAddress)` for a log message.
**Recommendation**
Decide the intended semantics explicitly. If `nameof` should be allowed (it reaches no
runtime API), special-case `NameOfExpression`/`InvocationExpression` named `nameof` to
suppress the inner-reference report. If it should be rejected, make that uniform
regardless of resolution state (tie to a simple-name check), and keep the test. Either
way, align the comment with the actual, deterministic behaviour.
**Resolution**
Deferred 2026-06-20: the `nameof(forbiddenType)` over-flagging is a documented
deliberate fail-safe; whether `nameof` (which reaches no API) should be allowed vs
uniformly rejected is a design-owner decision. Recorded; no behaviour change this pass.
### ScriptAnalysis-006 — No caching or throttling of Roslyn compilations; full-framework compilation rebuilt per call
| | |
|--|--|
| Severity | Low |
| Category | Performance & resource management |
| Status | Deferred |
| Location | `src/ZB.MOM.WW.ScadaBridge.ScriptAnalysis/ScriptTrustValidator.cs:73-97`; `RoslynScriptCompiler.cs:50-71` |
**Description**
`FindViolations` parses the script and builds a fresh `CSharpCompilation` against the
**entire framework reference set** (`AnalysisReferences`, potentially hundreds of
`MetadataReference`s) on every call (`:87-95`). `RoslynScriptCompiler.Compile`
similarly creates a new `CSharpScript` and calls `.Compile()` per call. Roslyn
compilation is the expensive part of these calls. The metadata references are cached
(built once in `AnalysisReferences`), which is the main cost, so this is Low rather
than Medium — but for the Central UI editor path (which can call the validator on every
keystroke-debounced analysis) and for batch template validation (many scripts per
deploy), the absence of any per-source memoisation or a bounded compilation cache means
repeated work. There is no back-pressure or concurrency cap either; the Inbound API
review (InboundAPI-009) already noted uncapped recompilation downstream.
**Recommendation**
Consider a small bounded cache keyed on a hash of `(code, globalsType)` for `Compile`
results and on `code` for `FindViolations` verdicts (the verdict is a pure function of
the source and the static policy). At minimum, document that callers on hot paths
(editor live-analysis) should debounce and cache, and confirm the Central UI debounces
before calling.
**Resolution**
Deferred 2026-06-20: compilation caching/throttling (the full-framework
`CSharpCompilation` is rebuilt per `FindViolations` call) is a performance enhancement,
not a correctness defect. Recorded for a follow-up.
### ScriptAnalysis-007 — `FindViolations` `extraReferences` XML-doc contradicts the Central UI consumer; `AnalysisReferences` undocumented in REQ-SA-2
| | |
|--|--|
| Severity | Low |
| Category | Documentation & comments |
| Status | Resolved |
| Location | `src/ZB.MOM.WW.ScadaBridge.ScriptAnalysis/ScriptTrustValidator.cs:57-64`; `ScriptTrustPolicy.cs:125-144`; `docs/requirements/Component-ScriptAnalysis.md:74-80` |
**Description**
Two doc/code mismatches, neither a vulnerability:
1. The `extraReferences` XML-doc on `FindViolations` (`:57-64`) says "Forbidden
references are NOT added here". But the Central UI run gate calls
`FindViolations(tree.ToString(), compilation.References)`
(`CentralUI/ScriptAnalysis/ScriptAnalysisService.cs:1319-1320`), forwarding the full
BCL surface as `extraReferences`. This is in fact *safe* (the deny-list is by
namespace/type, so adding references can only improve resolution, never produce a
false allow — and the policy comment at `ScriptTrustPolicy.cs:138-142` says exactly
that), but the `FindViolations` doc as written implies the caller must not pass
forbidden assemblies, which is both unnecessary and contradicted in practice.
2. REQ-SA-2 in the design doc describes Pass 1 as using "the full trusted-platform
reference set from `ScriptTrustPolicy.DefaultReferences`". The code uses a separate
`ScriptTrustPolicy.AnalysisReferences` (the full framework), and `DefaultReferences`
is explicitly the *minimal* set used by `RoslynScriptCompiler`. The doc conflates the
two and never mentions `AnalysisReferences` or the TPA mechanism — material because
that mechanism is the heart of the security verdict (and of ScriptAnalysis-001).
**Recommendation**
Rewrite the `extraReferences` doc to state that extra references only widen symbol
resolution and never whitelist a forbidden API. Update REQ-SA-2 to describe
`AnalysisReferences` (full-framework, TPA-sourced) versus `DefaultReferences` (minimal,
runtime-fidelity), and note the TPA-fallback behaviour.
**Resolution**
Resolved 2026-06-20 (commit `fd618cf1`): corrected the `FindViolations`
`extraReferences` XML doc (extra references only widen resolution, never whitelist) and
documented `AnalysisReferences` vs `DefaultReferences` in REQ-SA-2.
### ScriptAnalysis-008 — Static trust gate is defence-in-depth, not a sandbox; correctly documented but worth recording as a standing risk
| | |
|--|--|
| Severity | Low |
| Category | Security |
| Status | Won't Fix |
| Location | `src/ZB.MOM.WW.ScadaBridge.ScriptAnalysis/ScriptTrustValidator.cs:34-40`; `docs/requirements/Component-ScriptAnalysis.md:151` |
**Description**
The component is explicit (both in code comments and REQ-SA-5) that it is a
compile-time deny-list, not a runtime sandbox: scripts execute in-process, and a
sufficiently determined script could reconstruct a forbidden API at runtime via paths
the static walker cannot see (e.g. building type names from runtime strings, or any
gap in the deny-list). The reflection-gateway hardening narrows this, and the
`dynamic`/`Activator` bans close the obvious late-binding routes, but the residual risk
is real and inherent to in-process execution of partially-trusted C#. This is recorded
(not as a defect of this module, which behaves as designed) so the standing risk is
visible in the audit trail and is revisited if/when the deferred runtime boundary
(restricted `AssemblyLoadContext` / curated reference set / out-of-process sandbox) is
scoped.
**Recommendation**
Keep the caveat prominent. Track the runtime-boundary work as a roadmap item; in the
meantime ensure the curated *execution* reference sets (SiteRuntime `ScriptAssemblies`,
`RoslynScriptCompiler` `DefaultReferences`) stay minimal so that even a deny-list miss
cannot bind a forbidden type at execution time (this is the real second layer of
defence for the compile-running consumers).
**Resolution**
Won't Fix 2026-06-20: the static gate is intentionally defence-in-depth, not a runtime
sandbox — this is the explicitly-documented design posture, not a defect. Recorded as a
standing risk for visibility.