docs(code-reviews): comprehensive per-module review pass at 76d35d1
Reviewed all 31 src/ production projects against the 10-category checklist in REVIEW-PROCESS.md. Each module gets its own findings.md; code-reviews/README.md is regenerated from them. 334 findings: 6 Critical, 46 High, 126 Medium, 156 Low. Critical findings: - Server-001: WriteNodeIdUnknown recurses unconditionally — a HistoryRead on an unresolvable node crashes the process (remote DoS). - Admin-001/002: app-wide auth bypass (RouteView not AuthorizeRouteView) plus unauthenticated mutating routes. - Core.Scripting-001: System.Environment reachable from operator scripts; Environment.Exit() terminates the server. - Core.AlarmHistorian-001: rowIds/events parallel-list desync on a corrupt payload misapplies outcomes — silent alarm-event data loss. - Driver.Galaxy-001: ReconnectSupervisor is built but never triggered, so a transient gateway drop permanently kills the event stream. All findings are Status=Open; resolution is tracked per REVIEW-PROCESS.md section 4. Review only — no source code changed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
323
code-reviews/Core.Scripting/findings.md
Normal file
323
code-reviews/Core.Scripting/findings.md
Normal file
@@ -0,0 +1,323 @@
|
||||
# Code Review — Core.Scripting
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Module | `src/Core/ZB.MOM.WW.OtOpcUa.Core.Scripting` |
|
||||
| Reviewer | Claude Code |
|
||||
| Review date | 2026-05-22 |
|
||||
| Commit reviewed | `76d35d1` |
|
||||
| Status | Reviewed |
|
||||
| Open findings | 11 |
|
||||
|
||||
## Checklist coverage
|
||||
|
||||
A comprehensive review completes every category, recording "No issues found" where
|
||||
a category produced nothing rather than leaving it blank.
|
||||
|
||||
| # | Category | Result |
|
||||
|---|---|---|
|
||||
| 1 | Correctness & logic bugs | Core.Scripting-004, Core.Scripting-005 |
|
||||
| 2 | OtOpcUa conventions | No issues found |
|
||||
| 3 | Concurrency & thread safety | Core.Scripting-006 |
|
||||
| 4 | Error handling & resilience | Core.Scripting-007 |
|
||||
| 5 | Security | Core.Scripting-001, Core.Scripting-002, Core.Scripting-003 |
|
||||
| 6 | Performance & resource management | Core.Scripting-008 |
|
||||
| 7 | Design-document adherence | Core.Scripting-009 |
|
||||
| 8 | Code organization & conventions | No issues found |
|
||||
| 9 | Testing coverage | Core.Scripting-010, Core.Scripting-011 |
|
||||
| 10 | Documentation & comments | No issues found |
|
||||
|
||||
## Findings
|
||||
|
||||
### Core.Scripting-001
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Critical |
|
||||
| Category | Security |
|
||||
| Location | `ForbiddenTypeAnalyzer.cs:45`, `ScriptSandbox.cs:54` |
|
||||
| Status | Open |
|
||||
|
||||
**Description:** `System.Environment` lives in the allowed `System` namespace (it is in
|
||||
`System.Private.CoreLib`, which is allow-listed for primitives) and is not on the
|
||||
forbidden-namespace deny-list. Nothing prevents an operator-authored script from calling
|
||||
`System.Environment.Exit(0)` or `System.Environment.FailFast("...")`. Both terminate the
|
||||
host process immediately. Because scripted-alarm predicates and virtual-tag scripts run
|
||||
in-process in the main OPC UA server (decision: "Scripting engine runs in the main .NET 10
|
||||
server process"), a single malicious or buggy predicate brings down the entire server —
|
||||
an outage affecting every connected client and every driver. `ScriptSandboxTests` only
|
||||
pins the *read* path (`Environment.GetEnvironmentVariable`) as an accepted compromise; the
|
||||
process-killing members are not considered. The whole-process kill far exceeds the
|
||||
"read-only process state" justification the test comments rely on.
|
||||
|
||||
**Recommendation:** The forbidden surface must be member-granular, not namespace-granular,
|
||||
for types in allowed namespaces. Add an explicit forbidden-member deny-list to
|
||||
`ForbiddenTypeAnalyzer` covering at minimum `System.Environment.Exit`,
|
||||
`System.Environment.FailFast`, `System.AppDomain`, `System.GC` (e.g. `GC.Collect`,
|
||||
`GC.AddMemoryPressure`), and `System.Activator.CreateInstance` (a reflection-equivalent
|
||||
escape). Reject these in `CheckSymbol` by resolved method symbol, with a test for each.
|
||||
|
||||
**Resolution:** _(open)_
|
||||
|
||||
### Core.Scripting-002
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | High |
|
||||
| Category | Security |
|
||||
| Location | `ForbiddenTypeAnalyzer.cs:70` |
|
||||
| Status | Open |
|
||||
|
||||
**Description:** The syntax walker only inspects four node kinds:
|
||||
`ObjectCreationExpressionSyntax`, `InvocationExpressionSyntax` with a member-access target,
|
||||
`MemberAccessExpressionSyntax`, and bare `IdentifierNameSyntax`. It never visits
|
||||
`TypeOfExpressionSyntax`, generic type-argument lists (`GenericNameSyntax` /
|
||||
`TypeArgumentListSyntax`), cast expressions (`CastExpressionSyntax`), `is`/`as` type
|
||||
patterns, `default(T)` expressions, array-creation element types, or `using`/local
|
||||
declared types. A script such as `typeof(System.IO.File)`,
|
||||
`new System.Collections.Generic.List<System.IO.FileInfo>()`,
|
||||
`(System.IO.Stream)null`, or `default(System.Reflection.Assembly)` references a forbidden
|
||||
type without ever producing a node the walker examines, so the forbidden-type check is
|
||||
bypassed. The Phase 7 plan A.6 explicitly calls out `typeof` as a sandbox-escape attempt
|
||||
that "must fail at compile" — it currently does not.
|
||||
|
||||
**Recommendation:** Walk every `TypeSyntax` node (handle `TypeOfExpressionSyntax`,
|
||||
`CastExpressionSyntax`, generic argument lists, and the type operand of
|
||||
`IsPatternExpression` / binary `as`). The simplest robust fix is to enumerate all
|
||||
`DescendantNodes()` and, for any node, resolve both `GetSymbolInfo` and `GetTypeInfo`,
|
||||
then check the resolved type plus every type argument. Add tests covering `typeof`,
|
||||
generic arguments, casts, and `default(T)` with forbidden types.
|
||||
|
||||
**Resolution:** _(open)_
|
||||
|
||||
### Core.Scripting-003
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Medium |
|
||||
| Category | Security |
|
||||
| Location | `TimedScriptEvaluator.cs:9`, `ScriptSandbox.cs:30` |
|
||||
| Status | Open |
|
||||
|
||||
**Description:** There is no bound on memory a script may allocate or on the number of
|
||||
threads/tasks a script may spawn. The class docs acknowledge unbounded memory as "a budget
|
||||
concern" deferred to v3, but in-process execution means a script doing
|
||||
`new byte[int.MaxValue]` repeatedly (or `Enumerable.Range(0,int.MaxValue).ToList()` — LINQ
|
||||
is allow-listed) can drive the whole server to `OutOfMemoryException`, an outage. The
|
||||
timeout does not help: the allocation can exhaust memory well before 250ms elapses, and
|
||||
the orphaned thread-pool thread documented in `TimedScriptEvaluator` keeps the allocation
|
||||
rooted. `System.Threading.Tasks` is not on the deny-list, so a script can also
|
||||
`Task.Run` an unbounded fan-out of background work that outlives the timeout entirely.
|
||||
|
||||
**Recommendation:** At minimum, document this as a known accepted risk in
|
||||
`docs/ScriptedAlarms.md` / `docs/VirtualTags.md` rather than only in a code comment, and
|
||||
add the `Task`/`Parallel` namespaces to the forbidden list (scripts are synchronous
|
||||
predicates — they have no legitimate need to start background tasks). For memory, gate
|
||||
script authoring behind an Admin permission and treat the test-harness preview as the
|
||||
control point, or track an explicit v3 issue for out-of-process execution. Record the
|
||||
decision so it is not silently lost.
|
||||
|
||||
**Resolution:** _(open)_
|
||||
|
||||
### Core.Scripting-004
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Medium |
|
||||
| Category | Correctness & logic bugs |
|
||||
| Location | `DependencyExtractor.cs:73` |
|
||||
| Status | Open |
|
||||
|
||||
**Description:** The walker matches tag-access calls purely by spelling — any
|
||||
`InvocationExpressionSyntax` whose member name is `GetTag` or `SetVirtualTag` is treated as
|
||||
a `ScriptContext` tag access, regardless of the receiver. A script that defines a local
|
||||
type with a `GetTag(string)` method and calls `other.GetTag("X")`, or calls
|
||||
`this.GetTag(...)` on a script-defined helper, has spurious dependencies harvested (or, if
|
||||
the literal arg is non-literal, spurious rejections raised). The XML remarks claim "as long
|
||||
as it's not on the ctx instance, the extractor doesn't pick it up", but the code does not
|
||||
check that the receiver is the `ctx` identifier — it accepts any member access with the
|
||||
matching name. The `DependencyExtractorTests.Ignores_non_ctx_method_named_GetTag` test
|
||||
passes only because the helper there is a *free* function (not member-access form); a
|
||||
member-access call to a non-ctx `GetTag` is untested and would be misattributed.
|
||||
|
||||
**Recommendation:** In `VisitInvocationExpression`, additionally require that
|
||||
`member.Expression` is an `IdentifierNameSyntax` with `Identifier.ValueText == "ctx"`
|
||||
(matching the `ScriptGlobals<TContext>.ctx` field name). Add a test for
|
||||
`someOtherObject.GetTag("X")` asserting it is ignored.
|
||||
|
||||
**Resolution:** _(open)_
|
||||
|
||||
### Core.Scripting-005
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Low |
|
||||
| Category | Correctness & logic bugs |
|
||||
| Location | `DependencyExtractor.cs:97` |
|
||||
| Status | Open |
|
||||
|
||||
**Description:** A raw string literal token passed as the tag path (a raw triple-quote
|
||||
literal) tokenizes as `SingleLineRawStringLiteralToken` /
|
||||
`MultiLineRawStringLiteralToken`, not `StringLiteralToken`. The check
|
||||
`literal.Token.IsKind(SyntaxKind.StringLiteralToken)` therefore rejects an
|
||||
otherwise-static raw-string path as a non-literal "dynamic path", producing a misleading
|
||||
rejection message. This is an edge case (operators rarely write raw strings for tag
|
||||
paths) but the error text would confuse anyone who does.
|
||||
|
||||
**Recommendation:** Accept all string-literal token kinds — check
|
||||
`literal.IsKind(SyntaxKind.StringLiteralExpression)` on the expression node, or include
|
||||
the raw-string token kinds, so a static raw string is harvested rather than rejected.
|
||||
|
||||
**Resolution:** _(open)_
|
||||
|
||||
### Core.Scripting-006
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Low |
|
||||
| Category | Concurrency & thread safety |
|
||||
| Location | `CompiledScriptCache.cs:55` |
|
||||
| Status | Open |
|
||||
|
||||
**Description:** On a failed compile the `catch` block calls
|
||||
`_cache.TryRemove(key, out _)` without a value comparison. If two threads race a miss for
|
||||
the same bad source, both observe the same faulted `Lazy` and throw, and both call
|
||||
`TryRemove(key)`. If a concurrent retry re-adds a new `Lazy` for that key between the two
|
||||
removals, the second unconditional `TryRemove` could evict the in-flight retry entry. The
|
||||
window is small and the consequence is only a redundant recompile, so severity is Low —
|
||||
but the removal should be key+value scoped for correctness.
|
||||
|
||||
**Recommendation:** Use the `ConcurrentDictionary.TryRemove(KeyValuePair<,>)` overload to
|
||||
remove only the specific faulted `Lazy` instance, so a concurrently re-added entry is not
|
||||
evicted.
|
||||
|
||||
**Resolution:** _(open)_
|
||||
|
||||
### Core.Scripting-007
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Medium |
|
||||
| Category | Error handling & resilience |
|
||||
| Location | `TimedScriptEvaluator.cs:60` |
|
||||
| Status | Open |
|
||||
|
||||
**Description:** `RunAsync` wraps the inner run in `Task.Run(...)` and then awaits
|
||||
`WaitAsync(Timeout, ct)`. If the caller-supplied `ct` cancels at roughly the same time the
|
||||
timeout elapses, the order in which `WaitAsync` observes the timeout vs. the cancellation
|
||||
is non-deterministic, so the same shutdown can sometimes surface as
|
||||
`ScriptTimeoutException` and sometimes as `OperationCanceledException`. The class docs
|
||||
assert "the caller's cancel wins" as a hard guarantee that the virtual-tag engine shutdown
|
||||
path depends on to avoid misclassifying shutdown as a script fault — but the
|
||||
implementation does not guarantee it when both fire close together.
|
||||
|
||||
**Recommendation:** After catching `TimeoutException`, check `ct.IsCancellationRequested`
|
||||
and throw `OperationCanceledException(ct)` instead of `ScriptTimeoutException` when the
|
||||
caller's token is cancelled, so caller cancellation deterministically wins regardless of
|
||||
race ordering.
|
||||
|
||||
**Resolution:** _(open)_
|
||||
|
||||
### Core.Scripting-008
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Low |
|
||||
| Category | Performance & resource management |
|
||||
| Location | `CompiledScriptCache.cs:34`, `ScriptEvaluator.cs:34` |
|
||||
| Status | Open |
|
||||
|
||||
**Description:** `CompiledScriptCache` has no capacity bound (acknowledged in the class
|
||||
remarks) and no eviction. Each cached `ScriptEvaluator` holds a Roslyn `ScriptRunner<T>`
|
||||
delegate, which keeps the dynamically emitted script assembly loaded for the process
|
||||
lifetime — emitted assemblies in the default `AssemblyLoadContext` cannot be unloaded.
|
||||
`Clear()` drops the dictionary entries but does **not** unload the emitted assemblies;
|
||||
they leak. Across many config-generation publishes (each `Clear()` followed by recompiling
|
||||
every script), the process accumulates dead script assemblies. For the expected "low
|
||||
thousands" of scripts this is benign, but a long-running server with frequent publishes
|
||||
will see steady managed-memory growth that never returns.
|
||||
|
||||
**Recommendation:** Document the per-publish assembly accretion as a known limitation, or
|
||||
compile scripts into a collectible `AssemblyLoadContext` so `Clear()` can unload prior
|
||||
generations. At minimum add a note to `docs/ScriptedAlarms.md` so operators with
|
||||
high-publish-frequency deployments are aware.
|
||||
|
||||
**Resolution:** _(open)_
|
||||
|
||||
### Core.Scripting-009
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Low |
|
||||
| Category | Design-document adherence |
|
||||
| Location | `ForbiddenTypeAnalyzer.cs:45` |
|
||||
| Status | Open |
|
||||
|
||||
**Description:** The Phase 7 plan decision #6
|
||||
(`docs/v2/implementation/phase-7-scripting-and-alarming.md`) enumerates the forbidden
|
||||
surface as "No HttpClient / File / Process / reflection". `ForbiddenTypeAnalyzer` actually
|
||||
denies a broader set — `System.Threading.Thread`, `System.Runtime.InteropServices`, and
|
||||
`Microsoft.Win32` (registry) — which is sensible hardening but is undocumented in the plan
|
||||
and in `docs/ScriptedAlarms.md` (which defers sandbox rules to `VirtualTags.md`). An
|
||||
operator reading the design docs cannot predict that a registry or interop reference will
|
||||
be rejected. Conversely the plan does not record the `System.Environment` /
|
||||
`System.Diagnostics` decisions. The code and the design document have drifted.
|
||||
|
||||
**Recommendation:** Update the plan's decision #6 (or `docs/VirtualTags.md`) to list the
|
||||
authoritative deny-list exactly as `ForbiddenTypeAnalyzer.ForbiddenNamespacePrefixes`
|
||||
defines it, including the `System.Environment` allowed-compromise, so the docs match the
|
||||
code.
|
||||
|
||||
**Resolution:** _(open)_
|
||||
|
||||
### Core.Scripting-010
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Medium |
|
||||
| Category | Testing coverage |
|
||||
| Location | `tests/Core/ZB.MOM.WW.OtOpcUa.Core.Scripting.Tests/ScriptSandboxTests.cs:54` |
|
||||
| Status | Open |
|
||||
|
||||
**Description:** The sandbox-escape test suite covers only the four obvious vectors
|
||||
(File / Http / Process / Reflection) as direct member-access calls. It does not test:
|
||||
`typeof(forbidden)`, generic type arguments (`List<FileInfo>`), cast expressions to
|
||||
forbidden types, `System.Environment.Exit` / `FailFast`, `System.Threading.Thread`,
|
||||
`System.Runtime.InteropServices`, `Microsoft.Win32` registry access, `Activator`, or
|
||||
`System.AppDomain`. Given that the analyzer is the sole security boundary for in-process
|
||||
untrusted-script execution, the gaps in Core.Scripting-001 and Core.Scripting-002 went
|
||||
undetected precisely because no test exercises those forms. The Phase 7 plan A.6 mandated
|
||||
"sandbox escape tests" but the implemented set is materially narrower than the threat
|
||||
surface.
|
||||
|
||||
**Recommendation:** Add a parameterised escape-test covering every node form in
|
||||
Core.Scripting-002 and every forbidden namespace/member in Core.Scripting-001. Each must
|
||||
assert a `ScriptSandboxViolationException` (or `CompilationErrorException`) at compile.
|
||||
|
||||
**Resolution:** _(open)_
|
||||
|
||||
### Core.Scripting-011
|
||||
|
||||
| Field | Value |
|
||||
|---|---|
|
||||
| Severity | Low |
|
||||
| Category | Testing coverage |
|
||||
| Location | `tests/Core/ZB.MOM.WW.OtOpcUa.Core.Scripting.Tests/` |
|
||||
| Status | Open |
|
||||
|
||||
**Description:** Two source files have no direct test coverage: `ScriptContext`
|
||||
(`Deadband` static helper is exercised only indirectly through `ScriptSandboxTests`, and
|
||||
not for its boundary `tolerance` behaviour) and `ScriptSandbox.Build` itself (the
|
||||
`ArgumentNullException` / `ArgumentException` guards on `contextType` at
|
||||
`ScriptSandbox.cs:45-48` are never asserted). `ScriptLogCompanionSink` and
|
||||
`ScriptLoggerFactory` have tests, but there is no test that a script's `ctx.Logger` Error
|
||||
emission surfaces via the companion sink end-to-end (factory + sink integration is
|
||||
untested). These are minor gaps but leave guard clauses and the logging integration
|
||||
unverified.
|
||||
|
||||
**Recommendation:** Add unit tests for `ScriptSandbox.Build` argument validation, for
|
||||
`ScriptContext.Deadband` at and around the tolerance boundary, and an end-to-end test that
|
||||
a script logging at Error level produces both a `scripts-*.log` event and a companion
|
||||
Warning event.
|
||||
|
||||
**Resolution:** _(open)_
|
||||
Reference in New Issue
Block a user