Files
ScadaBridge/docs/requirements/Component-ScriptAnalysis.md
T

169 lines
12 KiB
Markdown

# Component: Script Analysis
## Purpose
The Script Analysis component is the single authoritative source of truth for the ScadaBridge script trust model. It provides a unified forbidden-API deny-list, a fused semantic and syntactic trust validator, a Roslyn compile wrapper, and compile-only globals stubs used by the design-time deploy gate. All four call sites that enforce the script trust boundary — Template Engine, Site Runtime, Inbound API, and Central UI — delegate to this component rather than maintaining their own divergent implementations.
## Location
`src/ZB.MOM.WW.ScadaBridge.ScriptAnalysis/`
Referenced by: Template Engine, Site Runtime, Inbound API, Central UI.
## Responsibilities
- Define the canonical forbidden-API deny-list (`ScriptTrustPolicy`) as the single source of truth for all trust enforcement decisions across the system.
- Provide an authoritative forbidden-API verdict (`ScriptTrustValidator.FindViolations`) that fuses semantic symbol resolution with syntactic reflection-gateway hardening.
- Wrap Roslyn `CSharpScript` compilation (`RoslynScriptCompiler`) so callers share one implementation of compile + diagnostics extraction.
- Provide compile-only globals stubs (`ScriptCompileSurface`, `TriggerCompileSurface`) that mirror the real execution-time globals member-for-member, allowing the design-time deploy gate to do a real type-checking compile without depending on the execution-time projects.
---
## Requirements
### REQ-SA-1: Trust Policy (`ScriptTrustPolicy`)
`ScriptTrustPolicy` is a static class (or record) that publishes the complete, authoritative forbidden-API policy used at every call site.
#### Forbidden scopes
The following namespace/type prefixes are forbidden in all scripts:
| Scope | Rationale |
|-------|-----------|
| `System.IO` | File system access — forbidden entirely |
| `System.Diagnostics.Process` | Process spawning — forbidden; `Stopwatch`, `Debug`, `Activity`, and other `System.Diagnostics` types are **allowed** |
| `System.Threading` | Raw thread manipulation — forbidden, with the exceptions below |
| `System.Reflection` | Reflection — forbidden entirely |
| `System.Net` | Raw network access — forbidden entirely (scripts must use `ExternalSystem.Call`) |
| `System.Runtime.InteropServices` | Native interop — forbidden entirely |
| `Microsoft.Win32` | Win32 API access — forbidden entirely |
#### Allowed exceptions within forbidden scopes
The following types are explicitly allowed despite falling within a forbidden namespace:
- `System.Threading.Tasks` (and all subtypes) — async/await support
- `System.Threading.CancellationToken` — cooperative cancellation
- `System.Threading.CancellationTokenSource` — cooperative cancellation
The scoping rationale: `System.Diagnostics.Process` is the dangerous type (spawns processes); `Stopwatch`, `Debug`, and `Activity` are harmless diagnostic utilities. Forbidding the whole `System.Diagnostics` namespace, as some earlier call sites did, was overly broad.
#### Reflection-gateway members
The following member names are blocked regardless of the receiver type, to prevent reflection-based bypasses such as `typeof(x).Assembly.GetType("System.IO.File")`:
`GetType`, `GetTypeInfo`, `Assembly`, `Module`, `CreateInstance`, `InvokeMember`, `GetMethod`, `GetMethods`, `GetConstructor`, `GetConstructors`, `GetField`, `GetFields`, `GetProperty`, `GetProperties`, `GetMember`, `GetMembers`, `GetRuntimeMethod`, `GetRuntimeMethods`, `MethodHandle`, `TypeHandle`.
#### Forbidden identifiers
The identifiers `dynamic` and `Activator` are forbidden at any scope, as they provide type-system escape hatches equivalent to reflection.
#### Default references and imports
`ScriptTrustPolicy` also publishes `DefaultReferences` (the canonical set of trusted-platform `MetadataReference` entries used when constructing the Roslyn script compilation context) and `DefaultImports` (the default `using` directives injected into every script). These are consumed by `RoslynScriptCompiler` and by the compile-only surfaces below.
---
### REQ-SA-2: Trust Validator (`ScriptTrustValidator`)
`ScriptTrustValidator.FindViolations(string code, IEnumerable<MetadataReference>? extraReferences = null)` is the **authoritative forbidden-API gate**. It returns a list of violation messages; an empty list means the script is clean.
#### Two-pass design
**Pass 1 — semantic symbol resolution (adapted from Site Runtime)**
- Builds a Roslyn compilation using the full trusted-platform reference set from `ScriptTrustPolicy.DefaultReferences` (plus any `extraReferences`).
- For each identifier in the syntax tree, resolves the underlying symbol to its fully qualified containing namespace and type name.
- Flags any symbol whose containing namespace or type matches a forbidden scope in `ScriptTrustPolicy.ForbiddenScopes`, taking `AllowedExceptions` into account.
- Correctly handles aliases (`using X = System.IO.File`), `using static`, and `global::` prefixes — the resolved symbol is checked, not the spelling.
- Because the full reference set is loaded, this pass also catches a forbidden type accessed inside an otherwise-allowed namespace (e.g., bare `Process` after `using System.Diagnostics;`).
**Pass 2 — syntactic reflection-gateway and identifier hardening (adapted from Inbound API)**
- Walks the syntax tree for member-access expressions and simple name references.
- Flags any member name found in `ScriptTrustPolicy.ReflectionGatewayMembers`, regardless of receiver type.
- Flags any identifier token found in `ScriptTrustPolicy.ForbiddenIdentifiers` (`dynamic`, `Activator`).
Violations from both passes are merged and deduplicated before being returned.
#### Design notes
- `FindViolations` needs no globals type; it operates solely on the script text and the trusted-platform reference set.
- The function is stateless and thread-safe — callers share a single instance or call it as a static method.
- A violation does not abort compilation; callers may choose to report violations and continue, or treat any violation as a hard reject.
---
### REQ-SA-3: Roslyn Compile Wrapper (`RoslynScriptCompiler`)
`RoslynScriptCompiler` wraps `CSharpScript` to give callers a single implementation of compile + diagnostics extraction.
#### `Compile(string code, Type? globalsType = null, IEnumerable<MetadataReference>? extraReferences = null, IEnumerable<string>? extraImports = null)`
- Creates a `CSharpScript` with the given code, `globalsType`, references (defaults from `ScriptTrustPolicy.DefaultReferences` plus `extraReferences`), and imports (defaults from `ScriptTrustPolicy.DefaultImports` plus `extraImports`).
- Calls `.Compile()` and returns the resulting `Diagnostic[]` filtered to errors and warnings.
- Each caller passes its own `globalsType``ScriptCompileSurface` for the design-time deploy gate, the real `ScriptGlobals` for Site Runtime execution, `null` for pure syntax checks.
#### `ParseDiagnostics(string code)`
- Parses the script text using Roslyn's `CSharpSyntaxTree.ParseText` and returns syntax-level diagnostics (errors and warnings).
- No compilation is performed — useful for fast syntax checks where no globals type is available.
---
### REQ-SA-4: Compile-Only Globals Stubs
The deploy gate in Template Engine must do a real type-checking compile (to catch undefined-symbol and type errors) but cannot depend on the execution-time projects (Site Runtime, Inbound API) that own the real globals. Two compile-only stubs solve this:
#### `ScriptCompileSurface`
Mirrors `ScriptGlobals` member-for-member (same public property names, same return types, same method signatures) but with no implementation bodies. All properties return `default` and all methods return `default` or `Task.CompletedTask`. Depends only on `Commons.Types` — no Akka.NET, no external system clients.
Used by the Template Engine deploy gate:
```csharp
var errors = RoslynScriptCompiler.Compile(code, typeof(ScriptCompileSurface));
```
This allows the compile to bind `Attributes["name"]`, `Notify.To("x").Send(...)`, `ExternalSystem.Call(...)`, and similar API calls against real types, catching undefined-symbol and type-mismatch errors before deployment.
#### `TriggerCompileSurface`
Mirrors `TriggerExpressionGlobals` in the same way. Used by `ValidationService.CheckExpressionSyntax` in the Template Engine for conditional and expression trigger validation.
#### Parity guard
A reflection-based parity test in `SiteRuntime.Tests` compares the public member names on `ScriptCompileSurface` against `ScriptGlobals` (and `TriggerCompileSurface` against `TriggerExpressionGlobals`). Any drift between the stub and the real globals causes this test to fail, ensuring the stubs cannot silently fall out of sync.
---
### REQ-SA-5: Consumer Delegation
All four call sites that previously maintained their own script trust enforcement now delegate to this component. The key behavioral changes per consumer:
| Consumer | Before | After |
|----------|--------|-------|
| **Template Engine** `ScriptCompiler.TryCompile` | Substring scan + brace-balance (advisory, bypassable) | `FindViolations` + real `Compile` against `ScriptCompileSurface` — authoritative gate |
| **Template Engine** `ValidationService.CheckExpressionSyntax` | Regex / brace scan | `FindViolations` + `Compile` against `TriggerCompileSurface` |
| **Site Runtime** `ScriptCompilationService.ValidateTrustModel` | Semantic resolver, no reflection-gateway hardening | Delegates to `FindViolations`; retains `CSharpScript.Compile` against real `ScriptGlobals` for execution |
| **Inbound API** `ForbiddenApiChecker.FindViolations` | Syntactic walker, forbade all `System.Diagnostics` | Thin shim delegating to `ScriptTrustValidator.FindViolations`; `System.Diagnostics` loosened to `.Process`-only |
| **Central UI** `ScriptAnalysisService` | Semantic + full compile, lenient threading | Delegates forbidden-API verdict and sources editor-marker deny-list from `ScriptTrustPolicy`; retains Test-Run execution host |
The static enforcement is **defence-in-depth**, not a true runtime sandbox. Scripts execute in-process; the denied API list prevents obvious escapes at compile time but does not provide the isolation guarantees of an out-of-process sandbox or a restricted `AssemblyLoadContext`. This caveat applies to all consumers.
---
## Dependencies
- **Commons**: Shared types referenced by `ScriptCompileSurface` and `TriggerCompileSurface` (e.g., `DataType`, attribute access types).
- **Microsoft.CodeAnalysis.CSharp.Scripting**: Roslyn scripting APIs used by `RoslynScriptCompiler` and `ScriptTrustValidator`.
- **Microsoft.CodeAnalysis.CSharp.Workspaces**: Roslyn workspace/syntax APIs used by `ScriptTrustValidator`.
No dependency on Akka.NET, ASP.NET Core, Entity Framework, or any other ScadaBridge component above Commons.
## Interactions
- **Template Engine (#1)**: Consumes `ScriptTrustValidator.FindViolations`, `RoslynScriptCompiler.Compile`, `ScriptCompileSurface`, and `TriggerCompileSurface` in the design-time deploy gate (`ScriptCompiler.TryCompile`) and expression syntax validator (`ValidationService.CheckExpressionSyntax`).
- **Site Runtime (#3)**: `ScriptCompilationService.ValidateTrustModel` delegates the trust verdict to `ScriptTrustValidator.FindViolations`; retains its own `CSharpScript.Compile` against the real `ScriptGlobals` for execution-time compilation.
- **Inbound API (#14)**: `ForbiddenApiChecker.FindViolations` is a thin shim over `ScriptTrustValidator.FindViolations`.
- **Central UI (#9)**: `ScriptAnalysisService` delegates the run-gate forbidden-API verdict and sources the editor-marker deny-list from `ScriptTrustPolicy`; retains the Test-Run execution host (`SandboxScriptHost`).