Phase 7 Stream A.1 — Core.Scripting project scaffold + ScriptContext + sandbox + AST dependency extractor #177

Merged
dohertj2 merged 1 commits from phase-7-stream-a1-core-scripting into v2 2026-04-20 16:29:46 -04:00
Owner

First of 3 increments within Stream A. Ships the Roslyn-based script engine's foundation.

What lands

  • Core.Scripting project (net10) + Microsoft.CodeAnalysis.CSharp.Scripting 4.12
  • ScriptContext abstract base — GetTag(string) → DataValueSnapshot, SetVirtualTag(string, object?), Now, Logger, static Deadband(double, double, double) helper
  • ScriptGlobals<TContext> wrapper exposing ctx so scripts see ctx.GetTag(...) — matches the AST walker's parse shape
  • ScriptSandbox.Build(contextType) produces ScriptOptions with a six-assembly allow-list + import list
  • ForbiddenTypeAnalyzerpost-compile semantic-model pass (defense in depth because .NET 10 type forwarding makes reference-level restriction leaky). Walks every ObjectCreationExpression / InvocationExpression / MemberAccessExpression / IdentifierName, resolves via SemanticModel, rejects namespaces matching System.IO / System.Net / System.Diagnostics / System.Reflection / System.Threading.Thread / System.Runtime.InteropServices / Microsoft.Win32. Throws ScriptSandboxViolationException with aggregated spans.
  • DependencyExtractorCSharpSyntaxWalker picking up ctx.GetTag("literal") + ctx.SetVirtualTag("literal", ...) invocations; rejects variable/concatenated/interpolated/method-returned paths with source spans
  • ScriptEvaluator<TContext, TResult> three-step gate: Roslyn compile → ForbiddenTypeAnalyzer → delegate materialization

Tests — 29/29 green

  • DependencyExtractorTests (14): literal extraction, read/write separation, deduplication, every rejection form (variable / concat / interpolation / method-returned / empty / whitespace), non-ctx GetTag ignored, empty/whitespace/null source no-op, rejection span carried, multiple bad paths in one pass, nested literal extraction
  • ScriptSandboxTests (15): happy-path compile+run, SetVirtualTag round-trip, rejection of File.IO / HttpClient / Process.Start / Reflection.Assembly.Load via ScriptSandboxViolationException, Environment.GetEnvironmentVariable explicitly allowed (pinned compromise), script-exception propagation unwrapped, ctx.Now reachable, Deadband reachable, LINQ reachable, DataValueSnapshot usable with quality branching, compile-error location carried in diagnostics

Deliberate compromise pinned by test

System.Environment (read-only process state, doesn't persist or leak outside the process) stays allowed. Tightening that later is a deliberate plan-level decision, not a silent creep.

Next

  • Stream A.2: compile cache (source-hash keyed) + per-evaluation timeout wrapper
  • Stream A.3: dedicated scripts-*.log Serilog rolling sink + companion-WARN enricher to main log
First of 3 increments within Stream A. Ships the Roslyn-based script engine's foundation. ## What lands - **`Core.Scripting` project** (net10) + `Microsoft.CodeAnalysis.CSharp.Scripting` 4.12 - **`ScriptContext`** abstract base — `GetTag(string) → DataValueSnapshot`, `SetVirtualTag(string, object?)`, `Now`, `Logger`, static `Deadband(double, double, double)` helper - **`ScriptGlobals<TContext>`** wrapper exposing `ctx` so scripts see `ctx.GetTag(...)` — matches the AST walker's parse shape - **`ScriptSandbox.Build(contextType)`** produces `ScriptOptions` with a six-assembly allow-list + import list - **`ForbiddenTypeAnalyzer`** — **post-compile semantic-model pass** (defense in depth because .NET 10 type forwarding makes reference-level restriction leaky). Walks every `ObjectCreationExpression` / `InvocationExpression` / `MemberAccessExpression` / `IdentifierName`, resolves via `SemanticModel`, rejects namespaces matching `System.IO` / `System.Net` / `System.Diagnostics` / `System.Reflection` / `System.Threading.Thread` / `System.Runtime.InteropServices` / `Microsoft.Win32`. Throws `ScriptSandboxViolationException` with aggregated spans. - **`DependencyExtractor`** — `CSharpSyntaxWalker` picking up `ctx.GetTag("literal")` + `ctx.SetVirtualTag("literal", ...)` invocations; rejects variable/concatenated/interpolated/method-returned paths with source spans - **`ScriptEvaluator<TContext, TResult>`** three-step gate: Roslyn compile → `ForbiddenTypeAnalyzer` → delegate materialization ## Tests — 29/29 green - `DependencyExtractorTests` (14): literal extraction, read/write separation, deduplication, every rejection form (variable / concat / interpolation / method-returned / empty / whitespace), non-ctx GetTag ignored, empty/whitespace/null source no-op, rejection span carried, multiple bad paths in one pass, nested literal extraction - `ScriptSandboxTests` (15): happy-path compile+run, `SetVirtualTag` round-trip, rejection of File.IO / HttpClient / Process.Start / Reflection.Assembly.Load via `ScriptSandboxViolationException`, `Environment.GetEnvironmentVariable` explicitly allowed (pinned compromise), script-exception propagation unwrapped, `ctx.Now` reachable, `Deadband` reachable, LINQ reachable, `DataValueSnapshot` usable with quality branching, compile-error location carried in diagnostics ## Deliberate compromise pinned by test `System.Environment` (read-only process state, doesn't persist or leak outside the process) stays allowed. Tightening that later is a deliberate plan-level decision, not a silent creep. ## Next - **Stream A.2**: compile cache (source-hash keyed) + per-evaluation timeout wrapper - **Stream A.3**: dedicated `scripts-*.log` Serilog rolling sink + companion-WARN enricher to main log
dohertj2 added 1 commit 2026-04-20 16:29:35 -04:00
ScriptContext abstract base defines the API user scripts see as ctx — GetTag(string) returns DataValueSnapshot so scripts branch on quality naturally, SetVirtualTag(string, object?) is the only write path virtual tags have (OPC UA client writes to virtual nodes rejected separately in DriverNodeManager per ADR-002), Now + Logger + Deadband static helper round out the surface. Concrete subclasses in Streams B + C plug in actual tag backends + per-script Serilog loggers.

ScriptSandbox.Build(contextType) produces the ScriptOptions for every compile — explicit allow-list of six assemblies (System.Private.CoreLib / System.Linq / Core.Abstractions / Core.Scripting / Serilog / the context type's own assembly), with a matching import list so scripts don't need using clauses. Allow-list is plan-level — expanding it is not a casual change.

DependencyExtractor uses CSharpSyntaxWalker to find every ctx.GetTag("literal") and ctx.SetVirtualTag("literal", ...) call, rejects every non-literal path (variable, concatenation, interpolation, method-returned). Rejections carry the exact TextSpan so the Admin UI can point at the offending token. Reads + writes are returned as two separate sets so the virtual-tag engine (Stream B) knows both the subscription targets and the write targets.

Sandbox enforcement turned out needing a second-pass semantic analyzer because .NET 10's type forwarding makes assembly-level restriction leaky — System.Net.Http.HttpClient resolves even with WithReferences limited to six assemblies. ForbiddenTypeAnalyzer runs after Roslyn's Compile() against the SemanticModel, walks every ObjectCreationExpression / InvocationExpression / MemberAccessExpression / IdentifierName, resolves to the containing type's namespace, and rejects any prefix-match against the deny-list (System.IO, System.Net, System.Diagnostics, System.Reflection, System.Threading.Thread, System.Runtime.InteropServices, Microsoft.Win32). Rejections throw ScriptSandboxViolationException with the aggregated list + source spans so the Admin UI surfaces every violation in one round-trip instead of whack-a-mole. System.Environment explicitly stays allowed (read-only process state, doesn't persist or leak outside) and that compromise is pinned by a dedicated test.

ScriptGlobals<TContext> wraps the context as a named field so scripts see ctx instead of the bare globalsType-member-access convention Roslyn defaults to — keeps script ergonomics (ctx.GetTag) consistent with the AST walker's parse shape and the Admin UI's hand-written type stub (coming in Stream F). Generic on TContext so Stream C's alarm-predicate context with an Alarm property inherits cleanly.

ScriptEvaluator<TContext, TResult>.Compile is the three-step gate: (1) Roslyn compile — throws CompilationErrorException on syntax/type errors with Location-carrying diagnostics; (2) ForbiddenTypeAnalyzer semantic pass — catches type-forwarding sandbox escapes; (3) delegate creation. Runtime exceptions from user code propagate unwrapped — the virtual-tag engine in Stream B catches + maps per-tag to BadInternalError quality per Phase 7 decision #11.

29 unit tests covering every surface: DependencyExtractorTests has 14 theories — single/multiple/deduplicated reads, separate write tracking, rejection of variable/concatenated/interpolated/method-returned/empty/whitespace paths, ignoring non-ctx methods named GetTag, empty-source no-op, source span carried in rejections, multiple bad paths reported in one pass, nested literal extraction. ScriptSandboxTests has 15 — happy-path compile + run, SetVirtualTag round-trip, rejection of File.IO + HttpClient + Process.Start + Reflection.Assembly.Load via ScriptSandboxViolationException, Environment.GetEnvironmentVariable explicitly allowed (pinned compromise), script-exception propagation, ctx.Now reachable, Deadband static reachable, LINQ Where/Sum reachable, DataValueSnapshot usable in scripts including quality branches, compile error carries source location.

Next two PRs within Stream A: A.2 adds the compile cache (source-hash keyed) + per-evaluation timeout wrapper; A.3 wires the dedicated scripts-*.log Serilog rolling sink with structured-property filtering + the companion-warning enricher to the main log.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
dohertj2 merged commit d2bfcd9f1e into v2 2026-04-20 16:29:46 -04:00
dohertj2 referenced this issue from a commit 2026-04-30 08:21:26 -04:00
Phase 2 official close-out. Closes task #209. The 2026-04-18 exit-gate-phase-2-final.md captured Phase 2 state at PR 2 merge — four High/Medium adversarial findings still OPEN, Historian port + alarm subsystem + v1 archive deletion all deferred. Since then: PR 4 closed all four findings end-to-end (High 1 Read subscription-leak, High 2 no reconnect loop, Medium 3 SubscribeAsync doesn't push frames, Medium 4 WriteValuesAsync doesn't await OnWriteComplete — mapped + resolved inline in the new doc), PR 12 landed the richer historian quality mapper, PR 13 shipped GalaxyRuntimeProbeManager with per-Platform/AppEngine ScanState subscriptions + StateChanged events forwarded through the existing OnHostStatusChanged IPC frame, PR 14 wired the alarm subsystem (GalaxyAlarmTracker advising the four alarm-state attributes per IsAlarm=true attribute, raising AlarmTransition events forwarded through OnAlarmEvent IPC frames), Phase 3 PR 18 deleted the v1 source trees, and PR 61 closed V1_ARCHIVE_STATUS.md. Phase 2 is functionally done; this commit is the bookkeeping pass. New exit-gate-phase-2-closed.md at docs/v2/implementation/ — five-stream status table (A/B/C/D/E all complete with the specific close commits named), full resolution table for every 2026-04-18 adversarial finding mapped to the PR 4 resolution, cross-cutting deferrals table marking every one resolved (Historian SDK plugin port → done, subscription push frames → done under Medium 3, Historian-backed HistoryRead → done, alarm subsystem wire-up → done, reconnect-without-recycle → done under High 2, v1 archive deletion → done). Fresh 2026-04-20 test baseline captured from the current v2 tip: 1844 passing + 29 infra-gated skips across 21 test projects, including the net48 x86 Galaxy.Host.Tests suite (107 pass) that exercises the MXAccess COM path on the dev box. Flake observed — Configuration.Tests 70/71 on first full-solution run, 71/71 on retry; logged as a known non-stable flake rather than chased because it did not reproduce. The prior exit-gate-phase-2-final.md is kept in place (historical record of the 2026-04-18 snapshot) but gets a superseded-by banner at the top pointing at the new close-out doc so future readers land on current status first. docs/v2/plan.md Phase 2 section header gains the ✅ CLOSED 2026-04-20 marker + a link to the close-out doc so the top-level plan index reflects reality. "What Phase 2 closed means for Phase 3 and later" section in the new doc captures the downstream contract: Galaxy now runs as a first-class v2 driver with the same capability-interface shape as Modbus / S7 / AbCip / AbLegacy / TwinCAT / FOCAS / OpcUaClient; no v1 code path remains; the 2026-04-13 stability findings persist as named regression tests under tests/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.E2E/StabilityFindingsRegressionTests.cs so any future refactor reintroducing them trips the test. "Outstanding — not Phase 2 blockers" section lists the four pending non-Phase-2 tasks (#177, #194, #195, #199) so nobody mistakes them for Phase 2 tail work.
dohertj2 referenced this issue from a commit 2026-04-30 08:21:26 -04:00
AbCip IAlarmSource via ALMD projection (#177) — feature-flagged OFF by default; when enabled, polls declared ALMD UDT member fields + raises OnAlarmEvent on 0→1 + 1→0 transitions. Closes task #177. The AB CIP driver now implements IAlarmSource so the generic-driver alarm dispatch path (PR 14's sinks + the Server.Security.AuthorizationGate AlarmSubscribe/AlarmAck invoker wrapping) can treat AB-backed alarms uniformly with Galaxy + OpcUaClient + FOCAS. Projection is ALMD-only in this pass: the Logix ALMD (digital alarm) instruction's UDT shape is well-understood (InFaulted + Acked + Severity + In + Cfg_ProgTime at stable member names) so the polled-read + state-diff pattern fits without concessions. ALMA (analog alarm) deferred to a follow-up because its HHLimit/HLimit/LLimit/LLLimit threshold + In value semantics deserve their own design pass — raising on threshold-crossing is not the same shape as raising on InFaulted-edge. AbCipDriverOptions gains two knobs: EnableAlarmProjection (default false) + AlarmPollInterval (default 1s). Explicit opt-in because projection semantics don't exactly mirror Rockwell FT Alarm & Events; shops running FT Live should leave this off + take alarms through the native A&E route. AbCipAlarmProjection is the state machine: per-subscription background loop polls the source-node set via the driver's public ReadAsync — which gains the #194 whole-UDT optimization for free when ALMDs are declared with their standard member set, so one poll tick reads (N alarms × 2 members) = N libplctag round-trips rather than 2N. Per-tick state diff: compare InFaulted + Severity against last-seen, fire raise (0→1) / clear (1→0) with AlarmSeverity bucketed via the 1-1000 Logix severity scale (≤250 Low, ≤500 Medium, ≤750 High, rest Critical — matches OpcUaClient's MapSeverity shape). ConditionId is {sourceNode}#active — matches a single active-branch per alarm which is all ALMD supports; when Cfg_ProgTime-based branch identity becomes interesting (re-raise after ack with new timestamp), a richer ConditionId pass can land. Subscribe-while-disabled returns a handle wrapping id=0 — capability negotiation (the server queries IAlarmSource presence at driver-load time) still succeeds, the alarm surface just never fires. Unsubscribe cancels the sub's CTS + awaits its loop; ShutdownAsync cancels every sub on its way out so a driver reload doesn't leak poll tasks. AcknowledgeAsync routes through the driver's existing WriteAsync path — per-ack writes {SourceNodeId}.Acked = true (the simpler semantic; operators whose ladder watches AckCmd + rising-edge can wire a client-side pulse until a driver-level edge-mode knob lands). Best-effort — per-ack faults are swallowed so one bad ack doesn't poison the whole batch. Six new AbCipAlarmProjectionTests: detector flags ALMD signature + skips non-signature UDTs + atomics; severity mapping matches OPC UA A&C bucket boundaries; feature-flag OFF returns a handle but never touches the fake runtime (proving no background polling happens); feature-flag ON fires a raise event on 0→1; clear event fires on 1→0 after a prior raise; unsubscribe stops the poll loop (ReadCount doesn't grow past cancel + at most one straggler read). Driver builds 0 errors; AbCip.Tests 233/233 (was 227, +6 new). Task #177 closed — the last pending AB CIP follow-up is now #194 (already shipped). Remaining pending fleet-wide: #150 (Galaxy MXAccess failover hardware) + #199 (UnsTab Playwright smoke).
Sign in to join this conversation.