docs: sync against recent code changes
Five doc-content updates after this session's code-review resolution sweep. No code touched; pure documentation drift correction. 1. docs/reqs/HighLevelReqs.md (HLR-007 — Service Hosting): Refreshed the deployment description from "three cooperating processes (Server, Admin, Galaxy.Host)" to "two cooperating Windows services (Server, Admin)". The legacy x86 TopShelf Galaxy.Host process was retired in PR 7.2 (2026-04-30); Galaxy access now flows through the in-process Tier-A GalaxyDriver talking gRPC to the sibling mxaccessgw gateway. Also called out decision #30 (AddWindowsService replacing TopShelf) inline. 2. docs/VirtualTags.md: - Line 9: "compiled via Microsoft.CodeAnalysis.CSharp.Scripting" replaced with the current pipeline (Microsoft.CodeAnalysis.CSharp regular compiler — Core.Scripting-008 / -016 retired the CSharpScript/ScriptRunner path). - Line 39: orphan-thread leak description rewritten. The CSharp.Scripting-era "underlying ScriptRunner keeps running on its thread-pool thread until the Roslyn runtime returns" is no longer accurate — the new pipeline binds the script as a regular C# Func<> delegate, so the leak is now "synchronous CPU-bound work on a pool thread" (same operator-visible effect, different mechanism). 3. docs/v2/plan.md decision #29 ("Galaxy Host is a separate Windows service"): Annotated both the decision body and the decision-log table row with "Reversed PR 7.2, 2026-04-30" + a one-line summary of the replacement architecture. The original reasoning is preserved as audit trail per the decision-log convention. 4. docs/v2/implementation/phase-7-scripting-and-alarming.md A.1: Added an Implementation note describing the Core.Scripting-008 / -016 supersession of the original CSharpScript pipeline. The historical record stays; the note points future readers at docs/VirtualTags.md "Compile cache" for the current contract. 5. docs/plans/alarms-over-gateway.md "Files" section under client regeneration: Updated the .NET regeneration instructions to point at the new ZB.MOM.WW.MxGateway.Contracts.csproj path. The old clients/dotnet/MxGateway.Client.csproj no longer exists in the sibling repo (restructure after this plan was written) and the vendored-binaries situation in src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/libs/ is called out so a reader following the plan won't chase a deleted path. Verification: grep against docs/ for the pre-fix wordings ("three cooperating processes", "Galaxy.Host (TopShelf)", "ScriptRunner", the wrong BadDeviceFailure hex code 0x80550000) returns no hits. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -6,7 +6,7 @@ The runtime is split across two projects: `Core.Scripting` holds the Roslyn sand
|
||||
|
||||
## Roslyn script sandbox (`Core.Scripting`)
|
||||
|
||||
User scripts are compiled via `Microsoft.CodeAnalysis.CSharp.Scripting` against a `ScriptContext` subclass. `ScriptGlobals<TContext>` exposes the context as a field named `ctx`, so scripts read `ctx.GetTag("...")` / `ctx.SetVirtualTag("...", ...)` / `ctx.Now` / `ctx.Logger` and return a value.
|
||||
User scripts are compiled via `Microsoft.CodeAnalysis.CSharp` (regular compiler, not the scripting variant — the original `CSharpScript` pipeline was retired by the Core.Scripting-008 / -016 rewrite, see "Compile cache" below). Each script's source is pasted as the body of a synthesized `CompiledScript.Run(ScriptGlobals<TContext>)` method against a `ScriptContext` subclass. `ScriptGlobals<TContext>` exposes the context as a field named `ctx`, so scripts read `ctx.GetTag("...")` / `ctx.SetVirtualTag("...", ...)` / `ctx.Now` / `ctx.Logger` and return a value.
|
||||
|
||||
### Compile pipeline (`ScriptEvaluator<TContext, TResult>`)
|
||||
|
||||
@@ -36,7 +36,7 @@ Similarly, **`System.Threading.Tasks` is now denied** (Core.Scripting-003), whic
|
||||
|
||||
### Per-evaluation timeout (`TimedScriptEvaluator<TContext, TResult>`)
|
||||
|
||||
Wraps `ScriptEvaluator` with a wall-clock budget. Default `DefaultTimeout = 250ms`. Implementation pushes the inner `RunAsync` onto `Task.Run` (so a CPU-bound script can't hog the calling thread before `WaitAsync` registers its timeout) then awaits `runTask.WaitAsync(Timeout, ct)`. A `TimeoutException` from `WaitAsync` is wrapped as `ScriptTimeoutException`. Caller-supplied `CancellationToken` cancellation wins over the timeout and propagates as `OperationCanceledException` — so a shutdown cancel is not misclassified. **Known leak:** when a CPU-bound script times out, the underlying `ScriptRunner` keeps running on its thread-pool thread until the Roslyn runtime returns (documented trade-off; out-of-process evaluation is a v3 concern).
|
||||
Wraps `ScriptEvaluator` with a wall-clock budget. Default `DefaultTimeout = 250ms`. Implementation pushes the inner `RunAsync` onto `Task.Run` (so a CPU-bound script can't hog the calling thread before `WaitAsync` registers its timeout) then awaits `runTask.WaitAsync(Timeout, ct)`. A `TimeoutException` from `WaitAsync` is wrapped as `ScriptTimeoutException`. Caller-supplied `CancellationToken` cancellation wins over the timeout and propagates as `OperationCanceledException` — so a shutdown cancel is not misclassified. **Known leak:** when a CPU-bound script times out, the underlying compiled-script delegate keeps running on its `Task.Run` thread-pool thread until it returns of its own accord (the CT is checked only at evaluator entry; once the script body is running, only the script returning or throwing will release the thread). The post-rewrite delegate is a regular C# `Func<>` bound to the synthesized `CompiledScript.Run` method, so this is a vanilla "synchronous CPU-bound work on a pool thread" leak rather than anything Roslyn-specific. Documented trade-off; out-of-process evaluation is a v3 concern.
|
||||
|
||||
### Script logger plumbing
|
||||
|
||||
|
||||
@@ -583,10 +583,20 @@ language binding.
|
||||
|
||||
**Depends on:** A.1 merged (proto change live).
|
||||
|
||||
**Files** (`c:\Users\dohertj2\Desktop\mxaccessgw\clients\`):
|
||||
**Files** (`c:\Users\dohertj2\Desktop\mxaccessgw\src\` for .NET — note the sibling
|
||||
repo restructured after this plan was written; `clients/dotnet/MxGateway.Client.csproj`
|
||||
no longer exists, the proto contracts now live in
|
||||
`src/ZB.MOM.WW.MxGateway.Contracts/` under the new namespace
|
||||
`ZB.MOM.WW.MxGateway.Contracts.Proto[.Galaxy]`; the OtOpcUa driver currently
|
||||
consumes vendored binaries from the pre-restructure build — see
|
||||
`src/Drivers/ZB.MOM.WW.OtOpcUa.Driver.Galaxy/libs/README.md`):
|
||||
|
||||
1. **.NET** — codegen runs on csproj rebuild via `Grpc.Tools`; just
|
||||
rebuild `MxGateway.Client.csproj` after pulling A.1.
|
||||
1. **.NET** — codegen runs on csproj rebuild via `Grpc.Tools`; rebuild
|
||||
`src/ZB.MOM.WW.MxGateway.Contracts/ZB.MOM.WW.MxGateway.Contracts.csproj`
|
||||
after pulling A.1. (If unwinding the driver's vendored binaries onto the
|
||||
new contracts namespace as part of the alarm work, namespace-rename + a
|
||||
reimplementation of the missing `MxGatewayClient` / `MxGatewaySession`
|
||||
wrappers is also in scope.)
|
||||
2. **Python** — run `clients\python\generate-proto.ps1`; commit the
|
||||
regenerated `_pb2.py` + `_pb2_grpc.py` files under
|
||||
`clients\python\src\`.
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
# High-Level Requirements
|
||||
|
||||
> **Revision** — Refreshed 2026-04-19 for the OtOpcUa v2 multi-driver platform (task #205). The original 2025 text described a single-process Galaxy/MXAccess server called LmxOpcUa. Today the project is the **OtOpcUa** multi-driver OPC UA platform deployed as three cooperating processes (Server, Admin, Galaxy.Host). The Galaxy integration is one of seven shipped drivers. HLR-001 through HLR-008 have been rewritten driver-agnostically; HLR-009 has been retired (the embedded Status Dashboard is superseded by the Admin UI). HLR-010 through HLR-017 are new and cover plug-in drivers, resilience, Config DB / draft-publish, cluster redundancy, fleet-wide identifier uniqueness, Admin UI, audit logging, metrics, and the Roslyn capability-wrapping analyzer.
|
||||
> **Revision** — Refreshed 2026-05-23 for the OtOpcUa v2 multi-driver platform. The original 2025 text described a single-process Galaxy/MXAccess server called LmxOpcUa. Today the project is the **OtOpcUa** multi-driver OPC UA platform deployed as two cooperating processes (Server, Admin). The Galaxy integration is one of seven shipped drivers and is now an in-process Tier-A driver that talks gRPC to a separately installed `mxaccessgw` gateway (sibling repo) — PR 7.2 (2026-04-30) retired the legacy out-of-process `Galaxy.Host` Windows service. HLR-001 through HLR-008 have been rewritten driver-agnostically; HLR-009 has been retired (the embedded Status Dashboard is superseded by the Admin UI). HLR-010 through HLR-017 cover plug-in drivers, resilience, Config DB / draft-publish, cluster redundancy, fleet-wide identifier uniqueness, Admin UI, audit logging, metrics, and the Roslyn capability-wrapping analyzer.
|
||||
|
||||
## HLR-001: OPC UA Server
|
||||
|
||||
@@ -28,11 +28,10 @@ Drivers whose backend has a native change signal (e.g. Galaxy's `time_of_last_de
|
||||
|
||||
## HLR-007: Service Hosting
|
||||
|
||||
The system shall be deployed as three cooperating Windows services:
|
||||
The system shall be deployed as two cooperating Windows services (the legacy `OtOpcUa.Galaxy.Host` x86 host was retired in PR 7.2 — Galaxy access now flows through the separately installed `mxaccessgw` gateway, which lives in a sibling repository and is not part of the OtOpcUa deployment):
|
||||
|
||||
- **OtOpcUa.Server** — .NET 10 x64, `Microsoft.Extensions.Hosting` + `AddWindowsService`, hosts all non-Galaxy drivers in-process and the OPC UA endpoint.
|
||||
- **OtOpcUa.Server** — .NET 10 AnyCPU, `Microsoft.Extensions.Hosting` + `AddWindowsService` (decision #30 replaced the original TopShelf choice), hosts every driver in-process — including the new Tier-A `GalaxyDriver` that speaks gRPC to `mxaccessgw` — and the OPC UA endpoint.
|
||||
- **OtOpcUa.Admin** — .NET 10 x64 Blazor Server web app, hosts the admin UI, SignalR hubs for live updates, `/metrics` Prometheus endpoint, and audit log writers.
|
||||
- **OtOpcUa.Galaxy.Host** — .NET Framework 4.8 x86 (TopShelf), hosts MXAccess COM + Galaxy Repository SQL + Historian plugin. Talks to `Driver.Galaxy.Proxy` inside `OtOpcUa.Server` via a named pipe (MessagePack over length-prefixed frames, per-process shared secret, SID-restricted ACL).
|
||||
|
||||
## HLR-008: Logging
|
||||
|
||||
|
||||
@@ -89,7 +89,7 @@ Tie-in capability — **historian alarm sink**:
|
||||
|
||||
### Stream A — `Core.Scripting` (Roslyn engine + sandbox + AST inference + logger) — **2 weeks**
|
||||
|
||||
1. **A.1** Project scaffold + NuGet `Microsoft.CodeAnalysis.CSharp.Scripting`. `ScriptOptions` allow-list (`typeof(object).Assembly`, `typeof(Enumerable).Assembly`, the Core.Scripting assembly itself — nothing else). Hand-written `ScriptContext` base class with `GetTag(string)` / `SetVirtualTag(string, object)` / `Logger` / `Now` / `Deadband(double, double, double)` helpers.
|
||||
1. **A.1** Project scaffold + NuGet `Microsoft.CodeAnalysis.CSharp.Scripting`. `ScriptOptions` allow-list (`typeof(object).Assembly`, `typeof(Enumerable).Assembly`, the Core.Scripting assembly itself — nothing else). Hand-written `ScriptContext` base class with `GetTag(string)` / `SetVirtualTag(string, object)` / `Logger` / `Now` / `Deadband(double, double, double)` helpers. _(Implementation note 2026-05-23 — superseded by Core.Scripting-008 / -016: the `CSharpScript`/`ScriptRunner` path was replaced with a hand-rolled `CSharpCompilation.Create` → `Emit(MemoryStream)` → collectible `ScriptAssemblyLoadContext.LoadFromStream` pipeline so per-publish ALC accretion is reclaimable, and engines route compiles through `CompiledScriptCache` rather than calling `ScriptEvaluator.Compile` directly. The reference list was correspondingly widened from the narrow allow-list above to the full BCL `TRUSTED_PLATFORM_ASSEMBLIES` set (filtered to `System.*` + `netstandard` + `Microsoft.Win32.Registry`) because the new pipeline can't compile against the old narrow set; `ForbiddenTypeAnalyzer` is now the sole security gate, consistent with how Core.Scripting-001 / -002 established the analyzer must be the real boundary because type forwarding makes any references-list-only restriction porous. See `docs/VirtualTags.md` "Compile cache" for the current implementation contract.)_
|
||||
2. **A.2** `DependencyExtractor : CSharpSyntaxWalker`. Visits every `InvocationExpressionSyntax` targeting `ctx.GetTag` / `ctx.SetVirtualTag`; accepts only a `LiteralExpressionSyntax` argument. Non-literal arguments (concat, variable, method call) → publish-time rejection with an actionable error pointing the operator at the exact span. Outputs `IReadOnlySet<string> Inputs` + `IReadOnlySet<string> Outputs`.
|
||||
3. **A.3** Compile cache. `(source_hash) → compiled Script<T>`. Recompile only when source changes. Warm on `SealedBootstrap`.
|
||||
4. **A.4** Per-evaluation timeout wrapper (default 250ms; configurable per tag). Timeout = tag quality `BadInternalError` + structured warning log. Keeps a single runaway script from starving the engine.
|
||||
|
||||
@@ -193,7 +193,7 @@ ConfigurationService
|
||||
- Compact binary format, faster than JSON, good fit for high-frequency data change callbacks
|
||||
- Simpler than gRPC on .NET 4.8 (which needs legacy `Grpc.Core` native library)
|
||||
|
||||
**Decided: Galaxy Host is a separate Windows service.**
|
||||
**Decided: Galaxy Host is a separate Windows service.** _(Reversed by PR 7.2, 2026-04-30 — see PR 7.2's commit `ae7106d` and the project_galaxy_via_mxgateway memory entry. The legacy in-process `Galaxy.Host` / `Galaxy.Proxy` / `Galaxy.Shared` projects + the `OtOpcUaGalaxyHost` Windows service were retired; Galaxy access now flows through the in-process Tier-A `GalaxyDriver` talking gRPC to a separately installed `mxaccessgw` gateway sibling repo. The reasoning below was correct for the original LMX/x86-COM architecture; the gateway sibling repo now owns those constraints externally.)_
|
||||
- Independent lifecycle from the OtOpcUa Server
|
||||
- Can be restarted without affecting the main server or other drivers
|
||||
- Galaxy.Proxy detects connection loss, sets Bad quality on Galaxy nodes, reconnects when Host comes back
|
||||
@@ -801,7 +801,7 @@ aggregate runner (#253); server-side factory + seed SQL per driver (#210–#213)
|
||||
| 26 | Admin deploys on same server (co-hosted) | Simplifies deployment; can also run on separate management host | 2026-04-16 |
|
||||
| 27 | Admin scaffold early, driver-specific screens deferred | Core CRUD for instances/drivers first; per-driver config UI added with each driver | 2026-04-16 |
|
||||
| 28 | Named pipes for Galaxy IPC | Fast, no port conflicts, native to both .NET 4.8 and .NET 10 | 2026-04-16 |
|
||||
| 29 | Galaxy Host is a separate Windows service | Independent lifecycle, can restart without affecting main server or other drivers | 2026-04-16 |
|
||||
| 29 | Galaxy Host is a separate Windows service | Independent lifecycle, can restart without affecting main server or other drivers | 2026-04-16 (**reversed PR 7.2, 2026-04-30** — Galaxy is now an in-process Tier-A driver talking gRPC to the sibling `mxaccessgw` gateway; see the decision body above) |
|
||||
| 30 | Drop TopShelf, use Microsoft.Extensions.Hosting | Built-in Windows Service support in .NET 10, no third-party dependency | 2026-04-16 |
|
||||
| 31 | Mono-repo for all drivers | Simpler dependency management, single CI pipeline, shared abstractions | 2026-04-16 |
|
||||
| 32 | MessagePack serialization for Galaxy IPC | Binary, fast, works on .NET 4.8+ and .NET 10 via MessagePack-CSharp NuGet | 2026-04-16 |
|
||||
|
||||
Reference in New Issue
Block a user