Commit Graph

448 Commits

Author SHA1 Message Date
Joseph Doherty
0ae715cca4 Phase 7 Stream A.2 — compile cache + per-evaluation timeout wrapper. Second of 3 increments within Stream A. Adds two independent resilience primitives that the virtual-tag engine (Stream B) and scripted-alarm engine (Stream C) will compose with the base ScriptEvaluator. Both are generic on (TContext, TResult) so different engines get their own instances without cross-contamination.
CompiledScriptCache<TContext, TResult> — source-hash-keyed cache of compiled evaluators. Roslyn compilation is the most expensive step in the evaluator pipeline (5-20ms per script depending on size); re-compiling on every value-change event would starve the engine. ConcurrentDictionary of Lazy<ScriptEvaluator> with ExecutionAndPublication mode ensures concurrent callers never double-compile even on a cold cache race. Failed compiles evict the cache entry so an Admin UI retry with corrected source actually recompiles (otherwise the cached exception would persist). Whitespace-sensitive hash — reformatting a script misses the cache on purpose, simpler than AST-canonicalize and happens rarely. No capacity bound because virtual-tag + alarm scripts are config-DB bounded (thousands, not millions); if scale pushes past that in v3 an LRU eviction slots in behind the same API.

TimedScriptEvaluator<TContext, TResult> — wraps a ScriptEvaluator with a per-evaluation wall-clock timeout (default 250ms per Phase 7 plan Stream A.4, configurable per tag so slower backends can widen). Critical implementation detail: the underlying Roslyn ScriptRunner executes synchronously on the calling thread for CPU-bound user scripts, returning an already-completed Task before the caller can register a timeout. Naive `Task.WaitAsync(timeout)` would see the completed task and never fire. Fix: push evaluation to a thread-pool thread via Task.Run, so the caller's thread is free to wait and the timeout reliably fires after the configured budget. Known trade-off (documented in the class summary): when a script times out, the underlying evaluation task continues running on the thread-pool thread until Roslyn returns; in the CPU-bound-infinite-loop case it's effectively leaked until the runtime decides to unwind. Tighter CPU budgeting would require an out-of-process script runner (v3 concern). In practice the timeout + structured warning log surfaces the offending script so the operator fixes it, and the orphan thread is rare. Caller-supplied CancellationToken is honored and takes precedence over the timeout, so driver-shutdown paths see a clean OperationCanceledException rather than a misclassified ScriptTimeoutException.

ScriptTimeoutException carries the configured Timeout and a diagnostic message pointing the operator at ctx.Logger output around the failure plus suggesting widening the timeout, simplifying the script, or moving heavy work out of the evaluation path. The virtual-tag engine (Stream B) will catch this and map the owning tag's quality to BadInternalError per Phase 7 decision #11, logging a structured warning with the offending script name.

Tests: CompiledScriptCacheTests (10) — first-call compile, identical-source dedupe to same instance, different-source produces different evaluator, whitespace-sensitivity documented, cached evaluator still runs correctly, failed compile evicted for retry, Clear drops entries, concurrent GetOrCompile of the same source deduplicates to one instance, different TContext/TResult use separate cache instances, null source rejected. TimedScriptEvaluatorTests (9) — fast script completes under timeout, CPU-bound script throws ScriptTimeoutException, caller cancellation takes precedence over timeout (shutdown path correctness), default 250ms per plan, zero/negative timeout rejected at construction, null inner rejected, null context rejected, user-thrown exceptions propagate unwrapped (not conflated with timeout), timeout exception message contains diagnostic guidance. Full suite: 48/48 green (29 from A.1 + 19 new).

Next: Stream A.3 wires the dedicated scripts-*.log Serilog rolling sink + structured-property filtering + companion-WARN enricher to the main log, closing out Stream A.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 16:38:43 -04:00
d2bfcd9f1e Merge pull request 'Phase 7 Stream A.1 — Core.Scripting project scaffold + ScriptContext + sandbox + AST dependency extractor' (#177) from phase-7-stream-a1-core-scripting into v2 2026-04-20 16:29:44 -04:00
Joseph Doherty
e4dae01bac Phase 7 Stream A.1 — Core.Scripting project scaffold + ScriptContext + sandbox + AST dependency extractor. First of 3 increments within Stream A. Ships the Roslyn-based script engine's foundation: user C# snippets compile against a constrained ScriptOptions allow-list + get a post-compile sandbox guard, the static tag-dependency set is extracted from the AST at publish time, and the script sees a stable ctx.GetTag/SetVirtualTag/Now/Logger/Deadband API that later streams plug into concrete backends.
ScriptContext abstract base defines the API user scripts see as ctx — GetTag(string) returns DataValueSnapshot so scripts branch on quality naturally, SetVirtualTag(string, object?) is the only write path virtual tags have (OPC UA client writes to virtual nodes rejected separately in DriverNodeManager per ADR-002), Now + Logger + Deadband static helper round out the surface. Concrete subclasses in Streams B + C plug in actual tag backends + per-script Serilog loggers.

ScriptSandbox.Build(contextType) produces the ScriptOptions for every compile — explicit allow-list of six assemblies (System.Private.CoreLib / System.Linq / Core.Abstractions / Core.Scripting / Serilog / the context type's own assembly), with a matching import list so scripts don't need using clauses. Allow-list is plan-level — expanding it is not a casual change.

DependencyExtractor uses CSharpSyntaxWalker to find every ctx.GetTag("literal") and ctx.SetVirtualTag("literal", ...) call, rejects every non-literal path (variable, concatenation, interpolation, method-returned). Rejections carry the exact TextSpan so the Admin UI can point at the offending token. Reads + writes are returned as two separate sets so the virtual-tag engine (Stream B) knows both the subscription targets and the write targets.

Sandbox enforcement turned out needing a second-pass semantic analyzer because .NET 10's type forwarding makes assembly-level restriction leaky — System.Net.Http.HttpClient resolves even with WithReferences limited to six assemblies. ForbiddenTypeAnalyzer runs after Roslyn's Compile() against the SemanticModel, walks every ObjectCreationExpression / InvocationExpression / MemberAccessExpression / IdentifierName, resolves to the containing type's namespace, and rejects any prefix-match against the deny-list (System.IO, System.Net, System.Diagnostics, System.Reflection, System.Threading.Thread, System.Runtime.InteropServices, Microsoft.Win32). Rejections throw ScriptSandboxViolationException with the aggregated list + source spans so the Admin UI surfaces every violation in one round-trip instead of whack-a-mole. System.Environment explicitly stays allowed (read-only process state, doesn't persist or leak outside) and that compromise is pinned by a dedicated test.

ScriptGlobals<TContext> wraps the context as a named field so scripts see ctx instead of the bare globalsType-member-access convention Roslyn defaults to — keeps script ergonomics (ctx.GetTag) consistent with the AST walker's parse shape and the Admin UI's hand-written type stub (coming in Stream F). Generic on TContext so Stream C's alarm-predicate context with an Alarm property inherits cleanly.

ScriptEvaluator<TContext, TResult>.Compile is the three-step gate: (1) Roslyn compile — throws CompilationErrorException on syntax/type errors with Location-carrying diagnostics; (2) ForbiddenTypeAnalyzer semantic pass — catches type-forwarding sandbox escapes; (3) delegate creation. Runtime exceptions from user code propagate unwrapped — the virtual-tag engine in Stream B catches + maps per-tag to BadInternalError quality per Phase 7 decision #11.

29 unit tests covering every surface: DependencyExtractorTests has 14 theories — single/multiple/deduplicated reads, separate write tracking, rejection of variable/concatenated/interpolated/method-returned/empty/whitespace paths, ignoring non-ctx methods named GetTag, empty-source no-op, source span carried in rejections, multiple bad paths reported in one pass, nested literal extraction. ScriptSandboxTests has 15 — happy-path compile + run, SetVirtualTag round-trip, rejection of File.IO + HttpClient + Process.Start + Reflection.Assembly.Load via ScriptSandboxViolationException, Environment.GetEnvironmentVariable explicitly allowed (pinned compromise), script-exception propagation, ctx.Now reachable, Deadband static reachable, LINQ Where/Sum reachable, DataValueSnapshot usable in scripts including quality branches, compile error carries source location.

Next two PRs within Stream A: A.2 adds the compile cache (source-hash keyed) + per-evaluation timeout wrapper; A.3 wires the dedicated scripts-*.log Serilog rolling sink with structured-property filtering + the companion-warning enricher to the main log.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 16:27:07 -04:00
6ae638a6de Merge pull request 'ADR-002 — driver-vs-virtual dispatch for Phase 7 scripting' (#176) from adr-002-driver-vs-virtual-dispatch into v2 2026-04-20 16:10:30 -04:00
Joseph Doherty
2a74daf228 ADR-002 — driver-vs-virtual dispatch: DriverNodeManager routes reads/writes/subscriptions across driver tags and virtual (scripted) tags via a single NodeManager with a NodeSource tag on NodeScopeResolver's output. Locks the architecture decision Phase 7 Stream G was going to have to make anyway — documenting it up front so the stream implementation can reference the chosen shape instead of rediscovering it. Option A (separate VirtualTagNodeManager sibling) rejected because shared Equipment folders owning both driver and virtual children would force two NodeManagers to fight for ownership on every Equipment node — the common case, not the exception — defeating the separation. Option C (virtual engine registers as a synthetic IDriver through DriverTypeRegistry) rejected because DriverInstance shape is wrong for scripting config (no DriverType, no HostAddress, no connectivity probe, no NSSM wrapper), IDriver.InitializeAsync semantics don't match script compilation, Polly resilience wrappers calibrated for network calls would either passthrough pointlessly or tune wrong, and Admin UI would need special-casing everywhere to hide fields that don't apply. Option B (single DriverNodeManager, NodeScopeResolver returns NodeSource enum alongside ScopeId, dispatch branches on source) accepted because it preserves one address-space tree with one walker, ACL binding works identically for both kinds, Phase 6.1 resilience + Phase 6.2 audit apply uniformly to the driver branch without needing Roslyn analyzer exemptions, and adding future source kinds is a single-enum-case addition. NodeScopeResolver.Resolve returns NodeScope(ScopeId, NodeSource, DriverInstanceId?, VirtualTagId?); DriverNodeManager pattern-matches on scope.Source and routes to either the driver dictionary or IVirtualTagEngine. OPC UA client writes to a virtual node return BadUserAccessDenied before the dispatch branch because Phase 7 decision #6 restricts virtual-tag writes to scripts via ctx.SetVirtualTag. Dispatch test coverage specified for Stream G.4: mixed Equipment folders browsing correctly, read routing per source kind, subscription fan-out across both kinds, the BadUserAccessDenied guard on virtual writes, and script-driven writes firing subscription notifications. ADR-001's walker gains the VirtualTag config-DB table as an additional input channel alongside Tag; NodeScopeResolver's ScopeId return stays unchanged so Phase 6.2's ACL trie needs no modification. Consequences flagged: whether IVirtualTagEngine lives in Core.Abstractions vs Phase 7's Core.VirtualTags project, and whether future server-side methods on virtual nodes would route through this dispatch, both marked out-of-scope for ADR-002.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 16:08:01 -04:00
3eb5f1d9da Merge pull request 'Phase 7 plan doc — scripting runtime + virtual tags + scripted alarms + historian alarm sink' (#175) from phase-7-plan-doc into v2 2026-04-20 16:07:34 -04:00
Joseph Doherty
f2c1cc84e9 Phase 7 plan doc — scripting runtime + virtual tags + scripted alarms + historian alarm sink. Draft output from the 2026-04-20 interactive planning session. Phase 7 is the last phase before v2 release readiness; adds two additive runtime capabilities on top of the existing driver + Equipment address-space foundation: (1) virtual (calculated) tags — OPC UA variables whose values are computed by user-authored C# scripts against other tags, evaluated on change and/or timer, living in the existing Equipment tree alongside driver tags, behaving identically to clients; (2) Part 9 scripted alarms — full state machine (EnabledState/ActiveState/AckedState/ConfirmedState/ShelvingState) with persistent operator-supplied state across restarts, complementing (not replacing) the existing Galaxy-native and AB CIP ALMD alarm sources. A third tie-in capability — Aveva Historian as alarm system of record — routes every qualifying alarm transition from any IAlarmSource (scripted + Galaxy + ALMD) through a local SQLite store-and-forward queue to Galaxy.Host, which uses its already-loaded aahClientManaged DLLs to write to the Historian alarm schema; per-alarm HistorizeToAveva toggle gates which sources flow (default off for Galaxy-native to avoid duplicating the direct Galaxy historian path, default on for scripted).
Locks in 22 design decisions from the planning conversation: C# via Roslyn scripting; virtual tags in the Equipment tree (not a separate /Virtual/ namespace); change-driven + timer-driven triggers operator-configurable per tag; Shape A one-script-per-tag-or-alarm (no predicate/action split); full OPC UA Part 9 alarm fidelity; read-only sandbox (scripts read any tag, write only to virtual tags, no File/HttpClient/Process/reflection); AST-inferred dependencies via CSharpSyntaxWalker (non-literal tag paths rejected at publish); config DB storage with generation-sealed cache; ctx.GetTag returns a full DataValue {Value, StatusCode, Timestamp}; per-tag Historize checkbox; per-tag error isolation (throwing script sets tag quality BadInternalError, engine unaffected); dedicated scripts-*.log Serilog sink bound to ctx.Logger; alarm message as template with {TagPath} substitution resolved at event emission; ActiveState recomputed from tags on startup while EnabledState/AckedState/ConfirmedState/ShelvingState + audit persist to config DB; historian sink scope = all IAlarmSource impls with per-alarm toggle; SQLite store-and-forward on the node so operators are never blocked by Historian downtime; IPC to Galaxy.Host for ingestion reusing the already-loaded aahClientManaged DLLs; Monaco editor for Admin code editing; serial cascade evaluation for v1 (parallel as follow-up); shelving UX via OPC UA method calls only with no custom Admin controls (operator drives state transitions from plant HMIs or Client.CLI); 30-day dead-letter retention with manual retry button; test harness accepts only declared-input paths so the harness enforces dependency declaration.

Eight streams totaling ~10-12 weeks, scope-comparable to Phase 6: A - Core.Scripting (Roslyn engine + sandbox + AST inference + logger); B - virtual tag engine (dependency graph + change/timer schedulers + historize); C - scripted alarm engine (Part 9 state machine + template messages + startup recovery + OPC UA method binding); D - historian alarm sink (SQLite store-and-forward + Galaxy.Host IPC contract extension); E - config DB schema (four new tables under sp_PublishGeneration); F - Admin UI scripting tab (Monaco + test harness + dependency preview + script-log viewer + historian diagnostics); G - address-space integration (extend EquipmentNodeWalker for virtual source kind + extend DriverNodeManager dispatch); H - exit gate.

Compliance-check surface covers sandbox escape (typeof/Assembly.Load/File/HttpClient attempts must fail at compile), dependency inference (literal-only paths), change cascade (topological ordering), cycle rejection at publish, startup recovery (ack/confirm/shelve survive restart but ActiveState recomputed), ack audit trail persistence, historian queue durability (Galaxy.Host offline → online drains in-order), per-alarm historian toggle gating, script timeout isolation, log sink isolation, ACL binding (virtual tags inherit Equipment scope grants).

Follow-up artifacts tracked as tasks #231-#238 (stream placeholders). Supporting doc updates (plan.md §6 Migration Strategy, config-db-schema.md §§ for the four new tables, driver-specs.md §Alarm semantics clarification, new ADR-002 for driver-vs-virtual dispatch) will land alongside the streams that touch them, not in this doc.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 16:05:12 -04:00
8384e58655 Merge pull request 'Modbus exception-injection profile — wire-level coverage for codes 0x01/0x03/0x04/0x05/0x06/0x0A/0x0B' (#174) from modbus-exception-injection-profile into v2 2026-04-20 15:14:00 -04:00
Joseph Doherty
96940aeb24 Modbus exception-injection profile — closes the end-to-end test gap for exception codes 0x01/0x03/0x04/0x05/0x06/0x0A/0x0B. pymodbus simulator naturally emits only 0x02 (Illegal Data Address on reads outside configured ranges) + 0x03 (Illegal Data Value on over-length); the driver's MapModbusExceptionToStatus table translates eight codes, but only 0x02 had integration-level coverage (via DL205's unmapped-register test). Unit tests lock the translation function in isolation but an integration test was missing for everything else. This PR lands wire-level coverage for the remaining seven codes without depending on device-specific quirks to naturally produce them.
New exception_injector.py — standalone pure-Python-stdlib Modbus/TCP server shipped alongside the pymodbus image. Speaks the wire protocol directly (MBAP header parse + FC 01/02/03/04/05/06/15/16 dispatch + store-backed happy-path reads/writes + spec-enforced length caps) and looks up each (fc, starting-address) against a rules list loaded from JSON; a matching rule makes the server respond [fc|0x80, exception_code] instead of the normal response. Zero runtime dependencies outside the stdlib — the Dockerfile just COPY's the script into /fixtures/ alongside the pymodbus profile JSONs, no new pip install needed. ~200 lines. New exception_injection.json profile carries rules for every exception code on FC03 (addresses 1000-1007, one per code), FC06 (2000-2001 for CPU-PROGRAM-mode and busy), and FC16 (3000 for server failure). New exception_injection compose profile binds :5020 like every other service + runs python /fixtures/exception_injector.py --config /fixtures/exception_injection.json.

New ExceptionInjectionTests.cs in Modbus.IntegrationTests — 11 tests. Eight FC03-read theories exercise every exception code 0x01/0x02/0x03/0x04/0x05/0x06/0x0A/0x0B asserting the driver's expected OPC UA StatusCode mapping (BadNotSupported/BadOutOfRange/BadOutOfRange/BadDeviceFailure/BadDeviceFailure/BadDeviceFailure/BadCommunicationError/BadCommunicationError). Two FC06-write theories cover the write path for 0x04 (Server Failure, CPU in PROGRAM mode) + 0x06 (Server Busy). One sanity-check read at address 5 confirms the injector isn't globally broken + non-injected reads round-trip cleanly with Value=5/StatusCode=Good. All tests follow the MODBUS_SIM_PROFILE=exception_injection skip guard so they no-op on a fresh clone without Docker running.

Docker/README.md gains an §Exception injection section explaining what pymodbus can and cannot emit, what the injector does, where the rules live, and how to append new ones. docs/drivers/Modbus-Test-Fixture.md follow-up item #2 (extend pymodbus profiles to inject exceptions) gets a shipped strikethrough with the new coverage inventory; the unit-level section adds ExceptionInjectionTests next to DL205ExceptionCodeTests so the split-of-responsibilities is explicit (DL205 test = natural out-of-range via dl205 profile, ExceptionInjectionTests = every other code via the injector).

Test baselines: Modbus unit 182/182 green (unchanged); Modbus integration with exception_injection profile live 11/11 new tests green. Existing DL205/S7/Mitsubishi integration tests unaffected since they skip on MODBUS_SIM_PROFILE mismatch.

Found + fixed during validation: a stale native pymodbus simulator from April 18 was still listening on port 5020 on IPv6 localhost (Windows was load-balancing between it + the Docker IPv4 forward, making injected exceptions intermittently come back as pymodbus's default 0x02). Killed the leftover. Documented the debugging path in the commit as a note for anyone who hits the same "my tests see exception 0x02 but the injector log has no request" symptom.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 15:11:32 -04:00
340f580be0 Merge pull request 'FOCAS Tier-C PR E — ops glue: ProcessHostLauncher + post-mortem MMF + NSSM scripts' (#173) from focas-tier-c-pr-e-ops-glue into v2 2026-04-20 14:26:35 -04:00
Joseph Doherty
8d88ffa14d FOCAS Tier-C PR E — ops glue: ProcessHostLauncher + post-mortem MMF + NSSM install scripts + doc close-out. Final of the 5 PRs for #220. With this landing, the Tier-C architecture is fully shipped; the only remaining FOCAS work is the hardware-dependent FwlibHostedBackend (real Fwlib32.dll P/Invoke, gated on #222 lab rig).
Production IHostProcessLauncher (ProcessHostLauncher.cs): Process.Start spawns OtOpcUa.Driver.FOCAS.Host.exe with OTOPCUA_FOCAS_PIPE / OTOPCUA_ALLOWED_SID / OTOPCUA_FOCAS_SECRET / OTOPCUA_FOCAS_BACKEND in the environment (supervisor-owned, never disk), polls FocasIpcClient.ConnectAsync at 250ms cadence until the pipe is up or the Host exits or the ConnectTimeout deadline passes, then wraps the connected client in an IpcFocasClient. TerminateAsync kills the entire process tree + disposes the IPC stream. ProcessHostLauncherOptions carries HostExePath + PipeName + AllowedSid plus optional SharedSecret (auto-generated from a GUID when omitted so install scripts don't have to), Arguments, Backend (fwlib32/fake/unconfigured default-unconfigured), ConnectTimeout (15s), and Series for CNC pre-flight.

Post-mortem MMF (Host/Stability/PostMortemMmf.cs + Proxy/Supervisor/PostMortemReader.cs): ring-buffer of the last ~1000 IPC operations written by the Host into a memory-mapped file. On a Host crash the supervisor reads the MMF — which survives process death — to see what was in flight. File format: 16-byte header [magic 'OFPC' (0x4F465043) | version | capacity | writeIndex] + N × 256-byte entries [8-byte UTC unix ms | 8-byte opKind | 240-byte UTF-8 message + null terminator]. Magic distinguishes FOCAS MMFs from the Galaxy MMFs that ship the same format shape. Writer is single-producer (Host) with a lock_writeGate; reader is multi-consumer (Proxy + any diagnostic tool) using a separate MemoryMappedFile handle.

NSSM install wrappers (scripts/install/Install-FocasHost.ps1 + Uninstall-FocasHost.ps1): idempotent service registration for OtOpcUaFocasHost. Resolves SID from the ServiceAccount, generates a fresh shared secret per install if not supplied, stages OTOPCUA_FOCAS_PIPE/SID/SECRET/BACKEND in AppEnvironmentExtra so they never hit disk, rotates 10MB stdout/stderr logs under %ProgramData%\OtOpcUa, DependOnService=OtOpcUa so startup order is deterministic. Backend selector defaults to unconfigured so a fresh install doesn't accidentally load a half-configured Fwlib32.dll on first start.

Tests (7 new, 2 files): PostMortemMmfTests.cs in FOCAS.Host.Tests — round-trip write+read preserves order + content, ring-buffer wraps at capacity (writes 10 entries to a 3-slot buffer, asserts only op-7/8/9 survive in FIFO order), message truncation at the 240-byte cap is null-terminated + non-overflowing, reopening an existing file preserves entries. PostMortemReaderCompatibilityTests.cs in FOCAS.Tests — hand-writes a file in the exact host format (magic/entry layout) + asserts the Proxy reader decodes with correct ring-walk ordering when writeIndex != 0, empty-return on missing file + magic mismatch. Keeps the two codebases in format-lockstep without the net10 test project referencing the net48 Host assembly.

Docs updated: docs/v2/implementation/focas-isolation-plan.md promoted from DRAFT to PRs A-E shipped status with per-PR citations + post-ship test counts (189 + 24 + 13 = 226 FOCAS-family tests green). docs/drivers/FOCAS-Test-Fixture.md §5 updated from "architecture scoped but not implemented" to listing the shipped components with the FwlibHostedBackend gap explicitly labeled as hardware-gated. Install-FocasHost.ps1 documents the OTOPCUA_FOCAS_BACKEND selector + points at docs/v2/focas-deployment.md for Fwlib32.dll licensing.

What ISN'T in this PR: (1) the real FwlibHostedBackend implementing IFocasBackend with the P/Invoke — requires either a CNC on the bench or a licensed FANUC developer kit to validate, tracked under #220 as a single follow-up task; (2) Admin /hosts surface integration for FOCAS runtime status — Galaxy Tier-C already has the shape, FOCAS can slot in when someone wires ObservedCrashes/StickyAlertActive/BackoffAttempt to the FleetStatusHub; (3) a full integration test that actually spawns a real FOCAS Host process — ProcessHostLauncher is tested via its contract + the MMF is tested via round-trip, but no test spins up the real exe (the Galaxy Tier-C tests do this, but the FOCAS equivalent adds no new coverage over what's already in place).

Total FOCAS-family tests green after this PR: 189 driver + 24 Shared + 13 Host = 226.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 14:24:13 -04:00
446a5c022c Merge pull request 'FOCAS Tier-C PR D — supervisor + backoff + crash-loop breaker' (#172) from focas-tier-c-pr-d-supervisor into v2 2026-04-20 14:19:32 -04:00
Joseph Doherty
5033609944 FOCAS Tier-C PR D — supervisor + backoff + crash-loop breaker + heartbeat monitor. Fourth of 5 PRs for #220. Ships the resilience harness that sits between the driver's IFocasClient requests and the Tier-C Host process, so a crashing Fwlib32.dll takes down only the Host (not the main server), gets respawned on a backoff ladder, and opens a circuit with a sticky operator alert when the crash rate is pathological. Same shape as Galaxy Tier-C so the Admin /hosts surface has a single mental model. New Supervisor/ namespace in Driver.FOCAS (.NET 10, Proxy-side): Backoff with the 5s→15s→60s default ladder + StableRunThreshold that resets the index after a 2-min clean run (so a one-off crash after hours of steady-state doesn't restart from the top); CircuitBreaker with 3-crashes-in-5-min threshold + escalating 1h→4h→manual-reset cooldown ladder + StickyAlertActive flag that persists across cooldowns until AcknowledgeAndReset is called; HeartbeatMonitor tracking ConsecutiveMisses against the 3-misses-kill threshold + LastAckUtc for telemetry; IHostProcessLauncher abstraction over "spawn Host process + produce an IFocasClient connected to it" so the supervisor stays I/O-free and fully testable with a fake launcher that can be told to throw on specific attempts (production wiring over Process.Start + FocasIpcClient.ConnectAsync is the PR E ops-glue concern); FocasHostSupervisor orchestrating them — GetOrLaunchAsync cycles through backoff until either a client is returned or the breaker opens (surfaced as InvalidOperationException so the driver maps to BadDeviceFailure), NotifyHostDeadAsync fans out the unavailable event + terminates the current launcher + records the crash without blocking (so heartbeat-loss detection can short-circuit subscriber fan-out and let the next GetOrLaunchAsync handle the respawn), AcknowledgeAndReset is the operator-clear path, OnUnavailable event for Admin /hosts wiring + ObservedCrashes + BackoffAttempt + StickyAlertActive for telemetry. 14 new unit tests across SupervisorTests.cs: Backoff (default sequence, clamping, RecordStableRun resets), CircuitBreaker (below threshold allowed, opens at threshold, escalates cooldown after second open, ManualReset clears state), HeartbeatMonitor (3 consecutive misses declares dead, ack resets counter), FocasHostSupervisor (first-launch success, retry-with-backoff after transient failure, repeated failures open breaker + surface InvalidOperationException, NotifyHostDeadAsync terminates + fan-outs + increments crash count, AcknowledgeAndReset clears sticky, Dispose terminates). Full FOCAS driver tests now 186/186 green (172 + 14 new). No changes to IFocasClient DI contract; existing FakeFocasClient-based tests unaffected. PR E wires the real Process-based IHostProcessLauncher + NSSM install scripts + MMF post-mortem + docs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 14:17:23 -04:00
9034294b77 Merge pull request 'FOCAS Tier-C PR C — IPC path end-to-end' (#171) from focas-tier-c-pr-c-ipc-proxy into v2 2026-04-20 14:13:33 -04:00
Joseph Doherty
3892555631 FOCAS Tier-C PR C — IPC path end-to-end: Proxy IpcFocasClient + Host FwlibFrameHandler + IFocasBackend abstraction. Third of 5 PRs for #220. Ships the wire path from IFocasClient calls in the .NET 10 driver, over a named-pipe (or in-memory stream) to the .NET 4.8 Host's FwlibFrameHandler, dispatched to an IFocasBackend. Keeps the existing IFocasClient DI seam intact so existing unit tests are unaffected (172/172 still pass). Proxy side adds Ipc/FocasIpcClient (owns one pipe stream + call gate so concurrent callers don't interleave frames, supports both real NamedPipeClientStream and arbitrary Stream for in-memory test loopback) and Ipc/IpcFocasClient (implements IFocasClient by forwarding every call as an IPC frame — Connect sends OpenSessionRequest and caches the SessionId; Read sends ReadRequest and decodes the typed value via FocasDataTypeCode; Write sends WriteRequest for non-bit data or PmcBitWriteRequest when it's a PMC bit so the RMW critical section stays on the Host; Probe sends ProbeRequest; Dispose best-effort sends CloseSessionRequest); plus FocasIpcException surfacing Host-side ErrorResponse frames as typed exceptions. Host side adds Backend/IFocasBackend (the Host's view of one FOCAS session — Open/Close/Read/Write/PmcBitWrite/Probe) with two implementations: FakeFocasBackend (in-memory, per-address value store, honors bit-write RMW semantics against the containing byte — used by tests and as an OTOPCUA_FOCAS_BACKEND=fake operational stub) and UnconfiguredFocasBackend (structured failure pointing at docs/v2/focas-deployment.md — the safe default when OTOPCUA_FOCAS_BACKEND is unset or hardware isn't configured). Ipc/FwlibFrameHandler replaces StubFrameHandler: deserializes each request DTO, delegates to the IFocasBackend, re-serializes into the matching response kind. Catches backend exceptions and surfaces them as ErrorResponse{backend-exception} rather than tearing down the pipe. Program.cs now picks the backend from OTOPCUA_FOCAS_BACKEND env var (fake/unconfigured/fwlib32; fwlib32 still maps to Unconfigured because the real Fwlib32 P/Invoke integration is a hardware-dependent follow-up — #220 captures it). Tests: 7 new IPC round-trip tests on the Proxy side (IpcFocasClient vs. an IpcLoopback fake server: connect happy path, connect rejection, read decode, write round-trip, PMC bit write routes to first-class RMW frame, probe, ErrorResponse surfaces as typed exception) + 6 new Host-side tests on FwlibFrameHandler (OpenSession allocates id, read-without-session fails, full open/write/read round-trip preserves value, PmcBitWrite sets the specified bit, Probe reports healthy with open session, UnconfiguredBackend returns pointed-at-docs error with ErrorCode=NoFwlibBackend). Existing 165 FOCAS unit tests + 24 Shared tests + 3 Host handshake tests all unchanged. Total post-PR: 172+24+9 = 205 FOCAS-family tests green. What's NOT in this PR: the actual Fwlib32.dll P/Invoke integration inside the Host (FwlibHostedBackend) lands as a hardware-dependent follow-up since no CNC is available for validation; supervisor + respawn + crash-loop breaker comes in PR D; MMF + NSSM install scripts in PR E.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 14:10:52 -04:00
3609a5c676 Merge pull request 'FOCAS Tier-C PR B � Driver.FOCAS.Host net48 x86 skeleton' (#170) from focas-tier-c-pr-b-host into v2 2026-04-20 14:02:56 -04:00
Joseph Doherty
a6f53e5b22 FOCAS Tier-C PR B — Driver.FOCAS.Host net48 x86 skeleton + pipe server. Second PR of the 5-PR #220 split. Stands up the Windows Service entry point + named-pipe scaffolding so PR C has a place to move the Fwlib32 calls into. New net48 x86 project (Fwlib32.dll is 32-bit-only, same bitness constraint as Galaxy.Host/MXAccess); references Driver.FOCAS.Shared for framing + DTOs so the wire format Proxy<->Host stays a single type system. Ships four files: PipeAcl creates a PipeSecurity where only the configured server principal SID gets ReadWrite+Synchronize + LocalSystem/BuiltinAdministrators are explicitly denied (so a compromised service account on the same host can't escalate via the pipe); IFrameHandler abstracts the dispatch surface with HandleAsync + AttachConnection for server-push event sinks; PipeServer accepts one connection at a time, verifies the peer SID via RunAsClient, reads the first Hello frame + matches the shared-secret and protocol major version, sends HelloAck, then hands off to the handler until EOF or cancel; StubFrameHandler fully handles Heartbeat/HeartbeatAck so a future supervisor's liveness detector stays happy, and returns ErrorResponse{Code=not-implemented,Message="Kind X is stubbed - Fwlib32 lift lands in PR C"} for every data-plane request. Program.cs mirrors the Galaxy.Host startup exactly: reads OTOPCUA_FOCAS_PIPE / OTOPCUA_ALLOWED_SID / OTOPCUA_FOCAS_SECRET from the env (supervisor passes these at spawn time), builds Serilog rolling-file logger into %ProgramData%\OtOpcUa\focas-host-*.log, constructs the pipe server with StubFrameHandler, loops on RunAsync until SIGINT. Log messages mark the backend as "stub" so it's visible in logs that Fwlib32 isn't actually connected yet. Driver.FOCAS.Host.Tests (net48 x86) ships three integration tests mirroring IpcHandshakeIntegrationTests from Galaxy.Host: correct-secret handshake + heartbeat round-trip, wrong-secret rejection with UnauthorizedAccessException, and a new test that sends a ReadRequest and asserts the StubFrameHandler returns ErrorResponse{not-implemented} mentioning PR C in the message so the wiring between frame dispatch + kind → error mapping is locked. Tests follow the same is-Administrator skip guard as Galaxy because PipeAcl denies BuiltinAdministrators. No changes to existing driver code; FOCAS unit tests still at 165/165 + Shared tests at 24/24. PR C wires the real Fwlib32 backend.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 14:00:56 -04:00
b968496471 Merge pull request 'FOCAS Tier-C PR A � Driver.FOCAS.Shared MessagePack contracts' (#169) from focas-tier-c-pr-a-shared into v2 2026-04-20 13:57:45 -04:00
Joseph Doherty
e6ff39148b FOCAS Tier-C PR A — Driver.FOCAS.Shared MessagePack IPC contracts. First PR of the 5-PR #220 split (isolation plan at docs/v2/implementation/focas-isolation-plan.md). Adds a new netstandard2.0 project consumable by both the .NET 10 Proxy and the future .NET 4.8 x86 Host, carrying every wire DTO the Proxy <-> Host pair will exchange: Hello/HelloAck + Heartbeat/HeartbeatAck + ErrorResponse for session negotiation (shared-secret + protocol major/minor mirroring Galaxy.Shared); OpenSessionRequest/Response + CloseSessionRequest carrying the declared FocasCncSeries so the Host picks up the pre-flight matrix; FocasAddressDto + FocasDataTypeCode for wire-compatible serialization of parsed addresses (0=Pmc/1=Param/2=Macro matches FocasAreaKind enum order so both sides cast (int)); ReadRequest/Response + WriteRequest/Response with MessagePack-serialized boxed values tagged by FocasDataTypeCode; PmcBitWriteRequest/Response as a first-class RMW operation so the critical section stays Host-side; Subscribe/Unsubscribe/OnDataChangeNotification for poll-loop-pushes-deltas model (FOCAS has no CNC-initiated callbacks); Probe + RuntimeStatusChange + Recycle surface for Tier-C supervision. Framing is [4-byte BE length][1-byte kind][body] with 16 MiB body cap matching Galaxy; FocasMessageKind byte values align with Galaxy ranges so an operator reading a hex dump doesn't have to context-switch. FrameReader/FrameWriter ported from Galaxy.Shared with thread-safe concurrent-write serialization. 24 new unit tests: 18 per-DTO round-trip tests covering every field + 6 framing tests (single-frame round-trip, clean-EOF returns null, oversized-length rejection, mid-frame EOF throws, 20-way concurrent-write ordering preserved, MessageKind byte values locked as wire-stable). No driver changes; existing 165 FOCAS unit tests still pass unchanged. PR B (Host skeleton) goes next.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 13:55:35 -04:00
4a6fe7fa7e Merge pull request 'FOCAS version-matrix stabilization (PR 1 of #220 split)' (#168) from focas-version-matrix-stabilization into v2 2026-04-20 13:46:53 -04:00
Joseph Doherty
a6be2f77b5 FOCAS version-matrix stabilization (PR 1 of #220 split) — ship the cheap half of the hardware-free stability gap ahead of the Tier-C out-of-process split. Without any CNC or simulator on the bench, the highest-leverage move is to catch operator config errors at init time instead of at steady-state per-read. Adds FocasCncSeries enum (Unknown/16i/0i-D/0i-F family/30i family/PowerMotion-i) + FocasCapabilityMatrix static class that encodes the per-series documented ranges for macro variables (cnc_rdmacro/wrmacro), parameters (cnc_rdparam/wrparam), and PMC letters + byte ceilings (pmc_rdpmcrng/wrpmcrng) straight from the Fanuc FOCAS Developer Kit. FocasDeviceOptions gains a Series knob (defaults Unknown = permissive so pre-matrix configs don't break on upgrade). FocasDriver.InitializeAsync now calls FocasAddress.TryParse on every tag + runs FocasCapabilityMatrix.Validate against the owning device's declared series, throwing InvalidOperationException with a reason string that names both the series and the documented limit ("Parameter #30000 is outside the documented range [0, 29999] for Thirty_i") so an operator can tell whether the mismatch is in the config or in their declared CNC model. Unknown series skips validation entirely. Ships 46 new theory cases in FocasCapabilityMatrixTests.cs — covering every boundary in the matrix (widen 16i->0i-F: macro ceiling 999->9999, param 9999->14999; widen 0i-F->30i: PMC letters +K+T; PMC-number 16i=999/0i-D=1999/0i-F=9999/30i=59999), permissive Unknown-series behavior, rejection-message content, and case-insensitive PMC-letter matching. Widening a range without updating docs/v2/focas-version-matrix.md fails a test because every InlineData cites the row it reflects. Full FOCAS test suite stays at 165/165 passing (119 existing + 46 new). Also authors docs/v2/focas-version-matrix.md as the authoritative range reference with per-function citations, CNC-series era context, error-surface shape, and the link back to the matrix code; docs/v2/implementation/focas-isolation-plan.md as the multi-PR plan for #220 Tier-C isolation (Shared contracts -> Host skeleton -> move Fwlib32 calls -> Supervisor+respawn -> MMF+ops glue, 2200-3200 LOC across 5 PRs mirroring the Galaxy Tier-C topology); and promotes docs/drivers/FOCAS-Test-Fixture.md from "version-matrix coverage = no" to explicit coverage via the new test file + cross-links to the matrix and isolation-plan docs. Leaves task #220 open since isolation itself (the expensive half) is still ahead.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 13:44:37 -04:00
64bbc12e8e Merge pull request 'AB Legacy ab_server PCCC Docker fixture scaffold (#224)' (#167) from ablegacy-docker-fixture into v2 2026-04-20 13:28:16 -04:00
Joseph Doherty
a0cf7c5860 AB Legacy ab_server PCCC Docker fixture scaffold (#224) — Docker infrastructure + test-class shape in place; wire-level round-trip currently blocked by an ab_server-side PCCC coverage gap documented honestly in the fixture + coverage docs. Closes the Docker-infrastructure piece of #224; the remaining work is upstream (patch ab_server's PCCC server opcodes) or sideways (RSEmulate 500 golden-box tier, lab rig).
New project tests/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy.IntegrationTests/ with four pieces. AbLegacyServerFixture — TCP probe against localhost:44818 (or AB_LEGACY_ENDPOINT override), distinct from AB_SERVER_ENDPOINT so both CIP + PCCC containers can run simultaneously. Single-public-ctor to satisfy xunit collection-fixture constraint. AbLegacyServerProfile + KnownProfiles carry the per-family (SLC500 / MicroLogix / PLC-5) ComposeProfile + Notes; drives per-theory parameterisation. AbLegacyFactAttribute / AbLegacyTheoryAttribute match the AB CIP skip-attribute pattern.

Docker/docker-compose.yml reuses the AB CIP otopcua-ab-server:libplctag-release image — `build:` block points at ../../AbCip.IntegrationTests/Docker context so `docker compose build` from here produces / reuses the same multi-stage build. Three compose profiles (slc500 / micrologix / plc5) with per-family `--plc` + `--tag=<file>[<size>]` flags matching the PCCC tag syntax (different from CIP's `Name:Type[size]`).

AbLegacyReadSmokeTests — one parametric theory reading N7:0 across all three families + one SLC500 write-then-read on N7:5. Targets the shape the driver would use against real hardware. Verified 2026-04-20 against a live SLC500 container: TCP probe passes + container accepts connections + libplctag negotiates session, but read/write returns BadCommunicationError (libplctag status 0x80050000). Root-caused to ab_server's PCCC server-side opcode coverage being narrower than libplctag's PCCC client expects — not a driver-side bug, not a scaffold bug, just an ab_server upstream limitation. Documented honestly in Docker/README.md + AbLegacy-Test-Fixture.md rather than skipping the tests or weakening assertions; tests now skip cleanly when container is absent, fail with clear message when container is up but the protocol gap surfaces. Operator resolves by filing an ab_server upstream patch, pointing AB_LEGACY_ENDPOINT at real hardware, or scaffolding an RSEmulate 500 golden-box tier.

Docker/README.md — Known limitations section leads with the PCCC round-trip gap (test date, failure signature, possible root causes, three resolution paths) before the pre-existing limitations (T/C file decomposition, ST file quirks, indirect addressing, DF1 serial). Reader can't miss the "scaffolded but blocked on upstream" framing.

docs/drivers/AbLegacy-Test-Fixture.md — TL;DR flipped from "no integration fixture" to "Docker scaffold in place; wire-level round-trip currently blocked by ab_server PCCC gap". What-the-fixture-is gains an Integration section. Follow-up candidates rewritten: #1 is now "fix ab_server PCCC upstream", #2 is RSEmulate 500 golden-box (with cost callouts matching our existing Logix Emulate + TwinCAT XAR scaffolds — license + Hyper-V conflict + binary project format), #3 is lab rig. Key-files list adds the four new files. docs/drivers/README.md coverage-map row updated from "no integration fixture" to "Docker scaffold via ab_server PCCC; wire-level round-trip currently blocked, docs call out resolution paths".

Solution file picks up the new tests/.../AbLegacy.IntegrationTests entry. AbLegacyDataType.Int used throughout (not Int16 — the enum uses SLC file-type naming). Build 0 errors; 2 smoke tests skip cleanly without container + fail with clear errors when container up (proving the infrastructure works end-to-end + the gap is specifically the ab_server protocol coverage, not the scaffold).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 13:26:19 -04:00
2fe1a326dc Merge pull request 'TwinCAT XAR integration fixture scaffold (#221)' (#166) from twincat-xar-fixture-scaffold into v2 2026-04-20 13:11:53 -04:00
Joseph Doherty
7b49ea13c7 TwinCAT XAR integration fixture — scaffold the code + docs so the Hyper-V VM + .tsproj drop in without fixture-code changes. Mirrors the AB CIP Logix Emulate scaffold shipped in PR #165: tier-gated smoke tests that skip cleanly when the VM isn't reachable, a project README documenting exactly what the XAR needs to run, fixture-coverage doc promoting TwinCAT from "no integration fixture" to "scaffolded + needs operational setup". The actual Beckhoff-side work (provision VM, install XAR, author tsproj, rotate 7-day trial) lives in #221 + the new TwinCatProject/README.md walkthrough.
New project tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests/ with four pieces. TwinCATXarFixture — TCP probe against the ADS-over-TCP port 48898 on the host from TWINCAT_TARGET_HOST env var, requires TWINCAT_TARGET_NETID for the target AmsNetId, optional TWINCAT_TARGET_PORT for runtime 2+ (default 851 = PLC runtime 1). Doesn't own a lifecycle — XAR can't run in Docker because it bypasses the Windows kernel scheduler to hit real-time cycles, so the VM stays operator-managed. Explicit skip reasons surface the setup steps (start VM, set env vars, reactivate trial license) instead of a confusing hang. TwinCATFactAttribute + TwinCATTheoryAttribute — xunit skip gate matching AbServerFactAttribute / OpcPlcCollection patterns.

TwinCAT3SmokeTests — three smoke tests through the real AdsTwinCATClient + real ADS over TCP. Driver_reads_seeded_DINT_through_real_ADS reads GVL_Fixture.nCounter, asserts >= 1234 (MAIN increments every cycle so an exact match would race). Driver_write_then_read_round_trip_on_scratch_REAL writes 42.5 to GVL_Fixture.rSetpoint + reads back, catches the ADS write path regression that unit tests can't see. Driver_subscribe_receives_native_ADS_notifications_on_counter_changes validates the #189 native-notification path end-to-end — AddDeviceNotification fires OnDataChange at the PLC cycle boundary, the test observes one firing within 3 s. All three gated on TWINCAT_TARGET_HOST + NETID; skip via TwinCATFactAttribute when unset, verified in this commit with 3 clean [SKIP] results.

TwinCatProject/README.md — the tsproj state the smoke tests depend on. GVL_Fixture with nCounter:DINT:=1234 + rSetpoint:REAL:=0.0 + bFlag:BOOL:=TRUE; MAIN program with the single-line ladder `GVL_Fixture.nCounter := GVL_Fixture.nCounter + 1;`; PlcTask cyclic @ 10 ms priority 20; PLC runtime 1 (AMS port 851). Explains why tsproj over the compiled bootproject (text-diffable, rebuildable, no per-install state). Full XAR VM setup walkthrough — Hyper-V Gen 2 VM, TC3 XAE+XAR install, noting the AmsNetId from the tray icon, bilateral route configuration (VM System Manager → Routes + dev box StaticRoutes.xml), project import, Activate Configuration + Run Mode. License-rotation section walks through two options — scheduled TcActivate.exe /reactivate via Task Scheduler (not officially Beckhoff-supported, reportedly works on current builds) or paid runtime license (~$1k one-time per runtime per CPU). Final section shows the exact env-var recipe + dotnet test command on the dev box.

docs/drivers/TwinCAT-Test-Fixture.md — flipped TL;DR from "there is no integration fixture" to "scaffolding lives at tests/..., remaining operational work is VM + tsproj + license rotation". "What the fixture is" gains an Integration section describing the XAR VM target. "What it actually covers" gains an Integration subsection listing the three named smoke tests. Follow-up candidates rewritten — the #1 item used to be "TwinCAT 3 runtime on CI" as a speculative option; now it's concrete "XAR VM live-population" with a link to #221 + the project README for the operational walkthrough. License rotation becomes #2 with both automation paths. Key fixture / config files list adds the three new files + the project README. docs/drivers/README.md coverage-map row updated from "no integration fixture" to "XAR-VM integration scaffolding".

Solution file picks up the new tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests entry alongside the existing TwinCAT.Tests. xunit CollectionDefinition added to TwinCATXarFixture after the first build revealed the [Collection("TwinCATXar")] reference on TwinCAT3SmokeTests had no matching registration. Build 0 errors; 3 skip-clean test outcomes verified. #221 stays open as in_progress until the VM + tsproj land.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 13:11:17 -04:00
b820b9a05f Merge pull request 'AB CIP Logix Emulate golden-box tier � scaffolding' (#165) from abcip-emulate-tier-scaffold into v2 2026-04-20 12:56:38 -04:00
Joseph Doherty
58a0cccc67 AB CIP Logix Emulate golden-box tier — scaffold the code + docs so the L5X + Emulate PC drop in without fixture-code changes. Closes the initial design question the user raised; the actual Emulate-side work (author project, commit L5X, install Emulate on the dev box) is tracked as #223. Scaffolding ships everything that doesn't need the live Emulate instance: tier-gated test classes that skip cleanly when AB_SERVER_PROFILE is unset, the profile gate helper, the LogixProject/README.md documenting the exact project state the tests expect, the fixture coverage doc's new §Logix Emulate tier section with the when-to-trust table extended from 3 columns to 4, and the dev-environment.md integration-host row.
AbServerProfileGate — static helper that reads `AB_SERVER_PROFILE` env var (defaults to "abserver") + exposes `SkipUnless(params string[] requiredProfiles)` matching the MODBUS_SIM_PROFILE pattern the DL205StringQuirkTests uses one directory over. Emulate-only tests call `AbServerProfileGate.SkipUnless("emulate")` at the top of each fact body; ab_server-default runs see them skip with a clear message pointing at the Emulate setup steps.

AbCipEmulateUdtReadTests — one test proving the #194 whole-UDT read optimization works against the real Logix Template Object, not just the golden byte buffers the unit suite uses. Builds an `AbCipDriverOptions` with a Structure tag `Motor1 : Motor_UDT` that has three declared members (Speed : DINT, Torque : REAL, Status : DINT), reads them via the `.Speed / .Torque / .Status` dotted-tag syntax, asserts the driver gets the grouped whole-UDT path + decodes each at the right offset. Required seed values documented inline + in LogixProject/README.md: Speed=1800, Torque=42.5f, Status=0x0001.

AbCipEmulateAlmdTests — one test proving the #177 ALMD projection fires `OnAlarmEvent` when a real ALMD instruction's `In` edge rises, not just the fake `InFaulted` timer edges the unit suite drives. Needs a `SimulateAlarm : BOOL` tag routed through `MainRoutine` ladder (`XIC SimulateAlarm OTE HighTempAlarm.In`) so the test case can pulse the input via the existing `IWritable.WriteAsync` path instead of scripting Emulate via its own socket. Alarm-projection options carry `EnableAlarmProjection = true` + 200 ms poll interval; a `TaskCompletionSource` gates the raise-event assertion with a 5 s deadline. Cleanup writes SimulateAlarm=false so consecutive runs start from known state.

LogixProject/README.md — the Studio 5000 project state the Emulate-tier tests depend on. Explains why L5X over ACD (text diff, reproducible import, no per-install state), the UDT + tag + routine structure, how to bring it up on the Emulate PC. Ships as a stub pending actual author + L5X export + commit; the README itself keeps the requirements visible so the L5X author has a checklist.

docs/drivers/AbServer-Test-Fixture.md — new §Logix Emulate golden-box tier section with the coverage-promotion table (ab_server / Emulate / hardware per gap), the setup-env-var recipe, the costs to accept (license, Hyper-V conflict, manual lifecycle). "When to trust" table extended from 3 columns (ab_server / unit / rig) to 4 (ab_server / unit / Logix Emulate / rig); two new rows for EtherNet/IP embedded-switch + redundant-chassis failover that even Emulate can't help with. Follow-up candidates list gets Logix Emulate as option 1 ahead of the pre-existing "extend ab_server upstream" + "stand up a lab rig". See-also file list gains AbServerProfileGate.cs + Docker/ + Emulate/ + LogixProject/README.md entries.

docs/v2/dev-environment.md — §C Integration host gains a Rockwell Studio 5000 Logix Emulate row: purpose (AB CIP golden-box tier closing UDT/ALMD/AOI/safety/ConnectionSize gaps), type (Windows-only, Hyper-V conflict matching TwinCAT XAR's constraint), port 44818, credentials note, owner split between integration-host admin for license+install and developer for per-session runtime start.

Verified: Emulate tests skip cleanly when AB_SERVER_PROFILE is unset — both `[SKIP]` with the operator-facing message pointing at the env-var setup. Whole-solution build 0 errors. Tests will transition from skip → pass once the L5X + Emulate PC land per #223.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 12:54:39 -04:00
231148d7f0 Merge pull request 'Doc + code-comment sweep � finish the native-fallback removal' (#164) from docs-native-fallback-cleanup into v2 2026-04-20 12:38:11 -04:00
Joseph Doherty
fdb268cee0 Docs + code-comment sweep — remove stale Pymodbus/ + PythonSnap7/ + LocateBinary references left behind by the native-fallback removal PR. Answer to "is the dev inventory + documentation updated": it was partial; this PR finishes the job.
Files touched — docs/drivers/Modbus-Test-Fixture.md dropped the key-files pointer at deleted Pymodbus/ + flipped "primary launcher is Docker, native fallback retained" framing to "Docker is the only supported launch path" (matching the code). docs/v2/dev-environment.md dropped the "skips both Docker + native-binary paths" parenthetical from AB_SERVER_ENDPOINT + flipped the "Native fallbacks" subsection to a one-liner that says Docker is the only supported path. docs/v2/modbus-test-plan.md rewrote §Harness from "pip install pymodbus + serve.ps1" setup pattern to "docker compose --profile <…> up" + updated the §PR 43 status bullet to point at Docker/profiles/. docs/v2/test-data-sources.md §"CI fixture (task #180)" rewrote the AB CIP section from "LocateBinary() picks binary off PATH" + GitHub Actions zip-download step to "Docker is the only supported reproducible build path" + docker compose GitHub Actions step; dropped the pinned-version SHA256 table + lock-file reference because the Dockerfile's LIBPLCTAG_TAG build-arg is the new pin.

Code docstrings + error messages — these are developer-facing operational text too. ModbusSimulatorFixture SkipReason strings (both branches) now point at `docker compose -f Docker/docker-compose.yml --profile standard up -d` instead of the deleted `Pymodbus\serve.ps1`; doc-comment at the top references Docker/docker-compose.yml. Snap7ServerFixture SkipReason strings + doc-comment point at Docker/docker-compose.yml instead of PythonSnap7/serve.ps1. S7_1500Profile.cs docstring updated. Modbus Dockerfile comment pointing at deleted tests/.../Pymodbus/README.md redirected to docs/drivers/Modbus-Test-Fixture.md. DL205Profile.cs + DL205StringQuirkTests.cs + S7_1500Profile.cs (in Modbus project) docstrings flipped from Pymodbus/*.json references to Docker/profiles/*.json.

Left untouched deliberately: docs/v2/implementation/exit-gate-phase-2-closed.md — that's a historical as-of-2026-04-18 snapshot documenting what was skipped at Phase 2 closure; rewriting would lose the date-stamped context. Its "oitc/modbus-server Docker container not started" + "ab_server binary not on PATH" lines describe the fixture landscape that existed at close time, not current operational guidance.

Final sweep confirms zero remaining `Pymodbus/` / `PythonSnap7/` / `LocateBinary` / `AbServerSeedTag` / `BuildCliArgs` / `AbServerPlcArg` mentions anywhere in tracked files outside that historical exit-gate doc. Whole-solution build still 0 errors.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 12:36:19 -04:00
4473197cf5 Merge pull request 'Remove native-launcher fallbacks; Docker is the only path for Modbus / S7 / AB CIP / OpcUaClient' (#163) from remove-native-fallbacks into v2 2026-04-20 12:29:46 -04:00
Joseph Doherty
0e1dcc119e Remove native-launcher fallbacks for the four Dockerized fixtures — Docker is the only supported path for Modbus / S7 / AB CIP / OpcUaClient integration. Native paths stay in place only where Docker isn't compatible (Galaxy: MXAccess COM + Windows-only; TwinCAT: Beckhoff runtime vs Hyper-V; FOCAS: closed-source Fanuc Fwlib32.dll; AB Legacy: PCCC has no OSS simulator). Simplifies the fixture landscape + removes the "which path do I run" ambiguity; removes two full native-launcher directories + the AB CIP native-spawn path; removes the parallel profile-as-CLI-arg-builder code from AbServerFixture.
Modbus — deletes tests/.../Modbus.IntegrationTests/Pymodbus/ (serve.ps1, standard.json, dl205.json, mitsubishi.json, s7_1500.json, README.md). Profile JSONs live only under Docker/profiles/ now. Docker/README.md loses its "Native-Python fallback" section; docs/drivers/Modbus-Test-Fixture.md "What the fixture is" bullet flipped from "primary launcher is Docker, native fallback under Pymodbus/" to "Docker is the only supported launch path".

S7 — deletes tests/.../S7.IntegrationTests/PythonSnap7/ (server.py, s7_1500.json, serve.ps1, README.md). Docker/README.md loses "Native-Python fallback"; docs/drivers/S7-Test-Fixture.md updated to match.

AB CIP — the biggest simplification because the native-binary spawn had the most code. AbServerFixture.cs rewrites: drops Process management (no more Process _proc + Kill/WaitForExit), drops LocateBinary() PATH lookup, drops the IAsyncLifetime initialize-spawns-server behavior. Fixture is now a thin TCP probe against localhost:44818 (or AB_SERVER_ENDPOINT override) — same shape as Snap7ServerFixture / ModbusSimulatorFixture / OpcPlcFixture. IsServerAvailable() simplifies to a single 500 ms probe. AbServerProfile.cs drops AbServerPlcArg + SeedTags + BuildCliArgs + ToCliSpec + the entire AbServerSeedTag record — the compose file is the canonical source of truth for which tags + which --plc mode each family gets; the profile record now carries just Family + ComposeProfile (matches the docker-compose service key) + Notes. KnownProfiles.ForFamily + .All stay for tests that iterate families. AbServerProfileTests.cs rewrites to match: drops BuildCliArgs_* + ToCliSpec_* + SeedTags_* tests; keeps the family-coverage contract tests + verifies the ComposeProfile strings match compose-file service names (a typo in either surfaces as a unit-test failure, not a silent "wrong family booted" at runtime). Docker/README.md loses "Native-binary fallback" section; docs/drivers/AbServer-Test-Fixture.md "What the fixture is" flipped to Docker-only with clearer skip rules.

dev-environment.md §Docker fixtures — the "Native fallbacks" subsection goes away; replaced with a one-line note that Docker is the only supported path for these four fixtures + a fresh clone needs Docker Desktop and nothing else.

Verified: whole-solution build 0 errors, AB CIP profile unit tests 6/6, AB CIP Docker smoke 4/4 (all family theory rows), S7 Docker smoke 3/3. Container lifecycle clean. The deleted native code surface was already redundant — every fixture the native paths served is now covered by Docker; keeping them invited drift between the two paths (the original AB CIP native profile had three undetected bugs per the #162 commit message: case-sensitive --plc, bracket tag notation, --path=1,0 requirement — noise the Docker path now avoids by never running the buggy code).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 12:27:44 -04:00
27d135bd59 Merge pull request 'Dockerize Modbus + AB CIP + S7 test fixtures for reproducibility' (#162) from fixtures-all-docker into v2 2026-04-20 12:11:45 -04:00
Joseph Doherty
6609141493 Dockerize Modbus + AB CIP + S7 test fixtures for reproducibility. Every driver integration simulator now has a pinned Docker image alongside the existing native launcher — Docker is the primary path, native fallbacks kept for contributors who prefer them. Matches the already-Dockerized OpcUaClient/opc-plc pattern from #215 so every fixture in the fleet presents the same compose-up/test/compose-down loop. Reproducibility gain: what used to require a local pip/Python install (Modbus pymodbus, S7 python-snap7) or a per-OS C build from source (AB CIP ab_server from libplctag) now collapses to a Dockerfile + docker compose up. Modbus — new tests/ZB.MOM.WW.OtOpcUa.Driver.Modbus.IntegrationTests/Docker/ with Dockerfile (python:3.12-slim-bookworm + pymodbus[simulator]==3.13.0) + docker-compose.yml with four compose profiles (standard / dl205 / mitsubishi / s7_1500) backed by the existing profile JSONs copied under Docker/profiles/ as canonical; native fallback in Pymodbus/ retained with the same JSON set (symlink-equivalent — manual re-sync when profiles change, noted in both READMEs). Port 5020 unchanged so MODBUS_SIM_ENDPOINT + ModbusSimulatorFixture work without code change. Dropped the --no_http CLI arg the old serve.ps1 + compose draft passed — pymodbus 3.13 doesn't recognize it; the simulator's http ui just binds inside the container where nothing maps it out and costs nothing. S7 — new tests/ZB.MOM.WW.OtOpcUa.Driver.S7.IntegrationTests/Docker/ with Dockerfile (python:3.12-slim-bookworm + python-snap7>=2.0) + docker-compose.yml with one s7_1500 compose profile; copies the existing server.py shim + s7_1500.json seed profile; runs python -u server.py ... --port 1102. Native fallback in PythonSnap7/ retained. Port 1102 unchanged. AB CIP — hardest because ab_server is a source-only C tool in libplctag's src/tools/ab_server/. New tests/ZB.MOM.WW.OtOpcUa.Driver.AbCip.IntegrationTests/Docker/ Dockerfile is multi-stage: build stage (debian:bookworm-slim + build-essential + cmake) clones libplctag at a pinned tag + cmake --build build --target ab_server; runtime stage (debian:bookworm-slim) copies just the binary from /src/build/bin_dist/ab_server. docker-compose.yml ships four compose profiles (controllogix / compactlogix / micro800 / guardlogix) with per-family ab_server CLI args matching AbServerProfile.cs. AbServerFixture updated: tries TCP probe on 127.0.0.1:44818 first (Docker path) + spawns the native binary only as fallback when no listener is there. AB_SERVER_ENDPOINT env var supported for pointing at a real PLC. AbServerFact/Theory attributes updated to IsServerAvailable() which accepts any of: live listener on 44818, AB_SERVER_ENDPOINT set, or binary on PATH. Required two CLI-compat fixes to ab_server's argument expectations that the existing native profile never caught because it was never actually run at CI: --plc is case-sensitive (ControlLogix not controllogix), CIP tags need [size] bracket notation (DINT[1] not bare DINT), ControlLogix also requires --path=1,0. Compose files carry the corrected flags; the existing native-path AbServerProfile.cs was never invoked in practice so we don't rewrite it here. Micro800 now uses the --plc=Micro800 mode rather than falling back to ControlLogix emulation — ab_server does have the dedicated mode, the old Notes saying otherwise were wrong. Updated docs: three fixture coverage docs (Modbus-Test-Fixture.md, S7-Test-Fixture.md, AbServer-Test-Fixture.md) flip their "What the fixture is" section from native-only to Docker-primary-with-native-fallback; dev-environment.md §Resource Inventory replaces the old ambiguous "Docker Desktop + ab_server native" mix with four per-driver rows (each listing the image, compose file, compose profiles, port, credentials) + a new Docker fixtures — quick reference subsection giving the one-line docker compose -f <…> --profile <…> up for each driver + the env-var override names + the native fallback install recipes. drivers/README.md coverage map table updated — Modbus/AB CIP/S7 entries now read "Dockerized …" consistent with OpcUaClient's line. Verified end-to-end against live containers: Modbus DL205 smoke 1/1, S7 3/3, AB CIP ControlLogix 4/4 (all family theory rows). Container lifecycle clean (up/test/down, no leaked state). Every fixture keeps its skip-when-absent probe + env-var endpoint override so dotnet test on a fresh clone without Docker running still gets a green run.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 12:09:44 -04:00
3b3e814855 Merge pull request 'OpcUaClient integration fixture � opc-plc in Docker (#215)' (#161) from opcuaclient-opc-plc-fixture into v2 2026-04-20 11:45:24 -04:00
Joseph Doherty
c985c50a96 OpcUaClient integration fixture — opc-plc in Docker closes the wire-level gap (#215). Closes task #215. The OpcUaClient driver had the richest capability matrix in the fleet (reads/writes/subscribe/alarms/history across 11 unit-test classes) + zero wire-level coverage; every test mocked the Session surface. opc-plc is Microsoft Industrial IoT's OPC UA PLC simulator — already containerized, already on MCR, pinned to 2.14.10 here. Wins vs the loopback-against-our-own-server option we'd originally scoped: (a) independent cert chain + user-token handling catches interop bugs loopback can't because both endpoints would share our own cert store; (b) pinned image tag fixes the test surface in a way our evolving server wouldn't; (c) the --alm flag opens the door to real IAlarmSource coverage later without building a custom FakeAlarmDriver. Loss vs loopback: both use the OPCFoundation.NetStandard stack internally so bugs common to that stack don't surface — addressed by a follow-up to add open62541/open62541 as a second independent-stack image (tracked). Docker is the fixture launcher — no PowerShell/Python wrapper like Modbus/pymodbus or S7/python-snap7 because opc-plc ships containerized. Docker/docker-compose.yml pins 2.14.10 + maps port 50000 + command flags --pn=50000 --ut --aa --alm; the healthcheck TCP-probes 50000 so docker ps surfaces ready state. Fixture OpcPlcFixture follows the same shape as Snap7ServerFixture + ModbusSimulatorFixture: collection-scoped, parses OPCUA_SIM_ENDPOINT (default opc.tcp://localhost:50000) into host + port, 2-second TCP probe at init, SkipReason records the failure for Assert.Skip. Forced IPv4 on the probe socket for the same reason those two fixtures do — .NET's dual-stack "localhost" resolves IPv6 ::1 first + hangs the full connect timeout when the target binds 0.0.0.0 (IPv4). OpcPlcProfile holds well-known node identifiers opc-plc exposes (ns=3;s=StepUp, FastUInt1, RandomSignedInt32, AlternatingBoolean) + builds OpcUaClientDriverOptions with SecurityPolicy.None + AutoAcceptCertificates=true since opc-plc regenerates its server cert on every container spin-up + there's no meaningful chain to validate against in CI. Three smoke tests covering what the unit suite couldn't reach: (1) Client_connects_and_reads_StepUp_node_through_real_OPC_UA_stack — full Secure Channel + Session + Read on ns=3;s=StepUp (counter that ticks every 1 s); (2) Client_reads_batch_of_varied_types_from_live_simulator — batch Read of UInt32 / Int32 / Boolean to prove typed Variant decoding, with an explicit ShouldBeOfType<bool> assertion on AlternatingBoolean to catch the common "variant gets stringified" regression; (3) Client_subscribe_receives_StepUp_data_changes_from_live_server — real MonitoredItem subscription on FastUInt1 (100 ms cadence) with a SemaphoreSlim gate + 3 s deadline on the first OnDataChange fire, tolerating container warm-up. Driver ran end-to-end against a live 2.14.10 container: all 3 pass; unit suite 78/78 unchanged. Container lifecycle verified (compose up → tests → compose down) clean, no leaked state. Docker/README.md documents install (Docker Desktop already on the dev box per Phase 1 decision #134), run (compose up / compose up -d / compose down), endpoint override (OPCUA_SIM_ENDPOINT), what opc-plc advertises with the current command flags, what's tunable via compose-file tweaks (--daa for username auth tests; --fn/--fr/--ft for subscription-stress nodes), known limitation that opc-plc shares the OPCFoundation stack with our driver. OpcUaClient-Test-Fixture.md updated — TL;DR flipped from "there is no integration fixture" to the new reality; "What it actually covers" gains an Integration section listing the three smoke tests. Follow-up the doc flags: add open62541/open62541 as a second image for fully-independent-stack interop coverage; once #219 (server-side IAlarmSource/IHistoryProvider integration tests) lands, re-run the client-side suite against opc-plc's --alm nodes to close the alarm gap from the client side too.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 11:43:20 -04:00
820567bc2a Merge pull request 'S7 integration fixture via python-snap7 (#216) + per-driver test-fixture coverage docs' (#160) from s7-integration-fixture into v2 2026-04-20 11:31:14 -04:00
Joseph Doherty
1d3544f18e S7 integration fixture — python-snap7 server closes the wire-level coverage gap (#216) + per-driver fixture coverage docs for every driver in the fleet. Closes #216. Two shipments in one PR because the docs landed as I surveyed each driver's fixture + the S7 work is the first wire-level-gap closer pulled from that survey.
S7 integration — AbCip/Modbus already have real-simulator integration suites; S7 had zero wire-level coverage despite being a Tier-A driver (all unit tests mocked IS7Client). Picked python-snap7's `snap7.server.Server` over raw Snap7 C library because `pip install` beats per-OS binary-pin maintenance, the package ships a Python __main__ shim that mirrors our existing pymodbus serve.ps1 + *.json pattern structurally, and the python-snap7 project is actively maintained. New project `tests/ZB.MOM.WW.OtOpcUa.Driver.S7.IntegrationTests/` with four moving parts: (a) `Snap7ServerFixture` — collection-scoped TCP probe on `localhost:1102` that sets `SkipReason` when the simulator's not running, matching the `ModbusSimulatorFixture` shape one directory over (same S7_SIM_ENDPOINT env var override convention for pointing at a real S7 CPU on port 102); (b) `PythonSnap7/` — `serve.ps1` wrapper + `server.py` shim + `s7_1500.json` seed profile + `README.md` documenting install / run / known limitations; (c) `S7_1500/S7_1500Profile.cs` — driver-side `S7DriverOptions` whose tag addresses map 1:1 to the JSON profile's seed offsets (DB1.DBW0 u16, DB1.DBW10 i16, DB1.DBD20 i32, DB1.DBD30 f32, DB1.DBX50.3 bool, DB1.DBW100 scratch); (d) `S7_1500SmokeTests` — three tests proving typed reads + write-then-read round-trip work through real S7netplus + real ISO-on-TCP + real snap7 server. Picked port 1102 default instead of S7-standard 102 because 102 is privileged on Linux + triggers Windows Firewall prompt; S7netplus 0.20 has a 5-arg `Plc(CpuType, host, port, rack, slot)` ctor that lets the driver honour `S7DriverOptions.Port`, but the existing driver code called the 4-arg overload + silently hardcoded 102. One-line driver fix (S7Driver.cs:87) threads `_options.Port` through — the S7 unit suite (58/58) still passes unchanged because every unit test uses a fake IS7Client that never sees the real ctor. Server seed-type matrix in `server.py` covers u8 / i8 / u16 / i16 / u32 / i32 / f32 / bool-with-bit / ascii (S7 STRING with max_len header). register_area takes the SrvArea enum value, not the string name — a 15-minute debug after the first test run caught that; documented inline.

Per-driver test-fixture coverage docs — eight new files in `docs/drivers/` laying out what each driver's harness actually benchmarks vs. what's trusted from field deployments. Pattern mirrors the AbServer-Test-Fixture.md doc that shipped earlier in this arc: TL;DR → What the fixture is → What it actually covers → What it does NOT cover → When-to-trust table → Follow-up candidates → Key files. Ugly truth the survey made visible: Galaxy + Modbus + (now) S7 + AB CIP have real wire-level coverage; AB Legacy / TwinCAT / FOCAS / OpcUaClient are still contract-only because their libraries ship no fake + no open-source simulator exists (AB Legacy PCCC), no public simulator exists (FOCAS), the vendor SDK has no in-process fake (TwinCAT/ADS.NET), or the test wiring just hasn't happened yet (OpcUaClient could trivially loopback against this repo's own server — flagged as #215). Each doc names the specific follow-up route: Snap7 server for S7 (done), TwinCAT 3 developer-runtime auto-restart for TwinCAT, Tier-C out-of-process Host for FOCAS, lab rigs for AB Legacy + hardware-gated bits of the others. `docs/drivers/README.md` gains a coverage-map section linking all eight. Tracking tasks #215-#222 filed for each PR-able follow-up.

Build clean (driver + integration project + docs); S7.Tests 58/58 (unchanged); S7.IntegrationTests 3/3 (new, verified end-to-end against a live python-snap7 server: `driver_reads_seeded_u16_through_real_S7comm`, `driver_reads_seeded_typed_batch`, `driver_write_then_read_round_trip_on_scratch_word`). Next fixture follow-up is #215 (OpcUaClient loopback against own server) — highest ROI of the remaining set, zero external deps.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 11:29:15 -04:00
4fe96fca9b Merge pull request 'AbCip IAlarmSource via ALMD projection (#177, feature-flagged)' (#159) from abcip-alarm-source into v2 2026-04-20 04:26:39 -04:00
Joseph Doherty
4e80db4844 AbCip IAlarmSource via ALMD projection (#177) — feature-flagged OFF by default; when enabled, polls declared ALMD UDT member fields + raises OnAlarmEvent on 0→1 + 1→0 transitions. Closes task #177. The AB CIP driver now implements IAlarmSource so the generic-driver alarm dispatch path (PR 14's sinks + the Server.Security.AuthorizationGate AlarmSubscribe/AlarmAck invoker wrapping) can treat AB-backed alarms uniformly with Galaxy + OpcUaClient + FOCAS. Projection is ALMD-only in this pass: the Logix ALMD (digital alarm) instruction's UDT shape is well-understood (InFaulted + Acked + Severity + In + Cfg_ProgTime at stable member names) so the polled-read + state-diff pattern fits without concessions. ALMA (analog alarm) deferred to a follow-up because its HHLimit/HLimit/LLimit/LLLimit threshold + In value semantics deserve their own design pass — raising on threshold-crossing is not the same shape as raising on InFaulted-edge. AbCipDriverOptions gains two knobs: EnableAlarmProjection (default false) + AlarmPollInterval (default 1s). Explicit opt-in because projection semantics don't exactly mirror Rockwell FT Alarm & Events; shops running FT Live should leave this off + take alarms through the native A&E route. AbCipAlarmProjection is the state machine: per-subscription background loop polls the source-node set via the driver's public ReadAsync — which gains the #194 whole-UDT optimization for free when ALMDs are declared with their standard member set, so one poll tick reads (N alarms × 2 members) = N libplctag round-trips rather than 2N. Per-tick state diff: compare InFaulted + Severity against last-seen, fire raise (0→1) / clear (1→0) with AlarmSeverity bucketed via the 1-1000 Logix severity scale (≤250 Low, ≤500 Medium, ≤750 High, rest Critical — matches OpcUaClient's MapSeverity shape). ConditionId is {sourceNode}#active — matches a single active-branch per alarm which is all ALMD supports; when Cfg_ProgTime-based branch identity becomes interesting (re-raise after ack with new timestamp), a richer ConditionId pass can land. Subscribe-while-disabled returns a handle wrapping id=0 — capability negotiation (the server queries IAlarmSource presence at driver-load time) still succeeds, the alarm surface just never fires. Unsubscribe cancels the sub's CTS + awaits its loop; ShutdownAsync cancels every sub on its way out so a driver reload doesn't leak poll tasks. AcknowledgeAsync routes through the driver's existing WriteAsync path — per-ack writes {SourceNodeId}.Acked = true (the simpler semantic; operators whose ladder watches AckCmd + rising-edge can wire a client-side pulse until a driver-level edge-mode knob lands). Best-effort — per-ack faults are swallowed so one bad ack doesn't poison the whole batch. Six new AbCipAlarmProjectionTests: detector flags ALMD signature + skips non-signature UDTs + atomics; severity mapping matches OPC UA A&C bucket boundaries; feature-flag OFF returns a handle but never touches the fake runtime (proving no background polling happens); feature-flag ON fires a raise event on 0→1; clear event fires on 1→0 after a prior raise; unsubscribe stops the poll loop (ReadCount doesn't grow past cancel + at most one straggler read). Driver builds 0 errors; AbCip.Tests 233/233 (was 227, +6 new). Task #177 closed — the last pending AB CIP follow-up is now #194 (already shipped). Remaining pending fleet-wide: #150 (Galaxy MXAccess failover hardware) + #199 (UnsTab Playwright smoke).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 04:24:40 -04:00
f6d5763448 Merge pull request 'AbCip whole-UDT read optimization (#194)' (#158) from abcip-whole-udt-read into v2 2026-04-20 04:19:51 -04:00
Joseph Doherty
780358c790 AbCip whole-UDT read optimization (#194) — declaration-driven member grouping collapses N per-member reads into one parent-UDT read + client-side decode. Closes task #194. On a batch that includes multiple members of the same hand-declared UDT tag, ReadAsync now issues one libplctag read on the parent + decodes each member from the runtime's buffer at its computed byte offset. A 6-member Motor UDT read goes from 6 libplctag round-trips to 1 — the Rockwell-suggested pattern for minimizing CIP request overhead on batch reads of UDT state (decision #11's follow-through on what the template decoder from task #179 was meant to enable). AbCipUdtMemberLayout is a pure-function helper that computes declared-member byte offsets under Logix natural-alignment rules (SInt 1-byte / Int 2-byte / DInt + Real + Dt 4-byte / LInt + ULInt + LReal 8-byte; alignment pad inserted before each member as needed). Opts out for BOOL / String / Structure members — BOOL storage in Logix UDTs packs into a hidden host byte whose position can't be computed from declaration-only info, and String members need length-prefix + STRING[82] fan-out which libplctag already handles via a per-tag DecodeValue path. The CIP Template Object shape from task #179 (when populated via FetchUdtShapeAsync) carries real offsets for those members — layering that richer path on top of the planner is a separate follow-up and does not change this PR's conservative behaviour. AbCipUdtReadPlanner is the scheduling function ReadAsync consults each batch — pure over (requests, tagsByName), emits Groups + Fallbacks. A group is formed when (a) the reference resolves to "parent.member"; (b) parent is a Structure tag with declared Members; (c) the layout helper succeeds on those members; (d) the specific member appears in the computed offset map; (e) at least two members of the same parent appear in the batch — single-member groups demote to the fallback path because one whole-UDT read vs one per-member read is equivalent cost but more client-side work. Original batch indices are preserved through the plan so out-of-order batches write decoded values back at the right output slot; the caller's result array order is invariant. IAbCipTagRuntime.DecodeValueAt(AbCipDataType, int offset, int? bitIndex) is the new hot-path method — LibplctagTagRuntime delegates to libplctag's offset-aware Get*(offset) calls (GetInt32, GetFloat32, etc.) that were always there; previously every call passed offset 0. DecodeValue(type, bitIndex) stays as the shorthand + forwards to DecodeValueAt with offset 0, preserving the existing single-tag read path + every test that exercises it. FakeAbCipTag gains a ValuesByOffset dictionary so tests can drive multi-member decoding by setting offset→value before the read fires; unmapped offsets fall back to the existing Value field so the 200+ existing tests that never set ValuesByOffset keep working unchanged. AbCipDriver.ReadAsync refactored: planner splits the batch, ReadGroupAsync handles each UDT group (one EnsureTagRuntimeAsync on the parent + one ReadAsync + N DecodeValueAt calls), ReadSingleAsync handles each fallback (the pre-#194 per-tag path, now extracted + threaded through). A per-group failure stamps the mapped libplctag status across every grouped member only — sibling groups + fallback refs are unaffected. Health-surface updates happen once per successful group rather than once per member to avoid ping-ponging the DriverState bookkeeping. Five AbCipUdtMemberLayoutTests: packed atomics get natural-alignment offsets including 8-byte pad before LInt; SInts pack without padding; BOOL/String/Structure opt out + return null; empty member list returns null. Six AbCipUdtReadPlannerTests: two members group; single-member demotes to fallback; unknown references fall back without poisoning groups; atomic top-level tags fall back untouched; UDTs containing BOOL don't group; original indices survive out-of-order batches. Five AbCipDriverWholeUdtReadTests (real driver + fake runtime): two grouped members trigger exactly one parent read + one fake runtime (proving the optimization engages); each member decodes at its own offset via ValuesByOffset; parent-read non-zero status stamps Bad across the group; mixed UDT-member + atomic top-level batch produces 2 runtimes + 2 reads (not 3); single-member-of-UDT still uses the member-level runtime (proving demotion works). Driver builds 0 errors; AbCip.Tests 227/227 (was 211, +16 new).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 04:17:57 -04:00
1ac87f1fac Merge pull request 'ADR-001 last-mile � Program.cs composes walker into production boot (#214)' (#157) from equipment-content-wiring into v2 2026-04-20 03:52:30 -04:00
Joseph Doherty
432173c5c4 ADR-001 last-mile — Program.cs composes EquipmentNodeWalker into the production boot path. Closes task #214 + fully lands ADR-001 Option A as a live code path, not just a connected set of unit-tested primitives. After this PR a server booted against a real Config DB with Published Equipment rows materializes the UNS tree into the OPC UA address space on startup — the whole walker → wire-in → loader chain (PRs #153, #154, #155, #156) finally fires end-to-end in the production process. DriverEquipmentContentRegistry is the handoff between OpcUaServerService's bootstrap-time populate pass + OpcUaApplicationHost's StartAsync walker invocation. It's a singleton mutable holder with Get/Set/Count + Lock-guarded internal dictionary keyed OrdinalIgnoreCase to match the DriverInstanceId convention used by Equipment / Tag rows + walker grouping. Set-once-per-bootstrap semantics in practice though nothing enforces that at the type level — OpcUaServerService.PopulateEquipmentContentAsync is the only expected writer. Shared-mutable rather than immutable-passed-by-value because the DI graph builds OpcUaApplicationHost before NodeBootstrap has resolved the generation, so the registry must exist at compose time + fill at boot time. Program.cs now registers OpcUaApplicationHost via a factory lambda that threads registry.Get as the equipmentContentLookup delegate PR #155 added to the ctor seam — the one-line composition the earlier PR promised. EquipmentNamespaceContentLoader (from PR #156) is AddScoped since it takes the scoped OtOpcUaConfigDbContext; the populate pass in OpcUaServerService opens one IServiceScopeFactory scope + reuses the same loader + DbContext across every driver query rather than scoping-per-driver. OpcUaServerService.ExecuteAsync gets a new PopulateEquipmentContentAsync step between bootstrap + StartAsync: iterates DriverHost.RegisteredDriverIds, calls loader.LoadAsync per driver at the bootstrapped generationId, stashes non-null results in the registry. Null results are skipped — the wire-in's null-check treats absent registry entries as "this driver isn't Equipment-kind; let DiscoverAsync own the address space" which is the correct backward-compat path for Modbus / AB CIP / TwinCAT / FOCAS. Guarded on result.GenerationId being non-null — a fleet with no Published generation yet boots cleanly into a UNS-less address space and fills the registry on the next restart after first publish. Ctor on OpcUaServerService gained two new dependencies (DriverEquipmentContentRegistry + IServiceScopeFactory). No test file constructs OpcUaServerService directly so no downstream test breakage — the BackgroundService is only wired via DI in Program.cs. Four new DriverEquipmentContentRegistryTests: Get-null-for-unknown, Set-then-Get, case-insensitive driver-id lookup, Set-overwrites-existing. Server.Tests 190/190 (was 186, +4 new registry tests). Full ADR-001 Option A now lives at every layer: Core.OpcUa walker (#153) → ScopePathIndexBuilder (#154) → OpcUaApplicationHost wire-in (#155) → EquipmentNamespaceContentLoader (#156) → this PR's registry + Program.cs composition. The last pending loose end (full-integration smoke test that boots Program.cs against a seeded Config DB + verifies UNS tree via live OPC UA client) isn't strictly necessary because PR #155's OpcUaEquipmentWalkerIntegrationTests already proves the wire-in at the OPC UA client-browse level — the Program.cs composition added here is purely mechanical + well-covered by the four-file audit trail plus registry unit tests.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 03:50:37 -04:00
f6d98cfa6b Merge pull request 'EquipmentNamespaceContentLoader � Config-DB loader for walker wire-in' (#156) from equipment-content-loader into v2 2026-04-20 03:21:48 -04:00
Joseph Doherty
a29828e41e EquipmentNamespaceContentLoader — Config-DB loader that fills the (driverInstanceId, generationId) shape the walker wire-in from PR #155 consumes. Narrow follow-up to PR #155: the ctor plumbing on OpcUaApplicationHost already takes a Func<string, EquipmentNamespaceContent?>? lookup; this PR lands the loader that will back that lookup against the central Config DB at SealedBootstrap time. DI composition in Program.cs is a separate structural PR because it needs the generation-resolve chain restructured to run before OpcUaApplicationHost construction — this one just lands the loader + unit tests so the wiring PR reduces to one factory lambda. Loader scope is one driver instance at one generation: joins Equipment filtered by (DriverInstanceId == driver, GenerationId == gen, Enabled) first, then UnsLines reachable from those Equipment rows, then UnsAreas reachable from those lines, then Tags filtered by (DriverInstanceId == driver, GenerationId == gen). Returns null when the driver has no Equipment at the supplied generation — the wire-in's null-check treats that as "skip the walker; let DiscoverAsync own the whole address space" which is the correct backward-compat behavior for non-Equipment-kind drivers (Modbus / AB CIP / TwinCAT / FOCAS whose namespace-kind is native per decisions #116-#121). Only loads the UNS branches that actually host this driver's Equipment — skips pulling unrelated UNS folders from other drivers' regions of the cluster by deriving lineIds/areaIds from the filtered Equipment set rather than reloading the full UNS tree. Enabled=false Equipment are skipped at the query level so a decommissioned machine doesn't produce a phantom browse folder — Admin still sees it in the diff view via the regular Config-DB queries but the walker's browse output reflects the operational fleet. AsNoTracking on every query because the bootstrap flow is read-only + the result is handed off to a pure-function walker immediately; change tracking would pin rows in the DbContext for the full server lifetime with no corresponding write path. Five new EquipmentNamespaceContentLoaderTests using InMemoryDatabase: (a) null result when driver has no Equipment; (b) baseline happy-path loads the full shape correctly; (c) other driver's rows at the same generation don't leak into this driver's result (per-driver scope contract); (d) same-driver rows at a different generation are skipped (per-generation scope contract per decision #148); (e) Enabled=false Equipment are skipped. Server project builds 0 errors; Server.Tests 186/186 (was 181, +5 new loader tests). Once the wiring PR lands the factory lambda in Program.cs the loader closes over the SealedBootstrap-resolved generationId + the lookup delegate delegates to LoadAsync via IServiceScopeFactory — a one-line composition, no ctor-signature churn on OpcUaApplicationHost because PR #155 already established the seam.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 03:19:45 -04:00
f5076b4cdd Merge pull request 'ADR-001 wire-in � EquipmentNodeWalker in OpcUaApplicationHost (#212 + #213)' (#155) from equipment-walker-wire-in into v2 2026-04-20 03:11:35 -04:00
Joseph Doherty
2d97f241c0 ADR-001 wire-in — EquipmentNodeWalker runs inside OpcUaApplicationHost before driver DiscoverAsync, closing tasks #212 + #213. Completes the in-server half of the ADR-001 Option A story: Task A (PR #153) shipped the pure-function walker in Core.OpcUa; Task B (PR #154) shipped the NodeScopeResolver + ScopePathIndexBuilder + evaluator-level authz proof. This PR lands the BuildAddressSpaceAsync wire-in the walker was always meant to plug into + a full-stack OPC UA client-browse integration test that proves the UNS folder skeleton is actually visible to real UA clients end-to-end, not just to the RecordingBuilder test double. OpcUaApplicationHost gains an optional ctor parameter equipmentContentLookup of type Func<string, EquipmentNamespaceContent?>? — when supplied + non-null for a driver instance, EquipmentNodeWalker.Walk is invoked against that driver's node manager BEFORE GenericDriverNodeManager.BuildAddressSpaceAsync streams the driver's native DiscoverAsync output on top. Walker-first ordering matters: the UNS Area/Line/Equipment folder skeleton + Identification sub-folders + the five identifier properties (decision #121) are in place so driver-native references (driver-specific tag paths) land ALONGSIDE the UNS tree rather than racing it. Callers that don't supply a lookup (every existing pre-ADR-001 test + the v1 upgrade path) get identical behavior — the null-check is the backward-compat seam per the opt-in design sketched in ADR-001. The lookup delegate is driver-instance-scoped, not server-scoped, so a single server with multiple drivers can serve e.g. one Equipment-kind namespace (Galaxy proxy with a full UNS) alongside several native-kind namespaces (Modbus / AB CIP / TwinCAT / FOCAS that do not have their own UNS because decisions #116-#121 scope UNS to Equipment-kind only). SealedBootstrap.Start will wire this lookup against the Config-DB snapshot loader in a follow-up — the lookup plumbing lands first so that wiring reduces to one-line composition rather than a ctor-signature churn. New OpcUaEquipmentWalkerIntegrationTests spins up a real OtOpcUaServer on a non-default port with an EmptyDriver that registers with zero native content + a lookup that returns a seeded EquipmentNamespaceContent (one area warsaw / one line line-a / one equipment oven-3 / one tag Temperature). An OPC UA client session connects anonymously against the un-secured endpoint, browses the standard hierarchy, + asserts: (a) area folder warsaw contains line-a folder as a child; (b) line folder line-a contains oven-3 folder as a child; (c) equipment folder oven-3 contains EquipmentId + EquipmentUuid + MachineCode identifier properties — ZTag + SAPID correctly absent because the fixture leaves them null per decision #121 skip-when-null behavior; (d) the bound Tag emits a Variable node under the equipment folder with NodeId == Tag.TagConfig (the wire-level driver address) + the client can ReadValue against it end-to-end through the DriverNodeManager dispatch path. Because the EmptyDriver's DiscoverAsync is a no-op the test proves UNS content came from the walker, not the driver — the original ADR-001 question "what actually owns the browse tree" now has a mechanical answer visible at the OPC UA wire level. Test class uses its own port (48500+rand) + per-test PKI root so it runs in parallel with the existing OpcUaServerIntegrationTests fixture (48400+rand) without binding or cert collisions. Server project builds 0 errors; Server.Tests 181/181 (was 179, +2 new full-stack walker tests). Task #212 + #213 closed; the follow-up SealedBootstrap wiring is the natural next pickup because the ctor plumbing lands here + that becomes a narrow downstream PR.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 03:09:37 -04:00
5811ede744 Merge pull request (#154) - ADR-001 Task B + #195 close-out 2026-04-20 02:52:26 -04:00
Joseph Doherty
1bf3938cdf ADR-001 Task B — NodeScopeResolver full-path + ScopePathIndexBuilder + evaluator-level ACL test closing #195. Two production additions + one end-to-end authz regression test proving the Identification ACL contract the IdentificationFolderBuilder docstring promises. Task A (PR #153) shipped the walker as a pure function that materializes the UNS → Equipment → Tag browse tree + IdentificationFolderBuilder.Build per Equipment. This PR lands the authz half of the walker's story — the resolver side that turns a driver-side full reference into a full NodeScope path (NamespaceId + UnsAreaId + UnsLineId + EquipmentId + TagId) so the permission trie can walk the UNS hierarchy + apply Equipment-scope grants correctly at dispatch time. The actual in-server wiring (load snapshot → call walker during BuildAddressSpaceAsync → swap in the full-path resolver) is split into follow-up task #212 because it's a bigger surface (Server bootstrap + DriverNodeManager override + real OPC UA client-browse integration test). NodeScopeResolver extended with a second constructor taking IReadOnlyDictionary<string, NodeScope> pathIndex — when supplied, Resolve looks up the full reference in the index + returns the indexed scope with every UNS level populated; when absent or on miss, falls back to the pre-ADR-001 cluster-only scope so driver-discovered tags that haven't been indexed yet (between a DiscoverAsync result + the next generation publish) stay addressable without crashing the resolver. Index is frozen into a FrozenDictionary<string, NodeScope> under Ordinal comparer for O(1) hot-path lookups. Thread-safety by immutability — callers swap atomically on generation change via the server's publish pipeline. New ScopePathIndexBuilder.Build in Server.Security takes (clusterId, namespaceId, EquipmentNamespaceContent) + produces the fullReference → NodeScope dictionary by joining Tag → Equipment → UnsLine → UnsArea through up-front dictionaries keyed Ordinal-ignoring-case. Tag rows with null EquipmentId (SystemPlatform-namespace Galaxy tags per decision #120) are excluded from the index; cluster-only fallback path covers them. Broken FKs (Tag references missing Equipment row, or Equipment references missing UnsLine) are skipped rather than crashing — sp_ValidateDraft should have caught these at publish, any drift here is unexpected but non-fatal. Duplicate keys throw InvalidOperationException at bootstrap so corrupt-data drift surfaces up-front instead of producing silently-last-wins scopes at dispatch. End-to-end authz regression test in EquipmentIdentificationAuthzTests walks the full dispatch flow against a Config-DB-style fixture: ScopePathIndexBuilder.Build from the same EquipmentNamespaceContent the EquipmentNodeWalker consumes → NodeScopeResolver with that index → AuthorizationGate + TriePermissionEvaluator → PermissionTrieBuilder with one Equipment-scope NodeAcl grant + a NodeAclPath resolving Equipment ScopeId to (namespace, area, line, equipment). Four tests prove the contract: (a) authorized group Read granted on Identification property; (b) unauthorized group Read denied on Identification property — the #195 contract the IdentificationFolderBuilder docstring promises (the BadUserAccessDenied surfacing happens at the DriverNodeManager dispatch layer which is already wired to AuthorizationGate.IsAllowed → StatusCodes.BadUserAccessDenied in PR #94); (c) Equipment-scope grant cascades to both the Equipment's tag + its Identification properties because they share the Equipment ScopeId — no new scope level for Identification per the builder's Remarks section; (d) grant on oven-3 does NOT leak to press-7 (different equipment under the same UnsLine) proving per-Equipment isolation at dispatch when the resolver populates the full path. NodeScopeResolverTests extended with two new tests covering the indexed-lookup path + fallback-on-miss path; renamed the existing "_For_Phase1" test to "_When_NoIndexSupplied" to match the current framing. Server project builds 0 errors; Server.Tests 179/179 (was 173, +6 new across the two test files). Task #212 captures the remaining in-server wiring work — Server.SealedBootstrap load of EquipmentNamespaceContent, DriverNodeManager override that calls EquipmentNodeWalker during BuildAddressSpaceAsync for Equipment-kind namespaces, and a real OPC UA client-browse integration test. With that wiring + this PR's authz-layer proof, #195's "ACL integration test" line is satisfied at two layers (evaluator + live endpoint) which is stronger than the task originally asked for.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 02:50:27 -04:00
7a42f6d84c Merge pull request (#153) - EquipmentNodeWalker (ADR-001 Task A) 2026-04-20 02:40:58 -04:00