Commit Graph

465 Commits

Author SHA1 Message Date
4e41f196b2 Merge pull request 'Phase 7 Stream G follow-up — DriverNodeManager dispatch routing by NodeSourceKind' (#186) from phase-7-stream-g-followup-dispatch into v2 2026-04-20 20:14:25 -04:00
Joseph Doherty
f0851af6b5 Phase 7 Stream G follow-up — DriverNodeManager dispatch routing by NodeSourceKind
Honors the ADR-002 discriminator at OPC UA Read/Write dispatch time. Virtual tag
reads route to the VirtualTagEngine-backed IReadable; scripted alarm reads route
to the ScriptedAlarmEngine-backed IReadable; driver reads continue to route to the
driver's own IReadable (no regression for any existing driver test).

## Changes

DriverNodeManager ctor gains optional `virtualReadable` + `scriptedAlarmReadable`
parameters. When callers omit them (every existing driver test) the manager behaves
exactly as before. SealedBootstrap wires the engines' IReadable adapters once the
Phase 7 composition root is added.

Per-variable NodeSourceKind tracked in `_sourceByFullRef` during Variable() registration
alongside the existing `_writeIdempotentByFullRef` / `_securityByFullRef` maps.

OnReadValue now picks the IReadable by source kind via the new internal
SelectReadable helper. When the engine-backed IReadable isn't wired (virtual tag
node but no engine provided), returns BadNotFound rather than silently falling
back to the driver — surfaces a misconfiguration instead of masking it.

OnWriteValue gates on IsWriteAllowedBySource which returns true only for Driver.
Plan decision #6: virtual tags + scripted alarms reject direct OPC UA writes with
BadUserAccessDenied. Scripts write virtual tags via `ctx.SetVirtualTag`; operators
ack alarms via the Part 9 method nodes.

## Tests — 7/7 (internal helpers exposed via InternalsVisibleTo)

DriverNodeManagerSourceDispatchTests covers:
- Driver source routes to driver IReadable
- Virtual source routes to virtual IReadable
- ScriptedAlarm source routes to alarm IReadable
- Virtual source with null virtual IReadable returns null (→ BadNotFound)
- ScriptedAlarm source with null alarm IReadable returns null
- Driver source with null driver IReadable returns null (preserves BadNotReadable)
- IsWriteAllowedBySource: only Driver=true (Virtual=false, ScriptedAlarm=false)

Full solution builds clean. Phase 7 test total now 197 green.
2026-04-20 20:12:17 -04:00
6df069b083 Merge pull request 'Phase 7 Stream F — Admin UI for scripts + test harness + historian diagnostics' (#185) from phase-7-stream-f-admin-ui into v2 2026-04-20 20:01:40 -04:00
Joseph Doherty
0687bb2e2d Phase 7 Stream F — Admin UI for scripts + test harness + historian diagnostics
Adds the draft-editor tab + page surface for authoring Phase 7 virtual tags and
scripted alarms, plus the /alarms/historian operator diagnostics page. Monaco loads
from CDN via a progressive-enhancement JS shim — the textarea works immediately so
the page is functional even if the CDN is unreachable.

## New services (Admin)

- ScriptService — CRUD for Script entity. SHA-256 SourceHash recomputed on save so
  Core.Scripting's CompiledScriptCache hits on re-publish of unchanged source + misses
  when the source actually changes.
- VirtualTagService — CRUD for VirtualTag, with Enabled toggle.
- ScriptedAlarmService — CRUD for ScriptedAlarm + lookup of persistent ScriptedAlarmState
  (logical-id-keyed per plan decision #14).
- ScriptTestHarnessService — pre-publish dry-run. Enforces plan decision #22: only
  inputs the DependencyExtractor identifies can be supplied. Missing / extra synthetic
  inputs surface as dedicated outcomes. Captures SetVirtualTag writes + Serilog events
  from the script so the operator can see both the output + the log output before
  publishing.
- HistorianDiagnosticsService — surfaces the local-process IAlarmHistorianSink state
  on /alarms/historian. Null sink reports Disabled + swallows retry. Live
  SqliteStoreAndForwardSink reports real queue depth + last-error + drain state and
  routes the Retry-dead-lettered button through.

## New UI

- ScriptsTab.razor (inside DraftEditor tabs) — list + create/edit/delete scripts with
  Monaco editor + dependency preview + test-harness run panel showing output + writes
  + log emissions.
- ScriptEditor.razor — reusable Monaco-backed textarea. Loads editor from CDN via
  wwwroot/js/monaco-loader.js. Textarea stays authoritative for Blazor binding; Monaco
  mirrors into it on every keystroke.
- AlarmsHistorian.razor (/alarms/historian) — queue depth + dead-letter depth + drain
  state badge + last-error banner + Retry-dead-lettered button.
- DraftEditor.razor — new "Scripts" tab.

## DI wiring

All five services registered in Program.cs. Null historian sink bound at Admin
composition time (real SqliteStoreAndForwardSink lives in the Server process).

## Tests — 13/13

Phase7ServicesTests covers:
- ScriptService: Add generates logical id + hash, Update recomputes hash on source
  change, Update same-source keeps hash (cache-hit preservation), Delete is idempotent
- VirtualTagService: round-trips trigger flags, Enabled toggle works
- ScriptedAlarmService: HistorizeToAveva defaults true per plan decision #15
- ScriptTestHarness: successful run captures output + writes, rejects missing /
  extra inputs, rejects non-literal paths, compile errors surface as Threw
- HistorianDiagnosticsService: null sink reports Disabled + retry returns 0
2026-04-20 19:59:18 -04:00
4d4f08af0d Merge pull request 'Phase 7 Stream G — Address-space integration (NodeSourceKind + walker emits VirtualTag/ScriptedAlarm)' (#184) from phase-7-stream-g-addressspace-integration into v2 2026-04-20 19:43:08 -04:00
Joseph Doherty
f1f53e1789 Phase 7 Stream G — Address-space integration (NodeSourceKind + walker emits VirtualTag/ScriptedAlarm)
Per ADR-002, adds the Driver/Virtual/ScriptedAlarm discriminator to DriverAttributeInfo
so the DriverNodeManager's dispatch layer can route Read/Write/Subscribe to the right
runtime subsystem — drivers (unchanged), VirtualTagEngine (Phase 7 Stream B), or
ScriptedAlarmEngine (Phase 7 Stream C).

## Changes
- NodeSourceKind enum added to Core.Abstractions (Driver=0/Virtual=1/ScriptedAlarm=2).
- DriverAttributeInfo gains Source / VirtualTagId / ScriptedAlarmId parameters — all
  default so existing call sites (every driver) compile unchanged.
- EquipmentNamespaceContent gains VirtualTags + ScriptedAlarms optional collections.
- EquipmentNodeWalker emits:
  - Virtual-tag variables — Source=Virtual, VirtualTagId set, Historize flag honored
  - Scripted-alarm variables — Source=ScriptedAlarm, ScriptedAlarmId set, IsAlarm=true
    (triggers node-manager AlarmConditionState materialization)
  - Skips disabled virtual tags + scripted alarms

## Tests — 13/13 in EquipmentNodeWalkerTests (5 new)
- Virtual-tag variables carry Source=Virtual + VirtualTagId + Historize flag
- Scripted-alarm variables carry Source=ScriptedAlarm + IsAlarm=true + Boolean type
- Disabled rows skipped
- Null VirtualTags/ScriptedAlarms collections safe (back-compat for non-Phase-7 callers)
- Driver tags default Source=Driver (ensures no discriminator regression)

## Next
Stream G follow-up: DriverNodeManager dispatch (Read/Write/Subscribe routing by
NodeSourceKind), SealedBootstrap wiring of VirtualTagEngine + ScriptedAlarmEngine,
end-to-end integration test.
2026-04-20 19:41:01 -04:00
e97db2d108 Merge pull request 'Phase 7 Stream E — Config DB schema for scripts, virtual tags, scripted alarms, and alarm state' (#183) from phase-7-stream-e-config-db into v2 2026-04-20 19:24:53 -04:00
Joseph Doherty
be1003c53e Phase 7 Stream E — Config DB schema for scripts, virtual tags, scripted alarms, and alarm state
Adds the four tables Streams B/C/F consume — Script (generation-scoped source code),
VirtualTag (generation-scoped calculated-tag config), ScriptedAlarm (generation-scoped
alarm config), and ScriptedAlarmState (logical-id-keyed persistent runtime state).

## New entities (net10, EF Core)

- Script — stable logical ScriptId carries across generations; SourceHash is the
  compile-cache key (matches Core.Scripting's CompiledScriptCache).
- VirtualTag — mandatory EquipmentId FK (plan decision #2, unified Equipment tree);
  ChangeTriggered/TimerIntervalMs + Historize flags; check constraints enforce
  "at least one trigger" + "timer >= 50ms".
- ScriptedAlarm — required AlarmType ('AlarmCondition'/'LimitAlarm'/'OffNormalAlarm'/
  'DiscreteAlarm'); Severity 1..1000 range check; HistorizeToAveva default true per
  plan decision #15.
- ScriptedAlarmState — keyed ONLY on ScriptedAlarmId (NOT generation-scoped) per plan
  decision #14 — ack state + audit trail must follow alarm identity across Modified
  generations. CommentsJson has ISJSON check for GxP audit.

## Migration

EF-generated 20260420231641_AddPhase7ScriptingTables covers all 4 tables + indexes +
check constraints + FKs to ConfigGeneration. sp_PublishGeneration required no changes —
it only flips Draft->Published status; the new entities already carry GenerationId so
they publish atomically with the rest of the config.

## Tests — 12/12 (design-time model introspection)

Phase7ScriptingEntitiesTests covers: table registration, column maxlength + column
types, unique indexes (Generation+LogicalId, Generation+EquipmentPath for VirtualTag
and ScriptedAlarm), secondary indexes (SourceHash for cache lookup), check constraints
(trigger-required, timer-min, severity-range, alarm-type-enum, CommentsJson-IsJson),
ScriptedAlarmState PK is alarm-id not generation-scoped, ScriptedAlarm defaults
(HistorizeToAveva=true, Retain=true, Severity=500, Enabled=true), DbSets wired, and
the generated migration type exists for rollforward.
2026-04-20 19:22:45 -04:00
dccaa11510 Merge pull request 'Phase 7 Stream D — Historian alarm sink (SQLite store-and-forward + Galaxy.Host IPC contracts)' (#182) from phase-7-stream-d-alarm-historian into v2 2026-04-20 19:14:01 -04:00
Joseph Doherty
25ad4b1929 Phase 7 Stream D — Historian alarm sink (SQLite store-and-forward + Galaxy.Host IPC contracts)
Phase 7 plan decisions #16, #17, #19, #21 implementation. Durable local SQLite queue
absorbs every qualifying alarm event; drain worker forwards batches to Galaxy.Host
(reusing the already-loaded 32-bit aahClientManaged DLLs) on an exponential-backoff
cadence; operator acks never block on the historian being reachable.

## New project Core.AlarmHistorian (net10)

- AlarmHistorianEvent — source-agnostic event shape (scripted alarms + Galaxy-native +
  AB CIP ALMD + any future IAlarmSource)
- IAlarmHistorianSink / NullAlarmHistorianSink — interface + disabled default
- IAlarmHistorianWriter — per-event outcome (Ack / RetryPlease / PermanentFail); Stream G
  wires the Galaxy.Host IPC client implementation
- SqliteStoreAndForwardSink — full implementation:
  - Queue table with AttemptCount / LastError / DeadLettered columns
  - DrainOnceAsync serialised via SemaphoreSlim
  - BackoffLadder 1s → 2s → 5s → 15s → 60s (cap)
  - DefaultCapacity 1,000,000 rows — overflow evicts oldest non-dead-lettered
  - DefaultDeadLetterRetention 30 days — sweeper purges on every drain tick
  - RetryDeadLettered operator action reattaches dead-letters to the regular queue
  - Writer-side exceptions treated as whole-batch RetryPlease (no data loss)

## New IPC contracts in Driver.Galaxy.Shared

- HistorianAlarmEventRequest — batched up to 100 events/request per plan Stream D.5
- HistorianAlarmEventResponse — per-event outcome (1:1 with request order)
- HistorianAlarmEventOutcomeDto enum (byte on the wire — Ack/RetryPlease/PermanentFail)
- HistorianAlarmEventDto — mirrors Core.AlarmHistorian.AlarmHistorianEvent
- HistorianConnectivityStatusNotification — Host pushes proactively when the SDK
  session drops so /alarms/historian flips red without waiting for the next drain
- MessageKind additions: 0x80 HistorianAlarmEventRequest / 0x81 HistorianAlarmEventResponse
  / 0x82 HistorianConnectivityStatus

## Tests — 14/14

SqliteStoreAndForwardSinkTests covers: enqueue→drain→Ack round-trip, empty-queue no-op,
RetryPlease bumps backoff + keeps row, Ack after Retry resets backoff, PermanentFail
dead-letters one row without blocking neighbors, writer exception treated as whole-batch
retry with error surfaced in status, capacity eviction drops oldest non-dead-lettered,
dead-letters purged past retention window, RetryDeadLettered requeues, ladder caps at
60s after 10 retries, Null sink reports Disabled status, null sink swallows enqueue,
ctor argument validation, disposed sink rejects enqueue.

## Totals
Full Phase 7 tests: 160 green (63 Scripting + 36 VirtualTags + 47 ScriptedAlarms +
14 AlarmHistorian). Stream G wires this into the real Galaxy.Host IPC pipe.
2026-04-20 19:11:17 -04:00
51d0b27bfd Merge pull request 'Phase 7 Stream C — Core.ScriptedAlarms (Part 9 state machine + predicate engine + IAlarmSource)' (#181) from phase-7-stream-c-scripted-alarms into v2 2026-04-20 18:52:11 -04:00
Joseph Doherty
df39809526 Phase 7 Stream C — Core.ScriptedAlarms project (Part 9 state machine + predicate engine + IAlarmSource adapter)
Ships the Part 9 alarm fidelity layer Phase 7 committed to in plan decision #5. Every scripted alarm gets a full OPC UA AlarmConditionType state machine — EnabledState, ActiveState, AckedState, ConfirmedState, ShelvingState — with persistent operator-supplied state across server restarts per Phase 7 plan decision #14. Runtime shape matches the Galaxy-native + AB CIP ALMD alarm sources: scripted alarms fan out through the existing IAlarmSource surface so Phase 6.1 AlarmTracker composition consumes them without per-source branching.

Part9StateMachine is a pure-functions module — no instance state, no I/O, no mutation. Every transition (ApplyPredicate, ApplyAcknowledge, ApplyConfirm, ApplyOneShotShelve, ApplyTimedShelve, ApplyUnshelve, ApplyEnable, ApplyDisable, ApplyAddComment, ApplyShelvingCheck) takes the current AlarmConditionState record plus the event and returns a fresh state + EmissionKind hint. Two structural invariants enforced: disabled alarms never transition ActiveState / AckedState / ConfirmedState; shelved alarms still advance state (so startup recovery reflects reality) but emit a Suppressed hint so subscribers do not see the transition. OneShot shelving expires on clear; Timed shelving expires via ApplyShelvingCheck against the UnshelveAtUtc timestamp. Comments are append-only — every acknowledge, confirm, shelve, unshelve, enable, disable, explicit add-comment, and auto-unshelve appends an AlarmComment record with user identity + timestamp + kind + text for the GxP / 21 CFR Part 11 audit surface.

AlarmConditionState is the persistent record the store saves. Fields: AlarmId, Enabled, Active, Acked, Confirmed, Shelving (kind + UnshelveAtUtc), LastTransitionUtc, LastActiveUtc, LastClearedUtc, LastAckUtc + LastAckUser + LastAckComment, LastConfirmUtc + LastConfirmUser + LastConfirmComment, Comments. Fresh factory initializes everything to the no-event position.

IAlarmStateStore is the persistence abstraction — LoadAsync, LoadAllAsync, SaveAsync, RemoveAsync. Stream E wires this to a SQL-backed store with IAuditLogger hooks; tests use InMemoryAlarmStateStore. Startup recovery per Phase 7 plan decision #14: LoadAsync runs every configured alarm predicate against current tag values to rederive ActiveState, but EnabledState / AckedState / ConfirmedState / ShelvingState + audit history are loaded verbatim from the store so operators do not re-ack after an outage and shelved alarms stay shelved through maintenance windows.

MessageTemplate implements Phase 7 plan decision #13 — static-with-substitution. {TagPath} tokens resolved at event emission time from the engine value cache. Missing paths, non-Good quality, or null values all resolve to {?} so the event still fires but the operator sees where the reference broke. ExtractTokenPaths enumerates tokens at publish time so the engine knows to subscribe to every template-referenced tag in addition to predicate-referenced tags.

AlarmPredicateContext is the ScriptContext subclass alarm scripts see. GetTag reads from the engine shared cache; SetVirtualTag is explicitly rejected at runtime with a pointed error message — alarm predicates must be pure so their output does not couple to virtual-tag state in ways that become impossible to reason about. If cross-tag side effects are needed, the operator authors a virtual tag and the alarm predicate reads it.

ScriptedAlarmEngine orchestrates. LoadAsync compiles every predicate through Stream A ScriptSandbox + ForbiddenTypeAnalyzer, runs DependencyExtractor to find the read set, adds template token paths to the input set, reports every compile failure as one aggregated InvalidOperationException (not one-at-a-time), subscribes to each unique referenced upstream path, seeds the value cache, loads persisted state for each alarm (falling back to Fresh for first-load), re-evaluates the predicate, and saves the recovered state. ChangeTrigger — when an upstream tag changes, look up every alarm referencing that path in a per-path inverse index, enqueue all of them for re-evaluation via a SemaphoreSlim-gated path. Unlike the virtual-tag engine, scripted alarms are leaves in the evaluation DAG (no alarm drives another alarm), so no topological sort is needed. Operator actions (AcknowledgeAsync, ConfirmAsync, OneShotShelveAsync, TimedShelveAsync, UnshelveAsync, EnableAsync, DisableAsync, AddCommentAsync) route through the state machine, persist, and emit if there is an emission. A 5-second shelving-check timer auto-expires Timed shelving and emits Unshelved events at the right moment. Predicate evaluation errors (script throws, timeout, compile-time reads bad tag) leave the state unchanged — the engine does NOT invent a clear transition on predicate failure. Logged as scripts-*.log Error; companion WARN in main log.

ScriptedAlarmSource implements IAlarmSource. SubscribeAlarmsAsync filter is a set of equipment-path prefixes; empty means all. AcknowledgeAsync from the base interface routes to the engine with user identity "opcua-client" — Stream G will replace this with the authenticated principal from the OPC UA dispatch layer. The adapter implements only the base IAlarmSource methods; richer Part 9 methods (Confirm, Shelve, Unshelve, AddComment) remain on the engine and will bind to OPC UA method nodes in Stream G.

47 unit tests across 5 files. Part9StateMachineTests (16) — every transition + noop edge cases: predicate true/false, same-state noop, disabled ignores predicate, acknowledge records user/comment/adds audit, idempotent acknowledge, reject no-user ack, full activate-ack-clear-confirm walk, one-shot shelve suppresses next activation, one-shot expires on clear, timed shelve requires future unshelve time, timed shelve expires via shelving-check, explicit unshelve emits, add-comment appends to audit, comments append-only through multiple operations, full lifecycle walk emits every expected EmissionKind. MessageTemplateTests (11) — no-token passthrough, single+multiple token substitution, bad quality becomes {?}, unknown path becomes {?}, null value becomes {?}, tokens with slashes+dots, empty + null template, ExtractTokenPaths returns every distinct path, whitespace inside tokens trimmed. ScriptedAlarmEngineTests (13) — load compiles+subscribes, compile failures aggregated, upstream change emits Activated, clearing emits Cleared, message template resolves at emission, ack persists to store, startup recovery preserves ack but rederives active, shelved activation state-advances but suppresses emission, runtime exception isolates to owning alarm, disable prevents activation until re-enable, AddComment appends audit without state change, SetVirtualTag from predicate rejected (state unchanged), Dispose releases upstream subscriptions. ScriptedAlarmSourceTests (5) — empty filter matches all, equipment-prefix filter, Unsubscribe stops events, AcknowledgeAsync routes with default user, null arguments rejected. FakeUpstream fixture gives tests an in-memory driver mock with subscription count tracking.

Full Phase 7 test count after Stream C: 146 green (63 Scripting + 36 VirtualTags + 47 ScriptedAlarms). Stream D (historian alarm sink with SQLite store-and-forward + Galaxy.Host IPC) consumes ScriptedAlarmEvent + similar Galaxy / AB CIP emissions to produce the unified alarm timeline. Stream G wires the OPC UA method calls and AlarmSource into DriverNodeManager dispatch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 18:49:48 -04:00
2a8bcc8f60 Merge pull request 'Phase 7 Stream B — Core.VirtualTags engine + dep graph + timer + source' (#180) from phase-7-stream-b-virtual-tag-engine into v2 2026-04-20 17:05:13 -04:00
Joseph Doherty
479af166ab Phase 7 Stream B — Core.VirtualTags project (engine + dep graph + timer + source)
Ships the evaluation engine that consumes compiled scripts from Stream A, subscribes to upstream driver tags, runs on change + on timer, cascades evaluations through dependent virtual tags in topological order, and emits changes through a driver-capability-shaped adapter the DriverNodeManager can dispatch to per ADR-002.

DependencyGraph owns the directed dep-graph where nodes are tag paths (driver tags implicit leaves, virtual tags registered internal nodes) and edges run from a virtual tag to each tag it reads. Kahn algorithm produces the topological sort. Tarjan iterative SCC detects every cycle in one pass so publish-time rejection surfaces all offending cycles together. Both iterative so 10k-deep chains do not StackOverflow. Re-adding a node overwrites prior dependency set cleanly (supports config-publish reloads).

VirtualTagDefinition is the operator-authored config row (Path, DataType, ScriptSource, ChangeTriggered, TimerInterval, Historize). Stream E config DB materializes these on publish.

ITagUpstreamSource is the abstraction the engine pulls driver tag values from. Stream G bridges this to IReadable + ISubscribable on live drivers; tests use FakeUpstream that tracks subscription count for leak-test assertions.

IHistoryWriter is the per-tag Historize sink. NullHistoryWriter default when caller does not pass one.

VirtualTagContext is the per-evaluation ScriptContext. Reads from engine last-known-value cache, writes route through SetVirtualTag callback so cross-tag side effects participate in change cascades. Injectable Now clock for deterministic tests.

VirtualTagEngine orchestrates. Load compiles every script via ScriptSandbox, builds the dep graph via DependencyExtractor, checks for cycles, reports every compile failure in one error, subscribes to each referenced upstream path, seeds the value cache. EvaluateAllAsync runs topological order. EvaluateOneAsync is timer path. Read returns cached value. Subscribe registers observer. OnUpstreamChange updates cache, fans out, schedules transitive dependents (change-driven=false tags skipped). EvaluateInternalAsync holds a SemaphoreSlim so cascades do not interleave. Script exceptions and timeouts map per-tag to BadInternalError. Coercion from script double to config Int32 uses Convert.ToInt32.

TimerTriggerScheduler groups tags by interval into shared Timers. Tags without TimerInterval not scheduled.

VirtualTagSource implements IReadable + ISubscribable per ADR-002. ReadAsync returns cache. SubscribeAsync fires initial-data callback per OPC UA convention. IWritable deliberately not implemented — OPC UA writes to virtual tags rejected in DriverNodeManager per Phase 7 decision 6.

36 unit tests across 4 files: DependencyGraphTests 12, VirtualTagEngineTests 13, VirtualTagSourceTests 6, TimerTriggerSchedulerTests 4. Coverage includes cycle detection (self-loop, 2-node, 3-node, multiple disjoint), 2-level change cascade, per-tag error isolation (one tag throws, others keep working), timeout isolation, Historize toggle, ChangeTriggered=false ignore, reload cleans subscriptions, Dispose releases resources, SetVirtualTag fires observers, type coercion, 10k deep graph no stack overflow, initial-data callback, Unsubscribe stops events.

Fixed two bugs during implementation. Monitor.Enter/Exit cannot be held across await (Monitor ownership is thread-local and lost across suspension) — switched to SemaphoreSlim. Kahn edge-direction was inverted — for dependency ordering (X depends on Y means Y comes before X) in-degree should be count of a node own deps, not count of nodes pointing to it; was incrementing inDegree[dep] instead of inDegree[nodeId], causing false cycle detection on valid DAGs.

Full Phase 7 test count after Stream B: 99 green (63 Scripting + 36 VirtualTags). Streams C and G will plug engine + source into live OPC UA dispatch path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 17:02:50 -04:00
00724e9784 Merge pull request 'Phase 7 Stream A.3 — ScriptLoggerFactory + ScriptLogCompanionSink (closes Stream A)' (#179) from phase-7-stream-a3-script-logger into v2 2026-04-20 16:45:09 -04:00
Joseph Doherty
36774842cf Phase 7 Stream A.3 — ScriptLoggerFactory + ScriptLogCompanionSink. Third of 3 increments closing out Stream A. Adds the Serilog plumbing that ties script-emitted log events to the dedicated scripts-*.log rolling sink with structured-property filtering AND forwards script Error+ events to the main opcua-*.log at Warning level so operators see script failures in the primary log without drowning it in Debug/Info script chatter. Both pieces are library-level building blocks — the actual file-sink + logger composition at server startup happens in Stream F (Admin UI) / Stream G (address-space wiring). This PR ships the reusable factory + sink + tests so any consumer can wire them up without rediscovering the structured-property contract.
ScriptLoggerFactory wraps a Serilog root logger (the scripts-*.log pipeline) and .Create(scriptName) returns a per-script ILogger with the ScriptName structured property pre-bound via ForContext. The structured property name is a public const (ScriptNameProperty = "ScriptName") because the Admin UI's log-viewer filter references this exact string — changing it breaks the filter silently, so it's stable by contract. Factory constructor rejects a null root logger; Create rejects null/empty/whitespace script names. No per-evaluation allocation in the hot path — engines (Stream B virtual-tag / Stream C scripted-alarm) create one factory per engine instance then cache per-script loggers beside the ScriptContext instances they already build.

ScriptLogCompanionSink is a Serilog ILogEventSink that forwards Error+ events from the script-logger pipeline to a separate "main" logger (the opcua-*.log pipeline in production) at Warning level. Rationale: operators usually watch the main server log, not scripts-*.log. Script authors log Info/Debug liberally during development — those stay in the scripts file. When a script actually fails (Error or Fatal), the operator needs to see it in the primary log so it can't be missed. Downgrading to Warning in the main log marks these as "needs attention but not a core server issue" since the server itself is healthy; the script author fixes the script. Forwarded event includes the ScriptName property (so operators can tell which script failed at a glance), the OriginalLevel (Error vs Fatal, preserved), the rendered message, and the original exception (preserved so the main log keeps the full stack trace — critical for diagnosis). Missing ScriptName property falls back to "unknown" without throwing; bypassing the factory is defensive but shouldn't happen in practice. Mirror threshold is configurable via constructor (defaults to LogEventLevel.Error) so deployments with stricter signal/noise requirements can raise it to Fatal.

15 new unit tests across two files. ScriptLoggerFactoryTests (6): Create sets the ScriptName structured property, each script gets its own property value across fan-out, Error-level event preserves level and exception, null root rejected, empty/whitespace/null name rejected, ScriptNameProperty const is stable at "ScriptName" (external-contract guard). ScriptLogCompanionSinkTests (9): Info/Warning events land in scripts sink only (not mirrored), Error event mirrored to main at Warning level (level-downgrade behavior), mirrored event includes ScriptName + OriginalLevel properties, mirrored event preserves exception for main-log stack-trace diagnosis, Fatal mirrored identically to Error, missing ScriptName falls back to "unknown" without throwing (defensive), null main logger rejected, custom mirror threshold (raised to Fatal) applied correctly.

Full Core.Scripting test suite after Stream A: 63/63 green (29 A.1 + 19 A.2 + 15 A.3). Stream A is complete — the scripting engine foundation, sandbox, sandbox-defense-in-depth, AST-inferred dependency extraction, compile cache, per-evaluation timeout, per-script logger with structured-property filtering, and companion-warn forwarding are all shipped and tested. Streams B through G build on this; Stream H closes out the phase with the compliance script + test baseline + merge to v2.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 16:42:48 -04:00
cb5d7b2d58 Merge pull request 'Phase 7 Stream A.2 — compile cache + per-evaluation timeout wrapper' (#178) from phase-7-stream-a2-cache-timeout into v2 2026-04-20 16:41:07 -04:00
Joseph Doherty
0ae715cca4 Phase 7 Stream A.2 — compile cache + per-evaluation timeout wrapper. Second of 3 increments within Stream A. Adds two independent resilience primitives that the virtual-tag engine (Stream B) and scripted-alarm engine (Stream C) will compose with the base ScriptEvaluator. Both are generic on (TContext, TResult) so different engines get their own instances without cross-contamination.
CompiledScriptCache<TContext, TResult> — source-hash-keyed cache of compiled evaluators. Roslyn compilation is the most expensive step in the evaluator pipeline (5-20ms per script depending on size); re-compiling on every value-change event would starve the engine. ConcurrentDictionary of Lazy<ScriptEvaluator> with ExecutionAndPublication mode ensures concurrent callers never double-compile even on a cold cache race. Failed compiles evict the cache entry so an Admin UI retry with corrected source actually recompiles (otherwise the cached exception would persist). Whitespace-sensitive hash — reformatting a script misses the cache on purpose, simpler than AST-canonicalize and happens rarely. No capacity bound because virtual-tag + alarm scripts are config-DB bounded (thousands, not millions); if scale pushes past that in v3 an LRU eviction slots in behind the same API.

TimedScriptEvaluator<TContext, TResult> — wraps a ScriptEvaluator with a per-evaluation wall-clock timeout (default 250ms per Phase 7 plan Stream A.4, configurable per tag so slower backends can widen). Critical implementation detail: the underlying Roslyn ScriptRunner executes synchronously on the calling thread for CPU-bound user scripts, returning an already-completed Task before the caller can register a timeout. Naive `Task.WaitAsync(timeout)` would see the completed task and never fire. Fix: push evaluation to a thread-pool thread via Task.Run, so the caller's thread is free to wait and the timeout reliably fires after the configured budget. Known trade-off (documented in the class summary): when a script times out, the underlying evaluation task continues running on the thread-pool thread until Roslyn returns; in the CPU-bound-infinite-loop case it's effectively leaked until the runtime decides to unwind. Tighter CPU budgeting would require an out-of-process script runner (v3 concern). In practice the timeout + structured warning log surfaces the offending script so the operator fixes it, and the orphan thread is rare. Caller-supplied CancellationToken is honored and takes precedence over the timeout, so driver-shutdown paths see a clean OperationCanceledException rather than a misclassified ScriptTimeoutException.

ScriptTimeoutException carries the configured Timeout and a diagnostic message pointing the operator at ctx.Logger output around the failure plus suggesting widening the timeout, simplifying the script, or moving heavy work out of the evaluation path. The virtual-tag engine (Stream B) will catch this and map the owning tag's quality to BadInternalError per Phase 7 decision #11, logging a structured warning with the offending script name.

Tests: CompiledScriptCacheTests (10) — first-call compile, identical-source dedupe to same instance, different-source produces different evaluator, whitespace-sensitivity documented, cached evaluator still runs correctly, failed compile evicted for retry, Clear drops entries, concurrent GetOrCompile of the same source deduplicates to one instance, different TContext/TResult use separate cache instances, null source rejected. TimedScriptEvaluatorTests (9) — fast script completes under timeout, CPU-bound script throws ScriptTimeoutException, caller cancellation takes precedence over timeout (shutdown path correctness), default 250ms per plan, zero/negative timeout rejected at construction, null inner rejected, null context rejected, user-thrown exceptions propagate unwrapped (not conflated with timeout), timeout exception message contains diagnostic guidance. Full suite: 48/48 green (29 from A.1 + 19 new).

Next: Stream A.3 wires the dedicated scripts-*.log Serilog rolling sink + structured-property filtering + companion-WARN enricher to the main log, closing out Stream A.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 16:38:43 -04:00
d2bfcd9f1e Merge pull request 'Phase 7 Stream A.1 — Core.Scripting project scaffold + ScriptContext + sandbox + AST dependency extractor' (#177) from phase-7-stream-a1-core-scripting into v2 2026-04-20 16:29:44 -04:00
Joseph Doherty
e4dae01bac Phase 7 Stream A.1 — Core.Scripting project scaffold + ScriptContext + sandbox + AST dependency extractor. First of 3 increments within Stream A. Ships the Roslyn-based script engine's foundation: user C# snippets compile against a constrained ScriptOptions allow-list + get a post-compile sandbox guard, the static tag-dependency set is extracted from the AST at publish time, and the script sees a stable ctx.GetTag/SetVirtualTag/Now/Logger/Deadband API that later streams plug into concrete backends.
ScriptContext abstract base defines the API user scripts see as ctx — GetTag(string) returns DataValueSnapshot so scripts branch on quality naturally, SetVirtualTag(string, object?) is the only write path virtual tags have (OPC UA client writes to virtual nodes rejected separately in DriverNodeManager per ADR-002), Now + Logger + Deadband static helper round out the surface. Concrete subclasses in Streams B + C plug in actual tag backends + per-script Serilog loggers.

ScriptSandbox.Build(contextType) produces the ScriptOptions for every compile — explicit allow-list of six assemblies (System.Private.CoreLib / System.Linq / Core.Abstractions / Core.Scripting / Serilog / the context type's own assembly), with a matching import list so scripts don't need using clauses. Allow-list is plan-level — expanding it is not a casual change.

DependencyExtractor uses CSharpSyntaxWalker to find every ctx.GetTag("literal") and ctx.SetVirtualTag("literal", ...) call, rejects every non-literal path (variable, concatenation, interpolation, method-returned). Rejections carry the exact TextSpan so the Admin UI can point at the offending token. Reads + writes are returned as two separate sets so the virtual-tag engine (Stream B) knows both the subscription targets and the write targets.

Sandbox enforcement turned out needing a second-pass semantic analyzer because .NET 10's type forwarding makes assembly-level restriction leaky — System.Net.Http.HttpClient resolves even with WithReferences limited to six assemblies. ForbiddenTypeAnalyzer runs after Roslyn's Compile() against the SemanticModel, walks every ObjectCreationExpression / InvocationExpression / MemberAccessExpression / IdentifierName, resolves to the containing type's namespace, and rejects any prefix-match against the deny-list (System.IO, System.Net, System.Diagnostics, System.Reflection, System.Threading.Thread, System.Runtime.InteropServices, Microsoft.Win32). Rejections throw ScriptSandboxViolationException with the aggregated list + source spans so the Admin UI surfaces every violation in one round-trip instead of whack-a-mole. System.Environment explicitly stays allowed (read-only process state, doesn't persist or leak outside) and that compromise is pinned by a dedicated test.

ScriptGlobals<TContext> wraps the context as a named field so scripts see ctx instead of the bare globalsType-member-access convention Roslyn defaults to — keeps script ergonomics (ctx.GetTag) consistent with the AST walker's parse shape and the Admin UI's hand-written type stub (coming in Stream F). Generic on TContext so Stream C's alarm-predicate context with an Alarm property inherits cleanly.

ScriptEvaluator<TContext, TResult>.Compile is the three-step gate: (1) Roslyn compile — throws CompilationErrorException on syntax/type errors with Location-carrying diagnostics; (2) ForbiddenTypeAnalyzer semantic pass — catches type-forwarding sandbox escapes; (3) delegate creation. Runtime exceptions from user code propagate unwrapped — the virtual-tag engine in Stream B catches + maps per-tag to BadInternalError quality per Phase 7 decision #11.

29 unit tests covering every surface: DependencyExtractorTests has 14 theories — single/multiple/deduplicated reads, separate write tracking, rejection of variable/concatenated/interpolated/method-returned/empty/whitespace paths, ignoring non-ctx methods named GetTag, empty-source no-op, source span carried in rejections, multiple bad paths reported in one pass, nested literal extraction. ScriptSandboxTests has 15 — happy-path compile + run, SetVirtualTag round-trip, rejection of File.IO + HttpClient + Process.Start + Reflection.Assembly.Load via ScriptSandboxViolationException, Environment.GetEnvironmentVariable explicitly allowed (pinned compromise), script-exception propagation, ctx.Now reachable, Deadband static reachable, LINQ Where/Sum reachable, DataValueSnapshot usable in scripts including quality branches, compile error carries source location.

Next two PRs within Stream A: A.2 adds the compile cache (source-hash keyed) + per-evaluation timeout wrapper; A.3 wires the dedicated scripts-*.log Serilog rolling sink with structured-property filtering + the companion-warning enricher to the main log.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 16:27:07 -04:00
6ae638a6de Merge pull request 'ADR-002 — driver-vs-virtual dispatch for Phase 7 scripting' (#176) from adr-002-driver-vs-virtual-dispatch into v2 2026-04-20 16:10:30 -04:00
Joseph Doherty
2a74daf228 ADR-002 — driver-vs-virtual dispatch: DriverNodeManager routes reads/writes/subscriptions across driver tags and virtual (scripted) tags via a single NodeManager with a NodeSource tag on NodeScopeResolver's output. Locks the architecture decision Phase 7 Stream G was going to have to make anyway — documenting it up front so the stream implementation can reference the chosen shape instead of rediscovering it. Option A (separate VirtualTagNodeManager sibling) rejected because shared Equipment folders owning both driver and virtual children would force two NodeManagers to fight for ownership on every Equipment node — the common case, not the exception — defeating the separation. Option C (virtual engine registers as a synthetic IDriver through DriverTypeRegistry) rejected because DriverInstance shape is wrong for scripting config (no DriverType, no HostAddress, no connectivity probe, no NSSM wrapper), IDriver.InitializeAsync semantics don't match script compilation, Polly resilience wrappers calibrated for network calls would either passthrough pointlessly or tune wrong, and Admin UI would need special-casing everywhere to hide fields that don't apply. Option B (single DriverNodeManager, NodeScopeResolver returns NodeSource enum alongside ScopeId, dispatch branches on source) accepted because it preserves one address-space tree with one walker, ACL binding works identically for both kinds, Phase 6.1 resilience + Phase 6.2 audit apply uniformly to the driver branch without needing Roslyn analyzer exemptions, and adding future source kinds is a single-enum-case addition. NodeScopeResolver.Resolve returns NodeScope(ScopeId, NodeSource, DriverInstanceId?, VirtualTagId?); DriverNodeManager pattern-matches on scope.Source and routes to either the driver dictionary or IVirtualTagEngine. OPC UA client writes to a virtual node return BadUserAccessDenied before the dispatch branch because Phase 7 decision #6 restricts virtual-tag writes to scripts via ctx.SetVirtualTag. Dispatch test coverage specified for Stream G.4: mixed Equipment folders browsing correctly, read routing per source kind, subscription fan-out across both kinds, the BadUserAccessDenied guard on virtual writes, and script-driven writes firing subscription notifications. ADR-001's walker gains the VirtualTag config-DB table as an additional input channel alongside Tag; NodeScopeResolver's ScopeId return stays unchanged so Phase 6.2's ACL trie needs no modification. Consequences flagged: whether IVirtualTagEngine lives in Core.Abstractions vs Phase 7's Core.VirtualTags project, and whether future server-side methods on virtual nodes would route through this dispatch, both marked out-of-scope for ADR-002.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 16:08:01 -04:00
3eb5f1d9da Merge pull request 'Phase 7 plan doc — scripting runtime + virtual tags + scripted alarms + historian alarm sink' (#175) from phase-7-plan-doc into v2 2026-04-20 16:07:34 -04:00
Joseph Doherty
f2c1cc84e9 Phase 7 plan doc — scripting runtime + virtual tags + scripted alarms + historian alarm sink. Draft output from the 2026-04-20 interactive planning session. Phase 7 is the last phase before v2 release readiness; adds two additive runtime capabilities on top of the existing driver + Equipment address-space foundation: (1) virtual (calculated) tags — OPC UA variables whose values are computed by user-authored C# scripts against other tags, evaluated on change and/or timer, living in the existing Equipment tree alongside driver tags, behaving identically to clients; (2) Part 9 scripted alarms — full state machine (EnabledState/ActiveState/AckedState/ConfirmedState/ShelvingState) with persistent operator-supplied state across restarts, complementing (not replacing) the existing Galaxy-native and AB CIP ALMD alarm sources. A third tie-in capability — Aveva Historian as alarm system of record — routes every qualifying alarm transition from any IAlarmSource (scripted + Galaxy + ALMD) through a local SQLite store-and-forward queue to Galaxy.Host, which uses its already-loaded aahClientManaged DLLs to write to the Historian alarm schema; per-alarm HistorizeToAveva toggle gates which sources flow (default off for Galaxy-native to avoid duplicating the direct Galaxy historian path, default on for scripted).
Locks in 22 design decisions from the planning conversation: C# via Roslyn scripting; virtual tags in the Equipment tree (not a separate /Virtual/ namespace); change-driven + timer-driven triggers operator-configurable per tag; Shape A one-script-per-tag-or-alarm (no predicate/action split); full OPC UA Part 9 alarm fidelity; read-only sandbox (scripts read any tag, write only to virtual tags, no File/HttpClient/Process/reflection); AST-inferred dependencies via CSharpSyntaxWalker (non-literal tag paths rejected at publish); config DB storage with generation-sealed cache; ctx.GetTag returns a full DataValue {Value, StatusCode, Timestamp}; per-tag Historize checkbox; per-tag error isolation (throwing script sets tag quality BadInternalError, engine unaffected); dedicated scripts-*.log Serilog sink bound to ctx.Logger; alarm message as template with {TagPath} substitution resolved at event emission; ActiveState recomputed from tags on startup while EnabledState/AckedState/ConfirmedState/ShelvingState + audit persist to config DB; historian sink scope = all IAlarmSource impls with per-alarm toggle; SQLite store-and-forward on the node so operators are never blocked by Historian downtime; IPC to Galaxy.Host for ingestion reusing the already-loaded aahClientManaged DLLs; Monaco editor for Admin code editing; serial cascade evaluation for v1 (parallel as follow-up); shelving UX via OPC UA method calls only with no custom Admin controls (operator drives state transitions from plant HMIs or Client.CLI); 30-day dead-letter retention with manual retry button; test harness accepts only declared-input paths so the harness enforces dependency declaration.

Eight streams totaling ~10-12 weeks, scope-comparable to Phase 6: A - Core.Scripting (Roslyn engine + sandbox + AST inference + logger); B - virtual tag engine (dependency graph + change/timer schedulers + historize); C - scripted alarm engine (Part 9 state machine + template messages + startup recovery + OPC UA method binding); D - historian alarm sink (SQLite store-and-forward + Galaxy.Host IPC contract extension); E - config DB schema (four new tables under sp_PublishGeneration); F - Admin UI scripting tab (Monaco + test harness + dependency preview + script-log viewer + historian diagnostics); G - address-space integration (extend EquipmentNodeWalker for virtual source kind + extend DriverNodeManager dispatch); H - exit gate.

Compliance-check surface covers sandbox escape (typeof/Assembly.Load/File/HttpClient attempts must fail at compile), dependency inference (literal-only paths), change cascade (topological ordering), cycle rejection at publish, startup recovery (ack/confirm/shelve survive restart but ActiveState recomputed), ack audit trail persistence, historian queue durability (Galaxy.Host offline → online drains in-order), per-alarm historian toggle gating, script timeout isolation, log sink isolation, ACL binding (virtual tags inherit Equipment scope grants).

Follow-up artifacts tracked as tasks #231-#238 (stream placeholders). Supporting doc updates (plan.md §6 Migration Strategy, config-db-schema.md §§ for the four new tables, driver-specs.md §Alarm semantics clarification, new ADR-002 for driver-vs-virtual dispatch) will land alongside the streams that touch them, not in this doc.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 16:05:12 -04:00
8384e58655 Merge pull request 'Modbus exception-injection profile — wire-level coverage for codes 0x01/0x03/0x04/0x05/0x06/0x0A/0x0B' (#174) from modbus-exception-injection-profile into v2 2026-04-20 15:14:00 -04:00
Joseph Doherty
96940aeb24 Modbus exception-injection profile — closes the end-to-end test gap for exception codes 0x01/0x03/0x04/0x05/0x06/0x0A/0x0B. pymodbus simulator naturally emits only 0x02 (Illegal Data Address on reads outside configured ranges) + 0x03 (Illegal Data Value on over-length); the driver's MapModbusExceptionToStatus table translates eight codes, but only 0x02 had integration-level coverage (via DL205's unmapped-register test). Unit tests lock the translation function in isolation but an integration test was missing for everything else. This PR lands wire-level coverage for the remaining seven codes without depending on device-specific quirks to naturally produce them.
New exception_injector.py — standalone pure-Python-stdlib Modbus/TCP server shipped alongside the pymodbus image. Speaks the wire protocol directly (MBAP header parse + FC 01/02/03/04/05/06/15/16 dispatch + store-backed happy-path reads/writes + spec-enforced length caps) and looks up each (fc, starting-address) against a rules list loaded from JSON; a matching rule makes the server respond [fc|0x80, exception_code] instead of the normal response. Zero runtime dependencies outside the stdlib — the Dockerfile just COPY's the script into /fixtures/ alongside the pymodbus profile JSONs, no new pip install needed. ~200 lines. New exception_injection.json profile carries rules for every exception code on FC03 (addresses 1000-1007, one per code), FC06 (2000-2001 for CPU-PROGRAM-mode and busy), and FC16 (3000 for server failure). New exception_injection compose profile binds :5020 like every other service + runs python /fixtures/exception_injector.py --config /fixtures/exception_injection.json.

New ExceptionInjectionTests.cs in Modbus.IntegrationTests — 11 tests. Eight FC03-read theories exercise every exception code 0x01/0x02/0x03/0x04/0x05/0x06/0x0A/0x0B asserting the driver's expected OPC UA StatusCode mapping (BadNotSupported/BadOutOfRange/BadOutOfRange/BadDeviceFailure/BadDeviceFailure/BadDeviceFailure/BadCommunicationError/BadCommunicationError). Two FC06-write theories cover the write path for 0x04 (Server Failure, CPU in PROGRAM mode) + 0x06 (Server Busy). One sanity-check read at address 5 confirms the injector isn't globally broken + non-injected reads round-trip cleanly with Value=5/StatusCode=Good. All tests follow the MODBUS_SIM_PROFILE=exception_injection skip guard so they no-op on a fresh clone without Docker running.

Docker/README.md gains an §Exception injection section explaining what pymodbus can and cannot emit, what the injector does, where the rules live, and how to append new ones. docs/drivers/Modbus-Test-Fixture.md follow-up item #2 (extend pymodbus profiles to inject exceptions) gets a shipped strikethrough with the new coverage inventory; the unit-level section adds ExceptionInjectionTests next to DL205ExceptionCodeTests so the split-of-responsibilities is explicit (DL205 test = natural out-of-range via dl205 profile, ExceptionInjectionTests = every other code via the injector).

Test baselines: Modbus unit 182/182 green (unchanged); Modbus integration with exception_injection profile live 11/11 new tests green. Existing DL205/S7/Mitsubishi integration tests unaffected since they skip on MODBUS_SIM_PROFILE mismatch.

Found + fixed during validation: a stale native pymodbus simulator from April 18 was still listening on port 5020 on IPv6 localhost (Windows was load-balancing between it + the Docker IPv4 forward, making injected exceptions intermittently come back as pymodbus's default 0x02). Killed the leftover. Documented the debugging path in the commit as a note for anyone who hits the same "my tests see exception 0x02 but the injector log has no request" symptom.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 15:11:32 -04:00
340f580be0 Merge pull request 'FOCAS Tier-C PR E — ops glue: ProcessHostLauncher + post-mortem MMF + NSSM scripts' (#173) from focas-tier-c-pr-e-ops-glue into v2 2026-04-20 14:26:35 -04:00
Joseph Doherty
8d88ffa14d FOCAS Tier-C PR E — ops glue: ProcessHostLauncher + post-mortem MMF + NSSM install scripts + doc close-out. Final of the 5 PRs for #220. With this landing, the Tier-C architecture is fully shipped; the only remaining FOCAS work is the hardware-dependent FwlibHostedBackend (real Fwlib32.dll P/Invoke, gated on #222 lab rig).
Production IHostProcessLauncher (ProcessHostLauncher.cs): Process.Start spawns OtOpcUa.Driver.FOCAS.Host.exe with OTOPCUA_FOCAS_PIPE / OTOPCUA_ALLOWED_SID / OTOPCUA_FOCAS_SECRET / OTOPCUA_FOCAS_BACKEND in the environment (supervisor-owned, never disk), polls FocasIpcClient.ConnectAsync at 250ms cadence until the pipe is up or the Host exits or the ConnectTimeout deadline passes, then wraps the connected client in an IpcFocasClient. TerminateAsync kills the entire process tree + disposes the IPC stream. ProcessHostLauncherOptions carries HostExePath + PipeName + AllowedSid plus optional SharedSecret (auto-generated from a GUID when omitted so install scripts don't have to), Arguments, Backend (fwlib32/fake/unconfigured default-unconfigured), ConnectTimeout (15s), and Series for CNC pre-flight.

Post-mortem MMF (Host/Stability/PostMortemMmf.cs + Proxy/Supervisor/PostMortemReader.cs): ring-buffer of the last ~1000 IPC operations written by the Host into a memory-mapped file. On a Host crash the supervisor reads the MMF — which survives process death — to see what was in flight. File format: 16-byte header [magic 'OFPC' (0x4F465043) | version | capacity | writeIndex] + N × 256-byte entries [8-byte UTC unix ms | 8-byte opKind | 240-byte UTF-8 message + null terminator]. Magic distinguishes FOCAS MMFs from the Galaxy MMFs that ship the same format shape. Writer is single-producer (Host) with a lock_writeGate; reader is multi-consumer (Proxy + any diagnostic tool) using a separate MemoryMappedFile handle.

NSSM install wrappers (scripts/install/Install-FocasHost.ps1 + Uninstall-FocasHost.ps1): idempotent service registration for OtOpcUaFocasHost. Resolves SID from the ServiceAccount, generates a fresh shared secret per install if not supplied, stages OTOPCUA_FOCAS_PIPE/SID/SECRET/BACKEND in AppEnvironmentExtra so they never hit disk, rotates 10MB stdout/stderr logs under %ProgramData%\OtOpcUa, DependOnService=OtOpcUa so startup order is deterministic. Backend selector defaults to unconfigured so a fresh install doesn't accidentally load a half-configured Fwlib32.dll on first start.

Tests (7 new, 2 files): PostMortemMmfTests.cs in FOCAS.Host.Tests — round-trip write+read preserves order + content, ring-buffer wraps at capacity (writes 10 entries to a 3-slot buffer, asserts only op-7/8/9 survive in FIFO order), message truncation at the 240-byte cap is null-terminated + non-overflowing, reopening an existing file preserves entries. PostMortemReaderCompatibilityTests.cs in FOCAS.Tests — hand-writes a file in the exact host format (magic/entry layout) + asserts the Proxy reader decodes with correct ring-walk ordering when writeIndex != 0, empty-return on missing file + magic mismatch. Keeps the two codebases in format-lockstep without the net10 test project referencing the net48 Host assembly.

Docs updated: docs/v2/implementation/focas-isolation-plan.md promoted from DRAFT to PRs A-E shipped status with per-PR citations + post-ship test counts (189 + 24 + 13 = 226 FOCAS-family tests green). docs/drivers/FOCAS-Test-Fixture.md §5 updated from "architecture scoped but not implemented" to listing the shipped components with the FwlibHostedBackend gap explicitly labeled as hardware-gated. Install-FocasHost.ps1 documents the OTOPCUA_FOCAS_BACKEND selector + points at docs/v2/focas-deployment.md for Fwlib32.dll licensing.

What ISN'T in this PR: (1) the real FwlibHostedBackend implementing IFocasBackend with the P/Invoke — requires either a CNC on the bench or a licensed FANUC developer kit to validate, tracked under #220 as a single follow-up task; (2) Admin /hosts surface integration for FOCAS runtime status — Galaxy Tier-C already has the shape, FOCAS can slot in when someone wires ObservedCrashes/StickyAlertActive/BackoffAttempt to the FleetStatusHub; (3) a full integration test that actually spawns a real FOCAS Host process — ProcessHostLauncher is tested via its contract + the MMF is tested via round-trip, but no test spins up the real exe (the Galaxy Tier-C tests do this, but the FOCAS equivalent adds no new coverage over what's already in place).

Total FOCAS-family tests green after this PR: 189 driver + 24 Shared + 13 Host = 226.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 14:24:13 -04:00
446a5c022c Merge pull request 'FOCAS Tier-C PR D — supervisor + backoff + crash-loop breaker' (#172) from focas-tier-c-pr-d-supervisor into v2 2026-04-20 14:19:32 -04:00
Joseph Doherty
5033609944 FOCAS Tier-C PR D — supervisor + backoff + crash-loop breaker + heartbeat monitor. Fourth of 5 PRs for #220. Ships the resilience harness that sits between the driver's IFocasClient requests and the Tier-C Host process, so a crashing Fwlib32.dll takes down only the Host (not the main server), gets respawned on a backoff ladder, and opens a circuit with a sticky operator alert when the crash rate is pathological. Same shape as Galaxy Tier-C so the Admin /hosts surface has a single mental model. New Supervisor/ namespace in Driver.FOCAS (.NET 10, Proxy-side): Backoff with the 5s→15s→60s default ladder + StableRunThreshold that resets the index after a 2-min clean run (so a one-off crash after hours of steady-state doesn't restart from the top); CircuitBreaker with 3-crashes-in-5-min threshold + escalating 1h→4h→manual-reset cooldown ladder + StickyAlertActive flag that persists across cooldowns until AcknowledgeAndReset is called; HeartbeatMonitor tracking ConsecutiveMisses against the 3-misses-kill threshold + LastAckUtc for telemetry; IHostProcessLauncher abstraction over "spawn Host process + produce an IFocasClient connected to it" so the supervisor stays I/O-free and fully testable with a fake launcher that can be told to throw on specific attempts (production wiring over Process.Start + FocasIpcClient.ConnectAsync is the PR E ops-glue concern); FocasHostSupervisor orchestrating them — GetOrLaunchAsync cycles through backoff until either a client is returned or the breaker opens (surfaced as InvalidOperationException so the driver maps to BadDeviceFailure), NotifyHostDeadAsync fans out the unavailable event + terminates the current launcher + records the crash without blocking (so heartbeat-loss detection can short-circuit subscriber fan-out and let the next GetOrLaunchAsync handle the respawn), AcknowledgeAndReset is the operator-clear path, OnUnavailable event for Admin /hosts wiring + ObservedCrashes + BackoffAttempt + StickyAlertActive for telemetry. 14 new unit tests across SupervisorTests.cs: Backoff (default sequence, clamping, RecordStableRun resets), CircuitBreaker (below threshold allowed, opens at threshold, escalates cooldown after second open, ManualReset clears state), HeartbeatMonitor (3 consecutive misses declares dead, ack resets counter), FocasHostSupervisor (first-launch success, retry-with-backoff after transient failure, repeated failures open breaker + surface InvalidOperationException, NotifyHostDeadAsync terminates + fan-outs + increments crash count, AcknowledgeAndReset clears sticky, Dispose terminates). Full FOCAS driver tests now 186/186 green (172 + 14 new). No changes to IFocasClient DI contract; existing FakeFocasClient-based tests unaffected. PR E wires the real Process-based IHostProcessLauncher + NSSM install scripts + MMF post-mortem + docs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 14:17:23 -04:00
9034294b77 Merge pull request 'FOCAS Tier-C PR C — IPC path end-to-end' (#171) from focas-tier-c-pr-c-ipc-proxy into v2 2026-04-20 14:13:33 -04:00
Joseph Doherty
3892555631 FOCAS Tier-C PR C — IPC path end-to-end: Proxy IpcFocasClient + Host FwlibFrameHandler + IFocasBackend abstraction. Third of 5 PRs for #220. Ships the wire path from IFocasClient calls in the .NET 10 driver, over a named-pipe (or in-memory stream) to the .NET 4.8 Host's FwlibFrameHandler, dispatched to an IFocasBackend. Keeps the existing IFocasClient DI seam intact so existing unit tests are unaffected (172/172 still pass). Proxy side adds Ipc/FocasIpcClient (owns one pipe stream + call gate so concurrent callers don't interleave frames, supports both real NamedPipeClientStream and arbitrary Stream for in-memory test loopback) and Ipc/IpcFocasClient (implements IFocasClient by forwarding every call as an IPC frame — Connect sends OpenSessionRequest and caches the SessionId; Read sends ReadRequest and decodes the typed value via FocasDataTypeCode; Write sends WriteRequest for non-bit data or PmcBitWriteRequest when it's a PMC bit so the RMW critical section stays on the Host; Probe sends ProbeRequest; Dispose best-effort sends CloseSessionRequest); plus FocasIpcException surfacing Host-side ErrorResponse frames as typed exceptions. Host side adds Backend/IFocasBackend (the Host's view of one FOCAS session — Open/Close/Read/Write/PmcBitWrite/Probe) with two implementations: FakeFocasBackend (in-memory, per-address value store, honors bit-write RMW semantics against the containing byte — used by tests and as an OTOPCUA_FOCAS_BACKEND=fake operational stub) and UnconfiguredFocasBackend (structured failure pointing at docs/v2/focas-deployment.md — the safe default when OTOPCUA_FOCAS_BACKEND is unset or hardware isn't configured). Ipc/FwlibFrameHandler replaces StubFrameHandler: deserializes each request DTO, delegates to the IFocasBackend, re-serializes into the matching response kind. Catches backend exceptions and surfaces them as ErrorResponse{backend-exception} rather than tearing down the pipe. Program.cs now picks the backend from OTOPCUA_FOCAS_BACKEND env var (fake/unconfigured/fwlib32; fwlib32 still maps to Unconfigured because the real Fwlib32 P/Invoke integration is a hardware-dependent follow-up — #220 captures it). Tests: 7 new IPC round-trip tests on the Proxy side (IpcFocasClient vs. an IpcLoopback fake server: connect happy path, connect rejection, read decode, write round-trip, PMC bit write routes to first-class RMW frame, probe, ErrorResponse surfaces as typed exception) + 6 new Host-side tests on FwlibFrameHandler (OpenSession allocates id, read-without-session fails, full open/write/read round-trip preserves value, PmcBitWrite sets the specified bit, Probe reports healthy with open session, UnconfiguredBackend returns pointed-at-docs error with ErrorCode=NoFwlibBackend). Existing 165 FOCAS unit tests + 24 Shared tests + 3 Host handshake tests all unchanged. Total post-PR: 172+24+9 = 205 FOCAS-family tests green. What's NOT in this PR: the actual Fwlib32.dll P/Invoke integration inside the Host (FwlibHostedBackend) lands as a hardware-dependent follow-up since no CNC is available for validation; supervisor + respawn + crash-loop breaker comes in PR D; MMF + NSSM install scripts in PR E.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 14:10:52 -04:00
3609a5c676 Merge pull request 'FOCAS Tier-C PR B � Driver.FOCAS.Host net48 x86 skeleton' (#170) from focas-tier-c-pr-b-host into v2 2026-04-20 14:02:56 -04:00
Joseph Doherty
a6f53e5b22 FOCAS Tier-C PR B — Driver.FOCAS.Host net48 x86 skeleton + pipe server. Second PR of the 5-PR #220 split. Stands up the Windows Service entry point + named-pipe scaffolding so PR C has a place to move the Fwlib32 calls into. New net48 x86 project (Fwlib32.dll is 32-bit-only, same bitness constraint as Galaxy.Host/MXAccess); references Driver.FOCAS.Shared for framing + DTOs so the wire format Proxy<->Host stays a single type system. Ships four files: PipeAcl creates a PipeSecurity where only the configured server principal SID gets ReadWrite+Synchronize + LocalSystem/BuiltinAdministrators are explicitly denied (so a compromised service account on the same host can't escalate via the pipe); IFrameHandler abstracts the dispatch surface with HandleAsync + AttachConnection for server-push event sinks; PipeServer accepts one connection at a time, verifies the peer SID via RunAsClient, reads the first Hello frame + matches the shared-secret and protocol major version, sends HelloAck, then hands off to the handler until EOF or cancel; StubFrameHandler fully handles Heartbeat/HeartbeatAck so a future supervisor's liveness detector stays happy, and returns ErrorResponse{Code=not-implemented,Message="Kind X is stubbed - Fwlib32 lift lands in PR C"} for every data-plane request. Program.cs mirrors the Galaxy.Host startup exactly: reads OTOPCUA_FOCAS_PIPE / OTOPCUA_ALLOWED_SID / OTOPCUA_FOCAS_SECRET from the env (supervisor passes these at spawn time), builds Serilog rolling-file logger into %ProgramData%\OtOpcUa\focas-host-*.log, constructs the pipe server with StubFrameHandler, loops on RunAsync until SIGINT. Log messages mark the backend as "stub" so it's visible in logs that Fwlib32 isn't actually connected yet. Driver.FOCAS.Host.Tests (net48 x86) ships three integration tests mirroring IpcHandshakeIntegrationTests from Galaxy.Host: correct-secret handshake + heartbeat round-trip, wrong-secret rejection with UnauthorizedAccessException, and a new test that sends a ReadRequest and asserts the StubFrameHandler returns ErrorResponse{not-implemented} mentioning PR C in the message so the wiring between frame dispatch + kind → error mapping is locked. Tests follow the same is-Administrator skip guard as Galaxy because PipeAcl denies BuiltinAdministrators. No changes to existing driver code; FOCAS unit tests still at 165/165 + Shared tests at 24/24. PR C wires the real Fwlib32 backend.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 14:00:56 -04:00
b968496471 Merge pull request 'FOCAS Tier-C PR A � Driver.FOCAS.Shared MessagePack contracts' (#169) from focas-tier-c-pr-a-shared into v2 2026-04-20 13:57:45 -04:00
Joseph Doherty
e6ff39148b FOCAS Tier-C PR A — Driver.FOCAS.Shared MessagePack IPC contracts. First PR of the 5-PR #220 split (isolation plan at docs/v2/implementation/focas-isolation-plan.md). Adds a new netstandard2.0 project consumable by both the .NET 10 Proxy and the future .NET 4.8 x86 Host, carrying every wire DTO the Proxy <-> Host pair will exchange: Hello/HelloAck + Heartbeat/HeartbeatAck + ErrorResponse for session negotiation (shared-secret + protocol major/minor mirroring Galaxy.Shared); OpenSessionRequest/Response + CloseSessionRequest carrying the declared FocasCncSeries so the Host picks up the pre-flight matrix; FocasAddressDto + FocasDataTypeCode for wire-compatible serialization of parsed addresses (0=Pmc/1=Param/2=Macro matches FocasAreaKind enum order so both sides cast (int)); ReadRequest/Response + WriteRequest/Response with MessagePack-serialized boxed values tagged by FocasDataTypeCode; PmcBitWriteRequest/Response as a first-class RMW operation so the critical section stays Host-side; Subscribe/Unsubscribe/OnDataChangeNotification for poll-loop-pushes-deltas model (FOCAS has no CNC-initiated callbacks); Probe + RuntimeStatusChange + Recycle surface for Tier-C supervision. Framing is [4-byte BE length][1-byte kind][body] with 16 MiB body cap matching Galaxy; FocasMessageKind byte values align with Galaxy ranges so an operator reading a hex dump doesn't have to context-switch. FrameReader/FrameWriter ported from Galaxy.Shared with thread-safe concurrent-write serialization. 24 new unit tests: 18 per-DTO round-trip tests covering every field + 6 framing tests (single-frame round-trip, clean-EOF returns null, oversized-length rejection, mid-frame EOF throws, 20-way concurrent-write ordering preserved, MessageKind byte values locked as wire-stable). No driver changes; existing 165 FOCAS unit tests still pass unchanged. PR B (Host skeleton) goes next.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 13:55:35 -04:00
4a6fe7fa7e Merge pull request 'FOCAS version-matrix stabilization (PR 1 of #220 split)' (#168) from focas-version-matrix-stabilization into v2 2026-04-20 13:46:53 -04:00
Joseph Doherty
a6be2f77b5 FOCAS version-matrix stabilization (PR 1 of #220 split) — ship the cheap half of the hardware-free stability gap ahead of the Tier-C out-of-process split. Without any CNC or simulator on the bench, the highest-leverage move is to catch operator config errors at init time instead of at steady-state per-read. Adds FocasCncSeries enum (Unknown/16i/0i-D/0i-F family/30i family/PowerMotion-i) + FocasCapabilityMatrix static class that encodes the per-series documented ranges for macro variables (cnc_rdmacro/wrmacro), parameters (cnc_rdparam/wrparam), and PMC letters + byte ceilings (pmc_rdpmcrng/wrpmcrng) straight from the Fanuc FOCAS Developer Kit. FocasDeviceOptions gains a Series knob (defaults Unknown = permissive so pre-matrix configs don't break on upgrade). FocasDriver.InitializeAsync now calls FocasAddress.TryParse on every tag + runs FocasCapabilityMatrix.Validate against the owning device's declared series, throwing InvalidOperationException with a reason string that names both the series and the documented limit ("Parameter #30000 is outside the documented range [0, 29999] for Thirty_i") so an operator can tell whether the mismatch is in the config or in their declared CNC model. Unknown series skips validation entirely. Ships 46 new theory cases in FocasCapabilityMatrixTests.cs — covering every boundary in the matrix (widen 16i->0i-F: macro ceiling 999->9999, param 9999->14999; widen 0i-F->30i: PMC letters +K+T; PMC-number 16i=999/0i-D=1999/0i-F=9999/30i=59999), permissive Unknown-series behavior, rejection-message content, and case-insensitive PMC-letter matching. Widening a range without updating docs/v2/focas-version-matrix.md fails a test because every InlineData cites the row it reflects. Full FOCAS test suite stays at 165/165 passing (119 existing + 46 new). Also authors docs/v2/focas-version-matrix.md as the authoritative range reference with per-function citations, CNC-series era context, error-surface shape, and the link back to the matrix code; docs/v2/implementation/focas-isolation-plan.md as the multi-PR plan for #220 Tier-C isolation (Shared contracts -> Host skeleton -> move Fwlib32 calls -> Supervisor+respawn -> MMF+ops glue, 2200-3200 LOC across 5 PRs mirroring the Galaxy Tier-C topology); and promotes docs/drivers/FOCAS-Test-Fixture.md from "version-matrix coverage = no" to explicit coverage via the new test file + cross-links to the matrix and isolation-plan docs. Leaves task #220 open since isolation itself (the expensive half) is still ahead.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 13:44:37 -04:00
64bbc12e8e Merge pull request 'AB Legacy ab_server PCCC Docker fixture scaffold (#224)' (#167) from ablegacy-docker-fixture into v2 2026-04-20 13:28:16 -04:00
Joseph Doherty
a0cf7c5860 AB Legacy ab_server PCCC Docker fixture scaffold (#224) — Docker infrastructure + test-class shape in place; wire-level round-trip currently blocked by an ab_server-side PCCC coverage gap documented honestly in the fixture + coverage docs. Closes the Docker-infrastructure piece of #224; the remaining work is upstream (patch ab_server's PCCC server opcodes) or sideways (RSEmulate 500 golden-box tier, lab rig).
New project tests/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy.IntegrationTests/ with four pieces. AbLegacyServerFixture — TCP probe against localhost:44818 (or AB_LEGACY_ENDPOINT override), distinct from AB_SERVER_ENDPOINT so both CIP + PCCC containers can run simultaneously. Single-public-ctor to satisfy xunit collection-fixture constraint. AbLegacyServerProfile + KnownProfiles carry the per-family (SLC500 / MicroLogix / PLC-5) ComposeProfile + Notes; drives per-theory parameterisation. AbLegacyFactAttribute / AbLegacyTheoryAttribute match the AB CIP skip-attribute pattern.

Docker/docker-compose.yml reuses the AB CIP otopcua-ab-server:libplctag-release image — `build:` block points at ../../AbCip.IntegrationTests/Docker context so `docker compose build` from here produces / reuses the same multi-stage build. Three compose profiles (slc500 / micrologix / plc5) with per-family `--plc` + `--tag=<file>[<size>]` flags matching the PCCC tag syntax (different from CIP's `Name:Type[size]`).

AbLegacyReadSmokeTests — one parametric theory reading N7:0 across all three families + one SLC500 write-then-read on N7:5. Targets the shape the driver would use against real hardware. Verified 2026-04-20 against a live SLC500 container: TCP probe passes + container accepts connections + libplctag negotiates session, but read/write returns BadCommunicationError (libplctag status 0x80050000). Root-caused to ab_server's PCCC server-side opcode coverage being narrower than libplctag's PCCC client expects — not a driver-side bug, not a scaffold bug, just an ab_server upstream limitation. Documented honestly in Docker/README.md + AbLegacy-Test-Fixture.md rather than skipping the tests or weakening assertions; tests now skip cleanly when container is absent, fail with clear message when container is up but the protocol gap surfaces. Operator resolves by filing an ab_server upstream patch, pointing AB_LEGACY_ENDPOINT at real hardware, or scaffolding an RSEmulate 500 golden-box tier.

Docker/README.md — Known limitations section leads with the PCCC round-trip gap (test date, failure signature, possible root causes, three resolution paths) before the pre-existing limitations (T/C file decomposition, ST file quirks, indirect addressing, DF1 serial). Reader can't miss the "scaffolded but blocked on upstream" framing.

docs/drivers/AbLegacy-Test-Fixture.md — TL;DR flipped from "no integration fixture" to "Docker scaffold in place; wire-level round-trip currently blocked by ab_server PCCC gap". What-the-fixture-is gains an Integration section. Follow-up candidates rewritten: #1 is now "fix ab_server PCCC upstream", #2 is RSEmulate 500 golden-box (with cost callouts matching our existing Logix Emulate + TwinCAT XAR scaffolds — license + Hyper-V conflict + binary project format), #3 is lab rig. Key-files list adds the four new files. docs/drivers/README.md coverage-map row updated from "no integration fixture" to "Docker scaffold via ab_server PCCC; wire-level round-trip currently blocked, docs call out resolution paths".

Solution file picks up the new tests/.../AbLegacy.IntegrationTests entry. AbLegacyDataType.Int used throughout (not Int16 — the enum uses SLC file-type naming). Build 0 errors; 2 smoke tests skip cleanly without container + fail with clear errors when container up (proving the infrastructure works end-to-end + the gap is specifically the ab_server protocol coverage, not the scaffold).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 13:26:19 -04:00
2fe1a326dc Merge pull request 'TwinCAT XAR integration fixture scaffold (#221)' (#166) from twincat-xar-fixture-scaffold into v2 2026-04-20 13:11:53 -04:00
Joseph Doherty
7b49ea13c7 TwinCAT XAR integration fixture — scaffold the code + docs so the Hyper-V VM + .tsproj drop in without fixture-code changes. Mirrors the AB CIP Logix Emulate scaffold shipped in PR #165: tier-gated smoke tests that skip cleanly when the VM isn't reachable, a project README documenting exactly what the XAR needs to run, fixture-coverage doc promoting TwinCAT from "no integration fixture" to "scaffolded + needs operational setup". The actual Beckhoff-side work (provision VM, install XAR, author tsproj, rotate 7-day trial) lives in #221 + the new TwinCatProject/README.md walkthrough.
New project tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests/ with four pieces. TwinCATXarFixture — TCP probe against the ADS-over-TCP port 48898 on the host from TWINCAT_TARGET_HOST env var, requires TWINCAT_TARGET_NETID for the target AmsNetId, optional TWINCAT_TARGET_PORT for runtime 2+ (default 851 = PLC runtime 1). Doesn't own a lifecycle — XAR can't run in Docker because it bypasses the Windows kernel scheduler to hit real-time cycles, so the VM stays operator-managed. Explicit skip reasons surface the setup steps (start VM, set env vars, reactivate trial license) instead of a confusing hang. TwinCATFactAttribute + TwinCATTheoryAttribute — xunit skip gate matching AbServerFactAttribute / OpcPlcCollection patterns.

TwinCAT3SmokeTests — three smoke tests through the real AdsTwinCATClient + real ADS over TCP. Driver_reads_seeded_DINT_through_real_ADS reads GVL_Fixture.nCounter, asserts >= 1234 (MAIN increments every cycle so an exact match would race). Driver_write_then_read_round_trip_on_scratch_REAL writes 42.5 to GVL_Fixture.rSetpoint + reads back, catches the ADS write path regression that unit tests can't see. Driver_subscribe_receives_native_ADS_notifications_on_counter_changes validates the #189 native-notification path end-to-end — AddDeviceNotification fires OnDataChange at the PLC cycle boundary, the test observes one firing within 3 s. All three gated on TWINCAT_TARGET_HOST + NETID; skip via TwinCATFactAttribute when unset, verified in this commit with 3 clean [SKIP] results.

TwinCatProject/README.md — the tsproj state the smoke tests depend on. GVL_Fixture with nCounter:DINT:=1234 + rSetpoint:REAL:=0.0 + bFlag:BOOL:=TRUE; MAIN program with the single-line ladder `GVL_Fixture.nCounter := GVL_Fixture.nCounter + 1;`; PlcTask cyclic @ 10 ms priority 20; PLC runtime 1 (AMS port 851). Explains why tsproj over the compiled bootproject (text-diffable, rebuildable, no per-install state). Full XAR VM setup walkthrough — Hyper-V Gen 2 VM, TC3 XAE+XAR install, noting the AmsNetId from the tray icon, bilateral route configuration (VM System Manager → Routes + dev box StaticRoutes.xml), project import, Activate Configuration + Run Mode. License-rotation section walks through two options — scheduled TcActivate.exe /reactivate via Task Scheduler (not officially Beckhoff-supported, reportedly works on current builds) or paid runtime license (~$1k one-time per runtime per CPU). Final section shows the exact env-var recipe + dotnet test command on the dev box.

docs/drivers/TwinCAT-Test-Fixture.md — flipped TL;DR from "there is no integration fixture" to "scaffolding lives at tests/..., remaining operational work is VM + tsproj + license rotation". "What the fixture is" gains an Integration section describing the XAR VM target. "What it actually covers" gains an Integration subsection listing the three named smoke tests. Follow-up candidates rewritten — the #1 item used to be "TwinCAT 3 runtime on CI" as a speculative option; now it's concrete "XAR VM live-population" with a link to #221 + the project README for the operational walkthrough. License rotation becomes #2 with both automation paths. Key fixture / config files list adds the three new files + the project README. docs/drivers/README.md coverage-map row updated from "no integration fixture" to "XAR-VM integration scaffolding".

Solution file picks up the new tests/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.IntegrationTests entry alongside the existing TwinCAT.Tests. xunit CollectionDefinition added to TwinCATXarFixture after the first build revealed the [Collection("TwinCATXar")] reference on TwinCAT3SmokeTests had no matching registration. Build 0 errors; 3 skip-clean test outcomes verified. #221 stays open as in_progress until the VM + tsproj land.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 13:11:17 -04:00
b820b9a05f Merge pull request 'AB CIP Logix Emulate golden-box tier � scaffolding' (#165) from abcip-emulate-tier-scaffold into v2 2026-04-20 12:56:38 -04:00
Joseph Doherty
58a0cccc67 AB CIP Logix Emulate golden-box tier — scaffold the code + docs so the L5X + Emulate PC drop in without fixture-code changes. Closes the initial design question the user raised; the actual Emulate-side work (author project, commit L5X, install Emulate on the dev box) is tracked as #223. Scaffolding ships everything that doesn't need the live Emulate instance: tier-gated test classes that skip cleanly when AB_SERVER_PROFILE is unset, the profile gate helper, the LogixProject/README.md documenting the exact project state the tests expect, the fixture coverage doc's new §Logix Emulate tier section with the when-to-trust table extended from 3 columns to 4, and the dev-environment.md integration-host row.
AbServerProfileGate — static helper that reads `AB_SERVER_PROFILE` env var (defaults to "abserver") + exposes `SkipUnless(params string[] requiredProfiles)` matching the MODBUS_SIM_PROFILE pattern the DL205StringQuirkTests uses one directory over. Emulate-only tests call `AbServerProfileGate.SkipUnless("emulate")` at the top of each fact body; ab_server-default runs see them skip with a clear message pointing at the Emulate setup steps.

AbCipEmulateUdtReadTests — one test proving the #194 whole-UDT read optimization works against the real Logix Template Object, not just the golden byte buffers the unit suite uses. Builds an `AbCipDriverOptions` with a Structure tag `Motor1 : Motor_UDT` that has three declared members (Speed : DINT, Torque : REAL, Status : DINT), reads them via the `.Speed / .Torque / .Status` dotted-tag syntax, asserts the driver gets the grouped whole-UDT path + decodes each at the right offset. Required seed values documented inline + in LogixProject/README.md: Speed=1800, Torque=42.5f, Status=0x0001.

AbCipEmulateAlmdTests — one test proving the #177 ALMD projection fires `OnAlarmEvent` when a real ALMD instruction's `In` edge rises, not just the fake `InFaulted` timer edges the unit suite drives. Needs a `SimulateAlarm : BOOL` tag routed through `MainRoutine` ladder (`XIC SimulateAlarm OTE HighTempAlarm.In`) so the test case can pulse the input via the existing `IWritable.WriteAsync` path instead of scripting Emulate via its own socket. Alarm-projection options carry `EnableAlarmProjection = true` + 200 ms poll interval; a `TaskCompletionSource` gates the raise-event assertion with a 5 s deadline. Cleanup writes SimulateAlarm=false so consecutive runs start from known state.

LogixProject/README.md — the Studio 5000 project state the Emulate-tier tests depend on. Explains why L5X over ACD (text diff, reproducible import, no per-install state), the UDT + tag + routine structure, how to bring it up on the Emulate PC. Ships as a stub pending actual author + L5X export + commit; the README itself keeps the requirements visible so the L5X author has a checklist.

docs/drivers/AbServer-Test-Fixture.md — new §Logix Emulate golden-box tier section with the coverage-promotion table (ab_server / Emulate / hardware per gap), the setup-env-var recipe, the costs to accept (license, Hyper-V conflict, manual lifecycle). "When to trust" table extended from 3 columns (ab_server / unit / rig) to 4 (ab_server / unit / Logix Emulate / rig); two new rows for EtherNet/IP embedded-switch + redundant-chassis failover that even Emulate can't help with. Follow-up candidates list gets Logix Emulate as option 1 ahead of the pre-existing "extend ab_server upstream" + "stand up a lab rig". See-also file list gains AbServerProfileGate.cs + Docker/ + Emulate/ + LogixProject/README.md entries.

docs/v2/dev-environment.md — §C Integration host gains a Rockwell Studio 5000 Logix Emulate row: purpose (AB CIP golden-box tier closing UDT/ALMD/AOI/safety/ConnectionSize gaps), type (Windows-only, Hyper-V conflict matching TwinCAT XAR's constraint), port 44818, credentials note, owner split between integration-host admin for license+install and developer for per-session runtime start.

Verified: Emulate tests skip cleanly when AB_SERVER_PROFILE is unset — both `[SKIP]` with the operator-facing message pointing at the env-var setup. Whole-solution build 0 errors. Tests will transition from skip → pass once the L5X + Emulate PC land per #223.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 12:54:39 -04:00
231148d7f0 Merge pull request 'Doc + code-comment sweep � finish the native-fallback removal' (#164) from docs-native-fallback-cleanup into v2 2026-04-20 12:38:11 -04:00
Joseph Doherty
fdb268cee0 Docs + code-comment sweep — remove stale Pymodbus/ + PythonSnap7/ + LocateBinary references left behind by the native-fallback removal PR. Answer to "is the dev inventory + documentation updated": it was partial; this PR finishes the job.
Files touched — docs/drivers/Modbus-Test-Fixture.md dropped the key-files pointer at deleted Pymodbus/ + flipped "primary launcher is Docker, native fallback retained" framing to "Docker is the only supported launch path" (matching the code). docs/v2/dev-environment.md dropped the "skips both Docker + native-binary paths" parenthetical from AB_SERVER_ENDPOINT + flipped the "Native fallbacks" subsection to a one-liner that says Docker is the only supported path. docs/v2/modbus-test-plan.md rewrote §Harness from "pip install pymodbus + serve.ps1" setup pattern to "docker compose --profile <…> up" + updated the §PR 43 status bullet to point at Docker/profiles/. docs/v2/test-data-sources.md §"CI fixture (task #180)" rewrote the AB CIP section from "LocateBinary() picks binary off PATH" + GitHub Actions zip-download step to "Docker is the only supported reproducible build path" + docker compose GitHub Actions step; dropped the pinned-version SHA256 table + lock-file reference because the Dockerfile's LIBPLCTAG_TAG build-arg is the new pin.

Code docstrings + error messages — these are developer-facing operational text too. ModbusSimulatorFixture SkipReason strings (both branches) now point at `docker compose -f Docker/docker-compose.yml --profile standard up -d` instead of the deleted `Pymodbus\serve.ps1`; doc-comment at the top references Docker/docker-compose.yml. Snap7ServerFixture SkipReason strings + doc-comment point at Docker/docker-compose.yml instead of PythonSnap7/serve.ps1. S7_1500Profile.cs docstring updated. Modbus Dockerfile comment pointing at deleted tests/.../Pymodbus/README.md redirected to docs/drivers/Modbus-Test-Fixture.md. DL205Profile.cs + DL205StringQuirkTests.cs + S7_1500Profile.cs (in Modbus project) docstrings flipped from Pymodbus/*.json references to Docker/profiles/*.json.

Left untouched deliberately: docs/v2/implementation/exit-gate-phase-2-closed.md — that's a historical as-of-2026-04-18 snapshot documenting what was skipped at Phase 2 closure; rewriting would lose the date-stamped context. Its "oitc/modbus-server Docker container not started" + "ab_server binary not on PATH" lines describe the fixture landscape that existed at close time, not current operational guidance.

Final sweep confirms zero remaining `Pymodbus/` / `PythonSnap7/` / `LocateBinary` / `AbServerSeedTag` / `BuildCliArgs` / `AbServerPlcArg` mentions anywhere in tracked files outside that historical exit-gate doc. Whole-solution build still 0 errors.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 12:36:19 -04:00
4473197cf5 Merge pull request 'Remove native-launcher fallbacks; Docker is the only path for Modbus / S7 / AB CIP / OpcUaClient' (#163) from remove-native-fallbacks into v2 2026-04-20 12:29:46 -04:00
Joseph Doherty
0e1dcc119e Remove native-launcher fallbacks for the four Dockerized fixtures — Docker is the only supported path for Modbus / S7 / AB CIP / OpcUaClient integration. Native paths stay in place only where Docker isn't compatible (Galaxy: MXAccess COM + Windows-only; TwinCAT: Beckhoff runtime vs Hyper-V; FOCAS: closed-source Fanuc Fwlib32.dll; AB Legacy: PCCC has no OSS simulator). Simplifies the fixture landscape + removes the "which path do I run" ambiguity; removes two full native-launcher directories + the AB CIP native-spawn path; removes the parallel profile-as-CLI-arg-builder code from AbServerFixture.
Modbus — deletes tests/.../Modbus.IntegrationTests/Pymodbus/ (serve.ps1, standard.json, dl205.json, mitsubishi.json, s7_1500.json, README.md). Profile JSONs live only under Docker/profiles/ now. Docker/README.md loses its "Native-Python fallback" section; docs/drivers/Modbus-Test-Fixture.md "What the fixture is" bullet flipped from "primary launcher is Docker, native fallback under Pymodbus/" to "Docker is the only supported launch path".

S7 — deletes tests/.../S7.IntegrationTests/PythonSnap7/ (server.py, s7_1500.json, serve.ps1, README.md). Docker/README.md loses "Native-Python fallback"; docs/drivers/S7-Test-Fixture.md updated to match.

AB CIP — the biggest simplification because the native-binary spawn had the most code. AbServerFixture.cs rewrites: drops Process management (no more Process _proc + Kill/WaitForExit), drops LocateBinary() PATH lookup, drops the IAsyncLifetime initialize-spawns-server behavior. Fixture is now a thin TCP probe against localhost:44818 (or AB_SERVER_ENDPOINT override) — same shape as Snap7ServerFixture / ModbusSimulatorFixture / OpcPlcFixture. IsServerAvailable() simplifies to a single 500 ms probe. AbServerProfile.cs drops AbServerPlcArg + SeedTags + BuildCliArgs + ToCliSpec + the entire AbServerSeedTag record — the compose file is the canonical source of truth for which tags + which --plc mode each family gets; the profile record now carries just Family + ComposeProfile (matches the docker-compose service key) + Notes. KnownProfiles.ForFamily + .All stay for tests that iterate families. AbServerProfileTests.cs rewrites to match: drops BuildCliArgs_* + ToCliSpec_* + SeedTags_* tests; keeps the family-coverage contract tests + verifies the ComposeProfile strings match compose-file service names (a typo in either surfaces as a unit-test failure, not a silent "wrong family booted" at runtime). Docker/README.md loses "Native-binary fallback" section; docs/drivers/AbServer-Test-Fixture.md "What the fixture is" flipped to Docker-only with clearer skip rules.

dev-environment.md §Docker fixtures — the "Native fallbacks" subsection goes away; replaced with a one-line note that Docker is the only supported path for these four fixtures + a fresh clone needs Docker Desktop and nothing else.

Verified: whole-solution build 0 errors, AB CIP profile unit tests 6/6, AB CIP Docker smoke 4/4 (all family theory rows), S7 Docker smoke 3/3. Container lifecycle clean. The deleted native code surface was already redundant — every fixture the native paths served is now covered by Docker; keeping them invited drift between the two paths (the original AB CIP native profile had three undetected bugs per the #162 commit message: case-sensitive --plc, bracket tag notation, --path=1,0 requirement — noise the Docker path now avoids by never running the buggy code).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 12:27:44 -04:00
27d135bd59 Merge pull request 'Dockerize Modbus + AB CIP + S7 test fixtures for reproducibility' (#162) from fixtures-all-docker into v2 2026-04-20 12:11:45 -04:00
Joseph Doherty
6609141493 Dockerize Modbus + AB CIP + S7 test fixtures for reproducibility. Every driver integration simulator now has a pinned Docker image alongside the existing native launcher — Docker is the primary path, native fallbacks kept for contributors who prefer them. Matches the already-Dockerized OpcUaClient/opc-plc pattern from #215 so every fixture in the fleet presents the same compose-up/test/compose-down loop. Reproducibility gain: what used to require a local pip/Python install (Modbus pymodbus, S7 python-snap7) or a per-OS C build from source (AB CIP ab_server from libplctag) now collapses to a Dockerfile + docker compose up. Modbus — new tests/ZB.MOM.WW.OtOpcUa.Driver.Modbus.IntegrationTests/Docker/ with Dockerfile (python:3.12-slim-bookworm + pymodbus[simulator]==3.13.0) + docker-compose.yml with four compose profiles (standard / dl205 / mitsubishi / s7_1500) backed by the existing profile JSONs copied under Docker/profiles/ as canonical; native fallback in Pymodbus/ retained with the same JSON set (symlink-equivalent — manual re-sync when profiles change, noted in both READMEs). Port 5020 unchanged so MODBUS_SIM_ENDPOINT + ModbusSimulatorFixture work without code change. Dropped the --no_http CLI arg the old serve.ps1 + compose draft passed — pymodbus 3.13 doesn't recognize it; the simulator's http ui just binds inside the container where nothing maps it out and costs nothing. S7 — new tests/ZB.MOM.WW.OtOpcUa.Driver.S7.IntegrationTests/Docker/ with Dockerfile (python:3.12-slim-bookworm + python-snap7>=2.0) + docker-compose.yml with one s7_1500 compose profile; copies the existing server.py shim + s7_1500.json seed profile; runs python -u server.py ... --port 1102. Native fallback in PythonSnap7/ retained. Port 1102 unchanged. AB CIP — hardest because ab_server is a source-only C tool in libplctag's src/tools/ab_server/. New tests/ZB.MOM.WW.OtOpcUa.Driver.AbCip.IntegrationTests/Docker/ Dockerfile is multi-stage: build stage (debian:bookworm-slim + build-essential + cmake) clones libplctag at a pinned tag + cmake --build build --target ab_server; runtime stage (debian:bookworm-slim) copies just the binary from /src/build/bin_dist/ab_server. docker-compose.yml ships four compose profiles (controllogix / compactlogix / micro800 / guardlogix) with per-family ab_server CLI args matching AbServerProfile.cs. AbServerFixture updated: tries TCP probe on 127.0.0.1:44818 first (Docker path) + spawns the native binary only as fallback when no listener is there. AB_SERVER_ENDPOINT env var supported for pointing at a real PLC. AbServerFact/Theory attributes updated to IsServerAvailable() which accepts any of: live listener on 44818, AB_SERVER_ENDPOINT set, or binary on PATH. Required two CLI-compat fixes to ab_server's argument expectations that the existing native profile never caught because it was never actually run at CI: --plc is case-sensitive (ControlLogix not controllogix), CIP tags need [size] bracket notation (DINT[1] not bare DINT), ControlLogix also requires --path=1,0. Compose files carry the corrected flags; the existing native-path AbServerProfile.cs was never invoked in practice so we don't rewrite it here. Micro800 now uses the --plc=Micro800 mode rather than falling back to ControlLogix emulation — ab_server does have the dedicated mode, the old Notes saying otherwise were wrong. Updated docs: three fixture coverage docs (Modbus-Test-Fixture.md, S7-Test-Fixture.md, AbServer-Test-Fixture.md) flip their "What the fixture is" section from native-only to Docker-primary-with-native-fallback; dev-environment.md §Resource Inventory replaces the old ambiguous "Docker Desktop + ab_server native" mix with four per-driver rows (each listing the image, compose file, compose profiles, port, credentials) + a new Docker fixtures — quick reference subsection giving the one-line docker compose -f <…> --profile <…> up for each driver + the env-var override names + the native fallback install recipes. drivers/README.md coverage map table updated — Modbus/AB CIP/S7 entries now read "Dockerized …" consistent with OpcUaClient's line. Verified end-to-end against live containers: Modbus DL205 smoke 1/1, S7 3/3, AB CIP ControlLogix 4/4 (all family theory rows). Container lifecycle clean (up/test/down, no leaked state). Every fixture keeps its skip-when-absent probe + env-var endpoint override so dotnet test on a fresh clone without Docker running still gets a green run.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 12:09:44 -04:00