Commit Graph

251 Commits

Author SHA1 Message Date
Joseph Doherty
2b66cec582 Auto: s7-a3 — DTL/DT/S5TIME/TIME/TOD/DATE codecs
Adds S7DateTimeCodec static class implementing the six Siemens S7 date/time
wire formats:

  - DTL (12 bytes): UInt16 BE year + month/day/dow/h/m/s + UInt32 BE nanos
  - DATE_AND_TIME (8 bytes BCD): yy/mm/dd/hh/mm/ss + 3-digit BCD ms + dow
  - S5TIME (16 bits): 2-bit timebase + 3-digit BCD count → TimeSpan
  - TIME (Int32 ms BE, signed) → TimeSpan, allows negative durations
  - TOD (UInt32 ms BE, 0..86399999) → TimeSpan since midnight
  - DATE (UInt16 BE days since 1990-01-01) → DateTime

Mirrors the S7StringCodec pattern from PR-S7-A2 — codecs operate on raw byte
spans so each format can be locked with golden-byte unit tests without a
live PLC. New S7DataType members (Dtl, DateAndTime, S5Time, Time, TimeOfDay,
Date) are wired into S7Driver.ReadOneAsync/WriteOneAsync via byte-level
ReadBytesAsync/WriteBytesAsync calls — S7.Net's string-keyed Read/Write
overloads have no syntax for these widths.

Uninitialized PLC buffers (all-zero year+month for DTL/DT) reject as
InvalidDataException → BadOutOfRange to operators, rather than decoding as
year-0001 garbage.

S5TIME / TIME / TOD surface as Int32 ms (DriverDataType has no Duration);
DTL / DT / DATE surface as DriverDataType.DateTime.

Test coverage: 30 new golden-vector + round-trip + rejection tests,
including the all-zero buffer rejection paths and BCD-nibble validation.
Build clean, 115/115 S7 tests pass.

Closes #289

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 16:37:39 -04:00
Joseph Doherty
316f820eff Auto: s7-a2 — STRING/WSTRING/CHAR/WCHAR
Closes the NotSupportedException cliff for S7 string-shaped types.

- S7DataType gains WString, Char, WChar members alongside the existing
  String entry.
- New S7StringCodec encodes/decodes the four wire formats:
    STRING  : 2-byte header (max-len + actual-len bytes) + N ASCII bytes
              -> total 2 + max_len.
    WSTRING : 4-byte header (max-len + actual-len UInt16 BE) + N×2
              UTF-16BE bytes -> total 4 + 2 × max_len.
    CHAR    : 1 ASCII byte (rejects non-ASCII on encode).
    WCHAR   : 2 UTF-16BE bytes.
  Header-bug clamp: actualLen > maxLen is silently clamped on read so
  firmware quirks don't walk past the wire buffer; rejected on write
  to avoid silent truncation.
- S7Driver.ReadOneAsync / WriteOneAsync issue ReadBytesAsync /
  WriteBytesAsync against the parsed Area / DbNumber / ByteOffset and
  honour S7TagDefinition.StringLength (default 254 = S7 STRING max).
- MapDataType returns DriverDataType.String for the three new enum
  members so OPC UA discovery surfaces them as scalar strings.

Tests: 21 new cases on S7StringCodec covering golden-byte vectors,
encode/decode round-trips, the firmware-bug header-clamp, ASCII-only
guard on CHAR, and the StringLength default. 85/85 passing.

Closes #288
2026-04-25 16:26:05 -04:00
Joseph Doherty
d1699af609 Auto: s7-a1 — 64-bit scalar types
Closes the NotSupportedException cliff for S7 Float64/Int64/UInt64.

- S7Size enum gains LWord (8 bytes); parser accepts DBLD/DBL on data
  blocks and LD on M/I/Q (e.g. DB1.DBLD0, DB1.DBL8, MLD0, ILD8, QLD16).
- S7Driver.ReadOneAsync / WriteOneAsync issue ReadBytesAsync /
  WriteBytesAsync for 64-bit types and convert big-endian via
  System.Buffers.Binary.BinaryPrimitives. S7's wire format is BE.
- Internal MapArea(S7Area) helper translates to S7.Net DataType.
- MapDataType now surfaces native DriverDataType for Int16/UInt16/
  UInt32/Int64/UInt64 instead of collapsing them all to Int32.

Tests: parser theories cover DBLD/DBL/MLD/ILD/QLD; discovery test
asserts the 64-bit DriverDataType mapping. 64/64 passing.

Closes #287
2026-04-25 16:16:23 -04:00
Joseph Doherty
4a3860ae92 Auto: opcuaclient-5 — CRL/revocation handling
Adds explicit revoked-vs-untrusted distinction to the OpcUaClient driver's
server-cert validation hook, plus three new knobs on a new
OpcUaCertificateValidationOptions sub-record:

  RejectSHA1SignedCertificates  (default true — SHA-1 is OPC UA spec-deprecated;
                                 this is a deliberately tighter default)
  RejectUnknownRevocationStatus (default false — keeps brownfield deployments
                                 without CRL infrastructure working)
  MinimumCertificateKeySize     (default 2048)

The validator hook now runs whether or not AutoAcceptCertificates is set:
revoked / issuer-revoked certs are always rejected with a distinct
"REVOKED" log line; SHA-1 + small-key certs are rejected per policy;
unknown-revocation gates on the new flag; untrusted still honours
AutoAccept.

Decision pipeline factored into a static EvaluateCertificateValidation
helper with a CertificateValidationDecision record so unit tests cover
all branches without needing to spin up an SDK CertificateValidator.

CRL files themselves: the OPC UA SDK reads them automatically from the
crl/ subdir of each cert store — no driver-side wiring needed.
Documented on the new options record.

Tests (12 new) cover defaults, every branch of the decision pipeline,
SHA-1 detection (custom X509SignatureGenerator since .NET 10's
CreateSelfSigned refuses SHA-1), and key-size detection. All 127
OpcUaClient unit tests still pass.

Closes #277

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 16:05:50 -04:00
Joseph Doherty
bb1ab47b68 Auto: opcuaclient-4 — diagnostics counters
Per-driver counters surfaced via DriverHealth.Diagnostics for the
driver-diagnostics RPC. New OpcUaClientDiagnostics tracks
PublishRequestCount, NotificationCount, NotificationsPerSecond (5s-half-life
EWMA), MissingPublishRequestCount, DroppedNotificationCount,
SessionResetCount and LastReconnectUtcTicks via Interlocked on the hot path.

DriverHealth gains an optional IReadOnlyDictionary<string,double>?
Diagnostics parameter (defaulted null for back-compat with the seven other
drivers' constructors). OpcUaClientDriver wires Session.Notification +
Session.PublishError on connect and on reconnect-complete (recording a
session-reset there); GetHealth snapshots the counters on every poll so the
RPC sees fresh values without a tick source.

Tests: 11 new OpcUaClientDiagnosticsTests cover counter increments, EWMA
convergence, snapshot shape, GetHealth integration, and DriverHealth
back-compat. Full OpcUaClient.Tests 115/115 green.

Closes #276

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 15:53:57 -04:00
Joseph Doherty
494fdf2358 Auto: opcuaclient-3 — honor server OperationLimits
Closes #275
2026-04-25 15:38:55 -04:00
Joseph Doherty
fae00749ca Auto: opcuaclient-2 — per-tag advanced subscription tuning
Closes #274
2026-04-25 15:25:20 -04:00
Joseph Doherty
7209364c35 Auto: opcuaclient-1 — per-subscription tuning
Closes #273
2026-04-25 15:09:08 -04:00
Joseph Doherty
1abf743a9f Auto: focas-f1f — figure scaling + diagnostics
Closes #262
2026-04-25 15:01:37 -04:00
Joseph Doherty
cc757855e6 Auto: focas-f1e — operator messages + block text
Closes #261
2026-04-25 14:49:11 -04:00
Joseph Doherty
9ec92a9082 Auto: focas-f1d — Tool number + work coordinate offsets
Closes #260
2026-04-25 14:37:51 -04:00
Joseph Doherty
3c2c4f29ea Auto: focas-f1c — Modal codes + overrides
Closes #259

Adds Modal/ + Override/ fixed-tree subfolders per FOCAS device, mirroring the
pattern established by Status/ (#257) and Production/ (#258): cached snapshots
refreshed on the probe tick, served from cache on read, no extra wire traffic
on top of user-driven tag reads.

Modal/ surfaces the four universally-present aux modal codes M/S/T/B from
cnc_modal(type=100..103) as Int16. **G-group decoding (groups 1..21) is deferred
to a follow-up** — the FWLIB ODBMDL union differs per series + group and the
issue body explicitly permits this scoping. Adds the cnc_modal P/Invoke +
ODBMDL struct + a generic int16 cnc_rdparam helper so the follow-up can add
G-groups without further wire-level scaffolding.

Override/ surfaces Feed/Rapid/Spindle/Jog from cnc_rdparam at MTB-specific
parameter numbers (FocasDeviceOptions.OverrideParameters; defaults to 30i:
6010/6011/6014/6015). Per-field nullable params let a deployment hide overrides
their MTB doesn't wire up; passing OverrideParameters=null suppresses the entire
Override/ subfolder for that device.

6 unit tests cover discovery shape, omitted Override folder when unconfigured,
partial Override field selection, cached-snapshot reads (Modal + Override),
BadCommunicationError before first refresh, and the FwlibFocasClient
disconnected short-circuit.
2026-04-25 14:26:48 -04:00
Joseph Doherty
3d9697b918 Auto: focas-f1b — parts count + cycle time
Closes #258
2026-04-25 14:14:54 -04:00
Joseph Doherty
551494d223 Auto: focas-f1a — ODBST status flags as fixed-tree nodes
Closes #257
2026-04-25 14:05:12 -04:00
Joseph Doherty
4ff4cc5899 Auto: ablegacy-4 — indirect/indexed addressing parser
Closes #247
2026-04-25 13:51:03 -04:00
Joseph Doherty
c89f5bb3b9 Auto: ablegacy-3 — sub-element bit semantics
Closes #246
2026-04-25 13:41:52 -04:00
Joseph Doherty
f2bc36349e Auto: ablegacy-2 — MicroLogix function-file letters
Closes #245
2026-04-25 13:32:23 -04:00
Joseph Doherty
8f7265186d Auto: ablegacy-1 — PLC-5 octal I/O addressing
Closes #244
2026-04-25 13:25:22 -04:00
Joseph Doherty
36b2929780 Auto: abcip-1.4 — CIP multi-tag write packing
Group writes by device through new AbCipMultiWritePlanner; for families that
support CIP request packing (ControlLogix / CompactLogix / GuardLogix) the
packable writes for one device are dispatched concurrently so libplctag's
native scheduler can coalesce them onto one Multi-Service Packet (0x0A).
Micro800 keeps SupportsRequestPacking=false and falls back to per-tag
sequential writes. BOOL-within-DINT writes are excluded from packing and
continue to go through the per-parent RMW semaphore so two concurrent bit
writes against the same DINT cannot lose one another's update.

The libplctag .NET wrapper does not expose a Multi-Service Packet construction
API at the per-Tag surface (each Tag is one CIP service), so this PR uses
client-side coalescing — concurrent Task.WhenAll dispatch per device — rather
than building raw CIP frames. The native libplctag scheduler does pack
concurrent same-connection writes when the family allows it, which gives the
round-trip reduction #228 calls for without ballooning the diff.

Per-tag StatusCodes preserve caller order across success, transport failure,
non-writable tags, unknown references, and unknown devices, including in
mixed concurrent batches.

Closes #228
2026-04-25 13:14:28 -04:00
Joseph Doherty
767ac4aec5 Auto: abcip-1.3 — array-slice read addressing
Closes #227
2026-04-25 13:03:45 -04:00
Joseph Doherty
d78a471e90 Auto: abcip-1.2 — STRINGnn variant decoding
Closes #226

Adds nullable StringLength to AbCipTagDefinition + AbCipStructureMember
so STRING_20 / STRING_40 / STRING_80 UDT variants decode against the
right DATA-array capacity. The configured length threads through a new
StringMaxCapacity field on AbCipTagCreateParams and lands on the
libplctag Tag.StringMaxCapacity attribute (verified property on
libplctag 1.5.2). Null leaves libplctag's default 82-byte STRING in
place for back-compat. Driver gates on DataType == String so a stray
StringLength on a DINT tag doesn't reshape that buffer. UDT member
fan-out copies StringLength from the AbCipStructureMember onto the
synthesised member tag definition.

Tests: 4 new in AbCipDriverReadTests covering threaded StringMaxCapacity,
the null back-compat path, the non-String gate, and the UDT-member fan-out.
2026-04-25 12:53:20 -04:00
Joseph Doherty
2e6228a243 Auto: abcip-1.1 — LINT/ULINT 64-bit fidelity
Closes #225
2026-04-25 12:44:43 -04:00
Joseph Doherty
95c7e0b490 Task #222 partial — unblock AB Legacy PCCC via cip-path workaround (5/5 stages)
Replaced the "ab_server PCCC upstream-broken" skip gate with the actual
root cause: libplctag's ab_server rejects empty CIP routing paths at the
unconnected-send layer before the PCCC dispatcher runs. Real SLC/
MicroLogix/PLC-5 hardware accepts empty paths (no backplane); ab_server
does not. With `/1,0` in place, N (Int16), F (Float32), and L (Int32)
file reads + writes round-trip cleanly across all three compose profiles.

## Fixture changes

- `AbLegacyServerFixture.cs`:
  - Drop `AB_LEGACY_TRUST_WIRE` env var + the reachable-but-untrusted
    skip branch. Fixture now only skips on TCP unreachability.
  - Add `AB_LEGACY_CIP_PATH` env var (default `1,0`) + expose `CipPath`
    property. Set `AB_LEGACY_CIP_PATH=` (empty) against real hardware.
  - Shorter skip messages on the `[AbLegacyFact]` / `[AbLegacyTheory]`
    attributes — one reason: endpoint not reachable.

- `AbLegacyReadSmokeTests.cs`:
  - Device URI built from `sim.CipPath` instead of hardcoded empty path.
  - New `AB_LEGACY_COMPOSE_PROFILE` env var filters the parametric
    theory to the running container's family. Only one container binds
    `:44818` at a time, so cross-family params would otherwise fail.
  - `Slc500_write_then_read_round_trip` skips cleanly when the running
    profile isn't `slc500`.

## E2E + seed + docs

- `scripts/e2e/test-ablegacy.ps1` — drop the `AB_LEGACY_TRUST_WIRE`
  skip gate; synopsis calls out the `/1,0` vs empty cip-path split
  between the Docker fixture and real hardware.
- `scripts/e2e/e2e-config.sample.json` — sample gateway flipped from
  the hardware placeholder (`192.168.1.10`) to the Docker fixture
  (`127.0.0.1/1,0`); comment rewritten.
- `scripts/e2e/README.md` — AB Legacy expected-matrix row goes from
  SKIP to PASS.
- `scripts/smoke/seed-ablegacy-smoke.sql` — default HostAddress points
  at the Docker fixture + header / footer text reflect the new state.
- `tests/.../Docker/README.md` — "Known limitations" section rewritten
  to describe the cip-path gate (not a dispatcher gap); env-var table
  picks up `AB_LEGACY_CIP_PATH` + `AB_LEGACY_COMPOSE_PROFILE`.
- `docs/drivers/AbLegacy-Test-Fixture.md` + `docs/drivers/README.md`
  + `docs/DriverClis.md` — flip status from blocked to functional;
  residual bit-file-write gap (B3:0/5 → 0x803D0000) documented.

## Residual gap

Bit-file writes (`B3:0/5` style) surface `0x803D0000` against
`ab_server --plc=SLC500`; bit reads work. Non-blocking for smoke
coverage — N/F/L round-trip is enough. Real hardware / RSEmulate 500
for bit-write fidelity. Documented in `Docker/README.md` §"Known
limitations" + the `AbLegacy-Test-Fixture.md` follow-ups list.

## Verified

- Full-solution build: 0 errors, 334 pre-existing warnings.
- Integration suite passes per-profile with
  `AB_LEGACY_COMPOSE_PROFILE=<slc500|micrologix|plc5>` + matching
  compose container up.
- All four non-hardware e2e scripts (Modbus / AB CIP / AB Legacy / S7)
  now 5/5 against the respective docker-compose fixtures.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 12:38:43 -04:00
Joseph Doherty
4dc685a365 Task #251 — S7 + TwinCAT test-client CLIs (driver CLI suite complete)
Final two of the five driver test clients. Pattern carried forward from
#249 (Modbus) + #250 (AB CIP, AB Legacy) — each CLI inherits Driver.Cli.Common
for DriverCommandBase + SnapshotFormatter and adds a protocol-specific
CommandBase + 4 commands (probe / read / write / subscribe).

New projects:
  - src/ZB.MOM.WW.OtOpcUa.Driver.S7.Cli/ — otopcua-s7-cli.
    S7CommandBase carries host/port/cpu/rack/slot/timeout. Handles all S7
    atomic types (Bool, Byte, Int16..UInt64, Float32/64, String, DateTime).
    DateTime parses via RoundtripKind so "2026-04-21T12:34:56Z" works.
  - src/ZB.MOM.WW.OtOpcUa.Driver.TwinCAT.Cli/ — otopcua-twincat-cli.
    TwinCATCommandBase carries ams-net-id + ams-port + --poll-only toggle
    (flips UseNativeNotifications=false). Covers the full IEC 61131-3
    atomic set: Bool, SInt/USInt, Int/UInt, DInt/UDInt, LInt/ULInt, Real,
    LReal, String, WString, Time/Date/DateTime/TimeOfDay. Structure writes
    refused as out-of-scope (same as AB CIP). IEC time/date variants marshal
    as UDINT on the wire per IEC spec. Subscribe banner announces "ADS
    notification" vs "polling" so the mechanism is obvious in bug reports.

Tests (49 new, 122 cumulative driver-CLI):
  - S7: 22 tests. Every S7DataType has a happy-path + bounds case. DateTime
    round-trips an ISO-8601 string. Tag-name synthesis round-trips every
    S7 address form (DB / M / I / Q, bit/word/dword, strings).
  - TwinCAT: 27 tests. Full IEC type matrix including WString UTF-8 pass-
    through + the four IEC time/date variants landing on UDINT. Structure
    rejection case. Tag-name synthesis for Program scope, GVL scope, nested
    UDT members, and array elements.

Docs:
  - docs/Driver.S7.Cli.md — address grammar cheat sheet + the PUT/GET-must-
    be-enabled gotcha every S7-1200/1500 operator hits.
  - docs/Driver.TwinCAT.Cli.md — AMS router prerequisite (XAR / standalone
    Router NuGet / remote AMS route) + per-command examples.

Wiring:
  - ZB.MOM.WW.OtOpcUa.slnx grew 4 entries (2 src + 2 tests).

Full-solution build clean. Both --help outputs verified end-to-end.

Driver CLI suite complete: 5 CLIs (otopcua-{modbus,abcip,ablegacy,s7,twincat}-cli)
sharing a common base + formatter. 122 CLI tests cumulative. Every driver family
shipped in v2 now has a shell-level ad-hoc validation tool.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 08:44:53 -04:00
Joseph Doherty
b2065f8730 Task #250 — AB CIP + AB Legacy test-client CLIs
Second + third of the four driver test clients. Both follow the same shape as
otopcua-modbus-cli (#249) and consume Driver.Cli.Common for DriverCommandBase +
SnapshotFormatter.

New projects:
  - src/ZB.MOM.WW.OtOpcUa.Driver.AbCip.Cli/ — otopcua-abcip-cli.
    AbCipCommandBase carries gateway (ab://host[:port]/cip-path) + family
    (ControlLogix/CompactLogix/Micro800/GuardLogix) + timeout.
    Commands: probe, read, write, subscribe.
    Value parser covers every AbCipDataType atomic type (Bool, SInt..LInt,
    USInt..ULInt, Real, LReal, String, Dt); Structure writes refused as
    out-of-scope for the CLI.
  - src/ZB.MOM.WW.OtOpcUa.Driver.AbLegacy.Cli/ — otopcua-ablegacy-cli.
    AbLegacyCommandBase carries gateway + plc-type (Slc500/MicroLogix/Plc5/
    LogixPccc) + timeout.
    Commands: probe (default address N7:0), read, write, subscribe.
    Value parser covers Bit, Int, Long, Float, AnalogInt, String, and the
    three sub-element types (TimerElement / CounterElement / ControlElement
    all land on int32 at the wire).

Tests (35 new, 73 cumulative across the driver CLI family):
  - AB CIP: 17 tests — ParseValue happy-paths for every Logix atomic type,
    failure cases (non-numeric / bool garbage), tag-name synthesis.
  - AB Legacy: 18 tests — ParseValue coverage (Bit / Int / AnalogInt / Long /
    Float / String / sub-elements), PCCC address round-trip in tag names
    including bit-within-word + sub-element syntax.

Docs:
  - docs/Driver.AbCip.Cli.md — family ↔ CIP-path cheat sheet + examples per
    command + typical workflows.
  - docs/Driver.AbLegacy.Cli.md — PCCC address primer (file letters → CLI
    --type) + known ab_server upstream gap cross-ref to #224 close-out.

Wiring:
  - ZB.MOM.WW.OtOpcUa.slnx grew 4 entries (2 src + 2 tests).

Full-solution build clean. `otopcua-abcip-cli --help` + `otopcua-ablegacy-cli
--help` verified end-to-end.

Next up (#251): S7 + TwinCAT CLIs, same pattern.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 08:32:43 -04:00
Joseph Doherty
5dac2e9375 Task #249 — Driver test-client CLIs: shared lib + Modbus CLI first
Mirrors the v1 otopcua-cli value prop (ad-hoc shell-level PLC validation) for
the Modbus-TCP driver, and lays down the shared scaffolding that AB CIP, AB
Legacy, S7, and TwinCAT CLIs will build on.

New projects:
  - src/ZB.MOM.WW.OtOpcUa.Driver.Cli.Common/ — DriverCommandBase (verbose
    flag + Serilog config) + SnapshotFormatter (single-tag + table +
    write-result renders with invariant-culture value formatting + OPC UA
    status-code shortnames + UTC-normalised timestamps).
  - src/ZB.MOM.WW.OtOpcUa.Driver.Modbus.Cli/ — otopcua-modbus-cli executable.
    Commands: probe, read, write, subscribe. ModbusCommandBase carries the
    host/port/unit-id flags + builds ModbusDriverOptions with Probe.Enabled
    =false (CLI runs are one-shot; driver-internal keep-alive would race).

Commands + coverage:
  - probe              single FC03 + GetHealth() + pretty-print
  - read               region × address × type synth into one driver tag
  - write              same shape + --value parsed per --type
  - subscribe          polled-subscription stream until Ctrl+C

Tests (38 total):
  - 16 SnapshotFormatterTests covering: status-code shortnames, unknown
    codes fall back to hex, null value + timestamp placeholders, bool
    lowercase, float invariant culture, string quoting, write-result shape,
    aligned table columns, mismatched-length rejection, UTC normalisation.
  - 22 Modbus CLI tests:
      · ReadCommandTests.SynthesiseTagName (5 theory cases)
      · WriteCommandParseValueTests (17 cases: bool aliases, unknown rejected,
        Int16 bounds, UInt16/Bcd16 type, Float32/64 invariant culture,
        String passthrough, BitInRegister, Int32 MinValue, non-numeric reject)

Wiring:
  - ZB.MOM.WW.OtOpcUa.slnx grew 4 entries (2 src + 2 tests).
  - docs/Driver.Modbus.Cli.md — operator-facing runbook with examples per
    command + output format + typical workflows.

Regression: full-solution build clean; shared-lib tests 16/0, Modbus CLI tests
22/0.

Next up: repeat the pattern for AB CIP (shares ~40% more with Modbus via
libplctag), then AB Legacy, S7, TwinCAT. The shared base stays as-is unless
one of those exposes a gap the Modbus-first pass missed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 08:15:14 -04:00
Joseph Doherty
012c6a4e7a Task #224 close — AB Legacy PCCC fixture: add AB_LEGACY_TRUST_WIRE opt-in for wire-level runs
The ab_server Docker simulator accepts TCP at :44818 when started with
--plc=SLC500 but its PCCC dispatcher is a confirmed upstream gap
(verified 2026-04-21 with --debug=5: zero request logs when libplctag
issues a read, every read surfaces BadCommunicationError 0x80050000).

Previous behavior — when Docker was up, the three smoke tests ran and
all failed on every integration-host run. Noise, not signal.

New behavior — AbLegacyServerFixture gates on a new env var
AB_LEGACY_TRUST_WIRE:

  Endpoint reachable? | TRUST_WIRE set? | Result
  --------------------+-----------------+------------------------------
  No                  | —               | Skip ("not reachable")
  Yes                 | No              | Skip ("ab_server PCCC gap")
  Yes                 | 1 / true        | Run

The fixture's new skip reason explicitly names the upstream gap + the
resolution paths (upstream bug / RSEmulate golden-box / real hardware
via task #222 lab rig). Operators with a real SLC 5/05 / MicroLogix
1100/1400 / PLC-5 or an Emulate box set AB_LEGACY_ENDPOINT + TRUST_WIRE
and the smoke tests round-trip cleanly.

Updated docs:
  - tests/.../Docker/README.md — new env-var table + three-case gate matrix
  - Known limitations section refreshed to "confirmed upstream gap"

Verified locally:
  - Docker down: 2 skipped.
  - Docker up + TRUST_WIRE unset: 2 skipped (upstream-gap message).
  - Docker up + TRUST_WIRE=1: 4 run, 4 fail BadCommunicationError (ab_server gap as expected).
  - Unit suite: 96 passed / 0 failed (regression-clean).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 04:17:46 -04:00
Joseph Doherty
c41831794a Task #242 finish — UnsTab drag-drop interactive Playwright E2E tests un-skip + pass
Closes the scope-out left by the #242 partial. Root cause of the blazor.web.js
zero-byte response turned out to be two co-operating harness bugs:

1) The static-asset manifest was discoverable but the runtime needs
   UseStaticWebAssets to be called so the StaticWebAssetsLoader composes a
   PhysicalFileProvider per ContentRoot declared in
   staticwebassets.development.json (Admin source wwwroot + obj/compressed +
   the framework NuGet cache). Without that call MapStaticAssets resolves the
   route but has no ContentRoot map — so every asset serves zero bytes.

2) The EF InMemory DB name was being re-generated on every DbContext
   construction (the lambda body called Guid.NewGuid() inline), so the seed
   scope, Blazor circuit scope, and test-assertion scopes all got separate
   stores. Capturing the name as a stable string per fixture instance fixes
   the "cluster not found → page stays at Loading…" symptom.

Fixes:
  - AdminWebAppFactory:
      * ApplicationName set on WebApplicationOptions so UseStaticWebAssets
        discovers the manifest.
      * builder.WebHost.UseStaticWebAssets() wired explicitly (matches what
        `dotnet run` does via MSBuild targets).
      * dbName captured once per fixture; the options lambda reads the
        captured string instead of re-rolling a Guid.
  - UnsTabDragDropE2ETests: the two [Fact(Skip=...)] tests un-skip.

Suite state: 3 passed, 0 skipped, 0 failed. Task #242 closed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 02:31:26 -04:00
Joseph Doherty
4e96f228b2 Task #242 partial — UnsTab interactive E2E test bodies + harness upgrades (tests Skip-guarded pending blazor.web.js asset plumbing)
Carries the interactive drag-drop + 409 concurrent-edit test bodies (full Playwright
flows against the real @ondragstart/@ondragover/@ondrop handlers + modal + EF state
round-trip), plus several harness upgrades that push the in-process
WebApplication-based fixture closer to a working Blazor Server circuit. The
interactive tests are marked [Fact(Skip=...)] pending resolution of one remaining
blocker documented in the class docstring.

Harness upgrades (AdminWebAppFactory):
  - Environment set to Development so 500s surface exception stacks (rather than
    the generic error page) during future diagnosis.
  - ContentRootPath pointed at the Admin assembly dir so wwwroot + manifest files
    resolve.
  - Wired SignalR hubs (/hubs/fleet, /hubs/alerts) so ClusterDetail's HubConnection
    negotiation no longer 500s at first render.
  - Services property exposed so tests can open scoped DI contexts against the
    running host (scheduled peer-edit simulation, post-commit state assertion).

Remaining blocker (reason for Skip):
  /_framework/blazor.web.js returns HTTP 200 with a zero-byte body. The asset's
  route is declared in OtOpcUa.Admin.staticwebassets.endpoints.json, but the
  underlying file is shipped by the framework NuGet package
  (Microsoft.AspNetCore.App.Internal.Assets/_framework/blazor.web.js) rather than
  copied into the Admin wwwroot. MapStaticAssets can't resolve it without wiring
  a composite FileProvider or the WebRootPath machinery. Three viable next-session
  approaches listed in the class docstring:
    (a) Composite FileProvider mapping /_framework/* → NuGet cache.
    (b) Subprocess harness spawning real dotnet run of Admin project with an
        InMemory-DB override (closest to production composition).
    (c) MSBuild ItemGroup in the test csproj that copies framework files into the
        test output + ContentRoot=test assembly dir with UseStaticFiles.

Scaffolding smoke test (Admin_host_serves_HTTP_via_Playwright_scaffolding) stays
green unchanged.

Suite state: 1 passed, 2 skipped, 0 failed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 02:09:44 -04:00
Joseph Doherty
dfe3731c73 Task #220 — Wire FOCAS into DriverFactoryRegistry bootstrap pipeline
Closes the non-hardware gap surfaced in the #220 audit: FOCAS had full Tier-C
architecture (Driver.FOCAS + Driver.FOCAS.Host + Driver.FOCAS.Shared, supervisor,
post-mortem MMF, NSSM scripts, 239 tests) but no factory registration, so config-DB
DriverInstance rows of type "FOCAS" would fail at bootstrap with "unknown driver
type". Hardware-gated FwlibHostedBackend (real Fwlib32 P/Invoke inside the Host
process) stays deferred under #222 lab-rig.

Ships:
  - FocasDriverFactoryExtensions.Register(registry) mirroring the Galaxy pattern.
    JSON schema selects backend via "Backend" field:
      "ipc" (default) — IpcFocasClientFactory → named-pipe FocasIpcClient →
                        Driver.FOCAS.Host process (Tier-C isolation)
      "fwlib"         — direct in-process FwlibFocasClientFactory (P/Invoke)
      "unimplemented" — UnimplementedFocasClientFactory (fail-fast on use —
                        useful for staging DriverInstance rows pre-Host-deploy)
  - Devices / Tags / Probe / Timeout / Series feed into FocasDriverOptions.
    Series validated eagerly at top-level so typos fail at bootstrap, not first
    read. Tag DataType + Series enum values surface clear errors listing valid
    options.
  - Program.cs adds FocasDriverFactoryExtensions.Register alongside Galaxy.
  - Driver.FOCAS.csproj references Core (for DriverFactoryRegistry).
  - Server.csproj adds Driver.FOCAS ProjectReference so the factory type is
    reachable from Program.cs.

Tests: 13 new FocasDriverFactoryExtensionsTests covering: registry entry,
case-insensitive lookup, ipc backend with full config, ipc defaults, missing
PipeName/SharedSecret errors, fwlib backend short-path, unimplemented backend,
unknown-backend error, unknown-Series error, tag missing DataType, null/ws args,
duplicate-register throws.

Regression: 202 FOCAS + 13 FOCAS.Host + 24 FOCAS.Shared + 239 Server all pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 01:08:25 -04:00
Joseph Doherty
8221fac8c1 Task #219 follow-up — close AlarmConditionState child-NodeId + event-propagation gaps
PR #197 surfaced two integration-level wiring gaps in DriverNodeManager's
MarkAsAlarmCondition path; this commit fixes both and upgrades the integration
test to assert them end-to-end.

Fix 1 — addressable child nodes: AlarmConditionState inherits ~50 typed children
(Severity / Message / ActiveState / AckedState / EnabledState / …). The stack
was leaving them with Foundation-namespace NodeIds (type-declaration defaults) or
shared ns=0 counter allocations, so client Read on a child returned
BadNodeIdUnknown. Pass assignNodeIds=true to alarm.Create, then walk the condition
subtree and rewrite each descendant's NodeId symbolically as
  {condition-full-ref}.{symbolic-path}
in the node manager's namespace. Stable, unique, and collision-free across
multiple alarm instances in the same driver.

Fix 2 — event propagation to Server.EventNotifier: OPC UA Part 9 event
propagation relies on the alarm condition being reachable from Objects/Server
via HasNotifier. Call CustomNodeManager2.AddRootNotifier(alarm) after registering
the condition so subscriptions placed on Server-object EventNotifier receive the
ReportEvent calls ConditionSink emits per-transition.

Test upgrades in AlarmSubscribeIntegrationTests:
  - Driver_alarm_transition_updates_server_side_AlarmConditionState_node — now
    asserts Severity == 700, Message text, and ActiveState.Id == true through
    the OPC UA client (previously scoped out as BadNodeIdUnknown).
  - New: Driver_alarm_event_flows_to_client_subscription_on_Server_EventNotifier
    subscribes an OPC UA event monitor on ObjectIds.Server, fires a driver
    transition, and waits for the AlarmConditionType event to be delivered,
    asserting Message + Severity fields. Previously scoped out as "Part 9 event
    propagation out of reach."

Regression checks: 239 server tests pass (+1 new event-subscription test),
195 Core tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 00:22:02 -04:00
Joseph Doherty
acf31fd943 Task #219 — Server-integration test coverage for IAlarmSource dispatch path
Adds AlarmSubscribeIntegrationTests alongside HistoryReadIntegrationTests so both
optional driver capabilities — IHistoryProvider (already covered) and IAlarmSource
(new) — have end-to-end coverage that boots the full OPC UA stack and exercises the
wiring path from driver event → GenericDriverNodeManager forwarder → DriverNodeManager
ConditionSink through a real Session.

Two tests:
  1. Driver_alarm_transition_updates_server_side_AlarmConditionState_node — a fake
     IAlarmSource declares an IsAlarm=true variable, calls MarkAsAlarmCondition in
     DiscoverAsync, and fires OnAlarmEvent for that source. Verifies the
     client can browse the alarm condition node at FullReference + ".Condition"
     and reads the DisplayName back through Session.Read.
  2. Each_IsAlarm_variable_registers_its_own_condition_node_in_the_driver_namespace —
     two IsAlarm variables each produce their own addressable AlarmConditionState,
     proving the CapturingHandle per-variable registration works.

Scoped-out (documented in the class docstring): the stack exposes AlarmConditionState's
inherited children (Severity / Message / ActiveState / …) with Foundation-namespace
NodeIds that DriverNodeManager does not add to its predefined-node index, so reading
those child attributes through a client returns BadNodeIdUnknown. OPC UA Part 9 event
propagation (subscribe-on-Server + ConditionRefresh) is likewise out of reach until
the node manager wires HasNotifier + child-node registration. The existing Core-level
GenericDriverNodeManagerTests cover the in-memory alarm-sink fan-out semantics.

Full Server.Tests suite: 238 passed, 0 failed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 23:33:45 -04:00
Joseph Doherty
3d78033ea4 Driver-instance bootstrap pipeline (#248) — DriverInstance rows materialise as live IDriver instances
Closes the gap surfaced by Phase 7 live smoke (#240): DriverInstance rows in
the central config DB had no path to materialise as live IDriver instances in
DriverHost, so virtual-tag scripts read BadNodeIdUnknown for every tag.

## DriverFactoryRegistry (Core.Hosting)
Process-singleton type-name → factory map. Each driver project's static
Register call pre-loads its factory at Program.cs startup; the bootstrapper
looks up by DriverInstance.DriverType + invokes with (DriverInstanceId,
DriverConfig JSON). Case-insensitive; duplicate-type registration throws.

## GalaxyProxyDriverFactoryExtensions.Register (Driver.Galaxy.Proxy)
Static helper — no Microsoft.Extensions.DependencyInjection dep, keeps the
driver project free of DI machinery. Parses DriverConfig JSON for PipeName +
SharedSecret + ConnectTimeoutMs. DriverInstanceId from the row wins over JSON
per the schema's UX_DriverInstance_Generation_LogicalId.

## DriverInstanceBootstrapper (Server)
After NodeBootstrap loads the published generation: queries DriverInstance
rows scoped to that generation, looks up the factory per row, constructs +
DriverHost.RegisterAsync (which calls InitializeAsync). Per plan decision
#12 (driver isolation), failure of one driver doesn't prevent others —
logs ERR + continues + returns the count actually registered. Unknown
DriverType (factory not registered) logs WRN + skips so a missing-assembly
deployment doesn't take down the whole server.

## Wired into OpcUaServerService.ExecuteAsync
After NodeBootstrap.LoadCurrentGenerationAsync, before
PopulateEquipmentContentAsync + Phase7Composer.PrepareAsync. The Phase 7
chain now sees a populated DriverHost so CachedTagUpstreamSource has an
upstream feed.

## Live evidence on the dev box
Re-ran the Phase 7 smoke from task #240. Pre-#248 vs post-#248:
  Equipment namespace snapshots loaded for 0/0 driver(s)  ← before
  Equipment namespace snapshots loaded for 1/1 driver(s)  ← after

Galaxy.Host pipe ACL denied our SID (env-config issue documented in
docs/ServiceHosting.md, NOT a code issue) — the bootstrapper logged it as
"failed to initialize, driver state will reflect Faulted" and continued past
the failure exactly per plan #12. The rest of the pipeline (Equipment walker
+ Phase 7 composer) ran to completion.

## Tests — 5 new DriverFactoryRegistryTests
Register + TryGet round-trip, case-insensitive lookup, duplicate-type throws,
null-arg guards, RegisteredTypes snapshot. Pure functions; no DI/DB needed.
The bootstrapper's DB-query path is exercised by the live smoke (#240) which
operators run before each release.
2026-04-20 22:49:25 -04:00
Joseph Doherty
bb10ba7108 Phase 7 follow-up #247 — Galaxy.Host historian writer + SQLite sink activation
Closes the historian leg of Phase 7. Scripted alarm transitions now batch-flow
through the existing Galaxy.Host pipe + queue durably in a local SQLite store-
and-forward when Galaxy is the registered driver, instead of being dropped into
NullAlarmHistorianSink.

## GalaxyHistorianWriter (Driver.Galaxy.Proxy.Ipc)

IAlarmHistorianWriter implementation. Translates AlarmHistorianEvent →
HistorianAlarmEventDto (Stream D contract), batches via the existing
GalaxyIpcClient.CallAsync round-trip on MessageKind.HistorianAlarmEventRequest /
Response, maps per-event HistorianAlarmEventOutcomeDto bytes back to
HistorianWriteOutcome (Ack/RetryPlease/PermanentFail) so the SQLite drain
worker knows what to ack vs dead-letter vs retry. Empty-batch fast path.
Pipe-level transport faults (broken pipe, host crash) bubble up as
GalaxyIpcException which the SQLite sink's drain worker translates to
whole-batch RetryPlease per its catch contract.

## GalaxyProxyDriver implements IAlarmHistorianWriter

Marker interface lets Phase7Composer discover it via type check at compose
time. WriteBatchAsync delegates to a thin GalaxyHistorianWriter wrapping the
driver's existing _client. Throws InvalidOperationException if InitializeAsync
hasn't connected yet — the SQLite drain worker treats that as a transient
batch failure and retries.

## Phase7Composer.ResolveHistorianSink

Replaces the injected sink dep when any registered driver implements
IAlarmHistorianWriter. Constructs SqliteStoreAndForwardSink at
%ProgramData%/OtOpcUa/alarm-historian-queue.db (falls back to %TEMP% when
ProgramData unavailable, e.g. dev), starts the 2s drain timer, owns the sink
disposable for clean teardown. When no driver provides the writer, keeps the
NullAlarmHistorianSink wired by Program.cs (#246).

DisposeAsync now also disposes the owned SQLite sink in the right order:
bridge → engines → owned sink → injected fallback.

## Tests — 7 new GalaxyHistorianWriterMappingTests

ToDto round-trips every field; preserves null Comment; per-byte outcome enum
mapping (Ack / RetryPlease / PermanentFail) via [Theory]; unknown byte throws;
ctor null-guard. The IPC round-trip itself is covered by the live Host suite
(task #240) which constructs a real pipe.

Server.Phase7 tests: 34/34 still pass; Galaxy.Proxy tests: 25/25 (+7 = 32 total).

## Phase 7 production wiring chain — COMPLETE
-  #243 composition kernel
-  #245 scripted-alarm IReadable adapter
-  #244 driver bridge
-  #246 Program.cs wire-in
-  #247 this — Galaxy.Host historian writer + SQLite sink activation

What unblocks now: task #240 live OPC UA E2E smoke. With a Galaxy driver
registered, scripted alarm transitions flow end-to-end through the engine →
SQLite queue → drain worker → Galaxy.Host IPC → Aveva Historian alarm schema.
Without Galaxy, NullSink keeps the engines functional and the queue dormant.
2026-04-20 22:18:39 -04:00
Joseph Doherty
7352db28a6 Phase 7 follow-up #246 — Phase7Composer + Program.cs wire-in
Activates the Phase 7 engines in production. Loads Script + VirtualTag +
ScriptedAlarm rows from the bootstrapped generation, wires the engines through
the Phase7EngineComposer kernel (#243), starts the DriverSubscriptionBridge feed
(#244), and late-binds the resulting IReadable sources to OpcUaApplicationHost
before OPC UA server start.

## Phase7Composer (Server.Phase7)

Singleton orchestrator. PrepareAsync loads the three Phase 7 row sets in one
DB scope, builds CachedTagUpstreamSource, calls Phase7EngineComposer.Compose,
constructs DriverSubscriptionBridge with one DriverFeed per registered
ISubscribable driver (path-to-fullRef map built from EquipmentNamespaceContent
via MapPathsToFullRefs), starts the bridge.

DisposeAsync tears down in the right order: bridge first (no more events fired
into the cache), then engines (cascades + timers stop), then any disposable sink.

MapPathsToFullRefs: deterministic path convention is
  /{areaName}/{lineName}/{equipmentName}/{tagName}
matching exactly what EquipmentNodeWalker emits into the OPC UA browse tree, so
script literals against the operator-visible UNS tree work without translation.
Tags missing EquipmentId or pointing at unknown Equipment are skipped silently
(Galaxy SystemPlatform-style tags + dangling references handled).

## OpcUaApplicationHost.SetPhase7Sources

New late-bind setter. Throws InvalidOperationException if called after
StartAsync because OtOpcUaServer + DriverNodeManagers capture the field values
at construction; mutation post-start would silently fail.

## OpcUaServerService

After bootstrap loads the current generation, calls phase7Composer.PrepareAsync
+ applicationHost.SetPhase7Sources before applicationHost.StartAsync. StopAsync
disposes Phase7Composer first so the bridge stops feeding the cache before the
OPC UA server tears down its node managers (avoids in-flight cascades surfacing
as noisy shutdown warnings).

## Program.cs

Registers IAlarmHistorianSink as NullAlarmHistorianSink.Instance (task #247
swaps in the real Galaxy.Host-writer-backed SqliteStoreAndForwardSink), Serilog
root logger, and Phase7Composer singleton.

## Tests — 5 new Phase7ComposerMappingTests = 34 Phase 7 tests total

Maps tag → walker UNS path, skips null EquipmentId, skips unknown Equipment
reference, multiple tags under same equipment map distinctly, empty content
yields empty map. Pure functions; no DI/DB needed.

The real PrepareAsync DB query path can't be exercised without SQL Server in
the test environment — it's exercised by the live E2E smoke (task #240) which
unblocks once #247 lands.

## Phase 7 production wiring chain status
-  #243 composition kernel
-  #245 scripted-alarm IReadable adapter
-  #244 driver bridge
-  #246 this — Program.cs wire-in
- 🟡 #247 — Galaxy.Host SqliteStoreAndForwardSink writer adapter (replaces NullSink)
- 🟡 #240 — live E2E smoke (unblocks once #247 lands)
2026-04-20 22:06:03 -04:00
Joseph Doherty
e11350cf80 Phase 7 follow-up #244 — DriverSubscriptionBridge
Pumps live driver OnDataChange notifications into CachedTagUpstreamSource so
ctx.GetTag in user scripts sees the freshest driver value. The last missing piece
between #243 (composition kernel) and #246 (Program.cs wire-in).

## DriverSubscriptionBridge

IAsyncDisposable. Per DriverFeed: groups all paths for one ISubscribable into a
single SubscribeAsync call (consolidating polled drivers' work + giving
native-subscription drivers one watch list), keeps a per-feed reverse map from
driver-opaque fullRef back to script-side UNS path, hooks OnDataChange to
translate + push into the cache. DisposeAsync awaits UnsubscribeAsync per active
subscription + unhooks every handler so events post-dispose are silent.

Empty PathToFullRef map → feed skipped (no SubscribeAsync call). Subscribe failure
on any feed unhooks that feed's handler + propagates so misconfiguration aborts
bridge start cleanly. Double-Start throws InvalidOperationException; double-Dispose
is idempotent.

OTOPCUA0001 suppressed at the two ISubscribable call sites with comments
explaining the carve-out: bridge is the lifecycle-coordinator for Phase 7
subscriptions (one Subscribe at engine compose, one Unsubscribe at shutdown),
not the per-call hot-path. Driver Read dispatch still goes through CapabilityInvoker
via DriverNodeManager.

## Tests — 9 new = 29 Phase 7 tests total

DriverSubscriptionBridgeTests covers: SubscribeAsync called with distinct fullRefs,
OnDataChange pushes to cache keyed by UNS path, unmapped fullRef ignored, empty
PathToFullRef skips Subscribe, DisposeAsync unsubscribes + unhooks (post-dispose
events don't push), StartAsync called twice throws, DisposeAsync idempotent,
Subscribe failure unhooks handler + propagates, ctor null guards.

## Phase 7 production wiring chain status
- #243  composition kernel
- #245  scripted-alarm IReadable adapter
- #244  this — driver bridge
- #246 pending — Program.cs Compose call + SqliteStoreAndForwardSink lifecycle
- #240 pending — live E2E smoke (unblocks once #246 lands)
2026-04-20 21:53:05 -04:00
Joseph Doherty
d6a8bb1064 Phase 7 follow-up #245 — ScriptedAlarmReadable adapter over engine state
Task #245 — exposes each scripted alarm's current ActiveState as IReadable so
OPC UA variable reads on Source=ScriptedAlarm nodes return the live predicate
truth instead of BadNotFound.

## ScriptedAlarmReadable

Wraps ScriptedAlarmEngine + implements IReadable:
- Known alarm + Active → DataValueSnapshot(true, Good)
- Known alarm + Inactive → DataValueSnapshot(false, Good)
- Unknown alarm id → DataValueSnapshot(null, BadNodeIdUnknown) — surfaces
  misconfiguration rather than silently reading false
- Batch reads preserve request order

Phase7EngineComposer.Compose now returns this as ScriptedAlarmReadable when
ScriptedAlarm rows are present. ScriptedAlarmSource (IAlarmSource for the event
stream) stays in place — the IReadable is a separate adapter over the same engine.

## Tests — 6 new + 1 updated composer test = 19 total Phase 7 tests

ScriptedAlarmReadableTests covers: inactive + active predicate → bool snapshot,
unknown alarm id → BadNodeIdUnknown, batch order preservation, null-engine +
null-fullReferences guards. The active-predicate test uses ctx.GetTag on a seeded
upstream value to drive a real cascade through the engine.

Updated Phase7EngineComposerTests to assert ScriptedAlarmReadable is non-null
when alarms compose, null when only virtual tags.

## Follow-ups remaining
- #244 — driver-bridge feed populating CachedTagUpstreamSource
- #246 — Program.cs Compose call + SqliteStoreAndForwardSink lifecycle
2026-04-20 21:30:56 -04:00
Joseph Doherty
f64a8049d8 Phase 7 follow-up #243 — CachedTagUpstreamSource + Phase7EngineComposer
Ships the composition kernel that maps Config DB rows (Script / VirtualTag /
ScriptedAlarm) to the runtime definitions VirtualTagEngine + ScriptedAlarmEngine
consume, builds the engine instances, and wires OnEvent → historian-sink routing.

## src/ZB.MOM.WW.OtOpcUa.Server/Phase7/

- CachedTagUpstreamSource — implements both Core.VirtualTags.ITagUpstreamSource and
  Core.ScriptedAlarms.ITagUpstreamSource (identical shape, distinct namespaces) on one
  concrete type so the composer can hand one instance to both engines. Thread-safe
  ConcurrentDictionary value cache with synchronous ReadTag + fire-on-write
  Push(path, snapshot) that fans out to every observer registered via SubscribeTag.
  Unknown-path reads return a BadNodeIdUnknown-quality snapshot (status 0x80340000)
  so scripts branch on quality naturally.
- Phase7EngineComposer.Compose(scripts, virtualTags, scriptedAlarms, upstream,
  alarmStateStore, historianSink, rootScriptLogger, loggerFactory) — single static
  entry point that:
  * Indexes scripts by ScriptId, resolves VirtualTag.ScriptId + ScriptedAlarm.PredicateScriptId
    to full SourceCode
  * Projects DB rows to VirtualTagDefinition + ScriptedAlarmDefinition (mapping
    DataType string → DriverDataType enum, AlarmType string → AlarmKind enum,
    Severity 1..1000 → AlarmSeverity bucket matching the OPC UA Part 9 bands
    that AbCipAlarmProjection + OpcUaClient MapSeverity already use)
  * Constructs VirtualTagEngine + loads definitions (throws InvalidOperationException
    with the list of scripts that failed to compile — aggregated like Streams B+C)
  * Constructs ScriptedAlarmEngine + loads definitions + wires OnEvent →
    IAlarmHistorianSink.EnqueueAsync using ScriptedAlarmEvent.Emission as the event
    kind + Condition.LastAckUser/LastAckComment for audit fields
  * Returns Phase7ComposedSources with Disposables list the caller owns

Empty Phase 7 config returns Phase7ComposedSources.Empty so deployments without
scripts / alarms behave exactly as pre-Phase-7. Non-null sources flow into
OpcUaApplicationHost's virtualReadable / scriptedAlarmReadable plumbing landed by
task #239 — DriverNodeManager then dispatches reads by NodeSourceKind per PR #186.

## Tests — 12/12

CachedTagUpstreamSourceTests (6):
- Unknown-path read returns BadNodeIdUnknown-quality snapshot
- Push-then-Read returns cached value
- Push fans out to subscribers in registration order
- Push to one path doesn't fire another path's observer
- Dispose of subscription handle stops fan-out
- Satisfies both Core.VirtualTags + Core.ScriptedAlarms ITagUpstreamSource interfaces

Phase7EngineComposerTests (6):
- Empty rows → Phase7ComposedSources.Empty (both sources null)
- VirtualTag rows → VirtualReadable non-null + Disposables populated
- Missing script reference throws InvalidOperationException with the missing ScriptId
  in the message
- Disabled VirtualTag row skipped by projection
- TimerIntervalMs → TimeSpan.FromMilliseconds round-trip
- Severity 1..1000 maps to Low/Medium/High/Critical at 250/500/750 boundaries
  (matches AbCipAlarmProjection + OpcUaClient.MapSeverity banding)

## Scope — what this PR does NOT do

The composition kernel is the tricky part; the remaining wiring is three narrower
follow-ups that each build on this PR:

- task #244 — driver-bridge feed that populates CachedTagUpstreamSource from live
  driver subscriptions. Without this, ctx.GetTag returns BadNodeIdUnknown even when
  the driver has a fresh value.
- task #245 — ScriptedAlarmReadable adapter exposing each alarm's current Active
  state as IReadable. Phase7EngineComposer.Compose currently returns
  ScriptedAlarmReadable=null so reads on Source=ScriptedAlarm variables return
  BadNotFound per the ADR-002 "misconfiguration not silent fallback" signal.
- task #246 — Program.cs call to Phase7EngineComposer.Compose with config rows
  loaded from the sealed-cache DB read, plus SqliteStoreAndForwardSink lifecycle
  wiring at %ProgramData%/OtOpcUa/alarm-historian-queue.db with the Galaxy.Host
  IPC writer from Stream D.

Task #240 (live OPC UA E2E smoke) depends on all three follow-ups landing.
2026-04-20 21:23:31 -04:00
Joseph Doherty
d78741cfdf Admin.E2ETests scaffolding — Playwright + Kestrel + InMemory DB + test auth
Ships the E2E infrastructure filed against task #199 (UnsTab drag-drop Playwright
smoke). The Blazor Server interactive-render assertion through a test-owned pipeline
needs a dedicated diagnosis pass — filed as task #242 — but the Playwright harness
lands here so that follow-up starts from a known-good scaffolding rather than
setting up the project from scratch.

## New project tests/ZB.MOM.WW.OtOpcUa.Admin.E2ETests

- AdminWebAppFactory — boots the Admin pipeline with Kestrel on a free loopback port,
  swaps the SQL DbContext for EF Core InMemory, replaces the LDAP cookie auth with
  TestAuthHandler, mirrors the Razor-components/auth/antiforgery pipeline, and seeds
  a cluster + draft generation with areas warsaw / berlin and a line-a1 in warsaw.
  Not a WebApplicationFactory<Program> because WAF's TestServer transport doesn't
  coexist cleanly with Kestrel-on-a-real-port, which Playwright needs.
- TestAuthHandler — stamps every request with a FleetAdmin claim so tests hit
  authenticated routes without the LDAP bind.
- PlaywrightFixture — one Chromium launch shared across tests; throws
  PlaywrightBrowserMissingException when the binary isn't installed so tests can
  Assert.Skip rather than fail hard.
- UnsTabDragDropE2ETests.Admin_host_serves_HTTP_via_Playwright_scaffolding — proves
  the full stack comes up: Kestrel bind, InMemory DbContext, test auth, Playwright
  navigation, Razor route pipeline responds with HTML < 500. One passing test.

## Prerequisite

Chromium must be installed locally:
  pwsh tests/ZB.MOM.WW.OtOpcUa.Admin.E2ETests/bin/Debug/net10.0/playwright.ps1 install chromium

Absent the browser, the suite Assert.Skip's cleanly — CI without the install step
still reports green. Once installed, `dotnet test` runs the scaffolding smoke in ~12s.

## Follow-up (task #242)

Diagnose why `/clusters/{id}/draft/{gen}` → UNS-tab click → drag-drop flow times out
under the test-owned Program.cs replica. Candidate causes: route-ordering difference,
missing SignalR hub mapping timing, JS interop asset differences, culture middleware.
Once the interactive circuit boots, add:
- happy-path drag-drop assertion (source row → target area → Confirm → assert re-parent)
- 409 conflict variant (preview → external DB mutation → Confirm → assert red-header modal)
2026-04-20 20:55:57 -04:00
Joseph Doherty
f0851af6b5 Phase 7 Stream G follow-up — DriverNodeManager dispatch routing by NodeSourceKind
Honors the ADR-002 discriminator at OPC UA Read/Write dispatch time. Virtual tag
reads route to the VirtualTagEngine-backed IReadable; scripted alarm reads route
to the ScriptedAlarmEngine-backed IReadable; driver reads continue to route to the
driver's own IReadable (no regression for any existing driver test).

## Changes

DriverNodeManager ctor gains optional `virtualReadable` + `scriptedAlarmReadable`
parameters. When callers omit them (every existing driver test) the manager behaves
exactly as before. SealedBootstrap wires the engines' IReadable adapters once the
Phase 7 composition root is added.

Per-variable NodeSourceKind tracked in `_sourceByFullRef` during Variable() registration
alongside the existing `_writeIdempotentByFullRef` / `_securityByFullRef` maps.

OnReadValue now picks the IReadable by source kind via the new internal
SelectReadable helper. When the engine-backed IReadable isn't wired (virtual tag
node but no engine provided), returns BadNotFound rather than silently falling
back to the driver — surfaces a misconfiguration instead of masking it.

OnWriteValue gates on IsWriteAllowedBySource which returns true only for Driver.
Plan decision #6: virtual tags + scripted alarms reject direct OPC UA writes with
BadUserAccessDenied. Scripts write virtual tags via `ctx.SetVirtualTag`; operators
ack alarms via the Part 9 method nodes.

## Tests — 7/7 (internal helpers exposed via InternalsVisibleTo)

DriverNodeManagerSourceDispatchTests covers:
- Driver source routes to driver IReadable
- Virtual source routes to virtual IReadable
- ScriptedAlarm source routes to alarm IReadable
- Virtual source with null virtual IReadable returns null (→ BadNotFound)
- ScriptedAlarm source with null alarm IReadable returns null
- Driver source with null driver IReadable returns null (preserves BadNotReadable)
- IsWriteAllowedBySource: only Driver=true (Virtual=false, ScriptedAlarm=false)

Full solution builds clean. Phase 7 test total now 197 green.
2026-04-20 20:12:17 -04:00
Joseph Doherty
0687bb2e2d Phase 7 Stream F — Admin UI for scripts + test harness + historian diagnostics
Adds the draft-editor tab + page surface for authoring Phase 7 virtual tags and
scripted alarms, plus the /alarms/historian operator diagnostics page. Monaco loads
from CDN via a progressive-enhancement JS shim — the textarea works immediately so
the page is functional even if the CDN is unreachable.

## New services (Admin)

- ScriptService — CRUD for Script entity. SHA-256 SourceHash recomputed on save so
  Core.Scripting's CompiledScriptCache hits on re-publish of unchanged source + misses
  when the source actually changes.
- VirtualTagService — CRUD for VirtualTag, with Enabled toggle.
- ScriptedAlarmService — CRUD for ScriptedAlarm + lookup of persistent ScriptedAlarmState
  (logical-id-keyed per plan decision #14).
- ScriptTestHarnessService — pre-publish dry-run. Enforces plan decision #22: only
  inputs the DependencyExtractor identifies can be supplied. Missing / extra synthetic
  inputs surface as dedicated outcomes. Captures SetVirtualTag writes + Serilog events
  from the script so the operator can see both the output + the log output before
  publishing.
- HistorianDiagnosticsService — surfaces the local-process IAlarmHistorianSink state
  on /alarms/historian. Null sink reports Disabled + swallows retry. Live
  SqliteStoreAndForwardSink reports real queue depth + last-error + drain state and
  routes the Retry-dead-lettered button through.

## New UI

- ScriptsTab.razor (inside DraftEditor tabs) — list + create/edit/delete scripts with
  Monaco editor + dependency preview + test-harness run panel showing output + writes
  + log emissions.
- ScriptEditor.razor — reusable Monaco-backed textarea. Loads editor from CDN via
  wwwroot/js/monaco-loader.js. Textarea stays authoritative for Blazor binding; Monaco
  mirrors into it on every keystroke.
- AlarmsHistorian.razor (/alarms/historian) — queue depth + dead-letter depth + drain
  state badge + last-error banner + Retry-dead-lettered button.
- DraftEditor.razor — new "Scripts" tab.

## DI wiring

All five services registered in Program.cs. Null historian sink bound at Admin
composition time (real SqliteStoreAndForwardSink lives in the Server process).

## Tests — 13/13

Phase7ServicesTests covers:
- ScriptService: Add generates logical id + hash, Update recomputes hash on source
  change, Update same-source keeps hash (cache-hit preservation), Delete is idempotent
- VirtualTagService: round-trips trigger flags, Enabled toggle works
- ScriptedAlarmService: HistorizeToAveva defaults true per plan decision #15
- ScriptTestHarness: successful run captures output + writes, rejects missing /
  extra inputs, rejects non-literal paths, compile errors surface as Threw
- HistorianDiagnosticsService: null sink reports Disabled + retry returns 0
2026-04-20 19:59:18 -04:00
Joseph Doherty
f1f53e1789 Phase 7 Stream G — Address-space integration (NodeSourceKind + walker emits VirtualTag/ScriptedAlarm)
Per ADR-002, adds the Driver/Virtual/ScriptedAlarm discriminator to DriverAttributeInfo
so the DriverNodeManager's dispatch layer can route Read/Write/Subscribe to the right
runtime subsystem — drivers (unchanged), VirtualTagEngine (Phase 7 Stream B), or
ScriptedAlarmEngine (Phase 7 Stream C).

## Changes
- NodeSourceKind enum added to Core.Abstractions (Driver=0/Virtual=1/ScriptedAlarm=2).
- DriverAttributeInfo gains Source / VirtualTagId / ScriptedAlarmId parameters — all
  default so existing call sites (every driver) compile unchanged.
- EquipmentNamespaceContent gains VirtualTags + ScriptedAlarms optional collections.
- EquipmentNodeWalker emits:
  - Virtual-tag variables — Source=Virtual, VirtualTagId set, Historize flag honored
  - Scripted-alarm variables — Source=ScriptedAlarm, ScriptedAlarmId set, IsAlarm=true
    (triggers node-manager AlarmConditionState materialization)
  - Skips disabled virtual tags + scripted alarms

## Tests — 13/13 in EquipmentNodeWalkerTests (5 new)
- Virtual-tag variables carry Source=Virtual + VirtualTagId + Historize flag
- Scripted-alarm variables carry Source=ScriptedAlarm + IsAlarm=true + Boolean type
- Disabled rows skipped
- Null VirtualTags/ScriptedAlarms collections safe (back-compat for non-Phase-7 callers)
- Driver tags default Source=Driver (ensures no discriminator regression)

## Next
Stream G follow-up: DriverNodeManager dispatch (Read/Write/Subscribe routing by
NodeSourceKind), SealedBootstrap wiring of VirtualTagEngine + ScriptedAlarmEngine,
end-to-end integration test.
2026-04-20 19:41:01 -04:00
Joseph Doherty
be1003c53e Phase 7 Stream E — Config DB schema for scripts, virtual tags, scripted alarms, and alarm state
Adds the four tables Streams B/C/F consume — Script (generation-scoped source code),
VirtualTag (generation-scoped calculated-tag config), ScriptedAlarm (generation-scoped
alarm config), and ScriptedAlarmState (logical-id-keyed persistent runtime state).

## New entities (net10, EF Core)

- Script — stable logical ScriptId carries across generations; SourceHash is the
  compile-cache key (matches Core.Scripting's CompiledScriptCache).
- VirtualTag — mandatory EquipmentId FK (plan decision #2, unified Equipment tree);
  ChangeTriggered/TimerIntervalMs + Historize flags; check constraints enforce
  "at least one trigger" + "timer >= 50ms".
- ScriptedAlarm — required AlarmType ('AlarmCondition'/'LimitAlarm'/'OffNormalAlarm'/
  'DiscreteAlarm'); Severity 1..1000 range check; HistorizeToAveva default true per
  plan decision #15.
- ScriptedAlarmState — keyed ONLY on ScriptedAlarmId (NOT generation-scoped) per plan
  decision #14 — ack state + audit trail must follow alarm identity across Modified
  generations. CommentsJson has ISJSON check for GxP audit.

## Migration

EF-generated 20260420231641_AddPhase7ScriptingTables covers all 4 tables + indexes +
check constraints + FKs to ConfigGeneration. sp_PublishGeneration required no changes —
it only flips Draft->Published status; the new entities already carry GenerationId so
they publish atomically with the rest of the config.

## Tests — 12/12 (design-time model introspection)

Phase7ScriptingEntitiesTests covers: table registration, column maxlength + column
types, unique indexes (Generation+LogicalId, Generation+EquipmentPath for VirtualTag
and ScriptedAlarm), secondary indexes (SourceHash for cache lookup), check constraints
(trigger-required, timer-min, severity-range, alarm-type-enum, CommentsJson-IsJson),
ScriptedAlarmState PK is alarm-id not generation-scoped, ScriptedAlarm defaults
(HistorizeToAveva=true, Retain=true, Severity=500, Enabled=true), DbSets wired, and
the generated migration type exists for rollforward.
2026-04-20 19:22:45 -04:00
Joseph Doherty
25ad4b1929 Phase 7 Stream D — Historian alarm sink (SQLite store-and-forward + Galaxy.Host IPC contracts)
Phase 7 plan decisions #16, #17, #19, #21 implementation. Durable local SQLite queue
absorbs every qualifying alarm event; drain worker forwards batches to Galaxy.Host
(reusing the already-loaded 32-bit aahClientManaged DLLs) on an exponential-backoff
cadence; operator acks never block on the historian being reachable.

## New project Core.AlarmHistorian (net10)

- AlarmHistorianEvent — source-agnostic event shape (scripted alarms + Galaxy-native +
  AB CIP ALMD + any future IAlarmSource)
- IAlarmHistorianSink / NullAlarmHistorianSink — interface + disabled default
- IAlarmHistorianWriter — per-event outcome (Ack / RetryPlease / PermanentFail); Stream G
  wires the Galaxy.Host IPC client implementation
- SqliteStoreAndForwardSink — full implementation:
  - Queue table with AttemptCount / LastError / DeadLettered columns
  - DrainOnceAsync serialised via SemaphoreSlim
  - BackoffLadder 1s → 2s → 5s → 15s → 60s (cap)
  - DefaultCapacity 1,000,000 rows — overflow evicts oldest non-dead-lettered
  - DefaultDeadLetterRetention 30 days — sweeper purges on every drain tick
  - RetryDeadLettered operator action reattaches dead-letters to the regular queue
  - Writer-side exceptions treated as whole-batch RetryPlease (no data loss)

## New IPC contracts in Driver.Galaxy.Shared

- HistorianAlarmEventRequest — batched up to 100 events/request per plan Stream D.5
- HistorianAlarmEventResponse — per-event outcome (1:1 with request order)
- HistorianAlarmEventOutcomeDto enum (byte on the wire — Ack/RetryPlease/PermanentFail)
- HistorianAlarmEventDto — mirrors Core.AlarmHistorian.AlarmHistorianEvent
- HistorianConnectivityStatusNotification — Host pushes proactively when the SDK
  session drops so /alarms/historian flips red without waiting for the next drain
- MessageKind additions: 0x80 HistorianAlarmEventRequest / 0x81 HistorianAlarmEventResponse
  / 0x82 HistorianConnectivityStatus

## Tests — 14/14

SqliteStoreAndForwardSinkTests covers: enqueue→drain→Ack round-trip, empty-queue no-op,
RetryPlease bumps backoff + keeps row, Ack after Retry resets backoff, PermanentFail
dead-letters one row without blocking neighbors, writer exception treated as whole-batch
retry with error surfaced in status, capacity eviction drops oldest non-dead-lettered,
dead-letters purged past retention window, RetryDeadLettered requeues, ladder caps at
60s after 10 retries, Null sink reports Disabled status, null sink swallows enqueue,
ctor argument validation, disposed sink rejects enqueue.

## Totals
Full Phase 7 tests: 160 green (63 Scripting + 36 VirtualTags + 47 ScriptedAlarms +
14 AlarmHistorian). Stream G wires this into the real Galaxy.Host IPC pipe.
2026-04-20 19:11:17 -04:00
Joseph Doherty
df39809526 Phase 7 Stream C — Core.ScriptedAlarms project (Part 9 state machine + predicate engine + IAlarmSource adapter)
Ships the Part 9 alarm fidelity layer Phase 7 committed to in plan decision #5. Every scripted alarm gets a full OPC UA AlarmConditionType state machine — EnabledState, ActiveState, AckedState, ConfirmedState, ShelvingState — with persistent operator-supplied state across server restarts per Phase 7 plan decision #14. Runtime shape matches the Galaxy-native + AB CIP ALMD alarm sources: scripted alarms fan out through the existing IAlarmSource surface so Phase 6.1 AlarmTracker composition consumes them without per-source branching.

Part9StateMachine is a pure-functions module — no instance state, no I/O, no mutation. Every transition (ApplyPredicate, ApplyAcknowledge, ApplyConfirm, ApplyOneShotShelve, ApplyTimedShelve, ApplyUnshelve, ApplyEnable, ApplyDisable, ApplyAddComment, ApplyShelvingCheck) takes the current AlarmConditionState record plus the event and returns a fresh state + EmissionKind hint. Two structural invariants enforced: disabled alarms never transition ActiveState / AckedState / ConfirmedState; shelved alarms still advance state (so startup recovery reflects reality) but emit a Suppressed hint so subscribers do not see the transition. OneShot shelving expires on clear; Timed shelving expires via ApplyShelvingCheck against the UnshelveAtUtc timestamp. Comments are append-only — every acknowledge, confirm, shelve, unshelve, enable, disable, explicit add-comment, and auto-unshelve appends an AlarmComment record with user identity + timestamp + kind + text for the GxP / 21 CFR Part 11 audit surface.

AlarmConditionState is the persistent record the store saves. Fields: AlarmId, Enabled, Active, Acked, Confirmed, Shelving (kind + UnshelveAtUtc), LastTransitionUtc, LastActiveUtc, LastClearedUtc, LastAckUtc + LastAckUser + LastAckComment, LastConfirmUtc + LastConfirmUser + LastConfirmComment, Comments. Fresh factory initializes everything to the no-event position.

IAlarmStateStore is the persistence abstraction — LoadAsync, LoadAllAsync, SaveAsync, RemoveAsync. Stream E wires this to a SQL-backed store with IAuditLogger hooks; tests use InMemoryAlarmStateStore. Startup recovery per Phase 7 plan decision #14: LoadAsync runs every configured alarm predicate against current tag values to rederive ActiveState, but EnabledState / AckedState / ConfirmedState / ShelvingState + audit history are loaded verbatim from the store so operators do not re-ack after an outage and shelved alarms stay shelved through maintenance windows.

MessageTemplate implements Phase 7 plan decision #13 — static-with-substitution. {TagPath} tokens resolved at event emission time from the engine value cache. Missing paths, non-Good quality, or null values all resolve to {?} so the event still fires but the operator sees where the reference broke. ExtractTokenPaths enumerates tokens at publish time so the engine knows to subscribe to every template-referenced tag in addition to predicate-referenced tags.

AlarmPredicateContext is the ScriptContext subclass alarm scripts see. GetTag reads from the engine shared cache; SetVirtualTag is explicitly rejected at runtime with a pointed error message — alarm predicates must be pure so their output does not couple to virtual-tag state in ways that become impossible to reason about. If cross-tag side effects are needed, the operator authors a virtual tag and the alarm predicate reads it.

ScriptedAlarmEngine orchestrates. LoadAsync compiles every predicate through Stream A ScriptSandbox + ForbiddenTypeAnalyzer, runs DependencyExtractor to find the read set, adds template token paths to the input set, reports every compile failure as one aggregated InvalidOperationException (not one-at-a-time), subscribes to each unique referenced upstream path, seeds the value cache, loads persisted state for each alarm (falling back to Fresh for first-load), re-evaluates the predicate, and saves the recovered state. ChangeTrigger — when an upstream tag changes, look up every alarm referencing that path in a per-path inverse index, enqueue all of them for re-evaluation via a SemaphoreSlim-gated path. Unlike the virtual-tag engine, scripted alarms are leaves in the evaluation DAG (no alarm drives another alarm), so no topological sort is needed. Operator actions (AcknowledgeAsync, ConfirmAsync, OneShotShelveAsync, TimedShelveAsync, UnshelveAsync, EnableAsync, DisableAsync, AddCommentAsync) route through the state machine, persist, and emit if there is an emission. A 5-second shelving-check timer auto-expires Timed shelving and emits Unshelved events at the right moment. Predicate evaluation errors (script throws, timeout, compile-time reads bad tag) leave the state unchanged — the engine does NOT invent a clear transition on predicate failure. Logged as scripts-*.log Error; companion WARN in main log.

ScriptedAlarmSource implements IAlarmSource. SubscribeAlarmsAsync filter is a set of equipment-path prefixes; empty means all. AcknowledgeAsync from the base interface routes to the engine with user identity "opcua-client" — Stream G will replace this with the authenticated principal from the OPC UA dispatch layer. The adapter implements only the base IAlarmSource methods; richer Part 9 methods (Confirm, Shelve, Unshelve, AddComment) remain on the engine and will bind to OPC UA method nodes in Stream G.

47 unit tests across 5 files. Part9StateMachineTests (16) — every transition + noop edge cases: predicate true/false, same-state noop, disabled ignores predicate, acknowledge records user/comment/adds audit, idempotent acknowledge, reject no-user ack, full activate-ack-clear-confirm walk, one-shot shelve suppresses next activation, one-shot expires on clear, timed shelve requires future unshelve time, timed shelve expires via shelving-check, explicit unshelve emits, add-comment appends to audit, comments append-only through multiple operations, full lifecycle walk emits every expected EmissionKind. MessageTemplateTests (11) — no-token passthrough, single+multiple token substitution, bad quality becomes {?}, unknown path becomes {?}, null value becomes {?}, tokens with slashes+dots, empty + null template, ExtractTokenPaths returns every distinct path, whitespace inside tokens trimmed. ScriptedAlarmEngineTests (13) — load compiles+subscribes, compile failures aggregated, upstream change emits Activated, clearing emits Cleared, message template resolves at emission, ack persists to store, startup recovery preserves ack but rederives active, shelved activation state-advances but suppresses emission, runtime exception isolates to owning alarm, disable prevents activation until re-enable, AddComment appends audit without state change, SetVirtualTag from predicate rejected (state unchanged), Dispose releases upstream subscriptions. ScriptedAlarmSourceTests (5) — empty filter matches all, equipment-prefix filter, Unsubscribe stops events, AcknowledgeAsync routes with default user, null arguments rejected. FakeUpstream fixture gives tests an in-memory driver mock with subscription count tracking.

Full Phase 7 test count after Stream C: 146 green (63 Scripting + 36 VirtualTags + 47 ScriptedAlarms). Stream D (historian alarm sink with SQLite store-and-forward + Galaxy.Host IPC) consumes ScriptedAlarmEvent + similar Galaxy / AB CIP emissions to produce the unified alarm timeline. Stream G wires the OPC UA method calls and AlarmSource into DriverNodeManager dispatch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 18:49:48 -04:00
Joseph Doherty
479af166ab Phase 7 Stream B — Core.VirtualTags project (engine + dep graph + timer + source)
Ships the evaluation engine that consumes compiled scripts from Stream A, subscribes to upstream driver tags, runs on change + on timer, cascades evaluations through dependent virtual tags in topological order, and emits changes through a driver-capability-shaped adapter the DriverNodeManager can dispatch to per ADR-002.

DependencyGraph owns the directed dep-graph where nodes are tag paths (driver tags implicit leaves, virtual tags registered internal nodes) and edges run from a virtual tag to each tag it reads. Kahn algorithm produces the topological sort. Tarjan iterative SCC detects every cycle in one pass so publish-time rejection surfaces all offending cycles together. Both iterative so 10k-deep chains do not StackOverflow. Re-adding a node overwrites prior dependency set cleanly (supports config-publish reloads).

VirtualTagDefinition is the operator-authored config row (Path, DataType, ScriptSource, ChangeTriggered, TimerInterval, Historize). Stream E config DB materializes these on publish.

ITagUpstreamSource is the abstraction the engine pulls driver tag values from. Stream G bridges this to IReadable + ISubscribable on live drivers; tests use FakeUpstream that tracks subscription count for leak-test assertions.

IHistoryWriter is the per-tag Historize sink. NullHistoryWriter default when caller does not pass one.

VirtualTagContext is the per-evaluation ScriptContext. Reads from engine last-known-value cache, writes route through SetVirtualTag callback so cross-tag side effects participate in change cascades. Injectable Now clock for deterministic tests.

VirtualTagEngine orchestrates. Load compiles every script via ScriptSandbox, builds the dep graph via DependencyExtractor, checks for cycles, reports every compile failure in one error, subscribes to each referenced upstream path, seeds the value cache. EvaluateAllAsync runs topological order. EvaluateOneAsync is timer path. Read returns cached value. Subscribe registers observer. OnUpstreamChange updates cache, fans out, schedules transitive dependents (change-driven=false tags skipped). EvaluateInternalAsync holds a SemaphoreSlim so cascades do not interleave. Script exceptions and timeouts map per-tag to BadInternalError. Coercion from script double to config Int32 uses Convert.ToInt32.

TimerTriggerScheduler groups tags by interval into shared Timers. Tags without TimerInterval not scheduled.

VirtualTagSource implements IReadable + ISubscribable per ADR-002. ReadAsync returns cache. SubscribeAsync fires initial-data callback per OPC UA convention. IWritable deliberately not implemented — OPC UA writes to virtual tags rejected in DriverNodeManager per Phase 7 decision 6.

36 unit tests across 4 files: DependencyGraphTests 12, VirtualTagEngineTests 13, VirtualTagSourceTests 6, TimerTriggerSchedulerTests 4. Coverage includes cycle detection (self-loop, 2-node, 3-node, multiple disjoint), 2-level change cascade, per-tag error isolation (one tag throws, others keep working), timeout isolation, Historize toggle, ChangeTriggered=false ignore, reload cleans subscriptions, Dispose releases resources, SetVirtualTag fires observers, type coercion, 10k deep graph no stack overflow, initial-data callback, Unsubscribe stops events.

Fixed two bugs during implementation. Monitor.Enter/Exit cannot be held across await (Monitor ownership is thread-local and lost across suspension) — switched to SemaphoreSlim. Kahn edge-direction was inverted — for dependency ordering (X depends on Y means Y comes before X) in-degree should be count of a node own deps, not count of nodes pointing to it; was incrementing inDegree[dep] instead of inDegree[nodeId], causing false cycle detection on valid DAGs.

Full Phase 7 test count after Stream B: 99 green (63 Scripting + 36 VirtualTags). Streams C and G will plug engine + source into live OPC UA dispatch path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 17:02:50 -04:00
Joseph Doherty
36774842cf Phase 7 Stream A.3 — ScriptLoggerFactory + ScriptLogCompanionSink. Third of 3 increments closing out Stream A. Adds the Serilog plumbing that ties script-emitted log events to the dedicated scripts-*.log rolling sink with structured-property filtering AND forwards script Error+ events to the main opcua-*.log at Warning level so operators see script failures in the primary log without drowning it in Debug/Info script chatter. Both pieces are library-level building blocks — the actual file-sink + logger composition at server startup happens in Stream F (Admin UI) / Stream G (address-space wiring). This PR ships the reusable factory + sink + tests so any consumer can wire them up without rediscovering the structured-property contract.
ScriptLoggerFactory wraps a Serilog root logger (the scripts-*.log pipeline) and .Create(scriptName) returns a per-script ILogger with the ScriptName structured property pre-bound via ForContext. The structured property name is a public const (ScriptNameProperty = "ScriptName") because the Admin UI's log-viewer filter references this exact string — changing it breaks the filter silently, so it's stable by contract. Factory constructor rejects a null root logger; Create rejects null/empty/whitespace script names. No per-evaluation allocation in the hot path — engines (Stream B virtual-tag / Stream C scripted-alarm) create one factory per engine instance then cache per-script loggers beside the ScriptContext instances they already build.

ScriptLogCompanionSink is a Serilog ILogEventSink that forwards Error+ events from the script-logger pipeline to a separate "main" logger (the opcua-*.log pipeline in production) at Warning level. Rationale: operators usually watch the main server log, not scripts-*.log. Script authors log Info/Debug liberally during development — those stay in the scripts file. When a script actually fails (Error or Fatal), the operator needs to see it in the primary log so it can't be missed. Downgrading to Warning in the main log marks these as "needs attention but not a core server issue" since the server itself is healthy; the script author fixes the script. Forwarded event includes the ScriptName property (so operators can tell which script failed at a glance), the OriginalLevel (Error vs Fatal, preserved), the rendered message, and the original exception (preserved so the main log keeps the full stack trace — critical for diagnosis). Missing ScriptName property falls back to "unknown" without throwing; bypassing the factory is defensive but shouldn't happen in practice. Mirror threshold is configurable via constructor (defaults to LogEventLevel.Error) so deployments with stricter signal/noise requirements can raise it to Fatal.

15 new unit tests across two files. ScriptLoggerFactoryTests (6): Create sets the ScriptName structured property, each script gets its own property value across fan-out, Error-level event preserves level and exception, null root rejected, empty/whitespace/null name rejected, ScriptNameProperty const is stable at "ScriptName" (external-contract guard). ScriptLogCompanionSinkTests (9): Info/Warning events land in scripts sink only (not mirrored), Error event mirrored to main at Warning level (level-downgrade behavior), mirrored event includes ScriptName + OriginalLevel properties, mirrored event preserves exception for main-log stack-trace diagnosis, Fatal mirrored identically to Error, missing ScriptName falls back to "unknown" without throwing (defensive), null main logger rejected, custom mirror threshold (raised to Fatal) applied correctly.

Full Core.Scripting test suite after Stream A: 63/63 green (29 A.1 + 19 A.2 + 15 A.3). Stream A is complete — the scripting engine foundation, sandbox, sandbox-defense-in-depth, AST-inferred dependency extraction, compile cache, per-evaluation timeout, per-script logger with structured-property filtering, and companion-warn forwarding are all shipped and tested. Streams B through G build on this; Stream H closes out the phase with the compliance script + test baseline + merge to v2.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 16:42:48 -04:00
Joseph Doherty
0ae715cca4 Phase 7 Stream A.2 — compile cache + per-evaluation timeout wrapper. Second of 3 increments within Stream A. Adds two independent resilience primitives that the virtual-tag engine (Stream B) and scripted-alarm engine (Stream C) will compose with the base ScriptEvaluator. Both are generic on (TContext, TResult) so different engines get their own instances without cross-contamination.
CompiledScriptCache<TContext, TResult> — source-hash-keyed cache of compiled evaluators. Roslyn compilation is the most expensive step in the evaluator pipeline (5-20ms per script depending on size); re-compiling on every value-change event would starve the engine. ConcurrentDictionary of Lazy<ScriptEvaluator> with ExecutionAndPublication mode ensures concurrent callers never double-compile even on a cold cache race. Failed compiles evict the cache entry so an Admin UI retry with corrected source actually recompiles (otherwise the cached exception would persist). Whitespace-sensitive hash — reformatting a script misses the cache on purpose, simpler than AST-canonicalize and happens rarely. No capacity bound because virtual-tag + alarm scripts are config-DB bounded (thousands, not millions); if scale pushes past that in v3 an LRU eviction slots in behind the same API.

TimedScriptEvaluator<TContext, TResult> — wraps a ScriptEvaluator with a per-evaluation wall-clock timeout (default 250ms per Phase 7 plan Stream A.4, configurable per tag so slower backends can widen). Critical implementation detail: the underlying Roslyn ScriptRunner executes synchronously on the calling thread for CPU-bound user scripts, returning an already-completed Task before the caller can register a timeout. Naive `Task.WaitAsync(timeout)` would see the completed task and never fire. Fix: push evaluation to a thread-pool thread via Task.Run, so the caller's thread is free to wait and the timeout reliably fires after the configured budget. Known trade-off (documented in the class summary): when a script times out, the underlying evaluation task continues running on the thread-pool thread until Roslyn returns; in the CPU-bound-infinite-loop case it's effectively leaked until the runtime decides to unwind. Tighter CPU budgeting would require an out-of-process script runner (v3 concern). In practice the timeout + structured warning log surfaces the offending script so the operator fixes it, and the orphan thread is rare. Caller-supplied CancellationToken is honored and takes precedence over the timeout, so driver-shutdown paths see a clean OperationCanceledException rather than a misclassified ScriptTimeoutException.

ScriptTimeoutException carries the configured Timeout and a diagnostic message pointing the operator at ctx.Logger output around the failure plus suggesting widening the timeout, simplifying the script, or moving heavy work out of the evaluation path. The virtual-tag engine (Stream B) will catch this and map the owning tag's quality to BadInternalError per Phase 7 decision #11, logging a structured warning with the offending script name.

Tests: CompiledScriptCacheTests (10) — first-call compile, identical-source dedupe to same instance, different-source produces different evaluator, whitespace-sensitivity documented, cached evaluator still runs correctly, failed compile evicted for retry, Clear drops entries, concurrent GetOrCompile of the same source deduplicates to one instance, different TContext/TResult use separate cache instances, null source rejected. TimedScriptEvaluatorTests (9) — fast script completes under timeout, CPU-bound script throws ScriptTimeoutException, caller cancellation takes precedence over timeout (shutdown path correctness), default 250ms per plan, zero/negative timeout rejected at construction, null inner rejected, null context rejected, user-thrown exceptions propagate unwrapped (not conflated with timeout), timeout exception message contains diagnostic guidance. Full suite: 48/48 green (29 from A.1 + 19 new).

Next: Stream A.3 wires the dedicated scripts-*.log Serilog rolling sink + structured-property filtering + companion-WARN enricher to the main log, closing out Stream A.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 16:38:43 -04:00
Joseph Doherty
e4dae01bac Phase 7 Stream A.1 — Core.Scripting project scaffold + ScriptContext + sandbox + AST dependency extractor. First of 3 increments within Stream A. Ships the Roslyn-based script engine's foundation: user C# snippets compile against a constrained ScriptOptions allow-list + get a post-compile sandbox guard, the static tag-dependency set is extracted from the AST at publish time, and the script sees a stable ctx.GetTag/SetVirtualTag/Now/Logger/Deadband API that later streams plug into concrete backends.
ScriptContext abstract base defines the API user scripts see as ctx — GetTag(string) returns DataValueSnapshot so scripts branch on quality naturally, SetVirtualTag(string, object?) is the only write path virtual tags have (OPC UA client writes to virtual nodes rejected separately in DriverNodeManager per ADR-002), Now + Logger + Deadband static helper round out the surface. Concrete subclasses in Streams B + C plug in actual tag backends + per-script Serilog loggers.

ScriptSandbox.Build(contextType) produces the ScriptOptions for every compile — explicit allow-list of six assemblies (System.Private.CoreLib / System.Linq / Core.Abstractions / Core.Scripting / Serilog / the context type's own assembly), with a matching import list so scripts don't need using clauses. Allow-list is plan-level — expanding it is not a casual change.

DependencyExtractor uses CSharpSyntaxWalker to find every ctx.GetTag("literal") and ctx.SetVirtualTag("literal", ...) call, rejects every non-literal path (variable, concatenation, interpolation, method-returned). Rejections carry the exact TextSpan so the Admin UI can point at the offending token. Reads + writes are returned as two separate sets so the virtual-tag engine (Stream B) knows both the subscription targets and the write targets.

Sandbox enforcement turned out needing a second-pass semantic analyzer because .NET 10's type forwarding makes assembly-level restriction leaky — System.Net.Http.HttpClient resolves even with WithReferences limited to six assemblies. ForbiddenTypeAnalyzer runs after Roslyn's Compile() against the SemanticModel, walks every ObjectCreationExpression / InvocationExpression / MemberAccessExpression / IdentifierName, resolves to the containing type's namespace, and rejects any prefix-match against the deny-list (System.IO, System.Net, System.Diagnostics, System.Reflection, System.Threading.Thread, System.Runtime.InteropServices, Microsoft.Win32). Rejections throw ScriptSandboxViolationException with the aggregated list + source spans so the Admin UI surfaces every violation in one round-trip instead of whack-a-mole. System.Environment explicitly stays allowed (read-only process state, doesn't persist or leak outside) and that compromise is pinned by a dedicated test.

ScriptGlobals<TContext> wraps the context as a named field so scripts see ctx instead of the bare globalsType-member-access convention Roslyn defaults to — keeps script ergonomics (ctx.GetTag) consistent with the AST walker's parse shape and the Admin UI's hand-written type stub (coming in Stream F). Generic on TContext so Stream C's alarm-predicate context with an Alarm property inherits cleanly.

ScriptEvaluator<TContext, TResult>.Compile is the three-step gate: (1) Roslyn compile — throws CompilationErrorException on syntax/type errors with Location-carrying diagnostics; (2) ForbiddenTypeAnalyzer semantic pass — catches type-forwarding sandbox escapes; (3) delegate creation. Runtime exceptions from user code propagate unwrapped — the virtual-tag engine in Stream B catches + maps per-tag to BadInternalError quality per Phase 7 decision #11.

29 unit tests covering every surface: DependencyExtractorTests has 14 theories — single/multiple/deduplicated reads, separate write tracking, rejection of variable/concatenated/interpolated/method-returned/empty/whitespace paths, ignoring non-ctx methods named GetTag, empty-source no-op, source span carried in rejections, multiple bad paths reported in one pass, nested literal extraction. ScriptSandboxTests has 15 — happy-path compile + run, SetVirtualTag round-trip, rejection of File.IO + HttpClient + Process.Start + Reflection.Assembly.Load via ScriptSandboxViolationException, Environment.GetEnvironmentVariable explicitly allowed (pinned compromise), script-exception propagation, ctx.Now reachable, Deadband static reachable, LINQ Where/Sum reachable, DataValueSnapshot usable in scripts including quality branches, compile error carries source location.

Next two PRs within Stream A: A.2 adds the compile cache (source-hash keyed) + per-evaluation timeout wrapper; A.3 wires the dedicated scripts-*.log Serilog rolling sink with structured-property filtering + the companion-warning enricher to the main log.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 16:27:07 -04:00
Joseph Doherty
96940aeb24 Modbus exception-injection profile — closes the end-to-end test gap for exception codes 0x01/0x03/0x04/0x05/0x06/0x0A/0x0B. pymodbus simulator naturally emits only 0x02 (Illegal Data Address on reads outside configured ranges) + 0x03 (Illegal Data Value on over-length); the driver's MapModbusExceptionToStatus table translates eight codes, but only 0x02 had integration-level coverage (via DL205's unmapped-register test). Unit tests lock the translation function in isolation but an integration test was missing for everything else. This PR lands wire-level coverage for the remaining seven codes without depending on device-specific quirks to naturally produce them.
New exception_injector.py — standalone pure-Python-stdlib Modbus/TCP server shipped alongside the pymodbus image. Speaks the wire protocol directly (MBAP header parse + FC 01/02/03/04/05/06/15/16 dispatch + store-backed happy-path reads/writes + spec-enforced length caps) and looks up each (fc, starting-address) against a rules list loaded from JSON; a matching rule makes the server respond [fc|0x80, exception_code] instead of the normal response. Zero runtime dependencies outside the stdlib — the Dockerfile just COPY's the script into /fixtures/ alongside the pymodbus profile JSONs, no new pip install needed. ~200 lines. New exception_injection.json profile carries rules for every exception code on FC03 (addresses 1000-1007, one per code), FC06 (2000-2001 for CPU-PROGRAM-mode and busy), and FC16 (3000 for server failure). New exception_injection compose profile binds :5020 like every other service + runs python /fixtures/exception_injector.py --config /fixtures/exception_injection.json.

New ExceptionInjectionTests.cs in Modbus.IntegrationTests — 11 tests. Eight FC03-read theories exercise every exception code 0x01/0x02/0x03/0x04/0x05/0x06/0x0A/0x0B asserting the driver's expected OPC UA StatusCode mapping (BadNotSupported/BadOutOfRange/BadOutOfRange/BadDeviceFailure/BadDeviceFailure/BadDeviceFailure/BadCommunicationError/BadCommunicationError). Two FC06-write theories cover the write path for 0x04 (Server Failure, CPU in PROGRAM mode) + 0x06 (Server Busy). One sanity-check read at address 5 confirms the injector isn't globally broken + non-injected reads round-trip cleanly with Value=5/StatusCode=Good. All tests follow the MODBUS_SIM_PROFILE=exception_injection skip guard so they no-op on a fresh clone without Docker running.

Docker/README.md gains an §Exception injection section explaining what pymodbus can and cannot emit, what the injector does, where the rules live, and how to append new ones. docs/drivers/Modbus-Test-Fixture.md follow-up item #2 (extend pymodbus profiles to inject exceptions) gets a shipped strikethrough with the new coverage inventory; the unit-level section adds ExceptionInjectionTests next to DL205ExceptionCodeTests so the split-of-responsibilities is explicit (DL205 test = natural out-of-range via dl205 profile, ExceptionInjectionTests = every other code via the injector).

Test baselines: Modbus unit 182/182 green (unchanged); Modbus integration with exception_injection profile live 11/11 new tests green. Existing DL205/S7/Mitsubishi integration tests unaffected since they skip on MODBUS_SIM_PROFILE mismatch.

Found + fixed during validation: a stale native pymodbus simulator from April 18 was still listening on port 5020 on IPv6 localhost (Windows was load-balancing between it + the Docker IPv4 forward, making injected exceptions intermittently come back as pymodbus's default 0x02). Killed the leftover. Documented the debugging path in the commit as a note for anyone who hits the same "my tests see exception 0x02 but the injector log has no request" symptom.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 15:11:32 -04:00