lmxopcua

Author	SHA1	Message	Date
Joseph Doherty	8c309aebf3	RMW pass 1 — Modbus BitInRegister + FOCAS PMC Bit write paths. First half of task #181 — the two drivers where read-modify-write is a clean protocol-level insertion (Modbus FC03/FC06 round-trip + FOCAS pmc_rdpmcrng / pmc_wrpmcrng round-trip). Per-driver SemaphoreSlim registry keyed on the parent word address serialises concurrent bit writes so two writers targeting different bits in the same word don't lose one another's update. Modbus — ModbusDriver gains WriteBitInRegisterAsync + _rmwLocks ConcurrentDictionary. WriteOneAsync routes BitInRegister (HoldingRegisters region only) through RMW ahead of the normal encode path. Read uses FC03 Read Holding Registers for 1 register at tag.Address, bit-op on the returned ushort via (current \| 1<<bit) for set / (current & ~(1<<bit)) for clear, write back via FC06 Write Single Register. Per-address lock prevents concurrent bit writes to the same register from racing. Rejects out-of-range bits (0-15) with InvalidOperationException. EncodeRegister's BitInRegister branch repurposed as a defensive guard — if a non-RMW caller ever reaches it, throw so an unintended bypass stays loud rather than silently clobbering. FOCAS — FwlibFocasClient gains WritePmcBitAsync + _rmwLocks keyed on {addrType}:{byteAddr}. Driver-layer WriteAsync routes Bit writes with a bitIndex through the new path; other Pmc writes still hit the direct pmc_wrpmcrng path. RMW uses cnc_rdpmcrng + Byte dataType to grab the parent byte, bit-op with (current \| 1<<bit) or (current & ~(1<<bit)), cnc_wrpmcrng to write back. Rejects out-of-range bits (0-7, FOCAS PMC bytes are 8-bit) with InvalidOperationException. EncodePmcValue's Bit branch now treats a no-bitIndex case as whole-byte boolean (non-zero / zero); bitIndex-present writes never hit this path because they dispatch to WritePmcBitAsync upstream. Tests — 5 new ModbusBitRmwTests + 4 new FocasPmcBitRmwTests + 1 renamed pre-existing test each covering — bit set preserves other bits, bit clear preserves other bits, concurrent bit writes to same word/byte compose correctly (8-parallel stress), bit writes on different parent words proceed without contention (4-parallel), sequential bit sets compose into 0xFF after all 8. Fake PmcRmwFake in FOCAS tests simulates the PMC byte storage + surfaces it through the IFocasClient contract so the test asserts driver-level behavior without needing Fwlib32.dll. FwlibNativeHelperTests.EncodePmcValue_Bit_throws_NotSupported_for_RMW_gap replaced with EncodePmcValue_Bit_without_bit_index_writes_byte_boolean reflecting the new behavior. ModbusDataTypeTests.BitInRegister_write_is_not_supported_in_PR24 renamed to BitInRegister_EncodeRegister_still_rejects_direct_calls; the message assertion updated to match the new defensive message. Modbus tests now 182/182, FOCAS tests now 119/119; full solution builds 0 errors; AbCip/AbLegacy/TwinCAT untouched (those get their RMW pass in a follow-up since libplctag bit access may need a parallel parent-word handle). Task #181 stays pending until that second pass lands. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 20:25:27 -04:00
Joseph Doherty	c95228391d	TwinCAT follow-up — Symbol browser via AdsClient + SymbolLoaderFactory. Closes task #188 . Adds ITwinCATClient.BrowseSymbolsAsync — IAsyncEnumerable yielding TwinCATDiscoveredSymbol (InstancePath + mapped TwinCATDataType + ReadOnly flag) from the target's flat symbol table. AdsTwinCATClient implementation uses SymbolLoaderFactory.Create(_client, new SymbolLoaderSettings(SymbolsLoadMode.Flat)) + iterates loader.Symbols, maps IEC 61131-3 type names (BOOL/SINT/INT/DINT/LINT/REAL/LREAL/STRING/WSTRING/TIME/DATE/DT/TOD + BYTE/WORD/DWORD/LWORD unsigned-word aliases) through MapSymbolTypeName, checks SymbolAccessRights.Write bit for writable vs read-only. Unsupported types (UDTs / function blocks / arrays / pointers) surface with DataType=null so callers can skip or recurse. TwinCATDriverOptions.EnableControllerBrowse — new bool, default false to preserve the strict-config path. When true, DiscoverAsync iterates each device's BrowseSymbolsAsync, filters via TwinCATSystemSymbolFilter (rejects TwinCAT_, Constants., Mc_, __, Global_Version* prefixes + anything empty), skips null-DataType symbols, emits surviving symbols under a per-device Discovered/ sub-folder with InstancePath as both FullName + BrowseName + ReadOnly→ViewOnly/writable→Operate. Pre-declared tags from TwinCATDriverOptions.Tags always emit regardless. Browse failure is non-fatal — exception caught + swallowed, pre-declared tags stay in the address space, operators see the failure in driver health on next read. TwinCATSystemSymbolFilter static class mirrors AbCipSystemTagFilter's shape with TwinCAT-specific prefixes. Fake client updated — BrowseResults list for test setup + FireNotification-style single-invocation on each subscribe, ThrowOnBrowse flag for failure testing. 8 new unit tests — strict path emits only pre-declared when EnableControllerBrowse=false, browse enabled adds Discovered/ folder, filter rejects system prefixes, null-DataType symbols skipped, ReadOnly symbols surface ViewOnly, browse failure leaves pre-declared intact, SystemSymbolFilter theory (10 cases). Total TwinCAT unit tests now 110/110 passing (+17 from the native-notification merge's 93); full solution builds 0 errors; other drivers untouched. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 20:13:33 -04:00
Joseph Doherty	1d6015bc87	FOCAS PR 3 — ITagDiscovery + ISubscribable + IHostConnectivityProbe + IPerCallHostResolver. Completes the FOCAS driver — 7-interface capability set matching AbCip/AbLegacy/TwinCAT (minus IAlarmSource — Fanuc CNC alarms live in a different API surface, tracked as a future-phase concern). ITagDiscovery emits pre-declared tags under a FOCAS root + per-device sub-folder keyed on the canonical focas://host:port string with DeviceName fallback. Writable → Operate, non-writable → ViewOnly. No native FOCAS symbol browsing — CNCs don't expose a tag catalogue the way Logix or TwinCAT do; operators declare addresses explicitly. ISubscribable consumes the shared PollGroupEngine — 5th consumer of the engine after Modbus + AbCip + AbLegacy + TwinCAT-poll-mode. 100ms interval floor inherited. FOCAS has no native notification/subscription protocol (unlike TwinCAT ADS), so polling is the only option — every subscribed tag round-trips through cnc_rdpmcrng / cnc_rdparam / cnc_rdmacro on each tick. IHostConnectivityProbe uses the existing IFocasClient.ProbeAsync which in the real FwlibFocasClient calls cnc_statinfo (cheap handshake returning ODBST with tmmode/aut/run/motion/alarm state). Probe loop runs when Enabled=true, catches OperationCanceledException during shutdown, falls through to Stopped on exceptions, emits Running/Stopped transitions via OnHostStatusChanged with the canonical focas://host:port as the host-name key. Same-state spurious-event guard under per-device lock. IPerCallHostResolver maps tag full-ref to DeviceHostAddress for Phase 6.1 bulkhead/breaker keying per plan decision #144 — unknown refs fall back to first device, no devices → DriverInstanceId. ShutdownAsync now disposes PollGroupEngine + cancels/disposes per-device probe CTS + disposes cached clients. DeviceState gains ProbeLock / HostState / HostStateChangedUtc / ProbeCts matching the shape used by AbCip/AbLegacy/TwinCAT. 9 new unit tests in FocasCapabilityTests — discovery tag emission with correct SecurityClassification, subscription initial poll raises OnDataChange, shutdown cancels subscriptions, GetHostStatuses entry-per-device, probe Running / Stopped transitions, ResolveHost for known / unknown / no-devices paths. FocasScaffoldingTests updated with Probe.Enabled=false where the default factory would otherwise try to load Fwlib32.dll during the probe-loop spinup. Total FOCAS unit tests now 115/115 passing (+9 from PR 2's 106); full solution builds 0 errors; Modbus / AbCip / AbLegacy / TwinCAT / other drivers untouched. FOCAS driver is real-wire-capable end-to-end — read / write / discover / subscribe / probe / host-resolve for Fanuc FS 0i/16i/18i/21i/30i/31i/32i/Series 35i/Power Mate i controllers once deployment drops Fwlib32.dll beside the server. Closes task #120 subtask FOCAS. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 19:59:37 -04:00
Joseph Doherty	a2c7fda5f5	FOCAS PR 2 — IReadable + IWritable + real FwlibFocasClient P/Invoke. Closes task #193 early now that strangesast/fwlib provides the licensed DLL references. Skips shipping with the Unimplemented stub as the default — FwlibFocasClientFactory is now the production default, UnimplementedFocasClientFactory stays as an opt-in for tests/deployments without FWLIB access. FwlibNative — narrow P/Invoke surface for the 7 calls the driver actually makes: cnc_allclibhndl3 (open Ethernet handle), cnc_freelibhndl (close), pmc_rdpmcrng + pmc_wrpmcrng (PMC range I/O), cnc_rdparam + cnc_wrparam (CNC parameters), cnc_rdmacro + cnc_wrmacro (macro variables), cnc_statinfo (probe). DllImport targets Fwlib32.dll; deployment places it next to the executable or on PATH. IODBPMC/IODBPSD/ODBM/ODBST marshaled with LayoutKind.Sequential + Pack=1 + fixed byte-array unions (avoids LayoutKind.Explicit complexity; managed-side BitConverter extracts typed values from the byte buffer). Internal helpers FocasPmcAddrType.FromLetter (G=0/F=1/Y=2/X=3/A=4/R=5/T=6/K=7/C=8/D=9/E=10 per Fanuc FOCAS/2 spec) + FocasPmcDataType.FromFocasDataType (Byte=0 / Word=1 / Long=2 / Float=4 / Double=5) exposed for testing without the DLL loaded. FwlibFocasClient is the concrete IFocasClient backed by P/Invoke. Construction is licence-safe — .NET P/Invoke is lazy so instantiating the class does NOT load Fwlib32.dll; DLL loads on first wire call (Connect/Read/Write/Probe). When missing, calls throw DllNotFoundException which the driver surfaces as BadCommunicationError via the normal exception path. Session-scoped handle from cnc_allclibhndl3; Dispose calls cnc_freelibhndl. Dispatch on FocasAreaKind — Pmc reads use pmc_rdpmcrng with the right ADR_* + data-type codes + parses the union via BinaryPrimitives LittleEndian, Parameter reads use cnc_rdparam + IODBPSD, Macro reads use cnc_rdmacro + compute scaled double as McrVal / 10^DecVal. Write paths mirror reads. PMC Bit writes throw NotSupportedException pointing at task #181 (read-modify-write gap — same as Modbus / AbCip / AbLegacy / TwinCAT). Macro writes accept int + pass decimal-point count 0 (decimal precision writes are a future enhancement). Probe calls cnc_statinfo with ODBST result. Driver wiring — FocasDriver now IDriver + IReadable + IWritable. Per-device connection caching via EnsureConnectedAsync + DeviceState.Client. ReadAsync/WriteAsync dispatch through the injected IFocasClient — ordered snapshots preserve per-tag status, OperationCanceledException rethrows, FormatException/InvalidCastException → BadTypeMismatch, OverflowException → BadOutOfRange, NotSupportedException → BadNotSupported, anything else → BadCommunicationError + Degraded health. Connect-failure disposes the half-open client. ShutdownAsync disposes every cached client. Default factory switched — constructor now defaults to FwlibFocasClientFactory (backed by real Fwlib32.dll) rather than UnimplementedFocasClientFactory. UnimplementedFocasClientFactory stays as an opt-in. 41 new tests — 14 in FocasReadWriteTests (ordered unknown-ref handling, successful PMC/Parameter/Macro reads routing through correct FocasAreaKind, repeat-read reuses connection, FOCAS error mapping, exception paths, batched order across areas, non-writable rejection, successful write logging, status mapping, batch ordering, cancellation, shutdown disposes), 27 in FwlibNativeHelperTests (12 letter-mapping cases + 3 unknown rejections + 6 data-type mapping + 4 encode helpers + Bit-write NotSupported). Total FOCAS unit tests now 106/106 passing (+41 from PR 1's 65); full solution builds 0 errors; Modbus / AbCip / AbLegacy / TwinCAT / other drivers untouched. FOCAS driver is real-wire-capable from day one — deployment drops Fwlib32.dll beside the server + driver talks to live FS 0i/16i/18i/21i/30i/31i/32i controllers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 19:55:37 -04:00
Joseph Doherty	285799a954	FOCAS PR 1 — Scaffolding + Core (FocasDriver skeleton + address parser + stub client). New Driver.FOCAS project for Fanuc CNC controllers (FS 0i/16i/18i/21i/30i/31i/32i/Series 35i/Power Mate i) talking via the Fanuc FOCAS/2 protocol. No NuGet reference to a FOCAS library — FWLIB (Fwlib32.dll) is Fanuc-proprietary + per-customer licensed + cannot be legally redistributed, so the driver is designed from the start to accept an IFocasClient supplied by the deployment side. Default IFocasClientFactory is UnimplementedFocasClientFactory which throws with a clear deployment-docs pointer at Create time so misconfigured servers fail fast rather than mysteriously hanging. Matches the pattern other drivers use for swappable wire layers (Modbus IModbusTransport, AbCip IAbCipTagFactory, TwinCAT ITwinCATClientFactory) — but uniquely, FOCAS ships without a production factory because of licensing. FocasHostAddress parses focas://{host}[:{port}] canonical form with default port 8193 (Fanuc-reserved FOCAS Ethernet port). Default-port stripping on ToString for roundtrip stability. Case-insensitive scheme. Rejects wrong scheme, empty body, invalid port, non-numeric port. FocasAddress handles the three addressing spaces a FOCAS driver touches — PMC (letter + byte + optional bit, X/Y for IO, F/G for PMC-CNC signals, R for internal relay, D for data table, C for counter, K for keep relay, A for message display, E for extended relay, T for timer, with .N bit syntax 0-7), CNC parameters (PARAM:n for a parameter number, PARAM:n/N for bit 0-31 of a parameter), macro variables (MACRO:n). Rejects unknown PMC letters, negative numbers, out-of-range bits (PMC 0-7, parameter 0-31), non-numeric fragments. FocasDataType — Bit / Byte / Int16 / Int32 / Float32 / Float64 / String covering the atomic types PMC reads + CNC parameters + macro variables return. ToDriverDataType widens to the Int32/Float32/Float64/Boolean/String surface. FocasStatusMapper covers the FWLIB EW_* return-code family documented in the FOCAS/1 + FOCAS/2 references — EW_OK=0, EW_FUNC=1 → BadNotSupported, EW_OVRFLOW=2/EW_NUMBER=3/EW_LENGTH=4 → BadOutOfRange, EW_PROT=5/EW_PASSWD=11 → BadNotWritable, EW_NOOPT=6/EW_VERSION=-9 → BadNotSupported, EW_ATTRIB=7 → BadTypeMismatch, EW_DATA=8 → BadNodeIdUnknown, EW_PARITY=9 → BadCommunicationError, EW_BUSY=-1 → BadDeviceFailure, EW_HANDLE=-8 → BadInternalError, EW_UNEXP=-10/EW_SOCKET=-16 → BadCommunicationError. IFocasClient + IFocasClientFactory abstraction — ConnectAsync, IsConnected, ReadAsync returning (value, status) tuple, WriteAsync returning status, ProbeAsync for IHostConnectivityProbe. Deployment supplies the real factory; driver assembly stays licence-clean. FocasDriverOptions + FocasDeviceOptions + FocasTagDefinition + FocasProbeOptions — one instance supports N CNCs, tags cross-key by HostAddress + use canonical FocasAddress strings. FocasDriver implements IDriver only (PRs 2-3 add read/write/discover/subscribe/probe/resolver). InitializeAsync parses each device HostAddress + fails fast on malformed strings → Faulted health. 65 new unit tests in FocasScaffoldingTests covering — 5 valid host forms + 8 invalid + default-port-strip ToString, 12 valid PMC addresses across all 11 canonical letters + 3 parameter forms with + without bit + 2 macro forms, 10 invalid address shapes, canonical roundtrip theory, data-type mapping theory, FWLIB EW_* status mapping theory (9 codes + unknown → generic), DriverType, multi-device Initialize + address parsing, malformed-address fault, shutdown, default factory throws NotSupportedException with deployment pointer + Fwlib32.dll mention. Total project count 31 src + 20 tests; full solution builds 0 errors. Other drivers untouched. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 19:47:52 -04:00
Joseph Doherty	6c5b202910	TwinCAT follow-up — Native ADS notifications for ISubscribable. Closes task #189 — upgrades TwinCATDriver's subscription path from polling (shared PollGroupEngine) to native AdsClient.AddDeviceNotificationExAsync so the PLC pushes changes on its own cycle rather than the driver polling. Strictly better for latency + CPU — TC2 and TC3 runtimes notify on value change with sub-millisecond latency from the PLC cycle. ITwinCATClient gains AddNotificationAsync — takes symbolPath + TwinCATDataType + optional bitIndex + cycleTime + onChange callback + CancellationToken; returns an ITwinCATNotificationHandle whose Dispose tears the notification down on the wire. Bit-within-word reads supported — the parent word value arrives via the notification, driver extracts the bit before invoking the callback (same ExtractBit path as the read surface from PR 2). AdsTwinCATClient — subscribes to AdsClient.AdsNotificationEx in the ctor, maintains a ConcurrentDictionary<uint, NotificationRegistration> keyed on the server-side notification handle. AddDeviceNotificationExAsync returns Task<ResultHandle> with Handle + ErrorCode; non-NoError throws InvalidOperationException so the driver can catch + retry. Notification event args carry Handle + Value + DataType; lookup in _notifications dict routes the value through any bit-extraction + calls the consumer callback. Consumer-side exceptions are swallowed so a misbehaving callback can't crash the ADS notification thread. Dispose unsubscribes from AdsNotificationEx + clears the dict + disposes AdsClient. NotificationRegistration is ITwinCATNotificationHandle — Dispose fires DeleteDeviceNotificationAsync as fire-and-forget with CancellationToken.None (caller has already committed to teardown; blocking would slow shutdown). TwinCATDriverOptions.UseNativeNotifications — new bool, default true. When true the driver uses native notifications; when false it falls through to the shared PollGroupEngine (same semantics as other libplctag-backed drivers, also a safety valve for targets with notification limits). TwinCATDriver.SubscribeAsync dual-path — if UseNativeNotifications false delegate into _poll.Subscribe (unchanged behavior from PR 3). If true, iterate fullReferences, resolve each to its device's client via EnsureConnectedAsync (reuses PR 2's per-device connection cache), parse the SymbolPath via TwinCATSymbolPath (preserves bit-in-word support), call ITwinCATClient.AddNotificationAsync with a closure over the FullReference (not the ADS symbol — OPC UA subscribers addressed the driver-side name). Per-registration callback bridges (_, value) → OnDataChange event with a fresh DataValueSnapshot (Good status, current UtcNow timestamps). Any mid-registration failure triggers a try/catch that disposes every already-registered handle before rethrowing, keeping the driver in a clean never-existed state rather than half-registered. UnsubscribeAsync dispatches on handle type — NativeSubscriptionHandle disposes all its cached ITwinCATNotificationHandles; anything else delegates to _poll.Unsubscribe for the poll fallback. ShutdownAsync tears down native subs first (so AdsClient-level cleanup happens before the client itself disposes), then PollGroupEngine, then per-device probe CTS + client. NativeSubscriptionHandle DiagnosticId prefixes with twincat-native-sub- so Admin UI + logs can distinguish the paths. 9 new unit tests in TwinCATNativeNotificationTests — native subscribe registers one notification per tag, pushed value via FireNotification fires OnDataChange with the right FullReference (driver-side, not ADS symbol), unsubscribe disposes all notifications, unsubscribe halts future notifications, partial-failure cleanup via FailAfterNAddsFake (first succeeds, second throws → first gets torn down + Notifications count returns to 0 + AddCallCount=2 proving the test actually exercised both calls), shutdown disposes subscriptions, poll fallback works when UseNativeNotifications=false (no native handles created + initial-data push still fires), handle DiagnosticId distinguishes native vs poll. Existing poll-mode ISubscribable tests in TwinCATCapabilityTests updated with UseNativeNotifications=false so they continue testing the poll path specifically — both poll + native paths have test coverage now. TwinCATDriverTests got Probe.Enabled=false added because the default factory creates a real AdsClient which was flakily affected by parallel test execution sharing AMS router state. Total TwinCAT unit tests now 93/93 passing (+8 from PR 3's 85 counting the new native tests + 2 existing tests that got options tweaks). Full solution builds 0 errors; Modbus / AbCip / AbLegacy / other drivers untouched. TwinCAT driver is now feature-complete end-to-end — read / write / discover / native-subscribe / probe / host-resolve, with poll-mode as a safety valve. Unblocks closing task #120 for TwinCAT; remaining sub-task: FOCAS + task #188 (symbol-browsing — lower priority than FOCAS since real config flows still use pre-declared tags). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 18:49:48 -04:00
Joseph Doherty	aeb28cc8e7	TwinCAT PR 3 — ITagDiscovery + ISubscribable + IHostConnectivityProbe + IPerCallHostResolver. Completes the TwinCAT driver — 7-interface capability set matching AbCip / AbLegacy (minus IAlarmSource, same deferral). ITagDiscovery emits pre-declared tags under TwinCAT/device-host folder with DeviceName fallback to HostAddress; Writable→Operate / non-writable→ViewOnly. Symbol-browsing via AdsClient.ReadSymbolsAsync / ReadSymbolInfoAsync deferred to a follow-up (same shape as the @tags deferral for AbCip — needs careful traversal of the TwinCAT symbol table + type graph which the ReadSymbolsAsync API does expose but adds enough scope to warrant its own PR). ISubscribable consumes the shared PollGroupEngine — 4th consumer after Modbus + AbCip + AbLegacy. TwinCAT supports native ADS notifications (AddDeviceNotification) which would be strictly superior to polling, but plumbing through OPC UA semantics + the PollGroupEngine abstraction would require a parallel sampling path; poll-first matches the cross-driver pattern + gets the driver shippable. Follow-up task for native-notification upgrade tracked after merge. IHostConnectivityProbe — per-device probe loop using ITwinCATClient.ProbeAsync which wraps AdsClient.ReadStateAsync (cheap handshake that returns the target's AdsState, succeeds when router + target both respond). Success transitions to Running, any exception or probe-false to Stopped. Same lazy-connect + dispose-on-failure pattern as the read/write path — device state reconnects cleanly after a transient. IPerCallHostResolver maps tag full-ref to DeviceHostAddress for Phase 6.1 (DriverInstanceId, ResolvedHostName) bulkhead/breaker keying per plan decision #144 ; unknown refs fall back to first device, no devices → DriverInstanceId. ShutdownAsync disposes PollGroupEngine + cancels/disposes every probe CTS + disposes every cached client. DeviceState extended with ProbeLock / HostState / HostStateChangedUtc / ProbeCts matching AbCip/AbLegacy shape. 10 new tests in TwinCATCapabilityTests — discovery tag emission with correct SecurityClassification, subscription initial poll raises OnDataChange, shutdown cancels subscriptions, GetHostStatuses entry-per-device, probe Running transition on ProbeResult=true, probe Stopped on ProbeResult=false, probe disabled when Enabled=false, ResolveHost for known/unknown/no-devices paths. Total TwinCAT unit tests now 85/85 passing (+10 from PR 2's 75); full solution builds 0 errors; other drivers untouched. TwinCAT driver complete end-to-end — any TC2/TC3 AMS target reachable through a router is now shippable with read/write/discover/subscribe/probe/host-resolve, feature-parity with AbCip/AbLegacy. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 18:36:55 -04:00
Joseph Doherty	28e3470300	TwinCAT PR 2 — IReadable + IWritable. ITwinCATClient + ITwinCATClientFactory abstraction — one client per AMS target, reused across reads/writes/probes. Shape differs from AbCip/AbLegacy where libplctag handles are per-tag — TwinCAT's AdsClient is a single connection with symbolic reads/writes issued against it, so the abstraction is coarser. AdsTwinCATClient is the default implementation wrapping Beckhoff.TwinCAT.Ads's AdsClient — ConnectAsync calls AdsClient.Connect(AmsNetId.Parse(netId), port) after setting Timeout in ms; ReadValueAsync dispatches TwinCATDataType to the CLR Type via MapToClrType (bool/sbyte/byte/short/ushort/int/uint/long/ulong/float/double/string/uint for time types) and calls AdsClient.ReadValueAsync(symbol, type, ct) which returns ResultAnyValue; unwraps .Value + .ErrorCode and maps non-NoError codes via TwinCATStatusMapper.MapAdsError. BOOL-within-word reads extract the bit after the underlying word read using ExtractBit over short/ushort/int/uint/long/ulong. WriteValueAsync converts the boxed value via ConvertForWrite (Convert.ToXxx per type) then calls AdsClient.WriteValueAsync returning ResultWrite; checks .ErrorCode for status mapping. BOOL-within-word writes throw NotSupportedException with a pointer to task #181 — same RMW gap as Modbus BitInRegister / AbCip BOOL-in-DINT / AbLegacy bit-within-N-file. ProbeAsync calls AdsClient.ReadStateAsync + checks AdsErrorCode.NoError. TwinCATDriver implements IReadable + IWritable — per-device ITwinCATClient cached in DeviceState.Client, lazy-connected on first read/write via EnsureConnectedAsync, connect-failure path disposes + clears the client so next call re-attempts cleanly. ReadAsync ordered-snapshot pattern matching AbCip/AbLegacy: unknown ref → BadNodeIdUnknown, unknown device → BadNodeIdUnknown, OperationCanceledException rethrow, any other exception → BadCommunicationError + Degraded health. WriteAsync similar — non-Writable tag → BadNotWritable upfront, NotSupportedException → BadNotSupported, FormatException/InvalidCastException (guard pattern) → BadTypeMismatch, OverflowException → BadOutOfRange, generic → BadCommunicationError. Symbol name resolution goes through TwinCATSymbolPath.TryParse(def.SymbolPath) with fallback to the raw def.SymbolPath if the path doesn't parse — the Beckhoff AdsClient handles the final validation at wire time. ShutdownAsync disposes each device's client. 14 new unit tests in TwinCATReadWriteTests using FakeTwinCATClient + FakeTwinCATClientFactory — unknown ref → BadNodeIdUnknown, successful DInt read with Good status + captured value + IsConnected=true after EnsureConnectedAsync, repeat reads reuse the connection (one Connect + multiple reads), ADS error code mapping via FakeTwinCATClient.ReadStatuses, read exception → BadCommunicationError + Degraded health, connect exception disposes the client, batched reads preserve order across DInt/Real/String types, non-Writable rejection, successful write logs symbol+type+value+bit for test inspection, write status-code mapping, write exception → BadCommunicationError, batch preserves order across success/non-writable/unknown, cancellation propagation, ShutdownAsync disposes the client. Total TwinCAT unit tests now 75/75 passing (+14 from PR 1's 61); full solution builds 0 errors; Modbus / AbCip / AbLegacy / other drivers untouched. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 18:33:03 -04:00
Joseph Doherty	cd2c0bcadd	TwinCAT PR 1 — Scaffolding + Core (TwinCATDriver + AMS address + symbolic path). New Driver.TwinCAT project referencing Beckhoff.TwinCAT.Ads 7.0.172 (the official Beckhoff .NET client — 1.6M+ downloads, actively maintained by Beckhoff + community). Package compiles without a local AMS router; wire calls need a running router (TwinCAT XAR on dev Windows, or the standalone Beckhoff.TwinCAT.Ads.TcpRouter embedded package for headless/CI). Same Core.Abstractions-only project shape as Modbus / AbCip / AbLegacy. TwinCATAmsAddress parses ads://{netId}:{port} canonical form — NetId is 6 dot-separated octets (NOT an IP; AMS router translates), port defaults to 851 (TC3 PLC runtime 1). Validates octet range 0-255 and port 1-65535. Case-insensitive scheme. Default-port stripping in canonical form for roundtrip stability. Rejects wrong scheme, missing //, 5-or-7-octet NetId, out-of-range octets/ports, non-numeric fragments. TwinCATSymbolPath handles IEC 61131-3 symbolic names — single-segment (Counter), POU.variable (MAIN.bStart), GVL.variable (GVL.Counter), structured member access (Motor1.Status.Running), array subscripts (Data[5]), multi-dim arrays (Matrix[1,2]), bit-access (Flags.3, GVL.Status.7), combined scope/member/subscript/bit (MAIN.Motors[0].Status.5). Roundtrip-safe ToAdsSymbolName produces the exact string AdsClient.ReadValue consumes. Rejects leading/trailing dots, space in idents, digit-prefix idents, empty/negative/non-numeric subscripts, unbalanced brackets. Underscore-prefix idents accepted per IEC. TwinCATDataType — BOOL / SINT / USINT / INT / UINT / DINT / UDINT / LINT / ULINT / REAL / LREAL / STRING / WSTRING (UTF-16) / TIME / DATE / DateTime (DT) / TimeOfDay (TOD) / Structure. Wider than Logix's surface — IEC adds WSTRING + TIME/DATE/DT/TOD variants. ToDriverDataType widens unsigned + 64-bit to Int32 matching the Modbus/AbCip/AbLegacy Int64-gap convention. TwinCATStatusMapper — Good / BadInternalError / BadNodeIdUnknown / BadNotWritable / BadOutOfRange / BadNotSupported / BadDeviceFailure / BadCommunicationError / BadTimeout / BadTypeMismatch. MapAdsError covers the ADS error codes a driver actually encounters — 6/7 port unreachable, 1792 service not supported, 1793/1794 invalid index group/offset, 1798 symbol not found (→ BadNodeIdUnknown), 1807 invalid state, 1808 access denied (→ BadNotWritable), 1811/1812 size mismatch (→ BadOutOfRange), 1861 sync timeout, unknown → BadCommunicationError. TwinCATDriverOptions + TwinCATDeviceOptions + TwinCATTagDefinition + TwinCATProbeOptions — one instance supports N AMS targets, Tags cross-key by HostAddress, Probe defaults to 5s interval (unlike AbLegacy there's no default probe address — ADS probe reads AmsRouterState not a user tag, so probe address is implicit). TwinCATDriver IDriver skeleton — InitializeAsync parses each device HostAddress + fails fast on malformed strings → Faulted. 61 new unit tests across 3 files — TwinCATAmsAddressTests (6 valid shapes + 12 invalid shapes + 2 ToString canonicalisation + roundtrip stability), TwinCATSymbolPathTests (9 valid shapes + 12 invalid shapes + underscore prefix + 8-case roundtrip), TwinCATDriverTests (DriverType + multi-device init + malformed-address fault + shutdown + reinit + data-type mapping theory + ADS error-code theory). Total project count 30 src + 19 tests; full solution builds 0 errors; Modbus / AbCip / AbLegacy / other drivers untouched. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 18:26:29 -04:00
Joseph Doherty	400fc6242c	AB Legacy PR 3 — ITagDiscovery + ISubscribable + IHostConnectivityProbe + IPerCallHostResolver. Fills out the AbLegacy capability surface — the driver now implements the same 7-interface set as AbCip (IDriver + IReadable + IWritable + ITagDiscovery + ISubscribable + IHostConnectivityProbe + IPerCallHostResolver). ITagDiscovery emits pre-declared tags under an AbLegacy root folder with a per-device sub-folder keyed on HostAddress (DeviceName fallback to HostAddress when null). Writable tags surface as SecurityClassification.Operate, non-writable as ViewOnly. No controller-side enumeration — PCCC has no @tags equivalent on SLC / MicroLogix / PLC-5 (symbol table isn't exposed the way Logix exposes it), so the pre-declared path is the only discovery mechanism. ISubscribable consumes the shared PollGroupEngine extracted in AB CIP PR 1 — reader delegate points at ReadAsync (already handles lazy runtime init + caching), onChange bridges into the driver's OnDataChange event. 100ms interval floor. Initial-data push on first poll. Makes AbLegacy the third consumer of PollGroupEngine (after Modbus and AbCip). IHostConnectivityProbe — per-device probe loop when ProbeOptions.Enabled + ProbeAddress configured (defaults to S:0 status file word 0). Lazy-init on first tick, re-init on wire failure (destroyed native handle gets recreated rather than silently staying broken). Success transitions device to Running, exception to Stopped, same-state spurious event guard under per-device lock. GetHostStatuses returns one entry per device with current state + last-change timestamp for Admin /hosts surfacing. IPerCallHostResolver maps tag full-ref → DeviceHostAddress for the Phase 6.1 (DriverInstanceId, ResolvedHostName) bulkhead/breaker keying per plan decision #144 . Unknown refs fall back to first device's address (invoker handles at capability level as BadNodeIdUnknown); no devices → DriverInstanceId. ShutdownAsync cancels + disposes each probe CTS, disposes PollGroupEngine cancelling active subscriptions, disposes every cached runtime. DeviceState gains ProbeLock / HostState / HostStateChangedUtc / ProbeCts / ProbeInitialized matching AbCip's DeviceState shape. 10 new unit tests in AbLegacyCapabilityTests covering — pre-declared tags emit under AbLegacy/device folder with correct SecurityClassification, subscription initial poll raises OnDataChange with correct value, unsubscribe halts polling (value change post-unsub produces no further events), GetHostStatuses returns one entry per device, probe Running transition on successful read, probe Stopped transition on read exception, probe disabled when ProbeAddress null, ResolveHost returns declared device for known tag, falls back to first device for unknown, falls back to DriverInstanceId when no devices. Total AbLegacy unit tests now 92/92 passing (+10 from PR 2's 82); full solution builds 0 errors; AbCip + Modbus + other drivers untouched. AB Legacy driver now complete end-to-end — SLC 500 / MicroLogix / PLC-5 / LogixPccc all shippable with read / write / discovery / subscribe / probe / host-resolve, feature-parity with AbCip minus IAlarmSource (same deferral per plan). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 18:02:52 -04:00
Joseph Doherty	b2424a0616	AB Legacy PR 2 — IReadable + IWritable. IAbLegacyTagRuntime + IAbLegacyTagFactory abstraction mirrors IAbCipTagRuntime from AbCip PR 3. LibplctagLegacyTagRuntime default implementation wraps libplctag.Tag with Protocol=ab_eip + PlcType dispatched from the profile's libplctag attribute (Slc500/MicroLogix/Plc5/LogixPccc) — libplctag routes PCCC-over-EIP internally based on PlcType, so our layer just forwards the atomic type to Get/Set calls. DecodeValue handles Bit (GetBit when bitIndex is set, else GetInt8!=0), Int/AnalogInt (GetInt16 widened to int), Long (GetInt32), Float (GetFloat32), String (GetString), TimerElement/CounterElement/ControlElement (GetInt32 — sub-element selection is in the libplctag tag name like T4:0.ACC, PLC-side decode picks the right slot). EncodeValue handles the same types; bit-within-word writes throw NotSupportedException pointing at follow-up task #181 (same read-modify-write gap as Modbus BitInRegister). AbLegacyDriver implements IReadable + IWritable with the exact same shape as AbCip PR 3-4 — per-tag lazy runtime init via EnsureTagRuntimeAsync cached in DeviceState.Runtimes dict, ordered-snapshot results, health surface updates. Exception table — OperationCanceledException rethrows, NotSupportedException → BadNotSupported, FormatException/InvalidCastException → BadTypeMismatch (guard pattern C# 11 syntax), OverflowException → BadOutOfRange, anything else → BadCommunicationError. ShutdownAsync disposes every cached runtime so the native tag handles get released. 14 new unit tests in AbLegacyReadWriteTests covering unknown ref → BadNodeIdUnknown, successful N-file read with Good status + captured value, repeat-read reuses cached runtime (init count 1 across 2 reads), libplctag non-zero status mapping (-14 → BadNodeIdUnknown), read exception → BadCommunicationError + Degraded health, batched reads preserve order across N/F/ST types, TagCreateParams composition (gateway/port/path/slc500 attribute/tag-name), non-writable tag → BadNotWritable, successful write encodes + flushes, bit-within-word → BadNotSupported (RmwThrowingFake mirrors LibplctagLegacyTagRuntime's runtime check), write exception → BadCommunicationError, batch preserves order across success+fail+unknown, cancellation propagates, ShutdownAsync disposes runtimes. Total AbLegacy unit tests now 82/82 passing (+14 from PR 1's 68). Full solution builds 0 errors; Modbus + AbCip + other drivers untouched. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 17:58:38 -04:00
Joseph Doherty	fc575e8dae	AB Legacy PR 1 — Scaffolding + Core (AbLegacyDriver + PCCC address parser). New Driver.AbLegacy project with the libplctag 1.5.2 reference + the same Core.Abstractions-only project shape AbCip uses. AbLegacyHostAddress duplicates the ab://gateway[:port]/cip-path parser from AbCip since PCCC-over-EIP uses the same gateway routing convention (SLC 500 direct-wired with empty path, PLC-5 bridged through a ControlLogix chassis with full CIP path). Parser is 30 lines; copy was cheaper than introducing a shared Ab* project just to avoid duplication. AbLegacyAddress handles PCCC file addressing — file-letter + optional file-number + colon + word-number + optional sub-element (.ACC / .PRE / .EN / .DN / .CU / .CD / .LEN / .POS / .ER) + optional /bit-index. Handles the full shape variety — N7:0 (integer file 7 word 0), F8:5 (float file 8 word 5), B3:0/0 (bit file 3 word 0 bit 0), ST9:0 (string file 9 string 0), L9:3 (long file SLC 5/05+), T4:0.ACC (timer accumulator), C5:2.CU (counter count-up bit), R6:0.LEN (control length), I:0/0 (input file bit — no file number for I/O/S), O:1/2 (output file bit), S:1 (status file word), N7:0/3 (bit within integer file). Validates file letters against the canonical SLC/ML/PLC-5 set (N/F/B/L/ST/T/C/R/I/O/S/A). ToLibplctagName roundtrips so the parsed value can be handed straight to libplctag's name= attribute. AbLegacyDataType — Bit / Int (N-file, 16-bit signed) / Long (L-file, 32-bit, SLC 5/05+ only) / Float (F-file, 32-bit IEEE-754) / AnalogInt (A-file) / String (ST-file, 82-byte fixed + length word) / TimerElement / CounterElement / ControlElement. ToDriverDataType widens Long to Int32 matching the Modbus/AbCip Int64-gap convention. AbLegacyStatusMapper shares the OPC UA status constants with AbCip (same numeric values, different namespace). MapLibplctagStatus mirrors AbCip — 0 success, positive pending, negative error code families. MapPcccStatus handles PCCC STS bytes — 0x00 success, 0x10 illegal command, 0x20 bad address, 0x30 protected, 0x40/0x50 busy, 0xF0 extended status. AbLegacyDriverOptions + AbLegacyDeviceOptions + AbLegacyTagDefinition + AbLegacyProbeOptions mirror AbCip shapes — one instance supports N devices via Devices list, Tags list references devices by HostAddress cross-key, Probe uses S:0 by default as the cheap probe address. AbLegacyPlcFamilyProfile for four families — Slc500 (slc500 attribute, 1,0 default path, supports L + ST files, 240B max PCCC packet), MicroLogix (micrologix attribute, empty path for direct EIP, supports ST but not L), Plc5 (plc5 attribute, 1,0 default path, supports ST but predates L), LogixPccc (logixpccc attribute, full Logix ConnectionSize + L file support via the PCCC compatibility layer on ControlLogix). AbLegacyDriver implements IDriver only — InitializeAsync parses each device's HostAddress and selects its profile (fails fast on malformed strings → Faulted health), per-device state with parsed address + options + profile + empty placeholder for PRs 2-3. ShutdownAsync clears the device dict. 68 new unit tests across 3 files — AbLegacyAddressTests (15 valid shapes + 10 invalid shapes + 7 ToLibplctagName roundtrip), AbLegacyHostAndStatusTests (4 valid host + 5 invalid host + 8 PCCC STS + 7 libplctag status), AbLegacyDriverTests (IDriver lifecycle + multi-device init with per-family profile selection + malformed-address fault + shutdown + family profile defaults + ForFamily theory + data-type mapping). Total project count 29 src + 18 tests; full solution builds 0 errors; Modbus + AbCip + other drivers untouched. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 17:54:25 -04:00
Joseph Doherty	60b8d6f2d0	AB CIP PR 9-12 — Per-PLC-family profile tests + GuardLogix safety-tag support. Consolidates PRs 9/10/11/12 from the plan (ControlLogix / CompactLogix / Micro800 / GuardLogix integration suites) into a single PR because the per-family work that actually ships without a live ab_server binary is profile-metadata assertion + unit-level driver-option binding. Per-family integration tests that require a running simulator are deferred to the ab_server-CI follow-up already tracked from PR 3 (download prebuilt Windows binary as GitHub release asset). ControlLogix — baseline profile asserted (controllogix attribute, 4002 LFO ConnectionSize, 1,0 default path, request-packing + connected-messaging, 4000B max fragment). CompactLogix — narrower 504 ConnectionSize for 5069-L3x safety, 500B max fragment, lib attribute compactlogix which libplctag maps to the ControlLogix family internally but via our profile chain we surface it as a distinct knob so future quirk handling (5069 narrow-window regression cases) hangs off the compactlogix attribute. Micro800 — empty CIP path for no-backplane routing, 488B ConnectionSize, 484B fragment cap, request packing + connected messaging both disabled (most models reject Forward_Open), micro800 lib attribute. Test asserts the driver correctly parses an ab://192.168.1.20/ host address with empty path + forwards the empty path through AbCipTagCreateParams so libplctag sees the unconnected-only configuration. GuardLogix — wire protocol identical to ControlLogix (safety partition is a per-tag concern, not a wire-layer distinction) so profile defaults match ControlLogix. New AbCipTagDefinition.SafetyTag field — when true, the driver forces SecurityClassification.ViewOnly in discovery regardless of the Writable flag, and IWritable rejects the write upfront with BadNotWritable. Matches the Rockwell safety-partition isolation model where non-safety-task writes to safety tags would be rejected by the PLC anyway — surfacing the intent at the driver surface prevents wasted wire round-trips + gives Admin UI users a correct ViewOnly rendering. 14 new unit tests in AbCipPlcFamilyTests covering — ControlLogix profile defaults + correct profile selection at Initialize, CompactLogix narrower-than-ControlLogix ConnectionSize + fragment cap, Micro800 empty path parses + SupportsConnectedMessaging=false + SupportsRequestPacking=false + read forwards empty path + micro800 attribute through to libplctag, GuardLogix wire-protocol parity with ControlLogix, GuardLogix safety tag surfaces as ViewOnly in discovery even when Writable=true, GuardLogix safety-tag write rejected with BadNotWritable even when Writable=true, ForFamily theory (4 families → correct libplctag attribute). Total AbCip unit tests now 161/161 passing (+14 from PR 8's 147). Modbus + other drivers untouched; full solution builds 0 errors. PR 13 (IAlarmSource via tag-projected ALMA/ALMD blocks) remains deferred per the plan — feature-flagged pattern not needed before go-live. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 17:18:51 -04:00
Joseph Doherty	ac14ba9664	AB CIP PR 8 — IHostConnectivityProbe + IPerCallHostResolver. Per-device probe loop — when AbCipProbeOptions.Enabled + ProbeTagPath are configured, InitializeAsync kicks off one probe task per device that periodically reads the probe tag (lazy-init on first attempt, re-init on wire failure so destroyed native handles get recreated rather than silently staying broken), transitions Running on status==0 or Stopped on non-zero status / exception, raises OnHostStatusChanged with the device HostAddress as the host-name key. TransitionDeviceState guards against spurious same-state events under a per-device lock. ShutdownAsync cancels + disposes each probe's CTS + its captured runtime. DeviceState record gains ProbeLock / HostState / HostStateChangedUtc / ProbeCts / ProbeInitialized fields. IHostConnectivityProbe.GetHostStatuses returns one HostConnectivityStatus per device with the current state + last-change timestamp, surfaced to Admin /hosts per plan decision #144 . IPerCallHostResolver.ResolveHost maps a tag full-reference to its DeviceHostAddress via the _tagsByName dict populated at Initialize time, which means UDT member full-references (Motor1.Speed synthesised by PR 6) resolve to the parent UDT's device without extra bookkeeping. Unknown references fall back to the first configured device's host address (invoker handles the actual mislookup at read time as BadNodeIdUnknown), and when no devices are configured resolver returns DriverInstanceId so the single-host fallback pipeline still works. Matches the plan decision #144 contract — Phase 6.1 resilience keys its bulkhead + breaker on (DriverInstanceId, ResolvedHostName) so a dead PLC trips only its own breaker, healthy siblings keep serving. 10 new unit tests in AbCipHostProbeTests covering GetHostStatuses returning one entry per device, probe success transitioning Unknown → Running, probe exception transitioning to Stopped, Enabled=false skipping the loop (no events + state stays Unknown), null ProbeTagPath skipping the loop, multi-device independent probe behavior (one Running + one Stopped simultaneously), ResolveHost for known tags returning the declared DeviceHostAddress, ResolveHost for unknown ref falling back to first device, ResolveHost falling back to DriverInstanceId when no devices, ResolveHost for UDT member walking to the synthesised member definition. Total AbCip unit tests now 147/147 passing (+10 from PR 7's 137). Full solution builds 0 errors; Modbus + other drivers untouched. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 17:15:10 -04:00
Joseph Doherty	33780eb64c	AB CIP PR 7 — ISubscribable via shared PollGroupEngine. AbCipDriver now implements ISubscribable — Subscribe delegates into the PollGroupEngine extracted in PR 1, Unsubscribe releases the subscription, ShutdownAsync disposes the engine cancelling every active subscription. OnDataChange event wired through the engine's on-change callback so external subscribers see the driver as sender. The engine's reader delegate points at the driver's ReadAsync (already handles lazy runtime init + caching via EnsureTagRuntimeAsync) — each poll tick batch-reads every subscribed tag in one IReadable call. 100ms interval floor inherited from PollGroupEngine.DefaultMinInterval matches Modbus convention. Initial-data push on first poll preserved via forceRaise=true. Exception-tolerant loop preserved — individual read failures show up as DataValueSnapshot with non-Good StatusCode via the status-code mapping PR 3 established. 7 new unit tests in AbCipSubscriptionTests covering initial-poll raising per tag, unchanged value raising only once, value change between polls triggering a new event, Unsubscribe halting the loop, 100ms floor keeping a 5ms request from generating extra events against a stable value, ShutdownAsync cancelling active subscriptions, UDT member subscription routing through the synthesised Motor1.Speed full-reference (proving PR 6's fan-out composes correctly with PR 7's subscription path). Total AbCip unit tests now 137/137 passing (+7 from PR 6's 130). Validates that the shared PollGroupEngine from PR 1 works correctly for a second driver, closing the original motivation for the extraction. Full solution builds 0 errors; Modbus + other drivers untouched. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 17:11:51 -04:00
Joseph Doherty	b06a1ba607	AB CIP PR 6 — UDT member-declaration support. Declaration-driven UDT member fan-out — users declare a UDT-typed tag once with an explicit Members list and the driver (1) expands member-addressable tags synthetically at Initialize time so Read/Write/Subscribe hit individual native tags per member, (2) emits a folder + one Variable per member in DiscoverAsync instead of a single opaque Structure Variable. Matches the Logix 5000 addressing convention where members are reached via dotted syntax (Motor1.Speed, Motor1.Running) — AbCipTagPath already parsed this shape in PR 2, so PR 6 just had to wire config→TagPath composition. New AbCipStructureMember record — Name / DataType / Writable / WriteIdempotent — plus optional Members list on AbCipTagDefinition that's ignored for atomic types and optional for Structure types. When Structure has null or empty Members the driver falls back to emitting a single opaque Variable so downstream config can address members manually (the "black box" path documented in AbCipTagDefinition's docstring). AbCipDriver.InitializeAsync now iterates tags + for every Structure tag with non-empty Members synthesises a child AbCipTagDefinition per member (composed full-reference Parent.Member + composed TagPath parent.member passed through to libplctag as a normal symbolic read). Per-member Writable/WriteIdempotent metadata propagates so IWritable correctly rejects writes to members flagged non-writable even when the parent tag is writable — each member stands alone from the resilience + authz perspective. DiscoverAsync gains a matching branch — Structure with Members emits an intermediate folder named after the parent tag + one Variable per member under it (browse name = member.Name, FullName = Parent.Member). Members with Writable=false surface SecurityClassification.ViewOnly, WriteIdempotent flag passes through to the DriverAttributeInfo. Structure without Members falls through to the normal single-Variable path. Whole-UDT read optimization (one libplctag call returns the packed buffer + client-side member decode) is deferred — needs the CIP Template Object class 0x6C reader which is blocked on the same libplctag 1.5.2 TagInfoPlcMapper gap that deferred the real @tags walker in PR 5. AbCipTemplateCache shipped in PR 5 is the drop-in point when that reader lands. Per-member reads today are N native round-trips; whole-UDT optimisation is a perf win, not a correctness gap. 7 new unit tests in AbCipUdtMemberTests — UDT fan-out to Variable children under folder with correct SecurityClassification + WriteIdempotent propagation, member reads via synthesised full-reference with correct per-member values, member writes routing to correct TagPath, member Writable=false flag correctly blocking IWritable, Structure without Members falls back to single Variable, empty Members list treated identically to null, UDT tags coexist with flat tags in the discovery output. Total AbCip unit tests now 130/130 passing (+7 from PR 5's 123). Modbus + other drivers untouched; full solution builds 0 errors. Unblocks PR 7 (ISubscribable) — the poll engine already works with member-level full references. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 17:09:06 -04:00
Joseph Doherty	447086892e	AB CIP PR 5 — ITagDiscovery (pre-declared emission + controller-enumeration scaffolding). DiscoverAsync streams tags to IAddressSpaceBuilder with the same shape the Modbus driver uses, keyed by device host address so one driver instance exposing N PLCs produces N device folders under a shared "AbCip" root. Pre-declared tags from AbCipDriverOptions.Tags emit first, filtered through AbCipSystemTagFilter so __DEFVAL_* / __DEFAULT_* / Routine: / Task: / Local:N:X / Map: / Axis: / Cam: / MotionGroup: infrastructure names never reach the address space. Writable tags map to SecurityClassification.Operate, non-writable to ViewOnly. Controller enumeration (walking the Logix Symbol Object via @tags) is wired up through a new IAbCipTagEnumerator + IAbCipTagEnumeratorFactory abstraction — default EmptyAbCipTagEnumeratorFactory returns an empty sequence so the driver stays production-safe without a real decoder. Tests inject FakeEnumeratorFactory to exercise the discovered-tag path: discovered tags land under a Discovered/ sub-folder, program-scope produces Program:P.Name full references, the IsSystemTag hint + the AbCipSystemTagFilter both act as gates, ReadOnly surfaces SecurityClassification.ViewOnly. The real @tags walker is a follow-up because libplctag 1.5.2 (latest stable on NuGet) does not expose TagInfoPlcMapper / UdtInfoMapper — the DataTypes namespace only ships IPlcMapper<T>, so enumerating the Symbol Object requires either implementing a custom IPlcMapper for the CIP byte layout or raw-buffer decoding via plc_tag_get_raw — both non-trivial enough to warrant their own PR. Code comment on EmptyAbCipTagEnumerator documents the gap + points to the follow-up. AbCipTemplateCache placeholder ships with a ConcurrentDictionary<(device, templateInstanceId), AbCipUdtShape> + Put / TryGet / Clear / Count — the Template Object reader (CIP class 0x6C) populates it in PR 6 and FlushOptionalCachesAsync now clears it. AbCipUdtShape + AbCipUdtMember records describe UDT layout — type name + total size + ordered members with offset / type / array length. AbCipDriver ctor gains optional enumeratorFactory parameter matching the tagFactory pattern from PR 3. TemplateCache exposed internally for PR 6's reader to write into. 25 new unit tests in AbCipDriverDiscoveryTests covering — pre-declared emission under device folder, DeviceName fallback to host address, system-tag filter rejecting pre-declared infrastructure names, cross-device tag filtering (tags for a device this driver does not own are ignored), controller enumeration adds tags under Discovered/, system-tag hint + filter both enforced, ReadOnly → ViewOnly, AbCipTagCreateParams composition (gateway / port / CIP path / libplctag attribute / tag name "@tags" / timeout), default enumerator factory used when not injected, 13 Theory cases covering every AbCipSystemTagFilter pattern, template cache roundtrip + clear, FlushOptionalCachesAsync clears the cache. Total AbCip unit tests now 123/123 passing (+25 from PR 4's 98). Modbus + other existing tests untouched; full solution builds 0 errors. Unblocks PR 6 (UDT structured read/write) + PR 7 (subscriptions consuming PollGroupEngine from PR 1). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 17:05:02 -04:00
Joseph Doherty	257f4fd3f5	AB CIP PR 4 — IWritable implementation. LibplctagTagRuntime.EncodeValue fills in the switch for every atomic Logix type the driver currently surfaces — Bool (standalone BOOL via SetInt8 0/1), SInt/USInt (SetInt8/SetUInt8), Int/UInt (SetInt16/SetUInt16), DInt/UDInt (SetInt32/SetUInt32), LInt/ULInt (SetInt64/SetUInt64), Real (SetFloat32), LReal (SetFloat64), String (SetString 0), Dt (epoch DINT via SetInt32). BOOL-within-DINT writes throw NotSupportedException with a code comment matching the Modbus BitInRegister pattern at ModbusDriver.cs line 640 — the read-modify-write logic + lock-per-DINT discipline is a follow-up PR rather than squeezing it into the initial wire plumbing. Structure writes throw NotSupportedException pointing at PR 6 when UDT support lands. AbCipDriver now implements IWritable. WriteAsync iterates writes preserving order, short-circuits on unknown reference → BadNodeIdUnknown, on non-writable tag definition → BadNotWritable, on unknown device → BadNodeIdUnknown. Happy path materialises the cached runtime via EnsureTagRuntimeAsync (shares PR 3's lazy-init path so read+write on the same tag hits one native handle), EncodeValue into the tag's buffer, WriteAsync flushes, GetStatus confirms the wire status, maps libplctag error codes via AbCipStatusMapper.MapLibplctagStatus, sets health Healthy on success. Per plan decisions #44 , #45 , #143 the driver does NOT auto-retry writes — that's a resilience-layer concern (Polly pipeline sitting above) keyed on the tag's WriteIdempotent flag. Exception-mapping table — OperationCanceledException rethrows (honors cancellation), NotSupportedException → BadNotSupported (bit-in-DINT, Structure, future unsupported types), FormatException → BadTypeMismatch (Convert.ToInt32 of a non-numeric string), InvalidCastException → BadTypeMismatch (caller passed an object incompatible with the conversion target), OverflowException → BadOutOfRange (value exceeds target type range, e.g. Int16 write of 1_000_000), any other Exception → BadCommunicationError (wire drop, libplctag-internal failure). Health surface updates Degraded on every non-Cancellation exception path, Healthy on success. Introduces AbCipStatusMapper.BadTypeMismatch (0x80730000). 10 new unit tests in AbCipDriverWriteTests covering — unknown ref → BadNodeIdUnknown, non-writable tag → BadNotWritable, successful DInt write encodes + flushes the value + marks WriteCount=1, BOOL-in-DINT rejected as BadNotSupported (separate ThrowingBoolBitFake mirrors LibplctagTagRuntime's runtime check), non-zero libplctag status after write mapped via AbCipStatusMapper (timeout -5 → BadTimeout), FormatException from non-numeric-string write → BadTypeMismatch (RealConvertFake exercises real Convert.ToInt32), OverflowException from Int16 write of 1_000_000 → BadOutOfRange, generic exception during write → BadCommunicationError + health Degraded, batch with mixed success+failure preserves order across four request types, cancellation propagates as OperationCanceledException. FakeAbCipTag's test-fake base class methods made virtual so override hooks work correctly through the IAbCipTagRuntime interface (new-shadow was silently falling through to the base implementation). Total AbCip unit tests now 98/98 passing; Modbus + other existing tests untouched; full solution builds 0 errors. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 16:57:52 -04:00
Joseph Doherty	cc35c77d64	AB CIP PR 3 — IReadable implementation against libplctag. Introduces IAbCipTagRuntime + IAbCipTagFactory abstraction matching the Modbus transport-factory pattern (ctor optional arg, default production impl injected) so the driver's read/status-mapping logic is unit-testable without a live PLC or the native libplctag binary. LibplctagTagRuntime is the default wire-backed implementation — wraps libplctag.Tag + translates our AbCipDataType enum into GetInt8/GetUInt8/GetInt16/GetUInt16/GetInt32/GetUInt32/GetInt64/GetUInt64/GetFloat32/GetFloat64/GetString/GetBit calls covering Bool (standalone + BOOL-in-DINT via .N bit selector), SInt/USInt, Int/UInt, DInt/UDInt, LInt/ULInt, Real, LReal, String, Dt (epoch DINT), with Structure deferred to PR 6. MapPlcType bridges our libplctag attribute strings (controllogix, compactlogix, micro800) to libplctag.PlcType enum; CompactLogix rolls under ControlLogix per libplctag's family grouping which matches the wire protocol reality. AbCipDriver now implements IReadable — ReadAsync iterates fullReferences preserving order, looks up each tag definition + its device, lazily materialises the tag runtime via EnsureTagRuntimeAsync on first touch (cached thereafter for the lifetime of the device), catches OperationCanceledException to honor cancellation, maps libplctag non-zero status via AbCipStatusMapper.MapLibplctagStatus, catches any other exception as BadCommunicationError. Health surface moves to Healthy on success + Degraded with the last error message on failure. Initialize-failure path disposes the half-created runtime before rethrowing so no native handles leak. DeviceState gains a Runtimes dict alongside the existing TagHandles collection; DisposeHandles walks both so ShutdownAsync + ReinitializeAsync cleanly destroy every native tag. 12 new unit tests in AbCipDriverReadTests using FakeAbCipTag / FakeAbCipTagFactory (test fake under tests/...AbCip.Tests/FakeAbCipTag.cs) covering unknown reference → BadNodeIdUnknown, unknown device → BadNodeIdUnknown, successful DInt read with correct Good status + captured value, lazy-init on first read with reuse across subsequent reads, non-zero libplctag status mapping via AbCipStatusMapper, exception during read surfacing as BadCommunicationError with health Degraded, batched reads preserving order + per-tag status, health Healthy after success, TagCreateParams composition from device + profile (gateway / port / CIP path / libplctag attribute / tag name wiring), cancellation propagation via OperationCanceledException, ShutdownAsync disposing every runtime, Initialize-failure disposing the aborted runtime. Total AbCip unit tests now 88/88 passing. Integration test project scaffolding — tests/ZB.MOM.WW.OtOpcUa.Driver.AbCip.IntegrationTests with AbServerFixture (IAsyncLifetime that starts ab_server when the binary is on PATH, otherwise marks IsAvailable=false), AbServerFact attribute (Fact-equivalent that skips when ab_server is missing), one smoke test exercising DInt read end-to-end. Project runs cleanly — the single smoke test skips on boxes without ab_server (0 failed, 0 passed, 1 skipped) + runs on boxes with it. Follow-up work captured in comments — ab_server CI fixture (download prebuilt Windows x64 binary as GitHub release asset) + per-family JSON profiles + hand-rolled CIP stub for UDT fidelity ship in the PR 6/9-12 window. Solution file updated. Full solution builds 0 errors across all 28 projects. Modbus + other existing tests untouched. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 16:38:54 -04:00
Joseph Doherty	3e0452e8a4	AB CIP PR 2 — scaffolding + Core (AbCipDriver skeleton + libplctag binding + host / tag-path / data-type / status-code parsers + per-family profiles + SafeHandle wrapper + test harness). Ships everything needed to stand up the driver project as a compiling assembly with no wire calls yet — PR 3 adds IReadable against ab_server which is the first PR that actually touches the native library. Project reference shape matches Modbus / OpcUaClient / S7 (only Core.Abstractions, no Core / Configuration / Polly) so the driver stays lean and doesn't drag EF Core into every deployment that wants AB support. libplctag 1.5.2 pinned (1.6.x only exists as alpha — stable 1.5 series covers ControlLogix / CompactLogix / Micro800 / SLC500 / PLC-5 / MicroLogix which matches plan decision #11 family coverage). libplctag.NativeImport arrives transitively. AbCipHostAddress parses ab://gateway[:port]/cip-path canonical strings end-to-end: handles hostname or IP gateway, optional explicit port (default 44818 EtherNet-IP reserved), CIP path including bridged routes (1,2,2,10.0.0.10,1,0), empty path for Micro800 / MicroLogix without backplane routing, case-insensitive scheme, default-port stripping in canonical form for round-trip stability. Opaque string survives straight into libplctag's gateway / path attributes so no translation layer at wire time. AbCipTagPath handles the full Logix symbolic tag surface — controller-scope (Motor1_Speed), program-scope (Program:MainProgram.StepIndex), structured member access (Motor1.Speed.Setpoint), multi-dim array subscripts (Matrix[1,2,3]), bit-within-DINT via .N syntax (Flags.3, Motor.Status.12) with valid range 0-31 per Logix 5000 General Instructions Reference. Structural capture so PR 6 UDT work can walk the path against a cached template without reparsing. Rejects malformed shapes (empty scopes, ident starting with digit, spaces, empty/negative/non-numeric subscripts, unbalanced brackets, leading / trailing dots). Round-trips via ToLibplctagName producing the exact string libplctag's name attribute expects. AbCipDataType mirrors ModbusDataType shape — atomic Bool / SInt / Int / DInt / LInt / USInt / UInt / UDInt / ULInt / Real / LReal / String / Dt plus a Structure marker for UDT-typed tags (resolved via CIP Template Object at discovery time in PR 5/6). ToDriverDataType adapter follows the Modbus widening convention for unsigned + 64-bit until DriverDataType picks those up. AbCipStatusMapper covers the CIP general-status values an AB PLC actually returns during normal operation (0x00/0x04/0x05/0x06/0x08/0x0A/0x0B/0x0E/0x10/0x13/0x16) + libplctag PLCTAG_STATUS_* codes (0, >0 pending, negative error families). Mirrors ModbusDriver.MapModbusExceptionToStatus so Admin UI status displays stay uniform across drivers. PlcTagHandle is a SafeHandle around the int32 native tag ID with plc_tag_destroy slot wired as a no-op for PR 2 (P/Invoke DllImport arrives with PR 3 when the wire calls land). Lifetime guaranteed by the SafeHandle finalizer — every leaked handle gets cleaned up even when the owner is GC'd without explicit Dispose. IsInvalid when native ID <= 0 so destroying a negative (error) handle never happens. Critical because driver-specs.md §3 flags libplctag native heap as invisible to GetMemoryFootprint — leaked handles directly feed the Tier-B recycle trigger. AbCipDriverOptions captures the multi-device shape — one driver instance can talk to N PLCs via Devices[] (each with HostAddress + PlcFamily + optional DeviceName); Tags[] references devices by HostAddress as the cross-key; AbCipProbeOptions + driver-wide Timeout. AbCipDriver implements IDriver only — InitializeAsync parses every device's HostAddress and selects its PlcFamilyProfile (fails fast on malformed strings via InvalidOperationException → Faulted health), per-device state cached in a DeviceState record with parsed address + profile + empty TagHandles dict for later PRs. ReinitializeAsync is the Tier-B escape hatch — shuts down every device, disposes every PlcTagHandle via SafeHandle lifetime, reinitializes from options. ShutdownAsync clears the device dict and flips health to Unknown. PlcFamilies/AbCipPlcFamilyProfile gives four baseline profiles — ControlLogix (4002 ConnectionSize, path 1,0, Large Forward Open + request packing + connected messaging, FW20+ baseline), CompactLogix (narrower 504 default for 5069-L3x safety), Micro800 (488 cap, empty path, unconnected-only, no request packing), GuardLogix (shares ControlLogix wire protocol — safety partition is tag-level, surfaced as ViewOnly in PR 12). Tests — 76 new cases across 4 test classes — AbCipHostAddressTests (10 valid shapes, 10 invalid shapes, ToString canonicalization, round-trip stability), AbCipTagPathTests (18 cases including multi-scope / multi-member / multi-subscript / bit-in-DINT / rejected shapes / underscore idents / round-trip), AbCipStatusMapperTests (12 CIP + 8 libplctag codes), AbCipDriverTests (IDriver lifecycle + multi-device init + malformed-address fault + per-family profile lookup + PlcTagHandle invalid/dispose idempotency + AbCipDataType mapping). Full solution builds 0 errors; 254 warnings are pre-existing xUnit1051 CancellationToken hints outside this PR. Solution file updated to include both new projects. Unblocks PR 3 (IReadable against ab_server) which is the first PR to exercise the native library end-to-end. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 15:58:15 -04:00
Joseph Doherty	4ab587707f	AB CIP PR 1 — extract shared PollGroupEngine into Core.Abstractions so the AB CIP driver (and any other poll-based driver — S7, FOCAS, AB Legacy) can reuse the subscription loop instead of reimplementing it. Behaviour-preserving refactor of ModbusDriver: SubscriptionState + PollLoopAsync + PollOnceAsync + ModbusSubscriptionHandle lifted verbatim into a new PollGroupEngine class, ModbusDriver's ISubscribable surface now delegates Subscribe/Unsubscribe into the engine and ShutdownAsync calls engine DisposeAsync. Interval floor (100 ms default) becomes a PollGroupEngine constructor knob so per-driver tuning is possible without re-shipping the loop. Initial-data push semantics preserved via forceRaise=true on the first poll. Exception-tolerant loop preserved — reader throws are swallowed, loop continues, driver's health surface remains the single reporting path. Placement in Core.Abstractions (not Core) because driver projects only reference Core.Abstractions by convention (matches OpcUaClient / Modbus / S7 csproj shape); putting the engine in Core would drag EF Core + Serilog + Polly into every driver. Module has no new dependencies beyond System.Collections.Concurrent + System.Threading, so Core.Abstractions stays lightweight. Modbus ctor converted from primary to explicit so the engine field can capture this for the reader + on-change bridge. All 177 ModbusDriver.Tests pass unmodified (Modbus subscription suite, probe suite, cap suite, exception mapper, reconnect, TCP). 10 new direct engine tests in Core.Abstractions.Tests covering: initial force-raise, unchanged-value single-raise, change-between-polls, unsubscribe halts loop, interval-floor clamp, independent subscriptions, reader-exception tolerance, unknown-handle returns false, ActiveSubscriptionCount lifecycle, DisposeAsync cancels all. No changes to driver-specs.md nor to the server Hosting layer — engine is a pure internal building block at this stage. Unblocks AB CIP PR 7 (ISubscribable consumes the engine); also sets up S7 + FOCAS to drop their own poll loops when they re-base. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 15:34:44 -04:00
Joseph Doherty	ae8f226e45	Phase 6.1 Stream E.3 partial — in-flight counter feeds CurrentBulkheadDepth Closes the observer half of #162 that was flagged as "persisted as 0 today" in PR #105. The Admin /hosts column refresh + FleetStatusHub SignalR push + red-badge visual still belong to the visual-compliance pass. Core.Resilience: - DriverResilienceStatusTracker gains RecordCallStart + RecordCallComplete + CurrentInFlight field on the snapshot record. Concurrent-safe via the same ConcurrentDictionary.AddOrUpdate pattern as the other recorder methods. Clamps to zero on over-decrement so a stray Complete-without-Start can't drive the counter negative. - CapabilityInvoker gains an optional statusTracker ctor parameter. When wired, every ExecuteAsync / ExecuteAsync(void) wraps the pipeline call in try / finally that records start/complete — so the counter advances cleanly whether the call succeeds, cancels, or throws. Null tracker keeps the pre-Phase-6.1 Stream E.3 behaviour exactly. Server.Hosting: - ResilienceStatusPublisherHostedService persists CurrentInFlight as the DriverInstanceResilienceStatus.CurrentBulkheadDepth column (was 0 before this PR). One-line fix on both the insert + update branches. The in-flight counter is a pragmatic proxy for Polly's internal bulkhead depth — a future PR wiring Polly telemetry would replace it with the real value. The shape of the column + the publisher + the Admin /hosts query doesn't change, so the follow-up is invisible to consumers. Tests (8 new InFlightCounterTests, all pass): - Start+Complete nets to zero. - Nested starts sum; Complete decrements. - Complete-without-Start clamps to zero. - Different hosts track independently. - Concurrent starts (500 parallel) don't lose count. - CapabilityInvoker observed-mid-call depth == 1 during a pending call. - CapabilityInvoker exception path still decrements (try/finally). - CapabilityInvoker without tracker doesn't throw. Full solution dotnet test: 1243 passing (was 1235, +8). Pre-existing Client.CLI Subscribe flake unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 15:02:34 -04:00
Joseph Doherty	ad131932d3	Phase 6.4 Stream B.2-B.4 server-side — EquipmentImportBatch staging + FinaliseBatch transaction Closes the server-side/data-layer piece of Phase 6.4 Stream B.2-B.4. The CSV-import preview + modal UI (Stream B.3/B.5) still belongs to the Admin UI follow-up — this PR owns the staging tables + atomic finalise alone. Configuration: - New EquipmentImportBatch entity (Id, ClusterId, CreatedBy, CreatedAtUtc, RowsStaged/Accepted/Rejected, FinalisedAtUtc?). Composite index on (CreatedBy, FinalisedAtUtc) powers the Admin preview modal's "my open batches" query. - New EquipmentImportRow entity — one row per CSV row, 8 required columns from decision #117 + 9 optional from decision #139 + IsAccepted flag + RejectReason. FK to EquipmentImportBatch with cascade delete so DropBatch collapses the whole tree. - EF migration 20260419_..._AddEquipmentImportBatch. - SchemaComplianceTests expected tables list gains the two new tables. Admin.Services.EquipmentImportBatchService: - CreateBatchAsync — new header row, caller-supplied ClusterId + CreatedBy. - StageRowsAsync(batchId, acceptedRows, rejectedRows) — bulk-inserts the parsed CSV rows into staging. Rejected rows carry LineNumberInFile + RejectReason for the preview modal. Throws when the batch is finalised. - DropBatchAsync — removes batch + cascaded rows. Throws when the batch was already finalised (rollback via staging is not a time machine). - FinaliseBatchAsync(batchId, generationId, driverInstanceId, unsLineId) — atomic apply. Opens an EF transaction when the provider supports it (SQL Server in prod; InMemory in tests skips the tx), bulk-inserts every accepted staging row into Equipment, stamps EquipmentImportBatch.FinalisedAtUtc, commits. Failure rolls back so Equipment never partially mutates. Idempotent-under-double-call: second finalise throws ImportBatchAlreadyFinalisedException. - ListByUserAsync(createdBy, includeFinalised) — the Admin preview modal's backing query. OrderByDescending on CreatedAtUtc so the most-recent batch shows first. - Two exception types: ImportBatchNotFoundException + ImportBatchAlreadyFinalisedException. ExternalIdReservation merging (ZTag + SAPID fleet-wide uniqueness) is NOT done here — a narrower follow-up wires it once the concurrent-insert test matrix is green. Tests (10 new EquipmentImportBatchServiceTests, all pass): - CreateBatch populates Id + CreatedAtUtc + zero-ed counters. - StageRows accepted + rejected both persist; counters advance. - DropBatch cascades row delete. - DropBatch after finalise throws. - Finalise translates accepted staging rows → Equipment under the target GenerationId + DriverInstanceId + UnsLineId. - Finalise twice throws. - Finalise of unknown batch throws. - Stage after finalise throws. - ListByUserAsync filters by creator + finalised flag. - Drop of unknown batch is a no-op (idempotent rollback). Full solution dotnet test: 1235 passing (was 1225, +10). Pre-existing Client.CLI Subscribe flake unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 14:55:39 -04:00
Joseph Doherty	016122841b	Phase 6.1 Stream E.2 partial — ResilienceStatusPublisherHostedService persists tracker snapshots to DB Closes the HostedService half of Phase 6.1 Stream E.2 flagged as a follow-up when the DriverResilienceStatusTracker shipped in PR #82. The Admin /hosts column refresh + SignalR push + red-badge visual (Stream E.3) remain deferred to the visual-compliance pass — this PR owns the persistence story alone. Server.Hosting: - ResilienceStatusPublisherHostedService : BackgroundService. Samples the DriverResilienceStatusTracker every TickInterval (default 5 s) and upserts each (DriverInstanceId, HostName) counter pair into DriverInstanceResilienceStatus via EF. New rows on first sight; in-place updates on subsequent ticks. - PersistOnceAsync extracted public so tests drive one tick directly — matches the ScheduledRecycleHostedService pattern for deterministic timing. - Best-effort persistence: a DB outage logs a warning + continues; the next tick retries. Never crashes the app on sample failure. Cancellation propagates through cleanly. - Tracks the bulkhead depth / recycle / footprint columns the entity was designed for. CurrentBulkheadDepth currently persisted as 0 — the tracker doesn't yet expose live bulkhead depth; a narrower follow-up wires the Polly bulkhead-depth observer into the tracker. Tests (6 new in ResilienceStatusPublisherHostedServiceTests): - Empty tracker → tick is a no-op, zero rows written. - Single-host counters → upsert a new row with ConsecutiveFailures + breaker timestamp + sampled timestamp. - Second tick updates the existing row in place (not a second insert). - Multi-host pairs persist independently. - Footprint counters (Baseline + Current) round-trip. - TickCount advances on every PersistOnceAsync call. Full solution dotnet test: 1225 passing (was 1219, +6). Pre-existing Client.CLI Subscribe flake unchanged. Production wiring (Program.cs) example: builder.Services.AddSingleton<DriverResilienceStatusTracker>(); builder.Services.AddHostedService<ResilienceStatusPublisherHostedService>(); // Tracker gets wired into CapabilityInvoker via OtOpcUaServer resolution // + the existing Phase 6.1 layer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 14:36:00 -04:00
Joseph Doherty	4de94fab0d	Phase 6.1 Stream A remaining — IPerCallHostResolver + DriverNodeManager per-call host dispatch (decision #144 ) Closes the per-device isolation gap flagged at the Phase 6.1 Stream A wire-up (PR #78 used driver.DriverInstanceId as the pipeline host for every call, so multi-host drivers like Modbus with N PLCs shared one pipeline — one dead PLC poisoned sibling breakers). Decision #144 requires per-device isolation; this PR wires it without breaking single-host drivers. Core.Abstractions: - IPerCallHostResolver interface. Optional driver capability. Drivers with multi-host topology (Modbus across N PLCs, AB CIP across a rack, etc.) implement this; single-host drivers (Galaxy, S7 against one PLC, OpcUaClient against one remote server) leave it alone. Must be fast + allocation-free — called once per tag on the hot path. Unknown refs return empty so dispatch falls back to single-host without throwing. Server/OpcUa/DriverNodeManager: - Captures `driver as IPerCallHostResolver` at construction alongside the existing capability casts. - New `ResolveHostFor(fullReference)` helper returns either the resolver's answer or the driver's DriverInstanceId (single-host fallback). Empty / whitespace resolver output also falls back to DriverInstanceId. - Every dispatch site now passes `ResolveHostFor(fullRef)` to the invoker instead of `_driver.DriverInstanceId` — OnReadValue, OnWriteValue, all four HistoryRead paths. The HistoryRead Events path tolerates fullRef=null and falls back to DriverInstanceId for those cluster-wide event queries. - Drivers without IPerCallHostResolver observe zero behavioural change: every call still keys on DriverInstanceId, same as before. Tests (4 new PerCallHostResolverDispatchTests, all pass): - DeadPlc_DoesNotOpenBreaker_For_HealthyPlc_With_Resolver — 2 PLCs behind one driver; hammer the dead PLC past its breaker threshold; assert the healthy PLC's first call succeeds on its first attempt (decision #144). - EmptyString / unknown-ref fallback behaviour documented via test. - WithoutResolver_SameHost_Shares_One_Pipeline — regression guard for the single-host pre-existing behaviour. - WithResolver_TwoHosts_Get_Two_Pipelines — builds the CachedPipelineCount assertion to confirm the shared-builder cache keys correctly. Full solution dotnet test: 1219 passing (was 1215, +4). Pre-existing Client.CLI Subscribe flake unchanged. Adoption: Modbus driver (#120 follow-up), AB CIP / AB Legacy / TwinCAT drivers (also #120) implement the interface and return the per-tag PLC host string. Single-host drivers stay silent and pay zero cost. Remaining sub-items of #160 still deferred: - IAlarmSource.SubscribeAlarmsAsync + AcknowledgeAsync invoker wrapping. Non-trivial because alarm subscription is push-based from driver through IAlarmConditionSink — the wrap has to happen at the driver-to-server glue rather than a synchronous dispatch site. - Roslyn analyzer asserting every capability-interface call routes through CapabilityInvoker. Substantial (separate analyzer project + test harness); noise-value ratio favors shipping this post-v2-GA once the coverage is known-stable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 12:31:24 -04:00
Joseph Doherty	7b50118b68	Phase 6.1 Stream A follow-up — DriverInstance.ResilienceConfig JSON column + parser + OtOpcUaServer wire-in Closes the Phase 6.1 Stream A.2 "per-instance overrides bound from DriverInstance.ResilienceConfig JSON column" work flagged as a follow-up when Stream A.1 shipped in PR #78. Every driver can now override its Polly pipeline policy per instance instead of inheriting pure tier defaults. Configuration: - DriverInstance entity gains a nullable `ResilienceConfig` string column (nvarchar(max)) + SQL check constraint `CK_DriverInstance_ResilienceConfig_IsJson` that enforces ISJSON when not null. Null = use tier defaults (decision #143 / unchanged from pre-Phase-6.1). - EF migration `20260419161008_AddDriverInstanceResilienceConfig`. - SchemaComplianceTests expected-constraint list gains the new CK name. Core.Resilience.DriverResilienceOptionsParser: - Pure-function parser. ParseOrDefaults(tier, json, out diag) returns the effective DriverResilienceOptions — tier defaults with per-capability / bulkhead overrides layered on top when the JSON payload supplies them. Partial policies (e.g. Read { retryCount: 10 }) fill missing fields from the tier default for that capability. - Malformed JSON falls back to pure tier defaults + surfaces a human-readable diagnostic via the out parameter. Callers log the diag but don't fail startup — a misconfigured ResilienceConfig must not brick a working driver. - Property names + capability keys are case-insensitive; unrecognised capability names are logged-and-skipped; unrecognised shape-level keys are ignored so future shapes land without a migration. Server wire-in: - OtOpcUaServer gains two optional ctor params: `tierLookup` (driverType → DriverTier) + `resilienceConfigLookup` (driverInstanceId → JSON string). CreateMasterNodeManager now resolves tier + JSON for each driver, parses via DriverResilienceOptionsParser, logs the diagnostic if any, and constructs CapabilityInvoker with the merged options instead of pure Tier A defaults. - OpcUaApplicationHost threads both lookups through. Default null keeps existing tests constructing without either Func unchanged (falls back to Tier A + tier defaults exactly as before). Tests (13 new DriverResilienceOptionsParserTests): - null / whitespace / empty-object JSON returns pure tier defaults. - Malformed JSON falls back + surfaces diagnostic. - Read override merged into tier defaults; other capabilities untouched. - Partial policy fills missing fields from tier default. - Bulkhead overrides honored. - Unknown capability skipped + surfaced in diagnostic. - Property names + capability keys are case-insensitive. - Every tier × every capability × empty-JSON round-trips tier defaults exactly (theory). Full solution dotnet test: 1215 passing (was 1202, +13). Pre-existing Client.CLI Subscribe flake unchanged. Production wiring (Program.cs) example: Func<string, DriverTier> tierLookup = type => type switch { "Galaxy" => DriverTier.C, "Modbus" or "S7" => DriverTier.B, "OpcUaClient" => DriverTier.A, _ => DriverTier.A, }; Func<string, string?> cfgLookup = id => db.DriverInstances.AsNoTracking().FirstOrDefault(x => x.DriverInstanceId == id)?.ResilienceConfig; var host = new OpcUaApplicationHost(..., tierLookup: tierLookup, resilienceConfigLookup: cfgLookup); Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 12:21:42 -04:00
Joseph Doherty	c1cab33e38	Phase 6.4 Stream D server-side — IdentificationFolderBuilder materializes OPC 40010 Machinery Identification sub-folder Closes the server-side / non-UI piece of Phase 6.4 Stream D. The Razor `IdentificationFields.razor` component for Admin-UI editing ships separately when the Admin UI pass lands (still tracked under #157 UI follow-up). Core.OpcUa additions: - IdentificationFolderBuilder — pure-function builder that materializes the OPC 40010 Machinery companion-spec Identification sub-folder per decision #139. Reads the nine nullable columns off an Equipment row: Manufacturer, Model, SerialNumber, HardwareRevision, SoftwareRevision, YearOfConstruction (short → OPC UA Int32), AssetLocation, ManufacturerUri, DeviceManualUri. Emits one AddProperty call per non-null field; skips the sub-folder entirely when all nine are null so browse trees don't carry pointless empty folders. - HasAnyFields(equipment) — cheap short-circuit so callers can decide whether to invoke Folder() at all. - FolderName constant ("Identification") + FieldNames list exposed so downstream tools / tests can cross-reference without duplicating the decision-#139 field set. ACL binding: the sub-folder + variables live under the Equipment node so Phase 6.2's PermissionTrie treats them as part of the Equipment ScopeId — no new scope level. A user with Equipment-level grant reads the Identification fields; a user without gets BadUserAccessDenied on both the Equipment node + its Identification variables. Documented in the class remarks; cross-reference update to acl-design.md is a follow-up. Tests (9 new IdentificationFolderBuilderTests): - HasAnyFields all-null false / any-non-null true. - Build all-null returns null + doesn't emit Folder. - Build fully-populated emits all 9 fields in decision #139 order. - Only non-null fields are emitted (3-of-9 case). - YearOfConstruction short widens to DriverDataType.Int32 with int value. - String values round-trip through AddProperty. - FieldNames constant matches decision #139 exactly. - FolderName is "Identification". Full solution dotnet test: 1202 passing (was 1193, +9). Pre-existing Client.CLI Subscribe flake unchanged. Production integration: the component that consumes this is the address-space-build flow that walks the live Equipment table + calls IdentificationFolderBuilder.Build(equipmentFolder, equipment) under each Equipment node. That integration is the remaining Stream D follow-up alongside the Razor UI component. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 11:57:39 -04:00
Joseph Doherty	c4a92f424a	Phase 6.1 Stream B.4 follow-up — ScheduledRecycleHostedService drives registered schedulers on a fixed tick Turns the Phase 6.1 Stream B.4 pure-logic ScheduledRecycleScheduler (shipped in PR #79) into a running background feature. A Tier C driver registers its scheduler at startup; the hosted service ticks every TickInterval (default 1 min) and invokes TickAsync on each registered scheduler. Server.Hosting: - ScheduledRecycleHostedService : BackgroundService. AddScheduler(s) must be called before StartAsync — registering post-start throws InvalidOperationException to avoid "some ticks saw my scheduler, some didn't" races. ExecuteAsync loops on Task.Delay(TickInterval, _timeProvider, stoppingToken) + delegates to a public TickOnceAsync method for one tick. - TickOnceAsync extracted as the unit-of-work so tests drive it directly without needing to synchronize with FakeTimeProvider + BackgroundService timing semantics. - Exception isolation: if one scheduler throws, the loop logs + continues to the next scheduler. A flaky supervisor can't take down the tick for every other Tier C driver. - Diagnostics: TickCount + SchedulerCount properties for tests + logs. Tests (7 new ScheduledRecycleHostedServiceTests, all pass): - TickOnce before interval doesn't fire; TickCount still advances. - TickOnce at/after interval fires the underlying scheduler exactly once. - Multiple ticks accumulate count. - AddScheduler after StartAsync throws. - Throwing scheduler doesn't poison its neighbours (logs + continues). - SchedulerCount matches registrations. - Empty scheduler list ticks cleanly (no-op + counter advances). Full solution dotnet test: 1193 passing (was 1186, +7). Pre-existing Client.CLI Subscribe flake unchanged. Production wiring (Program.cs): builder.Services.AddSingleton<ScheduledRecycleHostedService>(); builder.Services.AddHostedService(sp => sp.GetRequiredService<ScheduledRecycleHostedService>()); // During DI configuration, once Tier C drivers + their ScheduledRecycleSchedulers // are resolved, call host.AddScheduler(scheduler) for each. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 11:42:08 -04:00
Joseph Doherty	c4824bea12	Phase 6.3 Stream C core — RedundancyStatePublisher + PeerReachability; orchestrates calculator inputs end-to-end Wires the Phase 6.3 Stream B pure-logic pieces (ServiceLevelCalculator, RecoveryStateManager, ApplyLeaseRegistry) + Stream A topology loader (RedundancyCoordinator) into one orchestrator the runtime + OPC UA node surface consume. The actual OPC UA variable-node plumbing (mapping ServiceLevel Byte + ServerUriArray String[] onto the Opc.Ua.Server stack) is narrower follow-up on top of this — the publisher emits change events the OPC UA layer subscribes to. Server.Redundancy additions: - PeerReachability record + PeerReachabilityTracker — thread-safe per-peer-NodeId holder of the latest (HttpHealthy, UaHealthy) tuple. Probe loops (Stream B.1/B.2 runtime follow-up) write via Update; the publisher reads via Get. PeerReachability.FullyHealthy / Unknown sentinels for the two most-common states. - RedundancyStatePublisher — pure orchestrator, no background timer, no OPC UA stack dep. ComputeAndPublish reads the 6 inputs + calls the calculator: * role (from coordinator.Current.SelfRole) * selfHealthy (caller-supplied Func<bool>) * peerHttpHealthy + peerUaHealthy (aggregate across all peers in coordinator.Current.Peers) * applyInProgress (ApplyLeaseRegistry.IsApplyInProgress) * recoveryDwellMet (RecoveryStateManager.IsDwellMet) * topologyValid (coordinator.IsTopologyValid) * operatorMaintenance (caller-supplied Func<bool>) Before-coordinator-init returns NoData=1 so clients never see an authoritative value from an un-bootstrapped server. OnStateChanged event fires edge-triggered when the byte changes; OnServerUriArrayChanged fires edge-triggered when the topology's self-first peer-sorted URI array content changes. - ServiceLevelSnapshot record — per-tick output with Value + Band + Topology. The OPC UA layer's ServiceLevel Byte node subscribes to OnStateChanged; the ServerUriArray node subscribes to OnServerUriArrayChanged. Tests (8 new RedundancyStatePublisherTests, all pass): - Before-init returns NoData (Value=1, Band=NoData). - Authoritative-Primary when healthy + peer fully reachable. - Isolated-Primary (230) retains authority when peer unreachable — matches decision #154 non-promotion semantics. - Mid-apply band dominates: open lease → Value=200 even with peer healthy. - Self-unhealthy → NoData regardless of other inputs. - OnStateChanged fires only on value transitions (edge-triggered). - OnServerUriArrayChanged fires once per topology content change; repeat ticks with same topology don't re-emit. - Standalone cluster treats healthy as AuthoritativePrimary=255. Microsoft.EntityFrameworkCore.InMemory 10.0.0 added to Server.Tests for the coordinator-backed publisher tests. Full solution dotnet test: 1186 passing (was 1178, +8). Pre-existing Client.CLI Subscribe flake unchanged. Closes the core of release blocker #3 — the pure-logic + orchestration layer now exists + is unit-tested. Remaining Stream C surfaces: OPC UA ServiceLevel Byte variable wiring (binds to OnStateChanged), ServerUriArray String[] wiring (binds to OnServerUriArrayChanged), RedundancySupport static from RedundancyMode. Those touch the OPC UA stack directly + land as Stream C.2 follow-up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 11:31:50 -04:00
Joseph Doherty	84fe88fadb	Phase 6.3 Stream A — RedundancyTopology + ClusterTopologyLoader + RedundancyCoordinator Lands the data path that feeds the Phase 6.3 ServiceLevelCalculator shipped in PR #89. OPC UA node wiring (ServiceLevel variable + ServerUriArray + RedundancySupport) still deferred to task #147; peer-probe loops (Stream B.1/B.2 runtime layer beyond the calculator logic) deferred. Server.Redundancy additions: - RedundancyTopology record — immutable snapshot (ClusterId, SelfNodeId, SelfRole, Mode, Peers[], SelfApplicationUri). ServerUriArray() emits the OPC UA Part 4 §6.6.2.2 shape (self first, peers lexicographically by NodeId). RedundancyPeer record with per-peer Host/OpcUaPort/DashboardPort/ ApplicationUri so the follow-up peer-probe loops know where to probe. - ClusterTopologyLoader — pure fn from ServerCluster + ClusterNode[] to RedundancyTopology. Enforces Phase 6.3 Stream A.1 invariants: * At least one node per cluster. * At most 2 nodes (decision #83, v2.0 cap). * Every node belongs to the target cluster. * Unique ApplicationUri across the cluster (OPC UA Part 4 trust pin, decision #86). * At most 1 Primary per cluster in Warm/Hot modes (decision #84). * Self NodeId must be a member of the cluster. Violations throw InvalidTopologyException with a decision-ID-tagged message so operators know which invariant + what to fix. - RedundancyCoordinator singleton — holds the current topology + IsTopologyValid flag. InitializeAsync throws on invariant violation (startup fails fast). RefreshAsync logs + flips IsTopologyValid=false (runtime won't tear down a running server; ServiceLevelCalculator falls to InvalidTopology band = 2 which surfaces the problem to clients without crashing). CAS-style swap via Volatile.Write so readers always see a coherent snapshot. Tests (10 new ClusterTopologyLoaderTests): - Single-node standalone loads + empty peer list. - Two-node cluster loads self + peer. - ServerUriArray puts self first + peers sort lexicographically. - Empty-nodes throws. - Self-not-in-cluster throws. - Three-node cluster rejected with decision #83 message. - Duplicate ApplicationUri rejected with decision #86 shape reference. - Two Primaries in Warm mode rejected (decision #84 + runtime-band reference). - Cross-cluster node rejected. - None-mode allows any role mix (standalone clusters don't enforce Primary count). Full solution dotnet test: 1178 passing (was 1168, +10). Pre-existing Client.CLI Subscribe flake unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 11:24:14 -04:00
Joseph Doherty	19a0bfcc43	Phase 6.1 Stream D follow-up — SealedBootstrap consumes ResilientConfigReader + GenerationSealedCache + StaleConfigFlag; /healthz surfaces the flag Closes release blocker #2 from docs/v2/v2-release-readiness.md — the generation-sealed cache + resilient reader + stale-config flag shipped as unit-tested primitives in PR #81, but no production path consumed them until now. This PR wires them end-to-end. Server additions: - SealedBootstrap — Phase 6.1 Stream D consumption hook. Resolves the node's current generation through ResilientConfigReader's timeout → retry → fallback-to-sealed pipeline. On every successful central-DB fetch it seals a fresh snapshot to <cache-root>/<cluster>/<generationId>.db so a future cache-miss has a known-good fallback. Alongside the original NodeBootstrap (which still uses the single-file ILocalConfigCache); Program.cs can switch between them once operators are ready for the generation-sealed semantics. - OpcUaApplicationHost: new optional staleConfigFlag ctor parameter. When wired, HealthEndpointsHost consumes `flag.IsStale` via the existing usingStaleConfig Func<bool> hook. Means `/healthz` actually reports `usingStaleConfig: true` whenever a read fell back to the sealed cache — closes the loop between Stream D's flag + Stream C's /healthz body shape. Tests (4 new SealedBootstrapIntegrationTests, all pass): - Central-DB success path seals snapshot + flag stays fresh. - Central-DB failure falls back to sealed snapshot + flag flips stale (the SQL-kill scenario from Phase 6.1 Stream D.4.a). - No-snapshot + central-down throws GenerationCacheUnavailableException with a clear error (the first-boot scenario from D.4.c). - Next successful bootstrap after a fallback clears the stale flag. Full solution dotnet test: 1168 passing (was 1164, +4). Pre-existing Client.CLI Subscribe flake unchanged. Production activation: Program.cs wires SealedBootstrap (instead of NodeBootstrap), constructs OpcUaApplicationHost with the staleConfigFlag, and a HostedService polls sp_GetCurrentGenerationForCluster periodically so peer-published generations land in this node's sealed cache. The poller itself is Stream D.1.b follow-up. The sp_PublishGeneration SQL-side hook (where the publish commit itself could also write to a shared sealed cache) stays deferred — the per-node seal pattern shipped here is the correct v2 GA model: each Server node owns its own on-disk cache and refreshes from its own DB reads, matching the Phase 6.1 scope-table description. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 11:14:59 -04:00
Joseph Doherty	f8d5b0fdbb	Phase 6.2 Stream C follow-up — wire AuthorizationGate into DriverNodeManager Read / Write / HistoryRead dispatch Closes the Phase 6.2 security gap the v2 release-readiness dashboard flagged: the evaluator + trie + gate shipped as code in PRs #84-88 but no dispatch path called them. This PR threads the gate end-to-end from OpcUaApplicationHost → OtOpcUaServer → DriverNodeManager and calls it on every Read / Write / 4 HistoryRead paths. Server.Security additions: - NodeScopeResolver — maps driver fullRef → Core.Authorization NodeScope. Phase 1 shape: populates ClusterId + TagId; leaves NamespaceId / UnsArea / UnsLine / Equipment null. The cluster-level ACL cascade covers this configuration (decision #129 additive grants). Finer-grained scope resolution (joining against the live Configuration DB for UnsArea / UnsLine path) lands as Stream C.12 follow-up. - WriteAuthzPolicy.ToOpcUaOperation — maps SecurityClassification → the OpcUaOperation the gate evaluator consults (Operate/SecuredWrite → WriteOperate; Tune → WriteTune; Configure/VerifiedWrite → WriteConfigure). DriverNodeManager wiring: - Ctor gains optional AuthorizationGate + NodeScopeResolver; both null means the pre-Phase-6.2 dispatch runs unchanged (backwards-compat for every integration test that constructs DriverNodeManager directly). - OnReadValue: ahead of the invoker call, builds NodeScope + calls gate.IsAllowed(identity, Read, scope). Denied reads return BadUserAccessDenied without hitting the driver. - OnWriteValue: preserves the existing WriteAuthzPolicy check (classification vs session roles) + adds an additive gate check using WriteAuthzPolicy.ToOpcUaOperation(classification) to pick the right WriteOperate/Tune/Configure surface. Lax mode falls through for identities without LDAP groups. - Four HistoryRead paths (Raw / Processed / AtTime / Events): gate check runs per-node before the invoker. Events path tolerates fullRef=null (event-history queries can target a notifier / driver-root; those are cluster-wide reads that need a different scope shape — deferred). - New WriteAccessDenied helper surfaces BadUserAccessDenied in the OpcHistoryReadResult slot + errors list, matching the shape of the existing WriteUnsupported / WriteInternalError helpers. OtOpcUaServer + OpcUaApplicationHost: gate + resolver thread through as optional constructor parameters (same pattern as DriverResiliencePipelineBuilder in Phase 6.1). Null defaults keep the existing 3 OpcUaApplicationHost integration tests constructing without them unchanged. Tests (5 new in NodeScopeResolverTests): - Resolve populates ClusterId + TagId + Equipment Kind. - Resolve leaves finer path null per Phase 1 shape (doc'd as follow-up). - Empty fullReference throws. - Empty clusterId throws at ctor. - Resolver is stateless across calls. The existing 9 AuthorizationGate tests (shipped in PR #86) continue to cover the gate's allow/deny semantics under strict + lax mode. Full solution dotnet test: 1164 passing (was 1159, +5). Pre-existing Client.CLI Subscribe flake unchanged. Existing OpcUaApplicationHost + HealthEndpointsHost + driver integration tests continue to pass because the gate defaults to null → no enforcement, and the lax-mode fallback returns true for identities without LDAP groups (the anonymous test path). Production deployments flip the gate on by constructing it via OpcUaApplicationHost's new authzGate parameter + setting `Authorization:StrictMode = true` once ACL data is populated. Flipping the switch post-seed turns the evaluator + trie from scaffolded code into actual enforcement. This closes release blocker #1 listed in docs/v2/v2-release-readiness.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 11:02:17 -04:00
Joseph Doherty	560a961cca	Phase 6.4 Stream A + B data layer — UnsImpactAnalyzer + EquipmentCsvImporter (parser) Ships the pure-logic data layer of Phase 6.4. Blazor UI pieces (UnsTab drag/drop, CSV import modal, preview table, FinaliseImportBatch txn, staging tables) are deferred to visual-compliance follow-ups (tasks #153, #155, #157). Admin.Services additions: - UnsImpactAnalyzer.Analyze(snapshot, move) — pure-function, no I/O. Three move variants: LineMove, AreaRename, LineMerge. Returns UnsImpactPreview with AffectedEquipmentCount + AffectedTagCount + CascadeWarnings + RevisionToken + HumanReadableSummary the Admin UI shows in the confirm modal. Cross-cluster moves rejected with CrossClusterMoveRejectedException per decision #82. Missing source/target throws UnsMoveValidationException. Surfaces sibling-line same-name ambiguity as a cascade warning. - DraftRevisionToken — opaque revision fingerprint. Preview captures the token; Confirm compares it. The 409-concurrent-edit UX plumbs through on the Razor-page follow-up (task #153). Matches(other) is null-safe. - UnsTreeSnapshot + UnsAreaSummary + UnsLineSummary — snapshot shape the caller hands to the analyzer. Tests build them in-memory without a DB. - EquipmentCsvImporter.Parse(csvText) — RFC 4180 CSV parser per decision #95. Version-marker contract: line 1 must be "# OtOpcUaCsv v1" (future shapes bump the version). Required columns from decision #117 + optional columns from decision #139. Rejects unknown columns, duplicate column names, blank required fields, duplicate ZTags within the file. Quoted-field handling supports embedded commas + escaped "" quotes. Returns EquipmentCsvParseResult { AcceptedRows, RejectedRows } so the preview modal renders accept/reject counts without re-parsing. Tests (22 new, all pass): - UnsImpactAnalyzerTests (9): line move counts equipment + tags; cross- cluster throws; unknown source/target throws validation; ambiguous same- name target raises warning; area rename sums across lines; line merge cross-area warns; same-area merge no warning; DraftRevisionToken matches semantics. - EquipmentCsvImporterTests (13): empty file throws; missing version marker; missing required column; unknown column; duplicate column; valid single row round-trips; optional columns populate when present; blank required field rejects row; duplicate ZTag rejects second; RFC 4180 quoted fields with commas + escaped quotes; mismatched column count rejects; blank lines between rows ignored; required + optional column constants match decisions #117 + #139 exactly. Full solution dotnet test: 1159 passing (Phase 6.3 = 1137, Phase 6.4 A+B data = +22). Pre-existing Client.CLI Subscribe flake unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 10:09:47 -04:00
Joseph Doherty	483f55557c	Phase 6.3 Stream B + Stream D (core) — ServiceLevelCalculator + RecoveryStateManager + ApplyLeaseRegistry Lands the pure-logic heart of Phase 6.3. OPC UA node wiring (Stream C), RedundancyCoordinator topology loader (Stream A), Admin UI + metrics (Stream E), and client interop tests (Stream F) are follow-up work — tracked as tasks #145-150. New Server.Redundancy sub-namespace: - ServiceLevelCalculator — pure 8-state matrix per decision #154. Inputs: role, selfHealthy, peerUa/HttpHealthy, applyInProgress, recoveryDwellMet, topologyValid, operatorMaintenance. Output: OPC UA Part 5 §6.3.34 Byte. Reserved bands (0=Maintenance, 1=NoData, 2=InvalidTopology) override everything; operational bands occupy 30..255. Key invariants: * Authoritative-Primary = 255, Authoritative-Backup = 100. * Isolated-Primary = 230 (retains authority with peer down). * Isolated-Backup = 80 (does NOT auto-promote — non-transparent model). * Primary-Mid-Apply = 200, Backup-Mid-Apply = 50; apply dominates peer-unreachable per Stream C.4 integration expectation. * Recovering-Primary = 180, Recovering-Backup = 30. * Standalone treats healthy as Authoritative-Primary (no peer concept). - ServiceLevelBand enum — labels every numeric band for logs + Admin UI. Values match the calculator table exactly; compliance script asserts drift detection. - RecoveryStateManager — holds Recovering band until (dwell ≥ 60s default) AND (one publish witness observed). Re-fault resets both gates so a flapping node doesn't shortcut through recovery twice. - ApplyLeaseRegistry — keyed on (ConfigGenerationId, PublishRequestId) per decision #162. BeginApplyLease returns an IAsyncDisposable so every exit path (success, exception, cancellation, dispose-twice) closes the lease. ApplyMaxDuration watchdog (10 min default) via PruneStale tick forces close after a crashed publisher so ServiceLevel can't stick at mid-apply. Tests (40 new, all pass): - ServiceLevelCalculatorTests (27): reserved bands override; self-unhealthy → NoData; invalid topology demotes both nodes to 2; authoritative primary 255; backup 100; isolated primary 230 retains authority; isolated backup 80 does not promote; http-only unreachable triggers isolated; mid-apply primary 200; mid-apply backup 50; apply dominates peer-unreachable; recovering primary 180; recovering backup 30; standalone treats healthy as 255; classify round-trips every band including Unknown sentinel. - RecoveryStateManagerTests (6): never-faulted auto-meets dwell; faulted-only returns true (semantics-doc test — coordinator short-circuits on selfHealthy=false); recovered without witness never meets; witness without dwell never meets; witness + dwell-elapsed meets; re-fault resets. - ApplyLeaseRegistryTests (7): empty registry not-in-progress; begin+dispose closes; dispose on exception still closes; dispose twice safe; concurrent leases isolated; watchdog closes stale; watchdog leaves recent alone. Full solution dotnet test: 1137 passing (Phase 6.2 shipped at 1097, Phase 6.3 B + D core = +40 = 1137). Pre-existing Client.CLI Subscribe flake unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 09:56:34 -04:00
Joseph Doherty	3b8280f08a	Phase 6.2 Stream D (data layer) — ValidatedNodeAclAuthoringService with write-time invariants Ships the non-UI piece of Stream D: a draft-aware write surface over NodeAcl that enforces the Phase 6.2 plan's scope-uniqueness + grant-shape invariants. Blazor UI pieces (RoleGrantsTab + AclsTab refresh + SignalR invalidation + visual-compliance reviewer signoff) are deferred to the Phase 6.1-style follow-up task. Admin.Services: - ValidatedNodeAclAuthoringService — alongside existing NodeAclService (raw CRUD, kept for read + revoke paths). GrantAsync enforces: * Permissions != None (decision #129 — additive only, no empty grants). * Cluster scope has null ScopeId. * Sub-cluster scope requires a populated ScopeId. * No duplicate (GenerationId, ClusterId, LdapGroup, ScopeKind, ScopeId) tuple — operator updates the row instead of inserting a duplicate. UpdatePermissionsAsync also rejects None (operator revokes via NodeAclService). Violations throw InvalidNodeAclGrantException. Tests (10 new in Admin.Tests/ValidatedNodeAclAuthoringServiceTests): - Grant rejects None permissions. - Grant rejects Cluster-scope with ScopeId / sub-cluster without ScopeId. - Grant succeeds on well-formed row. - Grant rejects duplicate (group, scope) in same draft. - Grant allows same group at different scope. - Grant allows same (group, scope) in different draft. - UpdatePermissions rejects None. - UpdatePermissions round-trips new flags + notes. - UpdatePermissions on unknown rowid throws. Microsoft.EntityFrameworkCore.InMemory 10.0.0 added to Admin.Tests csproj. Full solution dotnet test: 1097 passing (was 1087, +10). Phase 6.2 total is now 1087+10 = 1097; baseline 906 → +191 net across Phase 6.1 (all streams) + Phase 6.2 (Streams A, B, C foundation, D data layer). Stream D follow-up task tracks: RoleGrantsTab CRUD over LdapGroupRoleMapping, AclsTab write-through + Probe-this-permission diagnostic, draft-diff ACL section, SignalR PermissionTrieCache invalidation push. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 09:39:06 -04:00
Joseph Doherty	8efb99b6be	Phase 6.2 Stream C (foundation) — AuthorizationGate + ILdapGroupsBearer Lands the integration seam between the Server project's OPC UA stack and the Core.Authorization evaluator. Actual DriverNodeManager dispatch-path wiring (Read/Write/HistoryRead/Browse/Call/Subscribe/Alarm surfaces) lands in the follow-up PR on this branch — covered by Task #143 below. Server.Security additions: - ILdapGroupsBearer — marker interface a custom IUserIdentity implements to expose its resolved LDAP group DNs. Parallel to the existing IRoleBearer (admin roles) — control/data-plane separation per decision #150. - AuthorizationGate — stateless bridge between Opc.Ua.IUserIdentity and IPermissionEvaluator. IsAllowed(identity, operation, scope) materializes a UserAuthorizationState from the identity's LDAP groups, delegates to the evaluator, and returns a single bool the dispatch paths use to decide whether to surface BadUserAccessDenied. - StrictMode knob controls fail-open-during-transition vs fail-closed: - strict=false (default during rollout) — null identity, identity without ILdapGroupsBearer, or NotGranted outcome all return true so older deployments without ACL data keep working. - strict=true (production target) — any of the above returns false. The appsetting `Authorization:StrictMode = true` flips deployments over once their ACL data is populated. Tests (9 new in Server.Tests/AuthorizationGateTests): - Null identity — strict denies, lax allows. - Identity without LDAP groups — strict denies, lax allows. - LDAP group with matching grant allows. - LDAP group without grant — strict denies. - Wrong operation denied (Read-only grant, WriteOperate requested). - BuildSessionState returns materialized state with LDAP groups + null when identity doesn't carry them. Full solution dotnet test: 1087 passing (Phase 6.1 = 1042, Phase 6.2 A = +9, B = +27, C foundation = +9 = 1087). Pre-existing Client.CLI Subscribe flake unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 09:33:51 -04:00
Joseph Doherty	40fb459040	Phase 6.2 Stream B — permission-trie evaluator in Core.Authorization Ships Stream B.1-B.6 — the data-plane authorization engine Phase 6.2 runs on. Integration into OPC UA dispatch (Stream C — Read / Write / HistoryRead / Subscribe / Browse / Call etc.) is the next PR on this branch. New Core.Abstractions: - OpcUaOperation enum enumerates every OPC UA surface the evaluator gates: Browse, Read, WriteOperate/Tune/Configure (split by SecurityClassification), HistoryRead, HistoryUpdate, CreateMonitoredItems, TransferSubscriptions, Call, AlarmAcknowledge/Confirm/Shelve. Stream C maps each one back to its dispatch call site. New Core.Authorization namespace: - NodeScope record + NodeHierarchyKind — 6-level scope addressing for Equipment-kind (UNS) namespaces, folder-segment walk for SystemPlatform-kind (Galaxy). NodeScope carries a Kind selector so the evaluator knows which hierarchy to descend. - AuthorizationDecision { Verdict, Provenance } + AuthorizationVerdict {Allow, NotGranted, Denied} + MatchedGrant. Tri-state per decision #149; Phase 6.2 only produces Allow + NotGranted, Denied stays reserved for v2.1 Explicit Deny without API break. - IPermissionEvaluator.Authorize(session, operation, scope). - PermissionTrie + PermissionTrieNode + TrieGrant. In-memory trie keyed on the ACL scope hierarchy. CollectMatches walks Cluster → Namespace → UnsArea → UnsLine → Equipment → Tag (or → FolderSegment(s) → Tag on Galaxy). Pure additive union — matches that share an LDAP group with the session contribute flags; OR across levels. - PermissionTrieBuilder static factory. Build(clusterId, generationId, rows, scopePaths?) returns a trie for one generation. Cross-cluster rows are filtered out so the trie is cluster-coherent. Stream C follow-up wires a real scopePaths lookup from the live DB; tests supply hand-built paths. - PermissionTrieCache — process-singleton, keyed on (ClusterId, GenerationId). Install(trie) adds a generation + promotes to "current" when the id is highest-known (handles out-of-order installs gracefully). Prior generations retained so an in-flight request against a prior trie still succeeds; GC via Prune(cluster, keepLatest). - UserAuthorizationState — per-session cache of resolved LDAP groups + AuthGenerationId + MembershipVersion + MembershipResolvedUtc. Bounded by MembershipFreshnessInterval (default 15 min per decision #151) + AuthCacheMaxStaleness (default 5 min per decision #152). - TriePermissionEvaluator — default IPermissionEvaluator. Fails closed on stale sessions (IsStale check short-circuits to NotGranted), on cross- cluster requests, on empty trie cache. Maps OpcUaOperation → NodePermissions via MapOperationToPermission (total — every enum value has a mapping; tested). Tests (27 new, all pass): - PermissionTrieTests (7): cluster-level grant cascades to every tag; equipment-level grant doesn't leak to sibling equipment; multi-group union ORs flags; no-matching-group returns empty; Galaxy folder-segment grant doesn't leak to sibling folder; cross-cluster rows don't land in this cluster's trie; build is idempotent (B.6 invariants). - TriePermissionEvaluatorTests (8): allow when flag matches; NotGranted when no matching group; NotGranted when flags insufficient; HistoryRead requires its own bit (decision-level requirement); cross-cluster session denied; stale session fails closed; no cached trie denied; MapOperationToPermission is total across every OpcUaOperation. - PermissionTrieCacheTests (8): empty cache returns null; install-then-get round-trips; new generation becomes current; out-of-order install doesn't downgrade current; invalidate drops one cluster; prune retains most recent; prune no-op when fewer than keep; cluster isolation. - UserAuthorizationStateTests (4): fresh is not stale; IsStale after 5 min default; NeedsRefresh true between freshness + staleness windows. Full solution dotnet test: 1078 passing (baseline 906, Phase 6.1 = 1042, Phase 6.2 Stream A = +9, Stream B = +27 = 1078). Pre-existing Client.CLI Subscribe flake unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 09:27:44 -04:00
Joseph Doherty	0fcdfc7546	Phase 6.2 Stream A — LdapGroupRoleMapping entity + EF migration + CRUD service Stream A.1-A.2 per docs/v2/implementation/phase-6-2-authorization-runtime.md. Seed-data migration (A.3) is a separate follow-up once production LDAP group DNs are finalised; until then CRUD via the Admin UI handles the fleet set up. Configuration: - New AdminRole enum {ConfigViewer, ConfigEditor, FleetAdmin} — string-stored. - New LdapGroupRoleMapping entity with Id (surrogate PK), LdapGroup (512 chars), Role (AdminRole enum), ClusterId (nullable, FK to ServerCluster), IsSystemWide, CreatedAtUtc, Notes. - EF config: UX_LdapGroupRoleMapping_Group_Cluster unique index on (LdapGroup, ClusterId) + IX_LdapGroupRoleMapping_Group hot-path index on LdapGroup for sign-in lookups. Cluster FK cascades on cluster delete. - Migration 20260419_..._AddLdapGroupRoleMapping generated via `dotnet ef`. Configuration.Services: - ILdapGroupRoleMappingService — CRUD surface. Declared as control-plane only per decision #150; the OPC UA data-path evaluator must NOT depend on this interface (Phase 6.2 compliance check on control/data-plane separation). GetByGroupsAsync is the hot-path sign-in lookup. - LdapGroupRoleMappingService (EF Core impl) enforces the write-time invariant "exactly one of (ClusterId populated, IsSystemWide=true)" and surfaces InvalidLdapGroupRoleMappingException on violation. Create auto-populates Id + CreatedAtUtc when omitted. Tests (9 new, all pass) in Configuration.Tests: - Create sets Id + CreatedAtUtc. - Create rejects empty LdapGroup. - Create rejects IsSystemWide=true with populated ClusterId. - Create rejects IsSystemWide=false with null ClusterId. - GetByGroupsAsync returns matching rows only. - GetByGroupsAsync with empty input returns empty (no full-table scan). - ListAllAsync orders by group then cluster. - Delete removes the target row. - Delete of unknown id is a no-op. Microsoft.EntityFrameworkCore.InMemory 10.0.0 added to Configuration.Tests for the service-level tests (schema-compliance tests still use the live SQL fixture). SchemaComplianceTests updated to expect the new LdapGroupRoleMapping table. Full solution dotnet test: 1051 passing (baseline 906, Phase 6.1 shipped at 1042, Phase 6.2 Stream A adds 9 = 1051). Pre-existing Client.CLI Subscribe flake unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 09:18:06 -04:00
Joseph Doherty	cbcaf6593a	Phase 6.1 Stream E (data layer) — DriverInstanceResilienceStatus entity + DriverResilienceStatusTracker + EF migration Ships the data + runtime layer of Stream E. The SignalR hub and Blazor /hosts page refresh (E.2-E.3) are follow-up work paired with the visual-compliance review per Phase 6.4 patterns — documented as a deferred follow-up below. Configuration: - New entity DriverInstanceResilienceStatus with: DriverInstanceId, HostName (composite PK), LastCircuitBreakerOpenUtc, ConsecutiveFailures, CurrentBulkheadDepth, LastRecycleUtc, BaselineFootprintBytes, CurrentFootprintBytes, LastSampledUtc. - Separate from DriverHostStatus (per-host connectivity view) so a Running host that has tripped its breaker or is nearing its memory ceiling shows up distinctly on Admin /hosts. Admin page left-joins both for display. - OtOpcUaConfigDbContext + Fluent-API config + IX_DriverResilience_LastSampled index for the stale-sample filter query. - EF migration: 20260419124034_AddDriverInstanceResilienceStatus. Core.Resilience: - DriverResilienceStatusTracker — process-singleton in-memory tracker keyed on (DriverInstanceId, HostName). CapabilityInvoker + MemoryTracking + MemoryRecycle callers record failure/success/breaker-open/recycle/footprint events; a HostedService (Stream E.2 follow-up) samples this tracker every 5 s and persists to the DB. Pure in-memory keeps tests fast + the core free of EF/SQL dependencies. Tests: - DriverResilienceStatusTrackerTests (9 new, all pass): tryget-before-write returns null; failures accumulate; success resets; breaker/recycle/footprint fields populate; per-host isolation; snapshot returns all pairs; concurrent writes don't lose counts. - SchemaComplianceTests: expected-tables list updated to include the new DriverInstanceResilienceStatus table. Full solution dotnet test: 1042 passing (baseline 906, +136 for Phase 6.1 so far across Streams A/B/C/D/E.1). Pre-existing Client.CLI Subscribe flake unchanged. Deferred to follow-up PR (E.2/E.3): - ResilienceStatusPublisher HostedService that samples DriverResilienceStatusTracker every 5 s + upserts DriverInstanceResilienceStatus rows. - Admin FleetStatusHub SignalR hub pushing LastCircuitBreakerOpenUtc / CurrentBulkheadDepth / LastRecycleUtc on change. - Admin /hosts Blazor column additions (red badge when ConsecutiveFailures > breakerThreshold / 2). Visual-compliance reviewer signoff alongside Phase 6.4 admin-ui patterns. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 08:47:43 -04:00
Joseph Doherty	854c3bcfec	Phase 6.1 Stream D — LiteDB generation-sealed config cache + ResilientConfigReader + UsingStaleConfig flag Closes Stream D per docs/v2/implementation/phase-6-1-resilience-and-observability.md. New Configuration.LocalCache types (alongside the existing single-file LiteDbConfigCache): - GenerationSealedCache — file-per-generation sealed snapshots per decision #148. Each SealAsync writes <cache-root>/<clusterId>/<generationId>.db as a read-only LiteDB file, then atomically publishes the CURRENT pointer via temp-file + File.Replace. Prior-generation files stay on disk for audit. Mixed-generation reads are structurally impossible: ReadCurrentAsync opens the single file named by CURRENT. Corruption of the pointer or the sealed file raises GenerationCacheUnavailableException — fails closed, never falls back silently to an older generation. TryGetCurrentGenerationId returns the pointer value or null for diagnostics. - StaleConfigFlag — thread-safe (Volatile.Read/Write) bool. MarkStale when a read fell back to the cache; MarkFresh when a central-DB read succeeded. Surfaced on /healthz body and Admin /hosts (Stream C wiring already in place). - ResilientConfigReader — wraps a central-DB fetch function with the Stream D.2 pipeline: timeout 2 s → retry N× jittered (skipped when retryCount=0) → fallback to the sealed cache. Toggles StaleConfigFlag per outcome. Read path only — the write path is expected to bypass this wrapper and fail hard on DB outage so inconsistent writes never land. Cancellation passes through and is NOT retried. Configuration.csproj: - Polly.Core 8.6.6 + Microsoft.Extensions.Logging.Abstractions added. Tests (17 new, all pass): - GenerationSealedCacheTests (10): first-boot-no-snapshot throws GenerationCacheUnavailableException (D.4 scenario C), seal-then-read round trip, sealed file is ReadOnly on disk, pointer advances to latest, prior generation file preserved, corrupt sealed file fails closed, missing sealed file fails closed, corrupt pointer fails closed (D.4 scenario B), same generation sealed twice is idempotent, independent clusters don't interfere. - ResilientConfigReaderTests (4): central-DB success returns value + marks fresh; central-DB failure exhausts retries + falls back to cache + marks stale (D.4 scenario A); central-DB + cache both unavailable throws; cancellation not retried. - StaleConfigFlagTests (3): default is fresh; toggles; concurrent writes converge. Full solution dotnet test: 1033 passing (baseline 906, +127 net across Phase 6.1 Streams A/B/C/D). Pre-existing Client.CLI Subscribe flake unchanged. Integration into Configuration read paths (DriverInstance enumeration, LdapGroupRoleMapping fetches, etc.) + the sp_PublishGeneration hook that writes sealed files lands in the Phase 6.1 Stream E / Admin-refresh PR where the DB integration surfaces are already touched. Existing LiteDbConfigCache continues serving its single-file role for the NodeBootstrap path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 08:33:32 -04:00
Joseph Doherty	9dd5e4e745	Phase 6.1 Stream C — health endpoints on :4841 + LogContextEnricher + Serilog JSON sink + CapabilityInvoker enrichment Closes Stream C per docs/v2/implementation/phase-6-1-resilience-and-observability.md. Core.Observability (new namespace): - DriverHealthReport — pure-function aggregation over DriverHealthSnapshot list. Empty fleet = Healthy. Any Faulted = Faulted. Any Unknown/Initializing (no Faulted) = NotReady. Any Degraded or Reconnecting (no Faulted, no NotReady) = Degraded. Else Healthy. HttpStatus(verdict) maps to the Stream C.1 state matrix: Healthy/Degraded → 200, NotReady/Faulted → 503. - LogContextEnricher — Serilog LogContext wrapper. Push(id, type, capability, correlationId) returns an IDisposable scope; inner log calls carry DriverInstanceId / DriverType / CapabilityName / CorrelationId structured properties automatically. NewCorrelationId = 12-hex-char GUID slice for cases where no OPC UA RequestHeader.RequestHandle is in flight. CapabilityInvoker — now threads LogContextEnricher around every ExecuteAsync / ExecuteWriteAsync call site. OtOpcUaServer passes driver.DriverType through so logs correlate to the driver type too. Every capability call emits structured fields per the Stream C.4 compliance check. Server.Observability: - HealthEndpointsHost — standalone HttpListener on http://localhost:4841/ (loopback avoids Windows URL-ACL elevation; remote probing via reverse proxy or explicit netsh urlacl grant). Routes: /healthz → 200 when (configDbReachable OR usingStaleConfig); 503 otherwise. Body: status, uptimeSeconds, configDbReachable, usingStaleConfig. /readyz → DriverHealthReport.Aggregate + HttpStatus mapping. Body: verdict, drivers[], degradedDrivers[], uptimeSeconds. anything else → 404. Disposal cooperative with the HttpListener shutdown. - OpcUaApplicationHost starts the health host after the OPC UA server comes up and disposes it on shutdown. New OpcUaServerOptions knobs: HealthEndpointsEnabled (default true), HealthEndpointsPrefix (default http://localhost:4841/). Program.cs: - Serilog pipeline adds Enrich.FromLogContext + opt-in JSON file sink via `Serilog:WriteJson = true` appsetting. Uses Serilog.Formatting.Compact's CompactJsonFormatter (one JSON object per line — SIEMs like Splunk, Datadog, Graylog ingest without a regex parser). Server.Tests: - Existing 3 OpcUaApplicationHost integration tests now set HealthEndpointsEnabled=false to avoid port :4841 collisions under parallel execution. - New HealthEndpointsHostTests (9): /healthz healthy empty fleet; stale-config returns 200 with flag; unreachable+no-cache returns 503; /readyz empty/ Healthy/Faulted/Degraded/Initializing drivers return correct status and bodies; unknown path → 404. Uses ephemeral ports via Interlocked counter. Core.Tests: - DriverHealthReportTests (8): empty fleet, all-healthy, any-Faulted trumps, any-NotReady without Faulted, Degraded without Faulted/NotReady, HttpStatus per-verdict theory. - LogContextEnricherTests (8): all 4 properties attach; scope disposes cleanly; NewCorrelationId shape; null/whitespace driverInstanceId throws. - CapabilityInvokerEnrichmentTests (2): inner logs carry structured properties; no context leak outside the call site. Full solution dotnet test: 1016 passing (baseline 906, +110 for Phase 6.1 so far across Streams A+B+C). Pre-existing Client.CLI Subscribe flake unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 08:15:44 -04:00
Joseph Doherty	1d9008e354	Phase 6.1 Stream B.3/B.4/B.5 — MemoryRecycle + ScheduledRecycleScheduler + demand-aware WedgeDetector Closes out Stream B per docs/v2/implementation/phase-6-1-resilience-and-observability.md. Core.Abstractions: - IDriverSupervisor — process-level supervisor contract a Tier C driver's out-of-process topology provides (Galaxy Proxy/Supervisor implements this in a follow-up Driver.Galaxy wiring PR). Concerns: DriverInstanceId + RecycleAsync. Tier A/B drivers don't implement this; Stream B code asserts tier == C before ever calling it. Core.Stability: - MemoryRecycle — companion to MemoryTracking. On HardBreach, invokes the supervisor IFF tier == C AND a supervisor is wired. Tier A/B HardBreach logs a promotion-to-Tier-C recommendation and returns false. Soft/None/Warming never triggers a recycle at any tier. - ScheduledRecycleScheduler — Tier C opt-in periodic recycler per decision #67. Ctor throws for Tier A/B (structural guard — scheduled recycle on an in-process driver would kill every OPC UA session and every co-hosted driver). TickAsync(now) advances the schedule by one interval per fire; RequestRecycleNowAsync drives an ad-hoc recycle without shifting the cron. - WedgeDetector — demand-aware per decision #147. Classify(state, demand, now) returns: * NotApplicable when driver state != Healthy * Idle when Healthy + no pending work (bulkhead=0 && monitored=0 && historic=0) * Healthy when Healthy + pending work + progress within threshold * Faulted when Healthy + pending work + no progress within threshold Threshold clamps to min 60 s. DemandSignal.HasPendingWork ORs the three counters. The three false-wedge cases the plan calls out all stay Healthy: idle subscription-only, slow historian backfill making progress, write-only burst with drained bulkhead. Tests (22 new, all pass): - MemoryRecycleTests (7): Tier C hard-breach requests recycle; Tier A/B hard-breach never requests; Tier C without supervisor no-ops; soft-breach at every tier never requests; None/Warming never request. - ScheduledRecycleSchedulerTests (6): ctor throws for A/B; zero/negative interval throws; tick before due no-ops; tick at/after due fires once and advances; RequestRecycleNow fires immediately without shifting schedule; multiple fires across ticks advance one interval each. - WedgeDetectorTests (9): threshold clamp to 60 s; unhealthy driver always NotApplicable; idle subscription stays Idle; pending+fresh progress stays Healthy; pending+stale progress is Faulted; MonitoredItems active but no publish is Faulted; MonitoredItems active with fresh publish stays Healthy; historian backfill with fresh progress stays Healthy; write-only burst with empty bulkhead is Idle; HasPendingWork theory for any non-zero counter. Full solution dotnet test: 989 passing (baseline 906, +83 for Phase 6.1 so far). Pre-existing Client.CLI Subscribe flake unchanged. Stream B complete. Next up: Stream C (health endpoints + structured logging). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 08:03:18 -04:00
Joseph Doherty	ef6b0bb8fc	Phase 6.1 Stream B.1/B.2 — DriverTier on DriverTypeMetadata + Core.Stability.MemoryTracking with hybrid-formula soft/hard thresholds Stream B.1 — registry invariant: - DriverTypeMetadata gains a required `DriverTier Tier` field. Every registered driver type must declare its stability tier so the downstream MemoryTracking, MemoryRecycle, and resilience-policy layers can resolve the right defaults. Stamped-at-registration-time enforcement makes the "every driver type has a non-null Tier" compliance check structurally impossible to fail. - DriverTypeRegistry API unchanged; one new property on the record. Stream B.2 — MemoryTracking (Core.Stability): - Tier-agnostic tracker per decision #146: captures baseline as the median of samples collected during a post-init warmup window (default 5 min), then classifies each subsequent sample with the hybrid formula `soft = max(multiplier × baseline, baseline + floor)`, `hard = 2 × soft`. - Per-tier constants wired: Tier A mult=3 floor=50 MB, Tier B mult=3 floor=100 MB, Tier C mult=2 floor=500 MB. - Never kills. Hard-breach action returns HardBreach; the supervisor that acts on that signal (MemoryRecycle) is Tier C only per decisions #74, #145 and lands in the next B.3 commit on this branch. - Two phases: WarmingUp (samples collected, Warming returned) and Steady (baseline captured, soft/hard checks active). Transition is automatic when the warmup window elapses. Tests (15 new, all pass): - Warming phase returns Warming until the window elapses. - Window-elapsed captures median baseline + transitions to Steady. - Per-tier constants match decision #146 table exactly. - Soft threshold uses max() — small baseline → floor wins; large baseline → multiplier wins. - Hard = 2 × soft. - Sample below soft = None; at soft = SoftBreach; at/above hard = HardBreach. - DriverTypeRegistry: theory asserts Tier round-trips for A/B/C. Full solution dotnet test: 963 passing (baseline 906, +57 net for Phase 6.1 Stream A + Stream B.1/B.2). Pre-existing Client.CLI Subscribe flake unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 07:37:43 -04:00
Joseph Doherty	29bcaf277b	Phase 6.1 Stream A.3 complete — wire CapabilityInvoker into DriverNodeManager dispatch end-to-end Every OnReadValue / OnWriteValue now routes through the process-singleton DriverResiliencePipelineBuilder's CapabilityInvoker. Read / Write dispatch paths gain timeout + per-capability retry + per-(driver, host) circuit breaker + bulkhead without touching the individual driver implementations. Wiring: - OpcUaApplicationHost: new optional DriverResiliencePipelineBuilder ctor parameter (default null → instance-owned builder). Keeps the 3 test call sites that construct OpcUaApplicationHost directly unchanged. - OtOpcUaServer: requires the builder in its ctor; constructs one CapabilityInvoker per driver at CreateMasterNodeManager time with default Tier A DriverResilienceOptions. TODO: Stream B.1 will wire real per-driver- type tiers via DriverTypeRegistry; Phase 6.1 follow-up will read the DriverInstance.ResilienceConfig JSON column for per-instance overrides. - DriverNodeManager: takes a CapabilityInvoker in its ctor. OnReadValue wraps the driver's ReadAsync through ExecuteAsync(DriverCapability.Read, hostName, ...); OnWriteValue wraps WriteAsync through ExecuteWriteAsync(hostName, isIdempotent, ...) where isIdempotent comes from the new _writeIdempotentByFullRef map populated at Variable() registration from DriverAttributeInfo.WriteIdempotent. HostName defaults to driver.DriverInstanceId for now — a single-host pipeline per driver. Multi-host drivers (Modbus with N PLCs) will expose their own per- call host resolution in a follow-up so failing PLCs can trip per-PLC breakers without poisoning siblings (decision #144). Test fixup: - FlakeyDriverIntegrationTests.Read_SurfacesSuccess_AfterTransientFailures: bumped TimeoutSeconds=2 → 30. 10 retries at exponential backoff with jitter can exceed 2s under parallel-test-run CPU pressure; the test asserts retry behavior, not timeout budget, so the longer slack keeps it deterministic. Full solution dotnet test: 948 passing. Pre-existing Client.CLI Subscribe flake unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 07:28:28 -04:00
Joseph Doherty	b6d2803ff6	Phase 6.1 Stream A — switch pipeline keys from Guid to string to match IDriver.DriverInstanceId IDriver.DriverInstanceId is declared as string in Core.Abstractions; keeping the pipeline key as Guid meant every call site would need .ToString() / Guid.Parse at the boundary. Switching the Resilience types to string removes that friction and lets OtOpcUaServer pass driver.DriverInstanceId directly to the builder in the upcoming server-dispatch wiring PR. - DriverResiliencePipelineBuilder.GetOrCreate + Invalidate + PipelineKey - CapabilityInvoker.ctor + _driverInstanceId field Tests: all 48 Core.Tests still pass. The Invalidate test's keepId / dropId now use distinct "drv-keep" / "drv-drop" literals (previously both were distinct Guid.NewGuid() values, which the sed-driven refactor had collapsed to the same literal — caught pre-commit). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 07:18:55 -04:00
Joseph Doherty	f3850f8914	Phase 6.1 Stream A.5/A.6 — WriteIdempotent flag on DriverAttributeInfo + Modbus/S7 tag records + FlakeyDriver integration tests Per-tag opt-in for write-retry per docs/v2/plan.md decisions #44, #45, #143. Default is false — writes never auto-retry unless the driver author has marked the tag as safe to replay. Core.Abstractions: - DriverAttributeInfo gains `bool WriteIdempotent = false` at the end of the positional record (back-compatible; every existing call site uses the default). Driver.Modbus: - ModbusTagDefinition gains `bool WriteIdempotent = false`. Safe candidates documented in the param XML: holding-register set-points, configuration registers. Unsafe: edge-triggered coils, counter-increment addresses. - ModbusDriver.DiscoverAsync propagates t.WriteIdempotent into DriverAttributeInfo.WriteIdempotent. Driver.S7: - S7TagDefinition gains `bool WriteIdempotent = false`. Safe candidates: DB word/dword set-points, configuration DBs. Unsafe: M/Q bits that drive edge-triggered program routines. - S7Driver.DiscoverAsync propagates the flag. Stream A.5 integration tests (FlakeyDriverIntegrationTests, 4 new) exercise the invoker + flaky-driver contract the plan enumerates: - Read with 5 transient failures succeeds on the 6th attempt (RetryCount=10). - Non-idempotent write with RetryCount=5 configured still fails on the first failure — no replay (decision #44 guard at the ExecuteWriteAsync surface). - Idempotent write with 2 transient failures succeeds on the 3rd attempt. - Two hosts on the same driver have independent breakers — dead-host trips its breaker but live-host's first call still succeeds. Propagation tests: - ModbusDriverTests: SetPoint WriteIdempotent=true flows into DriverAttributeInfo; PulseCoil default=false. - S7DiscoveryAndSubscribeTests: same pattern for DBx SetPoint vs M-bit. Full solution dotnet test: 947 passing (baseline 906, +41 net across Stream A so far). Pre-existing Client.CLI Subscribe flake unchanged. Stream A's remaining work (wiring CapabilityInvoker into DriverNodeManager's OnReadValue / OnWriteValue / History / Subscribe dispatch paths) is the server-side integration piece + needs DI wiring for the pipeline builder — lands in the next PR on this branch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 07:16:21 -04:00
Joseph Doherty	90f7792c92	Phase 6.1 Stream A.3 — CapabilityInvoker wraps driver-capability calls through the shared pipeline One invoker per (DriverInstance, IDriver) pair; calls ExecuteAsync(capability, host, callSite) and the invoker resolves the correct pipeline from the shared DriverResiliencePipelineBuilder. The options accessor is a Func so Admin-edit + pipeline-invalidate takes effect without restarting the invoker or the driver host. ExecuteWriteAsync(isIdempotent) is the explicit write-safety surface: - isIdempotent=false routes through a side pipeline with RetryCount=0 regardless of what the caller configured. The cache key carries a "::non-idempotent" suffix so it never collides with the retry-enabled write pipeline. - isIdempotent=true routes through the normal Write pipeline. If the user has configured Write retries (opt-in), the idempotent tag gets them; otherwise default-0 still wins. The server dispatch layer (next PR) reads WriteIdempotentAttribute on each tag definition once at driver-init time and feeds the boolean into ExecuteWriteAsync. Tests (6 new): - Read retries on transient failure; returns value from call site. - Write non-idempotent does NOT retry even when policy has 3 retries configured (the explicit decision-#44 guard at the dispatch surface). - Write idempotent retries when policy allows. - Write with default tier-A policy (RetryCount=0) never retries regardless of idempotency flag. - Different hosts get independent pipelines. Core.Tests now 44 passing (was 38). Invoker doc-refs completed (the XML comment on WriteIdempotentAttribute no longer references a non-existent type). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 04:09:26 -04:00
Joseph Doherty	c04b13f436	Phase 6.1 Stream A.1/A.2/A.6 — Polly resilience foundation: pipeline builder + per-tier policy defaults + WriteIdempotent attribute Lands the first chunk of the Phase 6.1 Stream A resilience layer per docs/v2/implementation/phase-6-1-resilience-and-observability.md §Stream A. Downstream CapabilityInvoker (A.3) + driver-dispatch wiring land in follow-up PRs on the same branch. Core.Abstractions additions: - WriteIdempotentAttribute — marker for tag-definition records that opt into auto-retry on IWritable.WriteAsync. Absence = no retry per decisions #44, #45, #143. Read once via reflection at driver-init time; no per-write cost. - DriverCapability enum — enumerates the 8 capability surface points (Read / Write / Discover / Subscribe / Probe / AlarmSubscribe / AlarmAcknowledge / HistoryRead). AlarmAcknowledge is write-shaped (no retry by default). - DriverTier enum — A/B/C per driver-stability.md §2-4. Stream B.1 wires this into DriverTypeMetadata; surfaced here because the resilience policy defaults key on it. Core.Resilience new namespace: - DriverResilienceOptions — per-tier × per-capability policy defaults. GetTierDefaults(tier) is the source of truth: * Tier A: Read 2s/3 retries, Write 2s/0 retries, breaker threshold 5 * Tier B: Read 4s/3, Write 4s/0, breaker threshold 5 * Tier C: Read 10s/1, Write 10s/0, breaker threshold 0 (supervisor handles process-level breaker per decision #68) Resolve(capability) overlays CapabilityPolicies on top of the defaults. - DriverResiliencePipelineBuilder — composes Timeout → Retry (capability- permitting, never on cancellation) → CircuitBreaker (tier-permitting) → Bulkhead. Pipelines cached in a lock-free ConcurrentDictionary keyed on (DriverInstanceId, HostName, DriverCapability) per decision #144 — one dead PLC behind a multi-device driver does not open the breaker for healthy siblings. Invalidate(driverInstanceId) supports Admin-triggered reload. Tests (30 new, all pass): - DriverResilienceOptionsTests: tier-default coverage for every capability, Write + AlarmAcknowledge never retry at any tier, Tier C disables breaker, resolve-with-override layering. - DriverResiliencePipelineBuilderTests: Read retries transients, Write does NOT retry on failure (decision #44 guard), dead-host isolation from sibling hosts, pipeline reuse for same triple, per-capability isolation, breaker opens after threshold on Tier A, timeout fires, cancellation is not retried, invalidation scoped to matching instance. Polly.Core 8.6.6 added to Core.csproj. Full solution dotnet test: 936 passing (baseline 906 + 30 new). One pre-existing Client.CLI Subscribe flake unchanged by this PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 04:07:27 -04:00
Joseph Doherty	c9e856178a	Phase 3 PR 76 -- OPC UA Client IHistoryProvider (HistoryRead passthrough). Driver now implements IHistoryProvider (Raw + Processed + AtTime); ReadEventsAsync deliberately inherits the interface default that throws NotSupportedException. ExecuteHistoryReadAsync is the shared wire path: parses the fullReference to NodeId, builds a HistoryReadValueIdCollection with one entry, calls Session.HistoryReadAsync(RequestHeader, ExtensionObject<details>, TimestampsToReturn.Both, releaseContinuationPoints:false, nodesToRead, ct), unwraps r.HistoryData ExtensionObject into the samples list, passes ContinuationPoint through. Each DataValue's upstream StatusCode + SourceTimestamp + ServerTimestamp preserved verbatim per driver-specs.md \u00A78 cascading-quality rule -- this matters especially for historical data where an interpolated / uncertain-quality sample must surface its true severity downstream, not a sanitized Good. SourceTimestamp=DateTime.MinValue guards map to null so downstream clients see 'source unknown' rather than an epoch-zero misread. ReadRawAsync builds ReadRawModifiedDetails with IsReadModified=false (raw, not modified-history), StartTime/EndTime, NumValuesPerNode=maxValuesPerNode, ReturnBounds=false (clients that want bounds request them via continuation handling). ReadProcessedAsync builds ReadProcessedDetails with ProcessingInterval in ms + AggregateType wrapping a single NodeId from MapAggregateToNodeId. MapAggregateToNodeId switches on HistoryAggregateType {Average, Minimum, Maximum, Total, Count} to the standard Part 13 ObjectIds.AggregateFunction_* NodeId -- future aggregate-type additions fail the switch with ArgumentOutOfRangeException so they can't silently slip through with a null NodeId and an opaque server-side BadAggregateNotSupported. ReadAtTimeAsync builds ReadAtTimeDetails with ReqTimes + UseSimpleBounds=true (returns boundary samples when an exact timestamp has no value -- the OPC UA Part 11 default). Malformed NodeId short-circuits to empty result without touching the wire, matching the ReadAsync / WriteAsync pattern. ReadEventsAsync stays at the interface-default NotSupportedException: the OPC UA call path (HistoryReadAsync with ReadEventDetails + EventFilter) needs an EventFilter SelectClauses spec which the current IHistoryProvider.ReadEventsAsync signature doesn't carry. Adding that would be an IHistoryProvider interface widening; out of scope for PR 76. Callers see BadHistoryOperationUnsupported on the OPC UA client which is the documented fallback. Name disambiguation: Core.Abstractions.HistoryReadResult and Opc.Ua.HistoryReadResult both exist; used fully-qualified Core.Abstractions.HistoryReadResult in return types + factory expressions. Shutdown unchanged -- history reads don't create persistent server-side resources, so no cleanup needed beyond the existing Session.CloseAsync. Unit tests (OpcUaClientHistoryTests, 7 facts): MapAggregateToNodeId theory covers all 5 aggregates; MapAggregateToNodeId_rejects_invalid_enum (defense against future enum addition silently passing through); Read{Raw,Processed,AtTime}Async_without_initialize_throws (RequireSession path); ReadEventsAsync_throws_NotSupportedException (locks in the intentional inheritance of the default). 78/78 OpcUaClient.Tests pass (67 prior + 11 new, -4 on the alarm suite moved into the events count). dotnet build clean. Final OPC UA Client capability surface: IDriver + ITagDiscovery + IReadable + IWritable + ISubscribable + IHostConnectivityProbe + IAlarmSource + IHistoryProvider -- 8 of 8 possible capabilities. Driver is feature-complete per driver-specs.md \u00A78.	2026-04-19 02:13:22 -04:00
Joseph Doherty	fad04bbdf7	Phase 3 PR 75 -- OPC UA Client IAlarmSource (A&C event forwarding + Acknowledge). Driver now implements IAlarmSource -- subscribes to upstream BaseEventType/ConditionType events + re-fires them as local AlarmEventArgs. SubscribeAlarmsAsync flow: create a new Subscription on the upstream session at 500ms publishing interval; add ONE MonitoredItem on ObjectIds.Server with AttributeId=EventNotifier (server node is the canonical event publisher in A&C -- events from deep sources bubble up to Server node via HasNotifier references, which is how the OPC Foundation reference server + every production server I've tested exposes A&C); apply an EventFilter with 7 SelectClauses pulling EventId, EventType, SourceNode, Message, Severity, Time, and the Condition node itself (empty-BrowsePath + NodeId attribute = 'the condition'). Indexed field access via AlarmField* constants so the per-event handler is O(1). Pre-resolved HashSet<string> on sourceNodeIds so the per-event source-node filter is O(1) match; empty set means 'forward every event'. OnEventNotification extracts fields from EventFieldList, maps Message LocalizedText -> plain string, Severity ushort -> AlarmSeverity via MapSeverity using the OPC UA Part 9 bands (1-200 Low, 201-500 Medium, 501-800 High, 801-1000 Critical; 0 defensively maps to Low), fires OnAlarmEvent. Queue size 1000 + DiscardOldest=false so bursts (e.g. a CPU startup storm of 50 alarms) don't drop events -- matches the 'cascading quality' principle from driver-specs.md \u00A78 where the driver must not silently lose upstream state. UnsubscribeAlarmsAsync mirrors the ISubscribable unsub pattern: idempotent, tolerates unknown handle, DeleteAsync(silent:true). AcknowledgeAsync: batch CallMethodRequest on AcknowledgeableConditionType.Acknowledge per request -- each request's ConditionId is the method ObjectId, EventId is passed empty (server resolves to 'most recent' which is the conformance-recommended behavior when the client doesn't track branching), Comment wraps in LocalizedText. Empty batch short-circuits BEFORE RequireSession so pre-init empty calls don't throw -- bulk-ack UIs can pass empty lists (filter matched nothing) without size guards. Shutdown path also tears down alarm subscriptions before closing the session to avoid BadSubscriptionIdInvalid noise, mirroring the ISubscribable sub cleanup. Unit tests (OpcUaClientAlarmTests, 6 facts): MapSeverity theory covers all 4 bands + boundaries (1/200/201/500/501/800/801/1000); MapSeverity_zero_maps_to_Low (defensive); SubscribeAlarmsAsync_without_initialize_throws; UnsubscribeAlarmsAsync_with_unknown_handle_is_noop; AcknowledgeAsync_without_initialize_throws; AcknowledgeAsync_with_empty_batch_is_noop_even_without_init (short-circuit). Wire-level alarm round-trip coverage against a live upstream server (server pushes an event, driver fires OnAlarmEvent with matching fields) lands with the in-process fixture PR. 67/67 OpcUaClient.Tests pass (54 prior + 13 new -- 6 alarm + 7 attribute mapping carry-over). dotnet build clean.	2026-04-19 02:09:04 -04:00

1 2 3 4

167 Commits