Commit Graph

182 Commits

Author SHA1 Message Date
Joseph Doherty
2b2991c593 EquipmentNodeWalker — pure-function UNS tree materialization (ADR-001 Task A, task #210). The walker traverses the Config-DB snapshot for a single Equipment-kind namespace (Areas / Lines / Equipment / Tags) and streams IAddressSpaceBuilder.Folder + Variable + AddProperty calls to materialize the canonical 5-level Unified Namespace browse tree that decisions #116-#121 promise external consumers. Pure function: no OPC UA SDK dependency, no DB access, no state — consumes pre-loaded EF Core row collections + streams into the supplied builder. Server-side wiring (load snapshot → call walker → per-tag capability probe) is Task B's scope, alongside NodeScopeResolver's Config-DB join + the ACL integration test that closes task #195. This PR is the Core.OpcUa primitive the server will consume. Walk algorithm — content is grouped up-front (lines by area, equipment by line, tags by equipment) into OrdinalIgnoreCase dictionaries so the per-level nested foreach stays O(N+M) rather than O(N·M) at each UNS level; orderings are deterministic on Name with StringComparer.Ordinal so diffs across runs (e.g. integration-test assertions) are stable. Areas → Lines → Equipment emitted as Folder nodes with browse-name = Name per decision #120. Under each Equipment folder: five identifier properties per decision #121 (EquipmentId + EquipmentUuid always; MachineCode always — it's a required column on the entity; ZTag + SAPID skipped when null to avoid empty-string property noise); IdentificationFolderBuilder.Build materializes the OPC 40010 sub-folder when HasAnyFields(equipment) returns true, skipped otherwise to avoid a pointless empty folder; then one Variable node per Tag row bound to this Equipment (Tag.EquipmentId non-null matches Equipment.EquipmentId) emitted in Name order. Tags with null EquipmentId are walker-skipped — those are SystemPlatform-kind (Galaxy) tags that take the driver-native DiscoverAsync path per decision #120. DriverAttributeInfo construction: FullName = Tag.TagConfig (driver-specific wire-level address); DriverDataType parsed from Tag.DataType which stores the enum name string per decision #138; unparseable values fall back to DriverDataType.String so a one-off driver-specific type doesn't abort the whole walk (driver still sees the original address at runtime + can surface its own typed value via the variant). Address validation is deliberately NOT done at build time per ADR-001 Option A: unreachable addresses surface as OPC UA Bad status via the natural driver-read failure path at runtime, legible to operators through their Admin UI + OPC UA client inspection. Eight new EquipmentNodeWalkerTests: empty content emits nothing; Area/Line/Equipment folder emission order matches Name-sorted deterministic traversal; five identifier properties appear on Equipment nodes with correct values, ZTag + SAPID skipped when null + emitted when non-null; Identification sub-folder materialized when at least one OPC 40010 field is non-null + omitted when all are null; tags with matching EquipmentId emit as Variable nodes under the Equipment folder in Name order, tags with null EquipmentId walker-skipped; unparseable DataType falls back to String. RecordingBuilder test double captures Folder/Variable/Property calls into a tree structure tests can navigate. Core project builds 0 errors; Core.Tests 190/190 (was 182, +8 new walker tests). No Server/Admin changes — Task B lands the server-side wiring + consumes this walker from DriverNodeManager.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 02:39:00 -04:00
Joseph Doherty
f9bc301c33 Client rename residuals: lmxopcua-cli → otopcua-cli + LmxOpcUaClient → OtOpcUaClient with migration shim. Closes task #208 (the executable-name + LocalAppData-folder slice that was called out in Client.CLI.md / Client.UI.md as a deliberately-deferred residual of the Phase 0 rename). Six source references flipped to the canonical OtOpcUaClient spelling: Program.cs CliFx executable name + description (lmxopcua-cli → otopcua-cli), DefaultApplicationConfigurationFactory.cs ApplicationName + ApplicationUri (LmxOpcUaClient + urn:localhost:LmxOpcUaClient → OtOpcUaClient + urn:localhost:OtOpcUaClient), OpcUaClientService.CreateSessionAsync session-name arg, ConnectionSettings.CertificateStorePath default, MainWindowViewModel.CertificateStorePath default, JsonSettingsService.SettingsDir. Two consuming tests (ConnectionSettingsTests + MainWindowViewModelTests) updated to assert the new canonical name. New ClientStoragePaths static helper at src/ZB.MOM.WW.OtOpcUa.Client.Shared/ClientStoragePaths.cs is the migration shim — single entry point for the PKI root + pki subpath, runs a one-shot legacy-folder probe on first resolution: if {LocalAppData}/LmxOpcUaClient/ exists + {LocalAppData}/OtOpcUaClient/ does not, Directory.Move renames it in place (atomic on NTFS within the same volume) so trusted server certs + saved connection settings persist across the rename without operator action. Idempotent per-process via a Lock-guarded _migrationChecked flag so repeated CertificateStorePath getter calls on the hot path pay no IO cost beyond the first. Fresh-install path (neither folder exists) + already-migrated path (only canonical exists) + manual-override path (both exist — developer has set up something explicit) are all no-ops that leave state alone. IOException on the Directory.Move is swallowed + logged as a false return so a concurrent peer process losing the race doesn't crash the consumer; the losing process falls back to whatever state exists. Five new ClientStoragePathsTests assert: GetRoot ends with canonical name under LocalAppData, GetPkiPath nests pki under root, CanonicalFolderName is OtOpcUaClient, LegacyFolderName is LmxOpcUaClient (the migration contract — a typo here would leak the legacy folder past the shim), repeat invocation returns false after first-touch arms the in-process guard. Doc-side residual-explanation notes in docs/Client.CLI.md + docs/Client.UI.md are dropped now that the rename is real; replaced with a short "pre-#208 dev boxes migrate automatically on first launch" note that points at ClientStoragePaths. Sample CLI invocations in Client.CLI.md updated via sed from lmxopcua-cli to otopcua-cli across every command block (14 replacements). Pre-existing staleness in SubscribeCommandTests.Execute_PrintsSubscriptionMessage surfaced during the test run — the CLI's subscribe command has long since switched to an aggregate "Subscribed to {count}/{total} nodes (interval: ...)" output format but the test still asserted the original single-node form. Updated the assertion to match current output + added a comment explaining the change; this is unrelated to the rename but was blocking a green Client.CLI.Tests run. Full solution build 0 errors; Client.Shared.Tests 136/136 + 5 new shim tests passing; Client.UI.Tests 98/98; Client.CLI.Tests 52/52 (was 51/52 before the subscribe-test fix). No Admin/Core/Server changes — this touches only the client layer.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 01:50:40 -04:00
Joseph Doherty
5c0d3154c1 Roslyn analyzer — detect unwrapped driver-capability calls (OTOPCUA0001). Closes task #200. New netstandard2.0 analyzer project src/ZB.MOM.WW.OtOpcUa.Analyzers registered as an <Analyzer>-item ProjectReference from the Server csproj so the warning fires at every Server compile. First (and only so far) rule OTOPCUA0001 — "Driver capability call must be wrapped in CapabilityInvoker" — walks every InvocationOperation in the AST + trips when (a) the target method implements one of the seven guarded capability interfaces (IReadable / IWritable / ITagDiscovery / ISubscribable / IHostConnectivityProbe / IAlarmSource / IHistoryProvider) AND (b) the method's return type is Task, Task<T>, ValueTask, or ValueTask<T> — the async-wire-call constraint narrows the rule to the surfaces the Phase 6.1 pipeline actually wraps + sidesteps pure in-memory accessors like IHostConnectivityProbe.GetHostStatuses() which would trigger false positives AND (c) the call does NOT sit inside a lambda argument passed to CapabilityInvoker.ExecuteAsync / ExecuteWriteAsync / AlarmSurfaceInvoker.*. The wrapper detection walks up the syntax tree from the call site, finds any enclosing InvocationExpressionSyntax whose method's containing type is one of the wrapper classes, + verifies the call lives transitively inside that invocation's AnonymousFunctionExpressionSyntax argument — a sibling "result = await driver.ReadAsync(...)" followed by a separate invoker.ExecuteAsync(...) call does NOT satisfy the wrapping rule + the analyzer flags it (regression guard in the 5th test). Five xunit-v3 + Shouldly tests at tests/ZB.MOM.WW.OtOpcUa.Analyzers.Tests: direct ReadAsync in server namespace trips; wrapped ReadAsync inside CapabilityInvoker.ExecuteAsync lambda passes; direct WriteAsync trips; direct DiscoverAsync trips; sneaky pattern — read outside the lambda + ExecuteAsync with unrelated lambda nearby — still trips. Hand-rolled test harness compiles a stub-plus-user snippet via CSharpCompilation.WithAnalyzers + runs GetAnalyzerDiagnosticsAsync directly, deliberately avoiding Microsoft.CodeAnalysis.CSharp.Analyzer.Testing.XUnit because that package pins to xunit v2 + this repo is on xunit.v3 everywhere else. RS2008 release-tracking noise suppressed by adding AnalyzerReleases.Shipped.md + AnalyzerReleases.Unshipped.md as AdditionalFiles, which is the canonical Roslyn-analyzer hygiene path. Analyzer DLL referenced from Server.csproj via ProjectReference with OutputItemType=Analyzer + ReferenceOutputAssembly=false — the DLL ships as a compiler plugin, not a runtime dependency. Server build validates clean: the analyzer activates on every Server file but finds zero violations, which confirms the Phase 6.1 wrapping work done in prior PRs is complete + the analyzer is now the regression guard preventing the next new capability surface from being added raw. slnx updated with both the src + tests project entries. Full solution build clean, analyzer suite 5/5 passing.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 00:52:40 -04:00
Joseph Doherty
ef53553e9d OTel Prometheus exporter wiring — RedundancyMetrics meter now scraped at /metrics. Closes task #201. Picked Prometheus over OTLP per the earlier recommendation (pull-based means no OTel Collector deployment required for the common K8s/containers case; the endpoint ASP.NET-hosts inside the Admin app already, so one less moving part). Adds two NuGet refs to the Admin csproj: OpenTelemetry.Extensions.Hosting 1.15.2 (stable) + OpenTelemetry.Exporter.Prometheus.AspNetCore 1.15.2-beta.1 (the exporter has historically been beta-only; rest of the OTel ecosystem treats it as production-acceptable + it's what the upstream OTel docs themselves recommend for AspNetCore hosts). Program.cs gains a Metrics:Prometheus:Enabled toggle (defaults true; setting to false disables both the MeterProvider registration + the scrape endpoint entirely for locked-down deployments). When enabled, AddOpenTelemetry().WithMetrics() registers a MeterProvider that subscribes to the "ZB.MOM.WW.OtOpcUa.Redundancy" meter (the exact MeterName constant on RedundancyMetrics) + wires AddPrometheusExporter. MapPrometheusScrapingEndpoint() appends a /metrics handler producing the Prometheus text-format output; deliberately NOT authenticated because scrape jobs typically run on a trusted network + operators who need auth wrap the endpoint behind a reverse-proxy basic-auth gate per fleet-ops convention. appsettings.json declares the toggle with Enabled: true so the default deploy gets metrics automatically — turning off is the explicit action. Future meters (resilience tracker + host status + auth probe) just AddMeter("Name") alongside the existing call to start flowing through the same endpoint without more infrastructure. Admin project builds 0 errors; Admin.Tests 92/92 passing (unchanged — the OTel pipeline runs at request time, not test time). Still-pending work that was NOT part of #201's scope: an equivalent setup for the Server project (different MeterNames — the Polly pipeline builder's tracker + host-status publisher) + a metrics cheat-sheet in docs/observability.md documenting each meter's tag set + expected alerting thresholds. Those are natural follow-ups when fleet-ops starts building dashboards.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 00:41:16 -04:00
Joseph Doherty
df0d7c2d84 DiffViewer ACL section — extend sp_ComputeGenerationDiff with NodeAcl rows. Closes the final slice of task #196 (draft-diff ACL section). The DiffViewer already rendered a placeholder "NodeAcl" card from the task #156 refactor; it stayed empty because the stored proc didn't emit NodeAcl rows. This PR lights the card up by adding a fifth UNION to the proc. Logical id for NodeAcl is the composite LdapGroup + ScopeKind + ScopeId triple — format "cn=group|Cluster|scope-id" or "cn=group|Cluster|(cluster)" when ScopeId is null (Cluster-wide rows). That shape means a permission-only change (same group + same scope, PermissionFlags shifted) appears as a single Modified row with the full triple as its identifier, whereas a scope move (same group, new ScopeId) correctly surfaces as Added + Removed of two different logical ids. CHECKSUM signature covers ClusterId + PermissionFlags + Notes so both operator-visible changes (permission bitmask) and audit-tier changes (notes) round-trip through the diff. New migration 20260420000001_ExtendComputeGenerationDiffWithNodeAcl.cs ships both Up (install V2 proc) + Down (restore the exact V1 proc text shipped in 20260417215224_StoredProcedures so the migration is reversible). Row-id column widens from nvarchar(64) to nvarchar(128) in V2 since the composite key (group DN + scope + scope-id) exceeds 64 chars comfortably — narrow column would silently truncate in prod. Designer .cs cloned from the prior migration since the EF model is unchanged; DiffViewer.razor section description updated to drop the "(proc-extension pending)" note it carried since task #156 — the card will now populate live. Admin + Core full-solution build clean. No unit-test changes needed — the existing StoredProceduresTests cover the proc-exec path + would immediately catch any SQL syntax regression on next SQL Server integration run. Task #196 fully closed now — Probe-this-permission (slice 1, PR 144), SignalR invalidation (slice 2, PR 145), draft-diff ACL section (this PR).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 00:37:05 -04:00
Joseph Doherty
ac63c2cfb2 ACL + role-grant SignalR invalidation — #196 slice 2. Adds the live-push layer so an operator editing permissions in one Admin session sees the change in peer sessions without a manual reload. Covers both axes of task #196's invalidation requirement: cluster-scoped NodeAcl mutations push NodeAclChanged to that cluster's subscribers; fleet-wide LdapGroupRoleMapping CRUD pushes RoleGrantsChanged to every Admin session on the fleet group. New AclChangeNotifier service wraps IHubContext<FleetStatusHub> with two methods: NotifyNodeAclChangedAsync(clusterId, generationId) + NotifyRoleGrantsChangedAsync(). Both are fire-and-forget — a failed hub send logs a warning + returns; the authoritative DB write already committed, so worst-case peers see stale data until their next poll (AclsTab has no polling today; on-parameter-set reload + this signal covers the practical refresh cases). Catching OperationCanceledException separately so request-teardown doesn't log a false-positive hub-failure. NodeAclService constructor gains an optional AclChangeNotifier param (defaults to null so the existing unit tests that pass only a DbContext keep compiling). GrantAsync + RevokeAsync both emit NodeAclChanged after the SaveChanges completes — the Revoke path uses the loaded row's ClusterId + GenerationId for accurate routing since the caller passes only the surrogate rowId. RoleGrants.razor consumes the notifier after every Create + Delete + opens a fleet-scoped HubConnection on first render that reloads the grant list on RoleGrantsChanged. AclsTab.razor opens a cluster-scoped connection on first render and reloads only when the incoming NodeAclChanged message matches both the current ClusterId + GenerationId (so a peer editing a different draft doesn't trigger spurious reloads). Both pages IAsyncDisposable the connection on navigation away. AclChangeNotifier is DI-registered alongside PermissionProbeService. Two new message records in AclChangeNotifier.cs: NodeAclChangedMessage(ClusterId, GenerationId, ObservedAtUtc) + RoleGrantsChangedMessage(ObservedAtUtc). Admin.Tests 92/92 passing (unchanged — the notifier is fire-and-forget + tested at hub level in existing FleetStatusPoller suite). Admin builds 0 errors. One slice of #196 remains: the draft-diff ACL section (extend sp_ComputeGenerationDiff to emit NodeAcl rows + wire the DiffViewer NodeAcl card from the empty placeholder it currently shows). Next PR.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 00:32:28 -04:00
Joseph Doherty
ecc2389ca8 AclsTab Probe-this-permission — first of three #196 slices. New /clusters/{ClusterId}/draft/{GenerationId} ACLs-tab gains a probe card above the grant table so operators can ask the trie "if cn=X asks for permission Y on node Z, would it be granted, and which rows contributed?" without shell-ing into the DB. Service thinly wraps the same PermissionTrieBuilder + PermissionTrie.CollectMatches call path the Server's dispatch layer uses at request time, so a probe answer is by construction identical to what the live server would decide. New PermissionProbeService.ProbeAsync(generationId, ldapGroup, NodeScope, requiredFlags) — loads the target generation's NodeAcl rows filtered to the cluster (critical: without the cluster filter, cross-cluster grants leak into the probe which tested false-positive in the unit suite), builds a trie, CollectMatches against the supplied scope + [ldapGroup], ORs the matched-grant flags into Effective, compares to Required. Returns PermissionProbeResult(Granted, Required, Effective, Matches) — Matches carries LdapGroup + Scope + PermissionFlags per matched row so the UI can render the contribution chain. Zero side effects + no audit rows — a failing probe is a question, not a denial. AclsTab.razor gains the probe card at the top (before the New-grant form + grant table): six inputs for ldap group + every NodeScope level (NamespaceId → UnsAreaId → UnsLineId → EquipmentId → TagId — blank fields become null so the trie walks only as deep as the operator specified), a NodePermissions dropdown filtered to skip None, Probe button, green Granted / red Denied badge + Required/Effective bitmask display, and (when matches exist) a small table showing which LdapGroup matched at which level with which flags. Admin csproj adds ProjectReference to Core — the trie + NodeScope live there + were previously Server-only. Five new PermissionProbeServiceTests covering: cluster-level row grants a namespace-level read; no-group-match denies with empty Effective; matching group but insufficient flags (Browse+Read vs WriteOperate required) denies with correct Effective bitmask; cross-cluster grants stay isolated (c2's WriteOperate does NOT leak into c1's probe); generation isolation (gen1's Read-only does NOT let gen2's WriteOperate-requiring probe pass). Admin.Tests 92/92 passing (was 87, +5). Admin builds 0 errors. Remaining #196 slices — SignalR invalidation + draft-diff ACL section — ship in follow-up PRs so the review surface per PR stays tight.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 00:28:17 -04:00
Joseph Doherty
1f3343e61f OpenTelemetry redundancy metrics + RoleChanged SignalR push. Closes instrumentation + live-push slices of task #198; the exporter wiring (OTLP vs Prometheus package decision) is split to new task #201 because the collector/scrape-endpoint choice is a fleet-ops decision that deserves its own PR rather than hardcoded here. New RedundancyMetrics class (Singleton-registered in DI) owning a System.Diagnostics.Metrics.Meter("ZB.MOM.WW.OtOpcUa.Redundancy", "1.0.0"). Three ObservableGauge instruments — otopcua.redundancy.primary_count / secondary_count / stale_count — all tagged by cluster.id, populated by SetClusterCounts(clusterId, primary, secondary, stale) which the poller calls at the tail of every tick; ObservableGauge callbacks snapshot the last value set under a lock so the reader (OTel collector, dotnet-counters) sees consistent tuples. One Counter — otopcua.redundancy.role_transition — tagged cluster.id, node.id, from_role, to_role; ideal for tracking "how often does Cluster-X failover" + "which node transitions most" aggregate queries. In-box Metrics API means zero NuGet dep here — the exporter PR adds OpenTelemetry.Extensions.Hosting + OpenTelemetry.Exporter.OpenTelemetryProtocol or OpenTelemetry.Exporter.Prometheus.AspNetCore to actually ship the data somewhere. FleetStatusPoller extended with role-change detection. Its PollOnceAsync now pulls ClusterNode rows alongside the existing ClusterNodeGenerationState scan, and a new PollRolesAsync walks every node comparing RedundancyRole to the _lastRole cache. On change: records the transition to RedundancyMetrics + emits a RoleChanged SignalR message to both FleetStatusHub.GroupName(cluster) + FleetStatusHub.FleetGroup so cluster-scoped + fleet-wide subscribers both see it. First observation per node is a bootstrap (cache fill) + NOT a transition — avoids spurious churn on service startup or pod restart. UpdateClusterGauges groups nodes by cluster + sets the three gauge values, using ClusterNodeService.StaleThreshold (shared 30s convention) for staleness so the /hosts page + the gauge agree. RoleChangedMessage record lives alongside NodeStateChangedMessage in FleetStatusPoller.cs. RedundancyTab.razor subscribes to the fleet-status hub on first parameters-set, filters RoleChanged events to the current cluster, reloads the node list + paints a blue info banner ("Role changed on node-a: Primary → Secondary at HH:mm:ss UTC") so operators see the transition without needing to poll-refresh the page. IAsyncDisposable closes the connection on tab swap-away. Two new RedundancyMetricsTests covering RecordRoleTransition tag emission (cluster.id + node.id + from_role + to_role all flow through the MeterListener callback) + ObservableGauge snapshot for two clusters (assert primary_count=1 for c1, stale_count=1 for c2). Existing FleetStatusPollerTests ctor-line updated to pass a RedundancyMetrics instance; all tests still pass. Full Admin.Tests suite 87/87 passing (was 85, +2). Admin project builds 0 errors. Task #201 captures the exporter-wiring follow-up — OpenTelemetry.Extensions.Hosting + OTLP vs Prometheus + /metrics endpoint decision, driven by fleet-ops infra direction.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 23:16:09 -04:00
Joseph Doherty
404bfbe7e4 AlarmSurfaceInvoker — wraps IAlarmSource.Subscribe/Unsubscribe/Acknowledge through CapabilityInvoker with multi-host fan-out. Closes alarm-surface slice of task #161 (Phase 6.1 Stream A); the Roslyn invoker-coverage analyzer is split into new task #200 because a DiagnosticAnalyzer project is genuinely its own scaffolding PR (Microsoft.CodeAnalysis.CSharp.Workspaces dep, netstandard2.0 target, Microsoft.CodeAnalysis.Testing harness, ProjectReference OutputItemType=Analyzer wiring, and four corner-case rules I want tests for before shipping). Ship this PR as the runtime guardrail + callable API; the analyzer lands next as the compile-time guardrail. New AlarmSurfaceInvoker class in Core.Resilience. Three methods mirror IAlarmSource's three mutating surfaces: SubscribeAsync (fan-out: group sourceNodeIds by IPerCallHostResolver.ResolveHost, one CapabilityInvoker.ExecuteAsync per host with DriverCapability.AlarmSubscribe so AlarmSubscribe's retry policy kicks in + returns one IAlarmSubscriptionHandle per host); UnsubscribeAsync (single-host, defaultHost); AcknowledgeAsync (fan-out: group AlarmAcknowledgeRequests by resolver-mapped host, run each host's batch through DriverCapability.AlarmAcknowledge which does NOT retry per decision #143 — alarm-ack is a write-shaped op that's not idempotent at the plant-floor level). Drivers without IPerCallHostResolver (Galaxy single MXAccess endpoint, OpcUaClient against one remote, etc.) fall back to defaultHost = DriverInstanceId so breaker + bulkhead keying still happens; drivers with it get one-dead-PLC-doesn't-poison-siblings isolation per decision #144. Single-host single-subscribe returns [handle] with length 1; empty sourceNodeIds fast-paths to [] without a driver call. Five new AlarmSurfaceInvokerTests covering: (a) empty list short-circuits — driver method never called; (b) single-host sub routes via default host — one driver call with full id list; (c) multi-host sub fans out to 2 distinct hosts for 3 src ids mapping to 2 plcs — one driver call per host; (d) Acknowledge does not retry on failure — call count stays at 1 even with exception; (e) Subscribe retries transient failures — call count reaches 3 with a 2-failures-then-success fake. Core.Tests resilience-builder suite 19/19 passing (was 14, +5); Core.Tests whole suite still green. Core project builds 0 errors. Task #200 captures the compile-time guardrail: Roslyn DiagnosticAnalyzer at src/ZB.MOM.WW.OtOpcUa.Analyzers that flags direct invocations of the eleven capability-interface methods inside the Server namespace when the call is NOT inside a CapabilityInvoker.ExecuteAsync/ExecuteWriteAsync/AlarmSurfaceInvoker.*Async lambda. That analyzer is the reason we keep paying the wrapping-class overhead for every new capability.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 23:07:37 -04:00
Joseph Doherty
c0751fdda5 ExternalIdReservation merge inside FinaliseBatchAsync. Closes task #197. The FinaliseBatch docstring called this out as a narrower follow-up pending a concurrent-insert test matrix, and the CSV import UI PR (#163) noted that operators would see raw DbUpdate UNIQUE-constraint messages on ZTag/SAPID collision until this landed. Now every finalised-batch row reserves ZTag + SAPID in the same EF transaction as the Equipment inserts, so either both commit atomically or neither does. New MergeReservation helper handles the four outcomes per (Kind, Value) pair: (1) value empty/whitespace → skip the reservation entirely (operator left the optional identifier blank); (2) active reservation exists for same EquipmentUuid → bump LastPublishedAt + reuse (re-finalising a batch against the same equipment must be idempotent, e.g. a retry after a transient DB blip); (3) active reservation exists for a DIFFERENT EquipmentUuid → throw ExternalIdReservationConflictException with the conflicting UUID + originating cluster + first-published timestamp so operator sees exactly who owns the value + where to resolve it (release via sp_ReleaseExternalIdReservation or pick a new ZTag); (4) no active reservation → create a fresh row with FirstPublishedBy = batch.CreatedBy + FirstPublishedAt = transaction time. Pre-commit overlap scan uses one round-trip (WHERE Kind+Value IN the batch's distinct sets, filtered to ReleasedAt IS NULL so explicitly-released values can be re-issued per decision #124) + caches the results in a Dictionary keyed on (Kind, value.ToLowerInvariant()) for O(1) lookup during the row loop. Race-safety catch: if another finalise commits between our cache-load + our SaveChanges, SQL Server surfaces a 2601/2627 unique-index violation against UX_ExternalIdReservation_KindValue_Active — IsReservationUniquenessViolation walks the inner-exception chain for that specific signature + rethrows as ExternalIdReservationConflictException so the UI shows a clean message instead of a raw DbUpdateException. The index-name match means unrelated filtered-unique violations (future indices) don't get mis-classified. Test-fixture Row() helper updated to generate unique SAPID per row (sap-{ZTag}) — the prior shared SAPID="sap" worked only because reservations didn't exist; two rows sharing a SAPID under different EquipmentUuids now collide as intended by decision #124's fleet-wide uniqueness rule. Four new tests: (a) finalise creates both ZTag + SAPID reservations with expected Kind + Value; (b) re-finalising same EquipmentUuid's ZTag from a different batch does not create a duplicate (LastPublishedAt refresh only); (c) different EquipmentUuid claiming the same ZTag throws ExternalIdReservationConflictException with the ZTag value in the message + Equipment row for the second batch is NOT inserted (transaction rolled back cleanly); (d) row with empty ZTag + empty SAPID skips reservation entirely. Full Admin.Tests suite 85/85 passing (was 81 before this PR, +4). Admin project builds 0 errors. Note: the InMemory EF provider doesn't enforce filtered-unique indices, so the IsReservationUniquenessViolation catch is exercised only in the SQL Server integration path — the in-memory tests cover the cache-level conflict detection in MergeReservation instead, which is the first line of defence + catches the same-batch + published-vs-staged cases. The DbUpdate catch protects only the last-second race where two concurrent transactions both passed the cache check.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 23:02:31 -04:00
Joseph Doherty
5ee510dc1a UnsTab native HTML5 drag/drop + 409 concurrent-edit modal + optimistic-concurrency commit path. Closes UI slice of task #153 (Phase 6.4 Stream A UI follow-up). Playwright E2E smoke is split into new task #199 — Playwright install + WebApplicationFactory + seeded-DB harness is genuinely its own infra-setup PR. Native HTML5 attributes (draggable, @ondragstart, @ondragover, @ondragleave, @ondrop) deliberately over MudBlazor per the task title — no MudBlazor ever joins this project. Two new service methods on UnsService land the data layer the existing UnsImpactAnalyzer assumed but which didn't actually exist: (1) LoadSnapshotAsync(generationId) — walks UnsAreas + UnsLines + per-line equipment counts + builds a UnsTreeSnapshot including a 16-char SHA-256 revision token computed deterministically over the sorted (kind, id, parent, name, notes) tuple-set so it's stable across processes + changes whenever any row is added / modified / deleted; (2) MoveLineAsync(generationId, expectedToken, lineId, targetAreaId) — re-parents one line inside the same draft under an EF transaction, recomputes the current revision token from freshly-loaded rows, and throws DraftRevisionConflictException when the caller-supplied token no longer matches. Token mismatch means another operator mutated the draft between preview + commit + the move rolls back rather than clobbering their work. No-op same-area drop is a silent return. Cross-generation move is prevented by the generationId filter on the transaction reads. UnsTab.razor gains draggable="true" on every line row with @ondragstart capturing the LineId into _dragLineId, and every area row is a drop target (@ondragover with :preventDefault so the browser accepts drops, @ondrop kicking off OnLineDroppedAsync). Drop path loads a fresh snapshot, builds a UnsMoveOperation(Kind=LineMove, source/target cluster matching because cross-cluster is decision-#82 rejected), runs UnsImpactAnalyzer.Analyze + shows a Bootstrap modal rendered inline in the component — modal shows HumanReadableSummary + equipment/tag counts + any CascadeWarnings list. Confirm button calls MoveLineAsync with the snapshot's RevisionToken; DraftRevisionConflictException surfaces a separate red-header "Draft changed — refresh required" modal with a Reload button that re-fetches areas + lines from the DB. New DraftRevisionConflictException in UnsService.cs, co-located with the service that throws it. Five new UnsServiceMoveTests covering LoadSnapshotAsync (areas + lines + equipment counts), RevisionToken stability between two reads, RevisionToken changes on AddLineAsync, MoveLineAsync happy path reparents the line in the DB, MoveLineAsync with stale token throws DraftRevisionConflictException + leaves the DB unchanged. Admin suite 81/81 passing (was 76, +5). Admin project builds 0 errors. Task #199 captures the deferred Playwright E2E smoke — drag a line onto a different area in a real browser, assert preview modal contents, click Confirm, assert the line row shows the new area. That PR stands up a new tests/ZB.MOM.WW.OtOpcUa.Admin.E2ETests project with Playwright + WebApplicationFactory + seeded InMemory DbContext.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 22:30:48 -04:00
Joseph Doherty
c8a38bc57b DiffViewer refactor — 6-section plugin pattern + 1000-row cap. Closes task #156 (Phase 6.4 Stream C). Replaces the flat single-table rendering that mixed Namespace/DriverInstance/Equipment/Tag rows into one untyped list with a per-section-card layout that makes draft review actually scannable on non-trivial diffs. New DiffSection.razor reusable component encapsulates the per-section rendering — card header shows Title + Description + a three-badge summary (+added / −removed / ~modified plus a "no changes" grey badge when the section is empty) so operators can glance at a six-card page and see what areas of the draft actually shifted before drilling into any one table. Hard row-cap at DefaultRowCap=1000 per section lives inside the component so a pathological draft (e.g. 20k tags churned by a block rebuild) can't freeze the browser on render — excess rows are silently dropped with a yellow warning banner that surfaces "Showing the first 1000 of N rows" + a pointer to run sp_ComputeGenerationDiff directly for the full set. Body max-height: 400px + overflow-y: auto gives each section its own scroll region so one big section doesn't push the others off screen. DiffViewer.razor refactored to a static Sections table driving a single foreach that instantiates one DiffSection per known TableName. Sections listed in author-order (Namespace → DriverInstance → Equipment → Tag → UnsLine → NodeAcl) — six entries matching the task acceptance criterion. The first four correspond to what sp_ComputeGenerationDiff currently emits; the last two (UnsLine + NodeAcl) render as empty "no changes" cards today + will light up when the proc is extended (tracked in task #196 for NodeAcl; UnsLine proc extension is a natural follow-up since UnsImpactAnalyzer already tracks UNS moves). RowsFor(tableName) replaces the prior flat table — each section filters the overall DiffRow list by its TableName so the proc output format stays stable. Header-bar summary at the top of the page now reads "N rows across M of 6 sections" so operators see overall change weight at a glance before scanning. Two Razor-specific fixes landed along the way: loop variable renamed from section to sec because @section collides with the Razor section directive + trips RZ2005; helper method renamed from Group to RowsFor because the Razor generator gets confused by a parameter-flowing method whose name clashes with LINQ's Group extension (the source-gen output referenced TypeCheck<T> with no argument). Admin project builds 0 errors; Admin.Tests suite 76/76 (unchanged — the refactor is structural + no service-layer logic changed, so the existing DraftValidator + EquipmentService + AdminServicesIntegrationTests cover the consuming paths). No bUnit in this project so the cap behavior isn't unit-tested at the component level; DiffSection.OnParametersSet is small + deterministic (int counts + Take(RowCap)) + reviewed before ship.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 22:23:22 -04:00
Joseph Doherty
13d5a7968b Admin RedundancyTab — per-cluster read-only topology view. Closes the UI slice of task #149 (Phase 6.3 Stream E — Admin UI RedundancyTab + OpenTelemetry metrics + SignalR); the OpenTelemetry metrics + RoleChanged SignalR push are split into new follow-up task #198 because each is a structural add that deserves its own test matrix + NuGet-dep decision rather than riding this UI PR. New /clusters/{ClusterId} Redundancy tab slotted between ACLs and Audit in the existing ClusterDetail tab bar. Shows each ClusterNode row in the cluster with columns Node / Role (Primary green, Secondary blue, Standalone primary-blue badge) / Host / OPC UA port / ServiceLevel base / ApplicationUri (text-break so the long urn: doesn't blow out the table) / Enabled badge / Last seen (relative age via the same FormatAge helper as Hosts.razor, with a yellow "Stale" chip once LastSeenAt crosses the 30s threshold shared with HostStatusService.StaleThreshold — a missed heartbeat plus clock-skew buffer). Four summary cards above the table — total Nodes, Primary count, Secondary count, Stale count. Two guard-rail alerts: (a) red "No Primary or Standalone" when the cluster has no authoritative write target (all rows are Secondaries — read-only until one is promoted by the server-side RedundancyCoordinator apply-lease flow); (b) red "Split-brain" when >1 Primary exists — apply-lease enforcement at the coordinator level should have made this impossible, so the alert implies a hand-edited DB row + an investigation. New ClusterNodeService with ListByClusterAsync (ordered by ServiceLevelBase descending so Primary rows with higher base float to the top) + a static IsStale predicate matching HostStatusService's 30s convention. DI-registered alongside the existing scoped services in Program.cs. Writes (role swap, enable/disable) are deliberately absent from the service — they go through the RedundancyCoordinator apply-lease flow on the server side + direct DB mutation from Admin would race with it. New ClusterNodeServiceTests covering IsStale across null/recent/old LastSeenAt + ListByClusterAsync ordering + cluster filter. 4/4 new tests passing; full Admin.Tests suite 76/76 (was 72 before this PR, +4). Admin project builds 0 errors. Task #198 captures the deferred work: (1) OpenTelemetry Meter for primary/secondary/stale counts + role_transition counter with from/to/node tags + OTLP exporter config; (2) RoleChanged SignalR push — extend FleetStatusPoller to detect RedundancyRole changes on ClusterNode rows + emit a RoleChanged hub message so the RedundancyTab refreshes instantly instead of on-page-load polling.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 22:14:25 -04:00
Joseph Doherty
ac69a1c39d Equipment CSV import UI — Stream B.3/B.5 operator page + EquipmentTab "Import CSV" button. Closes the UI slice of task #163 (Phase 6.4 Stream B.3/B.5); the ExternalIdReservation merge follow-up inside FinaliseBatchAsync is split into new task #197 so it gets a proper concurrent-insert test matrix rather than riding this UI PR. New /clusters/{ClusterId}/draft/{GenerationId}/import-equipment page driving the full staged-import flow end-to-end. Operator selects a driver instance + UNS line (both scoped to the draft generation via DriverInstanceService.ListAsync + UnsService.ListLinesAsync dropdowns), pastes or uploads a CSV (InputFile with 5 MiB cap so pathological files can't OOM the server), clicks Parse — EquipmentCsvImporter.Parse runs + shows two side-by-side cards (accepted rows in green with ZTag/Machine/Name/Line columns, rejected rows in red with line-number + reason). Click Stage + Finalise and the page calls CreateBatchAsync → StageRowsAsync → FinaliseBatchAsync in sequence using the authenticated user's identity as CreatedBy; on success, 600ms banner then NavigateTo back to the draft editor so operator sees the newly-imported rows in EquipmentTab without a manual refresh. Parse errors (missing version marker, bad header, malformed CSV) surface InvalidCsvFormatException.Message inline alongside the Parse button — no page reload needed to retry. Finalise errors surface the service-layer exception message (ImportBatchNotFoundException / ImportBatchAlreadyFinalisedException / any DbUpdate* exception from the atomic transaction) so operator sees exactly why the finalise rejected before the tx rolled back. EquipmentTab gains an "Import CSV…" button next to "Add equipment" that NavigateTo's the new page; it needs a ClusterId parameter to build the URL so the @code block adds [Parameter] string ClusterId, and DraftEditor now passes ClusterId="@ClusterId" alongside the existing GenerationId. EquipmentImportBatchService was already implemented in Phase 6.4 Stream B.4 but missing from the Admin DI container — this PR adds AddScoped so the @inject resolves. The FinaliseBatch docstring explicitly defers ExternalIdReservation merge as a narrower follow-up with a concurrent-insert test matrix — task #197 captures that work. For now the finalise may surface a DB-level UNIQUE-constraint violation if a ZTag conflict exists at commit time; the UI shows the raw message + the batch + staged rows are still in the DB for re-use once the conflict is resolved. Admin project builds 0 errors; Admin.Tests 72/72 passing.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 22:00:40 -04:00
Joseph Doherty
44d4448b37 Admin RoleGrants page — LDAP-group → Admin-role mapping CRUD. Closes the RoleGrantsTab slice of task #144 (Phase 6.2 Stream D follow-up); the remaining three sub-items (Probe-this-permission on AclsTab, SignalR invalidation on role/ACL changes, draft-diff ACL section) are split into new follow-up task #196 so each can ship independently. The permission-trie evaluator + ILdapGroupRoleMappingService already exist from Phase 6.2 Streams A + B — this PR adds the consuming UI + the DI registration that was missing. New /role-grants page at Components/Pages/RoleGrants.razor registered in MainLayout's sidebar next to Certificates. Lists every LdapGroupRoleMapping row with columns LDAP group / Role / Scope (Fleet-wide or Cluster:X) / Created / Notes / Revoke. Add-grant form takes LDAP group DN + AdminRole dropdown (ConfigViewer, ConfigEditor, FleetAdmin) + Fleet-wide checkbox + Cluster dropdown (disabled when Fleet-wide checked) + optional Notes. Service-layer invariants — IsSystemWide=true + ClusterId=null, or IsSystemWide=false + ClusterId populated — enforced in ValidateInvariants; UI catches InvalidLdapGroupRoleMappingException and displays the message in a red alert. ILdapGroupRoleMappingService was present in the Configuration project from Stream A but never registered in the Admin DI container — this PR adds the AddScoped registration so the injection can resolve. Control-plane/data-plane separation note rendered in an info banner at the top of the page per decision #150 (these grants do NOT govern OPC UA data-path authorization; NodeAcl rows are read directly by the permission-trie evaluator without consulting role mappings). Admin project builds 0 errors; Admin.Tests 72/72 passing. Task #196 created to track: (1) AclsTab Probe-this-permission form that takes (ldap group, node path, permission flag) and runs it through the permission trie, showing which row granted it + the actual resolved grant; (2) SignalR invalidation — push a RoleGrantsChanged event when rows are created/deleted so connected Admin sessions reload without polling, ditto NodeAclChanged on ACL writes; (3) DiffViewer ACL section — show NodeAcl + LdapGroupRoleMapping deltas between draft + published alongside equipment/uns diffs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 21:46:21 -04:00
Joseph Doherty
2acea08ced Admin Equipment editor — IdentificationFields component + edit mode + three missing OPC 40010 fields. Closes the UI-editor slice of task #159 (Phase 6.4 Stream D remaining); the DriverNodeManager wire-in + ACL integration test are split into a new follow-up task #195 because they're blocked on a prerequisite that hasn't shipped — the DriverNodeManager does not currently materialize Equipment nodes at all (NodeScopeResolver has an explicit "A future resolver will..." TODO in its decomposition docstring). Shipping the IdentificationFolderBuilder call before the parent walker exists would wire a call that no code path hits, so the wire-in is deferred until the Equipment node walker lands first. New IdentificationFields.razor reusable component renders the 9-field decision #139 grid in a Bootstrap 3-column layout — Manufacturer, Model, SerialNumber, HardwareRevision, SoftwareRevision, YearOfConstruction (InputNumber), AssetLocation, ManufacturerUri (placeholder https://…), DeviceManualUri (placeholder https://…). Takes a required Equipment parameter + 2-way binds every field; no state of its own. Three fields that were missing from the old inline form — AssetLocation, ManufacturerUri, DeviceManualUri — now present, matching IdentificationFolderBuilder.FieldNames exactly. EquipmentTab.razor refactored to consume the component in both create + edit flows. Each table row gains an Edit button next to Remove. StartEdit clones the row into _draft so Cancel doesn't mutate the displayed list row with in-flight edits; on Save, UpdateAsync persists through EquipmentService's existing update path which already handles all 9 Identification fields. SaveAsync branches on _editMode — create still derives EquipmentId from a fresh Uuid via DraftValidator per decision #125, edit keeps the original EquipmentId + EquipmentUuid (immutable once set). FormName renamed equipment-form (was new-equipment) to work for both flows. Admin project builds 0 errors; Admin.Tests 72/72 passing. No new tests shipped — this PR is strictly a Razor-component refactor + two new bound fields + an Edit branch; the existing EquipmentService tests cover both the create + update paths. Task #195 created to track the blocked server-side work: call IdentificationFolderBuilder.Build from DriverNodeManager once the Equipment walker exists, plus an integration test browsing Equipment/Identification as an unauthorized user asserting BadUserAccessDenied per the builder's cross-reference note in docs/v2/acl-design.md §Identification.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 21:41:13 -04:00
Joseph Doherty
d06cc01a48 Admin /hosts red-badge + resilience columns + Polly telemetry observer. Closes task #164 (the remaining slice of Phase 6.1 Stream E.3 after the earlier publisher + hub PR). Three cooperating pieces wired together so the operator-facing /hosts table actually reflects the live Polly counters that the pipeline builder is producing. DriverResiliencePipelineBuilder gains an optional DriverResilienceStatusTracker ctor param — when non-null, every built pipeline wires Polly's OnRetry/OnOpened/OnClosed strategy-options callbacks into the tracker. OnRetry → tracker.RecordFailure (so ConsecutiveFailures climbs per retry), OnOpened → tracker.RecordBreakerOpen (stamps LastCircuitBreakerOpenUtc), OnClosed → tracker.RecordSuccess (resets the failure counter once the target recovers). Absent tracker = silent, preserving the unit-test constructor path + any deployment that doesn't care about resilience observability. Cancellation stays excluded from the failure count via the existing ShouldHandle predicate. HostStatusService.HostStatusRow extends with four new fields — ConsecutiveFailures, LastCircuitBreakerOpenUtc, CurrentBulkheadDepth, LastRecycleUtc — populated via a second LEFT JOIN onto DriverInstanceResilienceStatuses keyed on (DriverInstanceId, HostName). LEFT JOIN because brand-new hosts haven't been sampled yet; a missing row means zero failures + never-opened breaker, which is the correct default. New FailureFlagThreshold constant (=3, matches plan decision #143's conservative half-of-breaker convention) + IsFlagged predicate so the UI can pre-warn before the breaker actually trips. Hosts.razor paints three new columns between State and Last-transition — Fail# (bold red when flagged), In-flight (bulkhead-depth proxy), Breaker-opened (relative age). Per-row "Flagged" red badge alongside State when IsFlagged is true. Above the first cluster table, a red alert banner summarises the flagged-host count when ≥1 host is flagged, so operators see the problem before scanning rows. Three new tests in DriverResiliencePipelineBuilderTests — Tracker_RecordsFailure_OnEveryRetry verifies ConsecutiveFailures reaches RetryCount after a transient-forever operation, Tracker_StampsBreakerOpen_WhenBreakerTrips verifies LastBreakerOpenUtc is set after threshold failures on a Write pipeline, Tracker_IsolatesCounters_PerHost verifies one dead host does not leak failure counts into a healthy sibling. Full suite — Core.Tests 14/14 resilience-builder tests passing (11 existing + 3 new), Admin.Tests 72/72 passing, Admin project builds 0 errors. SignalR live push of status changes + browser visual review are deliberately left to a follow-up — this PR keeps the structural change minimal (polling refresh already exists in the page's 10s timer; SignalR would be a structural add that touches hub registration + client subscription).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 21:35:54 -04:00
Joseph Doherty
ece530d133 AB CIP UDT Template Object shape reader. Closes the shape-reader half of task #179. CipTemplateObjectDecoder (pure-managed) parses the Read Template blob per Rockwell CIP Vol 1 + libplctag ab/cip.c handle_read_template_reply — 12-byte header (u16 member_count + u16 struct_handle + u32 instance_size + u32 member_def_size) followed by memberCount × 8-byte member blocks (u16 info with bit-15 struct flag + lower-12-bit type code matching the Symbol Object encoding, u16 array_size, u32 struct_offset) followed by semicolon-terminated strings (UDT name first, then one per member). ParseSemicolonTerminatedStrings handles the observed firmware variations — name;\0 vs name; delimiters, optional null/space padding after the semicolon, trailing-name-without-semicolon corner case. Struct-flag members decode as AbCipDataType.Structure; unknown atomic codes fall back to Structure so the shape remains valid even with unrecognised members. Zero member count + short buffer both return null; missing member names yield <member_N> placeholders. IAbCipTemplateReader + IAbCipTemplateReaderFactory abstraction — one call per template instance id returning the raw blob. LibplctagTemplateReader is the production implementation creating a libplctag Tag with name @udt/{templateId} + handing the buffer to the decoder. AbCipDriver ctor gains optional templateReaderFactory parameter (defaults to LibplctagTemplateReaderFactory) + new internal FetchUdtShapeAsync that — checks AbCipTemplateCache first, misses call the reader + decode + cache, template-read exceptions + decode failures return null so callers can fall back to declaration-driven fan-out without the whole discovery blowing up. OperationCanceledException rethrows for shutdown propagation. Unknown device host returns null without attempting a fetch. FlushOptionalCachesAsync empties the cache so a subsequent fetch re-reads. 16 new decoder tests — simple two-member UDT, struct-member flag → Structure, array member ArrayLength, 6-member mixed-type with correct offsets, unknown type code → Structure, zero member count → null, short buffer → null, missing member name → placeholder, ParseSemicolonTerminatedStrings theory across 5 shapes. 6 new AbCipFetchUdtShapeTests exercising the driver integration via reflection (method is internal) — happy-path decode + cache, different template ids get separate fetches, unknown device → null without reader creation, decode failure returns null + doesn't cache (next call retries), reader exception returns null, FlushOptionalCachesAsync clears the cache. Total AbCip unit tests now 211/211 passing (+19 from the @tags merge's 192); full solution builds 0 errors; other drivers untouched. Whole-UDT read optimization (single libplctag call returning the packed buffer + client-side member decode using the template offsets) is left as a follow-up — requires rethinking the per-tag read path + careful hardware validation; current per-member fan-out still works correctly, just with N round-trips instead of 1.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 21:21:42 -04:00
Joseph Doherty
088c4817fe AB CIP @tags walker — CIP Symbol Object decoder + LibplctagTagEnumerator. Closes task #178. CipSymbolObjectDecoder (pure-managed, no libplctag dep) parses the raw Symbol Object (class 0x6B) blob returned by reading the @tags pseudo-tag into an enumerable sequence of AbCipDiscoveredTag records. Entry layout per Rockwell CIP Vol 1 + Logix 5000 CIP Programming Manual 1756-PM019, cross-checked against libplctag's ab/cip.c handle_listed_tags_reply — u32 instance-id + u16 symbol-type + u16 element-length + 3×u32 array-dims + u16 name-length + name[len] + even-pad. Symbol-type lower 12 bits carry the CIP type code (0xC1 BOOL, 0xC2 SINT, …, 0xD0 STRING), bit 12 is the system-tag flag, bit 15 is the struct flag (when set lower 12 bits become the template instance id). Truncated tails stop decoding gracefully — caller keeps whatever parsed cleanly rather than getting an exception mid-walk. Program:-scope names (Program:MainProgram.StepIndex) are split via SplitProgramScope so the enumerator surfaces scope + simple name separately. 12 atomic type codes mapped (BOOL/SINT/INT/DINT/LINT/USINT/UINT/UDINT/ULINT/REAL/LREAL/STRING + DT/DATE_AND_TIME under Dt); unknown codes return null so the caller treats them as opaque Structure. LibplctagTagEnumerator is the real production walker — creates a libplctag Tag with name=@tags against the device's gateway/port/path, InitializeAsync + ReadAsync + GetBuffer, hands bytes to the decoder. Factory LibplctagTagEnumeratorFactory replaces EmptyAbCipTagEnumeratorFactory as the AbCipDriver default. AbCipDriverOptions gains EnableControllerBrowse (default false) matching the TwinCAT pattern — keeps the strict-config path for deployments where only declared tags should appear. When true, DiscoverAsync walks each device's @tags + emits surviving symbols under Discovered/ sub-folder. System-tag filter (AbCipSystemTagFilter shipped in PR 5) runs alongside the wire-layer system-flag hint. Tests — 18 new CipSymbolObjectDecoderTests with crafted byte arrays matching the documented layout — single-entry DInt, theory across 12 atomic type codes, unknown→null, struct flag override, system flag surface, Program:-scope split, multi-entry wire-order with even-pad, truncated-buffer graceful stop, empty buffer, SplitProgramScope theory across 6 shapes. 4 pre-existing AbCipDriverDiscoveryTests that tested controller-enumeration behavior updated with EnableControllerBrowse=true so they continue exercising the walker path (behavior unchanged from their perspective). Total AbCip unit tests now 192/192 passing (+26 from the RMW merge's 166); full solution builds 0 errors; other drivers untouched. Field validation note — the decoder layout matches published Rockwell docs + libplctag C source, but actual @tags responses vary slightly by controller firmware (some ship an older entry format with u16 array dims instead of u32). Any layout drift surfaces as gibberish names in the Discovered/ folder; field testing will flag that for a decoder patch if it occurs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 21:13:20 -04:00
Joseph Doherty
00a428c444 RMW pass 2 — AbCip BOOL-within-DINT + AbLegacy bit-within-word. Closes task #181. AbCip — AbCipDriver.WriteAsync now detects BOOL writes with a bit index + routes them through WriteBitInDIntAsync: strip the .N suffix to form the parent DINT tag path (via AbCipTagPath with BitIndex=null + ToLibplctagName), get/create a cached parent IAbCipTagRuntime via EnsureParentRuntimeAsync (distinct from the bit-selector tag runtime so read + write target the DINT directly), acquire a per-parent-name SemaphoreSlim, Read → Convert.ToInt32 the current DINT → (current | 1<<bit) or (current & ~(1<<bit)) → Write via EncodeValue(DInt, updated). Per-parent lock prevents concurrent writers to the same DINT from losing updates — parallels Modbus + FOCAS pass 1. DeviceState gains ParentRuntimes dict + GetRmwLock helper + _rmwLocks ConcurrentDictionary. DisposeHandles now walks ParentRuntimes too. LibplctagTagRuntime.EncodeValue's BOOL-with-bitIndex branch stays as a defensive throw (message updated to point at the new driver-level dispatch) so an accidental bypass fails loudly rather than silently clobbering the whole DINT. AbLegacy — identical pattern for PCCC N-file bit writes. AbLegacyDriver.WriteAsync detects Bit with bitIndex + PMC letter not in {B, I, O} (B-file + I/O use their own bit-addressable semantics so don't RMW at N-file word level), routes through WriteBitInWordAsync which uses Int16 for the parent word, creates + caches a parent runtime with the suffix-stripped N7:0 address, acquires per-parent lock, RMW. DeviceState extended the same way as AbCip (ParentRuntimes + GetRmwLock). LibplctagLegacyTagRuntime.EncodeValue Bit-with-bitIndex branch points at the driver dispatch. Tests — 5 new AbCipBoolInDIntRmwTests (bit set ORs + preserves, bit clear ANDs + preserves, 8-way concurrent writes to same parent compose to 0xFF, different-parent writes get separate runtimes, repeat bit writes reuse the parent runtime init-count 1 + write-count 2), 4 new AbLegacyBitRmwTests (bit set preserves, bit clear preserves 0xFFF7, 8-way concurrent 0xFF, repeat writes reuse parent). Two pre-existing tests flipped — AbCipDriverWriteTests.Bit_in_dint_write_returns_BadNotSupported + AbLegacyReadWriteTests.Bit_within_word_write_rejected_as_BadNotSupported both now assert Good instead of BadNotSupported, renamed to _now_succeeds_via_RMW. Total tests — AbCip 166/166, AbLegacy 96/96, full solution builds 0 errors; Modbus + FOCAS + TwinCAT + other drivers untouched. Task #181 done across all four libplctag-backed + non-libplctag drivers (Modbus BitInRegister + AbCip BOOL-in-DINT + AbLegacy N-file bit + FOCAS PMC Bit — all with per-parent-word serialisation).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 20:34:29 -04:00
Joseph Doherty
8c309aebf3 RMW pass 1 — Modbus BitInRegister + FOCAS PMC Bit write paths. First half of task #181 — the two drivers where read-modify-write is a clean protocol-level insertion (Modbus FC03/FC06 round-trip + FOCAS pmc_rdpmcrng / pmc_wrpmcrng round-trip). Per-driver SemaphoreSlim registry keyed on the parent word address serialises concurrent bit writes so two writers targeting different bits in the same word don't lose one another's update. Modbus — ModbusDriver gains WriteBitInRegisterAsync + _rmwLocks ConcurrentDictionary. WriteOneAsync routes BitInRegister (HoldingRegisters region only) through RMW ahead of the normal encode path. Read uses FC03 Read Holding Registers for 1 register at tag.Address, bit-op on the returned ushort via (current | 1<<bit) for set / (current & ~(1<<bit)) for clear, write back via FC06 Write Single Register. Per-address lock prevents concurrent bit writes to the same register from racing. Rejects out-of-range bits (0-15) with InvalidOperationException. EncodeRegister's BitInRegister branch repurposed as a defensive guard — if a non-RMW caller ever reaches it, throw so an unintended bypass stays loud rather than silently clobbering. FOCAS — FwlibFocasClient gains WritePmcBitAsync + _rmwLocks keyed on {addrType}:{byteAddr}. Driver-layer WriteAsync routes Bit writes with a bitIndex through the new path; other Pmc writes still hit the direct pmc_wrpmcrng path. RMW uses cnc_rdpmcrng + Byte dataType to grab the parent byte, bit-op with (current | 1<<bit) or (current & ~(1<<bit)), cnc_wrpmcrng to write back. Rejects out-of-range bits (0-7, FOCAS PMC bytes are 8-bit) with InvalidOperationException. EncodePmcValue's Bit branch now treats a no-bitIndex case as whole-byte boolean (non-zero / zero); bitIndex-present writes never hit this path because they dispatch to WritePmcBitAsync upstream. Tests — 5 new ModbusBitRmwTests + 4 new FocasPmcBitRmwTests + 1 renamed pre-existing test each covering — bit set preserves other bits, bit clear preserves other bits, concurrent bit writes to same word/byte compose correctly (8-parallel stress), bit writes on different parent words proceed without contention (4-parallel), sequential bit sets compose into 0xFF after all 8. Fake PmcRmwFake in FOCAS tests simulates the PMC byte storage + surfaces it through the IFocasClient contract so the test asserts driver-level behavior without needing Fwlib32.dll. FwlibNativeHelperTests.EncodePmcValue_Bit_throws_NotSupported_for_RMW_gap replaced with EncodePmcValue_Bit_without_bit_index_writes_byte_boolean reflecting the new behavior. ModbusDataTypeTests.BitInRegister_write_is_not_supported_in_PR24 renamed to BitInRegister_EncodeRegister_still_rejects_direct_calls; the message assertion updated to match the new defensive message. Modbus tests now 182/182, FOCAS tests now 119/119; full solution builds 0 errors; AbCip/AbLegacy/TwinCAT untouched (those get their RMW pass in a follow-up since libplctag bit access may need a parallel parent-word handle). Task #181 stays pending until that second pass lands.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 20:25:27 -04:00
Joseph Doherty
c95228391d TwinCAT follow-up — Symbol browser via AdsClient + SymbolLoaderFactory. Closes task #188. Adds ITwinCATClient.BrowseSymbolsAsync — IAsyncEnumerable yielding TwinCATDiscoveredSymbol (InstancePath + mapped TwinCATDataType + ReadOnly flag) from the target's flat symbol table. AdsTwinCATClient implementation uses SymbolLoaderFactory.Create(_client, new SymbolLoaderSettings(SymbolsLoadMode.Flat)) + iterates loader.Symbols, maps IEC 61131-3 type names (BOOL/SINT/INT/DINT/LINT/REAL/LREAL/STRING/WSTRING/TIME/DATE/DT/TOD + BYTE/WORD/DWORD/LWORD unsigned-word aliases) through MapSymbolTypeName, checks SymbolAccessRights.Write bit for writable vs read-only. Unsupported types (UDTs / function blocks / arrays / pointers) surface with DataType=null so callers can skip or recurse. TwinCATDriverOptions.EnableControllerBrowse — new bool, default false to preserve the strict-config path. When true, DiscoverAsync iterates each device's BrowseSymbolsAsync, filters via TwinCATSystemSymbolFilter (rejects TwinCAT_*, Constants.*, Mc_*, __*, Global_Version* prefixes + anything empty), skips null-DataType symbols, emits surviving symbols under a per-device Discovered/ sub-folder with InstancePath as both FullName + BrowseName + ReadOnly→ViewOnly/writable→Operate. Pre-declared tags from TwinCATDriverOptions.Tags always emit regardless. Browse failure is non-fatal — exception caught + swallowed, pre-declared tags stay in the address space, operators see the failure in driver health on next read. TwinCATSystemSymbolFilter static class mirrors AbCipSystemTagFilter's shape with TwinCAT-specific prefixes. Fake client updated — BrowseResults list for test setup + FireNotification-style single-invocation on each subscribe, ThrowOnBrowse flag for failure testing. 8 new unit tests — strict path emits only pre-declared when EnableControllerBrowse=false, browse enabled adds Discovered/ folder, filter rejects system prefixes, null-DataType symbols skipped, ReadOnly symbols surface ViewOnly, browse failure leaves pre-declared intact, SystemSymbolFilter theory (10 cases). Total TwinCAT unit tests now 110/110 passing (+17 from the native-notification merge's 93); full solution builds 0 errors; other drivers untouched.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 20:13:33 -04:00
Joseph Doherty
1d6015bc87 FOCAS PR 3 — ITagDiscovery + ISubscribable + IHostConnectivityProbe + IPerCallHostResolver. Completes the FOCAS driver — 7-interface capability set matching AbCip/AbLegacy/TwinCAT (minus IAlarmSource — Fanuc CNC alarms live in a different API surface, tracked as a future-phase concern). ITagDiscovery emits pre-declared tags under a FOCAS root + per-device sub-folder keyed on the canonical focas://host:port string with DeviceName fallback. Writable → Operate, non-writable → ViewOnly. No native FOCAS symbol browsing — CNCs don't expose a tag catalogue the way Logix or TwinCAT do; operators declare addresses explicitly. ISubscribable consumes the shared PollGroupEngine — 5th consumer of the engine after Modbus + AbCip + AbLegacy + TwinCAT-poll-mode. 100ms interval floor inherited. FOCAS has no native notification/subscription protocol (unlike TwinCAT ADS), so polling is the only option — every subscribed tag round-trips through cnc_rdpmcrng / cnc_rdparam / cnc_rdmacro on each tick. IHostConnectivityProbe uses the existing IFocasClient.ProbeAsync which in the real FwlibFocasClient calls cnc_statinfo (cheap handshake returning ODBST with tmmode/aut/run/motion/alarm state). Probe loop runs when Enabled=true, catches OperationCanceledException during shutdown, falls through to Stopped on exceptions, emits Running/Stopped transitions via OnHostStatusChanged with the canonical focas://host:port as the host-name key. Same-state spurious-event guard under per-device lock. IPerCallHostResolver maps tag full-ref to DeviceHostAddress for Phase 6.1 bulkhead/breaker keying per plan decision #144 — unknown refs fall back to first device, no devices → DriverInstanceId. ShutdownAsync now disposes PollGroupEngine + cancels/disposes per-device probe CTS + disposes cached clients. DeviceState gains ProbeLock / HostState / HostStateChangedUtc / ProbeCts matching the shape used by AbCip/AbLegacy/TwinCAT. 9 new unit tests in FocasCapabilityTests — discovery tag emission with correct SecurityClassification, subscription initial poll raises OnDataChange, shutdown cancels subscriptions, GetHostStatuses entry-per-device, probe Running / Stopped transitions, ResolveHost for known / unknown / no-devices paths. FocasScaffoldingTests updated with Probe.Enabled=false where the default factory would otherwise try to load Fwlib32.dll during the probe-loop spinup. Total FOCAS unit tests now 115/115 passing (+9 from PR 2's 106); full solution builds 0 errors; Modbus / AbCip / AbLegacy / TwinCAT / other drivers untouched. FOCAS driver is real-wire-capable end-to-end — read / write / discover / subscribe / probe / host-resolve for Fanuc FS 0i/16i/18i/21i/30i/31i/32i/Series 35i/Power Mate i controllers once deployment drops Fwlib32.dll beside the server. Closes task #120 subtask FOCAS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 19:59:37 -04:00
Joseph Doherty
a2c7fda5f5 FOCAS PR 2 — IReadable + IWritable + real FwlibFocasClient P/Invoke. Closes task #193 early now that strangesast/fwlib provides the licensed DLL references. Skips shipping with the Unimplemented stub as the default — FwlibFocasClientFactory is now the production default, UnimplementedFocasClientFactory stays as an opt-in for tests/deployments without FWLIB access. FwlibNative — narrow P/Invoke surface for the 7 calls the driver actually makes: cnc_allclibhndl3 (open Ethernet handle), cnc_freelibhndl (close), pmc_rdpmcrng + pmc_wrpmcrng (PMC range I/O), cnc_rdparam + cnc_wrparam (CNC parameters), cnc_rdmacro + cnc_wrmacro (macro variables), cnc_statinfo (probe). DllImport targets Fwlib32.dll; deployment places it next to the executable or on PATH. IODBPMC/IODBPSD/ODBM/ODBST marshaled with LayoutKind.Sequential + Pack=1 + fixed byte-array unions (avoids LayoutKind.Explicit complexity; managed-side BitConverter extracts typed values from the byte buffer). Internal helpers FocasPmcAddrType.FromLetter (G=0/F=1/Y=2/X=3/A=4/R=5/T=6/K=7/C=8/D=9/E=10 per Fanuc FOCAS/2 spec) + FocasPmcDataType.FromFocasDataType (Byte=0 / Word=1 / Long=2 / Float=4 / Double=5) exposed for testing without the DLL loaded. FwlibFocasClient is the concrete IFocasClient backed by P/Invoke. Construction is licence-safe — .NET P/Invoke is lazy so instantiating the class does NOT load Fwlib32.dll; DLL loads on first wire call (Connect/Read/Write/Probe). When missing, calls throw DllNotFoundException which the driver surfaces as BadCommunicationError via the normal exception path. Session-scoped handle from cnc_allclibhndl3; Dispose calls cnc_freelibhndl. Dispatch on FocasAreaKind — Pmc reads use pmc_rdpmcrng with the right ADR_* + data-type codes + parses the union via BinaryPrimitives LittleEndian, Parameter reads use cnc_rdparam + IODBPSD, Macro reads use cnc_rdmacro + compute scaled double as McrVal / 10^DecVal. Write paths mirror reads. PMC Bit writes throw NotSupportedException pointing at task #181 (read-modify-write gap — same as Modbus / AbCip / AbLegacy / TwinCAT). Macro writes accept int + pass decimal-point count 0 (decimal precision writes are a future enhancement). Probe calls cnc_statinfo with ODBST result. Driver wiring — FocasDriver now IDriver + IReadable + IWritable. Per-device connection caching via EnsureConnectedAsync + DeviceState.Client. ReadAsync/WriteAsync dispatch through the injected IFocasClient — ordered snapshots preserve per-tag status, OperationCanceledException rethrows, FormatException/InvalidCastException → BadTypeMismatch, OverflowException → BadOutOfRange, NotSupportedException → BadNotSupported, anything else → BadCommunicationError + Degraded health. Connect-failure disposes the half-open client. ShutdownAsync disposes every cached client. Default factory switched — constructor now defaults to FwlibFocasClientFactory (backed by real Fwlib32.dll) rather than UnimplementedFocasClientFactory. UnimplementedFocasClientFactory stays as an opt-in. 41 new tests — 14 in FocasReadWriteTests (ordered unknown-ref handling, successful PMC/Parameter/Macro reads routing through correct FocasAreaKind, repeat-read reuses connection, FOCAS error mapping, exception paths, batched order across areas, non-writable rejection, successful write logging, status mapping, batch ordering, cancellation, shutdown disposes), 27 in FwlibNativeHelperTests (12 letter-mapping cases + 3 unknown rejections + 6 data-type mapping + 4 encode helpers + Bit-write NotSupported). Total FOCAS unit tests now 106/106 passing (+41 from PR 1's 65); full solution builds 0 errors; Modbus / AbCip / AbLegacy / TwinCAT / other drivers untouched. FOCAS driver is real-wire-capable from day one — deployment drops Fwlib32.dll beside the server + driver talks to live FS 0i/16i/18i/21i/30i/31i/32i controllers.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 19:55:37 -04:00
Joseph Doherty
285799a954 FOCAS PR 1 — Scaffolding + Core (FocasDriver skeleton + address parser + stub client). New Driver.FOCAS project for Fanuc CNC controllers (FS 0i/16i/18i/21i/30i/31i/32i/Series 35i/Power Mate i) talking via the Fanuc FOCAS/2 protocol. No NuGet reference to a FOCAS library — FWLIB (Fwlib32.dll) is Fanuc-proprietary + per-customer licensed + cannot be legally redistributed, so the driver is designed from the start to accept an IFocasClient supplied by the deployment side. Default IFocasClientFactory is UnimplementedFocasClientFactory which throws with a clear deployment-docs pointer at Create time so misconfigured servers fail fast rather than mysteriously hanging. Matches the pattern other drivers use for swappable wire layers (Modbus IModbusTransport, AbCip IAbCipTagFactory, TwinCAT ITwinCATClientFactory) — but uniquely, FOCAS ships without a production factory because of licensing. FocasHostAddress parses focas://{host}[:{port}] canonical form with default port 8193 (Fanuc-reserved FOCAS Ethernet port). Default-port stripping on ToString for roundtrip stability. Case-insensitive scheme. Rejects wrong scheme, empty body, invalid port, non-numeric port. FocasAddress handles the three addressing spaces a FOCAS driver touches — PMC (letter + byte + optional bit, X/Y for IO, F/G for PMC-CNC signals, R for internal relay, D for data table, C for counter, K for keep relay, A for message display, E for extended relay, T for timer, with .N bit syntax 0-7), CNC parameters (PARAM:n for a parameter number, PARAM:n/N for bit 0-31 of a parameter), macro variables (MACRO:n). Rejects unknown PMC letters, negative numbers, out-of-range bits (PMC 0-7, parameter 0-31), non-numeric fragments. FocasDataType — Bit / Byte / Int16 / Int32 / Float32 / Float64 / String covering the atomic types PMC reads + CNC parameters + macro variables return. ToDriverDataType widens to the Int32/Float32/Float64/Boolean/String surface. FocasStatusMapper covers the FWLIB EW_* return-code family documented in the FOCAS/1 + FOCAS/2 references — EW_OK=0, EW_FUNC=1 → BadNotSupported, EW_OVRFLOW=2/EW_NUMBER=3/EW_LENGTH=4 → BadOutOfRange, EW_PROT=5/EW_PASSWD=11 → BadNotWritable, EW_NOOPT=6/EW_VERSION=-9 → BadNotSupported, EW_ATTRIB=7 → BadTypeMismatch, EW_DATA=8 → BadNodeIdUnknown, EW_PARITY=9 → BadCommunicationError, EW_BUSY=-1 → BadDeviceFailure, EW_HANDLE=-8 → BadInternalError, EW_UNEXP=-10/EW_SOCKET=-16 → BadCommunicationError. IFocasClient + IFocasClientFactory abstraction — ConnectAsync, IsConnected, ReadAsync returning (value, status) tuple, WriteAsync returning status, ProbeAsync for IHostConnectivityProbe. Deployment supplies the real factory; driver assembly stays licence-clean. FocasDriverOptions + FocasDeviceOptions + FocasTagDefinition + FocasProbeOptions — one instance supports N CNCs, tags cross-key by HostAddress + use canonical FocasAddress strings. FocasDriver implements IDriver only (PRs 2-3 add read/write/discover/subscribe/probe/resolver). InitializeAsync parses each device HostAddress + fails fast on malformed strings → Faulted health. 65 new unit tests in FocasScaffoldingTests covering — 5 valid host forms + 8 invalid + default-port-strip ToString, 12 valid PMC addresses across all 11 canonical letters + 3 parameter forms with + without bit + 2 macro forms, 10 invalid address shapes, canonical roundtrip theory, data-type mapping theory, FWLIB EW_* status mapping theory (9 codes + unknown → generic), DriverType, multi-device Initialize + address parsing, malformed-address fault, shutdown, default factory throws NotSupportedException with deployment pointer + Fwlib32.dll mention. Total project count 31 src + 20 tests; full solution builds 0 errors. Other drivers untouched.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 19:47:52 -04:00
Joseph Doherty
6c5b202910 TwinCAT follow-up — Native ADS notifications for ISubscribable. Closes task #189 — upgrades TwinCATDriver's subscription path from polling (shared PollGroupEngine) to native AdsClient.AddDeviceNotificationExAsync so the PLC pushes changes on its own cycle rather than the driver polling. Strictly better for latency + CPU — TC2 and TC3 runtimes notify on value change with sub-millisecond latency from the PLC cycle. ITwinCATClient gains AddNotificationAsync — takes symbolPath + TwinCATDataType + optional bitIndex + cycleTime + onChange callback + CancellationToken; returns an ITwinCATNotificationHandle whose Dispose tears the notification down on the wire. Bit-within-word reads supported — the parent word value arrives via the notification, driver extracts the bit before invoking the callback (same ExtractBit path as the read surface from PR 2). AdsTwinCATClient — subscribes to AdsClient.AdsNotificationEx in the ctor, maintains a ConcurrentDictionary<uint, NotificationRegistration> keyed on the server-side notification handle. AddDeviceNotificationExAsync returns Task<ResultHandle> with Handle + ErrorCode; non-NoError throws InvalidOperationException so the driver can catch + retry. Notification event args carry Handle + Value + DataType; lookup in _notifications dict routes the value through any bit-extraction + calls the consumer callback. Consumer-side exceptions are swallowed so a misbehaving callback can't crash the ADS notification thread. Dispose unsubscribes from AdsNotificationEx + clears the dict + disposes AdsClient. NotificationRegistration is ITwinCATNotificationHandle — Dispose fires DeleteDeviceNotificationAsync as fire-and-forget with CancellationToken.None (caller has already committed to teardown; blocking would slow shutdown). TwinCATDriverOptions.UseNativeNotifications — new bool, default true. When true the driver uses native notifications; when false it falls through to the shared PollGroupEngine (same semantics as other libplctag-backed drivers, also a safety valve for targets with notification limits). TwinCATDriver.SubscribeAsync dual-path — if UseNativeNotifications false delegate into _poll.Subscribe (unchanged behavior from PR 3). If true, iterate fullReferences, resolve each to its device's client via EnsureConnectedAsync (reuses PR 2's per-device connection cache), parse the SymbolPath via TwinCATSymbolPath (preserves bit-in-word support), call ITwinCATClient.AddNotificationAsync with a closure over the FullReference (not the ADS symbol — OPC UA subscribers addressed the driver-side name). Per-registration callback bridges (_, value) → OnDataChange event with a fresh DataValueSnapshot (Good status, current UtcNow timestamps). Any mid-registration failure triggers a try/catch that disposes every already-registered handle before rethrowing, keeping the driver in a clean never-existed state rather than half-registered. UnsubscribeAsync dispatches on handle type — NativeSubscriptionHandle disposes all its cached ITwinCATNotificationHandles; anything else delegates to _poll.Unsubscribe for the poll fallback. ShutdownAsync tears down native subs first (so AdsClient-level cleanup happens before the client itself disposes), then PollGroupEngine, then per-device probe CTS + client. NativeSubscriptionHandle DiagnosticId prefixes with twincat-native-sub- so Admin UI + logs can distinguish the paths. 9 new unit tests in TwinCATNativeNotificationTests — native subscribe registers one notification per tag, pushed value via FireNotification fires OnDataChange with the right FullReference (driver-side, not ADS symbol), unsubscribe disposes all notifications, unsubscribe halts future notifications, partial-failure cleanup via FailAfterNAddsFake (first succeeds, second throws → first gets torn down + Notifications count returns to 0 + AddCallCount=2 proving the test actually exercised both calls), shutdown disposes subscriptions, poll fallback works when UseNativeNotifications=false (no native handles created + initial-data push still fires), handle DiagnosticId distinguishes native vs poll. Existing poll-mode ISubscribable tests in TwinCATCapabilityTests updated with UseNativeNotifications=false so they continue testing the poll path specifically — both poll + native paths have test coverage now. TwinCATDriverTests got Probe.Enabled=false added because the default factory creates a real AdsClient which was flakily affected by parallel test execution sharing AMS router state. Total TwinCAT unit tests now 93/93 passing (+8 from PR 3's 85 counting the new native tests + 2 existing tests that got options tweaks). Full solution builds 0 errors; Modbus / AbCip / AbLegacy / other drivers untouched. TwinCAT driver is now feature-complete end-to-end — read / write / discover / native-subscribe / probe / host-resolve, with poll-mode as a safety valve. Unblocks closing task #120 for TwinCAT; remaining sub-task: FOCAS + task #188 (symbol-browsing — lower priority than FOCAS since real config flows still use pre-declared tags).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 18:49:48 -04:00
Joseph Doherty
aeb28cc8e7 TwinCAT PR 3 — ITagDiscovery + ISubscribable + IHostConnectivityProbe + IPerCallHostResolver. Completes the TwinCAT driver — 7-interface capability set matching AbCip / AbLegacy (minus IAlarmSource, same deferral). ITagDiscovery emits pre-declared tags under TwinCAT/device-host folder with DeviceName fallback to HostAddress; Writable→Operate / non-writable→ViewOnly. Symbol-browsing via AdsClient.ReadSymbolsAsync / ReadSymbolInfoAsync deferred to a follow-up (same shape as the @tags deferral for AbCip — needs careful traversal of the TwinCAT symbol table + type graph which the ReadSymbolsAsync API does expose but adds enough scope to warrant its own PR). ISubscribable consumes the shared PollGroupEngine — 4th consumer after Modbus + AbCip + AbLegacy. TwinCAT supports native ADS notifications (AddDeviceNotification) which would be strictly superior to polling, but plumbing through OPC UA semantics + the PollGroupEngine abstraction would require a parallel sampling path; poll-first matches the cross-driver pattern + gets the driver shippable. Follow-up task for native-notification upgrade tracked after merge. IHostConnectivityProbe — per-device probe loop using ITwinCATClient.ProbeAsync which wraps AdsClient.ReadStateAsync (cheap handshake that returns the target's AdsState, succeeds when router + target both respond). Success transitions to Running, any exception or probe-false to Stopped. Same lazy-connect + dispose-on-failure pattern as the read/write path — device state reconnects cleanly after a transient. IPerCallHostResolver maps tag full-ref to DeviceHostAddress for Phase 6.1 (DriverInstanceId, ResolvedHostName) bulkhead/breaker keying per plan decision #144; unknown refs fall back to first device, no devices → DriverInstanceId. ShutdownAsync disposes PollGroupEngine + cancels/disposes every probe CTS + disposes every cached client. DeviceState extended with ProbeLock / HostState / HostStateChangedUtc / ProbeCts matching AbCip/AbLegacy shape. 10 new tests in TwinCATCapabilityTests — discovery tag emission with correct SecurityClassification, subscription initial poll raises OnDataChange, shutdown cancels subscriptions, GetHostStatuses entry-per-device, probe Running transition on ProbeResult=true, probe Stopped on ProbeResult=false, probe disabled when Enabled=false, ResolveHost for known/unknown/no-devices paths. Total TwinCAT unit tests now 85/85 passing (+10 from PR 2's 75); full solution builds 0 errors; other drivers untouched. TwinCAT driver complete end-to-end — any TC2/TC3 AMS target reachable through a router is now shippable with read/write/discover/subscribe/probe/host-resolve, feature-parity with AbCip/AbLegacy.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 18:36:55 -04:00
Joseph Doherty
28e3470300 TwinCAT PR 2 — IReadable + IWritable. ITwinCATClient + ITwinCATClientFactory abstraction — one client per AMS target, reused across reads/writes/probes. Shape differs from AbCip/AbLegacy where libplctag handles are per-tag — TwinCAT's AdsClient is a single connection with symbolic reads/writes issued against it, so the abstraction is coarser. AdsTwinCATClient is the default implementation wrapping Beckhoff.TwinCAT.Ads's AdsClient — ConnectAsync calls AdsClient.Connect(AmsNetId.Parse(netId), port) after setting Timeout in ms; ReadValueAsync dispatches TwinCATDataType to the CLR Type via MapToClrType (bool/sbyte/byte/short/ushort/int/uint/long/ulong/float/double/string/uint for time types) and calls AdsClient.ReadValueAsync(symbol, type, ct) which returns ResultAnyValue; unwraps .Value + .ErrorCode and maps non-NoError codes via TwinCATStatusMapper.MapAdsError. BOOL-within-word reads extract the bit after the underlying word read using ExtractBit over short/ushort/int/uint/long/ulong. WriteValueAsync converts the boxed value via ConvertForWrite (Convert.ToXxx per type) then calls AdsClient.WriteValueAsync returning ResultWrite; checks .ErrorCode for status mapping. BOOL-within-word writes throw NotSupportedException with a pointer to task #181 — same RMW gap as Modbus BitInRegister / AbCip BOOL-in-DINT / AbLegacy bit-within-N-file. ProbeAsync calls AdsClient.ReadStateAsync + checks AdsErrorCode.NoError. TwinCATDriver implements IReadable + IWritable — per-device ITwinCATClient cached in DeviceState.Client, lazy-connected on first read/write via EnsureConnectedAsync, connect-failure path disposes + clears the client so next call re-attempts cleanly. ReadAsync ordered-snapshot pattern matching AbCip/AbLegacy: unknown ref → BadNodeIdUnknown, unknown device → BadNodeIdUnknown, OperationCanceledException rethrow, any other exception → BadCommunicationError + Degraded health. WriteAsync similar — non-Writable tag → BadNotWritable upfront, NotSupportedException → BadNotSupported, FormatException/InvalidCastException (guard pattern) → BadTypeMismatch, OverflowException → BadOutOfRange, generic → BadCommunicationError. Symbol name resolution goes through TwinCATSymbolPath.TryParse(def.SymbolPath) with fallback to the raw def.SymbolPath if the path doesn't parse — the Beckhoff AdsClient handles the final validation at wire time. ShutdownAsync disposes each device's client. 14 new unit tests in TwinCATReadWriteTests using FakeTwinCATClient + FakeTwinCATClientFactory — unknown ref → BadNodeIdUnknown, successful DInt read with Good status + captured value + IsConnected=true after EnsureConnectedAsync, repeat reads reuse the connection (one Connect + multiple reads), ADS error code mapping via FakeTwinCATClient.ReadStatuses, read exception → BadCommunicationError + Degraded health, connect exception disposes the client, batched reads preserve order across DInt/Real/String types, non-Writable rejection, successful write logs symbol+type+value+bit for test inspection, write status-code mapping, write exception → BadCommunicationError, batch preserves order across success/non-writable/unknown, cancellation propagation, ShutdownAsync disposes the client. Total TwinCAT unit tests now 75/75 passing (+14 from PR 1's 61); full solution builds 0 errors; Modbus / AbCip / AbLegacy / other drivers untouched.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 18:33:03 -04:00
Joseph Doherty
cd2c0bcadd TwinCAT PR 1 — Scaffolding + Core (TwinCATDriver + AMS address + symbolic path). New Driver.TwinCAT project referencing Beckhoff.TwinCAT.Ads 7.0.172 (the official Beckhoff .NET client — 1.6M+ downloads, actively maintained by Beckhoff + community). Package compiles without a local AMS router; wire calls need a running router (TwinCAT XAR on dev Windows, or the standalone Beckhoff.TwinCAT.Ads.TcpRouter embedded package for headless/CI). Same Core.Abstractions-only project shape as Modbus / AbCip / AbLegacy. TwinCATAmsAddress parses ads://{netId}:{port} canonical form — NetId is 6 dot-separated octets (NOT an IP; AMS router translates), port defaults to 851 (TC3 PLC runtime 1). Validates octet range 0-255 and port 1-65535. Case-insensitive scheme. Default-port stripping in canonical form for roundtrip stability. Rejects wrong scheme, missing //, 5-or-7-octet NetId, out-of-range octets/ports, non-numeric fragments. TwinCATSymbolPath handles IEC 61131-3 symbolic names — single-segment (Counter), POU.variable (MAIN.bStart), GVL.variable (GVL.Counter), structured member access (Motor1.Status.Running), array subscripts (Data[5]), multi-dim arrays (Matrix[1,2]), bit-access (Flags.3, GVL.Status.7), combined scope/member/subscript/bit (MAIN.Motors[0].Status.5). Roundtrip-safe ToAdsSymbolName produces the exact string AdsClient.ReadValue consumes. Rejects leading/trailing dots, space in idents, digit-prefix idents, empty/negative/non-numeric subscripts, unbalanced brackets. Underscore-prefix idents accepted per IEC. TwinCATDataType — BOOL / SINT / USINT / INT / UINT / DINT / UDINT / LINT / ULINT / REAL / LREAL / STRING / WSTRING (UTF-16) / TIME / DATE / DateTime (DT) / TimeOfDay (TOD) / Structure. Wider than Logix's surface — IEC adds WSTRING + TIME/DATE/DT/TOD variants. ToDriverDataType widens unsigned + 64-bit to Int32 matching the Modbus/AbCip/AbLegacy Int64-gap convention. TwinCATStatusMapper — Good / BadInternalError / BadNodeIdUnknown / BadNotWritable / BadOutOfRange / BadNotSupported / BadDeviceFailure / BadCommunicationError / BadTimeout / BadTypeMismatch. MapAdsError covers the ADS error codes a driver actually encounters — 6/7 port unreachable, 1792 service not supported, 1793/1794 invalid index group/offset, 1798 symbol not found (→ BadNodeIdUnknown), 1807 invalid state, 1808 access denied (→ BadNotWritable), 1811/1812 size mismatch (→ BadOutOfRange), 1861 sync timeout, unknown → BadCommunicationError. TwinCATDriverOptions + TwinCATDeviceOptions + TwinCATTagDefinition + TwinCATProbeOptions — one instance supports N AMS targets, Tags cross-key by HostAddress, Probe defaults to 5s interval (unlike AbLegacy there's no default probe address — ADS probe reads AmsRouterState not a user tag, so probe address is implicit). TwinCATDriver IDriver skeleton — InitializeAsync parses each device HostAddress + fails fast on malformed strings → Faulted. 61 new unit tests across 3 files — TwinCATAmsAddressTests (6 valid shapes + 12 invalid shapes + 2 ToString canonicalisation + roundtrip stability), TwinCATSymbolPathTests (9 valid shapes + 12 invalid shapes + underscore prefix + 8-case roundtrip), TwinCATDriverTests (DriverType + multi-device init + malformed-address fault + shutdown + reinit + data-type mapping theory + ADS error-code theory). Total project count 30 src + 19 tests; full solution builds 0 errors; Modbus / AbCip / AbLegacy / other drivers untouched.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 18:26:29 -04:00
Joseph Doherty
400fc6242c AB Legacy PR 3 — ITagDiscovery + ISubscribable + IHostConnectivityProbe + IPerCallHostResolver. Fills out the AbLegacy capability surface — the driver now implements the same 7-interface set as AbCip (IDriver + IReadable + IWritable + ITagDiscovery + ISubscribable + IHostConnectivityProbe + IPerCallHostResolver). ITagDiscovery emits pre-declared tags under an AbLegacy root folder with a per-device sub-folder keyed on HostAddress (DeviceName fallback to HostAddress when null). Writable tags surface as SecurityClassification.Operate, non-writable as ViewOnly. No controller-side enumeration — PCCC has no @tags equivalent on SLC / MicroLogix / PLC-5 (symbol table isn't exposed the way Logix exposes it), so the pre-declared path is the only discovery mechanism. ISubscribable consumes the shared PollGroupEngine extracted in AB CIP PR 1 — reader delegate points at ReadAsync (already handles lazy runtime init + caching), onChange bridges into the driver's OnDataChange event. 100ms interval floor. Initial-data push on first poll. Makes AbLegacy the third consumer of PollGroupEngine (after Modbus and AbCip). IHostConnectivityProbe — per-device probe loop when ProbeOptions.Enabled + ProbeAddress configured (defaults to S:0 status file word 0). Lazy-init on first tick, re-init on wire failure (destroyed native handle gets recreated rather than silently staying broken). Success transitions device to Running, exception to Stopped, same-state spurious event guard under per-device lock. GetHostStatuses returns one entry per device with current state + last-change timestamp for Admin /hosts surfacing. IPerCallHostResolver maps tag full-ref → DeviceHostAddress for the Phase 6.1 (DriverInstanceId, ResolvedHostName) bulkhead/breaker keying per plan decision #144. Unknown refs fall back to first device's address (invoker handles at capability level as BadNodeIdUnknown); no devices → DriverInstanceId. ShutdownAsync cancels + disposes each probe CTS, disposes PollGroupEngine cancelling active subscriptions, disposes every cached runtime. DeviceState gains ProbeLock / HostState / HostStateChangedUtc / ProbeCts / ProbeInitialized matching AbCip's DeviceState shape. 10 new unit tests in AbLegacyCapabilityTests covering — pre-declared tags emit under AbLegacy/device folder with correct SecurityClassification, subscription initial poll raises OnDataChange with correct value, unsubscribe halts polling (value change post-unsub produces no further events), GetHostStatuses returns one entry per device, probe Running transition on successful read, probe Stopped transition on read exception, probe disabled when ProbeAddress null, ResolveHost returns declared device for known tag, falls back to first device for unknown, falls back to DriverInstanceId when no devices. Total AbLegacy unit tests now 92/92 passing (+10 from PR 2's 82); full solution builds 0 errors; AbCip + Modbus + other drivers untouched. AB Legacy driver now complete end-to-end — SLC 500 / MicroLogix / PLC-5 / LogixPccc all shippable with read / write / discovery / subscribe / probe / host-resolve, feature-parity with AbCip minus IAlarmSource (same deferral per plan).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 18:02:52 -04:00
Joseph Doherty
b2424a0616 AB Legacy PR 2 — IReadable + IWritable. IAbLegacyTagRuntime + IAbLegacyTagFactory abstraction mirrors IAbCipTagRuntime from AbCip PR 3. LibplctagLegacyTagRuntime default implementation wraps libplctag.Tag with Protocol=ab_eip + PlcType dispatched from the profile's libplctag attribute (Slc500/MicroLogix/Plc5/LogixPccc) — libplctag routes PCCC-over-EIP internally based on PlcType, so our layer just forwards the atomic type to Get/Set calls. DecodeValue handles Bit (GetBit when bitIndex is set, else GetInt8!=0), Int/AnalogInt (GetInt16 widened to int), Long (GetInt32), Float (GetFloat32), String (GetString), TimerElement/CounterElement/ControlElement (GetInt32 — sub-element selection is in the libplctag tag name like T4:0.ACC, PLC-side decode picks the right slot). EncodeValue handles the same types; bit-within-word writes throw NotSupportedException pointing at follow-up task #181 (same read-modify-write gap as Modbus BitInRegister). AbLegacyDriver implements IReadable + IWritable with the exact same shape as AbCip PR 3-4 — per-tag lazy runtime init via EnsureTagRuntimeAsync cached in DeviceState.Runtimes dict, ordered-snapshot results, health surface updates. Exception table — OperationCanceledException rethrows, NotSupportedException → BadNotSupported, FormatException/InvalidCastException → BadTypeMismatch (guard pattern C# 11 syntax), OverflowException → BadOutOfRange, anything else → BadCommunicationError. ShutdownAsync disposes every cached runtime so the native tag handles get released. 14 new unit tests in AbLegacyReadWriteTests covering unknown ref → BadNodeIdUnknown, successful N-file read with Good status + captured value, repeat-read reuses cached runtime (init count 1 across 2 reads), libplctag non-zero status mapping (-14 → BadNodeIdUnknown), read exception → BadCommunicationError + Degraded health, batched reads preserve order across N/F/ST types, TagCreateParams composition (gateway/port/path/slc500 attribute/tag-name), non-writable tag → BadNotWritable, successful write encodes + flushes, bit-within-word → BadNotSupported (RmwThrowingFake mirrors LibplctagLegacyTagRuntime's runtime check), write exception → BadCommunicationError, batch preserves order across success+fail+unknown, cancellation propagates, ShutdownAsync disposes runtimes. Total AbLegacy unit tests now 82/82 passing (+14 from PR 1's 68). Full solution builds 0 errors; Modbus + AbCip + other drivers untouched.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 17:58:38 -04:00
Joseph Doherty
fc575e8dae AB Legacy PR 1 — Scaffolding + Core (AbLegacyDriver + PCCC address parser). New Driver.AbLegacy project with the libplctag 1.5.2 reference + the same Core.Abstractions-only project shape AbCip uses. AbLegacyHostAddress duplicates the ab://gateway[:port]/cip-path parser from AbCip since PCCC-over-EIP uses the same gateway routing convention (SLC 500 direct-wired with empty path, PLC-5 bridged through a ControlLogix chassis with full CIP path). Parser is 30 lines; copy was cheaper than introducing a shared Ab* project just to avoid duplication. AbLegacyAddress handles PCCC file addressing — file-letter + optional file-number + colon + word-number + optional sub-element (.ACC / .PRE / .EN / .DN / .CU / .CD / .LEN / .POS / .ER) + optional /bit-index. Handles the full shape variety — N7:0 (integer file 7 word 0), F8:5 (float file 8 word 5), B3:0/0 (bit file 3 word 0 bit 0), ST9:0 (string file 9 string 0), L9:3 (long file SLC 5/05+), T4:0.ACC (timer accumulator), C5:2.CU (counter count-up bit), R6:0.LEN (control length), I:0/0 (input file bit — no file number for I/O/S), O:1/2 (output file bit), S:1 (status file word), N7:0/3 (bit within integer file). Validates file letters against the canonical SLC/ML/PLC-5 set (N/F/B/L/ST/T/C/R/I/O/S/A). ToLibplctagName roundtrips so the parsed value can be handed straight to libplctag's name= attribute. AbLegacyDataType — Bit / Int (N-file, 16-bit signed) / Long (L-file, 32-bit, SLC 5/05+ only) / Float (F-file, 32-bit IEEE-754) / AnalogInt (A-file) / String (ST-file, 82-byte fixed + length word) / TimerElement / CounterElement / ControlElement. ToDriverDataType widens Long to Int32 matching the Modbus/AbCip Int64-gap convention. AbLegacyStatusMapper shares the OPC UA status constants with AbCip (same numeric values, different namespace). MapLibplctagStatus mirrors AbCip — 0 success, positive pending, negative error code families. MapPcccStatus handles PCCC STS bytes — 0x00 success, 0x10 illegal command, 0x20 bad address, 0x30 protected, 0x40/0x50 busy, 0xF0 extended status. AbLegacyDriverOptions + AbLegacyDeviceOptions + AbLegacyTagDefinition + AbLegacyProbeOptions mirror AbCip shapes — one instance supports N devices via Devices list, Tags list references devices by HostAddress cross-key, Probe uses S:0 by default as the cheap probe address. AbLegacyPlcFamilyProfile for four families — Slc500 (slc500 attribute, 1,0 default path, supports L + ST files, 240B max PCCC packet), MicroLogix (micrologix attribute, empty path for direct EIP, supports ST but not L), Plc5 (plc5 attribute, 1,0 default path, supports ST but predates L), LogixPccc (logixpccc attribute, full Logix ConnectionSize + L file support via the PCCC compatibility layer on ControlLogix). AbLegacyDriver implements IDriver only — InitializeAsync parses each device's HostAddress and selects its profile (fails fast on malformed strings → Faulted health), per-device state with parsed address + options + profile + empty placeholder for PRs 2-3. ShutdownAsync clears the device dict. 68 new unit tests across 3 files — AbLegacyAddressTests (15 valid shapes + 10 invalid shapes + 7 ToLibplctagName roundtrip), AbLegacyHostAndStatusTests (4 valid host + 5 invalid host + 8 PCCC STS + 7 libplctag status), AbLegacyDriverTests (IDriver lifecycle + multi-device init with per-family profile selection + malformed-address fault + shutdown + family profile defaults + ForFamily theory + data-type mapping). Total project count 29 src + 18 tests; full solution builds 0 errors; Modbus + AbCip + other drivers untouched.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 17:54:25 -04:00
Joseph Doherty
60b8d6f2d0 AB CIP PR 9-12 — Per-PLC-family profile tests + GuardLogix safety-tag support. Consolidates PRs 9/10/11/12 from the plan (ControlLogix / CompactLogix / Micro800 / GuardLogix integration suites) into a single PR because the per-family work that actually ships without a live ab_server binary is profile-metadata assertion + unit-level driver-option binding. Per-family integration tests that require a running simulator are deferred to the ab_server-CI follow-up already tracked from PR 3 (download prebuilt Windows binary as GitHub release asset). ControlLogix — baseline profile asserted (controllogix attribute, 4002 LFO ConnectionSize, 1,0 default path, request-packing + connected-messaging, 4000B max fragment). CompactLogix — narrower 504 ConnectionSize for 5069-L3x safety, 500B max fragment, lib attribute compactlogix which libplctag maps to the ControlLogix family internally but via our profile chain we surface it as a distinct knob so future quirk handling (5069 narrow-window regression cases) hangs off the compactlogix attribute. Micro800 — empty CIP path for no-backplane routing, 488B ConnectionSize, 484B fragment cap, request packing + connected messaging both disabled (most models reject Forward_Open), micro800 lib attribute. Test asserts the driver correctly parses an ab://192.168.1.20/ host address with empty path + forwards the empty path through AbCipTagCreateParams so libplctag sees the unconnected-only configuration. GuardLogix — wire protocol identical to ControlLogix (safety partition is a per-tag concern, not a wire-layer distinction) so profile defaults match ControlLogix. New AbCipTagDefinition.SafetyTag field — when true, the driver forces SecurityClassification.ViewOnly in discovery regardless of the Writable flag, and IWritable rejects the write upfront with BadNotWritable. Matches the Rockwell safety-partition isolation model where non-safety-task writes to safety tags would be rejected by the PLC anyway — surfacing the intent at the driver surface prevents wasted wire round-trips + gives Admin UI users a correct ViewOnly rendering. 14 new unit tests in AbCipPlcFamilyTests covering — ControlLogix profile defaults + correct profile selection at Initialize, CompactLogix narrower-than-ControlLogix ConnectionSize + fragment cap, Micro800 empty path parses + SupportsConnectedMessaging=false + SupportsRequestPacking=false + read forwards empty path + micro800 attribute through to libplctag, GuardLogix wire-protocol parity with ControlLogix, GuardLogix safety tag surfaces as ViewOnly in discovery even when Writable=true, GuardLogix safety-tag write rejected with BadNotWritable even when Writable=true, ForFamily theory (4 families → correct libplctag attribute). Total AbCip unit tests now 161/161 passing (+14 from PR 8's 147). Modbus + other drivers untouched; full solution builds 0 errors. PR 13 (IAlarmSource via tag-projected ALMA/ALMD blocks) remains deferred per the plan — feature-flagged pattern not needed before go-live.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 17:18:51 -04:00
Joseph Doherty
ac14ba9664 AB CIP PR 8 — IHostConnectivityProbe + IPerCallHostResolver. Per-device probe loop — when AbCipProbeOptions.Enabled + ProbeTagPath are configured, InitializeAsync kicks off one probe task per device that periodically reads the probe tag (lazy-init on first attempt, re-init on wire failure so destroyed native handles get recreated rather than silently staying broken), transitions Running on status==0 or Stopped on non-zero status / exception, raises OnHostStatusChanged with the device HostAddress as the host-name key. TransitionDeviceState guards against spurious same-state events under a per-device lock. ShutdownAsync cancels + disposes each probe's CTS + its captured runtime. DeviceState record gains ProbeLock / HostState / HostStateChangedUtc / ProbeCts / ProbeInitialized fields. IHostConnectivityProbe.GetHostStatuses returns one HostConnectivityStatus per device with the current state + last-change timestamp, surfaced to Admin /hosts per plan decision #144. IPerCallHostResolver.ResolveHost maps a tag full-reference to its DeviceHostAddress via the _tagsByName dict populated at Initialize time, which means UDT member full-references (Motor1.Speed synthesised by PR 6) resolve to the parent UDT's device without extra bookkeeping. Unknown references fall back to the first configured device's host address (invoker handles the actual mislookup at read time as BadNodeIdUnknown), and when no devices are configured resolver returns DriverInstanceId so the single-host fallback pipeline still works. Matches the plan decision #144 contract — Phase 6.1 resilience keys its bulkhead + breaker on (DriverInstanceId, ResolvedHostName) so a dead PLC trips only its own breaker, healthy siblings keep serving. 10 new unit tests in AbCipHostProbeTests covering GetHostStatuses returning one entry per device, probe success transitioning Unknown → Running, probe exception transitioning to Stopped, Enabled=false skipping the loop (no events + state stays Unknown), null ProbeTagPath skipping the loop, multi-device independent probe behavior (one Running + one Stopped simultaneously), ResolveHost for known tags returning the declared DeviceHostAddress, ResolveHost for unknown ref falling back to first device, ResolveHost falling back to DriverInstanceId when no devices, ResolveHost for UDT member walking to the synthesised member definition. Total AbCip unit tests now 147/147 passing (+10 from PR 7's 137). Full solution builds 0 errors; Modbus + other drivers untouched.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 17:15:10 -04:00
Joseph Doherty
33780eb64c AB CIP PR 7 — ISubscribable via shared PollGroupEngine. AbCipDriver now implements ISubscribable — Subscribe delegates into the PollGroupEngine extracted in PR 1, Unsubscribe releases the subscription, ShutdownAsync disposes the engine cancelling every active subscription. OnDataChange event wired through the engine's on-change callback so external subscribers see the driver as sender. The engine's reader delegate points at the driver's ReadAsync (already handles lazy runtime init + caching via EnsureTagRuntimeAsync) — each poll tick batch-reads every subscribed tag in one IReadable call. 100ms interval floor inherited from PollGroupEngine.DefaultMinInterval matches Modbus convention. Initial-data push on first poll preserved via forceRaise=true. Exception-tolerant loop preserved — individual read failures show up as DataValueSnapshot with non-Good StatusCode via the status-code mapping PR 3 established. 7 new unit tests in AbCipSubscriptionTests covering initial-poll raising per tag, unchanged value raising only once, value change between polls triggering a new event, Unsubscribe halting the loop, 100ms floor keeping a 5ms request from generating extra events against a stable value, ShutdownAsync cancelling active subscriptions, UDT member subscription routing through the synthesised Motor1.Speed full-reference (proving PR 6's fan-out composes correctly with PR 7's subscription path). Total AbCip unit tests now 137/137 passing (+7 from PR 6's 130). Validates that the shared PollGroupEngine from PR 1 works correctly for a second driver, closing the original motivation for the extraction. Full solution builds 0 errors; Modbus + other drivers untouched.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 17:11:51 -04:00
Joseph Doherty
b06a1ba607 AB CIP PR 6 — UDT member-declaration support. Declaration-driven UDT member fan-out — users declare a UDT-typed tag once with an explicit Members list and the driver (1) expands member-addressable tags synthetically at Initialize time so Read/Write/Subscribe hit individual native tags per member, (2) emits a folder + one Variable per member in DiscoverAsync instead of a single opaque Structure Variable. Matches the Logix 5000 addressing convention where members are reached via dotted syntax (Motor1.Speed, Motor1.Running) — AbCipTagPath already parsed this shape in PR 2, so PR 6 just had to wire config→TagPath composition. New AbCipStructureMember record — Name / DataType / Writable / WriteIdempotent — plus optional Members list on AbCipTagDefinition that's ignored for atomic types and optional for Structure types. When Structure has null or empty Members the driver falls back to emitting a single opaque Variable so downstream config can address members manually (the "black box" path documented in AbCipTagDefinition's docstring). AbCipDriver.InitializeAsync now iterates tags + for every Structure tag with non-empty Members synthesises a child AbCipTagDefinition per member (composed full-reference Parent.Member + composed TagPath parent.member passed through to libplctag as a normal symbolic read). Per-member Writable/WriteIdempotent metadata propagates so IWritable correctly rejects writes to members flagged non-writable even when the parent tag is writable — each member stands alone from the resilience + authz perspective. DiscoverAsync gains a matching branch — Structure with Members emits an intermediate folder named after the parent tag + one Variable per member under it (browse name = member.Name, FullName = Parent.Member). Members with Writable=false surface SecurityClassification.ViewOnly, WriteIdempotent flag passes through to the DriverAttributeInfo. Structure without Members falls through to the normal single-Variable path. Whole-UDT read optimization (one libplctag call returns the packed buffer + client-side member decode) is deferred — needs the CIP Template Object class 0x6C reader which is blocked on the same libplctag 1.5.2 TagInfoPlcMapper gap that deferred the real @tags walker in PR 5. AbCipTemplateCache shipped in PR 5 is the drop-in point when that reader lands. Per-member reads today are N native round-trips; whole-UDT optimisation is a perf win, not a correctness gap. 7 new unit tests in AbCipUdtMemberTests — UDT fan-out to Variable children under folder with correct SecurityClassification + WriteIdempotent propagation, member reads via synthesised full-reference with correct per-member values, member writes routing to correct TagPath, member Writable=false flag correctly blocking IWritable, Structure without Members falls back to single Variable, empty Members list treated identically to null, UDT tags coexist with flat tags in the discovery output. Total AbCip unit tests now 130/130 passing (+7 from PR 5's 123). Modbus + other drivers untouched; full solution builds 0 errors. Unblocks PR 7 (ISubscribable) — the poll engine already works with member-level full references.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 17:09:06 -04:00
Joseph Doherty
447086892e AB CIP PR 5 — ITagDiscovery (pre-declared emission + controller-enumeration scaffolding). DiscoverAsync streams tags to IAddressSpaceBuilder with the same shape the Modbus driver uses, keyed by device host address so one driver instance exposing N PLCs produces N device folders under a shared "AbCip" root. Pre-declared tags from AbCipDriverOptions.Tags emit first, filtered through AbCipSystemTagFilter so __DEFVAL_* / __DEFAULT_* / Routine: / Task: / Local:N:X / Map: / Axis: / Cam: / MotionGroup: infrastructure names never reach the address space. Writable tags map to SecurityClassification.Operate, non-writable to ViewOnly. Controller enumeration (walking the Logix Symbol Object via @tags) is wired up through a new IAbCipTagEnumerator + IAbCipTagEnumeratorFactory abstraction — default EmptyAbCipTagEnumeratorFactory returns an empty sequence so the driver stays production-safe without a real decoder. Tests inject FakeEnumeratorFactory to exercise the discovered-tag path: discovered tags land under a Discovered/ sub-folder, program-scope produces Program:P.Name full references, the IsSystemTag hint + the AbCipSystemTagFilter both act as gates, ReadOnly surfaces SecurityClassification.ViewOnly. The real @tags walker is a follow-up because libplctag 1.5.2 (latest stable on NuGet) does not expose TagInfoPlcMapper / UdtInfoMapper — the DataTypes namespace only ships IPlcMapper<T>, so enumerating the Symbol Object requires either implementing a custom IPlcMapper for the CIP byte layout or raw-buffer decoding via plc_tag_get_raw — both non-trivial enough to warrant their own PR. Code comment on EmptyAbCipTagEnumerator documents the gap + points to the follow-up. AbCipTemplateCache placeholder ships with a ConcurrentDictionary<(device, templateInstanceId), AbCipUdtShape> + Put / TryGet / Clear / Count — the Template Object reader (CIP class 0x6C) populates it in PR 6 and FlushOptionalCachesAsync now clears it. AbCipUdtShape + AbCipUdtMember records describe UDT layout — type name + total size + ordered members with offset / type / array length. AbCipDriver ctor gains optional enumeratorFactory parameter matching the tagFactory pattern from PR 3. TemplateCache exposed internally for PR 6's reader to write into. 25 new unit tests in AbCipDriverDiscoveryTests covering — pre-declared emission under device folder, DeviceName fallback to host address, system-tag filter rejecting pre-declared infrastructure names, cross-device tag filtering (tags for a device this driver does not own are ignored), controller enumeration adds tags under Discovered/, system-tag hint + filter both enforced, ReadOnly → ViewOnly, AbCipTagCreateParams composition (gateway / port / CIP path / libplctag attribute / tag name "@tags" / timeout), default enumerator factory used when not injected, 13 Theory cases covering every AbCipSystemTagFilter pattern, template cache roundtrip + clear, FlushOptionalCachesAsync clears the cache. Total AbCip unit tests now 123/123 passing (+25 from PR 4's 98). Modbus + other existing tests untouched; full solution builds 0 errors. Unblocks PR 6 (UDT structured read/write) + PR 7 (subscriptions consuming PollGroupEngine from PR 1).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 17:05:02 -04:00
Joseph Doherty
257f4fd3f5 AB CIP PR 4 — IWritable implementation. LibplctagTagRuntime.EncodeValue fills in the switch for every atomic Logix type the driver currently surfaces — Bool (standalone BOOL via SetInt8 0/1), SInt/USInt (SetInt8/SetUInt8), Int/UInt (SetInt16/SetUInt16), DInt/UDInt (SetInt32/SetUInt32), LInt/ULInt (SetInt64/SetUInt64), Real (SetFloat32), LReal (SetFloat64), String (SetString 0), Dt (epoch DINT via SetInt32). BOOL-within-DINT writes throw NotSupportedException with a code comment matching the Modbus BitInRegister pattern at ModbusDriver.cs line 640 — the read-modify-write logic + lock-per-DINT discipline is a follow-up PR rather than squeezing it into the initial wire plumbing. Structure writes throw NotSupportedException pointing at PR 6 when UDT support lands. AbCipDriver now implements IWritable. WriteAsync iterates writes preserving order, short-circuits on unknown reference → BadNodeIdUnknown, on non-writable tag definition → BadNotWritable, on unknown device → BadNodeIdUnknown. Happy path materialises the cached runtime via EnsureTagRuntimeAsync (shares PR 3's lazy-init path so read+write on the same tag hits one native handle), EncodeValue into the tag's buffer, WriteAsync flushes, GetStatus confirms the wire status, maps libplctag error codes via AbCipStatusMapper.MapLibplctagStatus, sets health Healthy on success. Per plan decisions #44, #45, #143 the driver does NOT auto-retry writes — that's a resilience-layer concern (Polly pipeline sitting above) keyed on the tag's WriteIdempotent flag. Exception-mapping table — OperationCanceledException rethrows (honors cancellation), NotSupportedException → BadNotSupported (bit-in-DINT, Structure, future unsupported types), FormatException → BadTypeMismatch (Convert.ToInt32 of a non-numeric string), InvalidCastException → BadTypeMismatch (caller passed an object incompatible with the conversion target), OverflowException → BadOutOfRange (value exceeds target type range, e.g. Int16 write of 1_000_000), any other Exception → BadCommunicationError (wire drop, libplctag-internal failure). Health surface updates Degraded on every non-Cancellation exception path, Healthy on success. Introduces AbCipStatusMapper.BadTypeMismatch (0x80730000). 10 new unit tests in AbCipDriverWriteTests covering — unknown ref → BadNodeIdUnknown, non-writable tag → BadNotWritable, successful DInt write encodes + flushes the value + marks WriteCount=1, BOOL-in-DINT rejected as BadNotSupported (separate ThrowingBoolBitFake mirrors LibplctagTagRuntime's runtime check), non-zero libplctag status after write mapped via AbCipStatusMapper (timeout -5 → BadTimeout), FormatException from non-numeric-string write → BadTypeMismatch (RealConvertFake exercises real Convert.ToInt32), OverflowException from Int16 write of 1_000_000 → BadOutOfRange, generic exception during write → BadCommunicationError + health Degraded, batch with mixed success+failure preserves order across four request types, cancellation propagates as OperationCanceledException. FakeAbCipTag's test-fake base class methods made virtual so override hooks work correctly through the IAbCipTagRuntime interface (new-shadow was silently falling through to the base implementation). Total AbCip unit tests now 98/98 passing; Modbus + other existing tests untouched; full solution builds 0 errors.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 16:57:52 -04:00
Joseph Doherty
cc35c77d64 AB CIP PR 3 — IReadable implementation against libplctag. Introduces IAbCipTagRuntime + IAbCipTagFactory abstraction matching the Modbus transport-factory pattern (ctor optional arg, default production impl injected) so the driver's read/status-mapping logic is unit-testable without a live PLC or the native libplctag binary. LibplctagTagRuntime is the default wire-backed implementation — wraps libplctag.Tag + translates our AbCipDataType enum into GetInt8/GetUInt8/GetInt16/GetUInt16/GetInt32/GetUInt32/GetInt64/GetUInt64/GetFloat32/GetFloat64/GetString/GetBit calls covering Bool (standalone + BOOL-in-DINT via .N bit selector), SInt/USInt, Int/UInt, DInt/UDInt, LInt/ULInt, Real, LReal, String, Dt (epoch DINT), with Structure deferred to PR 6. MapPlcType bridges our libplctag attribute strings (controllogix, compactlogix, micro800) to libplctag.PlcType enum; CompactLogix rolls under ControlLogix per libplctag's family grouping which matches the wire protocol reality. AbCipDriver now implements IReadable — ReadAsync iterates fullReferences preserving order, looks up each tag definition + its device, lazily materialises the tag runtime via EnsureTagRuntimeAsync on first touch (cached thereafter for the lifetime of the device), catches OperationCanceledException to honor cancellation, maps libplctag non-zero status via AbCipStatusMapper.MapLibplctagStatus, catches any other exception as BadCommunicationError. Health surface moves to Healthy on success + Degraded with the last error message on failure. Initialize-failure path disposes the half-created runtime before rethrowing so no native handles leak. DeviceState gains a Runtimes dict alongside the existing TagHandles collection; DisposeHandles walks both so ShutdownAsync + ReinitializeAsync cleanly destroy every native tag. 12 new unit tests in AbCipDriverReadTests using FakeAbCipTag / FakeAbCipTagFactory (test fake under tests/...AbCip.Tests/FakeAbCipTag.cs) covering unknown reference → BadNodeIdUnknown, unknown device → BadNodeIdUnknown, successful DInt read with correct Good status + captured value, lazy-init on first read with reuse across subsequent reads, non-zero libplctag status mapping via AbCipStatusMapper, exception during read surfacing as BadCommunicationError with health Degraded, batched reads preserving order + per-tag status, health Healthy after success, TagCreateParams composition from device + profile (gateway / port / CIP path / libplctag attribute / tag name wiring), cancellation propagation via OperationCanceledException, ShutdownAsync disposing every runtime, Initialize-failure disposing the aborted runtime. Total AbCip unit tests now 88/88 passing. Integration test project scaffolding — tests/ZB.MOM.WW.OtOpcUa.Driver.AbCip.IntegrationTests with AbServerFixture (IAsyncLifetime that starts ab_server when the binary is on PATH, otherwise marks IsAvailable=false), AbServerFact attribute (Fact-equivalent that skips when ab_server is missing), one smoke test exercising DInt read end-to-end. Project runs cleanly — the single smoke test skips on boxes without ab_server (0 failed, 0 passed, 1 skipped) + runs on boxes with it. Follow-up work captured in comments — ab_server CI fixture (download prebuilt Windows x64 binary as GitHub release asset) + per-family JSON profiles + hand-rolled CIP stub for UDT fidelity ship in the PR 6/9-12 window. Solution file updated. Full solution builds 0 errors across all 28 projects. Modbus + other existing tests untouched.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 16:38:54 -04:00
Joseph Doherty
3e0452e8a4 AB CIP PR 2 — scaffolding + Core (AbCipDriver skeleton + libplctag binding + host / tag-path / data-type / status-code parsers + per-family profiles + SafeHandle wrapper + test harness). Ships everything needed to stand up the driver project as a compiling assembly with no wire calls yet — PR 3 adds IReadable against ab_server which is the first PR that actually touches the native library. Project reference shape matches Modbus / OpcUaClient / S7 (only Core.Abstractions, no Core / Configuration / Polly) so the driver stays lean and doesn't drag EF Core into every deployment that wants AB support. libplctag 1.5.2 pinned (1.6.x only exists as alpha — stable 1.5 series covers ControlLogix / CompactLogix / Micro800 / SLC500 / PLC-5 / MicroLogix which matches plan decision #11 family coverage). libplctag.NativeImport arrives transitively. AbCipHostAddress parses ab://gateway[:port]/cip-path canonical strings end-to-end: handles hostname or IP gateway, optional explicit port (default 44818 EtherNet-IP reserved), CIP path including bridged routes (1,2,2,10.0.0.10,1,0), empty path for Micro800 / MicroLogix without backplane routing, case-insensitive scheme, default-port stripping in canonical form for round-trip stability. Opaque string survives straight into libplctag's gateway / path attributes so no translation layer at wire time. AbCipTagPath handles the full Logix symbolic tag surface — controller-scope (Motor1_Speed), program-scope (Program:MainProgram.StepIndex), structured member access (Motor1.Speed.Setpoint), multi-dim array subscripts (Matrix[1,2,3]), bit-within-DINT via .N syntax (Flags.3, Motor.Status.12) with valid range 0-31 per Logix 5000 General Instructions Reference. Structural capture so PR 6 UDT work can walk the path against a cached template without reparsing. Rejects malformed shapes (empty scopes, ident starting with digit, spaces, empty/negative/non-numeric subscripts, unbalanced brackets, leading / trailing dots). Round-trips via ToLibplctagName producing the exact string libplctag's name attribute expects. AbCipDataType mirrors ModbusDataType shape — atomic Bool / SInt / Int / DInt / LInt / USInt / UInt / UDInt / ULInt / Real / LReal / String / Dt plus a Structure marker for UDT-typed tags (resolved via CIP Template Object at discovery time in PR 5/6). ToDriverDataType adapter follows the Modbus widening convention for unsigned + 64-bit until DriverDataType picks those up. AbCipStatusMapper covers the CIP general-status values an AB PLC actually returns during normal operation (0x00/0x04/0x05/0x06/0x08/0x0A/0x0B/0x0E/0x10/0x13/0x16) + libplctag PLCTAG_STATUS_* codes (0, >0 pending, negative error families). Mirrors ModbusDriver.MapModbusExceptionToStatus so Admin UI status displays stay uniform across drivers. PlcTagHandle is a SafeHandle around the int32 native tag ID with plc_tag_destroy slot wired as a no-op for PR 2 (P/Invoke DllImport arrives with PR 3 when the wire calls land). Lifetime guaranteed by the SafeHandle finalizer — every leaked handle gets cleaned up even when the owner is GC'd without explicit Dispose. IsInvalid when native ID <= 0 so destroying a negative (error) handle never happens. Critical because driver-specs.md §3 flags libplctag native heap as invisible to GetMemoryFootprint — leaked handles directly feed the Tier-B recycle trigger. AbCipDriverOptions captures the multi-device shape — one driver instance can talk to N PLCs via Devices[] (each with HostAddress + PlcFamily + optional DeviceName); Tags[] references devices by HostAddress as the cross-key; AbCipProbeOptions + driver-wide Timeout. AbCipDriver implements IDriver only — InitializeAsync parses every device's HostAddress and selects its PlcFamilyProfile (fails fast on malformed strings via InvalidOperationException → Faulted health), per-device state cached in a DeviceState record with parsed address + profile + empty TagHandles dict for later PRs. ReinitializeAsync is the Tier-B escape hatch — shuts down every device, disposes every PlcTagHandle via SafeHandle lifetime, reinitializes from options. ShutdownAsync clears the device dict and flips health to Unknown. PlcFamilies/AbCipPlcFamilyProfile gives four baseline profiles — ControlLogix (4002 ConnectionSize, path 1,0, Large Forward Open + request packing + connected messaging, FW20+ baseline), CompactLogix (narrower 504 default for 5069-L3x safety), Micro800 (488 cap, empty path, unconnected-only, no request packing), GuardLogix (shares ControlLogix wire protocol — safety partition is tag-level, surfaced as ViewOnly in PR 12). Tests — 76 new cases across 4 test classes — AbCipHostAddressTests (10 valid shapes, 10 invalid shapes, ToString canonicalization, round-trip stability), AbCipTagPathTests (18 cases including multi-scope / multi-member / multi-subscript / bit-in-DINT / rejected shapes / underscore idents / round-trip), AbCipStatusMapperTests (12 CIP + 8 libplctag codes), AbCipDriverTests (IDriver lifecycle + multi-device init + malformed-address fault + per-family profile lookup + PlcTagHandle invalid/dispose idempotency + AbCipDataType mapping). Full solution builds 0 errors; 254 warnings are pre-existing xUnit1051 CancellationToken hints outside this PR. Solution file updated to include both new projects. Unblocks PR 3 (IReadable against ab_server) which is the first PR to exercise the native library end-to-end.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 15:58:15 -04:00
Joseph Doherty
4ab587707f AB CIP PR 1 — extract shared PollGroupEngine into Core.Abstractions so the AB CIP driver (and any other poll-based driver — S7, FOCAS, AB Legacy) can reuse the subscription loop instead of reimplementing it. Behaviour-preserving refactor of ModbusDriver: SubscriptionState + PollLoopAsync + PollOnceAsync + ModbusSubscriptionHandle lifted verbatim into a new PollGroupEngine class, ModbusDriver's ISubscribable surface now delegates Subscribe/Unsubscribe into the engine and ShutdownAsync calls engine DisposeAsync. Interval floor (100 ms default) becomes a PollGroupEngine constructor knob so per-driver tuning is possible without re-shipping the loop. Initial-data push semantics preserved via forceRaise=true on the first poll. Exception-tolerant loop preserved — reader throws are swallowed, loop continues, driver's health surface remains the single reporting path. Placement in Core.Abstractions (not Core) because driver projects only reference Core.Abstractions by convention (matches OpcUaClient / Modbus / S7 csproj shape); putting the engine in Core would drag EF Core + Serilog + Polly into every driver. Module has no new dependencies beyond System.Collections.Concurrent + System.Threading, so Core.Abstractions stays lightweight. Modbus ctor converted from primary to explicit so the engine field can capture this for the reader + on-change bridge. All 177 ModbusDriver.Tests pass unmodified (Modbus subscription suite, probe suite, cap suite, exception mapper, reconnect, TCP). 10 new direct engine tests in Core.Abstractions.Tests covering: initial force-raise, unchanged-value single-raise, change-between-polls, unsubscribe halts loop, interval-floor clamp, independent subscriptions, reader-exception tolerance, unknown-handle returns false, ActiveSubscriptionCount lifecycle, DisposeAsync cancels all. No changes to driver-specs.md nor to the server Hosting layer — engine is a pure internal building block at this stage. Unblocks AB CIP PR 7 (ISubscribable consumes the engine); also sets up S7 + FOCAS to drop their own poll loops when they re-base.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 15:34:44 -04:00
Joseph Doherty
ae8f226e45 Phase 6.1 Stream E.3 partial — in-flight counter feeds CurrentBulkheadDepth
Closes the observer half of #162 that was flagged as "persisted as 0 today"
in PR #105. The Admin /hosts column refresh + FleetStatusHub SignalR push
+ red-badge visual still belong to the visual-compliance pass.

Core.Resilience:
- DriverResilienceStatusTracker gains RecordCallStart + RecordCallComplete
  + CurrentInFlight field on the snapshot record. Concurrent-safe via the
  same ConcurrentDictionary.AddOrUpdate pattern as the other recorder methods.
  Clamps to zero on over-decrement so a stray Complete-without-Start can't
  drive the counter negative.
- CapabilityInvoker gains an optional statusTracker ctor parameter. When
  wired, every ExecuteAsync / ExecuteAsync(void) wraps the pipeline call
  in try / finally that records start/complete — so the counter advances
  cleanly whether the call succeeds, cancels, or throws. Null tracker keeps
  the pre-Phase-6.1 Stream E.3 behaviour exactly.

Server.Hosting:
- ResilienceStatusPublisherHostedService persists CurrentInFlight as the
  DriverInstanceResilienceStatus.CurrentBulkheadDepth column (was 0 before
  this PR). One-line fix on both the insert + update branches.

The in-flight counter is a pragmatic proxy for Polly's internal bulkhead
depth — a future PR wiring Polly telemetry would replace it with the real
value. The shape of the column + the publisher + the Admin /hosts query
doesn't change, so the follow-up is invisible to consumers.

Tests (8 new InFlightCounterTests, all pass):
- Start+Complete nets to zero.
- Nested starts sum; Complete decrements.
- Complete-without-Start clamps to zero.
- Different hosts track independently.
- Concurrent starts (500 parallel) don't lose count.
- CapabilityInvoker observed-mid-call depth == 1 during a pending call.
- CapabilityInvoker exception path still decrements (try/finally).
- CapabilityInvoker without tracker doesn't throw.

Full solution dotnet test: 1243 passing (was 1235, +8). Pre-existing
Client.CLI Subscribe flake unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 15:02:34 -04:00
Joseph Doherty
ad131932d3 Phase 6.4 Stream B.2-B.4 server-side — EquipmentImportBatch staging + FinaliseBatch transaction
Closes the server-side/data-layer piece of Phase 6.4 Stream B.2-B.4. The
CSV-import preview + modal UI (Stream B.3/B.5) still belongs to the Admin
UI follow-up — this PR owns the staging tables + atomic finalise alone.

Configuration:
- New EquipmentImportBatch entity (Id, ClusterId, CreatedBy, CreatedAtUtc,
  RowsStaged/Accepted/Rejected, FinalisedAtUtc?). Composite index on
  (CreatedBy, FinalisedAtUtc) powers the Admin preview modal's "my open
  batches" query.
- New EquipmentImportRow entity — one row per CSV row, 8 required columns
  from decision #117 + 9 optional from decision #139 + IsAccepted flag +
  RejectReason. FK to EquipmentImportBatch with cascade delete so
  DropBatch collapses the whole tree.
- EF migration 20260419_..._AddEquipmentImportBatch.
- SchemaComplianceTests expected tables list gains the two new tables.

Admin.Services.EquipmentImportBatchService:
- CreateBatchAsync — new header row, caller-supplied ClusterId + CreatedBy.
- StageRowsAsync(batchId, acceptedRows, rejectedRows) — bulk-inserts the
  parsed CSV rows into staging. Rejected rows carry LineNumberInFile +
  RejectReason for the preview modal. Throws when the batch is finalised.
- DropBatchAsync — removes batch + cascaded rows. Throws when the batch
  was already finalised (rollback via staging is not a time machine).
- FinaliseBatchAsync(batchId, generationId, driverInstanceId, unsLineId) —
  atomic apply. Opens an EF transaction when the provider supports it
  (SQL Server in prod; InMemory in tests skips the tx), bulk-inserts
  every accepted staging row into Equipment, stamps
  EquipmentImportBatch.FinalisedAtUtc, commits. Failure rolls back so
  Equipment never partially mutates. Idempotent-under-double-call:
  second finalise throws ImportBatchAlreadyFinalisedException.
- ListByUserAsync(createdBy, includeFinalised) — the Admin preview modal's
  backing query. OrderByDescending on CreatedAtUtc so the most-recent
  batch shows first.
- Two exception types: ImportBatchNotFoundException +
  ImportBatchAlreadyFinalisedException.

ExternalIdReservation merging (ZTag + SAPID fleet-wide uniqueness) is NOT
done here — a narrower follow-up wires it once the concurrent-insert test
matrix is green.

Tests (10 new EquipmentImportBatchServiceTests, all pass):
- CreateBatch populates Id + CreatedAtUtc + zero-ed counters.
- StageRows accepted + rejected both persist; counters advance.
- DropBatch cascades row delete.
- DropBatch after finalise throws.
- Finalise translates accepted staging rows → Equipment under the target
  GenerationId + DriverInstanceId + UnsLineId.
- Finalise twice throws.
- Finalise of unknown batch throws.
- Stage after finalise throws.
- ListByUserAsync filters by creator + finalised flag.
- Drop of unknown batch is a no-op (idempotent rollback).

Full solution dotnet test: 1235 passing (was 1225, +10). Pre-existing
Client.CLI Subscribe flake unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 14:55:39 -04:00
Joseph Doherty
016122841b Phase 6.1 Stream E.2 partial — ResilienceStatusPublisherHostedService persists tracker snapshots to DB
Closes the HostedService half of Phase 6.1 Stream E.2 flagged as a follow-up
when the DriverResilienceStatusTracker shipped in PR #82. The Admin /hosts
column refresh + SignalR push + red-badge visual (Stream E.3) remain
deferred to the visual-compliance pass — this PR owns the persistence
story alone.

Server.Hosting:
- ResilienceStatusPublisherHostedService : BackgroundService. Samples the
  DriverResilienceStatusTracker every TickInterval (default 5 s) and upserts
  each (DriverInstanceId, HostName) counter pair into
  DriverInstanceResilienceStatus via EF. New rows on first sight; in-place
  updates on subsequent ticks.
- PersistOnceAsync extracted public so tests drive one tick directly —
  matches the ScheduledRecycleHostedService pattern for deterministic
  timing.
- Best-effort persistence: a DB outage logs a warning + continues; the next
  tick retries. Never crashes the app on sample failure. Cancellation
  propagates through cleanly.
- Tracks the bulkhead depth / recycle / footprint columns the entity was
  designed for. CurrentBulkheadDepth currently persisted as 0 — the tracker
  doesn't yet expose live bulkhead depth; a narrower follow-up wires the
  Polly bulkhead-depth observer into the tracker.

Tests (6 new in ResilienceStatusPublisherHostedServiceTests):
- Empty tracker → tick is a no-op, zero rows written.
- Single-host counters → upsert a new row with ConsecutiveFailures + breaker
  timestamp + sampled timestamp.
- Second tick updates the existing row in place (not a second insert).
- Multi-host pairs persist independently.
- Footprint counters (Baseline + Current) round-trip.
- TickCount advances on every PersistOnceAsync call.

Full solution dotnet test: 1225 passing (was 1219, +6). Pre-existing
Client.CLI Subscribe flake unchanged.

Production wiring (Program.cs) example:
  builder.Services.AddSingleton<DriverResilienceStatusTracker>();
  builder.Services.AddHostedService<ResilienceStatusPublisherHostedService>();
  // Tracker gets wired into CapabilityInvoker via OtOpcUaServer resolution
  // + the existing Phase 6.1 layer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 14:36:00 -04:00
Joseph Doherty
4de94fab0d Phase 6.1 Stream A remaining — IPerCallHostResolver + DriverNodeManager per-call host dispatch (decision #144)
Closes the per-device isolation gap flagged at the Phase 6.1 Stream A wire-up
(PR #78 used driver.DriverInstanceId as the pipeline host for every call, so
multi-host drivers like Modbus with N PLCs shared one pipeline — one dead PLC
poisoned sibling breakers). Decision #144 requires per-device isolation; this
PR wires it without breaking single-host drivers.

Core.Abstractions:
- IPerCallHostResolver interface. Optional driver capability. Drivers with
  multi-host topology (Modbus across N PLCs, AB CIP across a rack, etc.)
  implement this; single-host drivers (Galaxy, S7 against one PLC, OpcUaClient
  against one remote server) leave it alone. Must be fast + allocation-free
  — called once per tag on the hot path. Unknown refs return empty so dispatch
  falls back to single-host without throwing.

Server/OpcUa/DriverNodeManager:
- Captures `driver as IPerCallHostResolver` at construction alongside the
  existing capability casts.
- New `ResolveHostFor(fullReference)` helper returns either the resolver's
  answer or the driver's DriverInstanceId (single-host fallback). Empty /
  whitespace resolver output also falls back to DriverInstanceId.
- Every dispatch site now passes `ResolveHostFor(fullRef)` to the invoker
  instead of `_driver.DriverInstanceId` — OnReadValue, OnWriteValue, all four
  HistoryRead paths. The HistoryRead Events path tolerates fullRef=null and
  falls back to DriverInstanceId for those cluster-wide event queries.
- Drivers without IPerCallHostResolver observe zero behavioural change:
  every call still keys on DriverInstanceId, same as before.

Tests (4 new PerCallHostResolverDispatchTests, all pass):
- DeadPlc_DoesNotOpenBreaker_For_HealthyPlc_With_Resolver — 2 PLCs behind
  one driver; hammer the dead PLC past its breaker threshold; assert the
  healthy PLC's first call succeeds on its first attempt (decision #144).
- EmptyString / unknown-ref fallback behaviour documented via test.
- WithoutResolver_SameHost_Shares_One_Pipeline — regression guard for the
  single-host pre-existing behaviour.
- WithResolver_TwoHosts_Get_Two_Pipelines — builds the CachedPipelineCount
  assertion to confirm the shared-builder cache keys correctly.

Full solution dotnet test: 1219 passing (was 1215, +4). Pre-existing
Client.CLI Subscribe flake unchanged.

Adoption: Modbus driver (#120 follow-up), AB CIP / AB Legacy / TwinCAT
drivers (also #120) implement the interface and return the per-tag PLC host
string. Single-host drivers stay silent and pay zero cost.

Remaining sub-items of #160 still deferred:
- IAlarmSource.SubscribeAlarmsAsync + AcknowledgeAsync invoker wrapping.
  Non-trivial because alarm subscription is push-based from driver through
  IAlarmConditionSink — the wrap has to happen at the driver-to-server glue
  rather than a synchronous dispatch site.
- Roslyn analyzer asserting every capability-interface call routes through
  CapabilityInvoker. Substantial (separate analyzer project + test harness);
  noise-value ratio favors shipping this post-v2-GA once the coverage is
  known-stable.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 12:31:24 -04:00
Joseph Doherty
7b50118b68 Phase 6.1 Stream A follow-up — DriverInstance.ResilienceConfig JSON column + parser + OtOpcUaServer wire-in
Closes the Phase 6.1 Stream A.2 "per-instance overrides bound from
DriverInstance.ResilienceConfig JSON column" work flagged as a follow-up
when Stream A.1 shipped in PR #78. Every driver can now override its Polly
pipeline policy per instance instead of inheriting pure tier defaults.

Configuration:
- DriverInstance entity gains a nullable `ResilienceConfig` string column
  (nvarchar(max)) + SQL check constraint `CK_DriverInstance_ResilienceConfig_IsJson`
  that enforces ISJSON when not null. Null = use tier defaults (decision
  #143 / unchanged from pre-Phase-6.1).
- EF migration `20260419161008_AddDriverInstanceResilienceConfig`.
- SchemaComplianceTests expected-constraint list gains the new CK name.

Core.Resilience.DriverResilienceOptionsParser:
- Pure-function parser. ParseOrDefaults(tier, json, out diag) returns the
  effective DriverResilienceOptions — tier defaults with per-capability /
  bulkhead overrides layered on top when the JSON payload supplies them.
  Partial policies (e.g. Read { retryCount: 10 }) fill missing fields from
  the tier default for that capability.
- Malformed JSON falls back to pure tier defaults + surfaces a human-readable
  diagnostic via the out parameter. Callers log the diag but don't fail
  startup — a misconfigured ResilienceConfig must not brick a working
  driver.
- Property names + capability keys are case-insensitive; unrecognised
  capability names are logged-and-skipped; unrecognised shape-level keys
  are ignored so future shapes land without a migration.

Server wire-in:
- OtOpcUaServer gains two optional ctor params: `tierLookup` (driverType →
  DriverTier) + `resilienceConfigLookup` (driverInstanceId → JSON string).
  CreateMasterNodeManager now resolves tier + JSON for each driver, parses
  via DriverResilienceOptionsParser, logs the diagnostic if any, and
  constructs CapabilityInvoker with the merged options instead of pure
  Tier A defaults.
- OpcUaApplicationHost threads both lookups through. Default null keeps
  existing tests constructing without either Func unchanged (falls back
  to Tier A + tier defaults exactly as before).

Tests (13 new DriverResilienceOptionsParserTests):
- null / whitespace / empty-object JSON returns pure tier defaults.
- Malformed JSON falls back + surfaces diagnostic.
- Read override merged into tier defaults; other capabilities untouched.
- Partial policy fills missing fields from tier default.
- Bulkhead overrides honored.
- Unknown capability skipped + surfaced in diagnostic.
- Property names + capability keys are case-insensitive.
- Every tier × every capability × empty-JSON round-trips tier defaults
  exactly (theory).

Full solution dotnet test: 1215 passing (was 1202, +13). Pre-existing
Client.CLI Subscribe flake unchanged.

Production wiring (Program.cs) example:
  Func<string, DriverTier> tierLookup = type => type switch
  {
      "Galaxy" => DriverTier.C,
      "Modbus" or "S7" => DriverTier.B,
      "OpcUaClient" => DriverTier.A,
      _ => DriverTier.A,
  };
  Func<string, string?> cfgLookup = id =>
      db.DriverInstances.AsNoTracking().FirstOrDefault(x => x.DriverInstanceId == id)?.ResilienceConfig;
  var host = new OpcUaApplicationHost(..., tierLookup: tierLookup, resilienceConfigLookup: cfgLookup);

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 12:21:42 -04:00
Joseph Doherty
c1cab33e38 Phase 6.4 Stream D server-side — IdentificationFolderBuilder materializes OPC 40010 Machinery Identification sub-folder
Closes the server-side / non-UI piece of Phase 6.4 Stream D. The Razor
`IdentificationFields.razor` component for Admin-UI editing ships separately
when the Admin UI pass lands (still tracked under #157 UI follow-up).

Core.OpcUa additions:
- IdentificationFolderBuilder — pure-function builder that materializes the
  OPC 40010 Machinery companion-spec Identification sub-folder per decision
  #139. Reads the nine nullable columns off an Equipment row:
  Manufacturer, Model, SerialNumber, HardwareRevision, SoftwareRevision,
  YearOfConstruction (short → OPC UA Int32), AssetLocation, ManufacturerUri,
  DeviceManualUri. Emits one AddProperty call per non-null field; skips the
  sub-folder entirely when all nine are null so browse trees don't carry
  pointless empty folders.
- HasAnyFields(equipment) — cheap short-circuit so callers can decide
  whether to invoke Folder() at all.
- FolderName constant ("Identification") + FieldNames list exposed so
  downstream tools / tests can cross-reference without duplicating the
  decision-#139 field set.

ACL binding: the sub-folder + variables live under the Equipment node so
Phase 6.2's PermissionTrie treats them as part of the Equipment ScopeId —
no new scope level. A user with Equipment-level grant reads the
Identification fields; a user without gets BadUserAccessDenied on both the
Equipment node + its Identification variables. Documented in the class
remarks; cross-reference update to acl-design.md is a follow-up.

Tests (9 new IdentificationFolderBuilderTests):
- HasAnyFields all-null false / any-non-null true.
- Build all-null returns null + doesn't emit Folder.
- Build fully-populated emits all 9 fields in decision #139 order.
- Only non-null fields are emitted (3-of-9 case).
- YearOfConstruction short widens to DriverDataType.Int32 with int value.
- String values round-trip through AddProperty.
- FieldNames constant matches decision #139 exactly.
- FolderName is "Identification".

Full solution dotnet test: 1202 passing (was 1193, +9). Pre-existing
Client.CLI Subscribe flake unchanged.

Production integration: the component that consumes this is the
address-space-build flow that walks the live Equipment table + calls
IdentificationFolderBuilder.Build(equipmentFolder, equipment) under each
Equipment node. That integration is the remaining Stream D follow-up
alongside the Razor UI component.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 11:57:39 -04:00
Joseph Doherty
c4a92f424a Phase 6.1 Stream B.4 follow-up — ScheduledRecycleHostedService drives registered schedulers on a fixed tick
Turns the Phase 6.1 Stream B.4 pure-logic ScheduledRecycleScheduler (shipped
in PR #79) into a running background feature. A Tier C driver registers its
scheduler at startup; the hosted service ticks every TickInterval (default
1 min) and invokes TickAsync on each registered scheduler.

Server.Hosting:
- ScheduledRecycleHostedService : BackgroundService. AddScheduler(s) must be
  called before StartAsync — registering post-start throws
  InvalidOperationException to avoid "some ticks saw my scheduler, some
  didn't" races. ExecuteAsync loops on Task.Delay(TickInterval, _timeProvider,
  stoppingToken) + delegates to a public TickOnceAsync method for one tick.
- TickOnceAsync extracted as the unit-of-work so tests drive it directly
  without needing to synchronize with FakeTimeProvider + BackgroundService
  timing semantics.
- Exception isolation: if one scheduler throws, the loop logs + continues
  to the next scheduler. A flaky supervisor can't take down the tick for
  every other Tier C driver.
- Diagnostics: TickCount + SchedulerCount properties for tests + logs.

Tests (7 new ScheduledRecycleHostedServiceTests, all pass):
- TickOnce before interval doesn't fire; TickCount still advances.
- TickOnce at/after interval fires the underlying scheduler exactly once.
- Multiple ticks accumulate count.
- AddScheduler after StartAsync throws.
- Throwing scheduler doesn't poison its neighbours (logs + continues).
- SchedulerCount matches registrations.
- Empty scheduler list ticks cleanly (no-op + counter advances).

Full solution dotnet test: 1193 passing (was 1186, +7). Pre-existing
Client.CLI Subscribe flake unchanged.

Production wiring (Program.cs):
  builder.Services.AddSingleton<ScheduledRecycleHostedService>();
  builder.Services.AddHostedService(sp => sp.GetRequiredService<ScheduledRecycleHostedService>());
  // During DI configuration, once Tier C drivers + their ScheduledRecycleSchedulers
  // are resolved, call host.AddScheduler(scheduler) for each.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 11:42:08 -04:00
Joseph Doherty
c4824bea12 Phase 6.3 Stream C core — RedundancyStatePublisher + PeerReachability; orchestrates calculator inputs end-to-end
Wires the Phase 6.3 Stream B pure-logic pieces (ServiceLevelCalculator,
RecoveryStateManager, ApplyLeaseRegistry) + Stream A topology loader
(RedundancyCoordinator) into one orchestrator the runtime + OPC UA node
surface consume. The actual OPC UA variable-node plumbing (mapping
ServiceLevel Byte + ServerUriArray String[] onto the Opc.Ua.Server stack)
is narrower follow-up on top of this — the publisher emits change events
the OPC UA layer subscribes to.

Server.Redundancy additions:
- PeerReachability record + PeerReachabilityTracker — thread-safe
  per-peer-NodeId holder of the latest (HttpHealthy, UaHealthy) tuple. Probe
  loops (Stream B.1/B.2 runtime follow-up) write via Update; the publisher
  reads via Get. PeerReachability.FullyHealthy / Unknown sentinels for the
  two most-common states.
- RedundancyStatePublisher — pure orchestrator, no background timer, no OPC
  UA stack dep. ComputeAndPublish reads the 6 inputs + calls the calculator:
    * role (from coordinator.Current.SelfRole)
    * selfHealthy (caller-supplied Func<bool>)
    * peerHttpHealthy + peerUaHealthy (aggregate across all peers in
      coordinator.Current.Peers)
    * applyInProgress (ApplyLeaseRegistry.IsApplyInProgress)
    * recoveryDwellMet (RecoveryStateManager.IsDwellMet)
    * topologyValid (coordinator.IsTopologyValid)
    * operatorMaintenance (caller-supplied Func<bool>)
  Before-coordinator-init returns NoData=1 so clients never see an
  authoritative value from an un-bootstrapped server.
  OnStateChanged event fires edge-triggered when the byte changes;
  OnServerUriArrayChanged fires edge-triggered when the topology's self-first
  peer-sorted URI array content changes.
- ServiceLevelSnapshot record — per-tick output with Value + Band +
  Topology. The OPC UA layer's ServiceLevel Byte node subscribes to
  OnStateChanged; the ServerUriArray node subscribes to OnServerUriArrayChanged.

Tests (8 new RedundancyStatePublisherTests, all pass):
- Before-init returns NoData (Value=1, Band=NoData).
- Authoritative-Primary when healthy + peer fully reachable.
- Isolated-Primary (230) retains authority when peer unreachable — matches
  decision #154 non-promotion semantics.
- Mid-apply band dominates: open lease → Value=200 even with peer healthy.
- Self-unhealthy → NoData regardless of other inputs.
- OnStateChanged fires only on value transitions (edge-triggered).
- OnServerUriArrayChanged fires once per topology content change; repeat
  ticks with same topology don't re-emit.
- Standalone cluster treats healthy as AuthoritativePrimary=255.

Microsoft.EntityFrameworkCore.InMemory 10.0.0 added to Server.Tests for the
coordinator-backed publisher tests.

Full solution dotnet test: 1186 passing (was 1178, +8). Pre-existing
Client.CLI Subscribe flake unchanged.

Closes the core of release blocker #3 — the pure-logic + orchestration
layer now exists + is unit-tested. Remaining Stream C surfaces: OPC UA
ServiceLevel Byte variable wiring (binds to OnStateChanged), ServerUriArray
String[] wiring (binds to OnServerUriArrayChanged), RedundancySupport
static from RedundancyMode. Those touch the OPC UA stack directly + land
as Stream C.2 follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 11:31:50 -04:00
Joseph Doherty
84fe88fadb Phase 6.3 Stream A — RedundancyTopology + ClusterTopologyLoader + RedundancyCoordinator
Lands the data path that feeds the Phase 6.3 ServiceLevelCalculator shipped in
PR #89. OPC UA node wiring (ServiceLevel variable + ServerUriArray +
RedundancySupport) still deferred to task #147; peer-probe loops (Stream B.1/B.2
runtime layer beyond the calculator logic) deferred.

Server.Redundancy additions:
- RedundancyTopology record — immutable snapshot (ClusterId, SelfNodeId,
  SelfRole, Mode, Peers[], SelfApplicationUri). ServerUriArray() emits the
  OPC UA Part 4 §6.6.2.2 shape (self first, peers lexicographically by
  NodeId). RedundancyPeer record with per-peer Host/OpcUaPort/DashboardPort/
  ApplicationUri so the follow-up peer-probe loops know where to probe.
- ClusterTopologyLoader — pure fn from ServerCluster + ClusterNode[] to
  RedundancyTopology. Enforces Phase 6.3 Stream A.1 invariants:
    * At least one node per cluster.
    * At most 2 nodes (decision #83, v2.0 cap).
    * Every node belongs to the target cluster.
    * Unique ApplicationUri across the cluster (OPC UA Part 4 trust pin,
      decision #86).
    * At most 1 Primary per cluster in Warm/Hot modes (decision #84).
    * Self NodeId must be a member of the cluster.
  Violations throw InvalidTopologyException with a decision-ID-tagged message
  so operators know which invariant + what to fix.
- RedundancyCoordinator singleton — holds the current topology + IsTopologyValid
  flag. InitializeAsync throws on invariant violation (startup fails fast).
  RefreshAsync logs + flips IsTopologyValid=false (runtime won't tear down a
  running server; ServiceLevelCalculator falls to InvalidTopology band = 2
  which surfaces the problem to clients without crashing). CAS-style swap
  via Volatile.Write so readers always see a coherent snapshot.

Tests (10 new ClusterTopologyLoaderTests):
- Single-node standalone loads + empty peer list.
- Two-node cluster loads self + peer.
- ServerUriArray puts self first + peers sort lexicographically.
- Empty-nodes throws.
- Self-not-in-cluster throws.
- Three-node cluster rejected with decision #83 message.
- Duplicate ApplicationUri rejected with decision #86 shape reference.
- Two Primaries in Warm mode rejected (decision #84 + runtime-band reference).
- Cross-cluster node rejected.
- None-mode allows any role mix (standalone clusters don't enforce Primary count).

Full solution dotnet test: 1178 passing (was 1168, +10). Pre-existing
Client.CLI Subscribe flake unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 11:24:14 -04:00