feat/scripted-alarm-shelve-routing #421

Closed
dohertj2 wants to merge 0 commits from feat/scripted-alarm-shelve-routing into master
Owner
No description provided.
dohertj2 added 161 commits 2026-05-22 14:19:55 -04:00
OneShotShelve / TimedShelve / Unshelve now reach the ScriptedAlarmEngine.
Scripted-alarm condition nodes get a ShelvedStateMachine subtree created
before alarm.Create so the stack wires each shelve method's dispatch
handler; AlarmConditionState.OnShelve / OnTimedUnshelve route to the
engine and mirror the result onto the OPC UA node via SetShelvingState.

The three per-instance shelve method NodeIds are indexed so the Call gate
resolves them to OpcUaOperation.AlarmShelve instead of falling through to
generic Call. Engine dispatch is split into the node-free InvokeEngineShelve
so the routing decision is unit-testable.

Adds 9 unit tests; updates phase-7-status.md Gap 1 (only AddComment remains
unwired) and the #24 entry in looseends.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
RouteScriptedAlarmMethodCalls now handles ConditionType.AddComment
alongside Acknowledge/Confirm, dispatching to engine.AddCommentAsync.
An empty comment is rejected by the Part 9 state machine and surfaced
as BadInvalidArgument. MapCallOperation gates AddComment at the
AlarmAcknowledge tier — there is no dedicated AddComment permission bit.

Closes phase-7-status.md Gap 1: all Part 9 alarm methods now route to
the engine. Adds 3 unit tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
SdkAlarmHistorianWriteBackend.WriteBatchAsync replaces the RetryPlease
placeholder with the real entry point — HistorianAccess.AddStreamedValue
(HistorianEvent, out HistorianAccessError) in aahClientManaged, pinned by
decompiling the installed SDK.

The write path opens its own ReadOnly=false connection: the query-side
HistorianDataSource opens ReadOnly sessions and AddStreamedValue fails on
those with WriteToReadOnlyFile. IHistorianConnectionFactory gains a readOnly
parameter (default true, query path unchanged); BuildConnectionArgs is
extracted as a pure helper. HistorianClusterEndpointPicker is shared for
node failover; connection-class errors abort the batch as RetryPlease and
reset the connection, malformed-input codes map to PermanentFail.

Tests: connection-unavailable batch deferral, ClassifyOutcome error-code
table, BuildConnectionArgs read-vs-write shaping (80 pass, 2 rig-skipped).
Live_* round-trip tests stay Skip-gated for the D.1 rollout smoke.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The mxaccessgw updated alarms to a session-less central monitor:
AcknowledgeAlarm dropped SessionId and alarm transitions now come from
the session-less StreamAlarms feed instead of the per-session worker
StreamEvents stream. The GalaxyDriver no longer compiled against the
updated client.

- GatewayGalaxyAlarmAcknowledger: session-less rewrite — no GalaxyMxSession;
  outcome read from ProtocolStatus (throw) and Hresult (warn).
- New IGalaxyAlarmFeed seam + GatewayGalaxyAlarmFeed: background consumer
  of StreamAlarms that decodes the active-alarm snapshot plus live
  transitions into GalaxyAlarmTransition and reopens the stream on
  transport faults.
- EventPump: drop the dead per-session OnAlarmTransition path; the
  per-session stream no longer carries alarms.
- GalaxyDriver: bridge the feed onto IAlarmSource.OnAlarmEvent; the feed
  starts on SubscribeAlarmsAsync, independent of data subscriptions.
- Tests: replace EventPumpAlarmTests with GatewayGalaxyAlarmFeedTests;
  move the driver alarm-source tests onto the IGalaxyAlarmFeed seam.

Browse needed no change — GatewayGalaxyHierarchySource consumes the
unchanged DiscoverHierarchy contract.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adapts the code-review procedure, folder layout, template, and tooling
from the sibling mxaccessgw repo to lmxopcua.

- REVIEW-PROCESS.md: per-module review workflow — a module is one src/
  or tests/ project (ZB.MOM.WW.OtOpcUa. prefix stripped); 10-category
  checklist; finding IDs/severities/statuses; re-review rules.
- code-reviews/_template/findings.md: per-module findings template.
- code-reviews/regen-readme.py: generates the cross-module README.md
  index from the per-module findings.md files; --check gates staleness
  and consistency.
- code-reviews/test_regen_readme.py: dependency-free generator tests.
- code-reviews/prompt.md: orchestration prompt for clearing the backlog.
- code-reviews/README.md: generated index (no modules reviewed yet).
- scripts/check-code-reviews-readme.ps1: CI / pre-commit check wrapper.

Adapted to this repo: ZB.MOM.WW.OtOpcUa module naming, OtOpcUa
conventions checklist (in-process GalaxyDriver + mxaccessgw,
contained-name vs tag-name, ACL at DriverNodeManager), single .NET
solution build/test commands, and the lmxopcua design docs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reviewed all 31 src/ production projects against the 10-category
checklist in REVIEW-PROCESS.md. Each module gets its own findings.md;
code-reviews/README.md is regenerated from them.

334 findings: 6 Critical, 46 High, 126 Medium, 156 Low.

Critical findings:
- Server-001: WriteNodeIdUnknown recurses unconditionally — a HistoryRead
  on an unresolvable node crashes the process (remote DoS).
- Admin-001/002: app-wide auth bypass (RouteView not AuthorizeRouteView)
  plus unauthenticated mutating routes.
- Core.Scripting-001: System.Environment reachable from operator scripts;
  Environment.Exit() terminates the server.
- Core.AlarmHistorian-001: rowIds/events parallel-list desync on a corrupt
  payload misapplies outcomes — silent alarm-event data loss.
- Driver.Galaxy-001: ReconnectSupervisor is built but never triggered, so
  a transient gateway drop permanently kills the event stream.

All findings are Status=Open; resolution is tracked per REVIEW-PROCESS.md
section 4. Review only — no source code changed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
WriteNodeIdUnknown called itself unconditionally as its first statement
— unbounded recursion with no base case → StackOverflowException, an
uncatchable process crash reachable by any client issuing a HistoryRead
on an unresolvable NodeId (remote DoS).

Replace the self-call with the result-slot assignment, mirroring
WriteUnsupported / WriteInternalError. The helper is now internal so the
regression test can pin the StatusCode without a server fixture.

Resolves code-review finding Server-001 (Critical).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Admin-001: Routes.razor used a plain RouteView, so the page-level
[Authorize] attributes on 11 pages were inert — every page, including
mutating ones, was reachable fully unauthenticated.
Admin-002: several pages (e.g. NewCluster, which writes config rows)
carried no auth attribute at all.

- Routes.razor: RouteView → AuthorizeRouteView with NotAuthorized /
  Authorizing slots; add RedirectToLogin component.
- Program.cs: SetFallbackPolicy(RequireAuthenticatedUser) — secure by
  default for new pages/endpoints.
- Login.razor: [AllowAnonymous] so login stays reachable; login page,
  /auth/* endpoints and static assets remain anonymous.
- Add [Authorize] to the previously un-gated pages; NewCluster gated to
  the CanPublish (FleetAdmin) policy.

Regression tests in PageAuthorizationTests pin that anonymous requests
to protected/mutating routes are rejected and that login + static
assets stay anonymously reachable. Admin test suite: 210/210 pass.

Resolves code-review findings Admin-001 and Admin-002 (Critical).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ForbiddenTypeAnalyzer used only a namespace-prefix deny-list. System.Environment,
System.AppDomain, System.GC and System.Activator live directly in the System
namespace, which must stay allowed for primitives (Math, String, ...), so they
were never caught — an operator-authored predicate could call
System.Environment.Exit(0) and terminate the in-process OPC UA server.

Add a type-granular deny-list (ForbiddenFullTypeNames) checked by
fully-qualified type name after the namespace-prefix check; legitimate System
types are unaffected.

Regression tests assert scripts referencing Environment/AppDomain/GC/Activator
are rejected at analysis time. Core.Scripting suite: 68/68 pass.

Resolves code-review finding Core.Scripting-001 (Critical).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ReadBatch built parallel rowIds / events lists: rowIds.Add ran for every
row but events.Add was guarded by `if (evt is not null)`. A corrupt /
null-deserializing payload desynced the lists, so DrainOnceAsync applied
each outcome to the wrong RowId — an Ack could delete an un-sent event
(silent alarm-event data loss) and the corrupt row stalled the queue
head forever.

ReadBatch now returns a single list of QueueRow(long RowId,
AlarmHistorianEvent? Event) records so a rowId can never drift from its
event; deserialization is wrapped to yield null on JsonException.
DrainOnceAsync immediately dead-letters rows whose payload is
null/un-deserializable and forwards only well-formed events to the
writer, mapping outcomes by RowId.

Regression tests cover a corrupt row mid-batch and at the queue head.
Core.AlarmHistorian suite: 16/16 pass.

Resolves code-review finding Core.AlarmHistorian-001 (Critical).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The ReconnectSupervisor was constructed but its trigger
ReportTransportFailure was never called. When the gateway StreamEvents
stream faulted, EventPump just logged and exited — the supervisor was
never notified, so a transient gateway drop permanently stopped
data-change notifications while GetHealth() still reported Healthy.

EventPump gains an optional onStreamFault callback invoked from its
stream-fault catch block (not on clean shutdown). GalaxyDriver wires it
to ReconnectSupervisor.ReportTransportFailure so a transport drop drives
reopen → replay.

This is the minimal fix for -001; the pump-restart-on-reopen gap remains
tracked as Driver.Galaxy-008. Regression tests cover the callback being
invoked on fault, the end-to-end supervisor reopen/replay, and that a
clean shutdown does not fire it. Driver.Galaxy suite: 206/206 pass.

Resolves code-review finding Driver.Galaxy-001 (Critical).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The ForbiddenTypeAnalyzer syntax walker only inspected four node kinds
(ObjectCreation, Invocation-with-member-access, MemberAccess, bare
Identifier), so a forbidden type named through typeof, a generic type
argument, a cast, an is/as type pattern, default(T), an array-creation
element type, or an explicitly-typed local declaration produced no
examined node and bypassed the sandbox check.

Analyze now runs a second pass that resolves GetTypeInfo on every
TypeSyntax node and recursively unwraps array element types and generic
type arguments, so forbidden types nested at any depth are rejected at
compile. The original member/call node-kind switch is kept deliberately
narrow (rather than resolving GetSymbolInfo on every node) to avoid
flagging harmless inherited members such as typeof(int).Name, whose Name
property is declared by System.Reflection.MemberInfo. A span+type dedupe
keeps the two passes from emitting duplicate rejections.

Regression tests added in ScriptSandboxTests cover typeof, generic type
arguments, casts, default(T), is/as patterns, array element types, and
typed local declarations with forbidden types, plus over-block guards
asserting allowed generics and typeof still compile.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Server-002 — AuthorizationGate lax-mode no longer overrides explicit deny.
IsAllowed now switches on the evaluator's AuthorizationVerdict: Allow -> true,
Denied (an authored deny rule matched) -> false in BOTH strict and lax mode,
and only the indeterminate NotGranted case falls through to !_strictMode.
Previously `if (decision.IsAllowed) return true; return !_strictMode;` let lax
mode (the default) nullify authored NodeAcl deny rules for fully-resolved
sessions. The tri-state AuthorizationVerdict.Denied member is now honoured.

Server-009 — LDAP is secure-by-default. LdapOptions.AllowInsecureLdap now
defaults to false (was true) and Program.cs's config fallback reads `?? false`
(was `?? true`), so an LDAP-enabled deployment will not bind credentials over
an unencrypted socket unless an operator explicitly opts in. Program.cs also
logs a startup warning when LDAP is enabled with UseTls=false and
AllowInsecureLdap=true, flagging the clear-text server->LDAP credential hop.

Regression tests: AuthorizationGateTests covers all four verdict x mode
combinations via a fixed-verdict evaluator stub; new LdapOptionsTests asserts
the secure defaults. Both Server and Server.Tests build clean; the 15 targeted
tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Configuration-001: wrap the EXEC dbo.sp_ValidateDraft call in
sp_PublishGeneration in a BEGIN TRY/CATCH ROLLBACK; THROW block so a
validation RAISERROR aborts the publish instead of being ignored.

Configuration-008: route caller-supplied strings interpolated into
ConfigAuditLog.DetailsJson through STRING_ESCAPE(@x, 'json') and emit
sp_RollbackToGeneration's @TargetGenerationId as a bare JSON number,
closing the JSON-injection / denial-of-operation vector.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Core-001: swap the authorization-cache defaults so
MembershipFreshnessInterval (5 min, inner re-resolve trigger) is
strictly less than AuthCacheMaxStaleness (15 min, fail-closed
ceiling), so NeedsRefresh's warm-refresh path is reachable.

Core-002: TriePermissionEvaluator.Authorize now compares the trie's
GenerationId against the session's AuthGenerationId and re-fetches the
session's bound generation on mismatch, failing closed when that
generation has been pruned.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Admin-003 — SignalR hubs were anonymously reachable: an unauthenticated
client could open /hubs/fleet, /hubs/alerts and /hubs/script-log and
stream fleet state, alert detail text and server script-log contents.
Added [Authorize] to FleetStatusHub, AlertHub and ScriptLogHub, and
chained .RequireAuthorization() onto all three MapHub() calls as a
belt-and-braces backstop.

Admin-004 — appsettings.json committed live-looking secrets (the `sa`
ConfigDb password and the LDAP ServiceAccountPassword) in plaintext.
Replaced both with empty placeholders sourced from user-secrets (dev) or
the ConnectionStrings__ConfigDb / Authentication__Ldap__ServiceAccountPassword
environment variables (prod); added a UserSecretsId to the Admin csproj
and a fail-fast guard in Program.cs when ConfigDb is empty/missing.

Admin-005 — Login.razor performed SignInAsync from an interactive Blazor
circuit, where the original HTTP response has long completed so the auth
cookie was not emitted. Rewrote it as a static-rendered plain HTML form
(data-enhance="false") posting to a new AuthEndpoints.MapAuthEndpoints()
minimal-API handler (/auth/login, /auth/logout) that does the LDAP bind,
grant resolution, cookie SignInAsync and redirect while the endpoint
still owns the response. Includes an open-redirect guard on returnUrl.

Added xUnit + Shouldly regression tests: AuthEndpointsTests (login cookie
issuance, failed-bind redirect, open-redirect rejection, logout, anonymous
hub negotiate rejection) and AppSettingsSecretHygieneTests (no committed
secrets). All 26 auth-related tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Client.Shared-005: _activeDataSubscriptions (a plain Dictionary) and the
_activeAlarmSubscription tuple were mutated from the caller thread, the
keep-alive failover path, and DisconnectAsync with no synchronization,
risking bucket corrosion / InvalidOperationException / lost entries.
Added a dedicated _subscriptionLock and wrapped every read/write of that
bookkeeping state inside it (Subscribe/Unsubscribe[Alarms]Async,
Disconnect, Dispose, and the snapshot/clear/re-record steps of
ReplaySubscriptionsAsync). Awaited adapter calls stay outside the lock so
it is never held across I/O.

Client.Shared-006: HandleKeepAliveFailureAsync had only a non-atomic
state check guarding re-entry, so two bad keep-alives could each start a
failover loop, racing to dispose/replace _session and double-replaying
subscriptions. It now claims an atomic _failoverInProgress slot via
Interlocked.CompareExchange; a re-entrant call returns immediately. The
loop body moved to RunFailoverAsync, wrapped in try/finally that resets
the flag.

Tests: added KeepAliveFailure_ReentrantWhileFailoverInFlight_RunsFailoverOnce
and SubscribeAndUnsubscribe_ConcurrentCalls_DoNotCorruptState regression
tests; made the FakeSubscriptionAdapter / FakeSessionAdapter /
FakeSessionFactory test doubles thread-safe (and added a CreateGate hook)
so the concurrency tests exercise production locking rather than fake
state. All 138 Client.Shared tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
SnapshotFormatter.FormatStatus mapped four OPC UA status names to
incorrect numeric codes, mislabelling operator-facing CLI output. The
codes were corrected to their canonical OPC Foundation
Opc.Ua.StatusCodes values:

  BadTimeout                0x80060000 -> 0x800A0000
  BadNoCommunication        0x80070000 -> 0x80310000
  BadWaitingForInitialData  0x80080000 -> 0x80320000
  BadNodeIdInvalid          0x80350000 -> 0x80330000

The Cli.Common project does not reference the Opc.Ua package (only
Core.Abstractions / CliFx / Serilog), so the hex literals were
corrected in place with a sync note rather than adding a heavy new
dependency.

SnapshotFormatterTests was updated: the [Theory] expectations now use
the correct spec codes and assert the full rendered form, plus a new
regression [Theory] confirms the pre-fix wrong names no longer apply.
All 24 tests pass.

findings.md: Driver.Cli.Common-001 set to Resolved; open count 6 -> 5.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Core.AlarmHistorian-002 — drain loop now honors exponential backoff:
StartDrainLoop arms a self-rescheduling one-shot Timer. RescheduleDrain
sets the next due-time to max(tickInterval, CurrentBackoff) while the
sink is BackingOff, so a historian outage genuinely slows the cadence
down the 1s->2s->5s->15s->60s ladder instead of hammering at the fixed
tick. Class doc-comment updated.

Core.AlarmHistorian-004 — SQLite busy handling: the connection string
is built via SqliteConnectionStringBuilder with DefaultTimeout=5, and a
new OpenConnection helper applies PRAGMA busy_timeout=5000 and
PRAGMA journal_mode=WAL on every open. A concurrent enqueue-vs-drain
file-lock collision now waits the lock out instead of failing fast with
SQLITE_BUSY. All connection open sites switched to the helper.

Core.AlarmHistorian-006 — drain-loop faults are no longer unobserved:
the timer callback (DrainTimerCallback) awaits DrainOnceAsync inside a
try/catch that logs via _logger.Error, records the message into
_lastError, and sets _drainState=BackingOff so a stalled drain is
visible on GetStatus; a finally always re-arms the timer.

Regression tests added to SqliteStoreAndForwardSinkTests:
StartDrainLoop_honors_backoff_and_slows_cadence_under_retry,
StartDrainLoop_keeps_steady_cadence_when_writer_is_healthy,
StartDrainLoop_records_drain_fault_and_keeps_running,
Concurrent_enqueue_and_drain_do_not_throw_sqlite_busy.

findings.md: 002/004/006 marked Resolved; open count 10 -> 7.

Build: clean (0 warnings). Tests: 20/20 passing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
_alarms was a plain Dictionary<string, AlarmState> mutated under the
_evalGate semaphore, but four read paths (GetState, GetAllStates, the
LoadedAlarmIds property, and RunShelvingCheck) touched it from arbitrary
threads with no synchronisation. A Dictionary read concurrent with a
writer's entry reassignment can throw InvalidOperationException or return
torn state.

Switched _alarms to ConcurrentDictionary<string, AlarmState>. The only
write shapes are indexer-set and Clear, both atomic on ConcurrentDictionary,
so all mutations stay correct without further change; reads now get safe
snapshot semantics. LoadedAlarmIds materialises the key snapshot to keep
its IReadOnlyCollection<string> return type. This matches _valueCache,
which is already a ConcurrentDictionary.

Added a regression test (Concurrent_reads_during_mutation_do_not_throw)
that hammers the engine with state mutations while four reader threads
continuously call the three unguarded read paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
OnScriptSetVirtualTag updated the value cache, notified observers, and
recorded history for the written path but never scheduled a cascade for
tags depending on that path. This contradicts docs/VirtualTags.md, which
states ctx.SetVirtualTag writes "still participate in change-trigger
cascades": a change-triggered virtual tag reading a script-written tag
went stale until an unrelated trigger fired.

OnScriptSetVirtualTag now launches a fire-and-forget CascadeAsync for the
written path, mirroring OnUpstreamChange. The cascade is scheduled rather
than invoked inline because the callback runs inside EvaluateInternalAsync
while the non-reentrant _evalGate semaphore is held.

Added regression test
SetVirtualTag_within_script_cascades_to_dependents_of_the_written_tag.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Driver.TwinCAT-001 — InitializeAsync/ReinitializeAsync ignored driverConfigJson.
Extracted the DTO-to-options parse into a shared TwinCATDriverFactoryExtensions.ParseOptions;
InitializeAsync now re-parses driverConfigJson into a mutable _options field, so a config
generation pushed via ReinitializeAsync (added/removed devices, tags, probe settings) is
actually applied at runtime.

Driver.TwinCAT-002 — LInt/ULInt narrowed to Int32. ToDriverDataType now maps LInt to Int64,
ULInt to UInt64, UDInt to UInt32, UInt/USInt to UInt16, Int/SInt to Int16, and the IEC
TIME/DATE/DT/TOD types to UInt32 (their raw UDINT counter). Removed the stale "Int64 gap"
comment — no truncation or sign flips at the OPC UA encode layer.

Driver.TwinCAT-007 — EnsureConnectedAsync was not thread-safe. Connect/reconnect is now
serialized per device by a SemaphoreSlim (DeviceState.ConnectGate) with a double-checked
connect, mirroring the S7 driver. Concurrent read/write/probe callers can no longer leak a
client or race a create-vs-dispose.

Driver.TwinCAT-008 — native ADS notification callbacks ran driver logic on the AMS router
thread. AdsTwinCATClient now enqueues AdsNotificationEx callbacks onto a bounded Channel
drained by a dedicated managed task; the router-thread callback only does a non-blocking
TryWrite, so a slow consumer cannot stall ADS notification delivery process-wide.

Driver.TwinCAT-013 — TwinCATDriver did not implement IRediscoverable. The driver now
implements IRediscoverable; AdsTwinCATClient detects ADS 0x0702 (symbol-version-changed) on
read/write paths and raises OnSymbolVersionChanged, which the driver forwards as
OnRediscoveryNeeded so Core rebuilds the address space after a PLC program re-download.

Adds TwinCATHighFindingsRegressionTests covering all five fixes; updates the data-type
mapping assertion in TwinCATDriverTests. TwinCAT driver builds clean; 119 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Driver.AbCip-001 — ReinitializeAsync silently discarded its config JSON.
Extracted AbCipDriverFactoryExtensions.ParseOptions; InitializeAsync now
re-parses a content-bearing driverConfigJson and replaces _options (and
recreates the alarm projection), so a reinitialize with a changed config
(new device/tag, changed timeout) actually takes effect. A blank or
empty-object JSON keeps construction-time options for the unit-test seam.

Driver.AbCip-002 — libplctag status mapping used wrong integer constants.
MapLibplctagStatus now switches on the libplctag.NET Status enum members
(Ok/Pending/ErrorTimeout/ErrorNotFound/ErrorNotAllowed/ErrorOutOfBounds/…)
instead of hand-typed natives, so timeout/not-found/not-allowed/out-of-bounds
get their specific OPC UA codes instead of all collapsing to
BadCommunicationError. The int overload casts to Status to stay correct
against the wrapper's contiguous renumbering.

Driver.AbCip-003 — whole-UDT reads decoded members at declaration-order
offsets, which Studio 5000 does not guarantee. Added the opt-in
AbCipDriverOptions.EnableDeclarationOnlyUdtGrouping flag (default false);
AbCipUdtReadPlanner.Build forms no groups when it is off, so by default
every UDT member reads per-tag rather than at possibly-wrong offsets.

Driver.AbCip-008 — probe loops were fire-and-forget and ShutdownAsync raced
them. Each probe Task is stored on DeviceState.ProbeTask; ShutdownAsync now
cancels every CTS, awaits each probe Task (10s timeout), then disposes the
CTS and handles. DeviceState.Runtimes/ParentRuntimes are ConcurrentDictionary
and the Ensure*RuntimeAsync paths use TryAdd, disposing the losing concurrent
creator instead of leaking a native tag handle.

Adds AbCipDriverCodeReviewRegressionTests and updates existing AbCip tests
to the corrected status constants + opt-in grouping flag. AbCip driver +
test project build clean; all 244 AbCip tests pass. (The full-solution
build has pre-existing, unrelated Driver.Galaxy protobuf-generation errors
in this worktree.)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Driver.AbLegacy-001 — PCCC bit-index range. AbLegacyAddress.TryParse
accepted a bit index of 0..31 for every file type, but a 16-bit
N/B/I/O/S/A word only has bits 0..15. TryParse now range-checks the
bit index against the file's word width (0..15 for 16-bit element
files, 0..31 for the 32-bit L file, no bits on float files), so
addresses like N7:0/20 are rejected at parse time instead of silently
truncating in the (short) cast. WriteBitInWordAsync reads and writes
an L-file parent word as 32-bit Long and masks the RMW arithmetic to
the native width, so a sign-extended 16-bit decode can no longer
corrupt the high bits.

Driver.AbLegacy-006 — shared-runtime concurrency. A per-tag libplctag
Tag handle is cached and reused by both the server read path and the
poll loop, with no synchronisation around Read/GetStatus/DecodeValue.
Added a per-runtime SemaphoreSlim (DeviceState.GetRuntimeLock, keyed
by tag name); ReadAsync and WriteAsync now hold it across the whole
Read -> GetStatus -> Decode / Encode -> Write -> GetStatus sequence so
no two threads touch the same Tag handle concurrently.

Added xUnit + Shouldly regression coverage: AbLegacyBitIndexRangeTests
(per-file bit-range validation + L-file 32-bit RMW + sign-extension
safety) and AbLegacyRuntimeConcurrencyTests (overlap-detecting fake
proving concurrent read/read and read/write are serialised).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Driver.S7-001: Timer (T{n}) / Counter (C{n}) addresses parsed cleanly but
the read path had no S7DataType or decode case for them, so a Timer/Counter
tag passed fail-fast init and then threw a misleading type-mismatch on every
read. InitializeAsync now runs RejectUnsupportedTagAddresses, throwing a clear
NotSupportedException ("not yet supported", echoing tag name + address) so the
config error fails fast at init.

Driver.S7-006: ShutdownAsync cancelled the probe/poll CTSs but did not await
the fire-and-forget loop tasks before DisposeAsync disposed _gate, letting a
loop iteration mid-semaphore race a disposed object. The probe task is now
tracked in _probeTask and each poll task in SubscriptionState.PollTask;
ShutdownAsync cancels every CTS, awaits Task.WhenAll of those handles with a
bounded 5 s DrainTimeout, then disposes the CTSs and gate. Task.Run is passed
CancellationToken.None so the handle is always awaitable.

Driver.S7-007: a PUT/GET-disabled fault (permanent misconfiguration) was
mapped identically to a transient PlcException — both BadDeviceFailure +
Degraded. ReadAsync/WriteAsync now split the catch via an IsAccessDenied
filter (S7.Net exposes no typed code for AccessingObjectNotAllowed, so the
inner-exception chain is inspected for the "not allowed" marker). Access-denied
now maps to BadNotSupported and Faulted with a config-alert message pointing
at the TIA Portal PUT/GET toggle; genuine device faults stay BadDeviceFailure.

Driver.S7-011: S7Driver ignored driverConfigJson on Initialize/Reinitialize,
so a config change delivered through ReinitializeAsync (the only Core-initiated
in-process recovery path) was silently discarded. Config parsing was factored
into S7DriverFactoryExtensions.ParseOptions; InitializeAsync now re-parses
driverConfigJson and rebuilds _options whenever the document has a real body.
An empty / placeholder document keeps the constructor options.

Adds S7DriverCodeReviewFixTests covering Timer/Counter rejection, config-json
application on Initialize/Reinitialize, and shutdown-drain with active
subscriptions. All 68 S7 driver tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Driver.OpcUaClient-001 — ReadAsync/WriteAsync/DiscoverAsync captured the
session before acquiring _gate, so a reconnect that completed while the
operation was blocked on the gate left the wire call bound to a stale,
closed session. All three now re-read Session (and parse NodeIds) inside
the _gate critical section after WaitAsync returns.

Driver.OpcUaClient-002 — OnReconnectComplete ignored the give-up (null
session) case, permanently wedging the driver with no Faulted signal and
no reconnect loop. The give-up branch now transitions HostState to
Faulted, sets a Faulted DriverHealth with an explanatory message, and
re-arms a fresh SessionReconnectHandler (TryRearmReconnect) against the
last-known session so an always-on gateway self-heals.

Driver.OpcUaClient-003 — BrowseRecursiveAsync discarded browse
continuation points, silently truncating large remote folders.
It now loops on BrowseResult.ContinuationPoint calling BrowseNextAsync
and appending each page until the continuation point is empty.

Driver.OpcUaClient-004 — driver-specs.md §8 namespace handling was
absent. Added NamespaceMap (built from session.NamespaceUris at connect,
rebuilt on reconnect) which persists discovered NodeIds in the
server-stable nsu=<uri>;... form; reads/writes re-resolve that form
against the current session so a remote namespace-table reorder no
longer misaddresses nodes. Added the TargetNamespaceKind option +
UnsMappingTable and ValidateNamespaceKind startup enforcement.

Driver.OpcUaClient-005 — OnKeepAlive read/wrote _reconnectHandler
without a lock, racing the SDK keep-alive timer thread and leaking
handlers. The check-and-set in OnKeepAlive, the take-and-clear in
ShutdownAsync, and the dispose/re-arm in OnReconnectComplete now all
run inside the _probeLock critical section.

Adds OpcUaClientNamespaceTests (11 xUnit + Shouldly regression tests)
covering ValidateNamespaceKind and the NamespaceMap stable encoding.
Reconnect/browse wire paths remain fixture-gated per finding -015.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Driver.FOCAS-001: FocasDriverConfigDto exposed no FixedTree / AlarmProjection /
HandleRecycle sections, so a deployment that opted into those features per
docs/drivers/FOCAS.md had the sections silently dropped by case-insensitive
JSON parsing and the features stayed at their disabled defaults. Added
FocasFixedTreeDto / FocasAlarmProjectionDto / FocasHandleRecycleDto and Build*
mappers in CreateInstance that populate the matching FocasDriverOptions
properties; a missing section or field keeps its existing default.

Driver.FOCAS-002: the fixed-tree bootstrap probe classified ProgramInfo as
"supported" whenever GetProgramInfoAsync returned non-null, but WireFocasClient
.GetProgramInfoAsync substituted defaults instead of throwing on a FOCAS error
return, so a CNC series answering EW_FUNC/EW_NOOPT for cnc_exeprgname2 /
cnc_rdopmode still got the Program/ and OperationMode/ subtrees. The method now
throws InvalidOperationException when neither the program-name nor the op-mode
read is IsOk, so SafeTryProbe correctly suppresses the capability.

Added FocasFactoryConfigTests covering the three opt-in config sections
round-tripping through CreateInstance and the fixed-tree bootstrap classifying
ProgramInfo as unsupported when the probe throws. Added an internal
FocasDriver.Options test seam.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Driver.Galaxy-002 — DataTypeMap.Map had no Int64 arm though MxValueDecoder/
MxValueEncoder both fully support Int64. Galaxy attributes with the Int64
mx_data_type code fell through to the String default, creating a String
address-space node while runtime reads decoded a boxed long. Added
`6 => DriverDataType.Int64`, extending the contiguous 0..5 scheme so the type
map agrees with the decoder/encoder on all seven Galaxy data types.

Driver.Galaxy-008 — after a stream fault the EventPump's StreamEvents consumer
loop exited and its channel completed; EventPump.Start() is a no-op on a
completed-but-non-null loop, so a replayed subscription had no consumer and
ReplayAsync never re-registered the post-reconnect item handles. ReplayAsync
now recreates the EventPump (RestartEventPumpForReplay) and rebinds the
SubscriptionRegistry per subscription with the fresh item handles returned by
the post-reconnect SubscribeBulkAsync, via new SubscriptionRegistry.SnapshotEntries
and Rebind APIs.

Regression tests: DataTypeMapTests (every code incl. Int64), SubscriptionRegistry
Tests (Rebind/SnapshotEntries), EventPumpStreamFaultTests (faulted pump dead,
fresh pump resumes dispatch).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
_lastPublishedByRef was a plain Dictionary<string, object> mutated inside
ShouldPublish, which runs on the PollGroupEngine onChange callback. The engine
runs one background Task per subscription, so a driver with two or more
subscriptions invokes ShouldPublish concurrently on separate threads. Concurrent
TryGetValue/indexer writes on a non-thread-safe Dictionary can corrupt internal
state, drop entries, or throw, crashing the poll loop.

Switch _lastPublishedByRef to ConcurrentDictionary<string, object>; its
TryGetValue and indexer-set operations are individually thread-safe, so the
deadband cache is now correct under concurrent multi-subscription publishing,
consistent with the lock-guarded sibling cache _lastWrittenByRef.

Add an xUnit + Shouldly regression test that runs 24 deadband-configured
single-tag subscriptions concurrently and asserts the poll loop survives without
faulting.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The DL205 family-native branch routed every V-prefixed address through
DirectLogicAddress.UserVMemoryToPdu, a plain octal-to-decimal decode.
DL205/DL260 system V-memory (V40400 and up) is not a simple octal decode:
the CPU relocates the system bank to Modbus PDU 0x2100. Octal-decoding
V40400 produced 16640 (0x4100), the wrong register, so any tag addressing
a system register through the grammar string silently read/wrote the
wrong PLC memory.

- Add DirectLogicAddress.VMemoryToPdu, which decodes the octal V-address,
  detects the system bank (octal >= V40400 == SystemVMemoryOctalBase) and
  relocates it through SystemVMemoryToPdu to PDU 0x2100; user-bank
  addresses keep the plain octal decode.
- ModbusAddressParser's DL205 V branch now calls VMemoryToPdu instead of
  UserVMemoryToPdu. UserVMemoryToPdu is retained for user-bank-only callers.
- Correct the ModbusFamilyParserTests V40400 assertion (16640 -> 0x2100)
  and add system-bank regression cases plus direct helper coverage.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
WriteToReadOnlyFile was listed in MalformedErrors, so ClassifyOutcome/
MapOutcome routed it to PermanentFail and the store-and-forward sink
dead-lettered every alarm event in the batch. But WriteToReadOnlyFile is
a connection-configuration fault (the write session was opened without
ReadOnly = false), not an event-payload fault — treating it as permanent
silently and permanently discards alarm events on a misconfigured or
regressed connection, which is data loss.

Move WriteToReadOnlyFile from MalformedErrors into ConnectionErrors. The
batch loop now aborts the batch, resets the connection (so the reconnect
path re-opens a writable ReadOnly = false session), and defers the
events as RetryPlease for the next drain tick.

Updated the ClassifyOutcome theory data and added a dedicated regression
test pinning WriteToReadOnlyFile -> RetryPlease.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
WonderwareHistorianClient.ReadAtTimeAsync passed the sidecar's reply.Samples
straight through ToSnapshots, which violated the IHistorianDataSource
contract: the result MUST be the same length and order as the requested
timestampsUtc, with gaps returned as Bad-quality snapshots. If the sidecar
dropped or reordered samples, OPC UA HistoryReadAtTime would silently
misalign values with timestamps.

Add an AlignAtTimeSnapshots helper that indexes the returned samples by
timestamp ticks, builds the result array at timestampsUtc.Count in request
order, and emits a Bad-quality (0x80000000) snapshot for any requested
timestamp the sidecar did not return.

Add the ReadAtTimeAsync_PartialAndReorderedReply_AlignsByTimestamp_AndFillsGapsAsBad
regression test where the fake returns a partial, reordered sample set.

Update code-reviews/Driver.Historian.Wonderware.Client/findings.md: -001
Resolved, open-finding count 10 -> 9.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Timeout_maps_to_BadInternalError_without_killing_the_engine test's
"Hang" script busy-looped on Environment.TickCount64. Commit cfb9ff1
(Core.Scripting-001) added System.Environment to the script-sandbox
deny-list, so the script now fails sandbox validation instead of
reaching the timeout path. Switch the busy-loop to DateTime.UtcNow
(an allowed type) to preserve the test's intent — a self-terminating
~5s hang that overruns the 30ms script timeout.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Route the synchronous IsLoading = true write through _dispatcher.Post so
both IsLoading assignments use the same dispatch path as Results.Clear()
and the final IsLoading = false, eliminating the ordering hazard.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Guard the two nullable child VM dereferences (BrowseTree at ConnectAsync
and History at ViewHistoryForSelectedNode) with != null checks, matching
the guarding style already used for Subscriptions and Alarms nearby.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Emit <AntiforgeryToken /> in the MainLayout sign-out form and remove
.DisableAntiforgery() from the /auth/logout endpoint so UseAntiforgery()
validates the token. A tokenless POST now returns 400, preventing CSRF-logout.
Regression-guarded by AuthEndpointsTests.Logout_without_antiforgery_token_is_rejected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Call Subscriptions?.Teardown() and Alarms?.Teardown() in the Disconnected
branch of OnConnectionStateChanged so server-side session drops also
quiesce the DataChanged and AlarmEvent handlers. Add Reattach() methods
that idempotently re-hook the handlers; call them from the Connected
branch so reconnects after a server-side drop restore live updates.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
NewCluster.razor and ClusterDetail.razor now resolve ClaimTypes.Name /
NameIdentifier from the cascaded AuthenticationState instead of hardcoding
"admin-ui" as the createdBy audit field. The operator principal is now
attributed correctly on every cluster-create and draft-create write path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Remove Password from UserSettings and stop writing it to settings.json;
the operator is re-prompted on each launch. Update LoadSettings/SaveSettings
comments and adjust the affected test assertion to verify the password is
not restored from the persisted model.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Implement IDisposable on MainWindowViewModel to detach ConnectionStateChanged,
call Teardown() on the subscription/alarm VMs, and dispose _service so the OPC UA
session and SDK resources are released. Call Dispose() from MainWindow.OnClosing
alongside the existing SaveSettings() call.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Update status and resolution text for the five Medium findings resolved
in this batch; lower the Open findings count from 11 to 6.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add @ReleasedBy parameter to sp_ReleaseExternalIdReservation via a new EF
migration so the operator principal (not the shared SQL account) is recorded
in ExternalIdReservation.ReleasedBy and ConfigAuditLog.Principal.
ReservationService.ReleaseAsync gains a releasedBy parameter; Reservations.razor
resolves the signed-in user from AuthenticationState and passes it through.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add AdminAuthPipelineTests (WebApplicationFactory + RoleInjectingHandler) to
enforce that ConfigViewer is denied CanPublish-gated pages while FleetAdmin is
permitted, and that an authenticated FleetAdmin session can reach the homepage.
Existing PageAuthorizationTests (anon page rejection) and AuthEndpointsTests
(login cookie + hub auth) cover cases (a)-(c); this file adds case (d).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Admin-006 through Admin-009 (all Medium) resolved; 3 Low findings remain open.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous Admin-006 commit added <AntiforgeryToken /> to the logout form
and updated the comment on the endpoint, but did not update LogoutAsync to
actually call IAntiforgery.ValidateRequestAsync. Blazor's UseAntiforgery()
middleware does not automatically validate minimal-API endpoints, so a
tokenless POST still succeeded. This commit injects IAntiforgery into the
handler, wraps ValidateRequestAsync in a try/catch, and returns 400 on
AntiforgeryValidationException. The endpoint keeps .DisableAntiforgery() to
prevent the middleware from also trying to read the body (which would cause
a double-read). The regression test is updated to log in first (to get an
authenticated session) before asserting 400 on a tokenless logout POST.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Analyzers-001: IsInsideWrapperLambda now matches the wrapper method name
(ExecuteAsync/ExecuteWriteAsync) in addition to the containing type, so a
future non-callSite lambda overload cannot suppress the diagnostic.
Analyzers-006: extended StubSources and added coverage for the remaining
guarded interfaces, synchronous members, concrete-driver receivers,
ExecuteWriteAsync wrapping, and nested lambdas.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Client.CLI-001: parse --start/--end with CultureInfo.InvariantCulture and
DateTimeStyles.AssumeUniversal|AdjustToUniversal so dates are culture-stable.
Client.CLI-005: SDK notification callbacks now hand off to an unbounded
channel drained on the main thread; handlers are unsubscribed before the
summary phase so no notification interleaves with console output.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Client.Shared-001: lowered the OnAlarmEventNotification early-return guard
from <6 to <1; per-index field guards already default missing fields safely.
Client.Shared-002: GetRedundancyInfoAsync replaces unguarded unboxing casts
with StatusCode.IsGood + Convert.ToInt32/ToByte, defaulting on bad reads.
Client.Shared-007: alarm fallback Task.Run guards on ReferenceEquals(session,
_session) and drops stale alarms on ObjectDisposedException after failover.
Client.Shared-008: WriteValueAsync rejects type inference from bad/null reads;
ValueConverter wraps parse failures in a descriptive FormatException.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Configuration-002: sp_PublishGeneration is transaction-nesting aware
(BEGIN TRANSACTION vs SAVE TRANSACTION on @@TRANCOUNT) so a caller's outer
transaction survives a publish failure; sp_ValidateDraft wrapped in TRY/CATCH.
Configuration-003: ValidatePathLength uses the cluster's actual Enterprise/Site
lengths when available, falling back to the conservative approximation.
Configuration-006: ResilientConfigReader treats a command-timeout
TaskCanceledException as a fault (not caller cancellation) and falls back.
Configuration-009: removed the checked-in plaintext sa connection string;
CreateDbContext now requires OTOPCUA_CONFIG_CONNECTION.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Dispose any existing _shelvingTimer before reassigning it inside LoadAsync so
that a second LoadAsync call does not leak the old timer and leave two timers
running concurrently against the same engine state.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Split the LoadAsync seed-read + subscribe loop: ReadTag seed fills _valueCache
first, then persisted-state restore runs, then _loaded = true, then SubscribeTag
is called. Any synchronous initial push from the upstream now arrives after
_alarms is fully initialised and _loaded = true, so ReevaluateAsync will queue
correctly behind the gate rather than racing the half-built state.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add _disposed re-checks inside ReevaluateAsync and ShelvingCheckAsync after
acquiring _evalGate so callbacks in flight when Dispose() runs bail out cleanly
instead of mutating _alarms or writing to a disposed store. Drop the
_alarms.Clear() from Dispose() — clearing outside the gate races concurrent
reads and is unnecessary since the object is being discarded.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reorder persist/update in ApplyAsync, ReevaluateAsync, and ShelvingCheckAsync:
SaveAsync is now called before the in-memory _alarms entry is advanced. A store
failure therefore leaves both the persisted and in-memory views at the prior state
rather than diverging, maintaining the invariant that startup recovery reflects
actual persisted state.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add FolderSegment member to NodeAclScopeKind; update WalkSystemPlatform
to report NodeAclScopeKind.FolderSegment (not Equipment) for each
visited Galaxy folder level, so MatchedGrant.Scope in
AuthorizationDecision.Provenance correctly distinguishes Galaxy folder
grants from UNS Equipment grants in the audit trail and Admin UI
diagnostics.  Three regression tests added to PermissionTrieTests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Change ClusterEntry from sealed record to sealed class so TryUpdate
uses reference equality for the CAS comparison.  Prune now uses a
read-compute-TryUpdate retry loop that restarts when a concurrent
Install updates the entry between the read and the write, preventing
a race that could silently drop the just-installed newest generation.
Two regression tests added to PermissionTrieCacheTests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
BuildAddressSpaceAsync now checks _disposed (throws ObjectDisposedException)
and tears down the previous alarm forwarder + clears the sink registry
before re-walking, so a Galaxy-redeploy rebuild does not leak the old
forwarder and double-deliver alarm transitions.  Three regression tests
added: double-build does not double-fire, sink count is correct after
rebuild, and post-dispose call throws.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
SubscribeAsync now wraps each driver handle in a private HostBoundHandle
that carries the resolved host name.  UnsubscribeAsync unwraps it and
routes through the recorded host's resilience pipeline, correctly
charging the subscription's originating host's circuit breaker/bulkhead
instead of always using the default host.  Falls back to the default
host for handles not created by this invoker.  Two regression tests
added; update findings.md Open count from 10 to 6.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add engine-level tests covering the six gaps identified in the finding:
(1) timed-shelve auto-expiry driven via injectable clock + RunShelvingCheckForTest
    hook so timer tests are deterministic;
(2) ConfirmAsync, TimedShelveAsync/UnshelveAsync round-trip, EnableAsync engine
    methods exercised end-to-end;
(3) OnEvent subscriber-throws isolation — engine state advances and stays
    operational after a subscriber throws;
(4) IAlarmStateStore.SaveAsync failure leaves in-memory state unchanged (locks in
    the persist-before-update invariant from finding-007);
(5) second LoadAsync does not leak the old timer (regression for finding-002);
(6) AreInputsReady cold-start guard correctly blocks on Bad/missing inputs and
    allows Uncertain-quality inputs through.

Expose RunShelvingCheckForTest() internal method on ScriptedAlarmEngine to
support deterministic timer tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mark Core.ScriptedAlarms-002, -004, -005, -007, -012 as Resolved with
one-line descriptions. Update open-findings count from 11 to 6.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Core.Abstractions-001: PollGroupEngine compares array values with structural
equality so a driver returning a fresh T[] each poll no longer fires spuriously.
Core.Abstractions-002: PollOnceAsync guards reader result cardinality and
throws a descriptive InvalidOperationException on mismatch instead of a
swallowed ArgumentOutOfRangeException that stalled the subscription.
Core.Abstractions-003: the poll loop Task is tracked; Unsubscribe/DisposeAsync
await loop completion before disposing the CTS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Core.VirtualTags-002: cold-start guard publishes BadWaitingForInitialData
instead of silently returning a stale value.
Core.VirtualTags-003: Load detects duplicate Path values and keys the
upstream-subscription loop off the registered tag set.
Core.VirtualTags-005: VirtualTagSource fires the initial-data callback per
path before registering the change observer, fixing an ordering race.
Core.VirtualTags-008: DependencyGraph caches topological rank, lowering
per-change-event cost from O(V+E) to O(closure).
Core.VirtualTags-012: added 9 engine tests; CoerceResult null-return now
maps to BadInternalError as the code comment intended.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
FormatStatus now matches named codes against code & 0xFFFF0000 (high-word
mask) rather than exact equality, so status codes carrying sub-code or flag
bits in the low 16 bits (e.g. 0x80050001) still resolve to their named class.
For codes not in the named shortlist a severity-class fallback using the top
2 bits always emits Good / Uncertain / Bad rather than bare hex.

Updated the stale FormatStatus_unknown_codes_fall_back_to_hex_only test (its
expectation became invalid once the severity-class fallback was added) and
added new Theory cases exercising both the high-word matching and the
severity-class fallback paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ConfigureLogging is now idempotent via a _loggingConfigured guard field so
repeated calls from subclasses do not abandon and leak the previous logger.
The previous Log.Logger is disposed before overwriting to release its
console-sink resources cleanly.

A new protected static FlushLogging() helper calls Log.CloseAndFlush() so
commands can guarantee buffered output is flushed in their finally blocks
before the process exits — important for the long-running subscribe verb.

XML doc updated to reflect call-once semantics and document FlushLogging().

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Added missing test coverage identified in the -005 finding:

- FormatTable_with_empty_input_returns_header_only: verifies the -004 fix
  (empty batch read returns header+separator rather than throwing).
- FormatStatus_with_sub_code_bits_resolves_to_named_class: Theory exercising
  the -002 high-word mask path (e.g. 0x80050001 → "BadCommunicationError").
- FormatStatus_unknown_sub_code_falls_back_to_severity_class: Theory for the
  -002 severity-class fallback (unknown sub-codes still emit Good/Uncertain/Bad).
- New DriverCommandBaseTests class: four tests covering verbose/non-verbose
  Serilog level selection, ConfigureLogging idempotency, and FlushLogging.

Also corrected the stale FormatStatus_unknown_codes_fall_back_to_hex_only
expectation (0xDEADBEEF now resolves to "Bad" via the severity-class fallback
introduced by -002, not bare hex) and fixed the FormatTable empty-input crash
(guard rows.Length == 0 before calling Enumerable.Max).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Driver.AbCip.Cli-001: WriteCommand.ParseValue wraps FormatException/
OverflowException as CommandException so bad --value input yields a clean
CLI error instead of a raw stack trace.
Driver.AbCip.Cli-002: probe/read/subscribe commands reject Structure types
up front (RejectStructure helper), matching the write guard.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
WriteCommand.ParseValue wraps FormatException/OverflowException as
CliFx CommandException so a bad --value yields a clean one-line CLI error
naming the value and target type instead of a raw .NET stack trace.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the synchronous non-cancellable _stream.ReadByte() for the kind byte
in FrameReader.ReadFrameAsync with an async ReadExactAsync(new byte[1], ct)
call so the full frame read honours the EffectiveCallTimeout-linked token
and cannot wedge the call gate when the sidecar stalls mid-frame.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`ToDriverDataType` mapped LInt/ULInt to Int32 (truncation) and UDInt
to Int32 (negative wrap for values > Int32.MaxValue). DriverDataType
already carries Int64/UInt64/UInt32, so map each Logix 64-bit and
unsigned-32-bit type to the correct member. `DecodeValueAt` in
`LibplctagTagRuntime` updated to return uint/ulong for UDInt/ULInt
so the runtime value type agrees with the declared OPC UA type.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Structure tags with declared Members no longer register the bare parent
name in `_tagsByName` — reading it would return Good/null, which is
misleading. Clients read individual member paths. Both the member
fan-out and the scalar-tag paths now perform a duplicate-key check that
throws `InvalidOperationException` naming both colliding entries (fail-
fast, consistent with the AbCipHostAddress validation pattern).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Introduce DeserializeSampleValue() helper that enforces a 64 KiB per-sample
ValueBytes size cap before calling MessagePackSerializer.Deserialize<object>,
and documents that the default StandardResolver (primitive-only, no typeless
or dynamic-type resolution) is in use. Both ToSnapshots and AlignAtTimeSnapshots
route through the new helper. Add inline XML comments to the two NuGetAuditSuppress
entries in the csproj recording the advisory title, why each does not apply to
this module's primitive-only deserialization, and when to revisit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Document explicitly that WriteBatchAsync never returns PermanentFail because
the WriteAlarmEventsReply wire contract carries only a bool-per-event (no
unrecoverable/transient distinction). Add a <remarks> XML block explaining
the structural limitation, why poison events retry rather than dead-letter,
and that a coordinated per-event status enum extension to the .NET 4.8
sidecar is a tracked follow-up. Add inline NOTE comments in both the
success and catch paths for discoverability.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Normalise req.Events to Array.Empty<AlarmHistorianEventDto>() immediately
after MessagePack deserialization in HandleWriteAlarmEventsAsync. MessagePack
deserializes an absent or explicit-nil array field as null, not Array.Empty,
so a peer that sends a null Events array would trigger a NullReferenceException
on either .Length dereference (no-writer branch or catch block), leaving the
client correlation-id wait hanging with no reply frame.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`current & widthMask` was already applied in `WriteBitInWordAsync` by
the -001 High finding fix, making the 16-bit sign-extension hazard fully
neutralised. No further code change required; mark Resolved.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`PlcTagHandle` and `DeviceState.TagHandles` were dead scaffolding: the
`ReleaseHandle` no-op never called `plc_tag_destroy` and the dict was
never populated. Removed the file, the dead dict, and its
`DisposeHandles` loop. Updated the `AbCipDriver` class doc to document
that native lifetime is owned by libplctag.NET `Tag.Dispose()` (invoked
from `DisposeHandles`) with the library's own finalizer covering any
GC-collected instances. Two test methods that only exercised the dead
`PlcTagHandle` class removed from `AbCipDriverTests`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extract the string-vs-numeric value selection from raw and at-time read
loops into a SelectValue helper method. aahClientManaged's HistoryQueryResult
has no data-type field in the bound SDK version, so the heuristic (prefer
StringValue when non-empty and Value==0) is unavoidable; the helper now
documents the limitation explicitly in its XML doc so the known edge case
(numeric tag at exactly zero with a formatted StringValue) is self-evident.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add System.Threading.Tasks to ForbiddenNamespacePrefixes so scripts
cannot use Task.Run / Parallel to spawn background work that outlives
the per-evaluation timeout. Document the unbounded-memory accepted
trade-off and the Task denial rationale in docs/VirtualTags.md (new
"Known resource limits" subsection) and cross-reference from
docs/ScriptedAlarms.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The ConcurrentDictionary + TryAdd/dispose-loser pattern for Runtimes
and ParentRuntimes was already applied as part of the Driver.AbCip-008
fix. Recording resolution with evidence rather than applying a
duplicate change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
DependencyExtractor.VisitInvocationExpression now additionally checks
that the member-access receiver is the identifier "ctx" before treating
a GetTag / SetVirtualTag call as a ScriptContext dependency. This
prevents spurious dependencies when a script defines a local helper type
with a matching method name and calls it as other.GetTag("X"). Test
Ignores_member_access_GetTag_on_non_ctx_receiver added.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EnqueueAsync used synchronous SQLite I/O (conn.Open / ExecuteNonQuery /
COUNT(*)) on the caller's thread, blocking the alarm-emitting thread under
write contention with the drain worker. The cancellationToken parameter was
silently ignored.

- EnqueueAsync converted to genuine async: OpenAsync / ExecuteNonQueryAsync /
  ExecuteScalarAsync used throughout; ct threaded to every await.
- ApplyPragmasAsync added alongside the existing ApplyPragmas helper so
  the WAL + busy_timeout PRAGMAs are applied on the async open path too.
- EnforceCapacityAsync added to handle capacity eviction on the async path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
In TimedScriptEvaluator.RunAsync, the catch (TimeoutException) block
now checks ct.IsCancellationRequested before throwing
ScriptTimeoutException, so a caller cancellation that races a timeout
deterministically surfaces as OperationCanceledException regardless of
which WaitAsync observes first. Regression test
Caller_cancellation_wins_even_when_timeout_fires_first added.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add ScriptSandboxTests cases for all forbidden-namespace deny-list
vectors that lacked test coverage: System.Threading.Thread,
System.Threading.Tasks.Task.Run (newly denied per Core.Scripting-003),
System.Runtime.InteropServices.Marshal, and Microsoft.Win32.Registry.
The 001/002 type-granular and node-form vectors were already covered by
the -001/-002 resolution commits. All 79 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
TryParse now rejects three classes of malformed PCCC address:
- Sub-element + bit-index together (e.g. T4:0.ACC/2) — never valid in PCCC
- File number on I/O/S system files (e.g. I3:0, S2:1) — single-letter only
- Sub-element on non-T/C/R files (e.g. B3:0.DN, N7:0.FOO) — only Timer,
  Counter, and Control files carry structured elements

New helper predicates IsNoFileNumberLetter / IsSubElementFileLetter
keep the parser's intent clear. Regression tests added in AbLegacyAddressTests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add exponential backoff (250 ms → 500 ms → 1 s → 2 s → 4 s → 8 s cap) to
PipeServer.RunAsync after each connection-loop exception, replacing the spin
loop that previously pegged a CPU core and flooded the log on persistent errors
such as a duplicate pipe name or a failing PipeAcl.Create. After 20 consecutive
failures the method re-throws so the SCM / NSSM supervisor can restart the
sidecar cleanly. A clean connection (even a short-lived one) resets the counter.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
DecodeValue for Bit with no bitIndex now reads the full 16-bit word via
GetInt16(0) and tests bit 0 instead of GetInt8(0), which only covered the
low byte and silently misread any bit in positions 8..15. The comment
explains the two decode paths (suffix-present vs suffix-absent).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Apply _config.MaxValuesPerRead as a bucket cap in ReadAggregateAsync,
mirroring the existing cap in ReadRawAsync. Without this guard a processed
read over a wide time range with a small IntervalMs could accumulate an
unbounded HistorianAggregateSample list; if the serialised reply exceeded
the 16 MiB FrameWriter frame cap WriteAsync would throw and the client
correlation-id wait would hang. Truncation now logs a Warning with a hint
to widen IntervalMs or reduce the time range.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add `EvictRuntime` helper that removes + disposes a stale
`ConcurrentDictionary` entry. Call it from `ReadSingleAsync`,
`ReadGroupAsync`, and `WriteAsync` on non-zero libplctag status and
transport exceptions so the next call for the same tag re-creates a
fresh handle — mirroring the probe loop's recreate-on-failure pattern.
Value-conversion exceptions (NotSupportedException, FormatException,
InvalidCastException, OverflowException) are not transport faults and
do not evict the handle.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Runtimes and ParentRuntimes changed from Dictionary to ConcurrentDictionary.
EnsureTagRuntimeAsync and EnsureParentRuntimeAsync now use a per-key
GetCreationLock semaphore with a double-checked pattern: fast-path read
requires no lock; slow-path create+initialize+store is serialised per key
so a concurrent caller waits rather than creating a duplicate runtime that
would be leaked when DisposeRuntimes runs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Throw InvalidOperationException at InitializeAsync when a tag's
DeviceHostAddress does not match any entry in the Devices list, naming
both the tag and the unresolved host.  Previously the missing-device
check was guarded by a TryGetValue so a typo silently bypassed
capability-matrix validation and deferred the error to per-read
BadNodeIdUnknown — the opposite of the documented "fail at load" goal.

Also resolves findings 004, 005, and 006 in the same file:
- 004: DiscoverAsync now unconditionally emits ViewOnly for all user
  tags; the Writable config field no longer influences security class
  because the wire backend always returns BadNotWritable.
- 005: All _health reads use Volatile.Read and all writes use
  Volatile.Write so concurrent readers observe a consistent reference
  and read-modify-write sequences capture a stable snapshot.
- 006: EnsureConnectedAsync disposes and nulls any existing
  non-connected client before creating a fresh one, preventing
  ObjectDisposedException loops after a HandleRecycle race or teardown.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mark _health volatile. The record-reference assignment is atomic, but
without an acquire/release memory barrier GetHealth() on another thread
can observe a stale snapshot indefinitely. volatile enforces the barrier
at read and write sites without a lock.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
DiscoverAsync now unconditionally emits SecurityClassification.ViewOnly
for every user-authored FOCAS tag.  Previously the SecurityClass was
tag.Writable ? Operate : ViewOnly, but WireFocasClient.WriteAsync always
returns BadNotWritable — advertising Operate misleads OPC UA clients
and the DriverNodeManager ACL layer into granting write permission on
nodes that can never be written.

Updated FocasCapabilityTests.DiscoverAsync_emits_pre_declared_tags to
assert ViewOnly for the writable-by-config tag so it matches the
corrected behaviour.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add regression tests for the Medium findings resolved in this series:
- AbCipDataType_maps_large_integer_types (theory: LInt→Int64, ULInt→UInt64,
  UDInt→UInt32) and Read_UDInt_tag_returns_uint_value_not_negative_wrapped_int
  cover Driver.AbCip-004.
- Structure_parent_tag_is_not_readable_after_member_fan_out,
  InitializeAsync_throws_on_duplicate_tag_name, and
  InitializeAsync_throws_when_member_name_collides_with_independent_tag
  cover Driver.AbCip-005.
- Read_failure_evicts_runtime_so_next_read_creates_fresh_handle covers
  Driver.AbCip-010.
AbCipDriverTests.AbCipDataType_maps_atomics_to_driver_types extended with
LInt/ULInt/UDInt assertions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
InitializeAsync catch block now mirrors ShutdownAsync teardown: cancels
and disposes probe CancellationTokenSources, calls DisposeRuntimes, and
clears _devices/_tagsByName before rethrowing. A caller that catches and
abandons (rather than retrying via ReinitializeAsync) no longer leaves
orphaned probe tasks or libplctag handles alive.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
6 Medium findings resolved (004, 005, 006, 009, 010, 014); open count
updated from 11 to 5 (all remaining Low severity: 007, 011, 012, 013, 015).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Guard all _health field accesses with Volatile.Read / Volatile.Write.
ReadAsync, WriteAsync, and ProbeLoopAsync run on different threads and
several updates are read-modify-write (new DriverHealth(_, _health.X, _)).
Without volatile semantics a concurrent update can be lost or a stale
LastSuccessfulRead timestamp propagated.  DriverHealth is an immutable
record so Volatile is sufficient — no lock needed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add six previously-missing edge-case tests to WonderwareHistorianClientTests:
(2) WriteBatchAsync transport-drop catch path returns RetryPlease for all events;
(3) InvokeAsync second-attempt-also-fails propagates the exception;
(4) stalled sidecar fires OperationCanceledException within CallTimeout;
(5) HistoryAggregateType.Total throws NotSupportedException via ReadProcessedAsync;
(6) sidecar wrong-MessageKind reply throws InvalidDataException.

Extend FakeSidecarServer with DisconnectBeforeReply, ReplyWithWrongKind, and
StallAfterRequest test knobs to support these scenarios.

Add ContractsWireParityTests.cs (11 tests) to pin the MessagePack byte layout,
round-trip correctness, MessageKind enum values, and Framing constants — catching
silent [Key] index drift between the client and sidecar mirror copies without
requiring a cross-TFM (net10 vs net48) project reference.

Test count grew from 11 to 27; all 27 pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EnsureConnectedAsync now disposes and nulls any existing non-connected
client before creating a fresh one via _clientFactory.Create().

Previously the method reused a cached client via ConnectAsync, but a
client disposed by a HandleRecycle race or prior teardown would hit
FocasWireClient.ThrowIfDisposed on every subsequent call, leaving the
device permanently wedged with BadCommunicationError and no recovery
path until ReinitializeAsync.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
System.Threading.Thread is in the System.Threading namespace (not
System.Threading.Thread), so the existing ForbiddenNamespacePrefixes
entry "System.Threading.Thread" never matched — the namespace prefix
check compared against the type's containing namespace, which is
System.Threading. Move Thread into ForbiddenFullTypeNames (alongside
Environment / AppDomain / GC / Activator) where it is matched by exact
fully-qualified type name, which actually fires. Remove the dead
namespace-prefix entry and document why. The Rejects_Thread_new_at_compile
test now passes. (Core.Scripting-010.)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Status fields (_lastDrainUtc, _lastSuccessUtc, _lastError, _drainState,
_evictedCount) were written by the drain timer thread and read by
GetStatus() / health-check threads with no memory barrier, risking torn
DateTime? reads and stale DrainState observations.

- Added _statusLock object; all writes to status fields now happen inside
  lock(_statusLock) blocks in DrainOnceAsync and DrainTimerCallback.
- GetStatus() snapshots all fields atomically under the same lock so the
  Admin UI / /healthz endpoint always sees a consistent view.
- Regression test GetStatus_snapshot_is_consistent_under_concurrent_drain
  drives status writes and reads from concurrent threads; asserts no throws.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add FocasDriverMediumFindingsTests.cs with regression coverage for the
five Medium findings:

- 003: InitializeAsync throws when tag's DeviceHostAddress is absent
  from Devices (two variants: typo host, wrong port; also happy path)
- 004: DiscoverAsync emits ViewOnly for tags with Writable:true
- 005: GetHealth() is consistent after ten concurrent ReadAsync calls
- 006: Read recovers after the client is externally disposed, creating
  a fresh client rather than wedging with BadCommunicationError
- 012: Factory full-round-trip with all three opt-in config sections
  (FixedTree + AlarmProjection + HandleRecycle) with all subfields

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When WriteBatchAsync returned a wrong-cardinality outcome list, DrainOnceAsync
threw InvalidOperationException after potential delivery — causing duplicate
events on re-drain or permanent queue stall on a deterministic writer bug.

- The throw replaced with log + backoff: mismatch is recorded into _lastError,
  _drainState set to BackingOff, backoff bumped, method returns without applying
  any outcomes, mirroring the writer-exception path.
- Regression test Writer_returning_wrong_cardinality_outcomes_sets_backing_off_and_keeps_rows
  asserts rows stay queued, DrainState = BackingOff, LastError populated, and
  that a fixed writer subsequently drains cleanly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add #pragma warning disable xUnit1051 at the top of ContractsWireParityTests.cs.
The xUnit1051 analyser fires on MessagePack's Serialize/Deserialize overloads that
have an optional CancellationToken parameter; these are synchronous parity tests
where the token is not meaningful — the suppression is scoped to this file only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
MapLibplctagStatus now casts the int to libplctag.Status and switches on
named enum members (mirroring AbCipStatusMapper) instead of unverified
magic integers. A strongly-typed Status overload is the canonical path;
the int overload delegates to it. MapPcccStatus is retained with a comment
marking it as the reference mapping for future PCCC-STS inspection.
Tests updated to use Status enum members rather than raw integers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The test suite lacked coverage for four critical paths: corrupt/null-
deserializing PayloadJson rows, StartDrainLoop timer behavior and backoff
honoring, concurrent EnqueueAsync+DrainOnceAsync stress, and the
outcomes.Count != events.Count cardinality-mismatch branch.

Added tests covering all four gaps (committed across companion findings):
- Drain_with_corrupt_payload_row_deadletters_it_and_keeps_good_rows_aligned
- Drain_with_corrupt_head_row_does_not_stall_queue
- StartDrainLoop_honors_backoff_and_slows_cadence_under_retry
- StartDrainLoop_keeps_steady_cadence_when_writer_is_healthy
- StartDrainLoop_records_drain_fault_and_keeps_running
- Concurrent_enqueue_and_drain_do_not_throw_sqlite_busy
- Writer_returning_wrong_cardinality_outcomes_sets_backing_off_and_keeps_rows
- Capacity_eviction_increments_evicted_count_on_status
- GetStatus_snapshot_is_consistent_under_concurrent_drain

Updated Open findings count to 2 (Core.AlarmHistorian-008 + -011, both Low).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Consume previously-dead AbLegacyPlcFamilyProfile fields:

- DeviceState.EffectiveCipPath applies DefaultCipPath when the parsed host
  address has an empty CIP path (SLC 500 / PLC-5 misconfigured without /1,0
  now gets the profile-supplied default route). All three tag/parent/probe
  Create() callers updated.
- InitializeAsync validates each tag's DataType against SupportsLongFile /
  SupportsStringFile and throws InvalidOperationException at init time so a
  MicroLogix Long tag or similar fails early rather than at runtime with an
  opaque comms error.
- MaxTagBytes tracked as a follow-up (string/array chunking requires broader
  design work).

Tests added for CipPath fallback and Long/String type validation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Finding 005 revised approach: keep the parent Structure tag in
`_tagsByName` so the whole-UDT grouping planner can find it (required
for Driver.AbCip-003 opt-in path + alarm projection). Instead, detect a
direct read of a Structure-with-Members in `ReadSingleAsync` and return
`BadNotSupported` rather than Good/null — explicitly documenting the
contract that callers must address member paths. Duplicate-key checks
(scalar and member fan-out) remain.

Finding 014 test corrections: `Structure_parent_tag_read_returns_BadNotSupported`
now asserts the new contract. `Read_UDInt_tag_returns_uint_value_not_negative_wrapped_int`
assertion fixed to use `ShouldBeOfType<uint>()` instead of
`ShouldNotBe(-1)` (Shouldly overflows comparing uint.MaxValue with int).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
8 Medium findings resolved (-002 through -012); 3 Low findings remain
open (-005, -011, -013).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Clarify Driver.AbCip-005 resolution: parent Structure tag stays in
_tagsByName (needed by whole-UDT planner + alarm projection); the fix
is in ReadSingleAsync returning BadNotSupported for direct reads.
Update Driver.AbCip-014 resolution text to match the actual test names.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The file was physically deleted and unstaged in the Driver.AbCip-006
commit but the git rm was not included. Committed separately.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
EffectiveCipPath now references ParsedAddress/Profile properties instead
of the captured primary-constructor parameters to avoid CS9124 (param
captured into enclosing type AND used to init a member).

NonZero_libplctag_status_maps_via_AbLegacyStatusMapper updated to pass
(int)Status.ErrorNotFound rather than the stale magic integer -14 that
the old mapper happened to handle but the new enum-based mapper does not.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
StatusCodeMap.FromMxStatus checked `success != 0` to determine success, but the
mxaccessgw proto contract explicitly documents that `success` is not a boolean and
that clients must branch on `category` (MX_STATUS_CATEGORY_OK), not on `success`
alone. Replace the raw field check with `status.IsSuccess()` from
MxStatusProxyExtensions, which requires both `success != 0` AND `category == Ok`.
A worker reporting success=1 with a non-OK category was previously misreported as
Good. Updated StatusCodeMapTests with a regression case covering the inverted scenario.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add StatusCodeMap.ToQualityCategoryByte(uint) so the StatusCode → quality-byte
mapping lives in one place next to its inverse (FromQualityByte). GalaxyDriver
OnPumpDataChange now delegates to the helper instead of duplicating the shift+switch
inline; a future edit to the OPC UA bit layout cannot silently desync the probe-health
decode. Unit tests in StatusCodeMapTests pin all three category buckets and the
round-trip invariant.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
HashSet<T>.First() enumeration order is unspecified and unstable across mutations, so
the "owner" handle attached to alarm events was non-deterministic when multiple alarm
subscriptions were active. Change _alarmSubscriptions from HashSet to List (preserving
insertion order) and pick [0] — the earliest-registered handle — as the deterministic
owner. The server routes transitions by SourceNodeId, not by handle, so the choice of
handle does not affect delivery to active subscribers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Implement IAsyncDisposable on GalaxyDriver so async sub-component disposals
(EventPump, AlarmFeed, MxSession, MxClient, RepositoryClient) are awaited rather
than blocked on GetAwaiter().GetResult(). DisposeAsync is now the primary path;
Dispose() delegates to it for using-statement compatibility. Each async component's
shutdown is awaited individually with a best-effort catch so a single slow shutdown
cannot prevent the rest of the cleanup sequence from running.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fix two resource-management bugs in StartDeployWatcher / BuildDefaultHierarchySource:
(a) Replace the discarded `_ = StartAsync(...)` with an explicit task variable that
    surfaces any synchronous InvalidOperationException (called-twice guard) rather than
    silently swallowing it.
(b) Change both StartDeployWatcher and BuildDefaultHierarchySource to use ??= on
    _ownedRepositoryClient so the first client created (by whichever path runs first)
    is reused by the second path, preventing a second GalaxyRepositoryClient from being
    created and the first from leaking past the driver's lifetime.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Clear _tagsByName, _lastPublishedByRef, and _lastWrittenByRef in ShutdownAsync
(via the new shared TeardownAsync helper) so a ReinitializeAsync cycle starts
from a clean state, consistent with the existing _autoProhibited.Clear().

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
GetMemoryFootprint() returned a constant 0 with a stale "PR 4.4 sets this" comment
even though PR 4.4 shipped the SubscriptionRegistry. Replace with a live estimate:
64 bytes × TrackedItemHandleCount + 256 bytes × TrackedSubscriptionCount. A 50k-tag
set now registers ~3 MB with the server's cache-flush heuristic instead of being
invisible. Returns 0 when no subscriptions are active.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add GalaxyDriverInfrastructureTests covering the two gaps identified in this finding
that are not yet tracked by a dedicated test file: GetMemoryFootprint returns a live
registry-derived estimate (Driver.Galaxy-011) and DisposeAsync completes without
deadlock (Driver.Galaxy-007). The remaining items listed in the finding are covered
by earlier resolution commits: stream-fault → recovery → OnDataChange resumes
(EventPumpStreamFaultTests), post-reconnect Rebind (SubscriptionRegistryTests),
StatusCodeMap.FromMxStatus success/failure semantics (StatusCodeMapTests), and
DataTypeMap all seven codes (DataTypeMapTests). Update findings.md header to 4 open.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
StatusCode is not a .NET type reference in this assembly — replace the unresolvable
<see cref="StatusCode"/> with prose text so TreatWarningsAsErrors does not fail the
build on the CS1574 unresolved-cref warning.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reject an empty 3rd field in the address parser by checking parts[2].Length > 0
before the All(char.IsDigit) guard, so a trailing-colon typo like "40001:F:"
produces a diagnostic instead of silently parsing as a scalar.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add ModbusAddressEdgeCaseTests.cs covering the overflow/boundary gaps: empty
trailing parser field (finding -002 regression), multi-dot input, UserVMemoryToPdu
and AddOctalOffset overflow, SystemVMemoryToPdu base+overflow, MelsecAddress
ParseHex overflow, and DRegisterToHolding/MRelayToCoil bank-base overflow.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add --bit-index, --string-length, and --string-byte-order options to
SubscribeCommand, mirroring ReadCommand, and pass them into ModbusTagDefinition
so that BitInRegister and String type subscriptions use the correct bit index and
string length rather than silently defaulting to bit-0 / zero-length.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reject --region Coils combined with any non-boolean --type with a CommandException
that names the constraint: coils carry a single bit, so only --type Bool is valid.
Without this check a write like "--region Coils --type UInt16 --value 42" would
silently coerce to a coil ON with no diagnostic.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Complete the incomplete Addressing-003 fix: TryParseByteOrder now produces a
diagnostic mentioning "field 2" when a known type-code token (e.g. BOOL) is
supplied in the byte-order slot, so the user is guided to the correct field.
The previous fix only wired the message in the else-branch, which was unreachable
because LooksLikeByteOrderToken(BOOL) returned true first.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Remove the dead ProbeAddress config surface from S7ProbeOptions and the factory
DTO. ProbeLoopAsync uses Plc.ReadStatusAsync (CPU-status PDU), not a tag-address
read — ProbeAddress was never consumed. The XML doc on Probe is corrected to
describe the ReadStatusAsync-based probe. Existing configs that set probeAddress
are silently ignored by the JSON deserializer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
S7-002: add inline comment documenting the UInt32→Int32 lossiness in MapDataType,
consistent with the Int64/UInt64 note. Tracked for a follow-up that adds unsigned
DriverDataType members.

S7-004: inject ILogger<S7Driver> (optional, defaults to NullLogger); add structured
log calls for connect success/failure, probe Running/Stopped transitions, and
swallowed poll-loop exceptions, so operators have an event trail via Serilog.

S7-008: restructure WriteAsync catch ladder to mirror ReadAsync — OperationCanceledException
re-throws, NotSupportedException → BadNotSupported, PUT/GET-disabled PlcException →
BadNotSupported/Faulted, genuine PlcException → BadDeviceFailure/Degraded, all
others → BadCommunicationError/Degraded. Health is now updated on every write failure.
Also factor ReadOneAsync reinterpret into internal ReinterpretRawValue and
WriteOneAsync boxing into internal BoxValueForWrite for testability (Driver.S7-014).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add S7TypeMappingTests.cs covering ReinterpretRawValue and BoxValueForWrite —
26 tests verifying every implemented type round-trip (Bool/Byte/UInt16/Int16/
UInt32/Int32/Float32), two's-complement reinterpret semantics (ushort→short,
uint→int), unsupported-type NotSupportedException, and overflow edge cases.
These methods were factored out as internal static in the S7-002/S7-008 commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wrap all numeric/DateTime BCL parses in ParseValue with try/catch(FormatException)
and try/catch(OverflowException) that re-throw as CommandException, matching the
existing Bool path. Update ParseValue_non_numeric_for_numeric_types_throws to assert
CommandException (not FormatException), and add an overflow-edge test (Byte value 256).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Trim the --type help text on read and subscribe to the implemented set
(Bool/Byte/Int16/UInt16/Int32/UInt32/Float32) and append a one-line caveat that
Int64, UInt64, Float64, String, and DateTime are not yet implemented and will
return BadNotSupported — so the CLI does not advertise options that cannot succeed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wrap the InitializeAsync + ReadAsync body in a try/catch so an unreachable PLC
(refused TCP connect, wrong slot) still prints the structured Host:/CPU:/Health:/
Last error: report from driver.GetHealth() instead of crashing with a stack trace.
OperationCanceledException re-throws so Ctrl+C during connect exits cleanly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mark Driver.S7-002, -004, -008, -012, -014 and Driver.S7.Cli-001, -002, -003
as Resolved; update Open findings counts (Driver.S7: 10→5, Driver.S7.Cli: 7→4).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Route all Session mutations through _probeLock so OnReconnectComplete, ShutdownAsync,
and OnKeepAlive cannot race each other when swapping or clearing the active session.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Map DataTypeIds.Byte to DriverDataType.UInt16 (unsigned family) rather than Int16
(signed family). Update attribute mapping test to assert the correct unsigned mapping
and add Byte/UInt16 to the standard-types theory.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add OpcUaClientMediumFindingsRegressionTests covering write-timeout status code
(009), Byte->UInt16 mapping (010), AutoAccept warning (012), GetMemoryFootprint/
FlushOptionalCachesAsync contract (013), and pre-init lifecycle guards (015).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reject Structure-typed pre-declared tags in BuildTag at config-parse time
with a clear InvalidOperationException; replaces the previous silent
garbage read (MapToClrType fell through to typeof(int)) and late
NotSupportedException on writes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Inject optional ILogger<TwinCATDriver> (NullLogger default) and log
connect success/failure, ADS read errors, symbol-browse fallback,
native-notification registration failures, and host-state transitions.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Swap _devices and _tagsByName to ConcurrentDictionary so ShutdownAsync
Clear() no longer races concurrent TryGetValue calls; store ProbeTask
on DeviceState and await it in ShutdownAsync before disposing the client
and gate, eliminating the probe-disposal race.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace yield break with cancellationToken.ThrowIfCancellationRequested()
in BrowseSymbolsAsync so a cancelled browse propagates as
OperationCanceledException instead of silently completing with a partial
symbol set.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Confirm AdsErrorCode values from Beckhoff.TwinCAT.Ads 7.0.172 and rewrite
MapAdsError with 20 explicit cases. Fix critical bug: AdsSymbolVersionChanged
was 0x0702 (DeviceInvalidGroup) but DeviceSymbolVersionInvalid is 1809
(0x0711); correct constant and all comments. Add BadOutOfService for
DeviceNotReady and BadInvalidState for DeviceInvalidState/PLC-in-Config.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
GetMemoryFootprint now returns tagsByName * 256 + nativeSubs * 512 bytes
instead of a hard-coded 0; document that the stream-and-discard symbol
browse leaves no flushable cache so FlushOptionalCachesAsync is a
deliberate no-op.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Mark findings 003, 009, 010, 011, 012 Status: Resolved (status fields
were missing the update in earlier commits); reduce Open findings
count from 11 to 5 (Low findings 004, 006, 014, 015, 016 remain open).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Fix ReadRawAsync: correct XML doc from newest-first to oldest-first
(ascending source timestamp per OPC UA Part 11); move maxValuesPerNode
cap inside the time-window filter loop so paging limits apply to
in-window results only, not the whole buffer snapshot.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add _nodeManagerDisposed field; set it under Lock in Dispose before
detaching the alarm-service handler; check it in OnAlarmServiceTransition
under the same Lock so an in-flight transition cannot dispatch to a
ConditionSink whose DriverNodeManager is being concurrently disposed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add configDbHealthy parameter to OpcUaApplicationHost; wire a
DbHealthCache (CanConnectAsync cached 10 s) in Program.cs so /healthz
reflects real config-DB reachability instead of the previous always-true
default; /healthz now returns 503 on a DB outage unless stale-config
cache is warm.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Default AutoAcceptUntrustedClientCertificates to false in both
OpcUaServerOptions and Program.cs config fallback, aligning with
docs/security.md; auto-accept is now explicitly opt-in for dev use only.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Advertise UserName token policy on any non-None security profile when
Ldap.Enabled; emit a startup LogWarning when Ldap.Enabled=true but
SecurityProfile=None so the misconfiguration is surfaced before clients
connect rather than silently producing no credential path.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace silent Enum.TryParse fallback to None with a ParseSecurityProfile
helper that emits a startup Log.Warning naming the unsupported value and
listing recognised profiles; operators now see the misconfiguration
before any client connects rather than getting an unexplained None posture.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Resolved Server-003, -005, -007, -010, -011, -013 in this batch;
Server-004, -006, -008, -012, -014, -015 remain open (all Low severity).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Driver.OpcUaClient-006, -007, -008, -009, -010, -012, -013, -015 were
resolved in earlier commits; only -011 (Low) and -014 (Low) remain open.
Header was left at 3 after the Medium batch; correct to 2.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Driver.TwinCAT-011 fix rewrote TwinCATStatusMapper with correct
numeric values from Beckhoff.TwinCAT.Ads 7.0.172 (e.g. DeviceSymbol-
VersionInvalid = 1809 / 0x0711, not 1794 / 0x0702). Pre-existing
StatusMapper_covers_known_ads_error_codes InlineData cases were written
against the old wrong mappings and now fail; StatusMapper_recognises_
symbol_version_changed_code asserted the legacy 0x0702 constant. Update
both test files to match the corrected mapper and add a comment
documenting the correction.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Core-002 fixed TriePermissionEvaluator to evaluate each request against
the session's bound AuthGenerationId rather than whatever the cache
currently holds. AuthorizationGate.BuildSessionState was not updated at
the same time: it hardcoded AuthGenerationId = 0, so the evaluator's
GetTrie(cluster, 0) call returned null for any generation != 0, causing
every gated operation to silently fail with NotGranted regardless of
actual grants. The 42 gate/matrix/deferred-hardening tests all started
failing as a result.

Fix: add an optional PermissionTrieCache parameter to AuthorizationGate;
BuildSessionState now stamps AuthGenerationId from the cache's current
generation for the session's cluster. AuthorizationBootstrap.BuildGateAsync
passes the cache it creates. All 7 test MakeGate helpers updated to pass
the cache so tests produce a valid AuthGenerationId. 433/433 server tests
now pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
All Medium-severity code-review findings across the 29 reviewed modules
are now Resolved. The Pending findings table holds only Low-severity items.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Admin-003 fix gated every SignalR hub with [Authorize]/RequireAuthorization,
but the server-side HubConnection clients on ClusterDetail, AclsTab, RedundancyTab
and RoleGrants cannot forward the browser's HttpOnly auth cookie — so the hub
negotiate returns 401. Those four pages called HubConnection.StartAsync()
unguarded, so the 401 surfaced as an unhandled exception (a 500 page for the
prerendered ClusterDetail, a broken circuit for the others).

Wrap StartAsync/SendAsync in try/catch on all four, matching the established
best-effort pattern already used in Hosts.razor and ScriptLog.razor: the live
banner / live refresh degrades but the page renders. Restoring functional hub
live-updates needs a token-based hub auth scheme (cookie forwarding is not
viable across the prerender/interactive boundary) and is left as follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Admin-003 fix gated every SignalR hub with [Authorize], but the server-side
Blazor HubConnection clients had no way to authenticate: the browser's HttpOnly
auth cookie is not reachable from the interactive circuit, so every hub negotiate
returned 401 and the Admin live-update feature was non-functional app-wide
(silently degraded on Hosts/ScriptLog, fatal on the cluster pages).

Introduce a token-based hub auth path:
- HubTokenService mints/validates short-lived tokens using ASP.NET Core Data
  Protection (the same primitive that protects the auth cookie — no signing-key
  management, no new packages). Tokens carry the user's name + roles.
- HubTokenAuthenticationHandler is a custom "HubToken" auth scheme that reads the
  token from the Authorization: Bearer header (negotiate) or the access_token
  query parameter (WebSocket upgrade).
- The "HubClients" authorization policy runs both the cookie and HubToken
  schemes; the hub endpoints use RequireAuthorization("HubClients").
- AdminHubConnectionFactory builds hub connections with an AccessTokenProvider
  that mints a fresh token for the circuit's authenticated user on every
  (re)connect. All six hub-consuming pages now resolve connections through it.

Hub negotiate now returns 200 and the WebSocket upgrades (101); live updates
work. The best-effort try/catch guards added previously are kept as defence.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Records the post-review finding discovered during browser smoke-testing: the
Admin-003 hub hardening was incomplete — the server-side Blazor HubConnection
clients had no way to authenticate, so hub negotiate 401'd and four cluster
pages threw unhandled 500s. Logged as Admin-013 (High, Error handling &
resilience), Status Resolved, fixed by commits f254539 + 8d5dbb4.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
dohertj2 added 6 commits 2026-05-23 05:42:10 -04:00
- Core.Abstractions-004: guard DriverTypeRegistry.Register with a Lock so
  concurrent registrations are atomic.
- Core.Abstractions-005: narrow PollGroupEngine catch blocks to non-fatal
  exceptions, add optional onError callback, tolerate disposed-CTS races.
- Core.Abstractions-006: document the deliberate int-vs-uint asymmetry on
  IHistoryProvider.ReadEventsAsync / IHistorianDataSource.ReadEventsAsync.
- Core.Abstractions-007: pin the gaps with PollGroupEngine + DriverHealth
  contract tests.
- Core.Abstractions-008: correct XML docs on DriverHealth.LastError and
  the optional / required asymmetry on the history-read surfaces.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Core-004: add ConfigureAwait(false) to DriverHost.RegisterAsync /
  UnregisterAsync / DisposeAsync.
- Core-008: rewrite the BuildAddressSpaceAsync XML doc to correctly name
  the caller (OpcUaApplicationHost.PopulateAddressSpaces) that owns the
  per-driver isolation.
- Core-009: snapshot DriverResilienceOptions once per non-idempotent write
  in CapabilityInvoker.ExecuteWriteAsync.
- Core-010: switch DriverResilienceOptions.Resolve to TryGetValue with a
  diagnostic error message when a tier table is missing a capability.
- Core-011: add an optional diagnostic callback to PermissionTrieBuilder
  so production callers can surface scope-path mismatches.
- Core-012: correct the stale WedgeDetector ctor summary and add the
  Reconnecting row to DriverHealthReport's state matrix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Configuration-004: NodePermissions stored as int to match the EF
  HasConversion<int>() in OtOpcUaConfigDbContext.ConfigureNodeAcl.
- Configuration-005: serialise LiteDbConfigCache.PutAsync so concurrent
  Put for the same (ClusterId, GenerationId) cannot duplicate rows.
- Configuration-007: rethrow OperationCanceledException from
  GenerationApplier.ApplyPass when the caller's token is cancelled.
- Configuration-010: scrub secrets and drop the full exception object
  from the ResilientConfigReader fallback warning log.
- Configuration-011: pin the previously-uncovered GenerationApplier
  cancellation and path-length / publish-validation paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Core.AlarmHistorian-008: cache queue depth in an Interlocked counter so
  EnqueueAsync no longer runs COUNT(*) on every alarm; consolidate
  DrainOnceAsync onto a single SqliteConnection per tick (purge, batch
  read, dead-letter, and outcome transaction all share it).
- Core.AlarmHistorian-011: confirm the stale Galaxy.Host XML doc
  references were already fixed under earlier commits; flip to Resolved.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Analyzers-002: drop the three dead AlarmSurfaceInvoker entries from
  the wrapper-method allow-list and from the diagnostic message.
- Analyzers-003: bail out of AnalyzeInvocation when the semantic model
  is null (was previously emitting a false positive).
- Analyzers-004: resolve guarded-interface + wrapper-method symbols
  once via CompilationStartAction and compare with SymbolEqualityComparer
  instead of formatting fully-qualified names on every invocation.
- Analyzers-005: add regression tests for default-interface-method
  reads (ReadAtTimeAsync / ReadEventsAsync on a concrete driver), with
  + without an override, and inside a CapabilityInvoker.ExecuteAsync
  lambda.
- Analyzers-007: rewrite the analyzer remarks to accurately describe
  the symbol-identity guarded-call detection, DIM handling, and the
  wrapper-lambda match heuristic.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Batch 1 cleared Open findings in Core, Core.Abstractions, Core.AlarmHistorian,
Configuration, and Analyzers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
dohertj2 added 6 commits 2026-05-23 07:27:17 -04:00
- Core.ScriptedAlarms-003: emit OnEvent OUTSIDE _evalGate by collecting
  pending emissions during the gate-held section and flushing them after
  release; eliminates re-entrancy deadlock the docs already promised.
- Core.ScriptedAlarms-006: track every fire-and-forget Reevaluate /
  ShelvingCheck task in _inFlight; Dispose drains the set so the engine
  no longer races store writes against teardown.
- Core.ScriptedAlarms-008: store comments as ImmutableList<AlarmComment>
  so AppendComment is O(log n) instead of O(n).
- Core.ScriptedAlarms-010: document the deliberate input-quality
  asymmetry (Uncertain drives the predicate, renders {?} in the message)
  in docs/ScriptedAlarms.md and on MessageTemplate.Resolve remarks.
- Core.ScriptedAlarms-011: propagate the no-op reason through
  TransitionResult.NoOp(state, reason) and log it from
  ScriptedAlarmEngine.ApplyAsync.
- Core.ScriptedAlarms-009 (Won't Fix per recommendation): documented the
  per-evaluation dictionary allocation in docs/v2/Galaxy.Performance.md
  with a mitigation path if a future soak surfaces pressure.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Core.Scripting-005: DependencyExtractor.HandleTagCall now recognises
  raw-string literal paths by checking the StringLiteralExpression node
  kind instead of the legacy StringLiteralToken kind.
- Core.Scripting-006: scope CompiledScriptCache failed-compile eviction
  with TryRemove(KeyValuePair) so a racing retry entry is not evicted.
- Core.Scripting-008: document the per-publish assembly accretion as an
  accepted limitation in docs/VirtualTags.md.
- Core.Scripting-009: enumerate the authoritative deny-list (namespace
  prefixes + type-granular denies) in the Phase 7 decision-#6 entry to
  match ForbiddenTypeAnalyzer.
- Core.Scripting-011: pin ScriptSandbox.Build, ScriptContext.Deadband
  boundary semantics, and end-to-end factory + companion-sink
  integration.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Core.VirtualTags-004: CoerceResult now covers every scalar
  DriverDataType and throws on the default arm; Load rejects unsupported
  declared types.
- Core.VirtualTags-006: Subscribe/Unsub prune empty observer-list
  entries from _observers under the same lock with a reconfirm-on-add
  race guard.
- Core.VirtualTags-007: rewrote TimerTriggerScheduler so each TickGroup
  tracks an InFlight flag (Interlocked CAS); ticks that overlap a still-
  running tick for the same group are skipped + counted.
- Core.VirtualTags-009: DirectDependencies / DirectDependents return a
  shared static empty set on miss instead of allocating per call.
- Core.VirtualTags-010: corrected XML docs to reference the real engine
  symbols (OnUpstreamChange, CascadeAsync, etc.) instead of phantom types.
- Core.VirtualTags-011: Load now rejects scripts whose declared Writes
  target a non-registered virtual-tag path.
- Core.VirtualTags-013: DependencyCycleException renders SCC members as
  a set rather than a fabricated arrow-traversal edge path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Admin-010: vendor Bootstrap 5.3.3 (CSS + JS bundle + maps + provenance
  README) under wwwroot/lib/bootstrap and reference local paths from
  App.razor — Admin no longer pulls Bootstrap from jsDelivr.
- Admin-011: swap FleetStatusPoller's three plain dictionaries for
  ConcurrentDictionary so ResetCache can't race a poll tick.
- Admin-012: drop the EquipmentId column from EquipmentCsvImporter (per
  admin-ui.md — equipment id is system-derived from EquipmentUuid);
  EquipmentImportBatchService and the textarea placeholder updated to
  match.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Server-004: pass the role-derived display name to UserIdentity's base
  ctor (the SDK's DisplayName has no public setter) and drop the dead
  Display property; make RoleBasedIdentity internal sealed.
- Server-006: derive a bounded CancellationToken from the SDK's
  OperationContext.OperationDeadline in OnReadValue / OnWriteValue so a
  stalled driver call can no longer pin the request thread.
- Server-008: mark handled slots via CallMethodRequest.Processed = true
  in RouteScriptedAlarmMethodCalls (the SDK skips on Processed, not on a
  Good error slot).
- Server-012: PeerHttpProbeLoop.ProbeAsync stops mutating client.Timeout
  per call; uses a per-request CancellationTokenSource linked to the
  shutdown token instead.
- Server-014: wire SealedBootstrap into Program.cs via AddSealedBootstrap
  + OpcUaServerService so the generation-sealed cache + stale-config flag
  + resilient reader actually run; /healthz now reflects cache-fallback
  state.
- Server-015: replace the stale 'PR 16 / PR 17 minimum-viable scope'
  class summaries on OtOpcUaServer and OpcUaServerOptions with the
  shipped LDAP + anonymous-role + configurable security-profile prose.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Batch 2 cleared Open findings in Core.ScriptedAlarms, Core.Scripting,
Core.VirtualTags, Admin, and Server (Core.ScriptedAlarms-009 documented
under Won't Fix per the recommendation).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
dohertj2 added 1 commit 2026-05-23 07:27:55 -04:00
The Resolution prose was already recorded under Core.Scripting commit
(0454822); status was left as Open. Flip to Won't Fix to match.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
dohertj2 added 6 commits 2026-05-23 07:48:42 -04:00
- Driver.Galaxy-005: rewrite the EventPump BoundedChannelOptions comment
  to honestly describe the Wait+TryWrite pattern.
- Driver.Galaxy-010: ResolveApiKey now warns when a literal API key is
  used in production wiring; added an explicit dev: prefix for known
  cleartext-in-dev cases and rewrote the GalaxyGatewayOptions doc.
- Driver.Galaxy-012: O(1) reverse-lookup for SubscriptionRegistry
  dispatch via per-entry FullRefByItemHandle map; immutable hash-set for
  the cross-binding reverse map; SubscribeAsync / ReadViaSubscribeOnce
  use BuildResultIndex for per-reference correlation.
- Driver.Galaxy-013: ReinitializeAsync now validates the incoming JSON
  against the running options; ReplayOnSessionLost honoured by the
  Replay path; class summary rewritten to describe the shipped surface.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Driver.AbCip-007: inject an optional ILogger<AbCipDriver> /
  ILogger<AbCipAlarmProjection> (default NullLogger) and log around
  every read / write / template-fetch / probe / alarm-poll failure path.
- Driver.AbCip-011: LogWarning when InitializeAsync is configured with
  Probe.Enabled=true but ProbeTagPath is blank — operators now see why
  GetHostStatuses keeps reporting Unknown.
- Driver.AbCip-012: documented the LibplctagTemplateReader per-call
  Tag cost as accepted given libplctag's own connection pool and the
  low-frequency discovery use-case.
- Driver.AbCip-013: per-device AllowPacking + ConnectionSize overrides
  on AbCipDeviceOptions, threaded through AbCipTagCreateParams; central
  BuildCreateParams helper replaces five ad-hoc clones; AllowPacking
  now reaches Tag.AllowPacking at runtime.
- Driver.AbCip-015: stale-comment sweep — every PR-N forward-reference
  is rewritten to describe present behaviour.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Driver.AbLegacy-005: optional ILogger<AbLegacyDriver> ctor parameter,
  logged init failure / probe transitions / first non-zero libplctag
  status per device.
- Driver.AbLegacy-011: Dispose() runs the synchronous teardown directly
  instead of bridging via DisposeAsync().AsTask().GetAwaiter().GetResult()
  to remove the documented sync-over-async deadlock pattern.
- Driver.AbLegacy-013: documented the ResolveHost three-tier fallback
  chain in XML and pointed DiscoverAsync's IsArray=false comment at the
  Modbus ArrayCount pattern for the eventual multi-element follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Driver.FOCAS-007: optional ILogger<FocasDriver> + alarm-projection
  logger; log Debug around every formerly-empty catch (probe / shutdown
  / fixed-tree / recycle / alarms-read / projection).
- Driver.FOCAS-008: cache the parsed FocasAddress per tag at
  InitializeAsync; Read/WriteAsync look it up instead of re-parsing on
  every call.
- Driver.FOCAS-009: ProbeLoopAsync now wraps client.ProbeAsync in a
  linked CTS honouring Probe.Timeout so a hung CNC socket can't block
  past the configured limit.
- Driver.FOCAS-010: FocasOperationModeExtensions.ToText delegates to
  FocasOpMode.ToText — single canonical op-mode label surface.
- Driver.FOCAS-011: FocasAlarmType constants are typed short to match
  the cnc_rdalmmsg2 wire field and the projection switch arms.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Driver.S7-003: ArgumentNullException.ThrowIfNull on the references
  argument at the top of ReadAsync / WriteAsync (was reaching .Count
  before any null check).
- Driver.S7-005: drop the redundant global::S7.Net.Plc qualifiers in
  ReadOneAsync / WriteOneAsync — using S7.Net already covers Plc.
- Driver.S7-009: PollLoopAsync degrades _health to Degraded after
  sustained failure and backs off exponentially up to PollBackoffCap;
  resets on a healthy tick so an operator can see the loop wedge.
- Driver.S7-010: Dispose runs the synchronous teardown directly with a
  bounded WhenAll Wait drain instead of bridging via DisposeAsync().
- Driver.S7-013: reject unsupported S7DataType values (Int64 / UInt64 /
  Float64 / String / DateTime) at InitializeAsync so half-implemented
  types no longer leak BadNotSupported live nodes into the address space.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Batch 3 cleared Open findings in Driver.Galaxy, Driver.AbCip,
Driver.AbLegacy, Driver.FOCAS, and Driver.S7.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
dohertj2 added 6 commits 2026-05-23 08:21:10 -04:00
- Driver.TwinCAT-004: corrected the IEC time-type inline comments;
  documented that the driver currently surfaces them as raw UInt32
  counters.
- Driver.TwinCAT-006: ResolveHost returns a documented UnresolvedHost
  sentinel when no devices are configured instead of returning the
  logical DriverInstanceId (which never matches GetHostStatuses).
- Driver.TwinCAT-014: wired Probe.Timeout into the probe-loop call and
  added a NotificationMaxDelayMs config knob threaded through
  AddNotificationAsync.
- Driver.TwinCAT-015: Dispose() runs a genuinely synchronous teardown
  with bounded waits (no sync-over-async deadlock pattern).
- Driver.TwinCAT-016: pinned the Structure-tag rejection and the
  probe-loop vs read disposal race with regression tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Driver.Modbus-003: route every _health access through ReadHealth /
  WriteHealth helpers backed by Volatile.Read / Volatile.Write so a
  burst of concurrent ReadAsync callers always sees a complete snapshot.
- Driver.Modbus-007: promoted the Int64 / UInt64 → Int32 surfacing
  caveat to a full <remarks> block; rewrote DisableFC23's doc to flag it
  as reserved / no-op.
- Driver.Modbus-008: deleted stale duplicate doc, rewrote the
  prohibition-block summaries to credit the shipped re-probe loop, and
  removed the unused 'status' local in the ModbusException catch arm.
- Driver.Modbus-009: bind-time validation rejects StringLength < 1 for
  String tags; ModbusTcpTransport clamps keep-alive intervals to whole
  seconds (>=1).
- Driver.Modbus-010: documented WriteOnChangeOnly's cache-invalidation
  policy (reads-only) and the write-only-tag caveat.
- Driver.Modbus-011: collected the scattered instance fields into a
  single contiguous block at the top of ModbusDriver.
- Driver.Modbus-012: covered the previously-uncovered Reinitialize
  state-hygiene, malformed/truncated/empty-bitmap response, and
  DisposeAsync teardown paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Driver.OpcUaClient-011: rewrote the ValueRank comment with the OPC UA
  Part 3 constants and an explicit scalar/array boundary at
  valueRank >= 0.
- Driver.OpcUaClient-014: track every MonitoredItem.Notification handler
  in a MonitoredItemNotificationHandle record; UnsubscribeAsync /
  UnsubscribeAlarmsAsync / ShutdownAsync detach the handler before
  Subscription.DeleteAsync so the SDK's invocation list no longer keeps
  the driver alive.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Driver.Historian.Wonderware-004: ToHistorianEvent synthesises a fresh
  Guid when the upstream EventId is unparseable and logs the substitution
  instead of writing the historian with Guid.Empty.
- Driver.Historian.Wonderware-005: GetHealthSnapshot derives the
  connection-open booleans from the active-node fields so the snapshot
  is self-consistent without depending on the secondary lock.
- Driver.Historian.Wonderware-007: SID-mismatch branch in PipeServer now
  sends a HelloAck { Accepted=false, RejectReason } so the client sees a
  symmetric rejection.
- Driver.Historian.Wonderware-008: classify StartQuery failures —
  connection-class codes drop the connection, query-class codes throw
  QueryClassStartQueryException so the IPC layer surfaces Success=false.
- Driver.Historian.Wonderware-010: RequestTimeoutSeconds now enforced
  via BuildRequestCts linked to the caller's CancellationToken.
- Driver.Historian.Wonderware-011: refreshed XML docs to describe the
  current sidecar / named-pipe architecture (Galaxy.Host / Proxy
  references reframed as historical context).
- Driver.Historian.Wonderware-012: pinned the previously-uncovered
  HistorianDataSource behaviours with five new test files; also removed
  the stale empty tests/ZB.MOM.WW.OtOpcUa.Driver.Historian.Wonderware.Tests
  directory.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Driver.Modbus.Addressing-006: broaden the catch in TryParseFamilyNative
  so a future helper throwing a non-Argument/Overflow type still satisfies
  the try-parse contract.
- Driver.Modbus.Addressing-007: document that the address grammar does
  not carry ModbusStringByteOrder (the structured-tag path does);
  add a 'Grammar scope' bullet to docs/v2/dl205.md.
- Driver.Modbus.Addressing-009: reword the ModbusModiconAddress comments
  so they don't imply a leading-digit invariant the parser doesn't
  enforce.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Batch 4 cleared Open findings in Driver.TwinCAT, Driver.Modbus,
Driver.OpcUaClient, Driver.Historian.Wonderware, and Driver.Modbus.Addressing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
dohertj2 added 6 commits 2026-05-23 08:38:02 -04:00
- Driver.AbCip.Cli-003: SubscribeCommand prints the 'Subscribed' banner
  BEFORE wiring OnDataChange so the main thread can't interleave its
  write with the poll-thread handler.
- Driver.AbCip.Cli-004: AbCipCommandBase.Timeout and SubscribeCommand
  validate TimeoutMs / IntervalMs and throw CommandException on
  non-positive values.
- Driver.AbCip.Cli-005: every command now calls FlushLogging() in its
  finally block.
- Driver.AbCip.Cli-006: Timeout init throws NotSupportedException with a
  pointer at TimeoutMs instead of silently swallowing assignments.
- Driver.AbCip.Cli-007: added AbCipCommandBaseTests covering BuildOptions
  shape, probe / controller-browse / alarm toggles, host address, family
  selection, tag list passthrough.
- Driver.AbCip.Cli-008: rewrote the opening paragraph in
  docs/Driver.AbCip.Cli.md to credit the six-CLI roster with a pointer
  at docs/DriverClis.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Driver.AbLegacy.Cli-002: WriteCommand.Value description lists the full
  true/false, 1/0, on/off, yes/no alias set.
- Driver.AbLegacy.Cli-003: SubscribeCommand serialises every WriteLine
  via a per-execution consoleGate lock so the poll-thread OnDataChange
  handler can't interleave with the banner.
- Driver.AbLegacy.Cli-004: dropped 'await using var driver' in favour of
  a plain 'var driver' + explicit await ShutdownAsync in finally; the
  driver is no longer shut down twice.
- Driver.AbLegacy.Cli-005: SubscribeCommand.IntervalMs description
  carries the PollGroupEngine 250ms-floor caveat; docs/Driver.AbLegacy.Cli.md
  spells out the same.
- Driver.AbLegacy.Cli-006: ProbeCommand --type now carries the short
  alias 't' to match the other commands.
- Driver.AbLegacy.Cli-007: BuildOptionsTests cover the probe-disabled,
  device-shape, tag-passthrough, timeout-propagation, and empty-tag-list
  paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Driver.S7.Cli-004: 'await using var driver' is the sole driver
  disposal path; dropped the redundant explicit await ShutdownAsync from
  each command's finally.
- Driver.S7.Cli-005: deleted the stale empty
  tests/ZB.MOM.WW.OtOpcUa.Driver.S7.Cli.Tests/ directory (the real test
  project lives under tests/Drivers/Cli/).
- Driver.S7.Cli-006: S7CommandBaseBuildOptionsTests cover the probe
  toggle, timeout mapping, host/port/CPU/rack/slot wiring, and tag list
  passthrough.
- Driver.S7.Cli-007: re-added the SubscribeCommand handler comment
  explaining the CliFx IConsole.Output usage and that the poll-thread
  raises events.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Driver.TwinCAT.Cli-001: TwinCATCommandBase.Validate rejects
  non-positive TimeoutMs / IntervalMs and AmsPort outside 1..65535;
  ExecuteAsync calls it first.
- Driver.TwinCAT.Cli-002: SubscribeCommand serialises every WriteLine
  through a writeLock to remove the notification-callback vs banner
  interleave risk.
- Driver.TwinCAT.Cli-003: SubscribeCommand.DescribeMechanism derives
  the banner label from the returned ISubscriptionHandle.DiagnosticId
  so it can't disagree with what the driver actually did.
- Driver.TwinCAT.Cli-004: introduced TwinCATTagCommandBase carrying
  --poll-only + BuildOptions; BrowseCommand stays on the slimmer
  TwinCATCommandBase so --poll-only no longer surfaces in browse --help.
- Driver.TwinCAT.Cli-005: ProbeCommand --type now carries the 't' short
  alias to match the other commands.
- Driver.TwinCAT.Cli-006: 35 new tests covering Gateway / AmsAddress
  parse / BuildOptions / PollOnly / browse-helpers / probe-alias /
  mechanism derivation.
- Driver.TwinCAT.Cli-007: replaced the empty-init <inheritdoc/> with an
  explicit summary warning future maintainers about the no-op init.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Driver.Modbus.Cli-003: ModbusCommandBase.ValidateEndpoint rejects
  --port outside 1..65535, non-positive --timeout-ms, and --unit-id
  outside 1..247.
- Driver.Modbus.Cli-004: wrapped SubscribeCommand's OnDataChange handler
  body in a try/catch (warn-and-swallow) and serialised the console
  write through a lock.
- Driver.Modbus.Cli-005: Probe / Read / Write now catch the
  cancellation-during-init OperationCanceledException and print
  'Cancelled.' instead of dumping a stack trace.
- Driver.Modbus.Cli-006: ProbeCommand.ComputeVerdict derives the headline
  from BOTH the driver state and the probe snapshot's OPC UA quality
  class so the headline can't disagree with the wire result.
- Driver.Modbus.Cli-007: docs/Driver.Modbus.Cli.md carries an explicit
  'CLI scope' callout — the address-string grammar is a DriverConfig
  JSON feature; the CLI takes the structured triple only.
- Driver.Modbus.Cli-008: pinned BuildOptions, ValidateEndpoint, the
  region-validation guards, ComputeVerdict, and the cancellation-during-
  initialize paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Batch 5 cleared Open findings in Driver.AbCip.Cli, Driver.AbLegacy.Cli,
Driver.S7.Cli, Driver.TwinCAT.Cli, and Driver.Modbus.Cli.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
dohertj2 added 6 commits 2026-05-23 11:16:18 -04:00
- Driver.FOCAS.Cli-001: WriteCommand.ParseValue now wraps numeric
  FormatException / OverflowException as CliFx CommandException with
  the offending value.
- Driver.FOCAS.Cli-002: SubscribeCommand's OnDataChange handler and the
  banner both take a writeLock so notification-callback and main-thread
  writes can't interleave; handler exceptions are warn-and-swallow.
- Driver.FOCAS.Cli-003: FocasCommandBase.ValidateOptions rejects
  --cnc-port outside 1..65535, non-positive --timeout-ms, and
  non-positive --interval-ms; ExecuteAsync calls it first.
- Driver.FOCAS.Cli-004: 'await using var driver' is the sole driver
  disposal path; dropped the redundant explicit await ShutdownAsync.
- Driver.FOCAS.Cli-005 (Deferred): the fix lives in
  Driver.Cli.Common.SnapshotFormatter — explicitly naming the
  status-code shortlist there benefits every driver CLI. Left as a
  Driver.Cli.Common follow-up.
- Registered the new tests/Drivers/Cli/ZB.MOM.WW.OtOpcUa.Driver.FOCAS.Cli.Tests
  project in ZB.MOM.WW.OtOpcUa.slnx.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Driver.Cli.Common-004: confirm the FormatTable empty-input guard
  landed earlier (commit 1433a1c); flip status to Resolved with a
  cross-reference.
- Driver.Cli.Common-006: reword the SnapshotFormatter source-time
  column comment to describe the actual behaviour (right-most column,
  unmeasured, '-' for null timestamps) and confirm the
  DriverCommandBase summary now enumerates FOCAS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Driver.Historian.Wonderware.Client-003: replaced the mixed Interlocked
  + healthLock counters with RecordOutcome that touches _totalQueries
  and exactly one of _totalSuccesses / _totalFailures under one
  acquisition.
- Driver.Historian.Wonderware.Client-004: InvokeAndClassifyAsync routes
  transport + sidecar classification through a single RecordOutcome
  call; the legacy ReclassifySuccessAsFailure two-step is gone.
- Driver.Historian.Wonderware.Client-006: removed the dead
  ReconnectInitialBackoff / ReconnectMaxBackoff options and added a
  doc <remarks> stating the channel performs a single in-place
  reconnect; retry/backoff stays with the caller.
- Driver.Historian.Wonderware.Client-008: the audit-suppression comment
  block now records advisory titles, why neither applies, and the
  revisit trigger.
- Driver.Historian.Wonderware.Client-010: reworded Dispose() to claim
  deadlock-safety and added a GetHealthSnapshot summary documenting the
  single-channel collapse + counter invariant.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Client.CLI-002: SubscribeCommand's neverWentBad list now requires the
  node to be present in lastStatus (i.e. received at least one update)
  so the 'suspect' bucket only contains observed nodes.
- Client.CLI-003: every long-running command validates numeric option
  ranges (Interval / Depth / MaxDepth / Duration / Max) and throws
  CliFx CommandException on out-of-range values.
- Client.CLI-004: SubscribeCommand carries XML summary docs on the
  type, ctor, every [CommandOption] property, and ExecuteAsync —
  matching the sibling commands' style.
- Client.CLI-006: HistoryReadCommand parses --start / --end with
  InvariantCulture+UTC and surfaces FormatException as CommandException;
  every NodeIdParser.ParseRequired call wraps FormatException /
  ArgumentException as CommandException.
- Client.CLI-007: CommandBase.ConfigureLogging calls Log.CloseAndFlush()
  before assigning a new Log.Logger so prior sinks are disposed.
- Client.CLI-008: rewrote the subscribe and historyread sections of
  docs/Client.CLI.md (every flag documented, summary-bucket vocabulary,
  StandardDeviation aggregate, UTC --start/--end convention).
- Client.CLI-009: SubscribeCommand / AlarmsCommand use named local
  handlers and detach them via -= after UnsubscribeAsync so no
  notification reaches the console after the command's output phase
  ends.
- Client.CLI-010: added CommandRangeValidationTests,
  EventHandlerLifecycleTests, InputValidationErrorsTests,
  LoggerLifecycleTests, and SubscribeCommandSummaryTests pinning every
  Low fix; FakeOpcUaClientService gained AddDiscoveredVariable +
  RaiseDataChanged + BrowseResultsByParent helpers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Client.Shared-003: DefaultSessionAdapter.WriteValueAsync / CallMethodAsync
  guard against null/empty Results and throw ServiceResultException with
  the response's ServiceResult code instead of indexing into a missing
  list.
- Client.Shared-004: DefaultSessionAdapter.CloseAsync / HistoryReadRawAsync
  / HistoryReadAggregateAsync use the Session.*Async overloads and honour
  the caller's CancellationToken.
- Client.Shared-009: AcknowledgeAlarmAsync returns the underlying
  ServiceResultException.StatusCode on failure instead of always Good;
  IOpcUaClientService doc updated to describe the new contract.
- Client.Shared-010: ConnectionSettings.CertificateStorePath defaults to
  empty; DefaultApplicationConfigurationFactory resolves the canonical
  PKI path lazily, so per-failover ConnectionSettings copies don't hit
  the filesystem.
- Client.Shared-011: added the alarm-fallback regression test, extracted
  EndpointSelector as a pure static, and added EndpointSelectorTests
  covering security-mode match, Basic256Sha256 preference, fallback,
  diagnostics, hostname rewrite, and null/empty guards.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Batch 6 cleared Open findings in Driver.FOCAS.Cli (1 deferred to
Driver.Cli.Common), Driver.Cli.Common, Driver.Historian.Wonderware.Client,
Client.CLI, and Client.Shared.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
dohertj2 added 2 commits 2026-05-23 11:28:17 -04:00
- Client.UI-003: wire Serilog properly per CLAUDE.md — console sink +
  rolling daily file sink in Program.Main, Log.CloseAndFlush in finally,
  per-VM Log.ForContext<> loggers.
- Client.UI-004: migrate the cert-store folder picker from the obsolete
  OpenFolderDialog to StorageProvider.OpenFolderPickerAsync (with
  TryGetFolderFromPathAsync seed + TryGetLocalPath extraction).
- Client.UI-006: surface formerly silent catch blocks via an observable
  StatusMessage on the Subscriptions / Alarms VMs that bubbles up into
  the shell's status bar; soft fallbacks log at Information level so
  hard failures stay distinguishable.
- Client.UI-009: docs/Client.UI.md now lists Standard Deviation in the
  Aggregate row of the Query Options table.
- Client.UI-010: removed the unused MinDateTimeProperty /
  MaxDateTimeProperty styled properties from DateTimeRangePicker.
- Client.UI-011: updated the cert-store TextBox watermark from the
  legacy AppData/LmxOpcUaClient/pki to the canonical
  AppData/OtOpcUaClient/pki.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Batch 7 closed the last Open findings in Client.UI. The review backlog
is now empty: 0 Open findings across all 31 modules.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
dohertj2 closed this pull request 2026-05-23 15:10:09 -04:00
dohertj2 deleted branch feat/scripted-alarm-shelve-routing 2026-05-23 15:10:09 -04:00

Pull request closed

Sign in to join this conversation.