- Core.VirtualTags-004: CoerceResult now covers every scalar
DriverDataType and throws on the default arm; Load rejects unsupported
declared types.
- Core.VirtualTags-006: Subscribe/Unsub prune empty observer-list
entries from _observers under the same lock with a reconfirm-on-add
race guard.
- Core.VirtualTags-007: rewrote TimerTriggerScheduler so each TickGroup
tracks an InFlight flag (Interlocked CAS); ticks that overlap a still-
running tick for the same group are skipped + counted.
- Core.VirtualTags-009: DirectDependencies / DirectDependents return a
shared static empty set on miss instead of allocating per call.
- Core.VirtualTags-010: corrected XML docs to reference the real engine
symbols (OnUpstreamChange, CascadeAsync, etc.) instead of phantom types.
- Core.VirtualTags-011: Load now rejects scripts whose declared Writes
target a non-registered virtual-tag path.
- Core.VirtualTags-013: DependencyCycleException renders SCC members as
a set rather than a fabricated arrow-traversal edge path.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Core.Scripting-005: DependencyExtractor.HandleTagCall now recognises
raw-string literal paths by checking the StringLiteralExpression node
kind instead of the legacy StringLiteralToken kind.
- Core.Scripting-006: scope CompiledScriptCache failed-compile eviction
with TryRemove(KeyValuePair) so a racing retry entry is not evicted.
- Core.Scripting-008: document the per-publish assembly accretion as an
accepted limitation in docs/VirtualTags.md.
- Core.Scripting-009: enumerate the authoritative deny-list (namespace
prefixes + type-granular denies) in the Phase 7 decision-#6 entry to
match ForbiddenTypeAnalyzer.
- Core.Scripting-011: pin ScriptSandbox.Build, ScriptContext.Deadband
boundary semantics, and end-to-end factory + companion-sink
integration.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Core.ScriptedAlarms-003: emit OnEvent OUTSIDE _evalGate by collecting
pending emissions during the gate-held section and flushing them after
release; eliminates re-entrancy deadlock the docs already promised.
- Core.ScriptedAlarms-006: track every fire-and-forget Reevaluate /
ShelvingCheck task in _inFlight; Dispose drains the set so the engine
no longer races store writes against teardown.
- Core.ScriptedAlarms-008: store comments as ImmutableList<AlarmComment>
so AppendComment is O(log n) instead of O(n).
- Core.ScriptedAlarms-010: document the deliberate input-quality
asymmetry (Uncertain drives the predicate, renders {?} in the message)
in docs/ScriptedAlarms.md and on MessageTemplate.Resolve remarks.
- Core.ScriptedAlarms-011: propagate the no-op reason through
TransitionResult.NoOp(state, reason) and log it from
ScriptedAlarmEngine.ApplyAsync.
- Core.ScriptedAlarms-009 (Won't Fix per recommendation): documented the
per-evaluation dictionary allocation in docs/v2/Galaxy.Performance.md
with a mitigation path if a future soak surfaces pressure.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Core.AlarmHistorian-008: cache queue depth in an Interlocked counter so
EnqueueAsync no longer runs COUNT(*) on every alarm; consolidate
DrainOnceAsync onto a single SqliteConnection per tick (purge, batch
read, dead-letter, and outcome transaction all share it).
- Core.AlarmHistorian-011: confirm the stale Galaxy.Host XML doc
references were already fixed under earlier commits; flip to Resolved.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Configuration-004: NodePermissions stored as int to match the EF
HasConversion<int>() in OtOpcUaConfigDbContext.ConfigureNodeAcl.
- Configuration-005: serialise LiteDbConfigCache.PutAsync so concurrent
Put for the same (ClusterId, GenerationId) cannot duplicate rows.
- Configuration-007: rethrow OperationCanceledException from
GenerationApplier.ApplyPass when the caller's token is cancelled.
- Configuration-010: scrub secrets and drop the full exception object
from the ResilientConfigReader fallback warning log.
- Configuration-011: pin the previously-uncovered GenerationApplier
cancellation and path-length / publish-validation paths.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Core-004: add ConfigureAwait(false) to DriverHost.RegisterAsync /
UnregisterAsync / DisposeAsync.
- Core-008: rewrite the BuildAddressSpaceAsync XML doc to correctly name
the caller (OpcUaApplicationHost.PopulateAddressSpaces) that owns the
per-driver isolation.
- Core-009: snapshot DriverResilienceOptions once per non-idempotent write
in CapabilityInvoker.ExecuteWriteAsync.
- Core-010: switch DriverResilienceOptions.Resolve to TryGetValue with a
diagnostic error message when a tier table is missing a capability.
- Core-011: add an optional diagnostic callback to PermissionTrieBuilder
so production callers can surface scope-path mismatches.
- Core-012: correct the stale WedgeDetector ctor summary and add the
Reconnecting row to DriverHealthReport's state matrix.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Core.Abstractions-004: guard DriverTypeRegistry.Register with a Lock so
concurrent registrations are atomic.
- Core.Abstractions-005: narrow PollGroupEngine catch blocks to non-fatal
exceptions, add optional onError callback, tolerate disposed-CTS races.
- Core.Abstractions-006: document the deliberate int-vs-uint asymmetry on
IHistoryProvider.ReadEventsAsync / IHistorianDataSource.ReadEventsAsync.
- Core.Abstractions-007: pin the gaps with PollGroupEngine + DriverHealth
contract tests.
- Core.Abstractions-008: correct XML docs on DriverHealth.LastError and
the optional / required asymmetry on the history-read surfaces.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add #pragma warning disable xUnit1051 at the top of ContractsWireParityTests.cs.
The xUnit1051 analyser fires on MessagePack's Serialize/Deserialize overloads that
have an optional CancellationToken parameter; these are synchronous parity tests
where the token is not meaningful — the suppression is scoped to this file only.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
System.Threading.Thread is in the System.Threading namespace (not
System.Threading.Thread), so the existing ForbiddenNamespacePrefixes
entry "System.Threading.Thread" never matched — the namespace prefix
check compared against the type's containing namespace, which is
System.Threading. Move Thread into ForbiddenFullTypeNames (alongside
Environment / AppDomain / GC / Activator) where it is matched by exact
fully-qualified type name, which actually fires. Remove the dead
namespace-prefix entry and document why. The Rejects_Thread_new_at_compile
test now passes. (Core.Scripting-010.)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
In TimedScriptEvaluator.RunAsync, the catch (TimeoutException) block
now checks ct.IsCancellationRequested before throwing
ScriptTimeoutException, so a caller cancellation that races a timeout
deterministically surfaces as OperationCanceledException regardless of
which WaitAsync observes first. Regression test
Caller_cancellation_wins_even_when_timeout_fires_first added.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
DependencyExtractor.VisitInvocationExpression now additionally checks
that the member-access receiver is the identifier "ctx" before treating
a GetTag / SetVirtualTag call as a ScriptContext dependency. This
prevents spurious dependencies when a script defines a local helper type
with a matching method name and calls it as other.GetTag("X"). Test
Ignores_member_access_GetTag_on_non_ctx_receiver added.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add System.Threading.Tasks to ForbiddenNamespacePrefixes so scripts
cannot use Task.Run / Parallel to spawn background work that outlives
the per-evaluation timeout. Document the unbounded-memory accepted
trade-off and the Task denial rationale in docs/VirtualTags.md (new
"Known resource limits" subsection) and cross-reference from
docs/ScriptedAlarms.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`PlcTagHandle` and `DeviceState.TagHandles` were dead scaffolding: the
`ReleaseHandle` no-op never called `plc_tag_destroy` and the dict was
never populated. Removed the file, the dead dict, and its
`DisposeHandles` loop. Updated the `AbCipDriver` class doc to document
that native lifetime is owned by libplctag.NET `Tag.Dispose()` (invoked
from `DisposeHandles`) with the library's own finalizer covering any
GC-collected instances. Two test methods that only exercised the dead
`PlcTagHandle` class removed from `AbCipDriverTests`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Core.VirtualTags-002: cold-start guard publishes BadWaitingForInitialData
instead of silently returning a stale value.
Core.VirtualTags-003: Load detects duplicate Path values and keys the
upstream-subscription loop off the registered tag set.
Core.VirtualTags-005: VirtualTagSource fires the initial-data callback per
path before registering the change observer, fixing an ordering race.
Core.VirtualTags-008: DependencyGraph caches topological rank, lowering
per-change-event cost from O(V+E) to O(closure).
Core.VirtualTags-012: added 9 engine tests; CoerceResult null-return now
maps to BadInternalError as the code comment intended.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Core.Abstractions-001: PollGroupEngine compares array values with structural
equality so a driver returning a fresh T[] each poll no longer fires spuriously.
Core.Abstractions-002: PollOnceAsync guards reader result cardinality and
throws a descriptive InvalidOperationException on mismatch instead of a
swallowed ArgumentOutOfRangeException that stalled the subscription.
Core.Abstractions-003: the poll loop Task is tracked; Unsubscribe/DisposeAsync
await loop completion before disposing the CTS.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add engine-level tests covering the six gaps identified in the finding:
(1) timed-shelve auto-expiry driven via injectable clock + RunShelvingCheckForTest
hook so timer tests are deterministic;
(2) ConfirmAsync, TimedShelveAsync/UnshelveAsync round-trip, EnableAsync engine
methods exercised end-to-end;
(3) OnEvent subscriber-throws isolation — engine state advances and stays
operational after a subscriber throws;
(4) IAlarmStateStore.SaveAsync failure leaves in-memory state unchanged (locks in
the persist-before-update invariant from finding-007);
(5) second LoadAsync does not leak the old timer (regression for finding-002);
(6) AreInputsReady cold-start guard correctly blocks on Bad/missing inputs and
allows Uncertain-quality inputs through.
Expose RunShelvingCheckForTest() internal method on ScriptedAlarmEngine to
support deterministic timer tests.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
SubscribeAsync now wraps each driver handle in a private HostBoundHandle
that carries the resolved host name. UnsubscribeAsync unwraps it and
routes through the recorded host's resilience pipeline, correctly
charging the subscription's originating host's circuit breaker/bulkhead
instead of always using the default host. Falls back to the default
host for handles not created by this invoker. Two regression tests
added; update findings.md Open count from 10 to 6.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
BuildAddressSpaceAsync now checks _disposed (throws ObjectDisposedException)
and tears down the previous alarm forwarder + clears the sink registry
before re-walking, so a Galaxy-redeploy rebuild does not leak the old
forwarder and double-deliver alarm transitions. Three regression tests
added: double-build does not double-fire, sink count is correct after
rebuild, and post-dispose call throws.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Change ClusterEntry from sealed record to sealed class so TryUpdate
uses reference equality for the CAS comparison. Prune now uses a
read-compute-TryUpdate retry loop that restarts when a concurrent
Install updates the entry between the read and the write, preventing
a race that could silently drop the just-installed newest generation.
Two regression tests added to PermissionTrieCacheTests.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add FolderSegment member to NodeAclScopeKind; update WalkSystemPlatform
to report NodeAclScopeKind.FolderSegment (not Equipment) for each
visited Galaxy folder level, so MatchedGrant.Scope in
AuthorizationDecision.Provenance correctly distinguishes Galaxy folder
grants from UNS Equipment grants in the audit trail and Admin UI
diagnostics. Three regression tests added to PermissionTrieTests.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reorder persist/update in ApplyAsync, ReevaluateAsync, and ShelvingCheckAsync:
SaveAsync is now called before the in-memory _alarms entry is advanced. A store
failure therefore leaves both the persisted and in-memory views at the prior state
rather than diverging, maintaining the invariant that startup recovery reflects
actual persisted state.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add _disposed re-checks inside ReevaluateAsync and ShelvingCheckAsync after
acquiring _evalGate so callbacks in flight when Dispose() runs bail out cleanly
instead of mutating _alarms or writing to a disposed store. Drop the
_alarms.Clear() from Dispose() — clearing outside the gate races concurrent
reads and is unnecessary since the object is being discarded.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Split the LoadAsync seed-read + subscribe loop: ReadTag seed fills _valueCache
first, then persisted-state restore runs, then _loaded = true, then SubscribeTag
is called. Any synchronous initial push from the upstream now arrives after
_alarms is fully initialised and _loaded = true, so ReevaluateAsync will queue
correctly behind the gate rather than racing the half-built state.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Dispose any existing _shelvingTimer before reassigning it inside LoadAsync so
that a second LoadAsync call does not leak the old timer and leave two timers
running concurrently against the same engine state.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Configuration-002: sp_PublishGeneration is transaction-nesting aware
(BEGIN TRANSACTION vs SAVE TRANSACTION on @@TRANCOUNT) so a caller's outer
transaction survives a publish failure; sp_ValidateDraft wrapped in TRY/CATCH.
Configuration-003: ValidatePathLength uses the cluster's actual Enterprise/Site
lengths when available, falling back to the conservative approximation.
Configuration-006: ResilientConfigReader treats a command-timeout
TaskCanceledException as a fault (not caller cancellation) and falls back.
Configuration-009: removed the checked-in plaintext sa connection string;
CreateDbContext now requires OTOPCUA_CONFIG_CONNECTION.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add @ReleasedBy parameter to sp_ReleaseExternalIdReservation via a new EF
migration so the operator principal (not the shared SQL account) is recorded
in ExternalIdReservation.ReleasedBy and ConfigAuditLog.Principal.
ReservationService.ReleaseAsync gains a releasedBy parameter; Reservations.razor
resolves the signed-in user from AuthenticationState and passes it through.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
OnScriptSetVirtualTag updated the value cache, notified observers, and
recorded history for the written path but never scheduled a cascade for
tags depending on that path. This contradicts docs/VirtualTags.md, which
states ctx.SetVirtualTag writes "still participate in change-trigger
cascades": a change-triggered virtual tag reading a script-written tag
went stale until an unrelated trigger fired.
OnScriptSetVirtualTag now launches a fire-and-forget CascadeAsync for the
written path, mirroring OnUpstreamChange. The cascade is scheduled rather
than invoked inline because the callback runs inside EvaluateInternalAsync
while the non-reentrant _evalGate semaphore is held.
Added regression test
SetVirtualTag_within_script_cascades_to_dependents_of_the_written_tag.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
_alarms was a plain Dictionary<string, AlarmState> mutated under the
_evalGate semaphore, but four read paths (GetState, GetAllStates, the
LoadedAlarmIds property, and RunShelvingCheck) touched it from arbitrary
threads with no synchronisation. A Dictionary read concurrent with a
writer's entry reassignment can throw InvalidOperationException or return
torn state.
Switched _alarms to ConcurrentDictionary<string, AlarmState>. The only
write shapes are indexer-set and Clear, both atomic on ConcurrentDictionary,
so all mutations stay correct without further change; reads now get safe
snapshot semantics. LoadedAlarmIds materialises the key snapshot to keep
its IReadOnlyCollection<string> return type. This matches _valueCache,
which is already a ConcurrentDictionary.
Added a regression test (Concurrent_reads_during_mutation_do_not_throw)
that hammers the engine with state mutations while four reader threads
continuously call the three unguarded read paths.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Core.AlarmHistorian-002 — drain loop now honors exponential backoff:
StartDrainLoop arms a self-rescheduling one-shot Timer. RescheduleDrain
sets the next due-time to max(tickInterval, CurrentBackoff) while the
sink is BackingOff, so a historian outage genuinely slows the cadence
down the 1s->2s->5s->15s->60s ladder instead of hammering at the fixed
tick. Class doc-comment updated.
Core.AlarmHistorian-004 — SQLite busy handling: the connection string
is built via SqliteConnectionStringBuilder with DefaultTimeout=5, and a
new OpenConnection helper applies PRAGMA busy_timeout=5000 and
PRAGMA journal_mode=WAL on every open. A concurrent enqueue-vs-drain
file-lock collision now waits the lock out instead of failing fast with
SQLITE_BUSY. All connection open sites switched to the helper.
Core.AlarmHistorian-006 — drain-loop faults are no longer unobserved:
the timer callback (DrainTimerCallback) awaits DrainOnceAsync inside a
try/catch that logs via _logger.Error, records the message into
_lastError, and sets _drainState=BackingOff so a stalled drain is
visible on GetStatus; a finally always re-arms the timer.
Regression tests added to SqliteStoreAndForwardSinkTests:
StartDrainLoop_honors_backoff_and_slows_cadence_under_retry,
StartDrainLoop_keeps_steady_cadence_when_writer_is_healthy,
StartDrainLoop_records_drain_fault_and_keeps_running,
Concurrent_enqueue_and_drain_do_not_throw_sqlite_busy.
findings.md: 002/004/006 marked Resolved; open count 10 -> 7.
Build: clean (0 warnings). Tests: 20/20 passing.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Core-001: swap the authorization-cache defaults so
MembershipFreshnessInterval (5 min, inner re-resolve trigger) is
strictly less than AuthCacheMaxStaleness (15 min, fail-closed
ceiling), so NeedsRefresh's warm-refresh path is reachable.
Core-002: TriePermissionEvaluator.Authorize now compares the trie's
GenerationId against the session's AuthGenerationId and re-fetches the
session's bound generation on mismatch, failing closed when that
generation has been pruned.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Configuration-001: wrap the EXEC dbo.sp_ValidateDraft call in
sp_PublishGeneration in a BEGIN TRY/CATCH ROLLBACK; THROW block so a
validation RAISERROR aborts the publish instead of being ignored.
Configuration-008: route caller-supplied strings interpolated into
ConfigAuditLog.DetailsJson through STRING_ESCAPE(@x, 'json') and emit
sp_RollbackToGeneration's @TargetGenerationId as a bare JSON number,
closing the JSON-injection / denial-of-operation vector.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The ForbiddenTypeAnalyzer syntax walker only inspected four node kinds
(ObjectCreation, Invocation-with-member-access, MemberAccess, bare
Identifier), so a forbidden type named through typeof, a generic type
argument, a cast, an is/as type pattern, default(T), an array-creation
element type, or an explicitly-typed local declaration produced no
examined node and bypassed the sandbox check.
Analyze now runs a second pass that resolves GetTypeInfo on every
TypeSyntax node and recursively unwraps array element types and generic
type arguments, so forbidden types nested at any depth are rejected at
compile. The original member/call node-kind switch is kept deliberately
narrow (rather than resolving GetSymbolInfo on every node) to avoid
flagging harmless inherited members such as typeof(int).Name, whose Name
property is declared by System.Reflection.MemberInfo. A span+type dedupe
keeps the two passes from emitting duplicate rejections.
Regression tests added in ScriptSandboxTests cover typeof, generic type
arguments, casts, default(T), is/as patterns, array element types, and
typed local declarations with forbidden types, plus over-block guards
asserting allowed generics and typeof still compile.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ReadBatch built parallel rowIds / events lists: rowIds.Add ran for every
row but events.Add was guarded by `if (evt is not null)`. A corrupt /
null-deserializing payload desynced the lists, so DrainOnceAsync applied
each outcome to the wrong RowId — an Ack could delete an un-sent event
(silent alarm-event data loss) and the corrupt row stalled the queue
head forever.
ReadBatch now returns a single list of QueueRow(long RowId,
AlarmHistorianEvent? Event) records so a rowId can never drift from its
event; deserialization is wrapped to yield null on JsonException.
DrainOnceAsync immediately dead-letters rows whose payload is
null/un-deserializable and forwards only well-formed events to the
writer, mapping outcomes by RowId.
Regression tests cover a corrupt row mid-batch and at the queue head.
Core.AlarmHistorian suite: 16/16 pass.
Resolves code-review finding Core.AlarmHistorian-001 (Critical).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ForbiddenTypeAnalyzer used only a namespace-prefix deny-list. System.Environment,
System.AppDomain, System.GC and System.Activator live directly in the System
namespace, which must stay allowed for primitives (Math, String, ...), so they
were never caught — an operator-authored predicate could call
System.Environment.Exit(0) and terminate the in-process OPC UA server.
Add a type-granular deny-list (ForbiddenFullTypeNames) checked by
fully-qualified type name after the namespace-prefix check; legitimate System
types are unaffected.
Regression tests assert scripts referencing Environment/AppDomain/GC/Activator
are rejected at analysis time. Core.Scripting suite: 68/68 pass.
Resolves code-review finding Core.Scripting-001 (Critical).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Group all 69 projects into category subfolders under src/ and tests/ so the
Rider Solution Explorer mirrors the module structure. Folders: Core, Server,
Drivers (with a nested Driver CLIs subfolder), Client, Tooling.
- Move every project folder on disk with git mv (history preserved as renames).
- Recompute relative paths in 57 .csproj files: cross-category ProjectReferences,
the lib/ HintPath+None refs in Driver.Historian.Wonderware, and the external
mxaccessgw refs in Driver.Galaxy and its test project.
- Rebuild ZB.MOM.WW.OtOpcUa.slnx with nested solution folders.
- Re-prefix project paths in functional scripts (e2e, compliance, smoke SQL,
integration, install).
Build green (0 errors); unit tests pass. Docs left for a separate pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>