Commit Graph

1450 Commits

Author SHA1 Message Date
Joseph Doherty 4765706e94 feat(dcl): coerce OPC UA array reads to typed list attributes; Bad quality on element mismatch 2026-06-16 15:39:19 -04:00
Joseph Doherty 872ce2b565 feat(validation): semantic checks for List attributes (element type, default value, trigger operands) 2026-06-16 15:38:18 -04:00
Joseph Doherty a1d464b50d fix(siteruntime): encode list attribute writes via AttributeValueCodec (was .ToString())
Replace value?.ToString() with AttributeValueCodec.Encode(value) in
AttributeAccessor indexer set and SetAsync, so a List<string>{"a","b"}
encodes to ["a","b"] instead of the garbage ToString representation.
Add using ZB.MOM.WW.ScadaBridge.Commons.Types. Tests verify the codec
contract (list→JSON array, scalar passthrough, null); full round-trip
through the accessor is not viable without a live Akka ActorSystem —
noted in-test with explanation.
2026-06-16 15:38:00 -04:00
Joseph Doherty ba414cbb68 feat(comm): stream List attribute values as canonical JSON
Replace ValueFormatter.FormatDisplayValue with AttributeValueCodec.Encode
in StreamRelayActor so List<T> attribute values cross the gRPC wire as a
JSON array (e.g. ["a","b"]) rather than a comma-joined display string.
Scalars and null values are unaffected. Tests cover List→JSON, scalar
string pass-through, and null→empty-string.
2026-06-16 15:37:33 -04:00
Joseph Doherty 492b41f0fd fix(multivalue): Wave-2 review fixes (MV-2/MV-4/MV-12)
- MV-2: guard unsupported element type before parse (no misleading re-wrap); add Float round-trip test
- MV-4: carry ElementDataType through the two validation-flatten ResolvedAttribute sites (ManagementActor.HandleValidateTemplate, BundleImporter.BuildFlattenedConfigForValidation) so MV-5 validation sees element type via every entry point
- MV-12: include ElementDataType in TemplateAttribute add/update audit payloads + fix stale docstring
2026-06-16 15:33:27 -04:00
Joseph Doherty 02aff2436e feat(template): carry ElementDataType through flatten/override 2026-06-16 15:24:31 -04:00
Joseph Doherty e7e34b26f1 feat(transport): round-trip ElementDataType for List attributes
Add DataType? ElementDataType to TemplateAttributeDto (optional, default null
for backward-compat with old bundles). Map it in both directions in
EntitySerializer (export + FromBundleContent) and in all three
TemplateAttribute construction sites in BundleImporter (BuildTemplate,
SyncTemplateAttributesAsync add-path, and SyncTemplateAttributesAsync
update-path including change-detection). Two new round-trip tests in
EntitySerializerTests confirm List attributes survive export→import and that
old DTOs with null ElementDataType import cleanly.
2026-06-16 15:23:39 -04:00
Joseph Doherty 4a4b3d677d feat(db): migration for ElementDataType + widen attribute Value to nvarchar(max) (idempotent) 2026-06-16 15:23:13 -04:00
Joseph Doherty 8bd8079a7f feat(commons): AttributeValueCodec for canonical list value encode/decode 2026-06-16 15:21:56 -04:00
Joseph Doherty 70fa0e7397 feat(commons): add DataType.List + ElementDataType companion for multi-value attributes 2026-06-16 15:18:12 -04:00
Joseph Doherty 09d7319958 docs: implementation plan for structured multi-value (List) attributes
15 tasks (MV-1..MV-15) with classifications, dependencies, and TDD steps:
type model, AttributeValueCodec, idempotent migration, flatten, validation,
runtime encode/decode, DCL array coercion, stream encode, management, CLI,
transport, two UI editors, and integration verification.
2026-06-16 15:16:06 -04:00
Joseph Doherty b238228d8b docs: design for structured multi-value (List) attributes
Add a first-class DataType.List + ElementDataType companion so object
attributes can store homogeneous scalar lists (e.g. MoveInWorkOrderNumbers,
MoveInPartNumbers) across all four lifecycle paths: script write/read,
static authored default, OPC UA array read, OPC UA array write.

Canonical JSON value codec; whole-list override; element type fixed by base;
idempotent migration widening Value to nvarchar(max) + adding ElementDataType.
Approved via brainstorming.
2026-06-16 15:10:32 -04:00
Joseph Doherty 2c8706ca67 Merge feature/inbound-xapikey: accept X-API-Key as inbound-API credential transport
Inbound API now accepts the credential from either Authorization: Bearer sbk_... OR
X-API-Key: sbk_... (raw token), via the SAME peppered-HMAC verifier (Authorization
precedence preserved; failure path / scope checks unchanged). 16/16 inbound-auth tests.
2026-06-16 14:07:20 -04:00
Joseph Doherty 1392fd144a test(inbound-api): X-API-Key review nits — whitespace-auth fallthrough test + dedupe + comment wording
- Add WhitespaceAuthorization_ValidXApiKey_Returns200: pins the IsNullOrWhiteSpace
  fall-through — a present-but-blank Authorization header is treated as absent so a
  valid X-API-Key still authenticates (200).
- Remove MissingBearer_Returns401 (added in 510559e): identical path to
  NeitherHeader_Returns401 (no Authorization + no X-API-Key → 401); keep the
  descriptively-named NeitherHeader variant.
- Change "legacy 'X-API-Key'" -> "alternate 'X-API-Key'" in EndpointExtensions.cs and
  the BuildPostWithApiKeyHeader/HappyPath doc comments to avoid implying Bearer is
  the older transport (Bearer was itself introduced by the prior auth re-arch).
2026-06-16 14:06:03 -04:00
Joseph Doherty 510559e1be feat(inbound-api): accept X-API-Key header as credential transport alongside Authorization: Bearer 2026-06-16 14:01:01 -04:00
Joseph Doherty 7362598b69 Merge feature/disable-login: dev auto-login flag (ScadaBridge:Security:Auth:DisableLogin)
Faithful port of OtOpcUa: AutoLoginAuthenticationHandler under the cookie scheme when
the flag is true → all-roles system-wide multi-role principal; loud warning; no env guard.
Full-solution build green; Security suite 136/136.
2026-06-16 08:56:46 -04:00
Joseph Doherty 13cd53ad1c docs(plan): mark disable-login DL-1..DL-4 complete 2026-06-16 08:55:09 -04:00
Joseph Doherty 57302500ac docs(security): document dev disable-login flag + ship default-false config key
Adds a "Dev Disable-Login Flag" subsection to Component-Security.md covering
ScadaBridge:Security:Auth:DisableLogin / User, the AutoLoginAuthenticationHandler
mechanism, and the no-environment-guard / startup-warning production risk.

Ships DisableLogin: false under ScadaBridge → Security → Auth in:
  - src/.../Host/appsettings.json (canonical default)
  - docker/central-node-a/appsettings.Central.json
  - docker/central-node-b/appsettings.Central.json

Also records DL-3 commit SHAs in the plan tasks file.
2026-06-16 08:54:11 -04:00
Joseph Doherty 75919cec31 test(security): DL-3 review nits — assert OnValidatePrincipal on prod path + warning/doc polish 2026-06-16 08:52:28 -04:00
Joseph Doherty e89604298d feat(security): wire DisableLogin flag — auto-login scheme + startup warning 2026-06-16 08:47:19 -04:00
Joseph Doherty 0926ce4dda test(security): DL-2 review nits — assert IsAuthenticated + clarify handler flag gating 2026-06-16 08:44:06 -04:00
Joseph Doherty dcd445a380 feat(security): AutoLoginAuthenticationHandler — all-roles system-wide dev auto-login 2026-06-16 08:40:30 -04:00
Joseph Doherty 72691e5577 feat(security): AuthDisableLoginOptions + Roles.All for dev auto-login 2026-06-16 08:36:48 -04:00
Joseph Doherty 56d6508a5b docs(plan): implementation plan for dev disable-login flag (4 tasks) 2026-06-16 08:35:05 -04:00
Joseph Doherty 5cf2d1cb99 docs(plan): design for dev disable-login auto-login flag (port from OtOpcUa)
Faithful copy (warning only, no env guard); custom AuthenticationHandler under the
cookie scheme; reuses M2.19 SessionClaimBuilder for an all-roles system-wide principal.
2026-06-16 08:31:19 -04:00
Joseph Doherty b2d8fd8a0a Merge M2: stillpending.md Tier-2 correctness & behavioral gaps (#7,#8,#9,#10,#13,#15,#17,#18,#20,#21,#22,#23,#24,#25,#26,#27,#28,#29,#30,#31,#32)
20 tasks (M2.0-M2.19), each through its classification-driven review chain.
Full-solution build green (0 warnings, TreatWarningsAsErrors). Per-task targeted
suites all passed. Known pre-existing: 2 partition-purge E2E failures (follow-up #52).
2026-06-16 08:27:59 -04:00
Joseph Doherty 077770fe35 docs(plan): record M2.19 review-fix SHA; M2 (Tier-2) complete — all 20 tasks done 2026-06-16 08:12:49 -04:00
Joseph Doherty fddc69545f fix(security): M2.19 review nits — idle/refresh config guard + adapter tests + dead-var/doc cleanup (#15)
- Add SecurityOptionsValidator (IValidateOptions<SecurityOptions>) enforcing
  RoleRefreshThresholdMinutes < IdleTimeoutMinutes; registered with ValidateOnStart in
  AddSecurity — startup FAILS if threshold >= idle, so the invariant cannot be silently
  misconfigured away.
- Update SecurityOptions XML-docs: class-level summary distinguishes JWT Bearer path
  (JwtSigningKey/JwtExpiryMinutes) from Blazor cookie session path (IdleTimeoutMinutes/
  RoleRefreshThresholdMinutes); both time fields document the ~45-min effective idle window
  and the new cross-field constraint.
- Remove dead jwtService variable from /auth/login lambda in AuthEndpoints.cs (resolved
  but never used since login moved to SessionClaimBuilder).
- Extract ApplyValidationResultAsync helper from OnValidatePrincipalAsync (pure
  decision-application step); add 3 adapter tests covering Reject → RejectPrincipal +
  SignOutAsync; Replace → ReplacePrincipal + ShouldRenew; Keep → no-op.
- Fix inaccurate TryRefreshAsync comment (dropped "OR last-activity needs advancing" —
  the code only returns non-null when roleRefreshDue).
- Add InternalsVisibleTo for Security.Tests in Security.csproj.
- Add IsRoleRefreshDue tests: missing claim → due; unparsable claim → due; plus integration
  test covering the full ValidateAsync path for a principal missing zb:lastrolerefresh
  (triggers refresh + re-stamps anchor rather than keeping stale principal forever).
- Add SecurityOptionsValidatorConfigGuardTests: default succeeds; equal fails; greater fails;
  boundary (idle-1) succeeds; wiring confirmed via AddSecurity container.
2026-06-16 08:12:11 -04:00
Joseph Doherty c7916d79a8 chore(tasks): record M2.19 implementation commit SHA (8fe7f46) 2026-06-16 07:54:49 -04:00
Joseph Doherty 8fe7f46df6 feat(security): cookie session idle-timeout + LDAP-free role-mapping refresh (#15, M2.19)
Spike outcome: the shared ILdapAuthService (ZB.MOM.WW.Auth.Abstractions, an external
NuGet package) exposes ONLY AuthenticateAsync(username, password, ct) — no passwordless
service-account group-search. A live LDAP group re-query for an active session therefore
requires a new lib method and is OUT OF SCOPE (cannot modify the external package).
Implemented the always-achievable layers (cookie-only; no embedded JWT for cookie principals):

- /auth/login now stores the user's raw LDAP groups (one zb:group claim each) plus a
  zb:lastrolerefresh anchor (login time, UTC), seeding the LastActivity idle anchor too.
- SessionClaimBuilder: single shared DRY claim-builder used by BOTH /auth/login AND the
  refresh path, so the two claim shapes cannot drift (canonical identity/role/scope claims
  with nameType/roleType pinned, plus the M2.19 group + refresh-anchor additions).
- CookieSessionValidator (TimeProvider-injected, unit-testable) + a thin
  CookieAuthenticationEvents.OnValidatePrincipal adapter:
    * idle-timeout: a session past IdleTimeoutMinutes (default 30) is RejectPrincipal+SignOut;
      consistent with the cookie ExpireTimeSpan+SlidingExpiration window (same value).
    * role refresh WITHOUT LDAP: when older than RoleRefreshThresholdMinutes (new option,
      default 15) the DB-backed RoleMapper re-runs on the STORED groups, claims are rebuilt
      via the shared builder, the anchor advances, principal is replaced + cookie renewed.
      Revoked DB mappings drop the user's roles mid-session.
    * fail-soft: any refresh error KEEPS the existing principal (no sign-out, never throws)
      — mirrors the documented "LDAP failure: active sessions continue with current roles".
- Documented residual limitation in Component-Security.md: central role-mapping/scope
  changes apply within ~15 min without LDAP; live directory group-membership changes are
  picked up only at next login (needs a passwordless group-search on the external
  ZB.MOM.WW.Auth.Ldap lib — tracked follow-up).

Tests (Security.Tests, all green): CookieSessionValidatorTests + SessionClaimBuilderParityTests
— idle reject/keep, LDAP-free remap-from-stored-groups, revoked-roles loss, sub-threshold
no-refresh, refresh-throws-keeps-session, and login/refresh claim-parity.
2026-06-16 07:54:31 -04:00
Joseph Doherty a0d9379a4f fix(debug-stream): M2.18 review nits — thread-safe test mock + AlarmKey null-guard + rename stale test (#26)
- MockSiteStreamGrpcClient.SubscribeCalls and UnsubscribedCorrelationIds
  switched from bare List<T> to lock-guarded backing fields with snapshot
  accessors, eliminating the actor-thread/test-thread data race (matches
  the existing lock(events) pattern for ReceivedEvents)
- AttributeKey and AlarmKey null-guard each component with ?? string.Empty
  so a null SourceReference/AlarmName/etc. cannot silently collide with an
  empty-string component in the dedup dictionary
- On_Snapshot_Opens_GrpcStream renamed to
  On_Snapshot_Does_Not_Open_Additional_GrpcStream; assertion updated to
  confirm exactly one subscribe (the PreStart stream-first open) with no
  second subscribe after snapshot delivery
- _stopped ordering in InstanceNotFound path moved after CleanupGrpc()
  for consistency with DebugStreamTerminated and ReceiveTimeout handlers
2026-06-16 07:41:41 -04:00
Joseph Doherty 7210cdbcb5 docs: record M2.18 (#26) implementation commit SHA in M2 task tracker 2026-06-16 07:34:06 -04:00
Joseph Doherty d8519cb464 fix(debug-stream): stream-first lifecycle with replay/dedup (#26, M2.18)
Re-architect DebugStreamBridgeActor from snapshot-first to stream-first so no
attribute/alarm event occurring during the snapshot-build + network-transit
window is lost (#26).

Lifecycle change:
- PreStart now opens the gRPC subscription FIRST (alongside sending the
  SubscribeDebugViewRequest), so live events start flowing immediately.
- Phase model via a single _snapshotDelivered flag (mutated only on the actor
  thread). While buffering (snapshot not yet delivered), AttributeValueChanged/
  AlarmStateChanged are appended to an ordered _preSnapshotBuffer instead of
  being delivered. After snapshot+flush, the same handlers pass through directly.
- On DebugViewSnapshot: deliver snapshot, then flush the buffer in arrival order
  with per-entity dedup, then set _snapshotDelivered=true (pass-through).

Dedup rule (exactly-once):
- Identity: attributes by (InstanceUniqueName, AttributePath, AttributeName);
  alarms by (InstanceUniqueName, AlarmName, SourceReference) so native
  per-condition alarms are not conflated. Keys joined with a NUL delimiter
  (declared as an escaped char constant; no raw NUL in source) so distinct
  identities never collide on a space within a name.
- Boundary: a buffered event whose timestamp is <= the snapshot's timestamp for
  the same entity is already reflected -> DROP; strictly-newer (>) -> DELIVER;
  entity absent from the snapshot -> DELIVER (genuine gap-window event).

Preserved paths:
- M2.11 InstanceNotFound: with stream-first the gRPC stream is already open, so
  the not-found path now tears it down (CleanupGrpc) + clears the buffer, does
  NOT enter pass-through, delivers the not-found snapshot, and stops cleanly.
- Reconnect (ReconnectGrpcStream -> OpenGrpcStream) does not touch the phase
  flag: a mid-session reconnect resumes pass-through; a reconnect during the
  buffering phase stays buffering until the snapshot arrives.
- Communication-008 retry/stability/stop/terminate + ReceiveTimeout orphan net
  unchanged. Duplicate/late snapshot after delivery is ignored defensively.

Tests: 10 new M2.18 tests (stream-first ordering, gap-window buffering, dedup
drop/deliver for attrs + alarms, ordering, pass-through, InstanceNotFound
teardown, reconnect-during-buffering, reconnect-after-snapshot) + revised the
M2.11 not-found test to assert stream teardown. Full DebugStreamBridgeActor
class green: 23/23.
2026-06-16 07:33:51 -04:00
Joseph Doherty c1043569f6 docs(deployment): reconcile delete-from-NotDeployed — spec matrix now matches deliberate code (#31, M2.17)
git blame shows commit 1d5465f3 deliberately added NotDeployed to CanDelete so an
undeployed instance can have its orphan record fully removed. Code + tests already
permit it; the spec matrix said 'No'. Per M2.17, reconcile doc→code (not the reverse):
matrix now reads 'Delete from Not deployed = Yes (removes the orphan record)' with a
note, and CanDelete carries a remark citing the rationale + origin commit.
2026-06-16 07:24:57 -04:00
Joseph Doherty c9244d8bda fix(health): M2.16 review nit — real idempotency guard for SiteEventLog health bridge (#30)
AddSiteEventLogHealthMetricsBridge registered via AddHostedService(factory-lambda),
which sets ImplementationFactory and leaves ImplementationType null. The prior
ImplementationType == guard was therefore silently dead — a second call would spin
up a second SiteEventLogFailureCountReporter. Fix: add a private
SiteEventLogHealthMetricsBridgeMarker singleton and guard on its ServiceType instead.

Also corrects the cycle-path comment in both ServiceCollectionExtensions.cs and
SiteEventLogFailureCountReporter.cs: StoreAndForward.csproj does reference
SiteEventLogging.csproj, so the transitive path HealthMonitoring → StoreAndForward →
SiteEventLogging is real, but adding a direct HealthMonitoring → SiteEventLogging
reference would NOT create a cycle (SiteEventLogging has no back-edge to HealthMonitoring).
The Func<long> seam is a coupling-avoidance measure, not a cycle-breaker.

Adds AddSiteEventLogHealthMetricsBridgeTests.AddSiteEventLogHealthMetricsBridge_IsIdempotent_DoesNotDoubleRegister_HostedService
as a regression test (builds provider and asserts exactly one reporter via GetServices<IHostedService>().OfType<T>()).
2026-06-16 07:22:35 -04:00
Joseph Doherty d81f747434 feat(health): wire ISiteEventLogger.FailedWriteCount into SiteHealthReport (#30, M2.16)
Add SiteHealthReport.SiteEventLogWriteFailures (trailing optional long = 0,
additive-only), ISiteHealthCollector.SetSiteEventLogWriteFailures (default
no-op so existing fakes compile), and SiteEventLogFailureCountReporter
(hosted service in HealthMonitoring, Func<long> delegate to avoid the
HealthMonitoring → StoreAndForward → SiteEventLogging cycle).

Registration helper AddSiteEventLogHealthMetricsBridge added to
HealthMonitoring.ServiceCollectionExtensions; wired in
SiteServiceRegistration after AddSiteEventLogging.

Tests: SiteEventLogWriteFailuresMetricTests (4 collector tests) +
SiteEventLogFailureCountReporterTests (2 poller tests) in
HealthMonitoring.Tests. 79/79 HealthMonitoring.Tests green,
59/59 SiteEventLogging.Tests green, 0 warnings.
2026-06-16 07:14:54 -04:00
Joseph Doherty e1ee37e508 fix(siteeventlog): gate EventLogPurge to active node via IClusterNodeProvider.SelfIsPrimary (#29, M2.15) 2026-06-16 07:02:26 -04:00
Joseph Doherty 6b1cb9e0e6 refactor(host)/test: M2.14 review nits — simplify probe cancellation + pre-cancelled-token test (#28)
- Remove redundant linked CancellationTokenSource in ProbeAsync; pass the
  framework cancellationToken and ProbeTimeout directly to Ask (the two-CTS
  pattern was redundant — Ask already honours both the timeout and the token).
- Add EchoActor XML <remarks> explaining why no Receive<Identify> handler is
  needed (ActorBase answers Identify automatically).
- Add PreCancelledToken_ReportsUnhealthy_DoesNotThrow test: verifies the
  never-throws guarantee on the shutdown-race path (token already cancelled
  before CheckHealthAsync is invoked).
2026-06-16 06:54:28 -04:00
Joseph Doherty 473429a202 docs: record M2.14 (#28) commit SHA in M2 task tracker 2026-06-16 06:49:28 -04:00
Joseph Doherty 253bec5a52 feat(host): readiness gates on required cluster singletons (#28, M2.14)
REQ-HOST-4a lists "required cluster singletons running (if applicable)" as a
readiness criterion, but /health/ready only checked database + akka-cluster.
Add a third Ready-tagged check, RequiredSingletonsHealthCheck, registered in the
Central-role AddHealthChecks() chain (so it is naturally role-scoped — site nodes
never run it).

Probe: for each required central singleton, Ask its local ClusterSingletonProxy
an Identify with a short bounded per-singleton timeout (~2s, probes run
concurrently via Task.WhenAll). A non-null ActorIdentity.Subject within the
timeout means the singleton is running and reachable through the proxy; a null
subject or a timeout means unreachable → Unhealthy, naming the unreachable
singleton(s). The check never throws (catch-all → Unhealthy) and resolves
ActorSystem lazily from DI per probe (Unhealthy if Akka not yet up).

Required-always set = the five singleton proxies created unconditionally in
AkkaHostedService.RegisterCentralActors: notification-outbox, audit-log-ingest,
site-call-audit, audit-log-purge, site-audit-reconciliation. There are no
feature/config-gated central singletons today; any future gated singleton is the
"if applicable" case and must NOT be added to the required set.

Leadership-agnostic: the proxy reaches the singleton from either central node, so
a ready standby still reports ready (readiness must not require cluster
leadership — that is the Active tier's job). During a brief singleton handover the
probe may time out and the node flaps to not-ready, which is correct (a node
mid-handover is legitimately not fully ready); no retries, to keep the probe fast.

Tests (TDD): RequiredSingletonsHealthCheckTests exercises the probe against a
TestKit ActorSystem — all proxies present+reachable → Healthy; one missing →
Unhealthy naming it; ActorSystem absent → Unhealthy, no throw. HealthCheckTests
regression-guards the Ready tag + absence of the Active tag on the new check.
2026-06-16 06:49:18 -04:00
Joseph Doherty 3945789970 docs(dcl): M2.13 review nits — OriginalRaiseTime ConditionRefresh/UTC caveats + Description-vs-Message note (#27) 2026-06-16 06:40:40 -04:00
Joseph Doherty 722b8663c1 feat(dcl): populate obtainable NativeAlarmTransition fields from OPC UA and MxGateway (#27, M2.13)
OPC UA (RealOpcUaClient):
- Append 5 new SelectClauses at indices 13–17 (never renumber 0–12):
  - 13: AlarmConditionType/ActiveState/TransitionTime → OriginalRaiseTime
  - 14–17: LimitAlarmType HighHighLimit/HighLimit/LowLimit/LowLowLimit → LimitValue
- New OpcUaAlarmMapper.PickLimitValue helper: first non-null in HiHi→Hi→Lo→LoLo
  priority order, InvariantCulture-formatted; empty string for non-limit alarm types.
- HandleAlarmEvent reads new indices with fields.Count > N guards; hard minimum (6)
  unchanged so base ConditionType events still process without the limit fields.
- Document unavailable-by-protocol fields (Category, Description, OperatorUser,
  CurrentValue) inline in BuildAlarmEventFilter and HandleAlarmEvent.

MxGateway (MxGatewayAlarmMapper):
- MapTransition: CurrentValue and LimitValue now populated via MxValueToString
  (uses MxValueExtensions.ToClrValue + InvariantCulture) from OnAlarmTransitionEvent
  proto fields current_value/limit_value.
- MapSnapshot: same — populated from ActiveAlarmSnapshot.current_value/limit_value.
- MxValueToString helper (internal): null-safe MxValue → string conversion.

Tests (17 new, 40 total pass):
- OpcUaAlarmMapperTests: PickLimitValue priority, InvariantCulture, all-null case.
- MxGatewayAlarmMapperTests: CurrentValue/LimitValue populate from double/string
  MxValue; absent fields yield empty strings.
- RealOpcUaClientAlarmFilterTests: index alignment assertions (count=18, per-index
  TypeDefinitionId+BrowsePath), regression guard on existing indices 0–12.
2026-06-16 06:37:19 -04:00
Joseph Doherty e2b31a9fd2 fix(siteruntime): M2.12 review nits — observe logger fault + meaningful source fallback (#25)
Replace bare task-discard with ContinueWith(OnlyOnFaulted|ExecuteSynchronously) so a
faulted ISiteEventLogger is logged and swallowed rather than going to the unobserved-task
firehose. Replace the "ScriptRuntimeContext" class-name fallback with the meaningful
"InstanceScript:{instanceName}" identifier (matching the site-event-log source convention).
Update the method doc-comment to state the best-effort contract explicitly. Pin the new
fallback value in the shape-precision test.
2026-06-16 06:26:00 -04:00
Joseph Doherty f08038db23 feat(siteruntime): M2.12 (#25) — emit script Error site event on recursion-limit violation
Inject ISiteEventLogger into ScriptRuntimeContext (additive optional ctor
param, defaulted null, all existing callers source-compatible). Add a single
private EmitRecursionLimitEventAsync helper that fires-and-forgets a
"script"/Error site event; called at both recursion guard sites (CallScript
at ~:332 and ScriptCallHelper.CallShared at ~:499). ScriptExecutionActor
threads the already-resolved siteEventLogger singleton into the context;
AlarmExecutionActor leaves it null (no siteEventLogger wired there).

Existing _logger.LogError + throw behaviour unchanged.

Tests: RecursionLimitSiteEventTests — 5 tests covering both CallScript and
CallShared (ISiteEventLogger.LogEventAsync called once with category "script",
severity "Error"; null logger path does not throw).
2026-06-16 06:20:58 -04:00
Joseph Doherty d160c7f694 test(communication): M2.11 review nits — bridge-actor not-found test + dead-letter comment + toast wording (#24)
- Add DebugStreamBridgeActorTests: On_InstanceNotFound_Snapshot_Forwards_To_OnEvent_Does_Not_Open_Stream_And_Terminates — asserts _onEvent receives the not-found snapshot, SubscribeCalls remains empty, and the actor terminates cleanly via Watch/ExpectTerminated.
- Add comment in DebugStreamBridgeActor near Context.Stop(Self) explaining that the subsequent StopDebugStream Tell from DebugStreamService.StopStream produces a benign expected dead-letter.
- Reword not-found toast in DebugView.razor to "Instance not found on the selected site — check the deployment target." (accurate when the instance may be deployed to a different site).
2026-06-16 06:15:26 -04:00
Joseph Doherty dbf44b9e10 fix(siteruntime): M2.11 — unknown-instance debug snapshot returns InstanceNotFound=true (#24)
RouteDebugSnapshot and RouteDebugViewSubscribe on DeploymentManagerActor
previously returned an empty DebugViewSnapshot for unknown instances,
indistinguishable from a deployed-but-empty instance. Callers had no way
to differentiate "not deployed here" from "deployed, no data yet."

Approach — additive field on existing message contract:
  Added `bool InstanceNotFound = false` as an optional trailing parameter
  to DebugViewSnapshot (Commons). All existing positional constructor calls
  and serialized wire frames are unaffected (default = false). A dedicated
  new message type was considered but rejected: the ClusterClient channel
  and DebugStreamService TCS are already typed on DebugViewSnapshot, and a
  second reply union would require wider changes for zero additive-safety
  gain.

Changes:
  - Commons/DebugViewSnapshot: add InstanceNotFound = false (additive)
  - DeploymentManagerActor: set InstanceNotFound=true in both unknown-
    instance branches (RouteDebugViewSubscribe, RouteDebugSnapshot)
  - DebugStreamBridgeActor: when snapshot.InstanceNotFound, forward it to
    _onEvent (resolves the TCS) then stop cleanly; no gRPC stream opened
  - DebugView.razor: check session.InitialSnapshot.InstanceNotFound after
    connect and show a clear "not deployed on this site" error toast
  - 3 new tests in DeploymentManagerActorTests covering: unknown→snapshot,
    unknown→subscribe, known-empty→InstanceNotFound stays false
2026-06-16 06:08:21 -04:00
Joseph Doherty 9cd62aa5b4 test(configdb): M2.10 review fix — catch bracketed AuditLog identifiers; document EF/multi-line scan limits (#18)
Extends ContainsAuditLogMutation regex to match T-SQL bracketed forms
([AuditLog], [dbo].[AuditLog]) that SSMS-generated SQL produces; the
prior optional-schema pattern only matched bare/dbo-prefixed names,
silently missing these real violation forms.

Changes:
- Schema sub-pattern (?:dbo\.)? → (?:\[?dbo\]?\.)? (matches dbo. and [dbo].)
- Table sub-pattern AuditLog\b → \[?AuditLog\]?\b (matches AuditLog and [AuditLog])
- Pattern compiled as static readonly Regex field for clarity/performance
- Adds 4 new planted-positive cases: UPDATE [dbo].[AuditLog], UPDATE [AuditLog],
  DELETE FROM [dbo].[AuditLog], DELETE FROM [AuditLog]
- Retains all existing negatives; adds DELETE FROM [dbo].[Notifications] negative
- Fixes misleading "reverse order" comment on the comment-prefix positive case
- Documents scan limitations (EF Core bulk methods; multi-line DML) in class XML doc
2026-06-16 05:55:27 -04:00
Joseph Doherty e7b6fe33a4 test(configdb): guard test for AuditLog append-only invariant (M2.10, #18)
Adds AuditLogAppendOnlyGuardTests.cs to
tests/ZB.MOM.WW.ScadaBridge.ConfigurationDatabase.Tests/ — a code-level backstop
for the DB-role DENY UPDATE / DENY DELETE control established in migration
20260602174346_CollapseAuditLogToCanonical.

The guard scans every non-Designer, non-Snapshot *.cs file in the
ConfigurationDatabase source tree and fails the test run if any line matches the
DML-syntax pattern:

    UPDATE\s+(?:dbo\.)?AuditLog\b
    DELETE\s+(?:FROM\s+)?(?:dbo\.)?AuditLog\b

The tight DML-syntax pattern naturally excludes false positives without extra
exclusion checks: DENY UPDATE ON dbo.AuditLog is not matched (UPDATE is followed
by ON, not the table name); ALTER TABLE … SWITCH and TRUNCATE contain no UPDATE/
DELETE keyword; comments with UPDATE/AuditLog in separate clauses are not matched.

Self-verifying unit tests (ContainsAuditLogMutation_*) prove the helper:
- returns false on clean-source lines (INSERT, SELECT, DENY DDL, ALTER SWITCH,
  TRUNCATE, DELETE FROM Notifications);
- returns TRUE on planted violations (UPDATE AuditLog SET …, DELETE FROM
  dbo.AuditLog WHERE …, lower-case variants);
- returns false on the exact DENY/GRANT/partition-switch strings from the
  production migration files.

All 256 ConfigurationDatabase.Tests pass; solution builds 0 W / 0 E.
2026-06-16 05:49:51 -04:00
Joseph Doherty 76198b36e3 fix(host): add MachineDataDb startup validation for Central (reverts Host-008, M2.9 #17)
REQ-HOST-3/REQ-HOST-4 require a MachineDataDb connection string for Central nodes.
The shipped docker appsettings (docker/central-node-a/appsettings.Central.json and
central-node-b) already carry the key. Host-008 had removed the fail-fast Require
because MachineDataDb had no consumer yet; this commit reverses that decision so a
misconfigured or missing connection string is caught at startup with a clear error.

Changes:
- DatabaseOptions: add MachineDataDb property with XML doc comment
- StartupValidator: add .Require for ScadaBridge:Database:MachineDataDb inside the
  existing Central .When block, immediately after the ConfigurationDb Require
- StartupValidatorTests: rename Central_MissingMachineDataDb_PassesValidation ->
  FailsValidation and flip to Assert.Throws; update comment to cite REQ-HOST-3/4,
  shipped docker appsettings, and the Host-008 reversal; add MachineDataDb to
  ValidCentralConfig() so all other Central tests remain green
- CentralDbTestEnvironment: supply ScadaBridge__Database__MachineDataDb env var
  (mirrors ConfigurationDb pattern) so HostStartupTests, HealthCheckTests, and
  MetricsEndpointTests pass through the new Require
- CompositionRootTests, AkkaHostedServiceAuditWiringTests, ActorPathTests: set
  ScadaBridge__Database__MachineDataDb env var alongside the pepper env var and
  clear it in Dispose, matching the existing pepper handling pattern

Build: 0 warnings, 0 errors. dotnet test Host.Tests: 233/233 passed.
2026-06-16 05:41:25 -04:00
Joseph Doherty 21b801b71f test(template): M2.8 review nits — stale-binding comment + stale-ID & inert-check tests (#23)
Add code comments in ValidateConnectionBindingCompleteness explaining
that the unbound-attribute branch also covers the silently-dropped
stale-binding case (cross-reference FlatteningService.ApplyConnectionBindings),
and that the `continue` skips the exists-at-site check for unbound attrs.

Add two new tests:
- FlatteningPipelineConnectionBindingTests: stale DataConnectionId (999)
  not present in site connections → flattener drops it silently →
  validator reports ConnectionBinding Error, IsValid false.
- ValidationServiceTests: enforce:true + siteConnectionNames:null on a
  properly-bound attribute → no ConnectionBinding error (exists-at-site
  check stays inert when site set is not supplied).
2026-06-16 05:34:56 -04:00