Commit Graph

757 Commits

Author SHA1 Message Date
Joseph Doherty f4b101b532 feat(db): idempotent startup normalizer rewriting List values to native JSON 2026-06-16 17:50:19 -04:00
Joseph Doherty e3d804a1a6 feat(transport): normalize List attribute values to native JSON on import 2026-06-16 17:50:05 -04:00
Joseph Doherty 5841cec958 feat(siteruntime): normalize old-form List static overrides to native JSON on load 2026-06-16 17:49:21 -04:00
Joseph Doherty bf80ca1388 test(commons): NJ-1 review — backward-compat decode tests for old-form Float/DateTime + assert DateTime list is quoted-string array 2026-06-16 17:38:57 -04:00
Joseph Doherty abe8832e9e feat(template): stamp ElementDataType on instance attribute overrides
Set existingOverride.ElementDataType and newOverride.ElementDataType from
templateAttr.ElementDataType in both the update and create branches of
SetAttributeOverrideAsync, so the persisted InstanceAttributeOverride row
always carries the element type for later central normalizer use (#93/M3).
2026-06-16 17:33:15 -04:00
Joseph Doherty 180d55482b feat(commons): native-typed JSON for List values; Decode reads both forms 2026-06-16 17:32:40 -04:00
Joseph Doherty 94be5e813b fix(siteruntime): decode List value to typed array before DCL write (OPC UA array write path) 2026-06-16 16:48:28 -04:00
Joseph Doherty ae2e1efb1c feat(ui): List attribute override editor in InstanceConfigure
When overriding a List attribute, render the shared AttributeListEditor
(whole-list replacement; element type fixed by the base, shown read-only via
ShowElementType=false) instead of the single-line input. Loading an existing
override decodes its JSON into rows (malformed -> empty); saving encodes rows to
canonical JSON with a pre-submit Decode round-trip guard surfacing element
errors inline. Clearing removes the InstanceAttributeOverride row
(repository-direct, mirroring native-alarm-source overrides). Non-List override
UX unchanged.
2026-06-16 16:25:58 -04:00
Joseph Doherty ba7331e67c feat(ui): List attribute editor in TemplateEdit 2026-06-16 16:20:08 -04:00
Joseph Doherty 85db4571b2 feat(cli): --element-type and JSON --value for List attributes 2026-06-16 16:18:08 -04:00
Joseph Doherty 0164f8a0d6 fix(mgmt): MV-10 review fixes (ElementDataType fixed-field in LockEnforcer; graceful bad-DataType error; message consistency) 2026-06-16 16:13:38 -04:00
Joseph Doherty 1525670fe7 feat(mgmt): accept + validate ElementDataType on attribute add/update 2026-06-16 16:05:05 -04:00
Joseph Doherty ad6bfc8af9 fix(siteruntime): reject SetStaticAttribute with malformed list value (no silent poison persist) 2026-06-16 15:59:30 -04:00
Joseph Doherty 7f97780bb3 feat(siteruntime): decode static List attributes to typed lists in InstanceActor (load/override/set) 2026-06-16 15:52:29 -04:00
Joseph Doherty 96e817a7e1 fix(siteruntime): MV-8 review fixes (construct list inside try; dictionary attr lookup; test hygiene) 2026-06-16 15:48:25 -04:00
Joseph Doherty 4765706e94 feat(dcl): coerce OPC UA array reads to typed list attributes; Bad quality on element mismatch 2026-06-16 15:39:19 -04:00
Joseph Doherty 872ce2b565 feat(validation): semantic checks for List attributes (element type, default value, trigger operands) 2026-06-16 15:38:18 -04:00
Joseph Doherty a1d464b50d fix(siteruntime): encode list attribute writes via AttributeValueCodec (was .ToString())
Replace value?.ToString() with AttributeValueCodec.Encode(value) in
AttributeAccessor indexer set and SetAsync, so a List<string>{"a","b"}
encodes to ["a","b"] instead of the garbage ToString representation.
Add using ZB.MOM.WW.ScadaBridge.Commons.Types. Tests verify the codec
contract (list→JSON array, scalar passthrough, null); full round-trip
through the accessor is not viable without a live Akka ActorSystem —
noted in-test with explanation.
2026-06-16 15:38:00 -04:00
Joseph Doherty ba414cbb68 feat(comm): stream List attribute values as canonical JSON
Replace ValueFormatter.FormatDisplayValue with AttributeValueCodec.Encode
in StreamRelayActor so List<T> attribute values cross the gRPC wire as a
JSON array (e.g. ["a","b"]) rather than a comma-joined display string.
Scalars and null values are unaffected. Tests cover List→JSON, scalar
string pass-through, and null→empty-string.
2026-06-16 15:37:33 -04:00
Joseph Doherty 492b41f0fd fix(multivalue): Wave-2 review fixes (MV-2/MV-4/MV-12)
- MV-2: guard unsupported element type before parse (no misleading re-wrap); add Float round-trip test
- MV-4: carry ElementDataType through the two validation-flatten ResolvedAttribute sites (ManagementActor.HandleValidateTemplate, BundleImporter.BuildFlattenedConfigForValidation) so MV-5 validation sees element type via every entry point
- MV-12: include ElementDataType in TemplateAttribute add/update audit payloads + fix stale docstring
2026-06-16 15:33:27 -04:00
Joseph Doherty 02aff2436e feat(template): carry ElementDataType through flatten/override 2026-06-16 15:24:31 -04:00
Joseph Doherty e7e34b26f1 feat(transport): round-trip ElementDataType for List attributes
Add DataType? ElementDataType to TemplateAttributeDto (optional, default null
for backward-compat with old bundles). Map it in both directions in
EntitySerializer (export + FromBundleContent) and in all three
TemplateAttribute construction sites in BundleImporter (BuildTemplate,
SyncTemplateAttributesAsync add-path, and SyncTemplateAttributesAsync
update-path including change-detection). Two new round-trip tests in
EntitySerializerTests confirm List attributes survive export→import and that
old DTOs with null ElementDataType import cleanly.
2026-06-16 15:23:39 -04:00
Joseph Doherty 8bd8079a7f feat(commons): AttributeValueCodec for canonical list value encode/decode 2026-06-16 15:21:56 -04:00
Joseph Doherty 1392fd144a test(inbound-api): X-API-Key review nits — whitespace-auth fallthrough test + dedupe + comment wording
- Add WhitespaceAuthorization_ValidXApiKey_Returns200: pins the IsNullOrWhiteSpace
  fall-through — a present-but-blank Authorization header is treated as absent so a
  valid X-API-Key still authenticates (200).
- Remove MissingBearer_Returns401 (added in 510559e): identical path to
  NeitherHeader_Returns401 (no Authorization + no X-API-Key → 401); keep the
  descriptively-named NeitherHeader variant.
- Change "legacy 'X-API-Key'" -> "alternate 'X-API-Key'" in EndpointExtensions.cs and
  the BuildPostWithApiKeyHeader/HappyPath doc comments to avoid implying Bearer is
  the older transport (Bearer was itself introduced by the prior auth re-arch).
2026-06-16 14:06:03 -04:00
Joseph Doherty 510559e1be feat(inbound-api): accept X-API-Key header as credential transport alongside Authorization: Bearer 2026-06-16 14:01:01 -04:00
Joseph Doherty 75919cec31 test(security): DL-3 review nits — assert OnValidatePrincipal on prod path + warning/doc polish 2026-06-16 08:52:28 -04:00
Joseph Doherty e89604298d feat(security): wire DisableLogin flag — auto-login scheme + startup warning 2026-06-16 08:47:19 -04:00
Joseph Doherty 0926ce4dda test(security): DL-2 review nits — assert IsAuthenticated + clarify handler flag gating 2026-06-16 08:44:06 -04:00
Joseph Doherty dcd445a380 feat(security): AutoLoginAuthenticationHandler — all-roles system-wide dev auto-login 2026-06-16 08:40:30 -04:00
Joseph Doherty 72691e5577 feat(security): AuthDisableLoginOptions + Roles.All for dev auto-login 2026-06-16 08:36:48 -04:00
Joseph Doherty fddc69545f fix(security): M2.19 review nits — idle/refresh config guard + adapter tests + dead-var/doc cleanup (#15)
- Add SecurityOptionsValidator (IValidateOptions<SecurityOptions>) enforcing
  RoleRefreshThresholdMinutes < IdleTimeoutMinutes; registered with ValidateOnStart in
  AddSecurity — startup FAILS if threshold >= idle, so the invariant cannot be silently
  misconfigured away.
- Update SecurityOptions XML-docs: class-level summary distinguishes JWT Bearer path
  (JwtSigningKey/JwtExpiryMinutes) from Blazor cookie session path (IdleTimeoutMinutes/
  RoleRefreshThresholdMinutes); both time fields document the ~45-min effective idle window
  and the new cross-field constraint.
- Remove dead jwtService variable from /auth/login lambda in AuthEndpoints.cs (resolved
  but never used since login moved to SessionClaimBuilder).
- Extract ApplyValidationResultAsync helper from OnValidatePrincipalAsync (pure
  decision-application step); add 3 adapter tests covering Reject → RejectPrincipal +
  SignOutAsync; Replace → ReplacePrincipal + ShouldRenew; Keep → no-op.
- Fix inaccurate TryRefreshAsync comment (dropped "OR last-activity needs advancing" —
  the code only returns non-null when roleRefreshDue).
- Add InternalsVisibleTo for Security.Tests in Security.csproj.
- Add IsRoleRefreshDue tests: missing claim → due; unparsable claim → due; plus integration
  test covering the full ValidateAsync path for a principal missing zb:lastrolerefresh
  (triggers refresh + re-stamps anchor rather than keeping stale principal forever).
- Add SecurityOptionsValidatorConfigGuardTests: default succeeds; equal fails; greater fails;
  boundary (idle-1) succeeds; wiring confirmed via AddSecurity container.
2026-06-16 08:12:11 -04:00
Joseph Doherty 8fe7f46df6 feat(security): cookie session idle-timeout + LDAP-free role-mapping refresh (#15, M2.19)
Spike outcome: the shared ILdapAuthService (ZB.MOM.WW.Auth.Abstractions, an external
NuGet package) exposes ONLY AuthenticateAsync(username, password, ct) — no passwordless
service-account group-search. A live LDAP group re-query for an active session therefore
requires a new lib method and is OUT OF SCOPE (cannot modify the external package).
Implemented the always-achievable layers (cookie-only; no embedded JWT for cookie principals):

- /auth/login now stores the user's raw LDAP groups (one zb:group claim each) plus a
  zb:lastrolerefresh anchor (login time, UTC), seeding the LastActivity idle anchor too.
- SessionClaimBuilder: single shared DRY claim-builder used by BOTH /auth/login AND the
  refresh path, so the two claim shapes cannot drift (canonical identity/role/scope claims
  with nameType/roleType pinned, plus the M2.19 group + refresh-anchor additions).
- CookieSessionValidator (TimeProvider-injected, unit-testable) + a thin
  CookieAuthenticationEvents.OnValidatePrincipal adapter:
    * idle-timeout: a session past IdleTimeoutMinutes (default 30) is RejectPrincipal+SignOut;
      consistent with the cookie ExpireTimeSpan+SlidingExpiration window (same value).
    * role refresh WITHOUT LDAP: when older than RoleRefreshThresholdMinutes (new option,
      default 15) the DB-backed RoleMapper re-runs on the STORED groups, claims are rebuilt
      via the shared builder, the anchor advances, principal is replaced + cookie renewed.
      Revoked DB mappings drop the user's roles mid-session.
    * fail-soft: any refresh error KEEPS the existing principal (no sign-out, never throws)
      — mirrors the documented "LDAP failure: active sessions continue with current roles".
- Documented residual limitation in Component-Security.md: central role-mapping/scope
  changes apply within ~15 min without LDAP; live directory group-membership changes are
  picked up only at next login (needs a passwordless group-search on the external
  ZB.MOM.WW.Auth.Ldap lib — tracked follow-up).

Tests (Security.Tests, all green): CookieSessionValidatorTests + SessionClaimBuilderParityTests
— idle reject/keep, LDAP-free remap-from-stored-groups, revoked-roles loss, sub-threshold
no-refresh, refresh-throws-keeps-session, and login/refresh claim-parity.
2026-06-16 07:54:31 -04:00
Joseph Doherty a0d9379a4f fix(debug-stream): M2.18 review nits — thread-safe test mock + AlarmKey null-guard + rename stale test (#26)
- MockSiteStreamGrpcClient.SubscribeCalls and UnsubscribedCorrelationIds
  switched from bare List<T> to lock-guarded backing fields with snapshot
  accessors, eliminating the actor-thread/test-thread data race (matches
  the existing lock(events) pattern for ReceivedEvents)
- AttributeKey and AlarmKey null-guard each component with ?? string.Empty
  so a null SourceReference/AlarmName/etc. cannot silently collide with an
  empty-string component in the dedup dictionary
- On_Snapshot_Opens_GrpcStream renamed to
  On_Snapshot_Does_Not_Open_Additional_GrpcStream; assertion updated to
  confirm exactly one subscribe (the PreStart stream-first open) with no
  second subscribe after snapshot delivery
- _stopped ordering in InstanceNotFound path moved after CleanupGrpc()
  for consistency with DebugStreamTerminated and ReceiveTimeout handlers
2026-06-16 07:41:41 -04:00
Joseph Doherty d8519cb464 fix(debug-stream): stream-first lifecycle with replay/dedup (#26, M2.18)
Re-architect DebugStreamBridgeActor from snapshot-first to stream-first so no
attribute/alarm event occurring during the snapshot-build + network-transit
window is lost (#26).

Lifecycle change:
- PreStart now opens the gRPC subscription FIRST (alongside sending the
  SubscribeDebugViewRequest), so live events start flowing immediately.
- Phase model via a single _snapshotDelivered flag (mutated only on the actor
  thread). While buffering (snapshot not yet delivered), AttributeValueChanged/
  AlarmStateChanged are appended to an ordered _preSnapshotBuffer instead of
  being delivered. After snapshot+flush, the same handlers pass through directly.
- On DebugViewSnapshot: deliver snapshot, then flush the buffer in arrival order
  with per-entity dedup, then set _snapshotDelivered=true (pass-through).

Dedup rule (exactly-once):
- Identity: attributes by (InstanceUniqueName, AttributePath, AttributeName);
  alarms by (InstanceUniqueName, AlarmName, SourceReference) so native
  per-condition alarms are not conflated. Keys joined with a NUL delimiter
  (declared as an escaped char constant; no raw NUL in source) so distinct
  identities never collide on a space within a name.
- Boundary: a buffered event whose timestamp is <= the snapshot's timestamp for
  the same entity is already reflected -> DROP; strictly-newer (>) -> DELIVER;
  entity absent from the snapshot -> DELIVER (genuine gap-window event).

Preserved paths:
- M2.11 InstanceNotFound: with stream-first the gRPC stream is already open, so
  the not-found path now tears it down (CleanupGrpc) + clears the buffer, does
  NOT enter pass-through, delivers the not-found snapshot, and stops cleanly.
- Reconnect (ReconnectGrpcStream -> OpenGrpcStream) does not touch the phase
  flag: a mid-session reconnect resumes pass-through; a reconnect during the
  buffering phase stays buffering until the snapshot arrives.
- Communication-008 retry/stability/stop/terminate + ReceiveTimeout orphan net
  unchanged. Duplicate/late snapshot after delivery is ignored defensively.

Tests: 10 new M2.18 tests (stream-first ordering, gap-window buffering, dedup
drop/deliver for attrs + alarms, ordering, pass-through, InstanceNotFound
teardown, reconnect-during-buffering, reconnect-after-snapshot) + revised the
M2.11 not-found test to assert stream teardown. Full DebugStreamBridgeActor
class green: 23/23.
2026-06-16 07:33:51 -04:00
Joseph Doherty c9244d8bda fix(health): M2.16 review nit — real idempotency guard for SiteEventLog health bridge (#30)
AddSiteEventLogHealthMetricsBridge registered via AddHostedService(factory-lambda),
which sets ImplementationFactory and leaves ImplementationType null. The prior
ImplementationType == guard was therefore silently dead — a second call would spin
up a second SiteEventLogFailureCountReporter. Fix: add a private
SiteEventLogHealthMetricsBridgeMarker singleton and guard on its ServiceType instead.

Also corrects the cycle-path comment in both ServiceCollectionExtensions.cs and
SiteEventLogFailureCountReporter.cs: StoreAndForward.csproj does reference
SiteEventLogging.csproj, so the transitive path HealthMonitoring → StoreAndForward →
SiteEventLogging is real, but adding a direct HealthMonitoring → SiteEventLogging
reference would NOT create a cycle (SiteEventLogging has no back-edge to HealthMonitoring).
The Func<long> seam is a coupling-avoidance measure, not a cycle-breaker.

Adds AddSiteEventLogHealthMetricsBridgeTests.AddSiteEventLogHealthMetricsBridge_IsIdempotent_DoesNotDoubleRegister_HostedService
as a regression test (builds provider and asserts exactly one reporter via GetServices<IHostedService>().OfType<T>()).
2026-06-16 07:22:35 -04:00
Joseph Doherty d81f747434 feat(health): wire ISiteEventLogger.FailedWriteCount into SiteHealthReport (#30, M2.16)
Add SiteHealthReport.SiteEventLogWriteFailures (trailing optional long = 0,
additive-only), ISiteHealthCollector.SetSiteEventLogWriteFailures (default
no-op so existing fakes compile), and SiteEventLogFailureCountReporter
(hosted service in HealthMonitoring, Func<long> delegate to avoid the
HealthMonitoring → StoreAndForward → SiteEventLogging cycle).

Registration helper AddSiteEventLogHealthMetricsBridge added to
HealthMonitoring.ServiceCollectionExtensions; wired in
SiteServiceRegistration after AddSiteEventLogging.

Tests: SiteEventLogWriteFailuresMetricTests (4 collector tests) +
SiteEventLogFailureCountReporterTests (2 poller tests) in
HealthMonitoring.Tests. 79/79 HealthMonitoring.Tests green,
59/59 SiteEventLogging.Tests green, 0 warnings.
2026-06-16 07:14:54 -04:00
Joseph Doherty e1ee37e508 fix(siteeventlog): gate EventLogPurge to active node via IClusterNodeProvider.SelfIsPrimary (#29, M2.15) 2026-06-16 07:02:26 -04:00
Joseph Doherty 6b1cb9e0e6 refactor(host)/test: M2.14 review nits — simplify probe cancellation + pre-cancelled-token test (#28)
- Remove redundant linked CancellationTokenSource in ProbeAsync; pass the
  framework cancellationToken and ProbeTimeout directly to Ask (the two-CTS
  pattern was redundant — Ask already honours both the timeout and the token).
- Add EchoActor XML <remarks> explaining why no Receive<Identify> handler is
  needed (ActorBase answers Identify automatically).
- Add PreCancelledToken_ReportsUnhealthy_DoesNotThrow test: verifies the
  never-throws guarantee on the shutdown-race path (token already cancelled
  before CheckHealthAsync is invoked).
2026-06-16 06:54:28 -04:00
Joseph Doherty 253bec5a52 feat(host): readiness gates on required cluster singletons (#28, M2.14)
REQ-HOST-4a lists "required cluster singletons running (if applicable)" as a
readiness criterion, but /health/ready only checked database + akka-cluster.
Add a third Ready-tagged check, RequiredSingletonsHealthCheck, registered in the
Central-role AddHealthChecks() chain (so it is naturally role-scoped — site nodes
never run it).

Probe: for each required central singleton, Ask its local ClusterSingletonProxy
an Identify with a short bounded per-singleton timeout (~2s, probes run
concurrently via Task.WhenAll). A non-null ActorIdentity.Subject within the
timeout means the singleton is running and reachable through the proxy; a null
subject or a timeout means unreachable → Unhealthy, naming the unreachable
singleton(s). The check never throws (catch-all → Unhealthy) and resolves
ActorSystem lazily from DI per probe (Unhealthy if Akka not yet up).

Required-always set = the five singleton proxies created unconditionally in
AkkaHostedService.RegisterCentralActors: notification-outbox, audit-log-ingest,
site-call-audit, audit-log-purge, site-audit-reconciliation. There are no
feature/config-gated central singletons today; any future gated singleton is the
"if applicable" case and must NOT be added to the required set.

Leadership-agnostic: the proxy reaches the singleton from either central node, so
a ready standby still reports ready (readiness must not require cluster
leadership — that is the Active tier's job). During a brief singleton handover the
probe may time out and the node flaps to not-ready, which is correct (a node
mid-handover is legitimately not fully ready); no retries, to keep the probe fast.

Tests (TDD): RequiredSingletonsHealthCheckTests exercises the probe against a
TestKit ActorSystem — all proxies present+reachable → Healthy; one missing →
Unhealthy naming it; ActorSystem absent → Unhealthy, no throw. HealthCheckTests
regression-guards the Ready tag + absence of the Active tag on the new check.
2026-06-16 06:49:18 -04:00
Joseph Doherty 722b8663c1 feat(dcl): populate obtainable NativeAlarmTransition fields from OPC UA and MxGateway (#27, M2.13)
OPC UA (RealOpcUaClient):
- Append 5 new SelectClauses at indices 13–17 (never renumber 0–12):
  - 13: AlarmConditionType/ActiveState/TransitionTime → OriginalRaiseTime
  - 14–17: LimitAlarmType HighHighLimit/HighLimit/LowLimit/LowLowLimit → LimitValue
- New OpcUaAlarmMapper.PickLimitValue helper: first non-null in HiHi→Hi→Lo→LoLo
  priority order, InvariantCulture-formatted; empty string for non-limit alarm types.
- HandleAlarmEvent reads new indices with fields.Count > N guards; hard minimum (6)
  unchanged so base ConditionType events still process without the limit fields.
- Document unavailable-by-protocol fields (Category, Description, OperatorUser,
  CurrentValue) inline in BuildAlarmEventFilter and HandleAlarmEvent.

MxGateway (MxGatewayAlarmMapper):
- MapTransition: CurrentValue and LimitValue now populated via MxValueToString
  (uses MxValueExtensions.ToClrValue + InvariantCulture) from OnAlarmTransitionEvent
  proto fields current_value/limit_value.
- MapSnapshot: same — populated from ActiveAlarmSnapshot.current_value/limit_value.
- MxValueToString helper (internal): null-safe MxValue → string conversion.

Tests (17 new, 40 total pass):
- OpcUaAlarmMapperTests: PickLimitValue priority, InvariantCulture, all-null case.
- MxGatewayAlarmMapperTests: CurrentValue/LimitValue populate from double/string
  MxValue; absent fields yield empty strings.
- RealOpcUaClientAlarmFilterTests: index alignment assertions (count=18, per-index
  TypeDefinitionId+BrowsePath), regression guard on existing indices 0–12.
2026-06-16 06:37:19 -04:00
Joseph Doherty e2b31a9fd2 fix(siteruntime): M2.12 review nits — observe logger fault + meaningful source fallback (#25)
Replace bare task-discard with ContinueWith(OnlyOnFaulted|ExecuteSynchronously) so a
faulted ISiteEventLogger is logged and swallowed rather than going to the unobserved-task
firehose. Replace the "ScriptRuntimeContext" class-name fallback with the meaningful
"InstanceScript:{instanceName}" identifier (matching the site-event-log source convention).
Update the method doc-comment to state the best-effort contract explicitly. Pin the new
fallback value in the shape-precision test.
2026-06-16 06:26:00 -04:00
Joseph Doherty f08038db23 feat(siteruntime): M2.12 (#25) — emit script Error site event on recursion-limit violation
Inject ISiteEventLogger into ScriptRuntimeContext (additive optional ctor
param, defaulted null, all existing callers source-compatible). Add a single
private EmitRecursionLimitEventAsync helper that fires-and-forgets a
"script"/Error site event; called at both recursion guard sites (CallScript
at ~:332 and ScriptCallHelper.CallShared at ~:499). ScriptExecutionActor
threads the already-resolved siteEventLogger singleton into the context;
AlarmExecutionActor leaves it null (no siteEventLogger wired there).

Existing _logger.LogError + throw behaviour unchanged.

Tests: RecursionLimitSiteEventTests — 5 tests covering both CallScript and
CallShared (ISiteEventLogger.LogEventAsync called once with category "script",
severity "Error"; null logger path does not throw).
2026-06-16 06:20:58 -04:00
Joseph Doherty d160c7f694 test(communication): M2.11 review nits — bridge-actor not-found test + dead-letter comment + toast wording (#24)
- Add DebugStreamBridgeActorTests: On_InstanceNotFound_Snapshot_Forwards_To_OnEvent_Does_Not_Open_Stream_And_Terminates — asserts _onEvent receives the not-found snapshot, SubscribeCalls remains empty, and the actor terminates cleanly via Watch/ExpectTerminated.
- Add comment in DebugStreamBridgeActor near Context.Stop(Self) explaining that the subsequent StopDebugStream Tell from DebugStreamService.StopStream produces a benign expected dead-letter.
- Reword not-found toast in DebugView.razor to "Instance not found on the selected site — check the deployment target." (accurate when the instance may be deployed to a different site).
2026-06-16 06:15:26 -04:00
Joseph Doherty dbf44b9e10 fix(siteruntime): M2.11 — unknown-instance debug snapshot returns InstanceNotFound=true (#24)
RouteDebugSnapshot and RouteDebugViewSubscribe on DeploymentManagerActor
previously returned an empty DebugViewSnapshot for unknown instances,
indistinguishable from a deployed-but-empty instance. Callers had no way
to differentiate "not deployed here" from "deployed, no data yet."

Approach — additive field on existing message contract:
  Added `bool InstanceNotFound = false` as an optional trailing parameter
  to DebugViewSnapshot (Commons). All existing positional constructor calls
  and serialized wire frames are unaffected (default = false). A dedicated
  new message type was considered but rejected: the ClusterClient channel
  and DebugStreamService TCS are already typed on DebugViewSnapshot, and a
  second reply union would require wider changes for zero additive-safety
  gain.

Changes:
  - Commons/DebugViewSnapshot: add InstanceNotFound = false (additive)
  - DeploymentManagerActor: set InstanceNotFound=true in both unknown-
    instance branches (RouteDebugViewSubscribe, RouteDebugSnapshot)
  - DebugStreamBridgeActor: when snapshot.InstanceNotFound, forward it to
    _onEvent (resolves the TCS) then stop cleanly; no gRPC stream opened
  - DebugView.razor: check session.InitialSnapshot.InstanceNotFound after
    connect and show a clear "not deployed on this site" error toast
  - 3 new tests in DeploymentManagerActorTests covering: unknown→snapshot,
    unknown→subscribe, known-empty→InstanceNotFound stays false
2026-06-16 06:08:21 -04:00
Joseph Doherty 9cd62aa5b4 test(configdb): M2.10 review fix — catch bracketed AuditLog identifiers; document EF/multi-line scan limits (#18)
Extends ContainsAuditLogMutation regex to match T-SQL bracketed forms
([AuditLog], [dbo].[AuditLog]) that SSMS-generated SQL produces; the
prior optional-schema pattern only matched bare/dbo-prefixed names,
silently missing these real violation forms.

Changes:
- Schema sub-pattern (?:dbo\.)? → (?:\[?dbo\]?\.)? (matches dbo. and [dbo].)
- Table sub-pattern AuditLog\b → \[?AuditLog\]?\b (matches AuditLog and [AuditLog])
- Pattern compiled as static readonly Regex field for clarity/performance
- Adds 4 new planted-positive cases: UPDATE [dbo].[AuditLog], UPDATE [AuditLog],
  DELETE FROM [dbo].[AuditLog], DELETE FROM [AuditLog]
- Retains all existing negatives; adds DELETE FROM [dbo].[Notifications] negative
- Fixes misleading "reverse order" comment on the comment-prefix positive case
- Documents scan limitations (EF Core bulk methods; multi-line DML) in class XML doc
2026-06-16 05:55:27 -04:00
Joseph Doherty e7b6fe33a4 test(configdb): guard test for AuditLog append-only invariant (M2.10, #18)
Adds AuditLogAppendOnlyGuardTests.cs to
tests/ZB.MOM.WW.ScadaBridge.ConfigurationDatabase.Tests/ — a code-level backstop
for the DB-role DENY UPDATE / DENY DELETE control established in migration
20260602174346_CollapseAuditLogToCanonical.

The guard scans every non-Designer, non-Snapshot *.cs file in the
ConfigurationDatabase source tree and fails the test run if any line matches the
DML-syntax pattern:

    UPDATE\s+(?:dbo\.)?AuditLog\b
    DELETE\s+(?:FROM\s+)?(?:dbo\.)?AuditLog\b

The tight DML-syntax pattern naturally excludes false positives without extra
exclusion checks: DENY UPDATE ON dbo.AuditLog is not matched (UPDATE is followed
by ON, not the table name); ALTER TABLE … SWITCH and TRUNCATE contain no UPDATE/
DELETE keyword; comments with UPDATE/AuditLog in separate clauses are not matched.

Self-verifying unit tests (ContainsAuditLogMutation_*) prove the helper:
- returns false on clean-source lines (INSERT, SELECT, DENY DDL, ALTER SWITCH,
  TRUNCATE, DELETE FROM Notifications);
- returns TRUE on planted violations (UPDATE AuditLog SET …, DELETE FROM
  dbo.AuditLog WHERE …, lower-case variants);
- returns false on the exact DENY/GRANT/partition-switch strings from the
  production migration files.

All 256 ConfigurationDatabase.Tests pass; solution builds 0 W / 0 E.
2026-06-16 05:49:51 -04:00
Joseph Doherty 76198b36e3 fix(host): add MachineDataDb startup validation for Central (reverts Host-008, M2.9 #17)
REQ-HOST-3/REQ-HOST-4 require a MachineDataDb connection string for Central nodes.
The shipped docker appsettings (docker/central-node-a/appsettings.Central.json and
central-node-b) already carry the key. Host-008 had removed the fail-fast Require
because MachineDataDb had no consumer yet; this commit reverses that decision so a
misconfigured or missing connection string is caught at startup with a clear error.

Changes:
- DatabaseOptions: add MachineDataDb property with XML doc comment
- StartupValidator: add .Require for ScadaBridge:Database:MachineDataDb inside the
  existing Central .When block, immediately after the ConfigurationDb Require
- StartupValidatorTests: rename Central_MissingMachineDataDb_PassesValidation ->
  FailsValidation and flip to Assert.Throws; update comment to cite REQ-HOST-3/4,
  shipped docker appsettings, and the Host-008 reversal; add MachineDataDb to
  ValidCentralConfig() so all other Central tests remain green
- CentralDbTestEnvironment: supply ScadaBridge__Database__MachineDataDb env var
  (mirrors ConfigurationDb pattern) so HostStartupTests, HealthCheckTests, and
  MetricsEndpointTests pass through the new Require
- CompositionRootTests, AkkaHostedServiceAuditWiringTests, ActorPathTests: set
  ScadaBridge__Database__MachineDataDb env var alongside the pepper env var and
  clear it in Dispose, matching the existing pepper handling pattern

Build: 0 warnings, 0 errors. dotnet test Host.Tests: 233/233 passed.
2026-06-16 05:41:25 -04:00
Joseph Doherty 21b801b71f test(template): M2.8 review nits — stale-binding comment + stale-ID & inert-check tests (#23)
Add code comments in ValidateConnectionBindingCompleteness explaining
that the unbound-attribute branch also covers the silently-dropped
stale-binding case (cross-reference FlatteningService.ApplyConnectionBindings),
and that the `continue` skips the exists-at-site check for unbound attrs.

Add two new tests:
- FlatteningPipelineConnectionBindingTests: stale DataConnectionId (999)
  not present in site connections → flattener drops it silently →
  validator reports ConnectionBinding Error, IsValid false.
- ValidationServiceTests: enforce:true + siteConnectionNames:null on a
  properly-bound attribute → no ConnectionBinding error (exists-at-site
  check stays inert when site set is not supplied).
2026-06-16 05:34:56 -04:00
Joseph Doherty 7c14a69091 feat(#23): elevate connection-binding completeness to a deploy-gating Error (M2.8)
Pre-deployment validation only WARNED when a data-sourced attribute had no
connection binding, so an instance with unresolved bindings still passed IsValid
and could deploy. There was also no check that a binding resolves to a connection
that actually exists at the target site.

- ValidationService.Validate gains an opt-in `enforceConnectionBindings` flag
  (default false) plus a `siteConnectionNames` set. Default-false keeps the
  template DESIGN-TIME path (ManagementActor.HandleValidateTemplate) non-blocking,
  since bindings are legitimately set later at instance/deploy time. The DEPLOY
  path (FlatteningPipeline) opts in (true) so:
    * a data-sourced attribute with no binding is now a deploy-gating Error;
    * a binding to a connection that does not exist on the target site is an Error.
  Static (non-data-sourced) attributes are never flagged.
- FlatteningPipeline computes the site-connection-names set from the loaded site
  data connections (mirroring M2.1's alarmCapableConnectionNames) and threads it in.
- Tests: TemplateEngine.Tests covers design-time warning / deploy-time error /
  static-ok / exists-at-site / non-existent-connection. New
  FlatteningPipelineConnectionBindingTests proves the deploy path enforces it.

Mark M2.7 + M2.8 completed in the plan task tracker.
2026-06-16 05:28:06 -04:00
Joseph Doherty a8e9e9952d fix(template): M2.7 review nits — comment-aware arg tokenizer + stricter numeric-literal inference (#20/#21)
SplitCallArguments now skips C# line (`//`) and block (`/* */`) comments when
tokenizing the argument list, so a comma inside a comment no longer produces a
spurious arg-count mismatch.  IsNumericLiteral now explicitly rejects tokens
whose first non-sign character is `_` or a letter (e.g. `_2`), and restricts
underscore digit-separators to positions after at least one digit, preventing
identifier-shaped tokens from being inferred as Integer/Float.
2026-06-16 05:21:23 -04:00