Commit Graph

186 Commits

Author SHA1 Message Date
Joseph Doherty 0cc8642cfa docs(m3): implementation plan + tasks for shared ScriptAnalysis consolidation 2026-06-16 19:09:12 -04:00
Joseph Doherty 8e99f22b24 docs(m3): design — shared ScriptAnalysis project consolidating the 4 trust-model analyzers 2026-06-16 19:07:32 -04:00
Joseph Doherty dc9f31537a docs: record final-review follow-ups (deployed-snapshot normalization gap I-1; CLI native-form help example) 2026-06-16 18:34:34 -04:00
Joseph Doherty c53b621b85 docs: mark native-typed JSON feature complete; update Component-Commons codec note
NJ-6: full solution builds 0/0; feature-targeted tests green (Commons codec 38,
TemplateEngine InstanceService 17, ConfigDB normalizer 8, Transport serializer 12,
SiteRuntime InstanceActor 47). Component-Commons now describes the native-typed
List encoding + read-both decode + the three normalization paths. #93/M3 folded in.
2026-06-16 18:27:10 -04:00
Joseph Doherty 69f7c526d0 docs: implementation plan for native-typed JSON List values + normalization
6 tasks (NJ-1..NJ-6): native codec + read-both decode; stamp override
ElementDataType (#93/M3); idempotent central startup normalizer; site
override-load normalization; normalize-on-import; integration + docs.
2026-06-16 17:13:14 -04:00
Joseph Doherty 91b1aa1275 docs: design for native-typed JSON List attribute values + data normalization
Encode emits native-typed JSON ([10,20], [true,false], ISO dates); Decode reads
both old (array-of-strings) and new forms. Existing data normalized via an
idempotent central MS SQL startup normalizer, active site SQLite normalization in
the InstanceActor override-load path, and normalize-on-import for bundles.
Approved via brainstorming (Approach B, thorough).
2026-06-16 17:08:38 -04:00
Joseph Doherty 734c161383 docs: mark multi-value (List) attribute feature complete; document DataType.List + ElementDataType in Component-Commons
MV-15 integration checkpoint: full solution builds 0/0; feature-targeted tests
green across Commons, TemplateEngine, SiteRuntime, DataConnectionLayer,
Communication, Transport, ManagementService, CLI, CentralUI (255 tests).
2026-06-16 16:34:56 -04:00
Joseph Doherty 09d7319958 docs: implementation plan for structured multi-value (List) attributes
15 tasks (MV-1..MV-15) with classifications, dependencies, and TDD steps:
type model, AttributeValueCodec, idempotent migration, flatten, validation,
runtime encode/decode, DCL array coercion, stream encode, management, CLI,
transport, two UI editors, and integration verification.
2026-06-16 15:16:06 -04:00
Joseph Doherty b238228d8b docs: design for structured multi-value (List) attributes
Add a first-class DataType.List + ElementDataType companion so object
attributes can store homogeneous scalar lists (e.g. MoveInWorkOrderNumbers,
MoveInPartNumbers) across all four lifecycle paths: script write/read,
static authored default, OPC UA array read, OPC UA array write.

Canonical JSON value codec; whole-list override; element type fixed by base;
idempotent migration widening Value to nvarchar(max) + adding ElementDataType.
Approved via brainstorming.
2026-06-16 15:10:32 -04:00
Joseph Doherty 13cd53ad1c docs(plan): mark disable-login DL-1..DL-4 complete 2026-06-16 08:55:09 -04:00
Joseph Doherty 57302500ac docs(security): document dev disable-login flag + ship default-false config key
Adds a "Dev Disable-Login Flag" subsection to Component-Security.md covering
ScadaBridge:Security:Auth:DisableLogin / User, the AutoLoginAuthenticationHandler
mechanism, and the no-environment-guard / startup-warning production risk.

Ships DisableLogin: false under ScadaBridge → Security → Auth in:
  - src/.../Host/appsettings.json (canonical default)
  - docker/central-node-a/appsettings.Central.json
  - docker/central-node-b/appsettings.Central.json

Also records DL-3 commit SHAs in the plan tasks file.
2026-06-16 08:54:11 -04:00
Joseph Doherty e89604298d feat(security): wire DisableLogin flag — auto-login scheme + startup warning 2026-06-16 08:47:19 -04:00
Joseph Doherty dcd445a380 feat(security): AutoLoginAuthenticationHandler — all-roles system-wide dev auto-login 2026-06-16 08:40:30 -04:00
Joseph Doherty 56d6508a5b docs(plan): implementation plan for dev disable-login flag (4 tasks) 2026-06-16 08:35:05 -04:00
Joseph Doherty 5cf2d1cb99 docs(plan): design for dev disable-login auto-login flag (port from OtOpcUa)
Faithful copy (warning only, no env guard); custom AuthenticationHandler under the
cookie scheme; reuses M2.19 SessionClaimBuilder for an all-roles system-wide principal.
2026-06-16 08:31:19 -04:00
Joseph Doherty 077770fe35 docs(plan): record M2.19 review-fix SHA; M2 (Tier-2) complete — all 20 tasks done 2026-06-16 08:12:49 -04:00
Joseph Doherty c7916d79a8 chore(tasks): record M2.19 implementation commit SHA (8fe7f46) 2026-06-16 07:54:49 -04:00
Joseph Doherty 8fe7f46df6 feat(security): cookie session idle-timeout + LDAP-free role-mapping refresh (#15, M2.19)
Spike outcome: the shared ILdapAuthService (ZB.MOM.WW.Auth.Abstractions, an external
NuGet package) exposes ONLY AuthenticateAsync(username, password, ct) — no passwordless
service-account group-search. A live LDAP group re-query for an active session therefore
requires a new lib method and is OUT OF SCOPE (cannot modify the external package).
Implemented the always-achievable layers (cookie-only; no embedded JWT for cookie principals):

- /auth/login now stores the user's raw LDAP groups (one zb:group claim each) plus a
  zb:lastrolerefresh anchor (login time, UTC), seeding the LastActivity idle anchor too.
- SessionClaimBuilder: single shared DRY claim-builder used by BOTH /auth/login AND the
  refresh path, so the two claim shapes cannot drift (canonical identity/role/scope claims
  with nameType/roleType pinned, plus the M2.19 group + refresh-anchor additions).
- CookieSessionValidator (TimeProvider-injected, unit-testable) + a thin
  CookieAuthenticationEvents.OnValidatePrincipal adapter:
    * idle-timeout: a session past IdleTimeoutMinutes (default 30) is RejectPrincipal+SignOut;
      consistent with the cookie ExpireTimeSpan+SlidingExpiration window (same value).
    * role refresh WITHOUT LDAP: when older than RoleRefreshThresholdMinutes (new option,
      default 15) the DB-backed RoleMapper re-runs on the STORED groups, claims are rebuilt
      via the shared builder, the anchor advances, principal is replaced + cookie renewed.
      Revoked DB mappings drop the user's roles mid-session.
    * fail-soft: any refresh error KEEPS the existing principal (no sign-out, never throws)
      — mirrors the documented "LDAP failure: active sessions continue with current roles".
- Documented residual limitation in Component-Security.md: central role-mapping/scope
  changes apply within ~15 min without LDAP; live directory group-membership changes are
  picked up only at next login (needs a passwordless group-search on the external
  ZB.MOM.WW.Auth.Ldap lib — tracked follow-up).

Tests (Security.Tests, all green): CookieSessionValidatorTests + SessionClaimBuilderParityTests
— idle reject/keep, LDAP-free remap-from-stored-groups, revoked-roles loss, sub-threshold
no-refresh, refresh-throws-keeps-session, and login/refresh claim-parity.
2026-06-16 07:54:31 -04:00
Joseph Doherty 7210cdbcb5 docs: record M2.18 (#26) implementation commit SHA in M2 task tracker 2026-06-16 07:34:06 -04:00
Joseph Doherty d8519cb464 fix(debug-stream): stream-first lifecycle with replay/dedup (#26, M2.18)
Re-architect DebugStreamBridgeActor from snapshot-first to stream-first so no
attribute/alarm event occurring during the snapshot-build + network-transit
window is lost (#26).

Lifecycle change:
- PreStart now opens the gRPC subscription FIRST (alongside sending the
  SubscribeDebugViewRequest), so live events start flowing immediately.
- Phase model via a single _snapshotDelivered flag (mutated only on the actor
  thread). While buffering (snapshot not yet delivered), AttributeValueChanged/
  AlarmStateChanged are appended to an ordered _preSnapshotBuffer instead of
  being delivered. After snapshot+flush, the same handlers pass through directly.
- On DebugViewSnapshot: deliver snapshot, then flush the buffer in arrival order
  with per-entity dedup, then set _snapshotDelivered=true (pass-through).

Dedup rule (exactly-once):
- Identity: attributes by (InstanceUniqueName, AttributePath, AttributeName);
  alarms by (InstanceUniqueName, AlarmName, SourceReference) so native
  per-condition alarms are not conflated. Keys joined with a NUL delimiter
  (declared as an escaped char constant; no raw NUL in source) so distinct
  identities never collide on a space within a name.
- Boundary: a buffered event whose timestamp is <= the snapshot's timestamp for
  the same entity is already reflected -> DROP; strictly-newer (>) -> DELIVER;
  entity absent from the snapshot -> DELIVER (genuine gap-window event).

Preserved paths:
- M2.11 InstanceNotFound: with stream-first the gRPC stream is already open, so
  the not-found path now tears it down (CleanupGrpc) + clears the buffer, does
  NOT enter pass-through, delivers the not-found snapshot, and stops cleanly.
- Reconnect (ReconnectGrpcStream -> OpenGrpcStream) does not touch the phase
  flag: a mid-session reconnect resumes pass-through; a reconnect during the
  buffering phase stays buffering until the snapshot arrives.
- Communication-008 retry/stability/stop/terminate + ReceiveTimeout orphan net
  unchanged. Duplicate/late snapshot after delivery is ignored defensively.

Tests: 10 new M2.18 tests (stream-first ordering, gap-window buffering, dedup
drop/deliver for attrs + alarms, ordering, pass-through, InstanceNotFound
teardown, reconnect-during-buffering, reconnect-after-snapshot) + revised the
M2.11 not-found test to assert stream teardown. Full DebugStreamBridgeActor
class green: 23/23.
2026-06-16 07:33:51 -04:00
Joseph Doherty 473429a202 docs: record M2.14 (#28) commit SHA in M2 task tracker 2026-06-16 06:49:28 -04:00
Joseph Doherty 253bec5a52 feat(host): readiness gates on required cluster singletons (#28, M2.14)
REQ-HOST-4a lists "required cluster singletons running (if applicable)" as a
readiness criterion, but /health/ready only checked database + akka-cluster.
Add a third Ready-tagged check, RequiredSingletonsHealthCheck, registered in the
Central-role AddHealthChecks() chain (so it is naturally role-scoped — site nodes
never run it).

Probe: for each required central singleton, Ask its local ClusterSingletonProxy
an Identify with a short bounded per-singleton timeout (~2s, probes run
concurrently via Task.WhenAll). A non-null ActorIdentity.Subject within the
timeout means the singleton is running and reachable through the proxy; a null
subject or a timeout means unreachable → Unhealthy, naming the unreachable
singleton(s). The check never throws (catch-all → Unhealthy) and resolves
ActorSystem lazily from DI per probe (Unhealthy if Akka not yet up).

Required-always set = the five singleton proxies created unconditionally in
AkkaHostedService.RegisterCentralActors: notification-outbox, audit-log-ingest,
site-call-audit, audit-log-purge, site-audit-reconciliation. There are no
feature/config-gated central singletons today; any future gated singleton is the
"if applicable" case and must NOT be added to the required set.

Leadership-agnostic: the proxy reaches the singleton from either central node, so
a ready standby still reports ready (readiness must not require cluster
leadership — that is the Active tier's job). During a brief singleton handover the
probe may time out and the node flaps to not-ready, which is correct (a node
mid-handover is legitimately not fully ready); no retries, to keep the probe fast.

Tests (TDD): RequiredSingletonsHealthCheckTests exercises the probe against a
TestKit ActorSystem — all proxies present+reachable → Healthy; one missing →
Unhealthy naming it; ActorSystem absent → Unhealthy, no throw. HealthCheckTests
regression-guards the Ready tag + absence of the Active tag on the new check.
2026-06-16 06:49:18 -04:00
Joseph Doherty 722b8663c1 feat(dcl): populate obtainable NativeAlarmTransition fields from OPC UA and MxGateway (#27, M2.13)
OPC UA (RealOpcUaClient):
- Append 5 new SelectClauses at indices 13–17 (never renumber 0–12):
  - 13: AlarmConditionType/ActiveState/TransitionTime → OriginalRaiseTime
  - 14–17: LimitAlarmType HighHighLimit/HighLimit/LowLimit/LowLowLimit → LimitValue
- New OpcUaAlarmMapper.PickLimitValue helper: first non-null in HiHi→Hi→Lo→LoLo
  priority order, InvariantCulture-formatted; empty string for non-limit alarm types.
- HandleAlarmEvent reads new indices with fields.Count > N guards; hard minimum (6)
  unchanged so base ConditionType events still process without the limit fields.
- Document unavailable-by-protocol fields (Category, Description, OperatorUser,
  CurrentValue) inline in BuildAlarmEventFilter and HandleAlarmEvent.

MxGateway (MxGatewayAlarmMapper):
- MapTransition: CurrentValue and LimitValue now populated via MxValueToString
  (uses MxValueExtensions.ToClrValue + InvariantCulture) from OnAlarmTransitionEvent
  proto fields current_value/limit_value.
- MapSnapshot: same — populated from ActiveAlarmSnapshot.current_value/limit_value.
- MxValueToString helper (internal): null-safe MxValue → string conversion.

Tests (17 new, 40 total pass):
- OpcUaAlarmMapperTests: PickLimitValue priority, InvariantCulture, all-null case.
- MxGatewayAlarmMapperTests: CurrentValue/LimitValue populate from double/string
  MxValue; absent fields yield empty strings.
- RealOpcUaClientAlarmFilterTests: index alignment assertions (count=18, per-index
  TypeDefinitionId+BrowsePath), regression guard on existing indices 0–12.
2026-06-16 06:37:19 -04:00
Joseph Doherty e7b6fe33a4 test(configdb): guard test for AuditLog append-only invariant (M2.10, #18)
Adds AuditLogAppendOnlyGuardTests.cs to
tests/ZB.MOM.WW.ScadaBridge.ConfigurationDatabase.Tests/ — a code-level backstop
for the DB-role DENY UPDATE / DENY DELETE control established in migration
20260602174346_CollapseAuditLogToCanonical.

The guard scans every non-Designer, non-Snapshot *.cs file in the
ConfigurationDatabase source tree and fails the test run if any line matches the
DML-syntax pattern:

    UPDATE\s+(?:dbo\.)?AuditLog\b
    DELETE\s+(?:FROM\s+)?(?:dbo\.)?AuditLog\b

The tight DML-syntax pattern naturally excludes false positives without extra
exclusion checks: DENY UPDATE ON dbo.AuditLog is not matched (UPDATE is followed
by ON, not the table name); ALTER TABLE … SWITCH and TRUNCATE contain no UPDATE/
DELETE keyword; comments with UPDATE/AuditLog in separate clauses are not matched.

Self-verifying unit tests (ContainsAuditLogMutation_*) prove the helper:
- returns false on clean-source lines (INSERT, SELECT, DENY DDL, ALTER SWITCH,
  TRUNCATE, DELETE FROM Notifications);
- returns TRUE on planted violations (UPDATE AuditLog SET …, DELETE FROM
  dbo.AuditLog WHERE …, lower-case variants);
- returns false on the exact DENY/GRANT/partition-switch strings from the
  production migration files.

All 256 ConfigurationDatabase.Tests pass; solution builds 0 W / 0 E.
2026-06-16 05:49:51 -04:00
Joseph Doherty 3b79b896cf chore: record M2.8 commit SHA in plan task tracker 2026-06-16 05:28:19 -04:00
Joseph Doherty 7c14a69091 feat(#23): elevate connection-binding completeness to a deploy-gating Error (M2.8)
Pre-deployment validation only WARNED when a data-sourced attribute had no
connection binding, so an instance with unresolved bindings still passed IsValid
and could deploy. There was also no check that a binding resolves to a connection
that actually exists at the target site.

- ValidationService.Validate gains an opt-in `enforceConnectionBindings` flag
  (default false) plus a `siteConnectionNames` set. Default-false keeps the
  template DESIGN-TIME path (ManagementActor.HandleValidateTemplate) non-blocking,
  since bindings are legitimately set later at instance/deploy time. The DEPLOY
  path (FlatteningPipeline) opts in (true) so:
    * a data-sourced attribute with no binding is now a deploy-gating Error;
    * a binding to a connection that does not exist on the target site is an Error.
  Static (non-data-sourced) attributes are never flagged.
- FlatteningPipeline computes the site-connection-names set from the loaded site
  data connections (mirroring M2.1's alarmCapableConnectionNames) and threads it in.
- Tests: TemplateEngine.Tests covers design-time warning / deploy-time error /
  static-ok / exists-at-site / non-existent-connection. New
  FlatteningPipelineConnectionBindingTests proves the deploy path enforces it.

Mark M2.7 + M2.8 completed in the plan task tracker.
2026-06-16 05:28:06 -04:00
Joseph Doherty 42d22766c7 docs(plan): mark M2.0-M2.6 complete in tasks.json; record commits + follow-ups 2026-06-15 15:20:04 -04:00
Joseph Doherty 411d0c043b fix(inbound-api): M2.6 review nits — legacy required default, recursion depth guard, return-validator comment (#13)
- legacy flat-array "required":"false" (string) now treated as optional (matches migration)
- depth ceiling (32) on InboundApiSchema Parse/Validate recursion — guards against
  stack-overflow from a deeply-nested stored schema (Parse throws->400, Validate adds error)
- DocOptions.MaxDepth=128 so the application-level structural guard fires before the
  System.Text.Json reader ceiling (each schema level = ~3 JSON reader levels)
- comment the intentional ParameterValidator/ReturnValueValidator early-return asymmetry
- note intentional datetime->string legacy collapse in NormalizeType
- tests: legacy string-false optional, parse/validate depth ceiling, scalar return schema
2026-06-15 15:18:44 -04:00
Joseph Doherty 4b6187c853 feat(inbound-api): nested Object/List extended-type validation (#13)
Object/List parameters and return values were shape-validated only (object vs
array), with no field-level/nested type checks — type-wrong nested data passed
inbound validation and failed only at script runtime. Add recursive type
validation (declared Object field types, List element type, scalars at any depth)
with path-qualified errors, symmetric across ParameterValidator and ReturnValueValidator.

Both validators now parse the canonical JSON Schema definition format (the
Central UI / MigrateParametersToJsonSchema output) via a shared recursive engine,
Commons.Types.InboundApi.InboundApiSchema, instead of the legacy flat
[{name,type}] array which they could not even deserialize from migrated rows.
The legacy flat-array form is still accepted on read for transition safety.
Undeclared fields are rejected at every level (consistent with the existing
top-level unexpected-parameter rejection); a present-but-null value satisfies
any type, only absence of a required field is an error.
2026-06-15 15:04:28 -04:00
Joseph Doherty 28bc639786 docs(plan): M2 implementation plan — Tier-2 correctness/behavioral gaps
19 tasks (M2.0-M2.19) covering stillpending.md Tier-2 items #7,#8,#9,#10,
#13,#17,#18,#20-#31, plus pre-existing EF model/snapshot drift (#32, lead item).
Risk-first ordering; migration tasks serialized. Scope decisions recorded:
#19 done in M1.8; #16 deferred to M8; #17 reverts Host-008 per design doc;
#8 filter semantics defined; #15 LDAP re-query spike-gated.
2026-06-15 13:08:37 -04:00
Joseph Doherty 9aa1259504 docs(plans): Phase 1 (M1-M4) implementation plan for stillpending.md
Bite-sized TDD plan. M1 (runtime wiring) fully detailed across 10 tasks
after verifying the purge/reconciliation actors already exist and only
need Host wiring + a gRPC pull client + event-logger injection. M2/M3/M4
as right-sized task inventories with files, classification, and AC.
Co-located .tasks.json for executing-plans resume.
2026-06-15 09:32:14 -04:00
Joseph Doherty f4707745bf docs(plans): completion roadmap for stillpending.md audit
Add the system-completion design doc (risk-first milestones M1-M10):
Phase 1 Stabilize (M1 runtime wiring, M2 correctness, M3 script trust
boundary, M4 doc reconciliation) then Phase 2 Expand (M5-M10 feature
epics). Scope = all Tier 1/2/4 + in-scope Tier 3 features; T12/T19
deferred to own brainstorm; deliberate anti-goals excluded. Also commit
the source audit (stillpending.md).
2026-06-15 09:27:00 -04:00
Joseph Doherty 68f911e634 docs: note alarmOverrides in GetInstanceDocumentAsync; mark template-alarm/override plan complete 2026-06-07 10:31:02 -04:00
Joseph Doherty 9d7e69056a docs(plans): add template-alarm CLI + alarm-override coverage implementation plan 2026-06-07 10:00:48 -04:00
Joseph Doherty 475bfadacd docs(plans): design for template-alarm CLI ergonomics + alarm-override coverage 2026-06-07 09:53:34 -04:00
Joseph Doherty fdea9e0bde docs(plans): mark Wave 4 tasks complete 2026-06-07 04:33:16 -04:00
Joseph Doherty 7fda67be9e docs(plans): add Wave 4 Playwright edge-sweep plan (cross-cutting edge cases) 2026-06-07 03:26:25 -04:00
Joseph Doherty 1eece71c76 docs(plans): mark Wave 3 tasks complete 2026-06-06 16:10:07 -04:00
Joseph Doherty e5bd8d9707 docs(plans): add Wave 3 Playwright coverage-fill plan (Tier 3 config CRUD breadth) 2026-06-06 14:31:18 -04:00
Joseph Doherty 4a993d76da docs: mark Playwright coverage-fill Wave 2 tasks complete 2026-06-06 13:59:21 -04:00
Joseph Doherty 46bc2288bf docs: correct Wave 2 Task 7 Test-D description to the shipped move-under-area scenario 2026-06-06 13:55:19 -04:00
Joseph Doherty b4b38fe52a docs: add Playwright coverage-fill Wave 2 implementation plan (Tier 2 real-time/relay) 2026-06-06 13:18:32 -04:00
Joseph Doherty efb3efe6dc docs: mark Playwright coverage-fill Wave 1 tasks complete 2026-06-06 12:37:47 -04:00
Joseph Doherty 8bd7656110 docs: sync Wave 1 plan with Task 0 review fixes (GetInstanceDocumentAsync, CreateApiKeyAsync) 2026-06-06 11:44:56 -04:00
Joseph Doherty 8e8bf44a29 docs: add Playwright coverage-fill Wave 1 plan (InstanceConfigure, API keys, Transport export) + tasks 2026-06-06 11:32:18 -04:00
Joseph Doherty 58bf59a42d docs: add Playwright coverage-fill design (Tier 1-3 + edge sweep, 4 waves) 2026-06-06 11:23:59 -04:00
Joseph Doherty b540015fbd docs(tests): implementation plan for Playwright coverage expansion
16 task-by-task steps: shared CliRunner + ClusterAvailability skip infra,
DeploymentFixture + deploy/enable/disable/delete suites, notification
retry/discard + parked-messages query, Transport Import round-trip, Site/
Template/LDAP CRUD round-trips, nav render hardening, Health KPI guard, and a
no-residue verification pass. Co-located .tasks.json for resumable execution.
2026-06-05 09:52:12 -04:00
Joseph Doherty cb3b3bf373 docs(tests): design for Playwright coverage expansion (7 audit recs)
Captures the 2026-06-05 coverage audit's gaps and the approved approach for
closing them: ephemeral CLI-provisioned fixtures with outcome-tolerant asserts
for the mutating suites (deploy lifecycle, retry/discard, transport import),
UI CRUD round-trips, nav render hardening, a Health KPI load test, and a
standardized skip-and-log policy. Next: writing-plans turns this into tasks.
2026-06-05 09:39:35 -04:00
Joseph Doherty 5e106df9e6 docs(plans): implementation plan for per-component reference docs
28-task plan: scaffold, AuditLog pilot (approval gate), 24-doc parallel
fan-out, index+README, verification pass. Co-located .tasks.json for resume.
2026-06-03 15:24:05 -04:00
Joseph Doherty e89cf2b278 docs(plans): design for per-component reference docs in docs/components/
Brainstormed design: generate 25 StyleGuide-conformant developer-reference
docs derived from src/ code (pilot AuditLog, then parallel fan-out, then
accuracy/conformance verification). Complements the requirements specs;
leaves src/, XML docs, and specs untouched.
2026-06-03 13:58:14 -04:00