Commit Graph

177 Commits

Author SHA1 Message Date
Joseph Doherty 13cd53ad1c docs(plan): mark disable-login DL-1..DL-4 complete 2026-06-16 08:55:09 -04:00
Joseph Doherty 57302500ac docs(security): document dev disable-login flag + ship default-false config key
Adds a "Dev Disable-Login Flag" subsection to Component-Security.md covering
ScadaBridge:Security:Auth:DisableLogin / User, the AutoLoginAuthenticationHandler
mechanism, and the no-environment-guard / startup-warning production risk.

Ships DisableLogin: false under ScadaBridge → Security → Auth in:
  - src/.../Host/appsettings.json (canonical default)
  - docker/central-node-a/appsettings.Central.json
  - docker/central-node-b/appsettings.Central.json

Also records DL-3 commit SHAs in the plan tasks file.
2026-06-16 08:54:11 -04:00
Joseph Doherty e89604298d feat(security): wire DisableLogin flag — auto-login scheme + startup warning 2026-06-16 08:47:19 -04:00
Joseph Doherty dcd445a380 feat(security): AutoLoginAuthenticationHandler — all-roles system-wide dev auto-login 2026-06-16 08:40:30 -04:00
Joseph Doherty 56d6508a5b docs(plan): implementation plan for dev disable-login flag (4 tasks) 2026-06-16 08:35:05 -04:00
Joseph Doherty 5cf2d1cb99 docs(plan): design for dev disable-login auto-login flag (port from OtOpcUa)
Faithful copy (warning only, no env guard); custom AuthenticationHandler under the
cookie scheme; reuses M2.19 SessionClaimBuilder for an all-roles system-wide principal.
2026-06-16 08:31:19 -04:00
Joseph Doherty 077770fe35 docs(plan): record M2.19 review-fix SHA; M2 (Tier-2) complete — all 20 tasks done 2026-06-16 08:12:49 -04:00
Joseph Doherty c7916d79a8 chore(tasks): record M2.19 implementation commit SHA (8fe7f46) 2026-06-16 07:54:49 -04:00
Joseph Doherty 8fe7f46df6 feat(security): cookie session idle-timeout + LDAP-free role-mapping refresh (#15, M2.19)
Spike outcome: the shared ILdapAuthService (ZB.MOM.WW.Auth.Abstractions, an external
NuGet package) exposes ONLY AuthenticateAsync(username, password, ct) — no passwordless
service-account group-search. A live LDAP group re-query for an active session therefore
requires a new lib method and is OUT OF SCOPE (cannot modify the external package).
Implemented the always-achievable layers (cookie-only; no embedded JWT for cookie principals):

- /auth/login now stores the user's raw LDAP groups (one zb:group claim each) plus a
  zb:lastrolerefresh anchor (login time, UTC), seeding the LastActivity idle anchor too.
- SessionClaimBuilder: single shared DRY claim-builder used by BOTH /auth/login AND the
  refresh path, so the two claim shapes cannot drift (canonical identity/role/scope claims
  with nameType/roleType pinned, plus the M2.19 group + refresh-anchor additions).
- CookieSessionValidator (TimeProvider-injected, unit-testable) + a thin
  CookieAuthenticationEvents.OnValidatePrincipal adapter:
    * idle-timeout: a session past IdleTimeoutMinutes (default 30) is RejectPrincipal+SignOut;
      consistent with the cookie ExpireTimeSpan+SlidingExpiration window (same value).
    * role refresh WITHOUT LDAP: when older than RoleRefreshThresholdMinutes (new option,
      default 15) the DB-backed RoleMapper re-runs on the STORED groups, claims are rebuilt
      via the shared builder, the anchor advances, principal is replaced + cookie renewed.
      Revoked DB mappings drop the user's roles mid-session.
    * fail-soft: any refresh error KEEPS the existing principal (no sign-out, never throws)
      — mirrors the documented "LDAP failure: active sessions continue with current roles".
- Documented residual limitation in Component-Security.md: central role-mapping/scope
  changes apply within ~15 min without LDAP; live directory group-membership changes are
  picked up only at next login (needs a passwordless group-search on the external
  ZB.MOM.WW.Auth.Ldap lib — tracked follow-up).

Tests (Security.Tests, all green): CookieSessionValidatorTests + SessionClaimBuilderParityTests
— idle reject/keep, LDAP-free remap-from-stored-groups, revoked-roles loss, sub-threshold
no-refresh, refresh-throws-keeps-session, and login/refresh claim-parity.
2026-06-16 07:54:31 -04:00
Joseph Doherty 7210cdbcb5 docs: record M2.18 (#26) implementation commit SHA in M2 task tracker 2026-06-16 07:34:06 -04:00
Joseph Doherty d8519cb464 fix(debug-stream): stream-first lifecycle with replay/dedup (#26, M2.18)
Re-architect DebugStreamBridgeActor from snapshot-first to stream-first so no
attribute/alarm event occurring during the snapshot-build + network-transit
window is lost (#26).

Lifecycle change:
- PreStart now opens the gRPC subscription FIRST (alongside sending the
  SubscribeDebugViewRequest), so live events start flowing immediately.
- Phase model via a single _snapshotDelivered flag (mutated only on the actor
  thread). While buffering (snapshot not yet delivered), AttributeValueChanged/
  AlarmStateChanged are appended to an ordered _preSnapshotBuffer instead of
  being delivered. After snapshot+flush, the same handlers pass through directly.
- On DebugViewSnapshot: deliver snapshot, then flush the buffer in arrival order
  with per-entity dedup, then set _snapshotDelivered=true (pass-through).

Dedup rule (exactly-once):
- Identity: attributes by (InstanceUniqueName, AttributePath, AttributeName);
  alarms by (InstanceUniqueName, AlarmName, SourceReference) so native
  per-condition alarms are not conflated. Keys joined with a NUL delimiter
  (declared as an escaped char constant; no raw NUL in source) so distinct
  identities never collide on a space within a name.
- Boundary: a buffered event whose timestamp is <= the snapshot's timestamp for
  the same entity is already reflected -> DROP; strictly-newer (>) -> DELIVER;
  entity absent from the snapshot -> DELIVER (genuine gap-window event).

Preserved paths:
- M2.11 InstanceNotFound: with stream-first the gRPC stream is already open, so
  the not-found path now tears it down (CleanupGrpc) + clears the buffer, does
  NOT enter pass-through, delivers the not-found snapshot, and stops cleanly.
- Reconnect (ReconnectGrpcStream -> OpenGrpcStream) does not touch the phase
  flag: a mid-session reconnect resumes pass-through; a reconnect during the
  buffering phase stays buffering until the snapshot arrives.
- Communication-008 retry/stability/stop/terminate + ReceiveTimeout orphan net
  unchanged. Duplicate/late snapshot after delivery is ignored defensively.

Tests: 10 new M2.18 tests (stream-first ordering, gap-window buffering, dedup
drop/deliver for attrs + alarms, ordering, pass-through, InstanceNotFound
teardown, reconnect-during-buffering, reconnect-after-snapshot) + revised the
M2.11 not-found test to assert stream teardown. Full DebugStreamBridgeActor
class green: 23/23.
2026-06-16 07:33:51 -04:00
Joseph Doherty 473429a202 docs: record M2.14 (#28) commit SHA in M2 task tracker 2026-06-16 06:49:28 -04:00
Joseph Doherty 253bec5a52 feat(host): readiness gates on required cluster singletons (#28, M2.14)
REQ-HOST-4a lists "required cluster singletons running (if applicable)" as a
readiness criterion, but /health/ready only checked database + akka-cluster.
Add a third Ready-tagged check, RequiredSingletonsHealthCheck, registered in the
Central-role AddHealthChecks() chain (so it is naturally role-scoped — site nodes
never run it).

Probe: for each required central singleton, Ask its local ClusterSingletonProxy
an Identify with a short bounded per-singleton timeout (~2s, probes run
concurrently via Task.WhenAll). A non-null ActorIdentity.Subject within the
timeout means the singleton is running and reachable through the proxy; a null
subject or a timeout means unreachable → Unhealthy, naming the unreachable
singleton(s). The check never throws (catch-all → Unhealthy) and resolves
ActorSystem lazily from DI per probe (Unhealthy if Akka not yet up).

Required-always set = the five singleton proxies created unconditionally in
AkkaHostedService.RegisterCentralActors: notification-outbox, audit-log-ingest,
site-call-audit, audit-log-purge, site-audit-reconciliation. There are no
feature/config-gated central singletons today; any future gated singleton is the
"if applicable" case and must NOT be added to the required set.

Leadership-agnostic: the proxy reaches the singleton from either central node, so
a ready standby still reports ready (readiness must not require cluster
leadership — that is the Active tier's job). During a brief singleton handover the
probe may time out and the node flaps to not-ready, which is correct (a node
mid-handover is legitimately not fully ready); no retries, to keep the probe fast.

Tests (TDD): RequiredSingletonsHealthCheckTests exercises the probe against a
TestKit ActorSystem — all proxies present+reachable → Healthy; one missing →
Unhealthy naming it; ActorSystem absent → Unhealthy, no throw. HealthCheckTests
regression-guards the Ready tag + absence of the Active tag on the new check.
2026-06-16 06:49:18 -04:00
Joseph Doherty 722b8663c1 feat(dcl): populate obtainable NativeAlarmTransition fields from OPC UA and MxGateway (#27, M2.13)
OPC UA (RealOpcUaClient):
- Append 5 new SelectClauses at indices 13–17 (never renumber 0–12):
  - 13: AlarmConditionType/ActiveState/TransitionTime → OriginalRaiseTime
  - 14–17: LimitAlarmType HighHighLimit/HighLimit/LowLimit/LowLowLimit → LimitValue
- New OpcUaAlarmMapper.PickLimitValue helper: first non-null in HiHi→Hi→Lo→LoLo
  priority order, InvariantCulture-formatted; empty string for non-limit alarm types.
- HandleAlarmEvent reads new indices with fields.Count > N guards; hard minimum (6)
  unchanged so base ConditionType events still process without the limit fields.
- Document unavailable-by-protocol fields (Category, Description, OperatorUser,
  CurrentValue) inline in BuildAlarmEventFilter and HandleAlarmEvent.

MxGateway (MxGatewayAlarmMapper):
- MapTransition: CurrentValue and LimitValue now populated via MxValueToString
  (uses MxValueExtensions.ToClrValue + InvariantCulture) from OnAlarmTransitionEvent
  proto fields current_value/limit_value.
- MapSnapshot: same — populated from ActiveAlarmSnapshot.current_value/limit_value.
- MxValueToString helper (internal): null-safe MxValue → string conversion.

Tests (17 new, 40 total pass):
- OpcUaAlarmMapperTests: PickLimitValue priority, InvariantCulture, all-null case.
- MxGatewayAlarmMapperTests: CurrentValue/LimitValue populate from double/string
  MxValue; absent fields yield empty strings.
- RealOpcUaClientAlarmFilterTests: index alignment assertions (count=18, per-index
  TypeDefinitionId+BrowsePath), regression guard on existing indices 0–12.
2026-06-16 06:37:19 -04:00
Joseph Doherty e7b6fe33a4 test(configdb): guard test for AuditLog append-only invariant (M2.10, #18)
Adds AuditLogAppendOnlyGuardTests.cs to
tests/ZB.MOM.WW.ScadaBridge.ConfigurationDatabase.Tests/ — a code-level backstop
for the DB-role DENY UPDATE / DENY DELETE control established in migration
20260602174346_CollapseAuditLogToCanonical.

The guard scans every non-Designer, non-Snapshot *.cs file in the
ConfigurationDatabase source tree and fails the test run if any line matches the
DML-syntax pattern:

    UPDATE\s+(?:dbo\.)?AuditLog\b
    DELETE\s+(?:FROM\s+)?(?:dbo\.)?AuditLog\b

The tight DML-syntax pattern naturally excludes false positives without extra
exclusion checks: DENY UPDATE ON dbo.AuditLog is not matched (UPDATE is followed
by ON, not the table name); ALTER TABLE … SWITCH and TRUNCATE contain no UPDATE/
DELETE keyword; comments with UPDATE/AuditLog in separate clauses are not matched.

Self-verifying unit tests (ContainsAuditLogMutation_*) prove the helper:
- returns false on clean-source lines (INSERT, SELECT, DENY DDL, ALTER SWITCH,
  TRUNCATE, DELETE FROM Notifications);
- returns TRUE on planted violations (UPDATE AuditLog SET …, DELETE FROM
  dbo.AuditLog WHERE …, lower-case variants);
- returns false on the exact DENY/GRANT/partition-switch strings from the
  production migration files.

All 256 ConfigurationDatabase.Tests pass; solution builds 0 W / 0 E.
2026-06-16 05:49:51 -04:00
Joseph Doherty 3b79b896cf chore: record M2.8 commit SHA in plan task tracker 2026-06-16 05:28:19 -04:00
Joseph Doherty 7c14a69091 feat(#23): elevate connection-binding completeness to a deploy-gating Error (M2.8)
Pre-deployment validation only WARNED when a data-sourced attribute had no
connection binding, so an instance with unresolved bindings still passed IsValid
and could deploy. There was also no check that a binding resolves to a connection
that actually exists at the target site.

- ValidationService.Validate gains an opt-in `enforceConnectionBindings` flag
  (default false) plus a `siteConnectionNames` set. Default-false keeps the
  template DESIGN-TIME path (ManagementActor.HandleValidateTemplate) non-blocking,
  since bindings are legitimately set later at instance/deploy time. The DEPLOY
  path (FlatteningPipeline) opts in (true) so:
    * a data-sourced attribute with no binding is now a deploy-gating Error;
    * a binding to a connection that does not exist on the target site is an Error.
  Static (non-data-sourced) attributes are never flagged.
- FlatteningPipeline computes the site-connection-names set from the loaded site
  data connections (mirroring M2.1's alarmCapableConnectionNames) and threads it in.
- Tests: TemplateEngine.Tests covers design-time warning / deploy-time error /
  static-ok / exists-at-site / non-existent-connection. New
  FlatteningPipelineConnectionBindingTests proves the deploy path enforces it.

Mark M2.7 + M2.8 completed in the plan task tracker.
2026-06-16 05:28:06 -04:00
Joseph Doherty 42d22766c7 docs(plan): mark M2.0-M2.6 complete in tasks.json; record commits + follow-ups 2026-06-15 15:20:04 -04:00
Joseph Doherty 411d0c043b fix(inbound-api): M2.6 review nits — legacy required default, recursion depth guard, return-validator comment (#13)
- legacy flat-array "required":"false" (string) now treated as optional (matches migration)
- depth ceiling (32) on InboundApiSchema Parse/Validate recursion — guards against
  stack-overflow from a deeply-nested stored schema (Parse throws->400, Validate adds error)
- DocOptions.MaxDepth=128 so the application-level structural guard fires before the
  System.Text.Json reader ceiling (each schema level = ~3 JSON reader levels)
- comment the intentional ParameterValidator/ReturnValueValidator early-return asymmetry
- note intentional datetime->string legacy collapse in NormalizeType
- tests: legacy string-false optional, parse/validate depth ceiling, scalar return schema
2026-06-15 15:18:44 -04:00
Joseph Doherty 4b6187c853 feat(inbound-api): nested Object/List extended-type validation (#13)
Object/List parameters and return values were shape-validated only (object vs
array), with no field-level/nested type checks — type-wrong nested data passed
inbound validation and failed only at script runtime. Add recursive type
validation (declared Object field types, List element type, scalars at any depth)
with path-qualified errors, symmetric across ParameterValidator and ReturnValueValidator.

Both validators now parse the canonical JSON Schema definition format (the
Central UI / MigrateParametersToJsonSchema output) via a shared recursive engine,
Commons.Types.InboundApi.InboundApiSchema, instead of the legacy flat
[{name,type}] array which they could not even deserialize from migrated rows.
The legacy flat-array form is still accepted on read for transition safety.
Undeclared fields are rejected at every level (consistent with the existing
top-level unexpected-parameter rejection); a present-but-null value satisfies
any type, only absence of a required field is an error.
2026-06-15 15:04:28 -04:00
Joseph Doherty 28bc639786 docs(plan): M2 implementation plan — Tier-2 correctness/behavioral gaps
19 tasks (M2.0-M2.19) covering stillpending.md Tier-2 items #7,#8,#9,#10,
#13,#17,#18,#20-#31, plus pre-existing EF model/snapshot drift (#32, lead item).
Risk-first ordering; migration tasks serialized. Scope decisions recorded:
#19 done in M1.8; #16 deferred to M8; #17 reverts Host-008 per design doc;
#8 filter semantics defined; #15 LDAP re-query spike-gated.
2026-06-15 13:08:37 -04:00
Joseph Doherty 9aa1259504 docs(plans): Phase 1 (M1-M4) implementation plan for stillpending.md
Bite-sized TDD plan. M1 (runtime wiring) fully detailed across 10 tasks
after verifying the purge/reconciliation actors already exist and only
need Host wiring + a gRPC pull client + event-logger injection. M2/M3/M4
as right-sized task inventories with files, classification, and AC.
Co-located .tasks.json for executing-plans resume.
2026-06-15 09:32:14 -04:00
Joseph Doherty f4707745bf docs(plans): completion roadmap for stillpending.md audit
Add the system-completion design doc (risk-first milestones M1-M10):
Phase 1 Stabilize (M1 runtime wiring, M2 correctness, M3 script trust
boundary, M4 doc reconciliation) then Phase 2 Expand (M5-M10 feature
epics). Scope = all Tier 1/2/4 + in-scope Tier 3 features; T12/T19
deferred to own brainstorm; deliberate anti-goals excluded. Also commit
the source audit (stillpending.md).
2026-06-15 09:27:00 -04:00
Joseph Doherty 68f911e634 docs: note alarmOverrides in GetInstanceDocumentAsync; mark template-alarm/override plan complete 2026-06-07 10:31:02 -04:00
Joseph Doherty 9d7e69056a docs(plans): add template-alarm CLI + alarm-override coverage implementation plan 2026-06-07 10:00:48 -04:00
Joseph Doherty 475bfadacd docs(plans): design for template-alarm CLI ergonomics + alarm-override coverage 2026-06-07 09:53:34 -04:00
Joseph Doherty fdea9e0bde docs(plans): mark Wave 4 tasks complete 2026-06-07 04:33:16 -04:00
Joseph Doherty 7fda67be9e docs(plans): add Wave 4 Playwright edge-sweep plan (cross-cutting edge cases) 2026-06-07 03:26:25 -04:00
Joseph Doherty 1eece71c76 docs(plans): mark Wave 3 tasks complete 2026-06-06 16:10:07 -04:00
Joseph Doherty e5bd8d9707 docs(plans): add Wave 3 Playwright coverage-fill plan (Tier 3 config CRUD breadth) 2026-06-06 14:31:18 -04:00
Joseph Doherty 4a993d76da docs: mark Playwright coverage-fill Wave 2 tasks complete 2026-06-06 13:59:21 -04:00
Joseph Doherty 46bc2288bf docs: correct Wave 2 Task 7 Test-D description to the shipped move-under-area scenario 2026-06-06 13:55:19 -04:00
Joseph Doherty b4b38fe52a docs: add Playwright coverage-fill Wave 2 implementation plan (Tier 2 real-time/relay) 2026-06-06 13:18:32 -04:00
Joseph Doherty efb3efe6dc docs: mark Playwright coverage-fill Wave 1 tasks complete 2026-06-06 12:37:47 -04:00
Joseph Doherty 8bd7656110 docs: sync Wave 1 plan with Task 0 review fixes (GetInstanceDocumentAsync, CreateApiKeyAsync) 2026-06-06 11:44:56 -04:00
Joseph Doherty 8e8bf44a29 docs: add Playwright coverage-fill Wave 1 plan (InstanceConfigure, API keys, Transport export) + tasks 2026-06-06 11:32:18 -04:00
Joseph Doherty 58bf59a42d docs: add Playwright coverage-fill design (Tier 1-3 + edge sweep, 4 waves) 2026-06-06 11:23:59 -04:00
Joseph Doherty b540015fbd docs(tests): implementation plan for Playwright coverage expansion
16 task-by-task steps: shared CliRunner + ClusterAvailability skip infra,
DeploymentFixture + deploy/enable/disable/delete suites, notification
retry/discard + parked-messages query, Transport Import round-trip, Site/
Template/LDAP CRUD round-trips, nav render hardening, Health KPI guard, and a
no-residue verification pass. Co-located .tasks.json for resumable execution.
2026-06-05 09:52:12 -04:00
Joseph Doherty cb3b3bf373 docs(tests): design for Playwright coverage expansion (7 audit recs)
Captures the 2026-06-05 coverage audit's gaps and the approved approach for
closing them: ephemeral CLI-provisioned fixtures with outcome-tolerant asserts
for the mutating suites (deploy lifecycle, retry/discard, transport import),
UI CRUD round-trips, nav render hardening, a Health KPI load test, and a
standardized skip-and-log policy. Next: writing-plans turns this into tasks.
2026-06-05 09:39:35 -04:00
Joseph Doherty 5e106df9e6 docs(plans): implementation plan for per-component reference docs
28-task plan: scaffold, AuditLog pilot (approval gate), 24-doc parallel
fan-out, index+README, verification pass. Co-located .tasks.json for resume.
2026-06-03 15:24:05 -04:00
Joseph Doherty e89cf2b278 docs(plans): design for per-component reference docs in docs/components/
Brainstormed design: generate 25 StyleGuide-conformant developer-reference
docs derived from src/ code (pilot AuditLog, then parallel fan-out, then
accuracy/conformance verification). Complements the requirements specs;
leaves src/, XML docs, and specs untouched.
2026-06-03 13:58:14 -04:00
Joseph Doherty 4db8c373af fix(auth): ScadaBridge 1.2 review fixes — secret-test repoint, checklist, Scope guard, 0.1.1 pin 2026-06-02 01:23:52 -04:00
Joseph Doherty 43228185b4 docs: convert standard diagrams from draw.io PNGs to inline Mermaid
Gitea renders mermaid inline, so the flow/state/hierarchy/DAG diagrams
move to text-in-markdown: auto-layout (removes the manual overlap-prone
draw.io step), diffable source, no committed binaries, and a dark-text
theme so labels stay legible. Keep draw.io PNGs only for the two complex
bespoke diagrams (logical architecture, env2 topology) where pixel
control still wins. All 24 mermaid blocks validated by rendering.
2026-06-01 00:23:00 -04:00
Joseph Doherty bdee12f4e9 docs: render architecture & flow diagrams as draw.io charts
Replace ASCII-art diagrams across the README and docs/ with editable
.drawio sources plus exported PNGs, so the diagrams render clearly in
rendered markdown and can be maintained/regenerated instead of being
hand-edited as fragile text art. Non-diagram blocks (code, folder
trees, UI wireframes) were left as text.
2026-05-31 23:32:53 -04:00
Joseph Doherty 300841b205 chore: mark rename plan complete (all 7 tasks done) 2026-05-31 22:05:13 -04:00
Joseph Doherty 3797af7f0f chore: mark rename plan tasks 0-4 complete 2026-05-31 21:59:57 -04:00
Joseph Doherty a47317d010 docs: record git-ignored deploy/ scrub gap + resolution (incl. LDAP directory rename) 2026-05-31 21:58:49 -04:00
Joseph Doherty c899cb162c refactor: scrub residual ScadaLink refs → ScadaBridge (env vars, config keys, assembly name, SQL login)
Renames the 13 SCADALINK_* runtime env vars → SCADABRIDGE_*, the ScadaLink__
.NET config keys → ScadaBridge__, the stale ScadaLink.Host.exe assembly name
→ ZB.MOM.WW.ScadaBridge.Host.exe, the scadalink_app SQL login → scadabridge_app,
and residual identifiers/comments/docs. Migration records (prior rename
tooling/design, DB-rename helper, this scrub script) carved out.

Adds tools/scrub-scadalink-refs.sh.
2026-05-31 21:50:38 -04:00
Joseph Doherty d317c07ea5 docs: add folder/repo rename implementation plan + task file 2026-05-31 21:39:23 -04:00
Joseph Doherty e01f3bdabe docs: add folder/repo rename design doc (scadalink-design → ScadaBridge) 2026-05-31 21:32:07 -04:00