Commit Graph

36 Commits

Author SHA1 Message Date
Joseph Doherty 57302500ac docs(security): document dev disable-login flag + ship default-false config key
Adds a "Dev Disable-Login Flag" subsection to Component-Security.md covering
ScadaBridge:Security:Auth:DisableLogin / User, the AutoLoginAuthenticationHandler
mechanism, and the no-environment-guard / startup-warning production risk.

Ships DisableLogin: false under ScadaBridge → Security → Auth in:
  - src/.../Host/appsettings.json (canonical default)
  - docker/central-node-a/appsettings.Central.json
  - docker/central-node-b/appsettings.Central.json

Also records DL-3 commit SHAs in the plan tasks file.
2026-06-16 08:54:11 -04:00
Joseph Doherty e89604298d feat(security): wire DisableLogin flag — auto-login scheme + startup warning 2026-06-16 08:47:19 -04:00
Joseph Doherty d81f747434 feat(health): wire ISiteEventLogger.FailedWriteCount into SiteHealthReport (#30, M2.16)
Add SiteHealthReport.SiteEventLogWriteFailures (trailing optional long = 0,
additive-only), ISiteHealthCollector.SetSiteEventLogWriteFailures (default
no-op so existing fakes compile), and SiteEventLogFailureCountReporter
(hosted service in HealthMonitoring, Func<long> delegate to avoid the
HealthMonitoring → StoreAndForward → SiteEventLogging cycle).

Registration helper AddSiteEventLogHealthMetricsBridge added to
HealthMonitoring.ServiceCollectionExtensions; wired in
SiteServiceRegistration after AddSiteEventLogging.

Tests: SiteEventLogWriteFailuresMetricTests (4 collector tests) +
SiteEventLogFailureCountReporterTests (2 poller tests) in
HealthMonitoring.Tests. 79/79 HealthMonitoring.Tests green,
59/59 SiteEventLogging.Tests green, 0 warnings.
2026-06-16 07:14:54 -04:00
Joseph Doherty e1ee37e508 fix(siteeventlog): gate EventLogPurge to active node via IClusterNodeProvider.SelfIsPrimary (#29, M2.15) 2026-06-16 07:02:26 -04:00
Joseph Doherty 6b1cb9e0e6 refactor(host)/test: M2.14 review nits — simplify probe cancellation + pre-cancelled-token test (#28)
- Remove redundant linked CancellationTokenSource in ProbeAsync; pass the
  framework cancellationToken and ProbeTimeout directly to Ask (the two-CTS
  pattern was redundant — Ask already honours both the timeout and the token).
- Add EchoActor XML <remarks> explaining why no Receive<Identify> handler is
  needed (ActorBase answers Identify automatically).
- Add PreCancelledToken_ReportsUnhealthy_DoesNotThrow test: verifies the
  never-throws guarantee on the shutdown-race path (token already cancelled
  before CheckHealthAsync is invoked).
2026-06-16 06:54:28 -04:00
Joseph Doherty 253bec5a52 feat(host): readiness gates on required cluster singletons (#28, M2.14)
REQ-HOST-4a lists "required cluster singletons running (if applicable)" as a
readiness criterion, but /health/ready only checked database + akka-cluster.
Add a third Ready-tagged check, RequiredSingletonsHealthCheck, registered in the
Central-role AddHealthChecks() chain (so it is naturally role-scoped — site nodes
never run it).

Probe: for each required central singleton, Ask its local ClusterSingletonProxy
an Identify with a short bounded per-singleton timeout (~2s, probes run
concurrently via Task.WhenAll). A non-null ActorIdentity.Subject within the
timeout means the singleton is running and reachable through the proxy; a null
subject or a timeout means unreachable → Unhealthy, naming the unreachable
singleton(s). The check never throws (catch-all → Unhealthy) and resolves
ActorSystem lazily from DI per probe (Unhealthy if Akka not yet up).

Required-always set = the five singleton proxies created unconditionally in
AkkaHostedService.RegisterCentralActors: notification-outbox, audit-log-ingest,
site-call-audit, audit-log-purge, site-audit-reconciliation. There are no
feature/config-gated central singletons today; any future gated singleton is the
"if applicable" case and must NOT be added to the required set.

Leadership-agnostic: the proxy reaches the singleton from either central node, so
a ready standby still reports ready (readiness must not require cluster
leadership — that is the Active tier's job). During a brief singleton handover the
probe may time out and the node flaps to not-ready, which is correct (a node
mid-handover is legitimately not fully ready); no retries, to keep the probe fast.

Tests (TDD): RequiredSingletonsHealthCheckTests exercises the probe against a
TestKit ActorSystem — all proxies present+reachable → Healthy; one missing →
Unhealthy naming it; ActorSystem absent → Unhealthy, no throw. HealthCheckTests
regression-guards the Ready tag + absence of the Active tag on the new check.
2026-06-16 06:49:18 -04:00
Joseph Doherty 76198b36e3 fix(host): add MachineDataDb startup validation for Central (reverts Host-008, M2.9 #17)
REQ-HOST-3/REQ-HOST-4 require a MachineDataDb connection string for Central nodes.
The shipped docker appsettings (docker/central-node-a/appsettings.Central.json and
central-node-b) already carry the key. Host-008 had removed the fail-fast Require
because MachineDataDb had no consumer yet; this commit reverses that decision so a
misconfigured or missing connection string is caught at startup with a clear error.

Changes:
- DatabaseOptions: add MachineDataDb property with XML doc comment
- StartupValidator: add .Require for ScadaBridge:Database:MachineDataDb inside the
  existing Central .When block, immediately after the ConfigurationDb Require
- StartupValidatorTests: rename Central_MissingMachineDataDb_PassesValidation ->
  FailsValidation and flip to Assert.Throws; update comment to cite REQ-HOST-3/4,
  shipped docker appsettings, and the Host-008 reversal; add MachineDataDb to
  ValidCentralConfig() so all other Central tests remain green
- CentralDbTestEnvironment: supply ScadaBridge__Database__MachineDataDb env var
  (mirrors ConfigurationDb pattern) so HostStartupTests, HealthCheckTests, and
  MetricsEndpointTests pass through the new Require
- CompositionRootTests, AkkaHostedServiceAuditWiringTests, ActorPathTests: set
  ScadaBridge__Database__MachineDataDb env var alongside the pepper env var and
  clear it in Dispose, matching the existing pepper handling pattern

Build: 0 warnings, 0 errors. dotnet test Host.Tests: 233/233 passed.
2026-06-16 05:41:25 -04:00
Joseph Doherty 963e3427da feat(sitecallaudit): PullSiteCalls reconciliation plumbing (store read + RPC + site handler + central client)
Site Call Audit (#22): build the documented periodic reconciliation PULL
self-heal path for the eventually-consistent central SiteCalls mirror, as a
dedicated PullSiteCalls gRPC RPC kept separate from the audit pull. This is the
pull PLUMBING only; the central reconciliation tick is a separate follow-up.

- IOperationTrackingStore.ReadChangedSinceAsync(sinceUtc, batchSize): inclusive
  UpdatedAtUtc cursor, oldest-first, batch-capped; SQLite impl projects tracking
  rows onto SiteCallOperational (Kind->Channel, TargetSummary->Target, SourceSite
  left empty - the store has no site-id column).
- sitestream.proto: rpc PullSiteCalls + PullSiteCallsRequest/Response, mirroring
  PullAuditEvents; regenerated checked-in SiteStreamGrpc/*.cs.
- SiteCallDtoMapper.ToDto(SiteCallOperational): inverse of FromDto for the handler.
- SiteStreamGrpcServer.PullSiteCalls handler + SetOperationTrackingStore seam;
  Host wires the seam alongside SetSiteAuditQueue (site roles only).
- Central IPullSiteCallsClient + GrpcPullSiteCallsClient (home: AuditLog/Central to
  reuse ISiteEnumerator; SiteCallAudit does not reference AuditLog). Re-stamps
  SourceSite from the dialed siteId; no-throw on tolerable transport faults;
  SpecifyKind (not ToUniversalTime) cursor handling. Central-only DI registration.

Tests: ReadChangedSinceAsync (4), PullSiteCalls handler (6), GrpcPullSiteCallsClient
(8). Full solution build 0 warnings/0 errors (TreatWarningsAsErrors).
2026-06-15 10:39:06 -04:00
Joseph Doherty 36a08a4145 feat(audit): start purge + reconciliation singletons; production ISiteEnumerator 2026-06-15 10:00:44 -04:00
Joseph Doherty d33617d65d fix(host): register ActorSystem as DI singleton so health-probe scopes don't dispose it (HOST-021)
Per-probe health-check child scopes were disposing the AddTransient-bridged
ActorSystem (IDisposable), terminating the live cluster node ~4s after boot and
leaving every singleton-proxy Ask to hang the full 30s QueryTimeout — the central
report pages (/notifications, /site-calls, /monitoring/health) loaded in ~30s.
Bridge it as a singleton via a new lazy AkkaHostedService.GetOrCreateActorSystem()
so child-scope disposal never touches it. Verified: 0 post-startup terminates,
healthy active/standby, report pages ~0.05s, Playwright 68 passed / 0 failed.
2026-06-05 08:26:09 -04:00
Joseph Doherty eabf270d71 docs: complete XML doc coverage (returns, summaries, inheritdoc)
Resolve all 622 issues flagged by the enhanced CommentChecker: add missing
<returns> tags (incl. the standard phrasing on non-generic Task methods),
add missing <summary> tags, and replace misused/redundant <inheritdoc/> on
members that override or implement nothing with real documentation.
Documentation-only — no behavior change; solution builds clean.
2026-06-03 11:39:32 -04:00
Joseph Doherty 9f18badf02 build(host): declare ZB.MOM.WW.Theme directly (not transitively via CentralUI)
Host/App.razor uses the kit's <ThemeHead/>/<ThemeScripts/>, but Host had no direct
PackageReference — it relied on CentralUI re-exporting the package transitively.
Add a versionless <PackageReference Include="ZB.MOM.WW.Theme"/> (version pinned by
central PM at Directory.Packages.props) so the declared dependency matches actual
usage and survives any future PrivateAssets/refactor on CentralUI. Additive only;
Host builds clean (0/0).
2026-06-03 04:52:00 -04:00
Joseph Doherty 6d75bdb372 feat(host): use ZB.MOM.WW.Theme ThemeHead + ThemeScripts 2026-06-03 03:23:03 -04:00
Joseph Doherty e1589497f1 build(centralui): reference ZB.MOM.WW.Theme 0.2.0 2026-06-03 03:21:44 -04:00
Joseph Doherty 6ae605160c chore(auth): ScadaBridge unify dev LDAP base DN to dc=zb,dc=local (Task 1.6)
Replace dc=scadabridge,dc=local with dc=zb,dc=local in all dev/test LDAP
references — app config, docker test-cluster node configs (docker/ and
docker-env2/), GLAuth fixture, dev tooling, Host.Tests fixtures,
IntegrationTests factory, and operational test_infra docs. OU structure
(ou=SCADA-Admins,ou=users,etc.) preserved throughout. Email domains
(@scadabridge.local), hostnames, and container names are untouched.
Historical plan docs (2026-05-24-second-environment.md,
2026-05-31-folder-repo-rename-scadabridge-design.md) excluded as
point-in-time records. No synthetic dc=example,dc=com placeholders touched.
2026-06-02 06:54:14 -04:00
Joseph Doherty afa55981d5 feat(auth)!: ScadaBridge retire SQL Server ApiKey entity + ApprovedApiKeyIds + legacy hashing; EF migration RetireInboundApiKeyStore; re-issue runbook + CHANGELOG (re-arch C5/E) — BREAKING: X-API-Key -> Bearer sbk_, keys re-issued 2026-06-02 05:39:59 -04:00
Joseph Doherty 55099b19f6 fix(auth): move AddZbLdapAuth to Host composition root so component-lib AddSecurity() drops IConfiguration param (satisfy OptionsTests arch rule; fix pre-existing ac34dac red); behaviour-preserving 2026-06-02 03:50:16 -04:00
Joseph Doherty d09def2be0 feat(auth): ScadaBridge re-pin Auth 0.1.3 + add IInboundApiKeyAdmin seam over library admin facade (re-arch C1, additive) 2026-06-02 03:32:25 -04:00
Joseph Doherty 1fcc4f5c2b fix(auth): ScadaBridge inbound auth review fixes — scope-before-DB, pinned 403 body, pepper fail-fast, log category 2026-06-02 02:50:10 -04:00
Joseph Doherty a94558c289 feat(auth): ScadaBridge inbound API — adopt ZB.MOM.WW.Auth.ApiKeys verifier + Bearer + scope=method (re-arch A+B); additive, old path retired later 2026-06-02 02:40:18 -04:00
Joseph Doherty ac34dac479 feat(auth): cut ScadaBridge over to ZB.MOM.WW.Auth.Ldap; nest+rename Ldap config; roles+sitescope via IGroupRoleMapper (Task 1.2/1.4) 2026-06-02 01:04:34 -04:00
Joseph Doherty 145d2668e2 fix: wire ValidateOnStart for ScadaBridge HealthMonitoring + Cluster options (fail-fast at startup) 2026-06-01 23:07:46 -04:00
Joseph Doherty 6dbbc7ad04 refactor: ScadaBridge StartupValidator -> ConfigPreflight (byte-compatible) 2026-06-01 19:04:13 -04:00
Joseph Doherty ccf43312e8 feat(scadabridge): config-driven OTLP exporter opt-in (default Prometheus) 2026-06-01 17:14:35 -04:00
Joseph Doherty c41cb41c7b fix(scadabridge): default MetricsPort to 8084 (avoid site RemotingPort collision) + validate port distinctness 2026-06-01 16:46:59 -04:00
Joseph Doherty fe25ac3e51 feat(scadabridge): add ScadaBridgeTelemetry meter + 4 instruments; register with OTel 2026-06-01 16:41:52 -04:00
Joseph Doherty bbc9f09268 feat(scadabridge): add HTTP/1.1 metrics listener on site nodes (NodeOptions.MetricsPort=8082) 2026-06-01 16:36:59 -04:00
Joseph Doherty f743ffaad2 feat(scadabridge): add shared TraceContextEnricher to log pipeline (trace correlation) 2026-06-01 15:40:42 -04:00
Joseph Doherty b3070c0bda feat(scadabridge): wire AddZbTelemetry + /metrics in both composition roots 2026-06-01 15:36:55 -04:00
Joseph Doherty 20a31835cf build(scadabridge): reference ZB.MOM.WW.Telemetry packages from Gitea feed 2026-06-01 15:30:00 -04:00
Joseph Doherty adf1bd2693 build: drop orphaned AspNetCore.HealthChecks.UI.Client ref (UIResponseWriter removed) 2026-06-01 13:56:12 -04:00
Joseph Doherty bbff1d19b5 feat: adopt shared ZB.MOM.WW.Health probes; add /healthz; canonical writer 2026-06-01 13:46:49 -04:00
Joseph Doherty 2a7ff03718 feat: bridge ActorSystem into DI (transient) for shared health checks 2026-06-01 13:37:21 -04:00
Joseph Doherty 38e48299a4 build: reference ZB.MOM.WW.Health packages from the Gitea feed 2026-06-01 13:30:33 -04:00
Joseph Doherty c899cb162c refactor: scrub residual ScadaLink refs → ScadaBridge (env vars, config keys, assembly name, SQL login)
Renames the 13 SCADALINK_* runtime env vars → SCADABRIDGE_*, the ScadaLink__
.NET config keys → ScadaBridge__, the stale ScadaLink.Host.exe assembly name
→ ZB.MOM.WW.ScadaBridge.Host.exe, the scadalink_app SQL login → scadabridge_app,
and residual identifiers/comments/docs. Migration records (prior rename
tooling/design, DB-rename helper, this scrub script) carved out.

Adds tools/scrub-scadalink-refs.sh.
2026-05-31 21:50:38 -04:00
Joseph Doherty 7b0b9c7365 refactor: rename ScadaLink → ZB.MOM.WW.ScadaBridge (code + projects + namespaces)
Solution + 23 src projects + 26 test projects renamed; folders, csproj,
namespaces, and ScadaLinkDbContext/ScadaBridgeDbContext class updated.
ActorSystem "scadalink" → "scadabridge", Akka seed-node URLs migrated.
SQL roles/logins, LDAP domains, CLI command name, and CLI config dir
(~/.scadalink → ~/.scadabridge) also renamed.

Build green; 5 Host.Tests fail awaiting SQL login rename in next commit.
Pre-existing StaleTagMonitor timing flakes unchanged.

Rename script committed at tools/rename-to-scadabridge.sh.
2026-05-28 09:37:45 -04:00