Commit Graph

36 Commits

Author SHA1 Message Date
Joseph Doherty 6ae0fea558 fix(error-handling): close Theme 4 — 18 cancellation / fire-and-forget findings
Async cancellation hygiene, fire-and-forget observability, retry/shutdown
semantics, and audit-row coverage across 9 modules. Highlights:

Cancellation & lifecycle:
- AuditLog-006: SqliteAuditWriter.Dispose hops to thread pool, escaping the
  captured SyncContext that risked sync-over-async deadlock.
- AuditLog-010: SiteAuditTelemetryActor owns a private lifecycle CTS,
  threaded through drain paths instead of CancellationToken.None.
- Comm-019: CentralCommunicationActor adds lifecycle CTS for repo calls.
- Host-019: Migration StartupRetry forwards ApplicationStopping so SIGTERM
  during the bounded-retry window aborts cleanly.

Cursor / retry / counter correctness:
- AuditLog-004: SiteAuditReconciliationActor's cursor now holds at `since`
  when any row's idempotent insert is still being retried (per-EventId
  retry counter, MaxPermanentInsertAttempts=5 escape valve with LogCritical
  abandon). No more silent abandonment of permanently-failing rows.
- ConfigDB-019: Dropped the catch-and-continue on EnsureLookaheadAsync's
  SPLIT loop — by class-doc construction the catch could only mask real
  failures and let the next iteration create permanent partition holes.
- HM-017/018: HealthReportSender + CentralHealthReportLoop snapshot
  per-interval counters before sending, restore via new
  ISiteHealthCollector.AddIntervalCounters on transport failure so counts
  aren't silently lost.

Fire-and-forget / shutdown waits:
- InboundAPI-018: AuditWriteMiddleware observes faulted audit-write tasks
  via OnlyOnFaulted continuation (Warning log; response unchanged).
- SnF-024: StoreAndForwardService.StopAsync awaits in-flight retry sweep
  with a bounded SweepShutdownWaitTimeout (10s).

Leak / refactor:
- Comm-021: SiteStreamGrpcServer.SubscribeInstance wraps Subscribe in its
  own try/catch so a throw doesn't leak the relay actor or _activeStreams
  entry.
- Comm-022: VERIFIED already-closed by Comm-016's dead-code purge.
- CLI-017: BundleCommands' three subcommands delegate to ExecuteCommandAsync
  (auth-failure exit-code contract unified).

Defensive / validation:
- CLI-021: CliConfig.Load wraps file-read/JSON parse so malformed config
  prints a warning and returns defaults instead of crashing the CLI.
- Host-022: ParseLevel emits stderr one-shot warning for unrecognised
  MinimumLevel instead of silently coercing to Information.
- ESG-019: ExternalSystemClient sets HttpClient.Timeout=Infinite so the
  per-call CTS is the sole timeout source (was clipped to 100s by .NET).
- Security-020: New SecurityOptionsValidator (IValidateOptions) rejects
  empty LdapServer/LdapSearchBase with ValidateOnStart.
- DM-019: Lifecycle command timeouts now emit DisableTimedOut/EnableTimedOut/
  DeleteTimedOut audit entries (mirrors DeployFailed pattern).

Plus reconciled stale per-module Open-findings counters that had drifted
from prior sessions.

20+ new regression tests across 11 test projects; build clean; affected
suites all green. README regenerated: 75 open (was 93).
2026-05-28 07:13:28 -04:00
Joseph Doherty 487859bff0 docs+code: close Theme 1 — 24 design-doc / XML-doc drift findings
Doc/XML-comment drift + small adherence fixes across 17 modules. Highlights:
- Host-017: site CoordinatedShutdown ordering — SiteStreamGrpcServer gains
  CancelAllStreams() (refuse new streams, cancel active), wired into
  Program.cs site branch via ApplicationStopping.
- InboundAPI-021: ParentExecutionId now travels on RouteToGet/SetAttributes
  symmetric with RouteToCallRequest; RouteHelper stamps from _parentExecutionId.
- ClusterInfra-012: ClusterOptionsValidator now requires both seed nodes.
- Comm-018: SiteCommunicationActor.HeartbeatMessage.IsActive derived from
  cluster leader check (was hardcoded true).
- DM-020: reconciliation audit row attributes the current user, not prior deployer.
- SEL-019: EventLogPurgeService early-exits on standby via active-node check.
- Plus comment/XML-doc accuracy fixes across AuditLog, ConfigurationDatabase,
  NotificationOutbox, SiteRuntime, SiteCallAudit; doc refreshes for Component-
  Commons / -ManagementService / -CLI / -ExternalSystemGateway / -HealthMonitoring
  / -Transport / -ConfigurationDatabase; CD-023 index-name doc alignment.

11 new regression tests (RouteHelper x4, SiteStreamGrpcServer x2,
ClusterOptionsValidator x1, SiteCommunicationActor x1, DeploymentService x1,
EventLogPurgeService x3). Build clean (0 warnings); InboundAPI/Communication/
Host suites all green. README regenerated: 112 open (was 136).
2026-05-28 06:28:31 -04:00
Joseph Doherty ac96b83b08 fix(high-severity): close 9 of 10 open High findings across 8 modules
Comm-016: delete dead HandleConnectionStateChanged + _debugSubscriptions /
_inProgressDeployments tracking + ConnectionStateChanged message record.
Disconnect detection is owned by the transport layers (gRPC keepalive PING
~25s; Ask-timeout at CommunicationService). Updates the
Component-Communication.md design doc to make that explicit.

SnF-018: NotificationForwarder.DeliverAsync now discards a corrupt buffered
payload (Warning log + return true) instead of returning false and parking
the row — honoring the design's "notifications do not park" invariant.

DM-018: reconciliation no longer force-sets Enabled, preserving an
intentional Disabled state after central failover.

ESG-018: DeliverBufferedAsync (both ExternalSystemClient + DatabaseGateway)
catches JsonException and returns false, turning a corrupt buffered row
into a parked operation instead of a retry-forever poison message.

InboundAPI-022: register ActiveNodeGate as IActiveNodeGate in the Central
DI branch so standby-node gating is actually wired up in production.

NS-019: remove orphaned NotificationDeliveryService /
INotificationDeliveryService / NotificationResult; central notification
delivery now lives entirely in NotificationOutbox.

SEL-016: normalise From/To filters to UTC before ISO-string compare so
non-UTC DateTimeOffset clients no longer get spuriously excluded events.

TE-017: include Description on attributes/alarms and a HashableConnections
projection (protocol, endpoint JSON, failover count) in the revision hash
and DiffService; staleness detection now catches description-only and
connection-endpoint edits.

Transport-001 and Transport-002 (also High) remain Open — they're being
handled in a follow-up batch because both touch BundleImporter.cs and
must serialise.
2026-05-28 05:40:15 -04:00
Joseph Doherty 9a3f5231db feat(transport): register AddTransport() on central nodes 2026-05-24 05:09:51 -04:00
Joseph Doherty a1bdd94d4c feat(mgmt): /api/audit/{query,export} endpoints with permission gates (#23 M8) 2026-05-20 21:49:14 -04:00
Joseph Doherty 75b060e0a8 feat(auditlog): AuditLogPartitionMaintenanceService monthly roll-forward (#23 M6) 2026-05-20 18:51:43 -04:00
Joseph Doherty 1c862989b4 feat(inbound): register AuditWriteMiddleware in pipeline (#23 M4) 2026-05-20 16:35:13 -04:00
Joseph Doherty 6fe23a4d9b feat(host): register SiteCallAuditActor + CachedCallTelemetry forwarder/bridge (#22, #23 M3)
M3 Bundle F (Task F1) wires the cached-call audit pipeline through the
composition roots:

- Central: register SiteCallAuditActor as a cluster singleton + proxy
  (mirrors AuditLogIngestActor and NotificationOutboxActor). Program.cs
  calls .AddSiteCallAudit() on the central role.
- Site: register ICachedCallTelemetryForwarder + CachedCallLifecycleBridge
  in AddAuditLog (lazy factory — Central nodes degrade to audit-only
  emission because IOperationTrackingStore is site-only).
- Site: bind CachedCallLifecycleBridge to ICachedCallLifecycleObserver so
  StoreAndForwardService picks it up via DI.
- Site: introduce IStoreAndForwardSiteContext + Host adapter to surface the
  site id to StoreAndForwardService without creating a
  StoreAndForward -> HealthMonitoring project-reference cycle.
- ScriptExecutionActor resolves ICachedCallTelemetryForwarder per script
  scope and threads it into ScriptRuntimeContext.

CachedCallTelemetryForwarder's IOperationTrackingStore dependency is now
nullable so Central DI validation succeeds with the lazy registration; the
forwarder's tracking-half emission is a no-op when the store is absent.

Tests:
- AkkaHostedServiceAuditWiringTests: Central host builds with
  AddSiteCallAudit and resolves ICachedCallTelemetryForwarder; Site
  resolves the forwarder + bridge + observer + IStoreAndForwardSiteContext.
- Full solution: 194 Host tests green, 241 SiteRuntime tests green, every
  other suite unchanged.
2026-05-20 15:10:47 -04:00
Joseph Doherty 9bf1497f03 feat(host): register Audit Log #23 singletons with dedicated dispatcher (#23)
Wires Bundle E of the M2 site-sync pipeline:

- AddAuditLog extended to register the site writer chain (SqliteAuditWriter
  singleton + ISiteAuditQueue forward + RingBufferFallback + FallbackAuditWriter
  composing them) and the telemetry collaborators (SiteAuditTelemetryOptions,
  SqliteAuditWriterOptions, IAuditWriteFailureCounter NoOp default,
  ISiteStreamAuditClient NoOp default).
- AkkaHostedService central role: AuditLogIngestActor as ClusterSingletonManager
  (singleton name 'audit-log-ingest') + ClusterSingletonProxy, mirroring the
  Notification Outbox pattern. Proxy is offered to SiteStreamGrpcServer if it
  resolves (Site path only today; M6 reconciliation will host gRPC on central).
- AkkaHostedService site role: SiteAuditTelemetryActor (per-site, NOT a
  singleton because each site is its own cluster), bound to a dedicated
  audit-telemetry-dispatcher (ForkJoinDispatcher, 2 dedicated threads).
- Program.cs + SiteServiceRegistration.Configure call AddAuditLog on both roles.
- AuditLogIngestActor gains a second constructor that takes IServiceProvider so
  the cluster singleton can create a fresh scope per message — IAuditLogRepository
  is a scoped EF Core service and cannot be pre-resolved from the root. The
  IAuditLogRepository constructor remains for Bundle D's MSSQL-fixture tests.

NoOp ISiteStreamAuditClient is deliberate: no site→central gRPC channel exists
in M2 (sites talk to central via Akka ClusterClient; gRPC SiteStreamService is
hosted on sites for central→site streaming). M6 reconciliation introduces the
real gRPC site→central client + central-hosted gRPC server. Bundle H's
integration test substitutes a stub client directly via the actor's Props.

Tests:
- tests/ScadaLink.AuditLog.Tests/AddAuditLogTests.cs — 11 tests (was 3): writer
  singleton, IAuditWriter as FallbackAuditWriter, ISiteAuditQueue same-instance
  as SqliteAuditWriter, options bind round-trip, NoOp default assertions.
- tests/ScadaLink.Host.Tests/AkkaHostedServiceAuditWiringTests.cs (new) — 13
  tests: BuildHocon emits audit-telemetry-dispatcher block with the expected
  type/throughput/thread-count; Central composition root resolves the writer
  chain + options; Site composition root resolves the writer chain + options +
  NoOp client.

Verified: dotnet build clean, 23 test suites green (Host 194 + AuditLog 54).
2026-05-20 13:04:05 -04:00
Joseph Doherty 1d495d1a87 feat(notification-outbox): register NotificationOutbox singleton in Host
Wire the Notification Outbox into the Host central role:
- Program.cs: call AddNotificationOutbox() on the central path (binds
  NotificationOutboxOptions via BindConfiguration; no explicit Configure).
- AkkaHostedService.RegisterCentralActors(): create the NotificationOutboxActor
  as a non-role-scoped central cluster singleton + proxy, then send
  RegisterNotificationOutbox(proxy) to the CentralCommunicationActor.
- appsettings.Central.json: add the ScadaLink:NotificationOutbox section with
  the NotificationOutboxOptions defaults.
- SiteServiceRegistration: remove the now-dead AddNotificationService() call -
  sites forward notifications to central rather than delivering over SMTP, and
  no site component consumes the SMTP machinery.
- Host.csproj: add the ScadaLink.NotificationOutbox project reference.
- Tests: add central outbox singleton/proxy actor-path assertions, drop the
  site OAuth2TokenService/INotificationDeliveryService resolution assertions,
  and add NotificationOutbox to the component-library IConfiguration check.
2026-05-19 02:44:32 -04:00
Joseph Doherty aca65e85bb fix(host): resolve Host-012..015 — consume DownIfAlone in HOCON, sub-second timing precision, config-driven Serilog sinks, transient-only startup retry 2026-05-17 03:18:33 -04:00
Joseph Doherty 8664cdf940 fix(host): resolve Host-005..011 — async startup, HOCON escaping, port-conflict check, dead-config cleanup, migration retry, log-level wiring; Host-002 flagged 2026-05-16 22:24:03 -04:00
Joseph Doherty 632d44f38c fix(host,deployment-manager,communication): repair cross-module DI regressions from batch 1-2
- DeploymentManager-008: revert IConfiguration overload (violated OptionsTests
  component-convention); Host now binds the ScadaLink:DeploymentManager section
- SiteStreamGrpcServer: make test-only int ctor internal so DI sees one public
  ctor (resolves ambiguous-constructor failure in SiteCompositionRootTests)
- Host site composition-root test config: supply Cluster:SeedNodes for the new
  ClusterOptionsValidator
2026-05-16 21:28:50 -04:00
Joseph Doherty a0e6a36e79 fix(host): resolve Host-001 — exclude leader-only active-node check from /health/ready 2026-05-16 19:40:40 -04:00
Joseph Doherty 295150751f feat(scripts): realign Test Run with runtime API, add anonymous-object calls and instance binding
The Test Run sandbox and Monaco analysis modelled a script API that had
drifted from the site runtime's ScriptGlobals, so real scripts failed to
compile in Test Run. Realign both to the runtime surface
(Instance/Scripts/ExternalSystem/Attributes/Children/Parent) and drop the
duplicate ScriptHost stub so the two cannot diverge again.

- Script calls (Scripts.CallShared, Instance.CallScript, Route.To().Call)
  accept an anonymous object instead of a hand-built dictionary, via a
  shared ScriptArgs normalizer; existing dictionary calls still compile.
- Test Run can optionally bind to a deployed instance, so Instance/
  Attributes/CallScript route to it cross-site; adds site-side
  RouteToGetAttributes/RouteToSetAttributes handlers.
- Adds Test Run panels to the API method and template script editors.
- Fixes the TestDatabaseQuery seed script, which queried a table that
  never existed.

Also commits unrelated in-progress work already in the tree: the health
monitoring report loop, site streaming changes, and the Admin/Design
data-connection and SMTP page reorganization.
2026-05-16 03:37:56 -04:00
Joseph Doherty d7b05b40e9 fix(host): drop UseStaticFiles so MapStaticAssets controls caching
UseStaticFiles middleware ran before the MapStaticAssets endpoints and
served static assets (monaco-init.js, site.css, etc.) with no
Cache-Control header. Browsers then heuristically cached them and kept
serving stale copies across deploys — e.g. the Monaco editor ran an old
monaco-init.js that did not send the script kind, so inbound API method
scripts were analysed against the wrong globals and 'Route' was flagged
as undefined.

MapStaticAssets alone now serves every static asset, tagging
non-fingerprinted files with Cache-Control: no-cache so the browser
always revalidates via ETag.
2026-05-15 12:29:14 -04:00
Joseph Doherty 5da779db17 fix(host): wait for configuration database before applying migrations
Central nodes crashed at startup with `CREATE DATABASE permission denied`
when MSSQL accepted connections before recovering user databases —
DB_ID(@db) returned null, so EF Core's MigrateAsync fell through to
SqlServerDatabaseCreator.CreateAsync. The non-privileged app login then
failed CREATE DATABASE and the host terminated with FTL, leaving Traefik's
/health/active probe unable to find an upstream ("no available server" at
localhost:9000).

Add MigrationHelper.WaitForDatabaseReadyAsync that polls
Database.CanConnectAsync() for up to 60s before invoking MigrateAsync, and
thread an ILogger through so retry attempts surface in normal logs. This
removes the startup race without requiring depends_on across compose stacks
or granting dbcreator to the app login.
2026-05-08 09:33:59 -04:00
Joseph Doherty 416a03b782 feat: complete gRPC streaming channel — site host, docker config, docs, integration tests
Switch site host to WebApplicationBuilder with Kestrel HTTP/2 gRPC server,
add GrpcPort/keepalive config, wire SiteStreamManager as ISiteStreamSubscriber,
expose gRPC ports in docker-compose, add site seed script, update all 10
requirement docs + CLAUDE.md + README.md for the new dual-transport architecture.
2026-03-21 12:38:33 -04:00
Joseph Doherty fd2e96fea2 feat: replace debug view polling with real-time SignalR streaming
The debug view polled every 2s by re-subscribing for full snapshots. Now a
persistent DebugStreamBridgeActor on central subscribes once and receives
incremental Akka stream events from the site, forwarding them to the Blazor
component via callbacks and to the CLI via a new SignalR hub at
/hubs/debug-stream. Adds `debug stream` CLI command with auto-reconnect.
2026-03-21 01:34:53 -04:00
Joseph Doherty 0a85a839a2 feat(infra): add Traefik load balancer with active node health check for central cluster failover
Add ActiveNodeHealthCheck that returns 200 only on the Akka.NET cluster
leader, enabling Traefik to route traffic to the active central node and
automatically fail over when the leader changes. Also fixes AkkaClusterHealthCheck
to resolve ActorSystem from AkkaHostedService (was always null via DI).
2026-03-21 00:44:37 -04:00
Joseph Doherty 1a540f4f0a feat: add HTTP Management API, migrate CLI from Akka ClusterClient to HTTP
Replace the CLI's Akka.NET ClusterClient transport with a simple HTTP client
targeting a new POST /management endpoint on the Central Host. The endpoint
handles Basic Auth, LDAP authentication, role resolution, and ManagementActor
dispatch in a single round-trip — eliminating the CLI's Akka, LDAP, and
Security dependencies.

Also fixes DCL ReSubscribeAll losing subscriptions on repeated reconnect by
deriving the tag list from _subscriptionsByInstance instead of _subscriptionIds.
2026-03-20 23:55:31 -04:00
Joseph Doherty 78fbb13df7 feat: wire Inbound API Route.To().Call() to site instance scripts and add Roslyn compilation
Completes the Inbound API → site script call chain by adding RouteToCallRequest
handlers in SiteCommunicationActor and DeploymentManagerActor. Also replaces the
placeholder dispatch table in InboundScriptExecutor with Roslyn compilation of
API method scripts at startup, enabling user-defined inbound API methods to call
instance scripts across the cluster.
2026-03-18 08:43:13 -04:00
Joseph Doherty 899dec6b6f feat: wire ExternalSystem, Database, and Notify APIs into script runtime
IServiceProvider now flows through the actor chain (DeploymentManagerActor
→ InstanceActor → ScriptActor → ScriptExecutionActor) so scripts can
resolve IExternalSystemClient, IDatabaseGateway, and
INotificationDeliveryService from DI. ScriptGlobals exposes ExternalSystem,
Database, Notify, and Scripts as top-level properties so scripts can use
them without the Instance. prefix.
2026-03-18 02:41:18 -04:00
Joseph Doherty e5eb871961 fix: wire up health report pipeline between sites and central aggregator
Sites now send SiteHealthReport via AkkaHealthReportTransport →
SiteCommunicationActor → CentralCommunicationActor → CentralHealthAggregator.
Added IHealthReportTransport impl, ISiteIdentityProvider impl, registered
HealthReportSender on site nodes, and added SiteHealthReport handler in
CentralCommunicationActor. Health Dashboard now shows all 3 sites online.
2026-03-17 23:46:17 -04:00
Joseph Doherty 9e97c1acd2 feat: replace site registration with database-driven site addressing
Central now resolves site Akka remoting addresses from the Sites DB table
(NodeAAddress/NodeBAddress) instead of relying on runtime RegisterSite
messages. Eliminates the race condition where sites starting before central
had their registration dead-lettered. Addresses are cached in
CentralCommunicationActor with 60s periodic refresh and on-demand refresh
when sites are added/edited/deleted via UI or CLI.
2026-03-17 23:13:10 -04:00
Joseph Doherty 1942544769 feat: register ManagementActor on Central with ClusterClientReceptionist 2026-03-17 14:49:35 -04:00
Joseph Doherty 3b22a8f0da feat: wire site-local repos, remove config DB from Site, update artifact service
- SiteExternalSystemRepository and SiteNotificationRepository registered in Site DI
- Removed AddConfigurationDatabase from Site role in Program.cs
- Removed ConfigurationDb from appsettings.Site.json
- ArtifactDeploymentService collects all 6 artifact types including data connections and SMTP
2026-03-17 13:54:37 -04:00
Joseph Doherty 4879c4e01e Fix auth, Bootstrap, Blazor nav, LDAP, and deployment pipeline for working Central UI
Bootstrap served locally with absolute paths and <base href="/">.
LDAP auth uses search-then-bind with service account for GLAuth compatibility.
CookieAuthenticationStateProvider reads HttpContext.User instead of parsing JWT.
Login/logout forms opt out of Blazor enhanced nav (data-enhance="false").
Nav links use absolute paths; seed data includes Design/Deployment group mappings.
DataConnections page loads all connections (not just site-assigned).
Site appsettings configured for Test Plant A; Site registers with Central on startup.
DeploymentService resolves string site identifier for Akka routing.
Instances page gains Create Instance form.
2026-03-17 10:03:06 -04:00
Joseph Doherty 6fa4c101ab Fix blazor.web.js 404: move App.razor and Routes.razor to Host project
Root cause: App.razor was in CentralUI (Microsoft.NET.Sdk.Razor RCL) but
MapRazorComponents<App>().AddInteractiveServerRenderMode() serves
_framework/blazor.web.js from the host assembly (Microsoft.NET.Sdk.Web).
Per MS docs, the root component must be in the server (host) project.

- Move App.razor and Routes.razor from CentralUI to Host/Components/
- Add Host/_Imports.razor for Razor component usings
- Make MapCentralUI generic: MapCentralUI<TApp>() accepts root component
- Add AdditionalAssemblies to discover CentralUI's pages/layouts
- Routes.razor references both Host and CentralUI assemblies
- Fix all DI registrations (TemplateEngine services)
- SCADALINK_CONFIG env var for role-specific config loading

Verified: blazor.web.js HTTP 200 (200KB), login page renders with Blazor
interactive server mode, SignalR circuit establishes, LDAP auth works.
2026-03-17 04:01:12 -04:00
Joseph Doherty 0b10747bd2 Fix Central launch profile: auth middleware, cookie auth, antiforgery, static files
- Add UseAuthentication/UseAuthorization/UseAntiforgery/UseStaticFiles middleware
- Register ASP.NET Core cookie authentication scheme in AddSecurity()
- Update auth endpoints to use SignInAsync/SignOutAsync (proper cookie auth)
- Add [AllowAnonymous] to login page
- Create wwwroot for static file serving
- Regenerate clean EF migration after model changes

Verified with launch profile "ScadaLink Central":
- Host starts, connects to SQL Server, applies EF migrations
- Akka.NET cluster forms (remoting on 8081, node joins self as leader)
- /health/ready returns Healthy (DB + Akka checks)
- LDAP auth works (admin/password via GLAuth → 302 + auth cookie set)
- Login page renders (HTTP 200)
- Unauthenticated requests redirect to /login
2026-03-17 03:43:11 -04:00
Joseph Doherty 121983fd66 Add Rider launch profiles, fix DI and migrations for dev startup
- Rider launch profiles: "ScadaLink Central" and "ScadaLink Site"
- appsettings.Central.json: correct test_infra credentials (ScadaLink_Dev1#,
  scadalink_app user, GLAuth on 3893, Mailpit on 1025)
- Fix HealthMonitoring DI: split site vs central registration to avoid
  missing IHealthReportTransport on central
- Regenerate single clean EF migration (InitialSchema) covering all entities
- Suppress PendingModelChangesWarning in dev mode
- Fix isDevelopment check for ASPNETCORE_ENVIRONMENT propagation

Verified: Host starts, connects to SQL Server, applies migrations, boots
Akka.NET cluster, LDAP auth works (admin/password via GLAuth), health
endpoint returns Healthy.
2026-03-17 03:01:21 -04:00
Joseph Doherty e9e6165914 Phase 3A: Site runtime foundation — Akka cluster, SQLite persistence, Deployment Manager singleton, Instance Actor
- WP-1: Site cluster config (keep-oldest SBR, down-if-alone, 2s/10s failure detection)
- WP-2: Site-role host bootstrap (no Kestrel, SQLite paths)
- WP-3: SiteStorageService with deployed_configurations + static_attribute_overrides tables
- WP-4: DeploymentManagerActor as cluster singleton with staggered Instance Actor creation,
  OneForOneStrategy/Resume supervision, deploy/disable/enable/delete lifecycle
- WP-5: InstanceActor with attribute state, GetAttribute/SetAttribute, SQLite override persistence
- WP-6: CoordinatedShutdown verified for graceful singleton handover
- WP-7: Dual-node recovery (both seed nodes, min-nr-of-members=1)
- WP-8: 31 tests (storage CRUD, actor lifecycle, supervision, negative checks)
389 total tests pass, zero warnings.
2026-03-16 20:34:56 -04:00
Joseph Doherty d38356efdb Phase 1 WP-11–22: Host infrastructure, Blazor Server UI, and integration tests
Host infrastructure (WP-11–17):
- StartupValidator with 19 validation rules
- /health/ready endpoint with DB + Akka health checks
- Akka.NET bootstrap via AkkaHostedService (HOCON config, cluster, remoting, SBR)
- Serilog with SiteId/NodeHostname/NodeRole enrichment
- DeadLetterMonitorActor with count tracking
- CoordinatedShutdown wiring (no Environment.Exit)
- Windows Service support (UseWindowsService)

Central UI (WP-18–21):
- Blazor Server shell with Bootstrap 5, role-aware NavMenu
- Login/logout flow (LDAP auth → JWT → HTTP-only cookie)
- CookieAuthenticationStateProvider with idle timeout
- LDAP group mapping CRUD page (Admin role)
- Route guards with Authorize attributes per role
- SignalR reconnection overlay for failover

Integration tests (WP-22):
- Startup validation, auth flow, audit transactions, readiness gating
186 tests pass (1 skipped: LDAP integration), zero warnings.
2026-03-16 19:50:59 -04:00
Joseph Doherty 1996b21961 Phase 1 WP-1: EF Core DbContext with Fluent API mappings for all 26 entities
ScadaLinkDbContext with 10 configuration classes (Fluent API only), initial
migration creating 25 tables, environment-aware migration helper (auto-apply
dev, validate-only prod), DesignTimeDbContextFactory, optimistic concurrency
on DeploymentRecord. 20 tests verify schema, CRUD, relationships, cascades.
2026-03-16 19:15:50 -04:00
Joseph Doherty 8c2091dc0a Phase 0 WP-0.10–0.12: Host skeleton, options classes, sample configs, and execution framework
- WP-0.10: Role-based Host startup (Central=WebApplication, Site=generic Host),
  15 component AddXxx() extension methods, MapCentralUI/MapInboundAPI stubs
- WP-0.11: 12 per-component options classes with config binding
- WP-0.12: Sample appsettings for central and site topologies
- Add execution procedure and checklist template to generate_plans.md
- Add phase-0-checklist.md for execution tracking
- Resolve all 21 open questions from plan generation
- Update IDataConnection with batch ops and IAsyncDisposable
57 tests pass, zero warnings.
2026-03-16 18:59:07 -04:00
Joseph Doherty 34190e1347 Phase 0 WP-0.1: Create .NET 10 solution structure with all 17 component projects
17 source projects (Commons + Host + 15 components) and 17 xUnit test projects.
SLNX format, net10.0, nullable enabled, warnings as errors. All components
reference Commons; Host references all components. Builds and tests clean.
2026-03-16 18:37:36 -04:00