Commit Graph

18 Commits

Author SHA1 Message Date
Joseph Doherty
e8df71ea64 feat(cli): add --primary-config, --backup-config, --failover-retry-count to data connection commands
Thread backup data connection fields through management command messages,
ManagementActor handlers, SiteService, site-side SQLite storage, and
deployment/replication actors. The old --configuration CLI flag is kept
as a hidden alias for backwards compatibility.
2026-03-22 08:41:57 -04:00
Joseph Doherty
46304678da feat(dcl): extend CreateConnectionCommand with backup config and failover retry count
Update CreateConnectionCommand to carry PrimaryConnectionDetails,
BackupConnectionDetails, and FailoverRetryCount. Update all callers:
DataConnectionManagerActor, DataConnectionActor, DeploymentManagerActor,
FlatteningService, and ConnectionConfig. The actor stores both configs
but continues using primary only — failover logic comes in Task 3.
2026-03-22 08:24:39 -04:00
Joseph Doherty
04af03980e feat(dcl): rename Configuration to PrimaryConfiguration, add BackupConfiguration and FailoverRetryCount 2026-03-22 08:18:31 -04:00
Joseph Doherty
db387c6613 fix: include recipients in artifact deployment and load shared scripts on startup
NotificationRepository.GetAllNotificationListsAsync() was missing
.Include(Recipients), causing artifact deployments to push empty recipient
lists to sites. Also load shared scripts from SQLite on DeploymentManager
startup so they're available before Instance Actors compile their scripts.
2026-03-18 09:13:10 -04:00
Joseph Doherty
78fbb13df7 feat: wire Inbound API Route.To().Call() to site instance scripts and add Roslyn compilation
Completes the Inbound API → site script call chain by adding RouteToCallRequest
handlers in SiteCommunicationActor and DeploymentManagerActor. Also replaces the
placeholder dispatch table in InboundScriptExecutor with Roslyn compilation of
API method scripts at startup, enabling user-defined inbound API methods to call
instance scripts across the cluster.
2026-03-18 08:43:13 -04:00
Joseph Doherty
eb8ead58d2 feat: wire SQLite replication between site nodes and fix ConfigurationDatabase tests
Add SiteReplicationActor (runs on every site node) to replicate deployed
configs and store-and-forward buffer operations to the standby peer via
cluster member discovery and fire-and-forget Tell. Wire ReplicationService
handler and pass replication actor to DeploymentManagerActor singleton.

Fix 5 pre-existing ConfigurationDatabase test failures: RowVersion NOT NULL
on SQLite, stale migration name assertion, and seed data count mismatch.
2026-03-18 08:28:02 -04:00
Joseph Doherty
9c6e3c2e56 feat: add CLI debug snapshot command for one-shot instance state inspection
Adds `debug snapshot --id <int>` to query a running instance's current
attribute values and alarm states without the subscribe/stream overhead
of the debug view. Routes through ManagementActor → CommunicationService
→ site DeploymentManager → InstanceActor using the existing remote query
pattern.
2026-03-18 07:16:22 -04:00
Joseph Doherty
899dec6b6f feat: wire ExternalSystem, Database, and Notify APIs into script runtime
IServiceProvider now flows through the actor chain (DeploymentManagerActor
→ InstanceActor → ScriptActor → ScriptExecutionActor) so scripts can
resolve IExternalSystemClient, IDatabaseGateway, and
INotificationDeliveryService from DI. ScriptGlobals exposes ExternalSystem,
Database, Notify, and Scripts as top-level properties so scripts can use
them without the Instance. prefix.
2026-03-18 02:41:18 -04:00
Joseph Doherty
8095c8efbe fix: only active singleton node sends health reports
Both nodes of a site cluster were sending health reports. The standby
node (without the DeploymentManager singleton) reported 0 instances and
no connections, overwriting the active node's data in the aggregator.

Added IsActiveNode flag to ISiteHealthCollector, set by
DeploymentManagerActor on PreStart/PostStop. HealthReportSender skips
sending when the node is not active. Also ensured EnsureDclConnections
is called during startup batch creation so data connections survive
container restarts.
2026-03-18 01:44:57 -04:00
Joseph Doherty
213ca2698a fix: update instance counts after startup batch creation completes
UpdateInstanceCounts() was only called before Instance Actors were
created (in HandleStartupConfigsLoaded), showing 0 enabled on the
health dashboard. Now also called after each batch in
HandleStartNextBatch to reflect actual running actor count.
2026-03-18 01:28:46 -04:00
Joseph Doherty
f165ca2774 feat: wire all health metrics and add instance counts to dashboard
Wired ISiteHealthCollector calls for script errors (ScriptExecutionActor),
alarm eval errors (AlarmActor), dead letters (DeadLetterMonitorActor), and
S&F buffer depth placeholder. Added instance count tracking (deployed/
enabled/disabled) to SiteHealthReport via DeploymentManagerActor. Updated
Health Dashboard UI to show instance counts per site. All metrics flow
through the existing health report pipeline via ClusterClient.
2026-03-18 00:57:49 -04:00
Joseph Doherty
88b5f6cb54 fix: handle mixed JSON types in data connection config deserialization
DeploymentManagerActor deserialized connection config JSON as
Dictionary<string, string>, which silently failed on non-string values
like {"publishInterval":1000}. The OPC UA adapter then fell back to
localhost:4840 (unreachable in Docker). Now uses JsonDocument to handle
any JSON value type. OPC PLC Simulator connects successfully.
2026-03-18 00:39:01 -04:00
Joseph Doherty
2f3e0ceecb feat: include data connections and SMTP in artifact deployment 2026-03-17 13:48:52 -04:00
Joseph Doherty
dfb809a909 Wire DCL to Instance Actors for OPC UA tag value flow
- Add TagValueUpdate/ConnectionQualityChanged handlers to InstanceActor
- InstanceActor subscribes to DCL on PreStart based on DataSourceReference
- DeploymentManagerActor creates DCL connections on deploy and passes DCL ref
- AkkaHostedService creates DCL Manager Actor for tag subscriptions
- Move CreateConnectionCommand to Commons for cross-project access
- Add ConnectionConfig to FlattenedConfiguration for deployment packaging
2026-03-17 11:21:11 -04:00
Joseph Doherty
2798b91fe1 Wire up debug view: route subscribe/unsubscribe through DeploymentManagerActor
DeploymentManagerActor now handles SubscribeDebugViewRequest and
UnsubscribeDebugViewRequest by forwarding to the appropriate Instance Actor.
This completes the debug view data flow from Central UI through to the site's
Instance Actor snapshot. Reduced refresh interval to 2s for responsiveness.
2026-03-17 10:55:47 -04:00
Joseph Doherty
60243ad619 Add Deploy/Redeploy button and fix actor replacement on redeployment
Instances page gains Deploy button that triggers flattening pipeline and sends
config to site. Button shows "Redeploy" when instance is stale. Fixed actor name
collision on redeployment by scheduling deferred recreation after Context.Stop.
2026-03-17 10:28:44 -04:00
Joseph Doherty
389f5a0378 Phase 3B: Site I/O & Observability — Communication, DCL, Script/Alarm actors, Health, Event Logging
Communication Layer (WP-1–5):
- 8 message patterns with correlation IDs, per-pattern timeouts
- Central/Site communication actors, transport heartbeat config
- Connection failure handling (no central buffering, debug streams killed)

Data Connection Layer (WP-6–14, WP-34):
- Connection actor with Become/Stash lifecycle (Connecting/Connected/Reconnecting)
- OPC UA + LmxProxy adapters behind IDataConnection
- Auto-reconnect, bad quality propagation, transparent re-subscribe
- Write-back, tag path resolution with retry, health reporting
- Protocol extensibility via DataConnectionFactory

Site Runtime (WP-15–25, WP-32–33):
- ScriptActor/ScriptExecutionActor (triggers, concurrent execution, blocking I/O dispatcher)
- AlarmActor/AlarmExecutionActor (ValueMatch/RangeViolation/RateOfChange, in-memory state)
- SharedScriptLibrary (inline execution), ScriptRuntimeContext (API)
- ScriptCompilationService (Roslyn, forbidden API enforcement, execution timeout)
- Recursion limit (default 10), call direction enforcement
- SiteStreamManager (per-subscriber bounded buffers, fire-and-forget)
- Debug view backend (snapshot + stream), concurrency serialization
- Local artifact storage (4 SQLite tables)

Health Monitoring (WP-26–28):
- SiteHealthCollector (thread-safe counters, connection state)
- HealthReportSender (30s interval, monotonic sequence numbers)
- CentralHealthAggregator (offline detection 60s, online recovery)

Site Event Logging (WP-29–31):
- SiteEventLogger (SQLite, 6 event categories, ISO 8601 UTC)
- EventLogPurgeService (30-day retention, 1GB cap)
- EventLogQueryService (filters, keyword search, keyset pagination)

541 tests pass, zero warnings.
2026-03-16 20:57:25 -04:00
Joseph Doherty
e9e6165914 Phase 3A: Site runtime foundation — Akka cluster, SQLite persistence, Deployment Manager singleton, Instance Actor
- WP-1: Site cluster config (keep-oldest SBR, down-if-alone, 2s/10s failure detection)
- WP-2: Site-role host bootstrap (no Kestrel, SQLite paths)
- WP-3: SiteStorageService with deployed_configurations + static_attribute_overrides tables
- WP-4: DeploymentManagerActor as cluster singleton with staggered Instance Actor creation,
  OneForOneStrategy/Resume supervision, deploy/disable/enable/delete lifecycle
- WP-5: InstanceActor with attribute state, GetAttribute/SetAttribute, SQLite override persistence
- WP-6: CoordinatedShutdown verified for graceful singleton handover
- WP-7: Dual-node recovery (both seed nodes, min-nr-of-members=1)
- WP-8: 31 tests (storage CRUD, actor lifecycle, supervision, negative checks)
389 total tests pass, zero warnings.
2026-03-16 20:34:56 -04:00