Commit Graph

45 Commits

Author SHA1 Message Date
Joseph Doherty
0f28d13da7 feat(siteruntime): Tracking.Status(id) script API (#23 M3) 2026-05-20 13:56:59 -04:00
Joseph Doherty
82a8bbf225 feat(siteruntime): ExternalSystem.Call emits Audit Log #23 event on every sync call
Wraps IExternalSystemClient.CallAsync inside ScriptRuntimeContext's
ExternalSystemHelper so every script-initiated ExternalSystem.Call
produces exactly one ApiOutbound/ApiCall AuditEvent via IAuditWriter.

- Captures duration with Stopwatch.GetTimestamp() around the call.
- Builds the audit event with full provenance (SiteId, InstanceId,
  SourceScript) and a fresh EventId; ForwardState=Pending.
- Maps Success → AuditStatus.Delivered, Failure (or thrown) → Failed;
  parses HTTP {code} out of the ExternalSystemClient's error message
  to populate HttpStatus.
- Audit emission is fully best-effort: event-build failures, sync
  WriteAsync throws, AND async WriteAsync faults are all logged at
  Warning and swallowed so the script's call path is never aborted
  by an audit-write failure (alog.md §7).
- Original ExternalCallResult or original exception flows back to the
  caller unchanged.

ScriptExecutionActor resolves IAuditWriter from DI and threads it
into ScriptRuntimeContext alongside the existing site identity.

Adds ExternalSystemCallAuditEmissionTests covering: success →
Delivered, HTTP 500 → Failed+httpStatus, HTTP 400 → Failed+httpStatus,
client-thrown network exception → Failed with original exception
re-thrown, audit-writer throw → original result returned, provenance
populated from context, DurationMs recorded.

Refs Audit Log #23 M2 Bundle F.
2026-05-20 13:11:19 -04:00
Joseph Doherty
558f9ceb39 feat(notification-outbox): populate SourceScript on outbound notifications
FU3: thread the executing script identifier from the script-execution
context down to the Notify outbox API so NotifyTarget.Send stamps
NotificationSubmit.SourceScript instead of leaving it null.

- ScriptRuntimeContext / NotifyHelper / NotifyTarget take an optional
  sourceScript value, carried through to NotificationSubmit.SourceScript.
- ScriptExecutionActor supplies "ScriptActor:<scriptName>", matching the
  Site Event Logging "Source" convention used for script error events.
- AlarmExecutionActor builds the context without the S&F engine, so its
  Notify API is inert; sourceScript defaults to null there.
2026-05-19 03:54:09 -04:00
Joseph Doherty
3326bddeb0 feat(notification-outbox): async Notify.Send with status handle
Notify.To(list).Send(subject,body) now generates a NotificationId GUID,
enqueues a Notification-category message into the site Store-and-Forward
Engine, and returns the NotificationId immediately (Task<string>). The
NotificationId is the single idempotency key end-to-end: it is the S&F
message Id, it is carried inside the buffered NotificationSubmit payload,
and it is the id the forwarder submits to central.

NotificationForwarder now deserializes the buffered payload as a
NotificationSubmit and reads NotificationId from it (re-stamping only the
site-owned SourceSiteId / SourceInstanceId), instead of deriving the id
from StoreAndForwardMessage.Id.

Adds NotifyHelper.Status(id): queries central via the site communication
actor; reports the site-local Forwarding state while the notification is
still buffered at the site, maps central's response when found, and
Unknown otherwise. Adds a NotificationDeliveryStatus record.

SiteCommunicationActor gains a NotificationStatusQuery forwarding handler
mirroring NotificationSubmit. StoreAndForwardService.EnqueueAsync gains an
optional messageId parameter and exposes GetMessageByIdAsync.
2026-05-19 02:30:51 -04:00
Joseph Doherty
437fe154e7 feat(triggers): add WhileTrue fire mode for Conditional/Expression script triggers
Conditional and Expression script triggers gain an optional `mode` field
in their TriggerConfiguration JSON:

- OnTrue (default): unchanged edge/per-change firing. An absent mode field
  parses as OnTrue, so every existing trigger config behaves identically.
- WhileTrue: fires on the false->true edge, then re-fires on a periodic
  timer while the condition holds; stops on the true->false edge. The
  re-fire cadence is the script's MinTimeBetweenRuns; with none configured
  the trigger degrades to a single edge fire and logs a warning.

ScriptActor tracks condition truth state and manages a dedicated
"whiletrue-trigger" timer. ScriptTriggerConfigCodec and ScriptTriggerEditor
round-trip the mode and expose an OnTrue/WhileTrue selector for the two
trigger kinds. Design: docs/plans/2026-05-18-whiletrue-trigger-mode-design.md

Tests: 7 ScriptActor runtime tests (edge fire, timer re-fire, stop,
re-arm, no-MinTimeBetweenRuns degrade, OnTrue regressions) + 14 codec /
editor tests. SiteRuntime suite 206 green, CentralUI suite 295 green.
2026-05-18 10:44:11 -04:00
Joseph Doherty
6139a65a7b fix(site-runtime): fan tag updates out to every attribute sharing a tag path
InstanceActor._tagPathToAttribute was a Dictionary<string,string> — one tag
path mapped to a single attribute. When two attributes reference the same PLC
node (e.g. two composed cooling-tank modules both reading ns=3;s=Tank.Level,
or a pump's TempSensor and AlarmSensor both reading ns=3;s=Sensor.Reading),
SubscribeToDcl's map assignment overwrote, so only the last-registered
attribute ever received values — the rest stayed permanently Uncertain.

The map is now Dictionary<string,List<string>>; HandleTagValueUpdate fans each
update out to every attribute referencing the tag path, and each distinct tag
path is still subscribed only once per connection.
2026-05-18 04:21:26 -04:00
Joseph Doherty
be274212f0 fix(site-runtime): resolve SiteRuntime-017..019 — isolated attribute snapshot for child actors, corrected dispatcher doc, remove dead lifecycle handlers 2026-05-17 03:18:41 -04:00
Joseph Doherty
dd7626da63 fix(site-runtime): resolve SiteRuntime-012,013,015,016 — doc accuracy, shared LoggerFactory, execution-actor coverage; SiteRuntime-014 deferred 2026-05-16 22:32:30 -04:00
Joseph Doherty
a88bec9376 fix(site-runtime): resolve SiteRuntime-004..011 — deploy-after-persist, remove reflection, deterministic IDs, non-blocking startup, dedicated script scheduler, config-change detection, semantic trust-model check 2026-05-16 21:44:10 -04:00
Joseph Doherty
bc548e1447 feat(deployment-manager): resolve DeploymentManager-006 — query site deployment state before redeploy and reconcile
Adds DeploymentStateQuery request/response contracts (Commons), a site-side
handler (SiteRuntime), a CommunicationService query method (Communication), and
reconciliation in DeploymentService: when a prior record is InProgress or
Failed-on-timeout, query the site; if it already holds the target revision hash
mark the record Success without re-sending; on query failure fall through to a
normal deploy (site-side stale-rejection is the safety net).
2026-05-16 20:12:24 -04:00
Joseph Doherty
09b4bd5dfa fix(site-runtime): resolve SiteRuntime-001/002/003 — route data-sourced writes to DCL, real per-attribute API results, race-free redeploy 2026-05-16 19:57:28 -04:00
Joseph Doherty
78b10d00d8 fix(triggers): bound expression evaluation, align AlarmActor error handling, dedupe config parsing 2026-05-16 05:43:18 -04:00
Joseph Doherty
41c3fa3d84 fix(triggers): seed the trigger-expression attribute snapshot at actor startup 2026-05-16 05:38:50 -04:00
Joseph Doherty
9e21b47080 feat(triggers): runtime expression trigger evaluation for scripts and alarms 2026-05-16 05:35:02 -04:00
Joseph Doherty
295150751f feat(scripts): realign Test Run with runtime API, add anonymous-object calls and instance binding
The Test Run sandbox and Monaco analysis modelled a script API that had
drifted from the site runtime's ScriptGlobals, so real scripts failed to
compile in Test Run. Realign both to the runtime surface
(Instance/Scripts/ExternalSystem/Attributes/Children/Parent) and drop the
duplicate ScriptHost stub so the two cannot diverge again.

- Script calls (Scripts.CallShared, Instance.CallScript, Route.To().Call)
  accept an anonymous object instead of a hand-built dictionary, via a
  shared ScriptArgs normalizer; existing dictionary calls still compile.
- Test Run can optionally bind to a deployed instance, so Instance/
  Attributes/CallScript route to it cross-site; adds site-side
  RouteToGetAttributes/RouteToSetAttributes handlers.
- Adds Test Run panels to the API method and template script editors.
- Fixes the TestDatabaseQuery seed script, which queried a table that
  never existed.

Also commits unrelated in-progress work already in the tree: the health
monitoring report loop, site streaming changes, and the Admin/Design
data-connection and SMTP page reorganization.
2026-05-16 03:37:56 -04:00
Joseph Doherty
17e24ddd20 fix(site-event-log): record script errors and route queries to the active node
Script execution failures were only written to Serilog, never to the
site event log — SiteRuntime did not reference the SiteEventLogging
project. ScriptExecutionActor now resolves ISiteEventLogger and emits a
'script'/'Error' event on timeout and exception.

The event-log query handler was a per-node actor bound to that node's
local SQLite. A ClusterClient query could land on the standby (which
records no events) and return nothing. The handler is now a cluster
singleton with a proxy, so queries always reach the active node.
2026-05-15 12:04:59 -04:00
Joseph Doherty
751248feb6 feat(alarms): HiLo trigger type with per-band level, hysteresis, messages, overrides
Adds a new HiLo alarm trigger type with four configurable setpoints
(LoLo / Lo / Hi / HiHi). Each setpoint carries an optional priority,
deadband (for hysteresis), and operator message. The site runtime emits
AlarmStateChanged with an AlarmLevel field so consumers can differentiate
warning vs critical bands.

Plumbing:
  - new AlarmLevel enum + AlarmStateChanged.Level/Message init properties
  - AlarmTriggerEditor (Blazor) gets a HiLo render with severity tinting
  - AlarmTriggerConfigCodec extracted from the editor for testability
  - sitestream.proto carries level + message over gRPC
  - SemanticValidator enforces numeric attribute, setpoint ordering,
    non-negative deadband
  - on-trigger scripts get an Alarm global (Name/Level/Priority/Message)
    so notification routing can branch by severity
  - per-instance InstanceAlarmOverride entity + EF migration + flattening
    step + CLI commands; HiLo overrides merge setpoint-by-setpoint, binary
    types whole-replace
  - DebugView shows a Level badge + per-band message tooltip
  - App.razor auto-reloads on permanent Blazor circuit failure
  - docker/regen-proto.sh automates the proto regen workflow (the linux/arm64
    protoc segfault means generated files are checked in for now)
2026-05-13 03:23:32 -04:00
Joseph Doherty
783da8e21a feat(ui): structured editors for script schemas and alarm triggers
Replace raw-JSON text inputs with rich UI: script parameter/return types use
a JSON Schema builder (SchemaBuilder + JsonSchemaShapeParser, with a migration
to convert existing definitions); alarm trigger config uses a type-aware
editor with a flattened attribute picker (AlarmTriggerEditor). AlarmActor
gains optional direction (rising/falling/either) on RateOfChange triggers.
2026-05-13 00:33:00 -04:00
Joseph Doherty
efba01d10a feat(scripts): self/child/parent attribute and script accessors
Phases 1+2 of the design at
docs/plans/2026-05-12-script-scope-access-design.md.

Adds ergonomic scope-aware accessors to compiled scripts. A script
on a composed TempSensor reads its own attribute via
Attributes["Temperature"]; reaches up to the parent via
Parent.Attributes["SpeedRPM"]; invokes a child script via
Children["TempSensor"].CallScript("Sample"). All resolve to the
existing flat Instance.GetAttribute / SetAttribute / CallScript
delegates by prepending the script's canonical path prefix.

Runtime types (SiteRuntime.Scripts.ScopeAccessors):
  AttributeAccessor   sync indexer + GetAsync / SetAsync
  CompositionAccessor Attributes + CallScript
  ChildrenAccessor    Children["name"] => CompositionAccessor

ScriptGlobals gains Scope, Attributes, Children, Parent properties.
Sync indexer blocks on the Instance Actor Ask; explicit GetAsync /
SetAsync are also available for callers that want to await.

Plumbing:
  - Commons.Types.Scripts.ScriptScope record (SelfPath / ParentPath).
  - ResolvedScript.Scope (defaults to ScriptScope.Root for back-compat).
  - FlatteningService emits new ScriptScope(prefix, "") for each
    composed script so a script defined on TempSensor composed under
    a parent gets SelfPath = "TempSensor".
  - ScriptActor reads the Scope from its ResolvedScript and forwards
    it through ScriptExecutionActor into ScriptGlobals on each call.

RevisionHashService not touched: the per-script canonical name
already encodes the composition path, so any structural change
already flips the hash.

10 new unit tests on the path arithmetic. Site/Template engine
suites stay green (129 + 199).

Editor surface (Phase 3: metadata fetch, Phase 4: completion +
SCADA006 / SCADA007 diagnostics) follows in the next commits.
2026-05-12 05:45:24 -04:00
Joseph Doherty
496d2a68e3 refactor(site-runtime): route OPC UA connection JSON through serializer 2026-05-12 00:59:25 -04:00
Joseph Doherty
8423915ba1 fix(site-runtime): publish quality changes to site stream for real-time debug view updates
HandleConnectionQualityChanged now publishes AttributeValueChanged events
to the SiteStreamManager for all affected attributes. This ensures the
central UI debug view updates in real-time when a data connection
disconnects and attributes go bad quality.

Only publishes to the stream — does NOT notify script or alarm actors,
since the value hasn't changed and firing scripts/alarms on quality-only
changes would cause spurious evaluations.
2026-03-24 16:32:00 -04:00
Joseph Doherty
161dc406ed feat(scripts): add typed Parameters.Get<T>() helpers for script API
Replace raw dictionary casting with ScriptParameters wrapper that provides
Get<T>, Get<T?>, Get<T[]>, and Get<List<T>> with clear error messages,
numeric conversion, and JsonElement support for Inbound API parameters.
2026-03-22 15:47:18 -04:00
Joseph Doherty
e8df71ea64 feat(cli): add --primary-config, --backup-config, --failover-retry-count to data connection commands
Thread backup data connection fields through management command messages,
ManagementActor handlers, SiteService, site-side SQLite storage, and
deployment/replication actors. The old --configuration CLI flag is kept
as a hidden alias for backwards compatibility.
2026-03-22 08:41:57 -04:00
Joseph Doherty
46304678da feat(dcl): extend CreateConnectionCommand with backup config and failover retry count
Update CreateConnectionCommand to carry PrimaryConnectionDetails,
BackupConnectionDetails, and FailoverRetryCount. Update all callers:
DataConnectionManagerActor, DataConnectionActor, DeploymentManagerActor,
FlatteningService, and ConnectionConfig. The actor stores both configs
but continues using primary only — failover logic comes in Task 3.
2026-03-22 08:24:39 -04:00
Joseph Doherty
04af03980e feat(dcl): rename Configuration to PrimaryConfiguration, add BackupConfiguration and FailoverRetryCount 2026-03-22 08:18:31 -04:00
Joseph Doherty
49f042a937 refactor: remove ClusterClient streaming path (DebugStreamEvent), events flow via gRPC 2026-03-21 12:18:52 -04:00
Joseph Doherty
3efec91386 fix: route debug stream events through ClusterClient site→central path
ClusterClient Sender refs are temporary proxies — valid for immediate reply
but not durable for future Tells. Events now flow as DebugStreamEvent through
SiteCommunicationActor → ClusterClient → CentralCommunicationActor → bridge
actor (same pattern as health reports). Also fix DebugStreamHub to use
IHubContext for long-lived callbacks instead of transient hub instance.
2026-03-21 11:32:17 -04:00
Joseph Doherty
7740a3bcf9 feat: add JoeAppEngine OPC UA nodes, fix DCL auto-reconnect and quality push
- Add JoeAppEngine folder to OPC UA nodes.json (BTCS, AlarmCntsBySeverity, Scheduler/ScanTime)
- Fix DataConnectionActor: capture Self in PreStart for use from non-actor threads,
  preventing Self.Tell failure in Disconnected event handler
- Implement InstanceActor.HandleConnectionQualityChanged to mark attributes Bad on disconnect
- Fix LmxFakeProxy TagMapper to serialize arrays as JSON instead of "System.Int32[]"
- Allow DataType and DataSourceReference updates in TemplateService.UpdateAttributeAsync
- Update test_infra_opcua.md with JoeAppEngine documentation
2026-03-19 13:27:54 -04:00
Joseph Doherty
db387c6613 fix: include recipients in artifact deployment and load shared scripts on startup
NotificationRepository.GetAllNotificationListsAsync() was missing
.Include(Recipients), causing artifact deployments to push empty recipient
lists to sites. Also load shared scripts from SQLite on DeploymentManager
startup so they're available before Instance Actors compile their scripts.
2026-03-18 09:13:10 -04:00
Joseph Doherty
78fbb13df7 feat: wire Inbound API Route.To().Call() to site instance scripts and add Roslyn compilation
Completes the Inbound API → site script call chain by adding RouteToCallRequest
handlers in SiteCommunicationActor and DeploymentManagerActor. Also replaces the
placeholder dispatch table in InboundScriptExecutor with Roslyn compilation of
API method scripts at startup, enabling user-defined inbound API methods to call
instance scripts across the cluster.
2026-03-18 08:43:13 -04:00
Joseph Doherty
eb8ead58d2 feat: wire SQLite replication between site nodes and fix ConfigurationDatabase tests
Add SiteReplicationActor (runs on every site node) to replicate deployed
configs and store-and-forward buffer operations to the standby peer via
cluster member discovery and fire-and-forget Tell. Wire ReplicationService
handler and pass replication actor to DeploymentManagerActor singleton.

Fix 5 pre-existing ConfigurationDatabase test failures: RowVersion NOT NULL
on SQLite, stale migration name assertion, and seed data count mismatch.
2026-03-18 08:28:02 -04:00
Joseph Doherty
f063fb1ca3 fix: wire DCL tag value delivery, alarm evaluation, and snapshot timestamps
Three runtime bugs fixed:
- DataConnectionActor: TagValueReceived/TagResolutionSucceeded/Failed not
  handled in any Become state — OPC UA values went to dead letters. Added
  initial read after subscribe to seed current values immediately.
- AlarmActor: ParseEvalConfig expected "attributeName"/"matchValue"/"min"/
  "max" keys but seed data uses "attribute"/"value"/"high"/"low". Added
  support for both conventions and !=prefix for not-equal matching.
- InstanceActor: snapshots reported all alarms (including unevaluated) with
  correct priorities and source timestamps instead of current UTC. Removed
  bogus Vibration template attribute that shadowed Speed's tag mapping.
2026-03-18 07:36:48 -04:00
Joseph Doherty
9c6e3c2e56 feat: add CLI debug snapshot command for one-shot instance state inspection
Adds `debug snapshot --id <int>` to query a running instance's current
attribute values and alarm states without the subscribe/stream overhead
of the debug view. Routes through ManagementActor → CommunicationService
→ site DeploymentManager → InstanceActor using the existing remote query
pattern.
2026-03-18 07:16:22 -04:00
Joseph Doherty
899dec6b6f feat: wire ExternalSystem, Database, and Notify APIs into script runtime
IServiceProvider now flows through the actor chain (DeploymentManagerActor
→ InstanceActor → ScriptActor → ScriptExecutionActor) so scripts can
resolve IExternalSystemClient, IDatabaseGateway, and
INotificationDeliveryService from DI. ScriptGlobals exposes ExternalSystem,
Database, Notify, and Scripts as top-level properties so scripts can use
them without the Instance. prefix.
2026-03-18 02:41:18 -04:00
Joseph Doherty
8095c8efbe fix: only active singleton node sends health reports
Both nodes of a site cluster were sending health reports. The standby
node (without the DeploymentManager singleton) reported 0 instances and
no connections, overwriting the active node's data in the aggregator.

Added IsActiveNode flag to ISiteHealthCollector, set by
DeploymentManagerActor on PreStart/PostStop. HealthReportSender skips
sending when the node is not active. Also ensured EnsureDclConnections
is called during startup batch creation so data connections survive
container restarts.
2026-03-18 01:44:57 -04:00
Joseph Doherty
213ca2698a fix: update instance counts after startup batch creation completes
UpdateInstanceCounts() was only called before Instance Actors were
created (in HandleStartupConfigsLoaded), showing 0 enabled on the
health dashboard. Now also called after each batch in
HandleStartNextBatch to reflect actual running actor count.
2026-03-18 01:28:46 -04:00
Joseph Doherty
f165ca2774 feat: wire all health metrics and add instance counts to dashboard
Wired ISiteHealthCollector calls for script errors (ScriptExecutionActor),
alarm eval errors (AlarmActor), dead letters (DeadLetterMonitorActor), and
S&F buffer depth placeholder. Added instance count tracking (deployed/
enabled/disabled) to SiteHealthReport via DeploymentManagerActor. Updated
Health Dashboard UI to show instance counts per site. All metrics flow
through the existing health report pipeline via ClusterClient.
2026-03-18 00:57:49 -04:00
Joseph Doherty
88b5f6cb54 fix: handle mixed JSON types in data connection config deserialization
DeploymentManagerActor deserialized connection config JSON as
Dictionary<string, string>, which silently failed on non-string values
like {"publishInterval":1000}. The OPC UA adapter then fell back to
localhost:4840 (unreachable in Docker). Now uses JsonDocument to handle
any JSON value type. OPC PLC Simulator connects successfully.
2026-03-18 00:39:01 -04:00
Joseph Doherty
775cb8084f feat: data-sourced attributes start with uncertain quality before first DCL value
Attributes bound to data connections now initialize with "Uncertain" quality,
distinguishing "never received a value" from "known good" or "connection lost."
Quality is tracked per attribute and included in GetAttributeResponse.
2026-03-17 18:25:39 -04:00
Joseph Doherty
2f3e0ceecb feat: include data connections and SMTP in artifact deployment 2026-03-17 13:48:52 -04:00
Joseph Doherty
dfb809a909 Wire DCL to Instance Actors for OPC UA tag value flow
- Add TagValueUpdate/ConnectionQualityChanged handlers to InstanceActor
- InstanceActor subscribes to DCL on PreStart based on DataSourceReference
- DeploymentManagerActor creates DCL connections on deploy and passes DCL ref
- AkkaHostedService creates DCL Manager Actor for tag subscriptions
- Move CreateConnectionCommand to Commons for cross-project access
- Add ConnectionConfig to FlattenedConfiguration for deployment packaging
2026-03-17 11:21:11 -04:00
Joseph Doherty
2798b91fe1 Wire up debug view: route subscribe/unsubscribe through DeploymentManagerActor
DeploymentManagerActor now handles SubscribeDebugViewRequest and
UnsubscribeDebugViewRequest by forwarding to the appropriate Instance Actor.
This completes the debug view data flow from Central UI through to the site's
Instance Actor snapshot. Reduced refresh interval to 2s for responsiveness.
2026-03-17 10:55:47 -04:00
Joseph Doherty
60243ad619 Add Deploy/Redeploy button and fix actor replacement on redeployment
Instances page gains Deploy button that triggers flattening pipeline and sends
config to site. Button shows "Redeploy" when instance is stale. Fixed actor name
collision on redeployment by scheduling deferred recreation after Context.Stop.
2026-03-17 10:28:44 -04:00
Joseph Doherty
389f5a0378 Phase 3B: Site I/O & Observability — Communication, DCL, Script/Alarm actors, Health, Event Logging
Communication Layer (WP-1–5):
- 8 message patterns with correlation IDs, per-pattern timeouts
- Central/Site communication actors, transport heartbeat config
- Connection failure handling (no central buffering, debug streams killed)

Data Connection Layer (WP-6–14, WP-34):
- Connection actor with Become/Stash lifecycle (Connecting/Connected/Reconnecting)
- OPC UA + LmxProxy adapters behind IDataConnection
- Auto-reconnect, bad quality propagation, transparent re-subscribe
- Write-back, tag path resolution with retry, health reporting
- Protocol extensibility via DataConnectionFactory

Site Runtime (WP-15–25, WP-32–33):
- ScriptActor/ScriptExecutionActor (triggers, concurrent execution, blocking I/O dispatcher)
- AlarmActor/AlarmExecutionActor (ValueMatch/RangeViolation/RateOfChange, in-memory state)
- SharedScriptLibrary (inline execution), ScriptRuntimeContext (API)
- ScriptCompilationService (Roslyn, forbidden API enforcement, execution timeout)
- Recursion limit (default 10), call direction enforcement
- SiteStreamManager (per-subscriber bounded buffers, fire-and-forget)
- Debug view backend (snapshot + stream), concurrency serialization
- Local artifact storage (4 SQLite tables)

Health Monitoring (WP-26–28):
- SiteHealthCollector (thread-safe counters, connection state)
- HealthReportSender (30s interval, monotonic sequence numbers)
- CentralHealthAggregator (offline detection 60s, online recovery)

Site Event Logging (WP-29–31):
- SiteEventLogger (SQLite, 6 event categories, ISO 8601 UTC)
- EventLogPurgeService (30-day retention, 1GB cap)
- EventLogQueryService (filters, keyword search, keyset pagination)

541 tests pass, zero warnings.
2026-03-16 20:57:25 -04:00
Joseph Doherty
e9e6165914 Phase 3A: Site runtime foundation — Akka cluster, SQLite persistence, Deployment Manager singleton, Instance Actor
- WP-1: Site cluster config (keep-oldest SBR, down-if-alone, 2s/10s failure detection)
- WP-2: Site-role host bootstrap (no Kestrel, SQLite paths)
- WP-3: SiteStorageService with deployed_configurations + static_attribute_overrides tables
- WP-4: DeploymentManagerActor as cluster singleton with staggered Instance Actor creation,
  OneForOneStrategy/Resume supervision, deploy/disable/enable/delete lifecycle
- WP-5: InstanceActor with attribute state, GetAttribute/SetAttribute, SQLite override persistence
- WP-6: CoordinatedShutdown verified for graceful singleton handover
- WP-7: Dual-node recovery (both seed nodes, min-nr-of-members=1)
- WP-8: 31 tests (storage CRUD, actor lifecycle, supervision, negative checks)
389 total tests pass, zero warnings.
2026-03-16 20:34:56 -04:00