Commit Graph

12 Commits

Author SHA1 Message Date
Joseph Doherty 7740a3bcf9 feat: add JoeAppEngine OPC UA nodes, fix DCL auto-reconnect and quality push
- Add JoeAppEngine folder to OPC UA nodes.json (BTCS, AlarmCntsBySeverity, Scheduler/ScanTime)
- Fix DataConnectionActor: capture Self in PreStart for use from non-actor threads,
  preventing Self.Tell failure in Disconnected event handler
- Implement InstanceActor.HandleConnectionQualityChanged to mark attributes Bad on disconnect
- Fix LmxFakeProxy TagMapper to serialize arrays as JSON instead of "System.Int32[]"
- Allow DataType and DataSourceReference updates in TemplateService.UpdateAttributeAsync
- Update test_infra_opcua.md with JoeAppEngine documentation
2026-03-19 13:27:54 -04:00
Joseph Doherty 78fbb13df7 feat: wire Inbound API Route.To().Call() to site instance scripts and add Roslyn compilation
Completes the Inbound API → site script call chain by adding RouteToCallRequest
handlers in SiteCommunicationActor and DeploymentManagerActor. Also replaces the
placeholder dispatch table in InboundScriptExecutor with Roslyn compilation of
API method scripts at startup, enabling user-defined inbound API methods to call
instance scripts across the cluster.
2026-03-18 08:43:13 -04:00
Joseph Doherty 9c6e3c2e56 feat: add CLI debug snapshot command for one-shot instance state inspection
Adds `debug snapshot --id <int>` to query a running instance's current
attribute values and alarm states without the subscribe/stream overhead
of the debug view. Routes through ManagementActor → CommunicationService
→ site DeploymentManager → InstanceActor using the existing remote query
pattern.
2026-03-18 07:16:22 -04:00
Joseph Doherty 4f22ca2b1f feat: replace ActorSelection with ClusterClient for inter-cluster communication
Central and site clusters now communicate via ClusterClient/
ClusterClientReceptionist instead of direct ActorSelection. Both
CentralCommunicationActor and SiteCommunicationActor are registered
with their cluster's receptionist. Central creates one ClusterClient
per site using NodeA/NodeB contact points from the DB. Sites configure
multiple CentralContactPoints for automatic failover between central
nodes. ISiteClientFactory enables test injection.
2026-03-18 00:08:47 -04:00
Joseph Doherty e5eb871961 fix: wire up health report pipeline between sites and central aggregator
Sites now send SiteHealthReport via AkkaHealthReportTransport →
SiteCommunicationActor → CentralCommunicationActor → CentralHealthAggregator.
Added IHealthReportTransport impl, ISiteIdentityProvider impl, registered
HealthReportSender on site nodes, and added SiteHealthReport handler in
CentralCommunicationActor. Health Dashboard now shows all 3 sites online.
2026-03-17 23:46:17 -04:00
Joseph Doherty 9e97c1acd2 feat: replace site registration with database-driven site addressing
Central now resolves site Akka remoting addresses from the Sites DB table
(NodeAAddress/NodeBAddress) instead of relying on runtime RegisterSite
messages. Eliminates the race condition where sites starting before central
had their registration dead-lettered. Addresses are cached in
CentralCommunicationActor with 60s periodic refresh and on-demand refresh
when sites are added/edited/deleted via UI or CLI.
2026-03-17 23:13:10 -04:00
Joseph Doherty 4879c4e01e Fix auth, Bootstrap, Blazor nav, LDAP, and deployment pipeline for working Central UI
Bootstrap served locally with absolute paths and <base href="/">.
LDAP auth uses search-then-bind with service account for GLAuth compatibility.
CookieAuthenticationStateProvider reads HttpContext.User instead of parsing JWT.
Login/logout forms opt out of Blazor enhanced nav (data-enhance="false").
Nav links use absolute paths; seed data includes Design/Deployment group mappings.
DataConnections page loads all connections (not just site-assigned).
Site appsettings configured for Test Plant A; Site registers with Central on startup.
DeploymentService resolves string site identifier for Akka routing.
Instances page gains Create Instance form.
2026-03-17 10:03:06 -04:00
Joseph Doherty b659978764 Phase 8: Production readiness — failover tests, security hardening, sandboxing, deployment docs
- WP-1-3: Central/site failover + dual-node recovery tests (17 tests)
- WP-4: Performance testing framework for target scale (7 tests)
- WP-5: Security hardening (LDAPS, JWT key length, no secrets in logs) (11 tests)
- WP-6: Script sandboxing adversarial tests (28 tests, all forbidden APIs)
- WP-7: Recovery drill test scaffolds (5 tests)
- WP-8: Observability validation (structured logs, correlation IDs, metrics) (6 tests)
- WP-9: Message contract compatibility (forward/backward compat) (18 tests)
- WP-10: Deployment packaging (installation guide, production checklist, topology)
- WP-11: Operational runbooks (failover, troubleshooting, maintenance)
92 new tests, all passing. Zero warnings.
2026-03-16 22:12:31 -04:00
Joseph Doherty 389f5a0378 Phase 3B: Site I/O & Observability — Communication, DCL, Script/Alarm actors, Health, Event Logging
Communication Layer (WP-1–5):
- 8 message patterns with correlation IDs, per-pattern timeouts
- Central/Site communication actors, transport heartbeat config
- Connection failure handling (no central buffering, debug streams killed)

Data Connection Layer (WP-6–14, WP-34):
- Connection actor with Become/Stash lifecycle (Connecting/Connected/Reconnecting)
- OPC UA + LmxProxy adapters behind IDataConnection
- Auto-reconnect, bad quality propagation, transparent re-subscribe
- Write-back, tag path resolution with retry, health reporting
- Protocol extensibility via DataConnectionFactory

Site Runtime (WP-15–25, WP-32–33):
- ScriptActor/ScriptExecutionActor (triggers, concurrent execution, blocking I/O dispatcher)
- AlarmActor/AlarmExecutionActor (ValueMatch/RangeViolation/RateOfChange, in-memory state)
- SharedScriptLibrary (inline execution), ScriptRuntimeContext (API)
- ScriptCompilationService (Roslyn, forbidden API enforcement, execution timeout)
- Recursion limit (default 10), call direction enforcement
- SiteStreamManager (per-subscriber bounded buffers, fire-and-forget)
- Debug view backend (snapshot + stream), concurrency serialization
- Local artifact storage (4 SQLite tables)

Health Monitoring (WP-26–28):
- SiteHealthCollector (thread-safe counters, connection state)
- HealthReportSender (30s interval, monotonic sequence numbers)
- CentralHealthAggregator (offline detection 60s, online recovery)

Site Event Logging (WP-29–31):
- SiteEventLogger (SQLite, 6 event categories, ISO 8601 UTC)
- EventLogPurgeService (30-day retention, 1GB cap)
- EventLogQueryService (filters, keyword search, keyset pagination)

541 tests pass, zero warnings.
2026-03-16 20:57:25 -04:00
Joseph Doherty 8c2091dc0a Phase 0 WP-0.10–0.12: Host skeleton, options classes, sample configs, and execution framework
- WP-0.10: Role-based Host startup (Central=WebApplication, Site=generic Host),
  15 component AddXxx() extension methods, MapCentralUI/MapInboundAPI stubs
- WP-0.11: 12 per-component options classes with config binding
- WP-0.12: Sample appsettings for central and site topologies
- Add execution procedure and checklist template to generate_plans.md
- Add phase-0-checklist.md for execution tracking
- Resolve all 21 open questions from plan generation
- Update IDataConnection with batch ops and IAsyncDisposable
57 tests pass, zero warnings.
2026-03-16 18:59:07 -04:00
Joseph Doherty fed5f5a82c Add .gitignore and remove tracked build artifacts (bin/obj) 2026-03-16 18:38:00 -04:00
Joseph Doherty 34190e1347 Phase 0 WP-0.1: Create .NET 10 solution structure with all 17 component projects
17 source projects (Commons + Host + 15 components) and 17 xUnit test projects.
SLNX format, net10.0, nullable enabled, warnings as errors. All components
reference Commons; Host references all components. Builds and tests clean.
2026-03-16 18:37:36 -04:00