Commit Graph

126 Commits

Author SHA1 Message Date
Joseph Doherty
d70bbbe739 feat: add StreamRelayActor bridging Akka events to gRPC proto channel 2026-03-21 11:48:04 -04:00
Joseph Doherty
9b0a80dcbd feat: add GrpcNodeAAddress/GrpcNodeBAddress to Site entity, CLI, and UI 2026-03-21 11:45:22 -04:00
Joseph Doherty
64ee316609 feat: add GrpcPort config to NodeOptions with startup validation 2026-03-21 11:42:41 -04:00
Joseph Doherty
deb58e1f17 feat: add sitestream.proto definition and generated gRPC stubs
Proto3 definition with SiteStreamService (server streaming), Quality and
AlarmStateEnum enums with UNSPECIFIED=0, google.protobuf.Timestamp for
cross-platform timestamps. Pre-generated C# stubs checked in (no protoc
at build time). 10 roundtrip tests covering serialization, oneof
discrimination, and Timestamp<->DateTimeOffset conversion.
2026-03-21 11:41:01 -04:00
Joseph Doherty
826cfbee31 feat: add sitestream.proto definition and generated gRPC stubs
Define the SiteStreamService proto for real-time instance event
streaming (attribute value changes, alarm state changes) from site
nodes to central. Add pre-generated C# stubs following the existing
LmxProxy pattern, gRPC NuGet packages with FrameworkReference for
ASP.NET Core server types, and proto roundtrip tests.
2026-03-21 11:37:39 -04:00
Joseph Doherty
b76ce09221 docs: add gRPC streaming channel implementation plan with task tracking 2026-03-21 11:32:24 -04:00
Joseph Doherty
3efec91386 fix: route debug stream events through ClusterClient site→central path
ClusterClient Sender refs are temporary proxies — valid for immediate reply
but not durable for future Tells. Events now flow as DebugStreamEvent through
SiteCommunicationActor → ClusterClient → CentralCommunicationActor → bridge
actor (same pattern as health reports). Also fix DebugStreamHub to use
IHubContext for long-lived callbacks instead of transient hub instance.
2026-03-21 11:32:17 -04:00
Joseph Doherty
41aff339b2 docs: add gRPC streaming channel design plan for site→central real-time data
Replaces ClusterClient-based event streaming with dedicated gRPC server-streaming
channels. Covers proto definition, server/client patterns, Channel<T> bridging,
keepalive/orphan prevention, failover scenarios, port/address configuration,
extensibility guide for new event types, testing strategy, and implementation guardrails.
2026-03-21 11:26:09 -04:00
Joseph Doherty
fd2e96fea2 feat: replace debug view polling with real-time SignalR streaming
The debug view polled every 2s by re-subscribing for full snapshots. Now a
persistent DebugStreamBridgeActor on central subscribes once and receives
incremental Akka stream events from the site, forwarding them to the Blazor
component via callbacks and to the CLI via a new SignalR hub at
/hubs/debug-stream. Adds `debug stream` CLI command with auto-reconnect.
2026-03-21 01:34:53 -04:00
Joseph Doherty
d91aa83665 refactor(docs): move requirements and test infra docs into docs/ subdirectories
Organize documentation by moving requirements (HighLevelReqs, Component-*,
lmxproxy_protocol) to docs/requirements/ and test infrastructure docs to
docs/test_infra/. Updates all cross-references in README, CLAUDE.md,
infra/README, component docs, and 23 plan files.
2026-03-21 01:11:35 -04:00
Joseph Doherty
0a85a839a2 feat(infra): add Traefik load balancer with active node health check for central cluster failover
Add ActiveNodeHealthCheck that returns 200 only on the Akka.NET cluster
leader, enabling Traefik to route traffic to the active central node and
automatically fail over when the leader changes. Also fixes AkkaClusterHealthCheck
to resolve ActorSystem from AkkaHostedService (was always null via DI).
2026-03-21 00:44:37 -04:00
Joseph Doherty
1a540f4f0a feat: add HTTP Management API, migrate CLI from Akka ClusterClient to HTTP
Replace the CLI's Akka.NET ClusterClient transport with a simple HTTP client
targeting a new POST /management endpoint on the Central Host. The endpoint
handles Basic Auth, LDAP authentication, role resolution, and ManagementActor
dispatch in a single round-trip — eliminating the CLI's Akka, LDAP, and
Security dependencies.

Also fixes DCL ReSubscribeAll losing subscriptions on repeated reconnect by
deriving the tag list from _subscriptionsByInstance instead of _subscriptionIds.
2026-03-20 23:55:31 -04:00
Joseph Doherty
7740a3bcf9 feat: add JoeAppEngine OPC UA nodes, fix DCL auto-reconnect and quality push
- Add JoeAppEngine folder to OPC UA nodes.json (BTCS, AlarmCntsBySeverity, Scheduler/ScanTime)
- Fix DataConnectionActor: capture Self in PreStart for use from non-actor threads,
  preventing Self.Tell failure in Disconnected event handler
- Implement InstanceActor.HandleConnectionQualityChanged to mark attributes Bad on disconnect
- Fix LmxFakeProxy TagMapper to serialize arrays as JSON instead of "System.Int32[]"
- Allow DataType and DataSourceReference updates in TemplateService.UpdateAttributeAsync
- Update test_infra_opcua.md with JoeAppEngine documentation
2026-03-19 13:27:54 -04:00
Joseph Doherty
ffdda51990 fix(infra): use appConfig.Validate instead of CheckApplicationInstanceCertificate
Replace with Validate() which validates config without requiring a cert,
matching the RealOpcUaClient pattern. Fixes OPC UA connection failure.
2026-03-19 11:30:58 -04:00
Joseph Doherty
8f2700f11e test(infra): add integration smoke test for RealLmxProxyClient against LmxFakeProxy 2026-03-19 11:29:00 -04:00
Joseph Doherty
2cb592ad00 docs: add LmxFakeProxy to test infrastructure documentation 2026-03-19 11:27:30 -04:00
Joseph Doherty
edb2ab98cb feat(infra): add LmxFakeProxy Dockerfile and docker-compose service 2026-03-19 11:26:19 -04:00
Joseph Doherty
aef70bec7f feat(infra): wire up Program.cs with CLI args, env vars, and OPC UA bridge startup 2026-03-19 11:25:36 -04:00
Joseph Doherty
6852250497 feat(infra): add ScadaServiceImpl with full proto parity for all RPCs 2026-03-19 11:24:26 -04:00
Joseph Doherty
9cc8a1ae80 feat(infra): add TagMapper with address mapping, value parsing, and quality mapping 2026-03-19 11:20:57 -04:00
Joseph Doherty
efbedc60a8 feat(infra): add SessionManager with full session tracking and API key validation 2026-03-19 11:20:44 -04:00
Joseph Doherty
1d498b94b4 feat(infra): add IOpcUaBridge interface and OpcUaBridge with OPC UA reconnection 2026-03-19 11:20:25 -04:00
Joseph Doherty
1b27b89ca0 feat(infra): scaffold LmxFakeProxy project with proto and test project 2026-03-19 11:15:54 -04:00
Joseph Doherty
3e93a0d8c3 docs: add LmxFakeProxy implementation plan with 10 tasks
Detailed task-by-task plan covering scaffolding, TagMapper, SessionManager,
OpcUaBridge, ScadaServiceImpl, Program.cs, Docker, docs, and integration test.
2026-03-19 11:13:51 -04:00
Joseph Doherty
e19a568b9b docs: add LmxFakeProxy design — OPC UA-backed test proxy for LmxProxy protocol
Defines a gRPC server implementing the scada.ScadaService proto that bridges
to the existing OPC UA test server. Enables end-to-end testing of
RealLmxProxyClient without a Windows LmxProxy deployment.
2026-03-19 11:08:47 -04:00
Joseph Doherty
e837eae2cc feat: wire real LmxProxy gRPC client into Data Connection Layer
Replace stub ILmxProxyClient with production proto-generated gRPC client
(RealLmxProxyClient) that connects to LmxProxy servers with x-api-key
metadata header authentication. Includes pre-generated proto stubs for
ARM64 Docker compatibility, updated adapter with proper quality mapping
(Good/Uncertain/Bad), subscription via server-streaming RPC, and 20 unit
tests covering all operations. Updated Component-DataConnectionLayer.md
to reflect the actual implementation.
2026-03-18 11:57:18 -04:00
Joseph Doherty
da683d4fe9 fix: lazy-compile API method scripts and prefix composed alarm trigger attributes
- InboundScriptExecutor lazy-compiles scripts on first request, solving
  the multi-node problem where methods created via CLI/UI were only compiled
  on the ManagementActor's node, not the node handling the HTTP request.
- ManagementActor hot-registers API method scripts on create/update/delete
  for the local node.
- FlatteningService prefixes the "attribute" field in composed alarm trigger
  configs with the composition instance name so alarms evaluate against the
  correct path-qualified attribute (e.g. CoolingTank.Level not Level).
2026-03-18 09:30:12 -04:00
Joseph Doherty
db387c6613 fix: include recipients in artifact deployment and load shared scripts on startup
NotificationRepository.GetAllNotificationListsAsync() was missing
.Include(Recipients), causing artifact deployments to push empty recipient
lists to sites. Also load shared scripts from SQLite on DeploymentManager
startup so they're available before Instance Actors compile their scripts.
2026-03-18 09:13:10 -04:00
Joseph Doherty
78fbb13df7 feat: wire Inbound API Route.To().Call() to site instance scripts and add Roslyn compilation
Completes the Inbound API → site script call chain by adding RouteToCallRequest
handlers in SiteCommunicationActor and DeploymentManagerActor. Also replaces the
placeholder dispatch table in InboundScriptExecutor with Roslyn compilation of
API method scripts at startup, enabling user-defined inbound API methods to call
instance scripts across the cluster.
2026-03-18 08:43:13 -04:00
Joseph Doherty
eb8ead58d2 feat: wire SQLite replication between site nodes and fix ConfigurationDatabase tests
Add SiteReplicationActor (runs on every site node) to replicate deployed
configs and store-and-forward buffer operations to the standby peer via
cluster member discovery and fire-and-forget Tell. Wire ReplicationService
handler and pass replication actor to DeploymentManagerActor singleton.

Fix 5 pre-existing ConfigurationDatabase test failures: RowVersion NOT NULL
on SQLite, stale migration name assertion, and seed data count mismatch.
2026-03-18 08:28:02 -04:00
Joseph Doherty
f063fb1ca3 fix: wire DCL tag value delivery, alarm evaluation, and snapshot timestamps
Three runtime bugs fixed:
- DataConnectionActor: TagValueReceived/TagResolutionSucceeded/Failed not
  handled in any Become state — OPC UA values went to dead letters. Added
  initial read after subscribe to seed current values immediately.
- AlarmActor: ParseEvalConfig expected "attributeName"/"matchValue"/"min"/
  "max" keys but seed data uses "attribute"/"value"/"high"/"low". Added
  support for both conventions and !=prefix for not-equal matching.
- InstanceActor: snapshots reported all alarms (including unevaluated) with
  correct priorities and source timestamps instead of current UTC. Removed
  bogus Vibration template attribute that shadowed Speed's tag mapping.
2026-03-18 07:36:48 -04:00
Joseph Doherty
9c6e3c2e56 feat: add CLI debug snapshot command for one-shot instance state inspection
Adds `debug snapshot --id <int>` to query a running instance's current
attribute values and alarm states without the subscribe/stream overhead
of the debug view. Routes through ManagementActor → CommunicationService
→ site DeploymentManager → InstanceActor using the existing remote query
pattern.
2026-03-18 07:16:22 -04:00
Joseph Doherty
6ee820b0f0 docs: add Docker/OrbStack CLI connection note to CLI README 2026-03-18 07:04:53 -04:00
Joseph Doherty
899dec6b6f feat: wire ExternalSystem, Database, and Notify APIs into script runtime
IServiceProvider now flows through the actor chain (DeploymentManagerActor
→ InstanceActor → ScriptActor → ScriptExecutionActor) so scripts can
resolve IExternalSystemClient, IDatabaseGateway, and
INotificationDeliveryService from DI. ScriptGlobals exposes ExternalSystem,
Database, Notify, and Scripts as top-level properties so scripts can use
them without the Instance. prefix.
2026-03-18 02:41:18 -04:00
Joseph Doherty
8095c8efbe fix: only active singleton node sends health reports
Both nodes of a site cluster were sending health reports. The standby
node (without the DeploymentManager singleton) reported 0 instances and
no connections, overwriting the active node's data in the aggregator.

Added IsActiveNode flag to ISiteHealthCollector, set by
DeploymentManagerActor on PreStart/PostStop. HealthReportSender skips
sending when the node is not active. Also ensured EnsureDclConnections
is called during startup batch creation so data connections survive
container restarts.
2026-03-18 01:44:57 -04:00
Joseph Doherty
213ca2698a fix: update instance counts after startup batch creation completes
UpdateInstanceCounts() was only called before Instance Actors were
created (in HandleStartupConfigsLoaded), showing 0 enabled on the
health dashboard. Now also called after each batch in
HandleStartNextBatch to reflect actual running actor count.
2026-03-18 01:28:46 -04:00
Joseph Doherty
c63fb1c4a6 feat: achieve CLI parity with Central UI
Add 33 new management message records, ManagementActor handlers, and CLI
commands to close all functionality gaps between the Central UI and the
Management CLI. New capabilities include:

- Template member CRUD (attributes, alarms, scripts, compositions)
- Shared script CRUD
- Database connection definition CRUD
- Inbound API method CRUD
- LDAP scope rule management
- API key enable/disable
- Area update
- Remote event log and parked message queries
- Missing get/update commands for templates, sites, instances, data
  connections, external systems, notifications, and SMTP config

Includes 12 new ManagementActor unit tests covering authorization,
happy-path queries, and error handling. Updates CLI README and component
design documents (Component-CLI.md, Component-ManagementService.md).
2026-03-18 01:21:20 -04:00
Joseph Doherty
b2385709f8 fix: raise health report sender log level to INFO for observability
Changed "Sent health report" from DEBUG to INFO and failure log from
WARNING to ERROR so health report activity is visible in default logging.
2026-03-18 01:08:44 -04:00
Joseph Doherty
f165ca2774 feat: wire all health metrics and add instance counts to dashboard
Wired ISiteHealthCollector calls for script errors (ScriptExecutionActor),
alarm eval errors (AlarmActor), dead letters (DeadLetterMonitorActor), and
S&F buffer depth placeholder. Added instance count tracking (deployed/
enabled/disabled) to SiteHealthReport via DeploymentManagerActor. Updated
Health Dashboard UI to show instance counts per site. All metrics flow
through the existing health report pipeline via ClusterClient.
2026-03-18 00:57:49 -04:00
Joseph Doherty
88b5f6cb54 fix: handle mixed JSON types in data connection config deserialization
DeploymentManagerActor deserialized connection config JSON as
Dictionary<string, string>, which silently failed on non-string values
like {"publishInterval":1000}. The OPC UA adapter then fell back to
localhost:4840 (unreachable in Docker). Now uses JsonDocument to handle
any JSON value type. OPC PLC Simulator connects successfully.
2026-03-18 00:39:01 -04:00
Joseph Doherty
68115e7e38 feat: move Areas to Design role, fix logout, add Sign Out button
Areas management is a design concern, not admin. Moved Areas page
authorization from RequireAdmin to RequireDesign, moved nav link from
Admin to Design section, updated ManagementActor role check. Added
GET /logout endpoint (was 404, now redirects to login). Improved Sign
Out button visibility in sidebar next to username.
2026-03-18 00:28:35 -04:00
Joseph Doherty
75a6636a2c fix: wire DCL connection state changes into ISiteHealthCollector
DataConnectionActor now calls UpdateConnectionHealth() on state
transitions (Connecting/Connected/Reconnecting) and UpdateTagResolution()
on connection establishment. DataConnectionManagerActor calls
RemoveConnection() on actor removal. Health reports now include
data connection statuses when instances are deployed with bindings.
2026-03-18 00:20:02 -04:00
Joseph Doherty
4f22ca2b1f feat: replace ActorSelection with ClusterClient for inter-cluster communication
Central and site clusters now communicate via ClusterClient/
ClusterClientReceptionist instead of direct ActorSelection. Both
CentralCommunicationActor and SiteCommunicationActor are registered
with their cluster's receptionist. Central creates one ClusterClient
per site using NodeA/NodeB contact points from the DB. Sites configure
multiple CentralContactPoints for automatic failover between central
nodes. ISiteClientFactory enables test injection.
2026-03-18 00:08:47 -04:00
Joseph Doherty
e5eb871961 fix: wire up health report pipeline between sites and central aggregator
Sites now send SiteHealthReport via AkkaHealthReportTransport →
SiteCommunicationActor → CentralCommunicationActor → CentralHealthAggregator.
Added IHealthReportTransport impl, ISiteIdentityProvider impl, registered
HealthReportSender on site nodes, and added SiteHealthReport handler in
CentralCommunicationActor. Health Dashboard now shows all 3 sites online.
2026-03-17 23:46:17 -04:00
Joseph Doherty
9e97c1acd2 feat: replace site registration with database-driven site addressing
Central now resolves site Akka remoting addresses from the Sites DB table
(NodeAAddress/NodeBAddress) instead of relying on runtime RegisterSite
messages. Eliminates the race condition where sites starting before central
had their registration dead-lettered. Addresses are cached in
CentralCommunicationActor with 60s periodic refresh and on-demand refresh
when sites are added/edited/deleted via UI or CLI.
2026-03-17 23:13:10 -04:00
Joseph Doherty
eb8d5ca2c0 feat: add Docker infrastructure for 8-node cluster topology (2 central + 3 sites)
Multi-stage Dockerfile with NuGet restore layer caching, per-node appsettings
with Docker hostnames, shared bridge network with infra services, and
build/deploy/teardown scripts. Ports use 90xx block to avoid conflicts.
2026-03-17 22:12:50 -04:00
Joseph Doherty
775cb8084f feat: data-sourced attributes start with uncertain quality before first DCL value
Attributes bound to data connections now initialize with "Uncertain" quality,
distinguishing "never received a value" from "known good" or "connection lost."
Quality is tracked per attribute and included in GetAttributeResponse.
2026-03-17 18:25:39 -04:00
Joseph Doherty
adc1af9f16 docs: note CLI as preferred tool for non-UI system setup and control 2026-03-17 18:19:31 -04:00
Joseph Doherty
85d351d729 chore: ignore all logs/ directories at any depth 2026-03-17 18:18:28 -04:00
Joseph Doherty
eea50014de fix: resolve CLI serialization failures and add README
Two Akka.NET deserialization bugs prevented CLI commands from reaching ManagementActor: IReadOnlyList<string> in AuthenticatedUser serialized as a compiler-generated internal type unknown to the server, and ManagementSuccess.Data carried server-side assembly types the CLI couldn't resolve on receipt. Fixed by using string[] for roles and pre-serializing response data to JSON in ManagementActor before sending. Adds full CLI reference documentation covering all 10 command groups.
2026-03-17 18:17:47 -04:00