Commit Graph

494 Commits

Author SHA1 Message Date
Joseph Doherty
ff23f64cf8 feat(template-folder): add TemplateFolderService.CreateFolderAsync with validation 2026-05-11 10:50:28 -04:00
Joseph Doherty
44c6e4a553 refactor(repo): align TemplateFolder methods with sibling repository conventions 2026-05-11 10:48:18 -04:00
Joseph Doherty
4b1077d686 feat(repo): add TemplateFolder repository methods 2026-05-11 10:45:20 -04:00
Joseph Doherty
978ac79ad8 feat(db): EF migration AddTemplateFolders 2026-05-11 10:43:04 -04:00
Joseph Doherty
e0b098d200 feat(db): map TemplateFolder entity and Template.FolderId 2026-05-11 10:42:19 -04:00
Joseph Doherty
1d27ec3b85 feat(templates): add TemplateFolder entity and Template.FolderId 2026-05-11 10:42:19 -04:00
Joseph Doherty
80f407ae0d fix(db): EF migration AddPrimaryBackupDataConnections
Captures the pre-existing entity drift from commit 04af039 (rename
Configuration to PrimaryConfiguration, add BackupConfiguration and
FailoverRetryCount), which was committed without a corresponding
migration. Generating this here unblocks the upcoming AddTemplateFolders
migration on the templates-folder-hierarchy branch.
2026-05-11 10:42:12 -04:00
Joseph Doherty
18387df8cb plan(templates-page): use ScadaLink.slnx (repo uses slnx, not sln) 2026-05-11 10:30:15 -04:00
Joseph Doherty
892204ea3a plan(templates-page): implementation plan for folder hierarchy 2026-05-11 10:27:39 -04:00
Joseph Doherty
daa01261f3 design(templates-page): folder hierarchy and split-pane tree layout
Replaces the current /design/templates list view with a Wonderware-style
template toolbox: nested TemplateFolder entity, FolderId on Template,
composition children as inline tree leaves, persistent split-pane with
editor on the right, context menus + drag-drop reorg.
2026-05-11 10:20:50 -04:00
Joseph Doherty
872d358ad3 chore(docker): drop stale lmxproxy paths from Dockerfile
The lmxproxy workspace was relocated to deprecated/ in 9dccf8e but the
Dockerfile still tried to COPY lmxproxy/src/ZB.MOM.WW.LmxProxy.Client/,
breaking docker build. Remove the two stale COPY lines.
2026-05-08 09:34:23 -04:00
Joseph Doherty
ec1d8f1393 chore(deps): bump packages flagged by NU190x advisories
Restore inside the docker build was failing because TreatWarningsAsErrors
promotes NU1902/NU1903/NU1904 (vulnerable package warnings) to errors.
Bump the flagged packages to advisory-free versions:

- MailKit                                          4.15.1 -> 4.16.0    (GHSA-9j88-vvj5-vhgr)
- Microsoft.AspNetCore.DataProtection.EFCore       10.0.5 -> 10.0.7    (GHSA-9mv3-2cwr-p262, transitively pulls fixed System.Security.Cryptography.Xml — GHSA-37gx-xxp4-5rgx, GHSA-w3x6-4m5h-cxqf)
- OpenTelemetry.Api  (transitive via Akka.Hosting) 1.9.0  -> 1.15.3    (GHSA-g94r-2vxg-569j, GHSA-8785-wc3w-h8q6) — added as a direct PackageReference in ScadaLink.Host to override the Akka.Hosting pin

To resolve the NU1605 downgrade chain triggered by DataProtection.EFCore
10.0.7 (which transitively requires Microsoft.EntityFrameworkCore >= 10.0.7
and friends), bump every Microsoft.* 10.0.5 reference across src/ and
tests/ to 10.0.7 in lockstep.
2026-05-08 09:34:17 -04:00
Joseph Doherty
5da779db17 fix(host): wait for configuration database before applying migrations
Central nodes crashed at startup with `CREATE DATABASE permission denied`
when MSSQL accepted connections before recovering user databases —
DB_ID(@db) returned null, so EF Core's MigrateAsync fell through to
SqlServerDatabaseCreator.CreateAsync. The non-privileged app login then
failed CREATE DATABASE and the host terminated with FTL, leaving Traefik's
/health/active probe unable to find an upstream ("no available server" at
localhost:9000).

Add MigrationHelper.WaitForDatabaseReadyAsync that polls
Database.CanConnectAsync() for up to 60s before invoking MigrateAsync, and
thread an ILogger through so retry attempts surface in normal logs. This
removes the startup race without requiring depends_on across compose stacks
or granting dbcreator to the app login.
2026-05-08 09:33:59 -04:00
Joseph Doherty
9dccf8e72f deprecate(lmxproxy): move all LmxProxy code, tests, and docs to deprecated/
LmxProxy is no longer needed. Moved the entire lmxproxy/ workspace, DCL
adapter files, and related docs to deprecated/. Removed LmxProxy registration
from DataConnectionFactory, project reference from DCL, protocol option from
UI, and cleaned up all requirement docs.
2026-04-08 15:56:23 -04:00
Joseph Doherty
8423915ba1 fix(site-runtime): publish quality changes to site stream for real-time debug view updates
HandleConnectionQualityChanged now publishes AttributeValueChanged events
to the SiteStreamManager for all affected attributes. This ensures the
central UI debug view updates in real-time when a data connection
disconnects and attributes go bad quality.

Only publishes to the stream — does NOT notify script or alarm actors,
since the value hasn't changed and firing scripts/alarms on quality-only
changes would cause spurious evaluations.
2026-03-24 16:32:00 -04:00
Joseph Doherty
6df2cbdf90 fix(lmxproxy): support multiple subscriptions per session
Key subscriptions by unique subscriptionId instead of sessionId to prevent
overwrites when the same session calls Subscribe multiple times (e.g. DCL
StaleTagMonitor). Add session-to-subscription reverse lookup for cleanup.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 16:30:06 -04:00
Joseph Doherty
b3076e18db docs(lmxproxy): add stale session subscription fix plan 2026-03-24 16:19:39 -04:00
Joseph Doherty
de7c4067e4 feat(dcl): add debug-level logging for heartbeat subscription callbacks 2026-03-24 16:19:39 -04:00
Joseph Doherty
5fdeaf613f feat(dcl): failover on repeated unstable connections (connect-then-stale pattern)
Previously, failover only triggered when ConnectAsync failed consecutively.
If a connection succeeded but went stale quickly (e.g., heartbeat timeout),
the failure counter reset on each successful connect and failover never
triggered.

Added a separate _consecutiveUnstableDisconnects counter that increments
when a connection lasts less than StableConnectionThreshold (60s) before
disconnecting. When this counter reaches failoverRetryCount, the actor
fails over to the backup endpoint. Stable connections (lasting >60s)
reset this counter.

The original connection-failure failover path is unchanged.
2026-03-24 16:19:39 -04:00
Joseph Doherty
ff2784b862 fix(site-runtime): add SQLite schema migration for backup_configuration column
Existing site databases created before the primary/backup data connections
feature lack the backup_configuration and failover_retry_count columns.
Added TryAddColumnAsync migration that runs on startup after table creation.
2026-03-24 16:19:39 -04:00
Joseph Doherty
0d03aec4f2 feat(dcl): log connection disconnect events to site event log 2026-03-24 16:19:39 -04:00
Joseph Doherty
d4397910f0 feat(dcl): add StaleTagMonitor for heartbeat-based disconnect detection
Composable StaleTagMonitor class in Commons fires a Stale event when no
value is received within a configurable max silence period. Integrated
into both LmxProxyDataConnection and OpcUaDataConnection adapters via
optional HeartbeatTagPath/HeartbeatMaxSilence connection config keys.
When stale, the adapter fires Disconnected triggering the standard
reconnect cycle. 10 unit tests cover timer behavior.
2026-03-24 16:19:39 -04:00
Joseph Doherty
02a7e8abc6 feat(health): show all cluster nodes (online/offline, primary/standby) in health dashboard
Add NodeStatus record, IClusterNodeProvider interface, and AkkaClusterNodeProvider
that queries Akka cluster membership for all site-role nodes. HealthReportSender
populates ClusterNodes before each report. UI shows a row per node with
hostname, Online/Offline badge, and Primary/Standby badge. Falls back to
single-node display if ClusterNodes is not populated.
2026-03-24 16:19:39 -04:00
Joseph Doherty
65cc7b69cd feat(health): wire up NodeHostname, ConnectionEndpoint, TagQuality, ParkedMessageCount collectors
- AkkaHostedService: SetNodeHostname from NodeOptions
- DataConnectionActor: UpdateConnectionEndpoint on state transitions,
  track per-tag quality counts and UpdateTagQuality on value changes
- HealthReportSender: query StoreAndForwardStorage for parked message count
- StoreAndForwardStorage: add GetParkedMessageCountAsync()
2026-03-24 16:19:39 -04:00
Joseph Doherty
e84a831a02 feat(health): redesign health dashboard with 4-column layout and new metrics
New fields in SiteHealthReport: NodeHostname, DataConnectionEndpoints
(primary/secondary), DataConnectionTagQuality (good/bad/uncertain),
ParkedMessageCount. New collector methods to populate them.

Health dashboard redesigned to match mockup: Nodes | Data Connections
(with per-connection tag quality) | Instances + S&F Buffers | Error
Counts + Parked Messages. Site names resolved from repository.
2026-03-24 16:19:39 -04:00
Joseph Doherty
5e2a4c9080 fix(ui): align TreeView node text by giving toggle and spacer equal fixed width 2026-03-24 16:19:39 -04:00
Joseph Doherty
0abaa47de2 fix(ui): normalize TreeView expanded keys to strings for sessionStorage compatibility
Keys from KeySelector (e.g. boxed int) were compared against string keys
restored from sessionStorage, causing expansion state to be lost on
navigation. All keys are now normalized to strings internally.
2026-03-24 16:19:39 -04:00
Joseph Doherty
a0a6bb4986 refactor(ui): replace manual template inheritance tree with TreeView component 2026-03-24 16:19:39 -04:00
Joseph Doherty
2b5dabb336 refactor(ui): redesign Areas page with TreeView and dedicated Add/Edit/Delete pages
Areas page now shows a single TreeView with sites as roots and areas as
children. Context menus: sites get "Add Area", areas get "Add Child Area",
"Edit Area", "Delete Area" — each navigating to a dedicated page.

The Delete Area page shows a TreeView of the area and all recursive children
with assigned instances. Deletion is blocked if any instances are assigned
to the area or its descendants.
2026-03-24 16:19:39 -04:00
Joseph Doherty
968fc4adc7 fix(ui): disable site and instance dropdowns while debug view is connected 2026-03-24 16:19:39 -04:00
Joseph Doherty
4c7fa03c07 fix(ui): remove default list-style bullets from TreeView ul elements 2026-03-24 16:19:39 -04:00
Joseph Doherty
addbb6ffeb fix(ui): move treeview-storage.js to Host wwwroot where static files are served 2026-03-24 16:19:39 -04:00
Joseph Doherty
f1537b62ca refactor(ui): replace instances table with hierarchical TreeView (Site → Area → Instance) 2026-03-24 16:19:39 -04:00
Joseph Doherty
71894f4ba9 refactor(ui): replace manual area tree rendering with TreeView component 2026-03-24 16:19:39 -04:00
Joseph Doherty
4426f3e928 refactor(ui): replace data connections table with TreeView grouped by site 2026-03-24 16:19:39 -04:00
Joseph Doherty
08d511f609 test(ui): add external filtering tests for TreeView (R8) 2026-03-24 16:19:39 -04:00
Joseph Doherty
4e5b5facec feat(ui): add right-click context menu to TreeView (R15) 2026-03-24 16:19:39 -04:00
Joseph Doherty
f127efe6ea feat(ui): add ExpandAll, CollapseAll, RevealNode to TreeView (R12, R13) 2026-03-24 16:19:39 -04:00
Joseph Doherty
d3a6ed5f68 feat(ui): add sessionStorage persistence for TreeView expansion state (R11) 2026-03-24 16:19:39 -04:00
Joseph Doherty
da4f29f6ee feat(ui): add selection support to TreeView (R5) 2026-03-24 16:19:39 -04:00
Joseph Doherty
75648c0c76 feat(ui): add TreeView<TItem> component with core rendering, expand/collapse, ARIA (R1-R4, R14) 2026-03-24 16:19:39 -04:00
Joseph Doherty
4db93cae2b fix(lmxproxy): fix orphaned tag subscriptions when client subscribes per-tag
When a client calls Subscribe multiple times with the same session ID
(one tag per RPC), each call overwrites the ClientSubscription entry.
UnsubscribeClient only cleaned up tags from the last entry, leaving
earlier tags orphaned in _tagSubscriptions. Now scans all tag
subscriptions for client references during cleanup.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 15:43:29 -04:00
Joseph Doherty
eecd82b787 fix(lmxproxy): clean up stale session subscriptions on scavenge and add stream timeout
Grpc.Core doesn't reliably fire CancellationToken on client disconnect,
so Subscribe RPCs can hang forever and leak session subscriptions. Bridge
SessionManager scavenging to SubscriptionManager cleanup, and add a
30-second periodic session validity check in the Subscribe loop so stale
streams exit within 30s of session scavenge rather than hanging until
process restart.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 15:21:06 -04:00
Joseph Doherty
b74e139a85 fix(lmxproxy): reset probe timer after reconnect to prevent false stale triggers
Without this, the staleness check could fire immediately after reconnect
before the first OnDataChange callback arrives, causing a reconnect loop.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 15:06:42 -04:00
Joseph Doherty
488a7b534b feat(lmxproxy): add Connected Since and Reconnect Count to status page
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 13:32:46 -04:00
Joseph Doherty
73fe618953 fix(lmxproxy): protect probe subscription from ReadAsync teardown, add instance configs
ReadAsync internally subscribes/unsubscribes the same ScanTime tag used
by the persistent probe, which was tearing down the probe subscription
and triggering false reconnects every ~5s. Guard UnsubscribeInternal and
stored subscription state so the probe tag is never removed by other
callers. Also removes DetailedHealthCheckService (redundant with the
persistent probe), adds per-instance config files (appsettings.v2.json,
appsettings.v2b.json) loaded via LMXPROXY_INSTANCE env var so deploys
no longer overwrite port settings.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 12:20:05 -04:00
Joseph Doherty
95168253fc feat(lmxproxy): replace subscribe/unsubscribe health probe with persistent subscription
The old probe did a subscribe-read-unsubscribe cycle every 5 seconds to
check connection health. This created unnecessary churn and didn't detect
the failure mode where long-lived subscriptions silently stop receiving
COM callbacks (e.g. stalled STA message pump). The new approach keeps a
persistent subscription on the health check tag and forces reconnect if
no value update arrives within a configurable threshold (ProbeStaleThresholdMs,
default 5s). Also adds STA message pump debug logging (5-min heartbeat with
message counters) and fixes log file path resolution for Windows services.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 11:57:35 -04:00
Joseph Doherty
b3222cf30b fix(site-runtime): wire EventLogHandlerActor so site event log queries work
The SiteCommunicationActor expected an event log handler but none was
registered, causing "Event log handler not available" on the Event Logs
page and CLI. Bridge IEventLogQueryService to Akka via a simple actor.
2026-03-23 00:37:33 -04:00
Joseph Doherty
64c914019d feat(lmxproxy): always show RPC Operations table, rename from 'Operations'
Table now displays all 5 RPC types (Read, ReadBatch, Write, WriteBatch,
Subscribe) with dashes for zero-count operations instead of hiding the
table entirely.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 00:12:19 -04:00
Joseph Doherty
7f74b660b3 feat(lmxproxy): add delivered/dropped message counts to subscription stats
Subscription metrics (totalDelivered, totalDropped) now visible in
/api/status JSON and HTML dashboard. Card turns yellow if drops > 0.
Aggregated from per-client counters in SubscriptionManager.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 00:07:58 -04:00