Commit Graph

384 Commits

Author SHA1 Message Date
Joseph Doherty
d8f99ba781 docs(code-reviews): add regen-readme.py to generate the review index
README.md is now generated from the per-module findings.md files by
code-reviews/regen-readme.py (discovers modules, parses each finding's
severity/status, rebuilds the Pending Findings and Module Status tables).
Run with --check to fail when README.md is stale (CI-friendly).

REVIEW-PROCESS.md section 5 now points to the script instead of describing
a manual edit, and README.md carries a generated-file banner.
2026-05-16 19:18:18 -04:00
Joseph Doherty
91438dcc1b fix(store-and-forward): create the SQLite database directory on init (StoreAndForward-014)
StoreAndForwardStorage.InitializeAsync opened a SqliteConnection against the
configured SqliteDbPath (default ./data/store-and-forward.db) without ensuring
the parent directory exists. SQLite creates the database file but not its
directory, so when data/ was absent the connection failed with
"SQLite Error 14: unable to open database file" — aborting the site host's
RegisterSiteActors at StoreAndForwardService.StartAsync.

This was the root cause of the six failing SiteActorPathTests. Production
masked it because the Docker image / deployment creates data/.

InitializeAsync now calls EnsureDatabaseDirectoryExists, which parses the
connection string and creates the parent directory of a file-backed database
(in-memory databases and bare filenames are skipped).

Regression test InitializeAsync_FileInMissingDirectory_CreatesDirectory fails
against the pre-fix code. Host suite now 155/155 green (was 149/155).
2026-05-16 19:13:00 -04:00
Joseph Doherty
61253e3269 fix(store-and-forward): resolve S&F delivery + replication wiring (3 Critical findings)
Resolves StoreAndForward-001, ExternalSystemGateway-001, NotificationService-001
— one systemic gap where buffered messages were persisted but never delivered,
and the active node never replicated its buffer to the standby.

Delivery handlers (ExternalSystemGateway-001 / NotificationService-001):
- AkkaHostedService registers delivery handlers for the ExternalSystem,
  CachedDbWrite and Notification categories after StoreAndForwardService starts;
  each resolves its scoped consumer in a fresh DI scope.
- ExternalSystemClient, DatabaseGateway and NotificationDeliveryService each
  gain a DeliverBufferedAsync method: re-resolve the target and re-attempt
  delivery, returning true/false/throwing per the transient-vs-permanent contract.
- EnqueueAsync gains an attemptImmediateDelivery flag; CachedCallAsync and
  NotificationDeliveryService.SendAsync pass false (they already attempted
  delivery themselves) so registering a handler does not dispatch twice.

Replication (StoreAndForward-001):
- ReplicationService is injected into StoreAndForwardService; a new BufferAsync
  helper replicates every enqueue, and successful-retry removes and parks are
  replicated too. Fire-and-forget, no-op when replication is disabled.

Tests: StoreAndForwardReplicationTests (Add/Remove/Park observed),
attemptImmediateDelivery behaviour, and DeliverBufferedAsync paths for each
consumer. Full solution builds; StoreAndForward/ExternalSystemGateway/
NotificationService suites green.
2026-05-16 18:58:11 -04:00
Joseph Doherty
a9bd7ee37c fix(central-ui): resolve CentralUI-001 — enforce script trust model before sandbox execution
ScriptAnalysisService.RunInSandboxAsync compiled and executed arbitrary
user C# in the central host process with no trust-model enforcement — the
forbidden-API set was only a Monaco editor diagnostic. A Design-role user
could run System.IO/Process/Reflection/network code on the central node.

Added a Roslyn semantic gate (EnforceTrustModel) invoked after compilation
and before script.RunAsync, and on nested shared scripts in callSharedFunc;
a script referencing any forbidden API is rejected before it runs.

Reworked FindForbiddenApiUsages: it now resolves every identifier against
the semantic model and checks types and members, so a fully-qualified call
(System.IO.File.WriteAllText) is caught — the pre-fix check only inspected
the leftmost identifier and missed that shape. This is a static semantic
gate, not a process sandbox.

Adds gate regression tests that fail against the pre-fix code, plus a
clean-script test guarding against over-blocking.
2026-05-16 18:41:12 -04:00
Joseph Doherty
a9ceba00d0 fix(communication): resolve Communication-001 — early stream termination handling
DebugStreamService.StartStreamAsync awaited the initial debug snapshot inside
a try whose only handler was catch (OperationCanceledException). When the
stream terminated before the snapshot arrived, onTerminatedWrapper completed
the await with an InvalidOperationException that escaped the catch — the
caller got a raw, untranslated exception and the service did no teardown of
its own on that path.

Replaced with catch (Exception): it removes the session entry, sends
StopDebugStream to the bridge actor via the local reference (deterministic
teardown, idempotent), and throws a descriptive exception — TimeoutException
for the 30s timeout, otherwise an InvalidOperationException naming the
instance/site and wrapping the cause.

Re-triaged Critical -> Medium: the originally-claimed multi-minute site-side
resource leak does not occur (the bridge actor self-terminates on every
onTerminated path). Adds the first DebugStreamService test, which fails
against the pre-fix code.
2026-05-16 18:32:52 -04:00
Joseph Doherty
239bee3bc4 fix(data-connection): resolve DataConnectionLayer-001 — off-thread actor state mutation
HandleSubscribe spawned a Task.Run that mutated DataConnectionActor private
state (_subscriptionIds, _subscriptionsByInstance, _totalSubscribed,
_resolvedTags, _unresolvedTags) from a thread-pool thread, racing the actor's
own message loop — a data race on non-thread-safe Dictionary/HashSet and
non-atomic counters.

Restructured HandleSubscribe to follow the actor's existing PipeTo(Self)
pattern: the background task now performs only adapter I/O and pipes a
SubscribeCompleted message to Self; all subscription-state mutation happens
in the new HandleSubscribeCompleted handler on the actor thread (wired into
the Connected, Connecting and Reconnecting states).

Adds DCL001_ConcurrentSubscribes_DoNotCorruptSubscriptionCounters (30x30
concurrent subscribes) which fails against the pre-fix code and passes after.
2026-05-16 18:26:43 -04:00
Joseph Doherty
977d7369a7 docs: add code review process and baseline review of all 19 modules
Establishes a per-module code review workflow under code-reviews/ and
records the 2026-05-16 baseline review (commit 9c60592): 241 findings
across all src/ modules (6 Critical, 46 High, 100 Medium, 89 Low).
This is the clean starting point for remediation work.
2026-05-16 18:09:09 -04:00
Joseph Doherty
9c60592632 build: adopt NuGet Central Package Management
Move all package versions into Directory.Packages.props so every project
resolves a single consistent version. Consolidates the Roslyn packages
(Microsoft.CodeAnalysis.CSharp.Scripting/Workspaces) onto 5.0.0, which
resolves the pre-existing NU1608 version-skew error in the test projects.
2026-05-16 15:56:30 -04:00
Joseph Doherty
fd1518f4f4 test(central-ui): remove vacuous tests for removed analyzer diagnostics
Six tests asserted DoesNotContain(SCADA004/SCADA005) or an empty InlayHints
result — all pass for the wrong reason now that those diagnostics and the
positional InlayHints were removed in the analyzer realignment. They also
used the obsolete top-level CallScript syntax. Removed.
2026-05-16 15:06:30 -04:00
Joseph Doherty
b949dc4183 test(central-ui): realign analyzer tests with the reworked script-call API 2026-05-16 15:04:06 -04:00
Joseph Doherty
3cc174c3cd test(central-ui): fix the CentralUI.Tests build
Two stale references blocked compilation: the DataConnection page tests
still pointed at Components.Pages.Admin (the pages moved to .Design), and
ScriptAnalysisServiceTests constructed ScriptAnalysisService without the
IServiceProvider parameter. The project now compiles.
2026-05-16 14:44:30 -04:00
Joseph Doherty
d030153378 test(site-runtime): fix stale SetStaticAttribute tests
HandleSetStaticAttribute was made fire-and-forget (commit 2951507) — it no
longer replies with SetStaticAttributeResponse — but three InstanceActor
tests still ExpectMsg<SetStaticAttributeResponse> and timed out. Verify the
mutation via the GetAttributeRequest round-trip instead, which the FIFO
mailbox makes a sound sync point. Test intent (in-memory update, SQLite
persistence, serialized ordering) is unchanged.
2026-05-16 14:33:09 -04:00
Joseph Doherty
d63d412461 test(triggers): expect AlarmTriggerType.Expression in the enum membership test 2026-05-16 06:42:17 -04:00
Joseph Doherty
0a535cd4a5 fix(triggers): don't false-flag Children/Parent attribute refs in expression validation 2026-05-16 06:08:06 -04:00
Joseph Doherty
5065384305 fix(triggers): use explicit ValidationCategory + tighten expression syntax validation 2026-05-16 05:57:39 -04:00
Joseph Doherty
bf3f572ad9 feat(triggers): validate expression triggers pre-deployment 2026-05-16 05:52:25 -04:00
Joseph Doherty
3499d76f14 feat(ui/triggers): expression trigger panel in the script & alarm editors 2026-05-16 05:46:27 -04:00
Joseph Doherty
78b10d00d8 fix(triggers): bound expression evaluation, align AlarmActor error handling, dedupe config parsing 2026-05-16 05:43:18 -04:00
Joseph Doherty
41c3fa3d84 fix(triggers): seed the trigger-expression attribute snapshot at actor startup 2026-05-16 05:38:50 -04:00
Joseph Doherty
9e21b47080 feat(triggers): runtime expression trigger evaluation for scripts and alarms 2026-05-16 05:35:02 -04:00
Joseph Doherty
f789ab4a91 docs(triggers): list the Expression config shape in the codec summaries 2026-05-16 05:30:12 -04:00
Joseph Doherty
199cdbe798 feat(triggers): add Expression to the script & alarm trigger codecs 2026-05-16 05:27:33 -04:00
Joseph Doherty
8050a1996f docs(plans): implementation plan for expression triggers 2026-05-16 05:25:10 -04:00
Joseph Doherty
c94d3b7570 docs(plans): design for expression-based script & alarm triggers
Captures the brainstormed design for a new Expression trigger: a read-only
boolean C# expression evaluated on attribute updates, edge-triggered for
scripts and level-based for alarms, compiled against a restricted read-only
globals type.
2026-05-16 05:21:57 -04:00
Joseph Doherty
6fb313cf58 feat(ui/templates): structured trigger editor for template scripts
The script add/edit modal exposed a script's trigger as two raw free-text
inputs — a type string and hand-written config JSON — with no validation
and no parity with the alarm trigger UI.

Replace them with a ScriptTriggerEditor component (mirroring
AlarmTriggerEditor): a trigger-type dropdown plus type-specific panels for
Interval, ValueChange, Conditional, and Call, a grouped attribute picker,
and an auto-generated hint. A ScriptTriggerConfigCodec round-trips the
TriggerConfiguration JSON the site runtime's ScriptActor consumes, tolerant
of legacy keys; an unrecognized stored type is preserved untouched in a
read-only panel.
2026-05-16 04:03:42 -04:00
Joseph Doherty
295150751f feat(scripts): realign Test Run with runtime API, add anonymous-object calls and instance binding
The Test Run sandbox and Monaco analysis modelled a script API that had
drifted from the site runtime's ScriptGlobals, so real scripts failed to
compile in Test Run. Realign both to the runtime surface
(Instance/Scripts/ExternalSystem/Attributes/Children/Parent) and drop the
duplicate ScriptHost stub so the two cannot diverge again.

- Script calls (Scripts.CallShared, Instance.CallScript, Route.To().Call)
  accept an anonymous object instead of a hand-built dictionary, via a
  shared ScriptArgs normalizer; existing dictionary calls still compile.
- Test Run can optionally bind to a deployed instance, so Instance/
  Attributes/CallScript route to it cross-site; adds site-side
  RouteToGetAttributes/RouteToSetAttributes handlers.
- Adds Test Run panels to the API method and template script editors.
- Fixes the TestDatabaseQuery seed script, which queried a table that
  never existed.

Also commits unrelated in-progress work already in the tree: the health
monitoring report loop, site streaming changes, and the Admin/Design
data-connection and SMTP page reorganization.
2026-05-16 03:37:56 -04:00
Joseph Doherty
d7b05b40e9 fix(host): drop UseStaticFiles so MapStaticAssets controls caching
UseStaticFiles middleware ran before the MapStaticAssets endpoints and
served static assets (monaco-init.js, site.css, etc.) with no
Cache-Control header. Browsers then heuristically cached them and kept
serving stale copies across deploys — e.g. the Monaco editor ran an old
monaco-init.js that did not send the script kind, so inbound API method
scripts were analysed against the wrong globals and 'Route' was flagged
as undefined.

MapStaticAssets alone now serves every static asset, tagging
non-fingerprinted files with Cache-Control: no-cache so the browser
always revalidates via ETag.
2026-05-15 12:29:14 -04:00
Joseph Doherty
e54c4a6c2e feat(ui/auth): use a minimal layout for the login page
The login page previously rendered inside MainLayout, showing the full
nav sidebar and the authenticated-user footer. It now uses a bare
LoginLayout (no nav, no session-expiry watchdog, no dialog host) and
just renders its own centred card.
2026-05-15 12:16:36 -04:00
Joseph Doherty
fc18239b97 fix(ui/auth): stop /login redirect loop when the session is expired
SessionExpiry renders inside MainLayout, which also wraps the login
page. For a user with a still-present auth cookie but an expired
expires_at claim, it redirected /login back to /login indefinitely.
It now skips the redirect when already on the login page.
2026-05-15 12:14:57 -04:00
Joseph Doherty
1d5465f31c fix(deployment): instance delete fully removes the record
Deleting an instance only undeployed it from the site and set the state
to NotDeployed, leaving an orphan record that could never be removed —
the state-transition matrix rejected delete from NotDeployed.

Delete now removes the instance record entirely (deployment history,
snapshot, attribute/alarm overrides, and connection bindings go with
it), and is permitted from any state.
2026-05-15 12:05:13 -04:00
Joseph Doherty
17e24ddd20 fix(site-event-log): record script errors and route queries to the active node
Script execution failures were only written to Serilog, never to the
site event log — SiteRuntime did not reference the SiteEventLogging
project. ScriptExecutionActor now resolves ISiteEventLogger and emits a
'script'/'Error' event on timeout and exception.

The event-log query handler was a per-node actor bound to that node's
local SQLite. A ClusterClient query could land on the standby (which
records no events) and return nothing. The handler is now a cluster
singleton with a proxy, so queries always reach the active node.
2026-05-15 12:04:59 -04:00
Joseph Doherty
80ec16a6d0 feat(ui/auth): redirect to /login when the session times out
Previously a user idling past the 30-minute cookie expiry stayed parked
on a stale page until they tried to navigate. The auth cookie's UTC
expiry is now also stamped onto an expires_at claim at sign-in, and a
SessionExpiry component mounted in MainLayout schedules a delay until
expiry + 2s grace, then force-loads /login — at which point the standard
cookie middleware confirms the session is gone and serves the login page.
2026-05-13 16:13:53 -04:00
Joseph Doherty
3f37584728 feat(ui/topology): open instance in Debug View from context menu
Adds a Debug View item to the instance context menu on /deployment/topology
that navigates to /deployment/debug-view with siteId and instanceId query
parameters; the page now auto-connects when those are present (falling
back to the existing localStorage auto-reconnect otherwise). Disabled for
non-Enabled instances since debug streaming only targets enabled ones.

Also fixes a latent NRE in DebugView.OnInitializedAsync: the toast ref
isn't bound yet during init, so transient load failures are now stashed
and surfaced from OnAfterRenderAsync where the toast is ready.
2026-05-13 13:41:20 -04:00
Joseph Doherty
733679a376 feat(ui/api-keys): grant API method access on edit page
Admins can now check/uncheck which API methods this key is approved to
invoke directly on /admin/api-keys/{id}/edit, instead of having to bounce
through the Design role's API method editor. Membership is diffed against
the initial state and applied by mutating ApprovedApiKeyIds on each
affected ApiMethod in the same SaveChangesAsync.
2026-05-13 13:41:13 -04:00
Joseph Doherty
7044791a55 docs(plans): scrub LmxProxy references from design plans
Remove the LmxProxy work package (WP-8) from phase-3b, the CD-DCL-1..6
protocol details, Q9/Q-P3B-2 from the questions log, the LmxProxy
component-design rows in requirements-traceability, and the inline
mentions across phase-0, phase-4, the gRPC streaming plans, and the
primary/backup data-connection plans.
2026-05-13 13:30:07 -04:00
Joseph Doherty
72e7bbe968 chore: remove deprecated LmxProxy reference implementation
Delete the standalone ZB.MOM.WW.LmxProxy solution and loose adapter
stubs under deprecated/, plus the lone LmxProxy mention in
deprecated/windev.md. The protocol was never wired into the active
codebase and the runtime artifact has been removed from the cluster.
2026-05-13 13:29:58 -04:00
Joseph Doherty
f66dc031a4 fix(health): route site heartbeats into the aggregator
CentralCommunicationActor.HandleHeartbeat was forwarding each incoming
HeartbeatMessage to Context.Parent, which resolves to the /user
guardian — a non-actor. Every site heartbeat went straight to dead
letters (~1026 per central node per 30 minutes at the default ~2s
interval across three sites).

The aggregator now exposes MarkHeartbeat(siteId, receivedAt) which
bumps LastReportReceivedAt on already-known sites (and clears IsOnline
if it had flipped) without touching LatestReport. Heartbeats from
unregistered sites are dropped — first registration still happens on
the first full report. CentralCommunicationActor calls this in place
of the no-op Tell.

The result: heartbeats now serve their stated health-monitoring
purpose (per CLAUDE.md) by keeping a site marked online between the
30s full reports if a single report is briefly delayed, and the dead
letter noise disappears entirely.
2026-05-13 08:11:43 -04:00
Joseph Doherty
7bba48a14a feat(ui/monitoring): redesign Parked Messages page with filters, drawer, and bulk actions
Triage was painful on the old layout: a lone Site dropdown sat on a sparse
row, errors were truncated mid-sentence with a per-row View/Hide toggle
that on expand pushed an unwrapped <pre> through the table and shoved the
Actions column off-screen, all rows looked the same regardless of age or
attempt count, and OriginInstance — which tells you which instance
produced the failure — wasn't displayed at all even though the data was
on the entity.

This pass:

- Adds a real filter bar: Site, Category, Target system, Origin instance,
  Age window, free-text search. Category/Target/Origin/Age/Search filter
  the loaded page client-side; Site still drives the server query (and
  changing site now auto-queries — one fewer click).
- Replaces the in-table expansion with an Offcanvas detail drawer.
  Clicking a row slides in a side panel with full message ID + copy,
  category label, origin, attempts, both timestamps in relative + absolute
  form, the complete error (pre-wrap, scrollable), and big Retry / Discard
  buttons. The table never overflows.
- Stacks Target + Method into one column (target in semibold, method
  small/muted below) and surfaces Origin as a code-styled chip in a new
  column ("—" muted when null).
- Severity left-border on each row, derived client-side from
  AttemptCount/MaxAttempts and age of the last attempt: red when retries
  are exhausted and last attempt was in the past hour, amber when
  exhausted but stale, muted grey otherwise.
- Mini attempt progress bar under the n/max count, red when fully
  exhausted and amber while partial.
- Relative timestamps ("5m ago", "1h ago", "2d ago") with absolute UTC on
  hover via the title attribute — applies in both the table and the drawer.
- Bulk select: header checkbox selects the filtered set, per-row
  checkboxes. When ≥1 selected, a sticky action strip slides in below the
  filter bar offering Retry selected / Discard selected with the usual
  confirm dialog. Toast reports per-item success/failure counts.
- Summary line next to the title: "N parked · K target systems · oldest
  Xh ago" (and "(showing M of N)" when filters are active).
- ParkedMessageEntry contract extended additively with MaxAttempts,
  Category, and OriginInstance so the UI has the data it needs for
  severity, the category filter, and the new column.
- Bumped page size from 25 to 50 to better match the dense layout.
2026-05-13 08:05:22 -04:00
Joseph Doherty
1c2dc45803 feat(ui/api-methods): pick approved API keys when editing a method
The ApiMethod entity had an ApprovedApiKeyIds column and ApiKeyValidator
read it, but no UI/CLI/seed code ever wrote to it. Result: any inbound
POST /api/{method} was rejected with 403 "API key not approved for this
method" regardless of which key was sent.

Add an "Approved API Keys" subsection to the method form, between
Timeout and Parameters: vertical list of checkboxes, one per ApiKey
row (with a "Disabled" badge for disabled keys, and a link to
/admin/api-keys when none exist). OnInitializedAsync loads all keys and
parses the existing comma-separated IDs; Save() serializes the selected
set back to the entity on both create and edit paths.

Re-uses IInboundApiRepository.GetAllApiKeysAsync — no repo or migration
changes needed.
2026-05-13 07:12:44 -04:00
Joseph Doherty
1822e3c76f fix(store-and-forward): wire up parked-message handler and start S&F service on sites
The Parked Messages page returned "Parked message handler not available"
because no actor was ever registered for ParkedMessages, and Retry/Discard
requests had no Receive at all (would have hit deadletters). On top of
that, StoreAndForwardService.StartAsync() was never called anywhere, so
the sf_messages SQLite table was never created and the retry timer never
ran — silently breaking all of S&F.

- New ParkedMessageHandlerActor bridges StoreAndForwardService.{Get,Retry,Discard}
  using the Sender→Task→PipeTo pattern already used in DeploymentManagerActor.
- SiteCommunicationActor now routes ParkedMessageRetryRequest and
  ParkedMessageDiscardRequest the same way as the existing Query handler.
- AkkaHostedService.RegisterSiteActors() resolves StoreAndForwardService,
  calls StartAsync() to create the schema and start the timer, then
  creates and registers the handler actor.
2026-05-13 07:12:37 -04:00
Joseph Doherty
6f1f6b8467 fix(health): replicate site health reports between central nodes
CentralHealthAggregator is a per-node hosted singleton, but site health
reports flow through ClusterClient which round-robins each report to one
central node only. The other node's aggregator never saw those reports
and marked sites offline at the 60s threshold — sites constantly flapped
between online and offline on the monitoring page.

On receive, the active CentralCommunicationActor now republishes a
SiteHealthReportReplica wrapper on a DistributedPubSub topic. Both
central nodes subscribe to the topic and process replicas through a
dedicated path that updates the local aggregator without re-broadcasting
(avoids fan-out loops). The aggregator's existing sequence-number
idempotency makes self-delivery a cheap no-op.

DistributedPubSubExtensionProvider is now listed in the HOCON
`akka.extensions` block so the mediator is initialised at cluster
start, eliminating a race where the first Subscribe arrived before the
extension was loaded.
2026-05-13 06:20:07 -04:00
Joseph Doherty
d9caa3dd7e fix(ui/shared-scripts): show real param count and return type on cards
The card badges were stuck on the pre-migration data shape: the param
counter only handled flat arrays (now JSON Schema objects), and the
return badge said "returns" regardless of the actual type. Count
`properties` for object schemas with array fallback, and label the
return badge with the schema's `type` (or `T[]` for arrays).
2026-05-13 05:52:53 -04:00
Joseph Doherty
352c93d5a2 fix(alarms): surface composed-member attributes across flatten/validate/UI
Three layers were each blind to nested composition in different ways:

- FlatteningPipeline only loaded compositions for templates in the parent's
  inheritance chain, so depth-2 composed attributes (e.g.
  Pump.AlarmSensor.SensorReading) never materialized. Walk composed chains
  breadth-first so the flattener's nested step has the data it needs.

- InstanceConfigure's alarm trigger picker was fed only direct, non-locked
  attributes, hiding inherited and composed-member paths. Feed it the full
  flattened attribute list via FlatteningPipeline.

- ValidationService.ExtractAttributeNameFromTriggerConfig only recognized
  "attributeName", silently passing alarms still using the legacy
  "attribute" key. Accept both keys, matching FlatteningService,
  AlarmActor, and AlarmTriggerConfigCodec.
2026-05-13 05:33:32 -04:00
Joseph Doherty
164d914ba8 feat(ui): rich AlarmTriggerEditor in instance override modal
Replaces the per-row JSON textbox with an Edit button that opens a modal
hosting the full AlarmTriggerEditor. The editor pre-populates with the
merged inherited + override config so the operator sees the effective
state, not the override delta.

On Save:
  - HiLo: diff against inherited, store only changed keys
  - Binary trigger types: whole-replace if the edited config differs

Value comparison in the diff is type-aware (decoded strings, numeric
GetDouble) so JSON-escape differences (e.g., literal em-dash vs —)
don't produce false-positive diffs that pollute the override JSON.

FlatteningService.MergeHiLoConfig is now public so the UI can pre-merge
the editor seed; new public DiffHiLoConfig handles the symmetric
direction. +2 encoding tests cover the new equivalence behavior.

The override row's summary column shows the diff'd keys + priority chip
so operators see what's overridden at a glance.
2026-05-13 04:05:08 -04:00
Joseph Doherty
4e446a7170 feat(ui): instance alarm override editor in InstanceConfigure
Adds an Alarm Overrides card to the per-instance Configure page (next to
the existing Attribute Overrides and Connection Bindings cards). Each
non-locked template alarm gets a row showing its trigger type, inherited
config, and inputs for an override JSON + priority override. A Clear
button removes the override; the Save Alarm Overrides button upserts
all dirty rows.

The HiLo merge / binary whole-replace semantics are surfaced via the JSON
placeholder hint per trigger type. Wired to the existing
InstanceService.SetAlarmOverrideAsync / DeleteAlarmOverrideAsync flow.
2026-05-13 03:28:39 -04:00
Joseph Doherty
751248feb6 feat(alarms): HiLo trigger type with per-band level, hysteresis, messages, overrides
Adds a new HiLo alarm trigger type with four configurable setpoints
(LoLo / Lo / Hi / HiHi). Each setpoint carries an optional priority,
deadband (for hysteresis), and operator message. The site runtime emits
AlarmStateChanged with an AlarmLevel field so consumers can differentiate
warning vs critical bands.

Plumbing:
  - new AlarmLevel enum + AlarmStateChanged.Level/Message init properties
  - AlarmTriggerEditor (Blazor) gets a HiLo render with severity tinting
  - AlarmTriggerConfigCodec extracted from the editor for testability
  - sitestream.proto carries level + message over gRPC
  - SemanticValidator enforces numeric attribute, setpoint ordering,
    non-negative deadband
  - on-trigger scripts get an Alarm global (Name/Level/Priority/Message)
    so notification routing can branch by severity
  - per-instance InstanceAlarmOverride entity + EF migration + flattening
    step + CLI commands; HiLo overrides merge setpoint-by-setpoint, binary
    types whole-replace
  - DebugView shows a Level badge + per-band message tooltip
  - App.razor auto-reloads on permanent Blazor circuit failure
  - docker/regen-proto.sh automates the proto regen workflow (the linux/arm64
    protoc segfault means generated files are checked in for now)
2026-05-13 03:23:32 -04:00
Joseph Doherty
783da8e21a feat(ui): structured editors for script schemas and alarm triggers
Replace raw-JSON text inputs with rich UI: script parameter/return types use
a JSON Schema builder (SchemaBuilder + JsonSchemaShapeParser, with a migration
to convert existing definitions); alarm trigger config uses a type-aware
editor with a flattened attribute picker (AlarmTriggerEditor). AlarmActor
gains optional direction (rising/falling/either) on RateOfChange triggers.
2026-05-13 00:33:00 -04:00
Joseph Doherty
57f477fd28 fix(templates): cascade delete through nested derived templates
DeleteCompositionAsync only dropped the top-level derived template — the
cascaded inner derived rows (created when composing a composite source)
were left orphaned with dangling OwnerCompositionId references. Any
subsequent attempt to recompose the same source hit the name-collision
guard ('Motor Controller.Pump.TempSensor' already exists).

New CascadeDeleteDerivedAsync walks each composition on the derived
template, recursively removes the slot-owned child derived first, then
the composition row, then the derived itself. Mirrors the recursive
shape of CreateCascadedCompositionAsync.
2026-05-12 10:34:55 -04:00
Joseph Doherty
85769486df fix(ui/templates): expand composition leaves to show cascaded slots
Composition leaves were rendered flat — the cascaded inner derived
templates existed in the DB but the tree only showed the outer slot
name (e.g. "Tank Monitor > DrivePump") with no way to see DrivePump's
own TempSensor + AlarmSensor slots.

BuildCompositionLeaves now recurses: for each composition under a
template, look up the composed template (which after derive-on-compose
is a derived row carrying its own Compositions) and build its slot
leaves as children. HasChildrenSelector loses the
"not a composition" guard so nested leaves render with the expand
chevron.
2026-05-12 10:29:52 -04:00
Joseph Doherty
4f90f952d0 fix(templates): cascade child compositions when composing a composite
When the user composes a template that already has compositions of its
own (e.g. \$Sensor → Probe1 slot), only the outer derived was created
— the source's children weren't replicated. AddCompositionAsync now
walks the source's composition graph and creates a parallel derived for
every slot it encounters, each linked back through ParentTemplateId so
override chains stay intact (\$Probe → \$Sensor.Probe1 → \$Pump.TempSensor.Probe1).

The cascade pre-flights every name it would create — a deep collision
aborts before any rows mutate. Internal helper
CreateCascadedCompositionAsync skips the "base templates only" check
since it operates on the source side which may legitimately reference
derived rows.
2026-05-12 09:57:07 -04:00