Wire the M6 KPI History recorder into the central composition path:
- Program.cs: call services.AddKpiHistory(configuration) on the central-only
branch alongside AddNotificationOutbox/AddAuditLog/AddSiteCallAudit.
- AkkaHostedService.cs: register KpiHistoryRecorderActor as a central,
non-role-scoped ClusterSingletonManager + ClusterSingletonProxy + a
PhaseClusterLeave CoordinatedShutdown graceful-stop drain (singleton name
'kpi-history-recorder'), copied/adapted from the audit-log-purge block.
- appsettings.Central.json (Host + docker + docker-env2 central nodes): add a
ScadaBridge:KpiHistory section (SampleInterval 00:01:00, RetentionDays 90,
PurgeInterval 1.00:00:00, DefaultMaxSeriesPoints 200).
KPI history is observability/best-effort and MUST NOT gate readiness: the
recorder is deliberately NOT added to RequiredSingletonsHealthCheck or any
other readiness gate.
Placeholder AlarmStateChanged rows are a DebugView snapshot-only concept emitted
by InstanceActor.BuildAlarmStatesSnapshot; they are never a real alarm transition.
Their timestamp may be DateTimeOffset.MinValue (the Protobuf Timestamp lower boundary),
which can throw when packed via Timestamp.FromDateTimeOffset.
Added early-return guard at the top of HandleAlarmStateChanged before any timestamp
pack or channel write. Updated the existing NativeBindingLinkage round-trip test to
use a real (non-placeholder) native alarm; added DropsAlarmStateChanged_WhenIsConfiguredPlaceholder
to assert placeholders are silently dropped (15/15 pass).
Replace the two flat capped tables with a Bootstrap nav-tabs layout, each
tab hosting a TreeView<DebugTreeNode> built from the live latest-per-name
dictionaries via DebugTreeBuilder. Drop the MaxRows cap, auto-scroll locks,
and Clear buttons (change-feed affordances that don't fit a current-status
tree); HandleStreamEvent now does a plain dictionary upsert. Per-tab filters
ExpandAll on change so matches stay visible. Branch nodes surface roll-up
badges (active-count for alarms, bad-quality for attributes); native binding
nodes show active-count or 'no active conditions'. All existing badge helpers
and ValueFormatter reused. Marshalling/dispose/reconnect contract preserved
(SafeInvokeAsync/_disposed/Dispose unchanged; FilteredAttributeValues kept as
the render-thread dict reader the CentralUI-021 race test exercises).
Rework DebugViewAlarmTableTests for the tabbed-tree DOM: tab presence+default,
computed alarm grouped under its Motor1 branch with the active roll-up badge,
and a native condition nested under its source-binding node with the enriched
kind/severity/Unacked/Shelved badge set.
Computed alarms place as leaves at their path-qualified AlarmName; native conditions group under a deduped IsNativeBinding branch keyed by NativeSourceCanonicalName with condition children keyed canonical::sourceRef. Configured-placeholder events materialise a childless binding node. Alarm roll-up (WorstState/ActiveCount) excludes placeholders. Filter matches AlarmName/SourceReference/NativeSourceCanonicalName (OrdinalIgnoreCase) and retains ancestor + binding branches. 20 new TDD cases; 18 attribute cases stay green. No DebugTreeNode model changes.
Fix 1 (Important): RollUp_FourLevelDeepBadQuality_ReachesRoot — proves bad quality at a
4-segment-deep leaf propagates HasBadQuality up every ancestor to the root.
Fix 2 (Important): Filter_DeepLeafMatch_RetainsAllAncestorBranches — proves filtering on
a terminal segment of a 3-level path retains all ancestor branches.
Fix 3 (Minor): BuildAttributeTree now returns roots.AsReadOnly() so the returned
IReadOnlyList<DebugTreeNode> reference is not a mutable list.
Fix 4 (Minor): Added <remarks> XML doc to BuildAttributeTree noting the caller-contract
that at most one AttributeValueChanged per AttributeName should be passed.
All 18 DebugTreeBuilder tests pass.
- Replace placeholder-loop comment with the double-render guard explanation
- Use _alarmTimestamps.GetValueOrDefault(binding, DateTimeOffset.MinValue) so the
placeholder timestamp is stable/idempotent across snapshot calls (was UtcNow)
- Add dcl.ExpectMsg<SubscribeAlarmsRequest>() drain in Snapshot_QuietNativeBinding_EmitsPlaceholder
and Snapshot_NativeBindingWithLiveCondition_NoPlaceholder to consume the DCL message
the NativeAlarmActor sends at startup
Pure path-split composition forest from streamed AttributeValueChanged: branch dedupe by accumulated prefix, ordinal child sort, post-order bad-quality roll-up, case-insensitive name-contains filter (keeps ancestors). BuildAlarmTree left as a NotImplementedException stub for DV-4. 16 unit tests cover structure + roll-up + filter.
InstanceActor.BuildAlarmStatesSnapshot now adds an IsConfiguredPlaceholder
row per configured native source binding that currently has no live
condition, so the Debug View tree can show the binding node even when
quiet. A binding is "quiet" when no retained AlarmStateChanged carries its
NativeSourceCanonicalName (DV-1).
Kind derivation: reuses the exact nativeKind value already computed via
ResolveNativeKind(nativeSource.ConnectionName) at the NativeAlarmActor
creation site and stored in a new _nativeAlarmKinds dictionary -- the
accurate per-binding kind (NativeOpcUa vs NativeMxAccess), not the
NativeOpcUa default.
Tests: Snapshot_QuietNativeBinding_EmitsPlaceholder,
Snapshot_NativeBindingWithLiveCondition_NoPlaceholder.
Add two additive init-only fields to AlarmStateChanged so the Debug View can
nest live native conditions under their configured source-binding node:
- NativeSourceCanonicalName (binding canonical name, e.g. "Motor1.MotorAlarms")
- IsConfiguredPlaceholder (quiet-binding placeholder flag; default false)
Flow on BOTH cross-process paths:
- Live: proto AlarmStateUpdate fields 22/23 -> StreamRelayActor packs ->
SiteStreamGrpcClient unpacks (regenerated SiteStreamGrpc/Sitestream.cs).
- Snapshot (Newtonsoft): record defaults carry through; no special handling.
NativeAlarmActor.Emit now stamps NativeSourceCanonicalName = _source.CanonicalName.
Additive-only: no existing positional constructor or wire frame changed.
Tests: StreamRelayActorTests round-trips both fields pack->unpack;
NativeAlarmActorTests asserts the emitted event carries the binding canonical name.
Add WaitForAttribute(attributeName, targetValue, timeout, cancellationToken)
to InboundScriptHost.RouteTarget and SandboxInboundScriptHost.RouteTarget,
mirroring the shipped runtime signature in RouteHelper. Eliminates the false
CS error the editor raised against valid Route.To("X").WaitForAttribute(...)
calls in inbound API method scripts. Test asserts the call diagnoses clean
under ScriptKind.InboundApi.
Adds the four missing overloads (value + predicate × WaitAsync + WaitForAsync)
to CompileAttributeAccessor so template/call scripts that use Attributes.WaitAsync
or Attributes.WaitForAsync pass design-time Roslyn validation. Covers both root
scope and composed/child scope (Children["x"].Attributes.WaitAsync) automatically
since CompileCompositionAccessor.Attributes already returns CompileAttributeAccessor.
Connection strings carry credentials; the Database Connections tab rendered the
full string (text + title tooltip) for any Design/Admin user. Replace with a
non-sensitive 'hidden — edit to view' hint so it never reaches the browser DOM.
Connection strings remain editable on the create/edit form. Adds a bUnit
regression guard asserting the seeded secret is absent from the rendered list.
A routed inbound-API call (Route.To(inst).Call(script)) runs the script on
the Site and returns its value to Central inside RouteToCallResponse, which
crosses the Central<->Site PROCESS boundary. A script's natural
'return new { ... }' is a compiler-generated anonymous type that Akka's
cross-process serializer cannot reconstruct on the receiving node, so the
reply was silently dropped and the caller's Route.To().Call() Ask timed out
at 30s with 'Script execution timed out' -- even though the script completed
and all device writes committed.
DeploymentManagerActor.RouteInboundApiCall now projects the routed return
value to a plain CLR graph (Dictionary/List/string/long/double/bool/null)
via a JSON round-trip before placing it in RouteToCallResponse. The graph
round-trips the wire and re-serializes to the same JSON shape the inbound
API expects for the HTTP body / ReturnDefinition validation.
Diagnosed live: IpsenMESMoveIn writes committed + site_events showed the
IpsenMoveIn script completed in ~0.6s, yet the inbound POST returned 500 at
30s; Central's Akka serializer logged 'Writing value of type
<>f__AnonymousType0`1 as Json' at the timeout moment.
379/379 SiteRuntime tests green.
MXAccess silently no-ops a whole-array write unless the item reference
ends in "[]" (e.g. "<object>.MoveInWorkOrderNumbers[]") — the COM Write
returns success but the value never commits. Reads work either way, so
the bug surfaced only on writes. Mirror the AVEVA MES Camstar API, which
registers array tags as "<object>.<attr>[]" (scalars have no brackets).
WriteAsync now resolves/advises/writes array values against tag + "[]"
(scalars unchanged), keeping the original tag for result mapping. Adds
IsArrayValue matching the ToMxValue/PadArrayToDeclaredSizeAsync array set.
Verified live via mxwrtest against the deployed gateway: bare ref write
ok but read-back unchanged; "[]" ref write commits (read-back changes,
fresh source timestamp). No RealMxGatewayClient unit harness exists (the
gRPC session is concrete) — consistent with how the sibling supervisory/
pad/encode fixes are verified.
An inbound /api array parameter was materialized as List<object?> whose
elements were raw System.Text.Json.JsonElement. When such a value is routed
Central->Site and a template script assigns it to a List-typed Galaxy
attribute (recv.Attributes[name] = Parameters[name]), the script-side encode
stalls (the attribute codec JSON-serializing JsonElement items) and the array
write never reaches the DCL — the Ipsen MoveIn array writes hung 30s while
scalars succeeded.
ParameterValidator.MaterializeArray now builds a strongly-typed list per the
declared element schema (List<string>/long/double/bool); arrays with no
declared scalar element type materialize each element to its CLR value
(MaterializeJsonValue) so no raw JsonElement survives. Typed lists serialize
cleanly across nodes and encode to a canonical JSON array, which the
InstanceActor decodes back to the typed list for the device write.
Even with correct array encoding (30d07b9), Ipsen MoveIn array writes still
hung: the Galaxy MES-receiver arrays are fixed-size SAFEARRAYs (e.g.
MoveInWorkOrderNumbers = SAFEARRAY(VT_BSTR) dimensions:[50]) and MXAccess only
accepts a write that supplies ALL slots. ScadaBridge sent just the N elements
the MES provided (1-2), so the COM write blocked. Verified on the live gateway:
a full-size (50) constructed array writes via WriteBulk in ~34ms; a short one
does not.
RealMxGatewayClient.WriteAsync now, for a list value, reads the tag's current
array to learn its slot count and pads the value to that length with
element-type defaults (empty string / 0 / false / default) — the caller's
values fill slots 0..N-1, the rest are cleared. The PLC reads the valid count
from a separate scalar (MoveInNumberWorkOrders). If the size can't be
determined (read fails / not an array) the value is written unpadded and a
warning is logged. Scalars are unaffected.
The Ipsen MoveIn e2e (after the supervisory-advise fix landed scalar writes)
exposed a second blocker: writes to List-typed attributes
(MoveInWorkOrderNumbers / MoveInPartNumbers, List<string>) hung at the 30s
device-write timeout while scalar writes succeeded.
InstanceActor.HandleSetDataAttribute already decodes a List attribute's
canonical JSON into a typed List<T> before the write (so the DCL can push a
real array), but RealMxGatewayClient.ToMxValue only had scalar cases — a
List<T> fell through to Convert.ToString and wrote the garbage string
"System.Collections.Generic.List`1[System.String]" to the array Galaxy
node, which the gateway's COM write rejected/blocked.
Add IReadOnlyList<bool|int|long|float|double|string|DateTimeOffset|DateTime>
cases that call the client package's typed array encoders
(VT_ARRAY|VT_BSTR etc.); List<DateTime> is mapped to DateTimeOffset. Covers
every element type AttributeValueCodec produces.
Writes through the MxGateway data connection (e.g. the Ipsen MoveIn flow
writing MES-receiver attributes) hung ~30s and changed nothing, while reads
of the same attributes worked. Root cause: MXAccess only accepts a write on
an item that holds a SUPERVISORY advise; the write path did AddItem +
WriteBulk with no advise and the monitoring subscription used a plain Advise,
so the worker's synchronous COM Write blocked until the gateway command
timeout. (Plain, non-secured writes need no user/login.) Verified live: with a
supervisory advise the write returns ok in ~22ms; without it it does not.
When the connection has no MXAccess write-user context (WriteUserId == 0) it
now behaves as a supervisory client: every advise defaults to
AdviseSupervisory — both the monitoring subscription (SubscribeAsync) and the
write path — so one connection can read and write. A supervisory advise still
delivers OnDataChange (the worker treats either advice kind as sufficient for
updates) so monitoring is unaffected, and the worker's UnAdvise tears down
either kind, so unsubscribe is unchanged. AdviseSupervisory is issued as a raw
MxCommandKind.AdviseSupervisory via the session's Invoke (the client package
exposes only plain Advise). The advise runs at most once per handle via a
Lazy<Task> so a concurrent first-time subscribe+write on the same new handle
both await the same advise (neither writes before it completes); a faulted
advise is evicted so the next write retries. Dropped on unsubscribe. A
configured non-zero WriteUserId keeps the prior plain-advise behaviour.