Adds an OPC UA | MxGateway protocol dropdown (create-time; locked read-only on
edit), branches the primary/backup endpoint editors, serializer, and validator
by protocol, and persists DataConnection.Protocol accordingly. Updates form
tests: protocol dropdown present on create + MxGateway save round-trips typed
JSON with Protocol=MxGateway.
DataConnectionFactory registers 'MxGateway' -> MxGatewayDataConnection over the
real client; AddDataConnectionLayer binds MxGatewayGlobalOptions; DeploymentManager
FlattenConnectionConfig gains an MxGateway arm using the typed serializer. Factory
test confirms Create("MxGateway") returns the adapter.
Seam implementation wrapping the gateway client + GalaxyRepositoryClient:
OpenSession/Register, AddItem+Advise subscribe, ReadBulk/WriteBulk with handle
tracking, StreamEvents loop with worker_sequence resume + OPC-style quality
mapping, and Galaxy BrowseChildren mapping (objects keyed by gobject id,
attributes by full tag reference). Only type touching the generated contracts.
Implements IDataConnection + IBrowsableDataConnection over the IMxGatewayClient
seam: connect/disconnect with once-only Disconnected guard + background event
loop, subscribe/unsubscribe with tag routing, read/write batch with per-tag
error classification, WriteBatchAndWait, and Galaxy browse mapping. Covers plan
Tasks 6-10. Full unit coverage via FakeMxGatewayClient (12 tests).
Central package management requires package-source mapping with >1 feed
(NU1507 as error), so nuget.config scopes ZB.MOM.WW.MxGateway.* to the Gitea
feed and everything else to nuget.org. Credentials are not committed.
Add design doc for a second data-connection protocol, MxGateway, alongside
the OPC UA client. New IDataConnection adapter behind the existing
DataConnectionFactory extension point; tag pipe (read/subscribe/write) plus
Galaxy hierarchy browse, optional 2nd endpoint for failover. Generalizes the
OPC-UA-named browse plumbing to protocol-agnostic browse via
IBrowsableDataConnection. No entity/schema changes.
- Per-instance DataSourceReferenceOverride on InstanceConnectionBinding
- IBrowsableDataConnection capability + RealOpcUaClient.BrowseChildrenAsync
- BrowseOpcUaNodeCommand routed via DeploymentManagerActor singleton (HA-safe)
- <OpcUaBrowserDialog/> on InstanceConfigure with lazy-loaded tree + manual paste
- Test Bindings popup: one-shot live read of all bound tags via ReadTagValuesCommand
- ConfigurationDatabase migration AddInstanceConnectionBindingOverride
- Doc updates: Component-DataConnectionLayer, Component-TemplateEngine, Component-CentralUI
Adds a Test Bindings button to the Connection Bindings table on the Configure
Instance page that opens a modal showing the live current value of every bound
attribute. Reuses the routing path that the OPC UA tag browser landed on:
Central: TestBindingsDialog → IBindingTester → CommunicationService
→ ReadTagValuesCommand → SiteEnvelope (Ask)
Site: SiteCommunicationActor → DeploymentManagerActor singleton
→ DataConnectionManagerActor → child DataConnectionActor
→ _adapter.ReadBatchAsync
Split mirrors the browse handler:
• Manager owns ConnectionNotFound (only it sees the per-site connection set).
• Child owns ConnectionNotConnected (pre-call status check, never stash —
read is interactive design-time), Timeout (OperationCanceledException),
ServerError (any other exception). Per-tag failures from ReadBatchAsync
become failure TagReadOutcomes without aborting the batch.
CentralUI:
• IBindingTester / BindingTester — Design-role guard via HasClaim against
JwtTokenService.RoleClaimType (not IsInRole — see c1e16cf), typed
transport-failure translation.
• TestBindingsDialog — ShowAsync(siteId, rows, instanceLabel) method-arg
pattern (no Razor parameter race; see 2c138b6), groups rows by connection
and issues one ReadAsync per connection in parallel, per-row error subline
+ per-connection banner, Refresh button re-issues the reads.
• InstanceConfigure.razor — Test Bindings button next to Save Bindings,
disabled when no testable rows. OPC UA only today (other protocols have
no ReadTagValuesCommand wiring yet).
Tests:
• Commons: ReadTagValuesCommand discovered by ManagementCommandRegistry.
• DataConnectionLayer: unknown connection → ConnectionNotFound,
not-connected adapter → ConnectionNotConnected (ReadBatchAsync NOT called),
success-path mapping (Good/Bad + per-tag error), cancellation → Timeout.
• CentralUI: register IBindingTester (and the previously-missing
IOpcUaBrowseService) on the existing InstanceConfigureAuditDrillinTests
Bunit container so the page renders cleanly with the new dialog.
Razor parameter binding propagates on the next render, so reading SiteId
inside LoadRootAsync raced against the parent's "set field, then call
ShowAsync()" pattern — central received an empty siteId and rejected
with "No ClusterClient for site ,". Take the values as args instead.
ClaimsIdentity is built without an explicit roleType, so IsInRole("Design")
checks ClaimTypes.Role while actual claims use "Role" — the guard always
returned not-authorized. Switch to HasClaim(RoleClaimType, "Design").
Wires the OPC UA Tag Browser cross-cluster path: central UI Asks via
CommunicationService.BrowseOpcUaNodeAsync -> ClusterClient -> site
SiteCommunicationActor -> /user/dcl-manager (Task 10 handler).
Uses ActorSelection.Tell(msg, Sender) since DataConnectionManagerActor
is not a child of DeploymentManagerActor and ActorSelection has no
Forward() helper; preserving Sender keeps the BrowseOpcUaNodeResult
routing back to the original Ask.
Integration test deferred: tests/ZB.MOM.WW.ScadaBridge.IntegrationTests
has no ClusterFixture (only ScadaBridgeWebApplicationFactory, which
does not expose a Communication service nor a seeded site OPC UA
connection). Round-trip will be exercised manually under Task 19.
- BrowseOpcUaNodeCommand: int DataConnectionId -> string ConnectionName
(site DataConnectionManagerActor indexes children by name; CentralUI
already has the connection name in scope via the dropdown — no extra
plumbing across the trust boundary).
- IOpcUaBrowseService / OpcUaBrowseService: parameter renamed accordingly.
- OpcUaBrowserDialog: collapse the duplicate ConnectionName parameters
(display label and routing key are the same string).
- Task 10: DataConnectionManagerActor forwards BrowseOpcUaNodeCommand to
its child by name (owns ConnectionNotFound); DataConnectionActor adds
the receive across all three lifecycle states (Connecting / Connected
/ Reconnecting) and maps adapter outcomes to BrowseFailureKind
(NotBrowsable / ConnectionNotConnected / Timeout / ServerError).
- Task 17: SetFailure in OpcUaBrowserDialog implements the full
BrowseFailureKind switch with friendly UI messages.
- Tests: DataConnectionManagerBrowseHandlerTests covers ConnectionNotFound,
NotBrowsable, success, and ConnectionNotConnectedException paths.
Five phases, PR-shippable per phase: schema/contracts, DCL browse capability,
flattening uses override, Central UI popup + integration, docs. Per-task
classification, time estimates, and parallelism declared.
Per-instance address override + live ClusterClient-based browse via a new
IBrowsableDataConnection capability on RealOpcUaClient. Lazy-loaded tree
with manual-paste fallback; offline-safe.
Test instances persistently emit Notify.To("Engineering Alerts").Send;
without the list at central a fresh cutover parks every notification
(observed 42k+ parked in a 3.5-min S&F drain after the rename).
Mirror the seed across docker/seed-sites.sh and docker-env2/seed-sites.sh.
A fresh ScadaBridgeConfig has only the Admin LdapGroupMappings row
(InitialSchema migration ships one row, SecurityConfiguration.HasData
declares four). docker-env2/seed-sites.sh already inserts the missing
three idempotently; docker/seed-sites.sh did not, so multi-role got
Admin only on a primary cutover. Mirror the env2 insert block.
Renames ScadaLink* databases + scadalink_app login on an existing
running scadabridge-mssql container. For users preserving seeded
test data through the rename. Fresh deployments use the already-
updated infra/mssql/setup.sql directly.
Decisions: full prefix in csproj names + namespaces, full runtime
artifact rename (containers/network/DBs), staged commits on main,
in-place MS SQL DB rename, wipe site SQLite on cutover.
Closes the last open code-review finding. The unreachable
IngestCachedTelemetryAsync path now carries production cached-call
lifecycle traffic, delivering the design's "AuditLog + SiteCalls in one
MS SQL transaction" guarantee. Before this commit, the SiteCalls
operational half had NO production transport at all — central's
SiteCallAuditActor.OnUpsertAsync had zero producers, so cached-call
operational state never reached the central mirror.
Site-side partition (so neither path double-emits):
- ISiteAuditQueue.ReadPendingCachedTelemetryAsync — new method returning
rows where Kind ∈ {CachedSubmit, ApiCallCached, DbWriteCached,
CachedResolve} AND ForwardState = Pending.
- ISiteAuditQueue.ReadPendingAsync — XML doc updated, SQLite impl now
filters Kind NOT IN the cached set so cached rows no longer ride the
audit-only drain.
New cached-drain in SiteAuditTelemetryActor:
- Optional IOperationTrackingStore? ctor param (null on central
composition roots — the cached scheduler is never armed there).
- Independent CachedDrain message + scheduler tick parallel to the
existing Drain — a stall on one path can't block the other; shared
lifecycle CTS gates both.
- OnCachedDrainAsync: reads cached audit rows, joins each with its
matching SiteCallOperational snapshot via CorrelationId →
TrackedOperationId from the tracking store, builds CachedTelemetryBatch,
pushes via IngestCachedTelemetryAsync, marks ack'd rows Forwarded.
- Orphan rows (no tracking snapshot, thrown tracking-store call,
missing CorrelationId) logged at Warning + skipped — they stay
Pending so reconciliation/retry picks them up later. Best-effort
contract preserved.
Central side: AuditLogIngestActor.OnCachedTelemetryAsync was already
implemented (M3 Bundle G dead code today, alive after this commit) —
performs InsertIfNotExists for AuditLog + UpsertAsync for SiteCalls
inside a BeginTransactionAsync. The handler is idempotent on EventId,
so any duplicate arrivals from concurrent push + reconciliation are
silent no-ops.
Composition root: AkkaHostedService now resolves IOperationTrackingStore
via GetService<>() (site-only) and threads it through the actor's
Props.Create.
Tests added (+3 in SiteAuditTelemetryActorTests):
- Cached rows route through the new transport, not the audit-only drain.
- Orphan cached row (no tracking match) is logged + skipped, drain
doesn't crash.
- Ordinary audit rows still flow through the audit-only drain unchanged.
- ParentExecutionIdCorrelationTests now unions both queues to assert
all expected Kinds remain covered after the partition.
Build clean; AuditLog.Tests 250/251 (the 1 fail is the pre-existing
date-sensitive PartitionPurgeTests integration flake explicitly accepted
across the session); SiteRuntime.Tests 302/302.
README regenerated: 0 pending of 481 total.
Session-final totals: 136 of 136 originally-open Theme findings closed
across 11 commits (10 themed batches + this architectural close).
Final themed batch. 5 well-localised correctness fixes.
Serialisation precision:
- ESG-020: DatabaseGateway.JsonElementToParameterValue probes
TryGetInt64 → TryGetDecimal → GetDouble, so a script's high-precision
decimal SQL parameter survives the cached-write retry round-trip
without silent precision loss. 3 new regression tests.
Template engine correctness:
- TE-018: DiffService gains ComputeConnectionsDiff over
FlattenedConfiguration.Connections, mirroring the existing entity-diff
shape and pairing with the Theme 1 TE-017 hash-coverage fix. A
ConfigurationDiff record extension in Commons is flagged as a follow-up.
- TE-019: TemplateResolver.BuildInheritanceChain now walks via the
int? ParentTemplateId directly — only null means "no parent". A real
Id of 0 (the prior special-cased sentinel) now walks the chain like
any other node, matching the TemplateEngine-013 CycleDetector fix.
Regression of TE-013 closed.
- TE-020: All 5 Create* paths in TemplateService + SharedScriptService
re-ordered to save-first → log-with-real-Id → save-audit (matching
the InstanceService pattern). Create* audit rows no longer carry a
literal "0" EntityId.
Doc deferral:
- Transport-012: Component-Transport.md §Audit Trail now spells out that
the BundleImportId repository filter IS wired (in CentralUiRepository),
but the Audit-Log-Viewer UI dropdown + summary-row hyperlink are a
deferred CentralUI follow-up. CLI workaround documented
(audit query --bundle-import-id).
11+ new regression tests (3 ESG, 4 DiffService, 3 TemplateResolver, 4
TemplateService, 1 SharedScriptService). Build clean; ESG 72/72,
TemplateEngine 324/324. README regenerated: 1 pending of 481 total.
Session-to-date: 135 of 136 originally-open Theme findings closed
across 10 themes in 10 commits.
The largest themed batch — small mechanical fixes across 11 modules.
API / message hygiene:
- Comm-020: SiteAddressCacheLoaded now carries IReadOnlyDictionary /
IReadOnlyList — Akka messages must be immutable.
- Commons-016: BundleSession.MaxUnlockAttempts named constant replaces
magic 3.
- Commons-018: IOperationTrackingStore + IPartitionMaintenance moved from
Interfaces/ root to Interfaces/Services/ (namespace preserved — 9
consumers exceeded the in-prompt move threshold).
- Commons-023: TrackingStatusSnapshot.SourceNode now consistent with the
trailing-optional-with-default pattern used elsewhere.
- SR-022: AuditingDbCommand.DbConnection.set no longer uses reflection —
exposes AuditingDbConnection.Inner via internal API surface.
Dead code / config cleanup:
- ClusterInfra-011: decorative SectionName constant deleted.
- ClusterInfra-014: dead AddClusterInfrastructureActors method + its
"throws-when-called" test deleted.
- Host-021: Microsoft Logging:LogLevel block deleted from appsettings.json
(dead under Serilog).
Fail-loud over fail-silent:
- DM-021: ResolveSiteIdentifierAsync throws on missing site (was silently
substituting a DB id).
- DM-022: dropped transient Pending write — record now lands directly in
InProgress (no UI flicker, one fewer DB write).
- Host-020: LoggerConfigurationFactory emits a Console.Error warning when
both Serilog:MinimumLevel and ScadaLink:Logging:MinimumLevel are set
(ScadaLink remains truth per Host-011).
- SnF-022: NotifyCachedCallObserverAsync logs Warning on unparseable
TrackedOperationId (was silently dropping).
- SnF-023: empty siteId default replaced with $unknown-site sentinel
+ constructor normalisation.
Correctness:
- SCA-001: SupervisorStrategy XML rewritten to match actual
DefaultDecider/Restart semantics (was claiming Resume).
- SCA-003: OnUpsertAsync now restamps IngestedAtUtc on every upsert.
- SR-021: HandleDeployArtifacts now dispatches an internal
ApplyArtifactDataConnectionsToDcl message after the SQLite write so
system-wide artifact-deploy data-connection changes go live
immediately (was requiring a site restart).
- SnF-020: RetryParkedMessageAsync captures the parked row BEFORE the
local write so a concurrent delete can't skip standby replication.
Sentinels / naming collisions:
- HM-021: CentralSiteId changed from "central" to "$central"
(uncollideable — leading $ is forbidden in real SiteIdentifiers).
Doc / surface cleanups:
- SEL-018: FailedWriteCount promoted to ISiteEventLogger; XML softened
to "Available for future Health Monitoring integration".
- SnF-019: VERIFY outcome — documented parking-after-DefaultMaxRetries
in Component-StoreAndForward.md + DefaultMaxRetries XML (uniform
cap; maxRetries:0 is the unbounded escape hatch).
- SnF-021: Component-StoreAndForward.md no longer claims the tracking
table lives in SnF — it's in SiteRuntime, the interface is in Commons.
- CLI-020: bundle export response parse guarded with try/catch on
JsonException / KeyNotFoundException / FormatException — emits a
clean INVALID_RESPONSE exit instead of a stack trace.
Config:
- ClusterInfra-013: intent comment added to "catastrophic config" test.
- Host-016: appsettings.Site.json second CentralContactPoints entry
removed (was pointing at the SITE's own port); doc-key explains how
to extend.
- Host-018: NodeName added to both shipped per-role configs (was
causing SourceNode to be null on audit rows).
UI:
- CentralUI-029: replaced JS.InvokeAsync<int>("eval", …) with an ES
module import (new wwwroot/js/browser-time.js).
- CentralUI-032: AuditResultsGrid gains a Previous button backed by a
cursor stack.
10+ new regression tests across the affected projects. Build clean;
all suites green. README regenerated: 6 open (was 33).
Session-to-date: 130 of 136 originally-open Theme findings closed.