Files
ScadaBridge/docs/plans/2026-06-18-m8-transport-design.md
T

20 KiB
Raw Blame History

Design: M8 — Transport (T18, T20)

Date: 2026-06-18 Status: Approved (brainstorming session) — ready for implementation planning Milestone: M8 of docs/plans/2026-06-15-stillpending-completion-design.md (Phase 2 — Expand) Component: #24 Transport (extends; component count stays 26) Source items: stillpending.md Transport (#24): T18 (site/instance-scoped artifact transport + name-mapping), T20 (per-line/Myers diff for Modified artifacts); plus Tier-2 defect #16 (stale-instance enumeration stub).

Goal

Extend the file-based, encrypted Transport bundle subsystem — today central-config-only (templates, shared scripts, external systems, central DB connections, notification lists, SMTP configs, API methods) — to also promote site-scoped and instance-scoped artifacts across environments, via a name-mapping subsystem that reconciles source-environment site/connection identities to the target environment. Replace the coarse line-count delta in the Modified-artifact diff with a real per-line Myers diff. Make the stale-instance enumeration real.

Scope (locked decisions)

  • D-Scope — Full M8: T18 + T20 + fix #16.
  • D1 (T18 granularity)Sites + individual instances. An operator can select a whole Site (pulling its DataConnections + all its Instances) and/or cherry-pick individual Instances. Instances carry their InstanceAttributeOverrides, InstanceAlarmOverrides, InstanceNativeAlarmSourceOverrides, InstanceConnectionBindings, and optional Area membership (by name).
  • D2 (name-mapping UX)Interactive wizard step + CLI flags. Auto-match source→target by SiteIdentifier / connection Name; operator overrides to an existing target or "create new". Unmapped references become blocker rows. CLI mirrors via --map-site src=dst and --map-connection site/src=dst.
  • D3 (environment-specific data)Carry full config (encrypted secrets). Site NodeA/B + GrpcNodeA/B addresses travel; new DataConnection PrimaryConfiguration/BackupConfiguration (endpoint + credentials) travel inside the encrypted SecretsBlock; operator overwrites target values post-import as needed.

Out of scope (deferred to their own brainstorm, per completion design)

  • T19 — direct cluster-to-cluster pull; asymmetric bundle signing; differential/incremental bundles.

Background — current Transport surface (as audited)

  • Content container: BundleContentDto (Transport/Serialization/EntityDtos.cs): TemplateFolders, Templates, SharedScripts, ExternalSystems, DatabaseConnections, NotificationLists, SmtpConfigs, ApiMethods (+ legacy read-only ApiKeys). Persistence-shaped twin EntityAggregate.
  • Selection: ExportSelection (Commons/Types/Transport) carries ID lists per type + IncludeDependencies. Across the wire (CLI/ManagementActor) selection is name-based; resolved to IDs in ManagementActor.
  • Flow: BundleExporter.ExportAsyncDependencyResolver.ResolveAsyncEntitySerializer.ToBundleContentManifestBuilder.BuildBundleSerializer.Pack. Import: BundleImporter.LoadAsync (envelope/zip-bomb guards, manifest+hash validation, decrypt) → PreviewAsync (per-type ArtifactDiff.Compare* + DetectBlockersAsync) → ApplyAsync (single EF transaction; per-type Apply*Async; intermediate flush; second-pass rewire ResolveAlarmScriptLinksAsync/ResolveCompositionEdgesAsync; BundleImported audit; commit).
  • Conflict model: ImportPreviewItem(EntityType, Name, ExistingVersion?, IncomingVersion?, ConflictKind{Identical,Modified,New,Blocker}, FieldDiffJson?, BlockerReason?); ImportResolution(EntityType, Name, ResolutionAction{Add,Overwrite,Skip,Rename}, RenameTo?); ImportResult(BundleImportId, Added, Overwritten, Skipped, Renamed, StaleInstanceIds, AuditEventCorrelation, ApiKeysIgnored).
  • Diff: ArtifactDiff emits coarse field changes + <N lines> markers for code (AddCodeChangeIfDifferent). Per-line diff explicitly deferred (T20).
  • #16: ImportResult.StaleInstanceIds = Array.Empty<int>() (BundleImporter.cs ~L736) — known stub.
  • Plumbing: TransportCommands.cs (ExportBundleCommand/PreviewBundleCommand/ImportBundleCommand, base64 envelope); ManagementActor handlers ~L23012468; CLI BundleCommands.cs.

Site-scoped entity model (as audited)

  • Site (Commons/Entities/Sites/Site.cs): Id, Name, SiteIdentifier (unique routing key), Description?, NodeAAddress?, NodeBAddress?, GrpcNodeAAddress?, GrpcNodeBAddress?. Repo ISiteRepository (GetSiteByIdentifierAsync, GetAllSitesAsync, GetDataConnectionsBySiteIdAsync, …).
  • DataConnection (Commons/Entities/Sites/DataConnection.cs): Id, SiteId (FK), Name (unique within site), Protocol, PrimaryConfiguration? (JSON: endpoint+creds), BackupConfiguration?, FailoverRetryCount. Repo ISiteRepository.
  • Instance (Commons/Entities/Instances/Instance.cs): Id, TemplateId (FK), SiteId (FK), AreaId?, UniqueName (globally unique), State{NotDeployed,Enabled,Disabled}, + collections AttributeOverrides, AlarmOverrides, ConnectionBindings, NativeAlarmSourceOverrides. Repos ITemplateEngineRepository (CRUD) + IDeploymentManagerRepository.
  • Overrides/bindings:
    • InstanceAttributeOverride(Id, InstanceId, AttributeName, OverrideValue?, ElementDataType?).
    • InstanceAlarmOverride(Id, InstanceId, AlarmCanonicalName, TriggerConfigurationOverride?, PriorityLevelOverride?).
    • InstanceNativeAlarmSourceOverride(Id, InstanceId, SourceCanonicalName, ConnectionNameOverride?, SourceReferenceOverride?, ConditionFilterOverride?)name-based connection ref.
    • InstanceConnectionBinding(Id, InstanceId, AttributeName, DataConnectionId, DataSourceReferenceOverride?)numeric FK connection ref.
  • Staleness: DeploymentService.GetDeploymentComparisonAsync compares DeployedConfigSnapshot.RevisionHash to the freshly-flattened hash. Deployments page surfaces drift; redeploy is the standard flow (Transport never bypasses it).
  • Audited handlers: ManagementActor HandleCreate/Update/DeleteSite, …DataConnection, HandleCreateInstance (InstanceService), HandleSetConnectionBindings, HandleSetInstanceOverrides. Site-scope enforced via EnforceSiteScope / EnforceSiteScopeForInstance.

Architecture

A. Bundle format extension (additive)

Extend the existing format — no second bundle type (rejected: duplicates machinery, splits UX).

  • BundleContentDto gains additive arrays:
    • Sites: IReadOnlyList<SiteDto>SiteDto(SiteIdentifier, Name, Description?, NodeAAddress?, NodeBAddress?, GrpcNodeAAddress?, GrpcNodeBAddress?).
    • DataConnections: IReadOnlyList<DataConnectionDto>DataConnectionDto(SiteIdentifier, Name, Protocol, int FailoverRetryCount, SecretsBlock? Secrets). PrimaryConfiguration/BackupConfiguration (endpoint+creds) ride Secrets (presence-only diff, encrypted-at-rest in the bundle). Named to avoid collision with the existing DatabaseConnectionDto (External-System DB defs).
    • Instances: IReadOnlyList<InstanceDto>InstanceDto(UniqueName, TemplateName, SiteIdentifier, AreaName?, InstanceState State, AttributeOverrides[], AlarmOverrides[], NativeAlarmSourceOverrides[], ConnectionBindings[]), with child DTOs:
      • InstanceAttributeOverrideDto(AttributeName, OverrideValue?, DataType? ElementDataType).
      • InstanceAlarmOverrideDto(AlarmCanonicalName, TriggerConfigurationOverride?, PriorityLevelOverride?).
      • InstanceNativeAlarmSourceOverrideDto(SourceCanonicalName, ConnectionNameOverride?, SourceReferenceOverride?, ConditionFilterOverride?).
      • InstanceConnectionBindingDto(AttributeName, ConnectionName, DataSourceReferenceOverride?) — note: ConnectionName (resolved from the source FK at export) replaces the numeric FK, so it is portable.
  • BundleSummary + manifest contents[] extended with Sites, DataConnections, Instances (counts + dependsOn edges, e.g. Instance:Tank01 dependsOn ["Template:Tank","Site:plant-a","DataConnection:plant-a/opc-main"]).
  • Versioning: bundleFormatVersion stays 1; schemaVersion 1.0 → 1.1 (additive-only; older importers accept minor increments and list the new entity entries as "skipped — unsupported in this version" per the existing forward-compat rule).

B. Name-mapping subsystem (core of T18)

New Commons type:

BundleNameMap(
    IReadOnlyList<SiteMapping> Sites,
    IReadOnlyList<ConnectionMapping> Connections)

SiteMapping(string SourceSiteIdentifier, MappingAction Action, string? TargetSiteIdentifier)
ConnectionMapping(string SourceSiteIdentifier, string SourceConnectionName, MappingAction Action, string? TargetConnectionName)

enum MappingAction { MapToExisting, CreateNew }
  • Detection (PreviewAsync): after deserialize, collect the distinct source Sites and DataConnections referenced by the bundle's instances (via ConnectionBindings.ConnectionName, NativeAlarmSourceOverrides.ConnectionNameOverride, and Instance.SiteIdentifier) and the sites/connections carried in the bundle directly. Return them as RequiredSiteMappings / RequiredConnectionMappings on the preview, each pre-populated with an auto-match: MapToExisting when a target SiteIdentifier (or (site,name) connection) already exists; otherwise CreateNew.
  • Wizard Map step: renders the required mappings as a table; per row a dropdown { <existing target> | Create new }. Connections are grouped under their (mapped) target site.
  • CLI: --map-site srcIdentifier=dstIdentifier (repeatable), --map-connection srcSiteIdentifier/srcName=dstName (repeatable); a --create-missing-sites/--create-missing-connections convenience defaults unmatched rows to CreateNew (otherwise unmatched → blocker, preserving fail-safe behaviour for automation).
  • Blockers: any referenced site/connection that resolves to neither a mapping target nor an in-bundle artifact → blocker row (BundleImportConnectionUnresolved / unresolved-site reason). Apply disabled until resolved.

C. Identity vs conflict (two orthogonal axes)

  1. Identity resolution — name-map for Site/DataConnection; UniqueName for Instance; Name for all existing central types. This decides which target row an artifact corresponds to.
  2. Conflict resolutionResolutionAction{Add,Overwrite,Skip,Rename} then applies per artifact exactly as today:
    • MapToExisting + Skip → leave target untouched.
    • MapToExisting + Overwrite → apply bundle config to the existing target.
    • CreateNewAdd.
    • Rename for instances uses a new UniqueName; for sites a new SiteIdentifier (rare).

D. Export flow

  • ExportSelection gains SiteIds: IReadOnlyList<int> + InstanceIds: IReadOnlyList<int>.
  • DependencyResolver expansion (when IncludeDependencies):
    • selected Site → its DataConnections + all its Instances;
    • selected Instance → its Site, its DataConnections (referenced by bindings + native-alarm overrides), and its Template (existing template-closure expansion);
    • InstanceTemplate → existing shared-script / external-system expansion already covers transitive deps.
  • EntitySerializer.ToBundleContent resolves each instance's TemplateId→name, SiteIdSiteIdentifier, AreaId→name, and each ConnectionBinding.DataConnectionId(siteIdentifier, connectionName); carries connection config into Secrets.
  • Export wizard Step 1 adds a Sites group (with nested instances) + a flat Instances selector; Step 2 dependency review shows the new edges. RequireDesign unchanged.

E. Import flow + wizard

  • Wizard: Upload → Passphrase → Map (new; shown only when the bundle contains site/instance-scoped artifacts) → Diff & resolve → Confirm → Result.
  • IBundleImporter.PreviewAsync returns the diff plus required mappings. ApplyAsync(sessionId, resolutions, nameMap, user, ct) (signature extended with nameMap) runs in the existing single EF transaction:
    1. Resolve/create target Sites per the site-map (CreateNew → full SiteDto config; MapToExisting → use target, honour conflict action).
    2. Resolve/create target DataConnections per the connection-map (config from Secrets).
    3. Existing central-config apply (ApplyTemplatesAsync, …) — unchanged.
    4. Upsert Instances: resolve TemplateName→target template id (blocker if unresolved), SiteIdentifier→target site id (from map), AreaName→area (create-if-missing within target site), State forced to NotDeployed on import; write overrides; write ConnectionBindings with ConnectionName→target DataConnectionId (from map) in the second-pass rewire; rewrite NativeAlarmSourceOverride.ConnectionNameOverride to the mapped target name.
    5. Semantic validation (existing two-tier) — unchanged.
    6. BundleImported audit + BundleImportId correlation (existing) — now also covers the per-entity Site/DataConnection/Instance rows because we route through the audited repository methods.
    7. Commit.
  • Reuses audited repository paths so per-entity audit rows + BundleImportId correlation flow automatically; site-scope (EnforceSiteScope) honoured for each target site at import. RequireAdmin unchanged.
  • Blockers: instance whose TemplateName resolves in neither bundle nor target; unresolved connection/site references (see B).

F. T20 — per-line Myers diff

  • New pure helper LineDiffer (Transport project or Commons/Types/Transport) implementing Myers O(ND) LCS over \n-split lines (custom, no third-party library — matches the custom-KpiTrendChart precedent; central package management discourages new deps).
  • Output: structured hunks of { Op: Context|Add|Remove, Line, OldLineNo?, NewLineNo? }, size-capped by TransportOptions (MaxDiffLines default ~400, MaxDiffHunks); beyond the cap, emit a truncated:true marker + a final <+N / -M more lines> summary so very large scripts don't bloat FieldDiffJson.
  • ArtifactDiff.AddCodeChangeIfDifferent (and the script-children path) replace the <N lines> marker with the hunk diff embedded in FieldDiffJson for TemplateScript.Code, SharedScript.Code, ApiMethod.Script (and ExternalSystemMethod body if present). Non-code fields keep coarse value diffs.
  • Import wizard Modified rows render the +/- line view (fulfils the spec's Step-3 "shows +/-/~ line diff" promise that the implementation never met). bUnit + Playwright coverage for the rendered diff.

G. #16 — real stale-instance enumeration

  • ImportResult.StaleInstanceIds becomes real. Inside ApplyAsync, before commit: for every Template Overwritten by the import, enumerate existing deployed target instances of that template and compute hash drift (reuse the flatten + DeployedConfigSnapshot.RevisionHash comparison the Deployments page already uses; called inside the open transaction against staged changes, or post-flush pre-commit). Return the drifting instance ids.
  • Directly-imported instances land NotDeployed → surfaced as new in the result (not "stale"; stale = deployed-and-drifted).
  • Step 4 "N instances will become stale" shows the real count; Step 5's existing Deployments deep-link gets real ids. UI already wired — this fills it with data.

Error handling (additions to the existing table)

Where Failure Surfaced as
Map Source site referenced, no auto-match, not mapped, CreateNew not chosen Blocker row: "Site '{id}' not mapped to a target"
Map Connection referenced by an instance binding, unmapped & not in bundle Blocker row (BundleImportConnectionUnresolved)
Apply Instance's TemplateName resolves nowhere Blocker row before Apply enabled
Apply Create-new site/connection but config absent (unencrypted bundle stripped) Created with available fields; warning event; operator completes config post-import

All-or-nothing per bundle (unchanged): the EF transaction wraps site/connection/instance writes + audit; rollback removes everything.

Security

  • DataConnection endpoints + credentials ride the encrypted SecretsBlock; unencrypted export still trips UnencryptedBundleExport. Diffs are presence-only for secrets (never echo endpoints/creds).
  • Import remains RequireAdmin; site-scope enforced per target site.
  • No new HTTP channel; bundle bytes ride the existing base64 /management envelope (already raised to 200 MB).

Authorization (unchanged)

Operation Role
Export page / bundle export (incl. sites/instances) RequireDesign
Import page / bundle preview+import (incl. sites/instances) RequireAdmin

Audit

  • Reuse existing per-entity audited repository methods for Site/DataConnection/Instance create/update so rows carry BundleImportId.
  • New warning events (mirroring existing BundleImportAlarmScriptUnresolved / …CompositionUnresolved): BundleImportConnectionUnresolved, BundleImportSiteUnresolved (when a non-blocking soft-reference can't bind).

Testing strategy

  • Unit: LineDiffer (Myers correctness + size-cap + edge cases: empty, identical, all-add, all-remove, CRLF); BundleNameMap auto-match + CLI flag parsing; EntitySerializer round-trip for SiteDto/DataConnectionDto/InstanceDto (incl. connection-name resolution); ArtifactDiff Site/Connection/Instance compare.
  • Integration: export a site (+connections+instances) → import to a fresh target with a name-map (create-new) and to a populated target (map-to-existing + Overwrite/Skip); FK remapping of ConnectionBinding/NativeAlarmSourceOverride; blocker on unresolved template/connection; #16 stale enumeration (overwrite a template that has deployed instances → assert returned ids drift); rollback leaves nothing partial.
  • bUnit: Map step rendering + auto-match; Modified row +/- diff render.
  • Playwright: export-with-site → import-with-mapping happy path; diff line view.
  • Targeted builds/tests per task; full-solution build + bash docker/deploy.sh rebuild + live smoke at integration.

Forward compatibility

  • Older importers (schemaVersion 1.0) reading a 1.1 bundle: unknown Site/DataConnection/Instance entries listed as "skipped — unsupported in this version" (existing rule). Central-config artifacts still import.
  • bundleFormatVersion unchanged (1) — no hard refusal.

Execution plan (waves — for writing-plans)

  • Wave A (format + name-map foundation): Commons DTOs/types (SiteDto/DataConnectionDto/InstanceDto + children, BundleNameMap, extended ExportSelection/BundleSummary/ManifestContentEntry usage, extended ImportResult/preview required-mappings), LineDiffer pure helper. (high-risk: data contracts; standard: LineDiffer)
  • Wave B (export): DependencyResolver site/instance expansion; EntitySerializer site/connection/instance mapping (FK→name); ManifestBuilder/summary counts; ExportSelection wiring + ManagementActor.HandleExportBundle name resolution; CLI bundle export site/instance flags. (high-risk)
  • Wave C (diff + preview): ArtifactDiff Myers integration (T20) + Site/Connection/Instance compare; BundleImporter.PreviewAsync required-mapping detection + blocker scan extension. (standard/high-risk)
  • Wave D (apply + name-map + #16): BundleImporter.ApplyAsync nameMap param + site/connection resolve-or-create + instance upsert + FK rewire; #16 stale enumeration; ManagementActor.HandleImportBundle/PreviewBundle plumbing; CLI bundle import --map-* flags. (high-risk; serial reviews)
  • Wave E (UI): Export wizard Sites/Instances selection; Import wizard Map step + Modified +/- diff render. (standard)
  • Integration: docs (Component-Transport.md, Component-CLI.md, README, stillpending.md, completion-design.md), full build, docker rebuild, Playwright, live smoke, end-to-end cross-cluster trace.

Open items / risks

  • The name-map detection must be exhaustive: every place a transported instance points at a connection (ConnectionBinding, NativeAlarmSourceOverride) and at a site must be covered, or an import silently drops a binding. Integration tests assert FK remap explicitly.
  • #16 hash-drift computation inside the open transaction must use the post-staged template state; verify it doesn't double-count directly-imported (NotDeployed) instances.
  • Area create-if-missing is the lightest reasonable choice; if areas grow fields later, revisit.

Next step

Hand off to writing-plans to produce the bite-sized, task-metadata'd implementation plan + .tasks.json, then execute subagent-driven in this worktree.