diff --git a/docs/plans/2026-06-18-m8-transport-design.md b/docs/plans/2026-06-18-m8-transport-design.md new file mode 100644 index 00000000..92c83e69 --- /dev/null +++ b/docs/plans/2026-06-18-m8-transport-design.md @@ -0,0 +1,189 @@ +# Design: M8 — Transport (T18, T20) + +**Date:** 2026-06-18 +**Status:** Approved (brainstorming session) — ready for implementation planning +**Milestone:** M8 of `docs/plans/2026-06-15-stillpending-completion-design.md` (Phase 2 — Expand) +**Component:** #24 Transport (extends; component count stays 26) +**Source items:** `stillpending.md` Transport (#24): T18 (site/instance-scoped artifact transport + name-mapping), T20 (per-line/Myers diff for Modified artifacts); plus Tier-2 defect #16 (stale-instance enumeration stub). + +## Goal + +Extend the file-based, encrypted Transport bundle subsystem — today **central-config-only** (templates, shared scripts, external systems, central DB connections, notification lists, SMTP configs, API methods) — to also promote **site-scoped** and **instance-scoped** artifacts across environments, via a **name-mapping subsystem** that reconciles source-environment site/connection identities to the target environment. Replace the coarse line-count delta in the Modified-artifact diff with a real **per-line Myers diff**. Make the stale-instance enumeration real. + +## Scope (locked decisions) + +- **D-Scope** — Full M8: T18 + T20 + fix #16. +- **D1 (T18 granularity)** — **Sites + individual instances.** An operator can select a whole `Site` (pulling its `DataConnection`s + all its `Instance`s) and/or cherry-pick individual `Instance`s. Instances carry their `InstanceAttributeOverride`s, `InstanceAlarmOverride`s, `InstanceNativeAlarmSourceOverride`s, `InstanceConnectionBinding`s, and optional `Area` membership (by name). +- **D2 (name-mapping UX)** — **Interactive wizard step + CLI flags.** Auto-match source→target by `SiteIdentifier` / connection `Name`; operator overrides to an existing target or "create new". Unmapped references become blocker rows. CLI mirrors via `--map-site src=dst` and `--map-connection site/src=dst`. +- **D3 (environment-specific data)** — **Carry full config (encrypted secrets).** Site `NodeA/B` + `GrpcNodeA/B` addresses travel; new `DataConnection` `PrimaryConfiguration`/`BackupConfiguration` (endpoint + credentials) travel inside the encrypted `SecretsBlock`; operator overwrites target values post-import as needed. + +### Out of scope (deferred to their own brainstorm, per completion design) + +- **T19** — direct cluster-to-cluster pull; asymmetric bundle signing; differential/incremental bundles. + +## Background — current Transport surface (as audited) + +- **Content container:** `BundleContentDto` (`Transport/Serialization/EntityDtos.cs`): `TemplateFolders, Templates, SharedScripts, ExternalSystems, DatabaseConnections, NotificationLists, SmtpConfigs, ApiMethods` (+ legacy read-only `ApiKeys`). Persistence-shaped twin `EntityAggregate`. +- **Selection:** `ExportSelection` (Commons/Types/Transport) carries ID lists per type + `IncludeDependencies`. Across the wire (CLI/ManagementActor) selection is **name**-based; resolved to IDs in `ManagementActor`. +- **Flow:** `BundleExporter.ExportAsync` → `DependencyResolver.ResolveAsync` → `EntitySerializer.ToBundleContent` → `ManifestBuilder.Build` → `BundleSerializer.Pack`. Import: `BundleImporter.LoadAsync` (envelope/zip-bomb guards, manifest+hash validation, decrypt) → `PreviewAsync` (per-type `ArtifactDiff.Compare*` + `DetectBlockersAsync`) → `ApplyAsync` (single EF transaction; per-type `Apply*Async`; intermediate flush; second-pass rewire `ResolveAlarmScriptLinksAsync`/`ResolveCompositionEdgesAsync`; `BundleImported` audit; commit). +- **Conflict model:** `ImportPreviewItem(EntityType, Name, ExistingVersion?, IncomingVersion?, ConflictKind{Identical,Modified,New,Blocker}, FieldDiffJson?, BlockerReason?)`; `ImportResolution(EntityType, Name, ResolutionAction{Add,Overwrite,Skip,Rename}, RenameTo?)`; `ImportResult(BundleImportId, Added, Overwritten, Skipped, Renamed, StaleInstanceIds, AuditEventCorrelation, ApiKeysIgnored)`. +- **Diff:** `ArtifactDiff` emits coarse field changes + `` markers for code (`AddCodeChangeIfDifferent`). Per-line diff explicitly deferred (T20). +- **`#16`:** `ImportResult.StaleInstanceIds = Array.Empty()` (BundleImporter.cs ~L736) — known stub. +- **Plumbing:** `TransportCommands.cs` (`ExportBundleCommand`/`PreviewBundleCommand`/`ImportBundleCommand`, base64 envelope); `ManagementActor` handlers ~L2301–2468; CLI `BundleCommands.cs`. + +## Site-scoped entity model (as audited) + +- **`Site`** (`Commons/Entities/Sites/Site.cs`): `Id`, `Name`, `SiteIdentifier` (unique routing key), `Description?`, `NodeAAddress?`, `NodeBAddress?`, `GrpcNodeAAddress?`, `GrpcNodeBAddress?`. Repo `ISiteRepository` (`GetSiteByIdentifierAsync`, `GetAllSitesAsync`, `GetDataConnectionsBySiteIdAsync`, …). +- **`DataConnection`** (`Commons/Entities/Sites/DataConnection.cs`): `Id`, `SiteId` (FK), `Name` (unique within site), `Protocol`, `PrimaryConfiguration?` (JSON: endpoint+creds), `BackupConfiguration?`, `FailoverRetryCount`. Repo `ISiteRepository`. +- **`Instance`** (`Commons/Entities/Instances/Instance.cs`): `Id`, `TemplateId` (FK), `SiteId` (FK), `AreaId?`, `UniqueName` (globally unique), `State{NotDeployed,Enabled,Disabled}`, + collections `AttributeOverrides`, `AlarmOverrides`, `ConnectionBindings`, `NativeAlarmSourceOverrides`. Repos `ITemplateEngineRepository` (CRUD) + `IDeploymentManagerRepository`. +- **Overrides/bindings:** + - `InstanceAttributeOverride(Id, InstanceId, AttributeName, OverrideValue?, ElementDataType?)`. + - `InstanceAlarmOverride(Id, InstanceId, AlarmCanonicalName, TriggerConfigurationOverride?, PriorityLevelOverride?)`. + - `InstanceNativeAlarmSourceOverride(Id, InstanceId, SourceCanonicalName, ConnectionNameOverride?, SourceReferenceOverride?, ConditionFilterOverride?)` — **name-based** connection ref. + - `InstanceConnectionBinding(Id, InstanceId, AttributeName, DataConnectionId, DataSourceReferenceOverride?)` — **numeric FK** connection ref. +- **Staleness:** `DeploymentService.GetDeploymentComparisonAsync` compares `DeployedConfigSnapshot.RevisionHash` to the freshly-flattened hash. Deployments page surfaces drift; redeploy is the standard flow (Transport never bypasses it). +- **Audited handlers:** `ManagementActor` `HandleCreate/Update/DeleteSite`, `…DataConnection`, `HandleCreateInstance` (`InstanceService`), `HandleSetConnectionBindings`, `HandleSetInstanceOverrides`. Site-scope enforced via `EnforceSiteScope` / `EnforceSiteScopeForInstance`. + +## Architecture + +### A. Bundle format extension (additive) + +Extend the existing format — **no second bundle type** (rejected: duplicates machinery, splits UX). + +- `BundleContentDto` gains additive arrays: + - `Sites: IReadOnlyList` — `SiteDto(SiteIdentifier, Name, Description?, NodeAAddress?, NodeBAddress?, GrpcNodeAAddress?, GrpcNodeBAddress?)`. + - `DataConnections: IReadOnlyList` — `DataConnectionDto(SiteIdentifier, Name, Protocol, int FailoverRetryCount, SecretsBlock? Secrets)`. `PrimaryConfiguration`/`BackupConfiguration` (endpoint+creds) ride **`Secrets`** (presence-only diff, encrypted-at-rest in the bundle). Named to avoid collision with the existing `DatabaseConnectionDto` (External-System DB defs). + - `Instances: IReadOnlyList` — `InstanceDto(UniqueName, TemplateName, SiteIdentifier, AreaName?, InstanceState State, AttributeOverrides[], AlarmOverrides[], NativeAlarmSourceOverrides[], ConnectionBindings[])`, with child DTOs: + - `InstanceAttributeOverrideDto(AttributeName, OverrideValue?, DataType? ElementDataType)`. + - `InstanceAlarmOverrideDto(AlarmCanonicalName, TriggerConfigurationOverride?, PriorityLevelOverride?)`. + - `InstanceNativeAlarmSourceOverrideDto(SourceCanonicalName, ConnectionNameOverride?, SourceReferenceOverride?, ConditionFilterOverride?)`. + - `InstanceConnectionBindingDto(AttributeName, ConnectionName, DataSourceReferenceOverride?)` — note: **`ConnectionName`** (resolved from the source FK at export) replaces the numeric FK, so it is portable. +- `BundleSummary` + manifest `contents[]` extended with `Sites`, `DataConnections`, `Instances` (counts + `dependsOn` edges, e.g. `Instance:Tank01 dependsOn ["Template:Tank","Site:plant-a","DataConnection:plant-a/opc-main"]`). +- **Versioning:** `bundleFormatVersion` stays **1**; `schemaVersion` **1.0 → 1.1** (additive-only; older importers accept minor increments and list the new entity entries as "skipped — unsupported in this version" per the existing forward-compat rule). + +### B. Name-mapping subsystem (core of T18) + +New Commons type: + +``` +BundleNameMap( + IReadOnlyList Sites, + IReadOnlyList Connections) + +SiteMapping(string SourceSiteIdentifier, MappingAction Action, string? TargetSiteIdentifier) +ConnectionMapping(string SourceSiteIdentifier, string SourceConnectionName, MappingAction Action, string? TargetConnectionName) + +enum MappingAction { MapToExisting, CreateNew } +``` + +- **Detection (PreviewAsync):** after deserialize, collect the distinct source `Site`s and `DataConnection`s referenced by the bundle's instances (via `ConnectionBindings.ConnectionName`, `NativeAlarmSourceOverrides.ConnectionNameOverride`, and `Instance.SiteIdentifier`) **and** the sites/connections carried in the bundle directly. Return them as `RequiredSiteMappings` / `RequiredConnectionMappings` on the preview, each pre-populated with an auto-match: `MapToExisting` when a target `SiteIdentifier` (or `(site,name)` connection) already exists; otherwise `CreateNew`. +- **Wizard Map step:** renders the required mappings as a table; per row a dropdown `{ | Create new }`. Connections are grouped under their (mapped) target site. +- **CLI:** `--map-site srcIdentifier=dstIdentifier` (repeatable), `--map-connection srcSiteIdentifier/srcName=dstName` (repeatable); a `--create-missing-sites`/`--create-missing-connections` convenience defaults unmatched rows to `CreateNew` (otherwise unmatched → blocker, preserving fail-safe behaviour for automation). +- **Blockers:** any referenced site/connection that resolves to neither a mapping target nor an in-bundle artifact → blocker row (`BundleImportConnectionUnresolved` / unresolved-site reason). Apply disabled until resolved. + +### C. Identity vs conflict (two orthogonal axes) + +1. **Identity resolution** — name-map for `Site`/`DataConnection`; `UniqueName` for `Instance`; `Name` for all existing central types. This decides *which* target row an artifact corresponds to. +2. **Conflict resolution** — `ResolutionAction{Add,Overwrite,Skip,Rename}` then applies per artifact exactly as today: + - `MapToExisting` + `Skip` → leave target untouched. + - `MapToExisting` + `Overwrite` → apply bundle config to the existing target. + - `CreateNew` → `Add`. + - `Rename` for instances uses a new `UniqueName`; for sites a new `SiteIdentifier` (rare). + +### D. Export flow + +- `ExportSelection` gains `SiteIds: IReadOnlyList` + `InstanceIds: IReadOnlyList`. +- `DependencyResolver` expansion (when `IncludeDependencies`): + - selected **Site** → its `DataConnection`s + all its `Instance`s; + - selected **Instance** → its `Site`, its `DataConnection`s (referenced by bindings + native-alarm overrides), and its `Template` (existing template-closure expansion); + - `Instance` → `Template` → existing shared-script / external-system expansion already covers transitive deps. +- `EntitySerializer.ToBundleContent` resolves each instance's `TemplateId`→name, `SiteId`→`SiteIdentifier`, `AreaId`→name, and each `ConnectionBinding.DataConnectionId`→`(siteIdentifier, connectionName)`; carries connection config into `Secrets`. +- Export wizard Step 1 adds a **Sites** group (with nested instances) + a flat **Instances** selector; Step 2 dependency review shows the new edges. `RequireDesign` unchanged. + +### E. Import flow + wizard + +- Wizard: Upload → Passphrase → **Map** *(new; shown only when the bundle contains site/instance-scoped artifacts)* → Diff & resolve → Confirm → Result. +- `IBundleImporter.PreviewAsync` returns the diff **plus** required mappings. `ApplyAsync(sessionId, resolutions, nameMap, user, ct)` (signature extended with `nameMap`) runs in the existing single EF transaction: + 1. Resolve/create target `Site`s per the site-map (CreateNew → full `SiteDto` config; MapToExisting → use target, honour conflict action). + 2. Resolve/create target `DataConnection`s per the connection-map (config from `Secrets`). + 3. Existing central-config apply (`ApplyTemplatesAsync`, …) — unchanged. + 4. Upsert `Instance`s: resolve `TemplateName`→target template id (blocker if unresolved), `SiteIdentifier`→target site id (from map), `AreaName`→area (create-if-missing within target site), `State` forced to `NotDeployed` on import; write overrides; write `ConnectionBinding`s with `ConnectionName`→target `DataConnectionId` (from map) in the second-pass rewire; rewrite `NativeAlarmSourceOverride.ConnectionNameOverride` to the mapped target name. + 5. Semantic validation (existing two-tier) — unchanged. + 6. `BundleImported` audit + `BundleImportId` correlation (existing) — now also covers the per-entity Site/DataConnection/Instance rows because we route through the audited repository methods. + 7. Commit. +- Reuses audited repository paths so per-entity audit rows + `BundleImportId` correlation flow automatically; site-scope (`EnforceSiteScope`) honoured for each target site at import. `RequireAdmin` unchanged. +- **Blockers:** instance whose `TemplateName` resolves in neither bundle nor target; unresolved connection/site references (see B). + +### F. T20 — per-line Myers diff + +- New pure helper `LineDiffer` (Transport project or Commons/Types/Transport) implementing **Myers O(ND)** LCS over `\n`-split lines (custom, no third-party library — matches the custom-`KpiTrendChart` precedent; central package management discourages new deps). +- Output: structured hunks of `{ Op: Context|Add|Remove, Line, OldLineNo?, NewLineNo? }`, **size-capped** by `TransportOptions` (`MaxDiffLines` default ~400, `MaxDiffHunks`); beyond the cap, emit a `truncated:true` marker + a final `<+N / -M more lines>` summary so very large scripts don't bloat `FieldDiffJson`. +- `ArtifactDiff.AddCodeChangeIfDifferent` (and the script-children path) replace the `` marker with the hunk diff embedded in `FieldDiffJson` for `TemplateScript.Code`, `SharedScript.Code`, `ApiMethod.Script` (and `ExternalSystemMethod` body if present). Non-code fields keep coarse value diffs. +- Import wizard Modified rows render the `+/-` line view (fulfils the spec's Step-3 "shows `+/-/~` line diff" promise that the implementation never met). bUnit + Playwright coverage for the rendered diff. + +### G. #16 — real stale-instance enumeration + +- `ImportResult.StaleInstanceIds` becomes real. Inside `ApplyAsync`, before commit: for every **Template Overwritten** by the import, enumerate existing **deployed** target instances of that template and compute hash drift (reuse the flatten + `DeployedConfigSnapshot.RevisionHash` comparison the Deployments page already uses; called inside the open transaction against staged changes, or post-flush pre-commit). Return the drifting instance ids. +- Directly-imported instances land `NotDeployed` → surfaced as **new** in the result (not "stale"; stale = deployed-and-drifted). +- Step 4 "N instances will become stale" shows the real count; Step 5's existing Deployments deep-link gets real ids. UI already wired — this fills it with data. + +## Error handling (additions to the existing table) + +| Where | Failure | Surfaced as | +|---|---|---| +| Map | Source site referenced, no auto-match, not mapped, `CreateNew` not chosen | Blocker row: "Site '{id}' not mapped to a target" | +| Map | Connection referenced by an instance binding, unmapped & not in bundle | Blocker row (`BundleImportConnectionUnresolved`) | +| Apply | Instance's `TemplateName` resolves nowhere | Blocker row before Apply enabled | +| Apply | Create-new site/connection but config absent (unencrypted bundle stripped) | Created with available fields; warning event; operator completes config post-import | + +All-or-nothing per bundle (unchanged): the EF transaction wraps site/connection/instance writes + audit; rollback removes everything. + +## Security + +- DataConnection endpoints + credentials ride the **encrypted** `SecretsBlock`; unencrypted export still trips `UnencryptedBundleExport`. Diffs are presence-only for secrets (never echo endpoints/creds). +- Import remains `RequireAdmin`; site-scope enforced per target site. +- No new HTTP channel; bundle bytes ride the existing base64 `/management` envelope (already raised to 200 MB). + +## Authorization (unchanged) + +| Operation | Role | +|---|---| +| Export page / `bundle export` (incl. sites/instances) | `RequireDesign` | +| Import page / `bundle preview`+`import` (incl. sites/instances) | `RequireAdmin` | + +## Audit + +- Reuse existing per-entity audited repository methods for Site/DataConnection/Instance create/update so rows carry `BundleImportId`. +- New warning events (mirroring existing `BundleImportAlarmScriptUnresolved` / `…CompositionUnresolved`): `BundleImportConnectionUnresolved`, `BundleImportSiteUnresolved` (when a non-blocking soft-reference can't bind). + +## Testing strategy + +- **Unit:** `LineDiffer` (Myers correctness + size-cap + edge cases: empty, identical, all-add, all-remove, CRLF); `BundleNameMap` auto-match + CLI flag parsing; `EntitySerializer` round-trip for `SiteDto`/`DataConnectionDto`/`InstanceDto` (incl. connection-name resolution); `ArtifactDiff` Site/Connection/Instance compare. +- **Integration:** export a site (+connections+instances) → import to a fresh target with a name-map (create-new) and to a populated target (map-to-existing + Overwrite/Skip); FK remapping of `ConnectionBinding`/`NativeAlarmSourceOverride`; blocker on unresolved template/connection; `#16` stale enumeration (overwrite a template that has deployed instances → assert returned ids drift); rollback leaves nothing partial. +- **bUnit:** Map step rendering + auto-match; Modified row `+/-` diff render. +- **Playwright:** export-with-site → import-with-mapping happy path; diff line view. +- **Targeted builds/tests per task; full-solution build + `bash docker/deploy.sh` rebuild + live smoke at integration.** + +## Forward compatibility + +- Older importers (schemaVersion 1.0) reading a 1.1 bundle: unknown `Site`/`DataConnection`/`Instance` entries listed as "skipped — unsupported in this version" (existing rule). Central-config artifacts still import. +- `bundleFormatVersion` unchanged (1) — no hard refusal. + +## Execution plan (waves — for writing-plans) + +- **Wave A (format + name-map foundation):** Commons DTOs/types (`SiteDto`/`DataConnectionDto`/`InstanceDto` + children, `BundleNameMap`, extended `ExportSelection`/`BundleSummary`/`ManifestContentEntry` usage, extended `ImportResult`/preview required-mappings), `LineDiffer` pure helper. *(high-risk: data contracts; standard: LineDiffer)* +- **Wave B (export):** `DependencyResolver` site/instance expansion; `EntitySerializer` site/connection/instance mapping (FK→name); `ManifestBuilder`/summary counts; `ExportSelection` wiring + `ManagementActor.HandleExportBundle` name resolution; CLI `bundle export` site/instance flags. *(high-risk)* +- **Wave C (diff + preview):** `ArtifactDiff` Myers integration (T20) + Site/Connection/Instance compare; `BundleImporter.PreviewAsync` required-mapping detection + blocker scan extension. *(standard/high-risk)* +- **Wave D (apply + name-map + #16):** `BundleImporter.ApplyAsync` `nameMap` param + site/connection resolve-or-create + instance upsert + FK rewire; `#16` stale enumeration; `ManagementActor.HandleImportBundle`/`PreviewBundle` plumbing; CLI `bundle import` `--map-*` flags. *(high-risk; serial reviews)* +- **Wave E (UI):** Export wizard Sites/Instances selection; Import wizard Map step + Modified `+/-` diff render. *(standard)* +- **Integration:** docs (Component-Transport.md, Component-CLI.md, README, stillpending.md, completion-design.md), full build, docker rebuild, Playwright, live smoke, end-to-end cross-cluster trace. + +## Open items / risks + +- The name-map detection must be exhaustive: every place a transported instance points at a connection (`ConnectionBinding`, `NativeAlarmSourceOverride`) and at a site must be covered, or an import silently drops a binding. Integration tests assert FK remap explicitly. +- `#16` hash-drift computation inside the open transaction must use the post-staged template state; verify it doesn't double-count directly-imported (NotDeployed) instances. +- `Area` create-if-missing is the lightest reasonable choice; if areas grow fields later, revisit. + +## Next step + +Hand off to writing-plans to produce the bite-sized, task-metadata'd implementation plan + `.tasks.json`, then execute subagent-driven in this worktree.