fix(docker-dev): self-bootstrap schema via one-shot migrator (fixes fresh-volume quirks)

Adds a 'migrator' Dockerfile stage + Compose service that runs 'dotnet ef
database update' once on bring-up, so a fresh SQL volume gets the schema with no
operator step (quirk 1). cluster-seed + every host node depend on it via
service_completed_successfully, so the seed never races an in-progress migration
(quirk 2). Host build pinned to target: runtime (the migrator is now the last
stage). entrypoint + README updated; the manual 'dotnet ef' first-time step is
gone. Verified: down -v + up --build self-bootstraps (migrator+seed exit 0,
6 nodes up), deploy Sealed 6/6.
This commit is contained in:
Joseph Doherty
2026-06-07 08:17:09 -04:00
parent 1f76eac97a
commit b0a62a9f3b
4 changed files with 61 additions and 45 deletions
+4 -23
View File
@@ -42,7 +42,7 @@ Site nodes serve no UI and authenticate no users. The central cluster manages an
All six host nodes write to the same `OtOpcUa` ConfigDb. The `ServerCluster` table differentiates the three logical clusters: each maps to one row, and each `ClusterNode` row's `ClusterId` ties the runtime node back to its owning cluster scope.
A one-shot `cluster-seed` Compose service (image `mcr.microsoft.com/mssql-tools`) waits for the `OtOpcUa` ConfigDb schema to exist (the host nodes do **not** auto-migrate — you apply EF migrations once; see [First-time setup](#first-time-setup-or-after-down--v)) and then INSERTs the rows below. The seed is **idempotent**`IF NOT EXISTS` guards every insert — so re-runs on `docker compose up` are no-ops:
Two one-shot Compose services bootstrap the DB on bring-up: `migrator` applies the EF Core migrations (so a fresh SQL volume gets the schema with no operator step — the host nodes deliberately do **not** auto-migrate, since production owns schema changes), then `cluster-seed` (image `mcr.microsoft.com/mssql-tools`) INSERTs the rows below. `cluster-seed` and every host node `depend_on` the `migrator` completing (`service_completed_successfully`), so the seed never races an in-progress migration. The seed is **idempotent**`IF NOT EXISTS` guards every insert — so re-runs on `docker compose up` are no-ops:
| Logical cluster | `ServerCluster.ClusterId` | `ClusterNode.NodeId` rows |
|---|---|---|
@@ -70,33 +70,14 @@ The DriverHost actor doesn't spawn drivers from raw DriverInstance rows on its o
# from the repo root
docker compose -f docker-dev/docker-compose.yml up -d --build
# wait ~20 seconds for SQL to come up + the mesh to form
# the one-shot migrator + cluster-seed bootstrap the DB; watch the seed finish:
docker compose -f docker-dev/docker-compose.yml logs -f cluster-seed # ^C once it prints "[cluster-seed] done."
open http://localhost:9200 # Admin UI (Traefik → central-1 or central-2)
open http://localhost:8089 # Traefik dashboard
```
The first build takes a few minutes (.NET SDK image + restore + publish). Subsequent rebuilds are faster with Docker's layer cache.
### First-time setup (or after `down -v`)
The host nodes do **not** auto-create the ConfigDb schema — on a brand-new SQL volume you must apply the EF migrations once, then (re)run the seed. (The auto-started `cluster-seed` polls for `dbo.ServerCluster`, which the *first* migration creates, so if it runs mid-migration it can fail against an intermediate schema — just re-run it after migrations finish.)
```bash
# 1. bring the stack up (SQL + nodes; nodes retry the DB until the schema exists)
docker compose -f docker-dev/docker-compose.yml up -d --build
# 2. create + migrate the OtOpcUa ConfigDb (one time; the design-time factory reads OTOPCUA_CONFIG_CONNECTION)
OTOPCUA_CONFIG_CONNECTION="Server=localhost,14330;Database=OtOpcUa;User Id=sa;Password=OtOpcUa!Dev123;TrustServerCertificate=True;" \
dotnet ef database update \
--project src/Core/ZB.MOM.WW.OtOpcUa.Configuration \
--startup-project src/Core/ZB.MOM.WW.OtOpcUa.Configuration
# 3. apply the cluster/namespace/driver seed against the now-complete schema (idempotent)
docker compose -f docker-dev/docker-compose.yml run --rm cluster-seed
```
After the schema + seed exist, a plain `docker compose ... up -d` is enough — the named SQL volume keeps both across restarts (only `down -v` wipes them, which is when you repeat the steps above).
The first build takes a few minutes (.NET SDK image + restore + publish). **No manual schema step is needed** — on a fresh SQL volume the one-shot `migrator` service applies the EF migrations (the host nodes deliberately don't auto-migrate, since production owns schema changes), then `cluster-seed` populates the cluster/namespace/driver rows. `cluster-seed` and the host nodes wait for the migrator via `service_completed_successfully`, so nothing races an in-progress migration. A plain `docker compose ... up -d` on an existing volume is a fast no-op for both — the named SQL volume keeps the schema + rows across restarts; only `down -v` wipes them, after which the next `up` re-migrates + re-seeds automatically.
## Auth (dev only)