SnowBridge now owns machine-data ingest, in-process .NET transformation, and direct writes to curated tables in Snowflake. Collapses the previous ingest/transform split into a single service; no dbt, no external orchestrator, no Snowflake landing tier. Keeps the in-house .NET pattern consistent with ScadaBridge and OtOpcUa. The "Snowflake dbt Transform Layer" roadmap workstream merges into SnowBridge (7 → 6 workstreams); Year 2 canonical-state-based OEE moves with it. Canonical model still has three surfaces — the third is renamed from "dbt curated layer" to "SnowBridge curated layer in Snowflake"; mechanics unchanged.
123 lines
23 KiB
Markdown
123 lines
23 KiB
Markdown
# Plan — Working Session Status
|
||
|
||
**Saved:** 2026-04-24 (second session of the day)
|
||
**Previous session:** Opus 4.6 (1M context)
|
||
**Resume with:** start a new Claude Code session in this directory — `CLAUDE.md` and this file provide full context. No session ID needed; the plan is self-contained in the repo.
|
||
|
||
> This file is a **bookmark**, not a replacement for the plan. The authoritative content lives in `CLAUDE.md`, `current-state.md`, `goal-state.md`, `roadmap.md`, and the component detail files under `current-state/` and `outputs/`. Read this file only to find out where we left off.
|
||
|
||
## Where we are
|
||
|
||
The plan is **substantially complete**. All core documents are populated, architectural decisions are captured with rationale, the canonical model + UNS hierarchy standard are declared, the digital-twin scope is narrowed to two access-control patterns (both delivered by already-committed architecture), the OtOpcUa v2 implementation corrections (19 items + addendum) are integrated, and the first PPTX has been generated. The `schemas/` repo seed exists with the FANUC CNC pilot class and JSON Schema format definitions.
|
||
|
||
**What happened since the original session (2026-04-15 through 2026-04-24):**
|
||
- Integrated OtOpcUa v2 implementation corrections (19 corrections + hardening addendum: ACL model committed, stability tiers, multi-identifier equipment model, driver list confirmed, cutover ownership assigned outside OtOpcUa)
|
||
- Schemas repo seed contributed by OtOpcUa team at `schemas/` (temporary location)
|
||
- Enterprise shortname resolved to `zb`; Warsaw West buildings confirmed as 5 and 19
|
||
- Equipment protocol survey removed (driver list confirmed directly by v2 team)
|
||
- First PPTX generated (18 slides, mixed-stakeholder deck)
|
||
- 7 component diagrams created (OtOpcUa, Redpanda, SnowBridge, ScadaBridge dataflow + topology, Snowflake/dbt — dbt diagram is now stale; next regen will replace)
|
||
- ScadaBridge accuracy corrections from design repo review (email only, not Teams; EventHub not yet implemented)
|
||
- ScadaBridge topology corrected (no site-to-site routing; direct API access; inbound Web API as input)
|
||
- **Digital-twin scope finalized (2026-04-24).** Plan's digital-twin scope is definitively **two access-control patterns**: (1) environment-lifecycle promotion without reconfiguration (ACL flip on write authority against stable equipment UUIDs); (2) safe read-only consumption for KPI / monitoring systems (structurally guaranteed by single-connection-through-OtOpcUa). Both delivered by architecture already committed — no new component, no new workstream. The earlier management-conversation brief (`goal-state/digital-twin-management-brief.md`) and the `goal-state/` subdirectory have been removed; the plan uses only the two patterns above. Write-authority arbitration mechanism is out of scope for this plan (OtOpcUa team's concern). Physics simulation / FAT / commissioning emulation is not a plan item; if it ever materializes as a funded adjacent initiative, that will be a separate scoping conversation.
|
||
- **SnowBridge scope expanded; dbt workstream removed (2026-04-24).** SnowBridge now owns **ingest + in-process .NET transform + curated-table write**, collapsing the previous ingest/transform split. **No dbt, no external orchestrator, no Snowflake landing tier.** The "Snowflake dbt Transform Layer" roadmap workstream is removed; its Year 2 canonical-state-based OEE commitment moves into the SnowBridge workstream. Workstream count drops from **7 to 6**. The canonical model still has three surfaces — the third is renamed "SnowBridge curated layer in Snowflake" (was "dbt curated layer"); mechanics are identical. Rationale: keep the in-house .NET pattern consistent (ScadaBridge / OtOpcUa / SnowBridge), collapse two tools into one, drop the separate Python/SQL transform skillset. Trade-offs captured in `goal-state.md` → SnowBridge → Trade-offs.
|
||
|
||
### Files
|
||
|
||
**Core plan content:**
|
||
|
||
- [`CLAUDE.md`](CLAUDE.md) — plan purpose, document index (now including the component detail files and outputs pipeline), markdown-first conventions, component breakout rules.
|
||
- [`current-state.md`](current-state.md) — snapshot of today's estate (enterprise layout, clusters, systems, integrations, equipment access patterns).
|
||
- [`goal-state.md`](goal-state.md) — target end-state with Vision, layered architecture, **Unified Namespace posture + naming hierarchy standard**, component designs (OtOpcUa, SnowBridge with ingest+transform+curated-layer ownership, Redpanda EventHub with **Canonical Equipment/Production/Event Model + canonical state vocabulary**, ScadaBridge extensions), success criteria, observability, Strategic Considerations (Digital twin — two access-control patterns; Power BI), and Non-Goals.
|
||
- [`roadmap.md`](roadmap.md) — 3-year workstreams × years grid with 6 workstreams and cross-workstream dependencies; Year 1 Redpanda and SnowBridge cells deliver canonical model v1.
|
||
|
||
**Component detail files:**
|
||
|
||
- [`current-state/legacy-integrations.md`](current-state/legacy-integrations.md) — authoritative inventory for pillar 3 retirement. **Closed as denominator = 3**: LEG-001 Delmia DNC, LEG-002 Camstar MES, LEG-003 custom email notification service. Historian MSSQL reporting surface explicitly carved out as *not* legacy.
|
||
- ~~`current-state/equipment-protocol-survey.md`~~ — **Removed.** Protocol survey no longer needed; the OtOpcUa v2 implementation team committed the 8-driver core library from internal knowledge. The UNS hierarchy snapshot (equipment-instance walk) is now a standalone Year 1 deliverable tracked separately.
|
||
**Output generation pipeline (specs only — no outputs generated yet):**
|
||
|
||
- [`outputs/README.md`](outputs/README.md) — trigger phrases (`regenerate outputs` / `regenerate presentation` / `regenerate longform`), regeneration procedure, edit-this-not-that rules.
|
||
- [`outputs/DESIGN.md`](outputs/DESIGN.md) — design for the generation pipeline.
|
||
- [`outputs/IMPLEMENTATION-PLAN.md`](outputs/IMPLEMENTATION-PLAN.md) — scaffolding plan (partially executed — specs written, generation not yet run).
|
||
- [`outputs/presentation-spec.md`](outputs/presentation-spec.md) — 18-slide mixed-stakeholder deck structure anchor.
|
||
- [`outputs/longform-spec.md`](outputs/longform-spec.md) — faithful-typeset PDF structure anchor.
|
||
- `outputs/diagrams/` and `outputs/generated/` — empty, waiting for first regeneration run.
|
||
|
||
### Major decisions captured (pointers, not restatements)
|
||
|
||
- **Vision theme:** *stable, single point of integration between shopfloor OT and enterprise IT* — used as the tiebreaker for ambiguous decisions.
|
||
- **Three in-scope pillars:** unification (100% of sites on standardized stack), analytics/AI enablement (≤15m analytics SLO, one "not possible before" use case in production), legacy middleware retirement (inventory to zero). Binary at end of plan.
|
||
- **UX split:** Ignition owns KPI UX long-term; Aveva System Platform HMI owns validated-data UX long-term. Not a primary goal of this plan.
|
||
- **IT↔OT boundary:** single crossing at ScadaBridge central. OT = machine data (System Platform, equipment OPC UA, OtOpcUa, ScadaBridge, Aveva Historian, Ignition). IT = enterprise apps (Camstar, Delmia, Snowflake, SnowBridge, Power BI/BOBJ).
|
||
- **Layered architecture:** Layer 1 Equipment → Layer 2 OtOpcUa → Layer 3 SCADA (System Platform + Ignition) → Layer 4 ScadaBridge → Enterprise IT.
|
||
- **OtOpcUa** (layer 2): custom-built, clustered, co-located on System Platform nodes, hybrid driver strategy (proactive core library + on-demand long-tail), OPC UA-native auth, **absorbs LmxOpcUa** as its System Platform namespace. Tiered cutover: ScadaBridge first, Ignition second, System Platform IO last. **Namespace architecture supports a future `simulated` namespace** for the pre-install case (dev work before equipment is on the floor) and as foundation for a possible future funded physics-simulation initiative — architecturally supported, not committed for build. **ACL model + single-connection-per-equipment also delivers the plan's two digital-twin patterns** (environment-lifecycle promotion via write-authority flip; safe read-only KPI / monitoring exposure) — see `goal-state.md` → OtOpcUa → Consumer access patterns enabled by the ACL model.
|
||
- **Redpanda EventHub:** self-hosted, central cluster in South Bend (single-cluster HA, VM-level DR out of scope), per-topic tiered retention (operational 7d / analytics 30d / compliance 90d), bundled Schema Registry, Protobuf via central `schemas` repo with `buf` CI, `BACKWARD_TRANSITIVE` compatibility, `TopicNameStrategy` subjects, `{domain}.{entity}.{event-type}` naming, site identity in message (not topic), SASL/OAUTHBEARER + prefix ACLs. Store-and-forward at ScadaBridge handles site resilience. **Analytics-tier retention is also a replay surface** for integration testing (and for a possible future funded physics-simulation initiative, should one materialize).
|
||
- **Canonical Equipment, Production, and Event Model:** the plan commits to declaring the composition of OtOpcUa equipment namespace + Redpanda canonical topics + `schemas` repo + SnowBridge curated layer in Snowflake as **the** canonical model. Three surfaces, one source of truth (`schemas` repo). Includes a **canonical machine state vocabulary** (`Running / Idle / Faulted / Starved / Blocked` + TBD additions like `Changeover`, `Maintenance`, `Setup`). Year 1 Redpanda and SnowBridge cells deliver v1. Load-bearing for pillar 2.
|
||
- **Unified Namespace (UNS) posture:** the canonical model above is also declared as the plan's UNS, framed for stakeholders using UNS vocabulary. **Deliberate deviations from classic MQTT/Sparkplug UNS:** Kafka instead of MQTT (for analytics/replay), flat `{domain}.{entity}.{event-type}` topics with site in message (for bounded topic count), stateless events instead of Sparkplug state machine. Optional future **UNS projection service** (MQTT/Sparkplug and/or enterprise OPC UA aggregator) is architecturally supported but not committed for build; decision trigger documented.
|
||
- **UNS naming hierarchy standard:** 5 levels always present — Enterprise → Site → Area → Line → Equipment, with `_default` placeholder where a level doesn't apply. Text form `zb.warsaw-west.bldg-5.line-2.cnc-mill-05` / OPC UA form `zb/warsaw-west/bldg-5/line-2/cnc-mill-05`. Stable **equipment UUIDv4** alongside the path (path is navigation, UUID is lineage). Authority lives in `schemas` repo; OtOpcUa / Redpanda / SnowBridge consume the authoritative definition.
|
||
- **SnowBridge:** custom-built machine-data-to-Snowflake upload service; Year 1 starting source is Aveva Historian SQL; UI + API with blast-radius-based approval workflow; selection state in internal datastore (not git).
|
||
- **Snowflake transform tooling:** none separate — **SnowBridge owns transformation in-process (.NET)**. No dbt, no Snowflake Dynamic Tables / Streams+Tasks, no external orchestrator (Airflow / Dagster / Prefect).
|
||
- **Aggregation boundary:** aggregation lives in **SnowBridge** (writing curated rows to Snowflake). ScadaBridge does deadband/exception-based filtering (global default ~1% of span) plus tag opt-in via SnowBridge — not source-side summarization.
|
||
- **Observability:** commit to signals (Redpanda, ScadaBridge, SnowBridge ingest + SnowBridge transforms + SnowBridge validation checks), tool is out of scope.
|
||
- **Digital-twin scope (finalized 2026-04-24):** the plan's digital-twin scope is definitively **two access-control patterns** — (1) environment-lifecycle promotion without reconfiguration (ACL flip on write authority against stable equipment UUIDs); (2) safe read-only consumption for KPI / monitoring systems (structurally guaranteed by single-connection-through-OtOpcUa). Both delivered by architecture already committed in the **OtOpcUa** and **Canonical Equipment, Production, and Event Model** subsections — no new component, no new workstream, no pillar dependency. Write-authority arbitration mechanism is out of scope (OtOpcUa team's concern). Physics simulation / FAT / commissioning emulation is not a plan item; any future funded adjacent initiative would be a separate scoping conversation.
|
||
- **Enterprise reporting coordination (BOBJ → Power BI migration, in-flight adjacent initiative):** three consumption paths analyzed (SnowBridge curated layer in Snowflake / Historian direct / both). Recommended position: **Path C with Path A as strategic direction** — most machine-data and cross-domain reports move to Snowflake over Years 2–3, compliance reports stay on Historian indefinitely. Conversation with reporting team still to be scheduled.
|
||
- **Output generation pipeline:** PPTX + PDF generation from plan markdown, repeatability anchored by spec files (`presentation-spec.md`, `longform-spec.md`) rather than prompts. Spec files written; diagrams and generation run deferred until the source plan is stable.
|
||
|
||
## Top pending items (from most recent status check)
|
||
|
||
All four items from the previous status check have been **advanced to the point where the next move is a real-world action** (management meeting, reporting-team conversation, field survey, or — for legacy — closed outright). The in-room plan work that could be done without external input has been done. The remaining open items are **external dependencies**, not plan-authoring gaps.
|
||
|
||
### External-dependency items — waiting on real-world action
|
||
|
||
1. **BOBJ → Power BI coordination with reporting team.** Plan position documented in `goal-state.md` → Strategic Considerations → **Enterprise reporting: BOBJ → Power BI migration (adjacent initiative)** — three consumption paths analyzed, recommended position stated (Path C with Path A as strategic direction), eight questions and a four-bucket decision rubric included. **Action needed:** schedule the coordination conversation with the reporting team; bring back a bucket assignment. Once a bucket is assigned, update `goal-state.md` → Enterprise reporting and, if the outcome is Bucket A or B, update `roadmap.md` → SnowBridge to include reporting-shaped curated tables.
|
||
2. **UNS hierarchy snapshot walk.** The protocol survey has been **removed** — the OtOpcUa v2 implementation team committed the core driver list (8 drivers) based on internal knowledge, making a formal protocol survey unnecessary for driver scoping. What remains is the **UNS hierarchy snapshot**: a per-site equipment-instance walk capturing site / area / line / equipment assignments and stable UUIDs, which feeds the initial `schemas` repo hierarchy definition and canonical model. See `goal-state.md` → **Unified Namespace (UNS) posture → UNS naming hierarchy standard**. **Action needed:** assign a walk owner; walk System Platform IO config, Ignition OPC UA connections, and ScadaBridge templates across integrated sites within Q1–Q2 of Year 1; capture equipment instances at site/area/line/equipment granularity (not protocol — that's already resolved). The canonical model v1 cannot be published without the initial hierarchy snapshot. **Sub-blocker:** the UNS hierarchy's enterprise-level shortname is currently a placeholder (`ent` in goal-state.md); the real shortname needs to be assigned before the initial hierarchy snapshot can be committed to the `schemas` repo.
|
||
### Closed since last status check
|
||
|
||
All closed items below were worked through the same 2026-04-15 session. Grouped roughly chronologically.
|
||
|
||
**Denominators and discovery templates:**
|
||
|
||
- ~~**Legacy integration inventory population.**~~ **Closed 2026-04-15.** The inventory in `current-state/legacy-integrations.md` is complete as the pillar 3 denominator: **3 rows** — LEG-001 Delmia DNC, LEG-002 Camstar MES (Camstar-initiated, confirmed this session), LEG-003 custom email notification service (added this session). Historian's MSSQL reporting surface (BOBJ / Power BI) was explicitly carved out as **not legacy** and documented under "Deliberately not tracked" in the inventory file — the rationale is that Historian's SQL interface is its native consumption surface, not a bespoke integration. Detail fields on the three rows (sites, owners, volumes, exact transports) remain `_TBD_` and will get filled in during migration planning.
|
||
- ~~**Equipment protocol survey template.**~~ **Advanced 2026-04-15.** The survey was listed as a Year 1 prerequisite but had no template; now a full template with schema, classification rule, rollup views, and discovery approach lives at `current-state/equipment-protocol-survey.md`. **Then further advanced** to carry a dual mandate (see below). Still open: actually running the survey (tracked above as item #2).
|
||
|
||
**Digital twin — scope resolution:**
|
||
|
||
- ~~**Digital twin scope iteration.**~~ **Closed 2026-04-24.** Plan's digital-twin scope is definitively **two access-control patterns** — environment-lifecycle promotion without reconfiguration (ACL flip on write authority) and safe read-only consumption for KPI / monitoring systems. Both delivered by architecture already committed in the OtOpcUa and Canonical Model subsections — no new component, no new workstream. Earlier working artifacts (the meeting-prep brief under `goal-state/`) were removed once scope was finalized. See `goal-state.md` → Strategic Considerations → Digital twin for the authoritative scope.
|
||
|
||
**Canonical model and UNS work:**
|
||
|
||
- ~~**Canonical Equipment, Production, and Event Model declaration.**~~ **Closed 2026-04-15 (third surface renamed 2026-04-24).** New subsection under `goal-state.md` → Async Event Backbone declares the canonical model: three surfaces (OtOpcUa equipment namespace, Redpanda topics + Protobuf schemas, SnowBridge curated layer in Snowflake) with `schemas` repo as single source of truth. Committed **canonical machine state vocabulary** (`Running / Idle / Faulted / Starved / Blocked` + TBD additions) with explicit semantics, rules, and governance. OEE computed on the canonical state stream named as a candidate for pillar 2's "not possible before" use case. Year 1 Redpanda cell in `roadmap.md` commits to publishing v1.
|
||
- ~~**Unified Namespace (UNS) posture declaration.**~~ **Closed 2026-04-15.** New subsection under `goal-state.md` → Target IT/OT Integration declares the canonical model as **the plan's UNS**, with three deliberate deviations from classic MQTT/Sparkplug UNS (Kafka instead of MQTT, flat topics with site-in-message, stateless events instead of Sparkplug state). Optional future **UNS projection service** (MQTT/Sparkplug and/or enterprise OPC UA aggregator) documented as architecturally supported but not committed for build. Cross-references added from Canonical Model subsection and Digital Twin section.
|
||
- ~~**UNS naming hierarchy standard.**~~ **Closed 2026-04-15.** Five-level hierarchy committed: Enterprise → Site → Area → Line → Equipment, always present, `_default` placeholder where a level doesn't apply. Naming rules align with Redpanda topic convention (`[a-z0-9-]`, dots/slashes for segments, hyphens within). Stable **equipment UUIDv4** alongside the path. Authority in `schemas` repo. Evolution governance, worked examples, out-of-scope list (no product/job hierarchy — that's Camstar MES), and TBDs all captured. `current-state/equipment-protocol-survey.md` updated to note the dual mandate — same discovery walk produces the initial hierarchy snapshot at equipment-instance granularity.
|
||
|
||
**Adjacent initiatives:**
|
||
|
||
- ~~**BOBJ → Power BI coordination framing.**~~ **Advanced 2026-04-15.** The coordination question was flagged but no plan position existed; now documented as a new Strategic Considerations subsection in `goal-state.md` with three paths, recommended position, and eight questions for the reporting team. Still open: actually having the coordination conversation (tracked above as item #1).
|
||
|
||
**Output generation pipeline:**
|
||
|
||
- ~~**Repeatable PPTX + PDF generation pipeline.**~~ **Advanced 2026-04-15.** Design brainstormed (A+D pattern — Claude in full control, spec-file anchors, no templates yet). Directory scaffold created at `outputs/`. README, DESIGN, IMPLEMENTATION-PLAN, presentation-spec (18 slides, mixed-stakeholder), and longform-spec (3 chapters + 2 appendices, faithful typeset) all written. **Deferred:** Mermaid diagram source files, first PNG rendering, first PPTX and PDF generation, inaugural run-log entry — all wait on source data being stable. Trigger phrases (`regenerate outputs` / `regenerate presentation` / `regenerate longform`) documented in `outputs/README.md` for any future session.
|
||
|
||
Items that can wait, design details that close during implementation, and deliberately deferred / out-of-scope items are listed in the working conversation — no need to re-enumerate here; they're all captured as `_TBD_` markers in the authoritative files.
|
||
|
||
## Recommended resume flow
|
||
|
||
1. Start a new Claude Code session in this directory. `CLAUDE.md` and this file provide full context.
|
||
2. Skim this file to re-orient (~2 minutes).
|
||
3. Pick one of the three external-dependency items above — or whatever has become most pressing.
|
||
4. If you've had the Power BI coordination conversation with the reporting team, bring the answers and I'll fold them into the plan.
|
||
5. If a funded physics-simulation / FAT initiative has materialized (out of current plan scope), say so and I'll reuse the meeting brief for a scoping conversation.
|
||
6. If the UNS hierarchy walk has been run, bring the data and I'll populate the initial hierarchy snapshot in the `schemas` repo.
|
||
7. To regenerate outputs: `regenerate presentation` (PPTX), `regenerate longform` (PDF, not yet run), or `regenerate outputs` (both). See `outputs/README.md` for the full checklist.
|
||
8. To hand off a component to an implementation agent, check `handoffs/` for existing handoff docs or ask me to create one.
|
||
|
||
## What not to do on resume
|
||
|
||
- Don't re-open settled decisions without a reason. The plan's decisions are load-bearing and have explicit rationale captured inline; reversing one should require new information, not re-litigation.
|
||
- Don't add new workstreams to `roadmap.md` without a matching commitment to one of the three pillars. That's how plans quietly bloat.
|
||
- Don't let Digital Twin reappear as a new committed workstream or widen beyond the finalized scope. Plan's digital-twin scope is exactly two access-control patterns (environment-lifecycle promotion; safe read-only KPI / monitoring exposure), both delivered by already-committed architecture. Physics simulation / FAT / commissioning emulation is out of plan scope; it does not reappear unless a separately funded initiative with a sponsor is stood up, and even then it is an adjacent initiative, not this plan's work.
|
||
- Don't let Copilot 365 reappear. It was deliberately removed earlier — it's handled implicitly by the SnowBridge curated layer + canonical model path.
|
||
- Don't build a parallel MQTT UNS broker just because "UNS" means MQTT to many vendors. The plan's UNS posture is deliberate: Redpanda IS the UNS backbone, and a projection service is a small optional addition when a specific consumer requires it — not the default path.
|
||
- Don't hand-edit files under `outputs/generated/` — they're disposable, regenerated from the spec files on every run. Edit specs or source plan files instead.
|