diff --git a/CLAUDE.md b/CLAUDE.md index ae313aa..779619e 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -16,7 +16,6 @@ Plan content lives in markdown files at the repo root to keep it easy to read an ### Component detail files - [`current-state/legacy-integrations.md`](current-state/legacy-integrations.md) — authoritative inventory of **bespoke IT↔OT integrations** that cross ScadaBridge-central outside ScadaBridge. Denominator for pillar 3 retirement. -- [`goal-state/digital-twin-management-brief.md`](goal-state/digital-twin-management-brief.md) — meeting-prep artifact for the management conversation that turns "we want digital twins" into a scoped response. Parallel structure to `goal-state.md` → Strategic Considerations → Digital twin. - [`schemas/`](schemas/) — **Canonical OT equipment definitions seed** (temporary location — see `schemas/README.md`). JSON Schema format definitions, FANUC CNC pilot equipment class, UNS subtree example, and documentation. Contributed by the OtOpcUa team; ownership TBD. To be migrated to a dedicated `schemas` repo once created. ### Output generation pipeline diff --git a/README.md b/README.md index f162b99..031ccc8 100644 --- a/README.md +++ b/README.md @@ -40,7 +40,6 @@ The plan also declares a **Unified Namespace (UNS)** composed of OtOpcUa + Redpa |---|---| | [`current-state/legacy-integrations.md`](current-state/legacy-integrations.md) | Pillar 3 denominator: 3 legacy IT/OT integrations to retire | | ~~`current-state/equipment-protocol-survey.md`~~ | Removed — protocol survey no longer needed; OtOpcUa v2 team committed driver list directly | -| [`goal-state/digital-twin-management-brief.md`](goal-state/digital-twin-management-brief.md) | Digital twin management conversation brief (completed) | | [`schemas/`](schemas/) | Canonical OT equipment definitions (DRAFT seed contributed by OtOpcUa team — UNS hierarchy + equipment-class templates + format JSON Schemas + worked FANUC CNC pilot). Temporary location until a dedicated `schemas` repo is created and an owner team is named — see `schemas/README.md` | ### Output Generation diff --git a/STATUS.md b/STATUS.md index 2599114..083a892 100644 --- a/STATUS.md +++ b/STATUS.md @@ -1,16 +1,16 @@ # Plan — Working Session Status -**Saved:** 2026-04-23 +**Saved:** 2026-04-24 **Previous session:** Opus 4.6 (1M context) **Resume with:** start a new Claude Code session in this directory — `CLAUDE.md` and this file provide full context. No session ID needed; the plan is self-contained in the repo. -> This file is a **bookmark**, not a replacement for the plan. The authoritative content lives in `CLAUDE.md`, `current-state.md`, `goal-state.md`, `roadmap.md`, and the component detail files under `current-state/`, `goal-state/`, and `outputs/`. Read this file only to find out where we left off. +> This file is a **bookmark**, not a replacement for the plan. The authoritative content lives in `CLAUDE.md`, `current-state.md`, `goal-state.md`, `roadmap.md`, and the component detail files under `current-state/` and `outputs/`. Read this file only to find out where we left off. ## Where we are -The plan is **substantially complete**. All core documents are populated, architectural decisions are captured with rationale, the canonical model + UNS hierarchy standard are declared, the digital twin use cases are absorbed, the OtOpcUa v2 implementation corrections (19 items + addendum) are integrated, and the first PPTX has been generated. The `schemas/` repo seed exists with the FANUC CNC pilot class and JSON Schema format definitions. +The plan is **substantially complete**. All core documents are populated, architectural decisions are captured with rationale, the canonical model + UNS hierarchy standard are declared, the digital-twin scope is narrowed to two access-control patterns (both delivered by already-committed architecture), the OtOpcUa v2 implementation corrections (19 items + addendum) are integrated, and the first PPTX has been generated. The `schemas/` repo seed exists with the FANUC CNC pilot class and JSON Schema format definitions. -**What happened since the original session (2026-04-15 through 2026-04-23):** +**What happened since the original session (2026-04-15 through 2026-04-24):** - Integrated OtOpcUa v2 implementation corrections (19 corrections + hardening addendum: ACL model committed, stability tiers, multi-identifier equipment model, driver list confirmed, cutover ownership assigned outside OtOpcUa) - Schemas repo seed contributed by OtOpcUa team at `schemas/` (temporary location) - Enterprise shortname resolved to `zb`; Warsaw West buildings confirmed as 5 and 19 @@ -19,6 +19,7 @@ The plan is **substantially complete**. All core documents are populated, archit - 7 component diagrams created (OtOpcUa, Redpanda, SnowBridge, ScadaBridge dataflow + topology, Snowflake/dbt) - ScadaBridge accuracy corrections from design repo review (email only, not Teams; EventHub not yet implemented) - ScadaBridge topology corrected (no site-to-site routing; direct API access; inbound Web API as input) +- **Digital-twin scope finalized (2026-04-24).** Plan's digital-twin scope is definitively **two access-control patterns**: (1) environment-lifecycle promotion without reconfiguration (ACL flip on write authority against stable equipment UUIDs); (2) safe read-only consumption for KPI / monitoring systems (structurally guaranteed by single-connection-through-OtOpcUa). Both delivered by architecture already committed — no new component, no new workstream. The earlier management-conversation brief (`goal-state/digital-twin-management-brief.md`) and the `goal-state/` subdirectory have been removed; the plan uses only the two patterns above. Write-authority arbitration mechanism is out of scope for this plan (OtOpcUa team's concern). Physics simulation / FAT / commissioning emulation is not a plan item; if it ever materializes as a funded adjacent initiative, that will be a separate scoping conversation. ### Files @@ -26,19 +27,13 @@ The plan is **substantially complete**. All core documents are populated, archit - [`CLAUDE.md`](CLAUDE.md) — plan purpose, document index (now including the component detail files and outputs pipeline), markdown-first conventions, component breakout rules. - [`current-state.md`](current-state.md) — snapshot of today's estate (enterprise layout, clusters, systems, integrations, equipment access patterns). -- [`goal-state.md`](goal-state.md) — target end-state with Vision, layered architecture, **Unified Namespace posture + naming hierarchy standard**, component designs (OtOpcUa, SnowBridge, Redpanda EventHub with **Canonical Equipment/Production/Event Model + canonical state vocabulary**, dbt layer, ScadaBridge extensions), success criteria, observability, Strategic Considerations (Digital Twin with use cases received + Power BI), and Non-Goals. +- [`goal-state.md`](goal-state.md) — target end-state with Vision, layered architecture, **Unified Namespace posture + naming hierarchy standard**, component designs (OtOpcUa, SnowBridge, Redpanda EventHub with **Canonical Equipment/Production/Event Model + canonical state vocabulary**, dbt layer, ScadaBridge extensions), success criteria, observability, Strategic Considerations (Digital twin — two access-control patterns; Power BI), and Non-Goals. - [`roadmap.md`](roadmap.md) — 3-year workstreams × years grid with 7 workstreams and cross-workstream dependencies; Year 1 Redpanda and dbt cells updated for canonical model delivery. **Component detail files:** - [`current-state/legacy-integrations.md`](current-state/legacy-integrations.md) — authoritative inventory for pillar 3 retirement. **Closed as denominator = 3**: LEG-001 Delmia DNC, LEG-002 Camstar MES, LEG-003 custom email notification service. Historian MSSQL reporting surface explicitly carved out as *not* legacy. - ~~`current-state/equipment-protocol-survey.md`~~ — **Removed.** Protocol survey no longer needed; the OtOpcUa v2 implementation team committed the 8-driver core library from internal knowledge. The UNS hierarchy snapshot (equipment-instance walk) is now a standalone Year 1 deliverable tracked separately. -- [`goal-state/digital-twin-management-brief.md`](goal-state/digital-twin-management-brief.md) — meeting-prep artifact for the (now completed) digital twin management conversation; "Outcome" section at top captures the resolution. - -**Input / reference files:** - -- [`digital_twin_usecases.md.txt`](digital_twin_usecases.md.txt) — management's delivered requirements for digital twin (three use cases: standardized state model, virtual testing/simulation, cross-system canonical model). Source for the plan's digital twin response. - **Output generation pipeline (specs only — no outputs generated yet):** - [`outputs/README.md`](outputs/README.md) — trigger phrases (`regenerate outputs` / `regenerate presentation` / `regenerate longform`), regeneration procedure, edit-this-not-that rules. @@ -55,16 +50,16 @@ The plan is **substantially complete**. All core documents are populated, archit - **UX split:** Ignition owns KPI UX long-term; Aveva System Platform HMI owns validated-data UX long-term. Not a primary goal of this plan. - **IT↔OT boundary:** single crossing at ScadaBridge central. OT = machine data (System Platform, equipment OPC UA, OtOpcUa, ScadaBridge, Aveva Historian, Ignition). IT = enterprise apps (Camstar, Delmia, Snowflake, SnowBridge, Power BI/BOBJ). - **Layered architecture:** Layer 1 Equipment → Layer 2 OtOpcUa → Layer 3 SCADA (System Platform + Ignition) → Layer 4 ScadaBridge → Enterprise IT. -- **OtOpcUa** (layer 2): custom-built, clustered, co-located on System Platform nodes, hybrid driver strategy (proactive core library + on-demand long-tail), OPC UA-native auth, **absorbs LmxOpcUa** as its System Platform namespace. Tiered cutover: ScadaBridge first, Ignition second, System Platform IO last. **Namespace architecture supports a future `simulated` namespace** for integration testing (digital twin use case 2) — architecturally supported, not committed for build. -- **Redpanda EventHub:** self-hosted, central cluster in South Bend (single-cluster HA, VM-level DR out of scope), per-topic tiered retention (operational 7d / analytics 30d / compliance 90d), bundled Schema Registry, Protobuf via central `schemas` repo with `buf` CI, `BACKWARD_TRANSITIVE` compatibility, `TopicNameStrategy` subjects, `{domain}.{entity}.{event-type}` naming, site identity in message (not topic), SASL/OAUTHBEARER + prefix ACLs. Store-and-forward at ScadaBridge handles site resilience. **Analytics-tier retention is also a replay surface** for integration testing / simulation-lite (digital twin use case 2). -- **Canonical Equipment, Production, and Event Model** (added via digital twin use cases 1 and 3): the plan commits to declaring the composition of OtOpcUa equipment namespace + Redpanda canonical topics + `schemas` repo + dbt curated layer as **the** canonical model. Three surfaces, one source of truth (`schemas` repo). Includes a **canonical machine state vocabulary** (`Running / Idle / Faulted / Starved / Blocked` + TBD additions like `Changeover`, `Maintenance`, `Setup`). Year 1 Redpanda and dbt cells are updated to deliver v1. +- **OtOpcUa** (layer 2): custom-built, clustered, co-located on System Platform nodes, hybrid driver strategy (proactive core library + on-demand long-tail), OPC UA-native auth, **absorbs LmxOpcUa** as its System Platform namespace. Tiered cutover: ScadaBridge first, Ignition second, System Platform IO last. **Namespace architecture supports a future `simulated` namespace** for the pre-install case (dev work before equipment is on the floor) and as foundation for a possible future funded physics-simulation initiative — architecturally supported, not committed for build. **ACL model + single-connection-per-equipment also delivers the plan's two digital-twin patterns** (environment-lifecycle promotion via write-authority flip; safe read-only KPI / monitoring exposure) — see `goal-state.md` → OtOpcUa → Consumer access patterns enabled by the ACL model. +- **Redpanda EventHub:** self-hosted, central cluster in South Bend (single-cluster HA, VM-level DR out of scope), per-topic tiered retention (operational 7d / analytics 30d / compliance 90d), bundled Schema Registry, Protobuf via central `schemas` repo with `buf` CI, `BACKWARD_TRANSITIVE` compatibility, `TopicNameStrategy` subjects, `{domain}.{entity}.{event-type}` naming, site identity in message (not topic), SASL/OAUTHBEARER + prefix ACLs. Store-and-forward at ScadaBridge handles site resilience. **Analytics-tier retention is also a replay surface** for integration testing (and for a possible future funded physics-simulation initiative, should one materialize). +- **Canonical Equipment, Production, and Event Model:** the plan commits to declaring the composition of OtOpcUa equipment namespace + Redpanda canonical topics + `schemas` repo + dbt curated layer as **the** canonical model. Three surfaces, one source of truth (`schemas` repo). Includes a **canonical machine state vocabulary** (`Running / Idle / Faulted / Starved / Blocked` + TBD additions like `Changeover`, `Maintenance`, `Setup`). Year 1 Redpanda and dbt cells deliver v1. Load-bearing for pillar 2. - **Unified Namespace (UNS) posture:** the canonical model above is also declared as the plan's UNS, framed for stakeholders using UNS vocabulary. **Deliberate deviations from classic MQTT/Sparkplug UNS:** Kafka instead of MQTT (for analytics/replay), flat `{domain}.{entity}.{event-type}` topics with site in message (for bounded topic count), stateless events instead of Sparkplug state machine. Optional future **UNS projection service** (MQTT/Sparkplug and/or enterprise OPC UA aggregator) is architecturally supported but not committed for build; decision trigger documented. - **UNS naming hierarchy standard:** 5 levels always present — Enterprise → Site → Area → Line → Equipment, with `_default` placeholder where a level doesn't apply. Text form `ent.warsaw-west.bldg-3.line-2.cnc-mill-05` / OPC UA form `ent/warsaw-west/bldg-3/line-2/cnc-mill-05`. Stable **equipment UUIDv4** alongside the path (path is navigation, UUID is lineage). Authority lives in `schemas` repo; OtOpcUa / Redpanda / dbt consume the authoritative definition. **Enterprise shortname is currently `ent` placeholder — needs assignment.** - **SnowBridge:** custom-built machine-data-to-Snowflake upload service; Year 1 starting source is Aveva Historian SQL; UI + API with blast-radius-based approval workflow; selection state in internal datastore (not git). - **Snowflake transform tooling:** dbt only, run by a self-hosted orchestrator (specific orchestrator out of scope). - **Aggregation boundary:** aggregation lives in Snowflake (dbt). ScadaBridge does deadband/exception-based filtering (global default ~1% of span) plus tag opt-in via SnowBridge — not source-side summarization. - **Observability:** commit to signals (Redpanda, ScadaBridge, SnowBridge, dbt), tool is out of scope. -- **Digital Twin (management-delivered use cases, 2026-04-15):** three use cases received — (1) standardized equipment state model, (2) virtual testing / simulation, (3) cross-system canonical model. **Use cases 1 and 3 absorbed into the plan** as the canonical state vocabulary + canonical model declaration (see above). **Use case 2 served minimally** by Redpanda historical replay + future OtOpcUa `simulated` namespace; full commissioning-grade simulation stays out of scope pending a separately funded initiative. +- **Digital-twin scope (finalized 2026-04-24):** the plan's digital-twin scope is definitively **two access-control patterns** — (1) environment-lifecycle promotion without reconfiguration (ACL flip on write authority against stable equipment UUIDs); (2) safe read-only consumption for KPI / monitoring systems (structurally guaranteed by single-connection-through-OtOpcUa). Both delivered by architecture already committed in the **OtOpcUa** and **Canonical Equipment, Production, and Event Model** subsections — no new component, no new workstream, no pillar dependency. Write-authority arbitration mechanism is out of scope (OtOpcUa team's concern). Physics simulation / FAT / commissioning emulation is not a plan item; any future funded adjacent initiative would be a separate scoping conversation. - **Enterprise reporting coordination (BOBJ → Power BI migration, in-flight adjacent initiative):** three consumption paths analyzed (Snowflake dbt / Historian direct / both). Recommended position: **Path C with Path A as strategic direction** — most machine-data and cross-domain reports move to Snowflake over Years 2–3, compliance reports stay on Historian indefinitely. Conversation with reporting team still to be scheduled. - **Output generation pipeline:** PPTX + PDF generation from plan markdown, repeatability anchored by spec files (`presentation-spec.md`, `longform-spec.md`) rather than prompts. Spec files written; diagrams and generation run deferred until the source plan is stable. @@ -76,8 +71,6 @@ All four items from the previous status check have been **advanced to the point 1. **BOBJ → Power BI coordination with reporting team.** Plan position documented in `goal-state.md` → Strategic Considerations → **Enterprise reporting: BOBJ → Power BI migration (adjacent initiative)** — three consumption paths analyzed, recommended position stated (Path C with Path A as strategic direction), eight questions and a four-bucket decision rubric included. **Action needed:** schedule the coordination conversation with the reporting team; bring back a bucket assignment. Once a bucket is assigned, update `goal-state.md` → Enterprise reporting and, if the outcome is Bucket A or B, update `roadmap.md` → Snowflake dbt Transform Layer to include reporting-shaped views. 2. **UNS hierarchy snapshot walk.** The protocol survey has been **removed** — the OtOpcUa v2 implementation team committed the core driver list (8 drivers) based on internal knowledge, making a formal protocol survey unnecessary for driver scoping. What remains is the **UNS hierarchy snapshot**: a per-site equipment-instance walk capturing site / area / line / equipment assignments and stable UUIDs, which feeds the initial `schemas` repo hierarchy definition and canonical model. See `goal-state.md` → **Unified Namespace (UNS) posture → UNS naming hierarchy standard**. **Action needed:** assign a walk owner; walk System Platform IO config, Ignition OPC UA connections, and ScadaBridge templates across integrated sites within Q1–Q2 of Year 1; capture equipment instances at site/area/line/equipment granularity (not protocol — that's already resolved). The canonical model v1 cannot be published without the initial hierarchy snapshot. **Sub-blocker:** the UNS hierarchy's enterprise-level shortname is currently a placeholder (`ent` in goal-state.md); the real shortname needs to be assigned before the initial hierarchy snapshot can be committed to the `schemas` repo. -3. **Digital twin use case 2 — funded simulation initiative (exploratory).** The digital twin management conversation is complete; management provided three use cases and the plan absorbs two of them (canonical state vocabulary + canonical model declaration — see closed items). **Use case 2 (Virtual Testing / Simulation)** is served minimally by Redpanda historical replay + OtOpcUa's architectural support for a future `simulated` namespace, but **full commissioning-grade simulation stays out of scope** for this plan. **No action needed** unless and until a funded simulation initiative materializes with a sponsor, scope, and timeline; at that point the meeting-prep brief at [`goal-state/digital-twin-management-brief.md`](goal-state/digital-twin-management-brief.md) can be reused for a use-case-2-specific scoping conversation. Keep on the radar, don't actively work on it. - ### Closed since last status check All closed items below were worked through the same 2026-04-15 session. Grouped roughly chronologically. @@ -87,12 +80,11 @@ All closed items below were worked through the same 2026-04-15 session. Grouped - ~~**Legacy integration inventory population.**~~ **Closed 2026-04-15.** The inventory in `current-state/legacy-integrations.md` is complete as the pillar 3 denominator: **3 rows** — LEG-001 Delmia DNC, LEG-002 Camstar MES (Camstar-initiated, confirmed this session), LEG-003 custom email notification service (added this session). Historian's MSSQL reporting surface (BOBJ / Power BI) was explicitly carved out as **not legacy** and documented under "Deliberately not tracked" in the inventory file — the rationale is that Historian's SQL interface is its native consumption surface, not a bespoke integration. Detail fields on the three rows (sites, owners, volumes, exact transports) remain `_TBD_` and will get filled in during migration planning. - ~~**Equipment protocol survey template.**~~ **Advanced 2026-04-15.** The survey was listed as a Year 1 prerequisite but had no template; now a full template with schema, classification rule, rollup views, and discovery approach lives at `current-state/equipment-protocol-survey.md`. **Then further advanced** to carry a dual mandate (see below). Still open: actually running the survey (tracked above as item #2). -**Digital twin — full lifecycle in one session:** +**Digital twin — scope resolution:** -- ~~**Digital twin conversation preparation.**~~ **Advanced 2026-04-15.** The 8 clarification questions existed but lacked framing; now wrapped with a full meeting-prep brief at `goal-state/digital-twin-management-brief.md`. Superseded by the conversation outcome below. -- ~~**Digital twin management conversation.**~~ **Closed 2026-04-15.** Conversation happened. Management delivered three use cases (source: `digital_twin_usecases.md.txt`): (1) standardized equipment state model, (2) virtual testing / simulation, (3) cross-system canonical model. Plan response: (a) use cases 1 and 3 absorbed as plan additions (see Canonical Model + UNS work below); (b) use case 2 served minimally by Redpanda historical replay + OtOpcUa architectural support for a future `simulated` namespace — full commissioning simulation stays out of scope. The `goal-state.md` Digital Twin section, the `digital-twin-management-brief.md` outcome, and the Year 1 Redpanda/dbt roadmap cells are all updated. Narrower open item carried forward as external-dependency item #3 above. +- ~~**Digital twin scope iteration.**~~ **Closed 2026-04-24.** Plan's digital-twin scope is definitively **two access-control patterns** — environment-lifecycle promotion without reconfiguration (ACL flip on write authority) and safe read-only consumption for KPI / monitoring systems. Both delivered by architecture already committed in the OtOpcUa and Canonical Model subsections — no new component, no new workstream. Earlier working artifacts (the meeting-prep brief under `goal-state/`) were removed once scope was finalized. See `goal-state.md` → Strategic Considerations → Digital twin for the authoritative scope. -**Canonical model and UNS work (follows from digital twin use cases 1 and 3):** +**Canonical model and UNS work:** - ~~**Canonical Equipment, Production, and Event Model declaration.**~~ **Closed 2026-04-15.** New subsection under `goal-state.md` → Async Event Backbone declares the canonical model: three surfaces (OtOpcUa equipment namespace, Redpanda topics + Protobuf schemas, dbt curated layer) with `schemas` repo as single source of truth. Committed **canonical machine state vocabulary** (`Running / Idle / Faulted / Starved / Blocked` + TBD additions) with explicit semantics, rules, and governance. OEE computed on the canonical state stream named as a candidate for pillar 2's "not possible before" use case. Year 1 Redpanda cell in `roadmap.md` commits to publishing v1. - ~~**Unified Namespace (UNS) posture declaration.**~~ **Closed 2026-04-15.** New subsection under `goal-state.md` → Target IT/OT Integration declares the canonical model as **the plan's UNS**, with three deliberate deviations from classic MQTT/Sparkplug UNS (Kafka instead of MQTT, flat topics with site-in-message, stateless events instead of Sparkplug state). Optional future **UNS projection service** (MQTT/Sparkplug and/or enterprise OPC UA aggregator) documented as architecturally supported but not committed for build. Cross-references added from Canonical Model subsection and Digital Twin section. @@ -114,7 +106,7 @@ Items that can wait, design details that close during implementation, and delibe 2. Skim this file to re-orient (~2 minutes). 3. Pick one of the three external-dependency items above — or whatever has become most pressing. 4. If you've had the Power BI coordination conversation with the reporting team, bring the answers and I'll fold them into the plan. -5. If a funded simulation initiative has materialized (digital twin use case 2), say so and I'll reuse the meeting brief for a scoping conversation. +5. If a funded physics-simulation / FAT initiative has materialized (out of current plan scope), say so and I'll reuse the meeting brief for a scoping conversation. 6. If the UNS hierarchy walk has been run, bring the data and I'll populate the initial hierarchy snapshot in the `schemas` repo. 7. To regenerate outputs: `regenerate presentation` (PPTX), `regenerate longform` (PDF, not yet run), or `regenerate outputs` (both). See `outputs/README.md` for the full checklist. 8. To hand off a component to an implementation agent, check `handoffs/` for existing handoff docs or ask me to create one. @@ -123,7 +115,7 @@ Items that can wait, design details that close during implementation, and delibe - Don't re-open settled decisions without a reason. The plan's decisions are load-bearing and have explicit rationale captured inline; reversing one should require new information, not re-litigation. - Don't add new workstreams to `roadmap.md` without a matching commitment to one of the three pillars. That's how plans quietly bloat. -- Don't let Digital Twin reappear as a new committed workstream. Management's three use cases have been resolved: uses 1 and 3 absorbed into the canonical model + UNS work; use 2 stays out of scope unless a separately funded simulation initiative materializes. Full commissioning-grade simulation is not a stealth pillar. +- Don't let Digital Twin reappear as a new committed workstream or widen beyond the finalized scope. Plan's digital-twin scope is exactly two access-control patterns (environment-lifecycle promotion; safe read-only KPI / monitoring exposure), both delivered by already-committed architecture. Physics simulation / FAT / commissioning emulation is out of plan scope; it does not reappear unless a separately funded initiative with a sponsor is stood up, and even then it is an adjacent initiative, not this plan's work. - Don't let Copilot 365 reappear. It was deliberately removed earlier — it's handled implicitly by the Snowflake/dbt + canonical model path. - Don't build a parallel MQTT UNS broker just because "UNS" means MQTT to many vendors. The plan's UNS posture is deliberate: Redpanda IS the UNS backbone, and a projection service is a small optional addition when a specific consumer requires it — not the default path. - Don't hand-edit files under `outputs/generated/` — they're disposable, regenerated from the spec files on every run. Edit specs or source plan files instead. diff --git a/digital_twin_usecases.md.txt b/digital_twin_usecases.md.txt deleted file mode 100644 index 5638fe6..0000000 --- a/digital_twin_usecases.md.txt +++ /dev/null @@ -1,74 +0,0 @@ -1) Standardized Equipment State / Metadata Model - -Use case: -Create a consistent, high-level representation of machine state derived from raw signals. - -What it does: - • Converts low-level sensor/PLC data into meaningful states (e.g., Running, Idle, Faulted, Starved, Blocked) - • Normalizes differences across equipment types - • Aggregates multiple signals into a single, authoritative “machine state” - -Examples: - • Deriving true run state from multiple interlocks and status bits - • Calculating actual cycle time vs. theoretical - • Identifying top fault instead of exposing dozens of raw alarms - -Value: - • Provides a single, consistent view of equipment behavior - • Reduces complexity for downstream systems and users - • Improves accuracy of KPIs like OEE and downtime tracking - -⸻ - -2) Virtual Testing / Simulation (FAT, Integration, Validation) - -Use case: -Use a digital representation of equipment to simulate behavior for testing without requiring physical machines. - -What it does: - • Emulates machine signals, states, and sequences - • Allows testing of automation logic, workflows, and integrations - • Supports replay of historical scenarios or generation of synthetic ones - -Examples: - • Simulating startup, shutdown, and fault conditions - • Testing alarm handling and recovery workflows - • Validating system behavior under edge cases (missing data, delays, abnormal sequences) - -Value: - • Enables earlier testing before equipment is available - • Reduces commissioning time and risk - • Improves quality and stability of deployed systems - -⸻ - -3) Cross-System Data Normalization / Canonical Model - -Use case: -Act as a common semantic layer between multiple systems interacting with manufacturing data. - -What it does: - • Defines standardized data structures for equipment, production, and events - • Translates system-specific formats into a unified model - • Provides a consistent interface for all consumers - -Examples: - • Mapping different machine tag structures into a common equipment model - • Standardizing production counts, states, and identifiers - • Providing uniform event definitions (e.g., “machine fault,” “job complete”) - -Value: - • Simplifies integration between disparate systems - • Reduces duplication of transformation logic - • Improves data consistency and interoperability across the enterprise - -⸻ - -Combined Outcome - -Together, these three use cases position a digital twin as: - • A translator (raw signals → meaningful state) - • A simulator (test without physical dependency) - • A standard interface (consistent data across systems) - -This approach focuses on practical operational value rather than high-fidelity modeling, aligning well with discrete manufacturing environments. \ No newline at end of file diff --git a/goal-state.md b/goal-state.md index 99a4876..191086c 100644 --- a/goal-state.md +++ b/goal-state.md @@ -234,14 +234,14 @@ Two projection flavors are possible, not mutually exclusive: **Decision trigger for building a projection service:** when a specific consumer (vendor tool, COTS HMI, analytics product, new initiative) requires a classic UNS surface and the cost of writing a Kafka client for that consumer exceeds the cost of operating the projection layer for the rest of the consumer's lifetime. Until that trigger is hit, the canonical model + Redpanda **is** the UNS and consumers reach it directly. -This mirrors the treatment of OtOpcUa's future `simulated` namespace and the Digital Twin Use Case 2 simulation-lite foundation: the architecture supports the addition; the plan does not commit the build until a specific need justifies it. +This mirrors the treatment of OtOpcUa's future `simulated` namespace: the architecture supports the addition; the plan does not commit the build until a specific need justifies it. #### What the UNS framing does and does not change **Changes:** - Stakeholders who ask "do we have a UNS?" get a direct "yes — composed of OtOpcUa + Redpanda + `schemas` repo + dbt" answer instead of "we have a canonical model but we didn't use that word." -- Digital Twin Use Cases 1 and 3 (see **Strategic Considerations → Digital twin**) — which are functionally UNS use cases in another vocabulary — now have a second name and a second stakeholder audience. +- The **canonical machine state vocabulary** and **canonical equipment/production/event model declaration** (see **Async Event Backbone → Canonical Equipment, Production, and Event Model**) — which are functionally UNS deliverables in another vocabulary — now have a second name and a second stakeholder audience. - A future projection service is pre-legitimized as a small optional addition, not a parallel or competing initiative. - Vendor conversations that assume "UNS" means a specific MQTT broker purchase can be reframed: the plan delivers the UNS value proposition via different transport; the vendor's MQTT expectations become a projection-layer concern, not a core-architecture concern. @@ -411,7 +411,7 @@ _TBD — service name (working title only); hosting (South Bend, alongside Redpa 1. **Equipment namespace (raw data).** Live values read from equipment via native OPC UA or native device protocols (Modbus, Ethernet/IP, Siemens S7, etc.) translated to OPC UA. This is the new capability the plan introduces — what the "layer 2 — raw data" role in the layered architecture describes. 2. **System Platform namespace (processed data tap).** The former **LmxOpcUa** functionality, folded in. Exposes Aveva System Platform objects (via the local App Server's LMX API) as OPC UA so that OPC UA-native consumers can read processed data through the same endpoint they use for raw equipment data. -**Namespace model is extensible — future "simulated" namespace supported architecturally, not committed for build.** The two-namespace design is not a hard cap. A future **`simulated` namespace** could expose synthetic or replayed equipment data to consumers, letting tier-1 / tier-2 consumers (ScadaBridge, Ignition, System Platform IO) be exercised against real-shaped-but-offline data streams without physical equipment. This is the **OtOpcUa-side foundation for Digital Twin Use Case 2** (Virtual Testing / Simulation — see **Strategic Considerations → Digital twin**). The plan **does not commit to building** a simulated namespace in the 3-year scope; it commits that the namespace architecture can accommodate one when a specific testing need justifies it, without reshaping OtOpcUa. The complementary foundation (historical event replay) lives in the Redpanda layer — see **Async Event Backbone → Usage patterns → Historical replay**. +**Namespace model is extensible — future "simulated" namespace supported architecturally, not committed for build.** The two-namespace design is not a hard cap. A future **`simulated` namespace** could expose synthetic or replayed equipment data to consumers, letting tier-1 / tier-2 consumers (ScadaBridge, Ignition, System Platform IO) be exercised against real-shaped-but-offline data streams without physical equipment — primarily useful for the **pre-install case** (dev work against a piece of equipment that is not yet physically on the floor). The plan **does not commit to building** a simulated namespace in the 3-year scope; it commits that the namespace architecture can accommodate one when a specific testing need justifies it, without reshaping OtOpcUa. The complementary foundation (historical event replay) lives in the Redpanda layer — see **Async Event Backbone → Usage patterns → Historical replay**. Note: once equipment is physically present, the two access-control patterns that are this plan's digital-twin scope (environment-lifecycle promotion without reconfiguration, and safe read-only KPI/monitoring exposure) are delivered by the **ACL model below**, not by the `simulated` namespace. See **Consumer access patterns enabled by the ACL model** further down, and **Strategic Considerations → Digital twin**. **LmxOpcUa is absorbed into OtOpcUa, not replaced by a separate component.** The existing LmxOpcUa software and deployment pattern (per-node service on every System Platform node) evolves into OtOpcUa. Consumers that previously pointed at LmxOpcUa for System Platform data and at "nothing yet" for equipment data now point at OtOpcUa and see both in its namespace. There is not a second OPC UA server running alongside. @@ -475,6 +475,13 @@ _TBD — service name (working title only); hosting (South Bend, alongside Redpa - **Phasing:** Phase 1 ships the schema + Admin UI + evaluator unit tests; per-driver enforcement lands in each driver's phase (Phase 2+). **Phase 1 completes before any driver phase**, so the ACL model exists in the central config DB before any driver consumes it — satisfying the "must be working before Tier 1 cutover" timing constraint. - _TBD — specific OPC UA security mode + profile combinations required vs allowed; where UserName credentials/certs are sourced from (local site directory, a per-site credential vault, AD/LDAP); rotation cadence; audit trail of authz decisions._ +**Consumer access patterns enabled by the ACL model (plan's digital-twin scope).** Two consumer patterns fall directly out of the ACL model + single-connection-per-equipment design and are worth naming explicitly, because they are this plan's full digital-twin scope (see **Strategic Considerations → Digital twin**): + +- **Environment-lifecycle promotion without reconfiguration.** Dev / QA / Prod System Platform instances each authenticate to OtOpcUa as a distinct identity (e.g., `sp-dev`, `sp-qa`, `sp-prod`). Promotion of a piece of equipment from dev → qa → prod is an ACL change that moves the **single write-holder grant** for that equipment UUID from one identity to the next; the OPC UA session itself — which OtOpcUa owns — is configured once and never torn down or rebuilt. Read grants can stay broad (all three environment identities observe continuously); only write authority is single-assignee and mobile. This replaces today's pattern of disabling the direct System Platform connection on the dev box, then re-creating the same connection on the qa box, then again on the prod box. +- **Safe read-only consumption for KPI / monitoring systems.** Ignition KPI views, Power BI dashboards, observability / monitoring consumers, and any future read-only analytics consumer authenticate to OtOpcUa as identities with read-only grants. Because OtOpcUa owns the single OPC UA session to each piece of equipment, there is **no write path available to these consumers at all** — the guarantee is structural (there is no equipment-side session for a read-only consumer to misuse) rather than procedural (relying on the consumer's own code to not issue writes). This materially reduces the risk of adding new KPI / monitoring consumers to the estate. + +**Out of scope for this plan:** how OtOpcUa arbitrates write-authority moves between environments — e.g., an Admin UI switch, a PR-merge on the `schemas` repo, a release-pipeline step, or some combination. That mechanism is the OtOpcUa team's implementation decision. What the plan commits to is the architectural substrate (stable equipment UUID, single OPC UA session per equipment owned by OtOpcUa, read-vs-write-distinguishing ACL model) that makes both patterns above possible. + **Open questions (TBD).** - **Driver coverage.** Which equipment protocols need to be bridged to OPC UA beyond native OPC UA equipment — this is where product-driven decisions matter most. - **Rollout posture: build and deploy the cluster software to every site ASAP.** The cluster software (server + core driver library) is built and rolled out to **every site's System Platform nodes as fast as practical** — deployment to all sites is treated as a prerequisite for the rest of the OT plan, not a gradual per-site effort. "Deployment" here means installing and configuring the cluster software at each site so the node is ready to front equipment; it does **not** mean immediately migrating consumers (that follows the tiered cutover below). A deployed but inactive cluster is cheap; what's expensive is delaying deployment and then trying to do it site-by-site on the critical path of every other workstream. @@ -564,14 +571,14 @@ _TBD — service name (working title only); hosting (South Bend, alongside Redpa - **Async event notifications** — shopfloor events (state changes, alarms, lifecycle events, etc.) published to EventHub for any interested consumer to subscribe to, without producers needing to know who's listening. - **Async processing for KPI** — KPI calculations (currently handled on Ignition SCADA) can consume event streams from EventHub, enabling decoupled, replayable KPI pipelines instead of tightly coupled point queries. - **System integrations** — other enterprise systems (Camstar, Snowflake, future consumers) integrate by subscribing to EventHub topics rather than opening point-to-point connections into OT. - - **Historical replay for integration testing and simulation-lite.** The `analytics`-tier retention (30 days) is explicitly also a **replay surface** for testing and simulation-lite: downstream consumers (ScadaBridge scripts, KPI pipelines, dbt models, a future digital twin layer) can be exercised against real historical event streams instead of synthetic data. This is the minimal answer to **Digital Twin Use Case 2 (Virtual Testing / Simulation)** — see **Strategic Considerations → Digital twin** → use case 2 — and does not require any new component. When longer horizons are needed, extend to the `compliance` tier (90 days). Replay windows beyond 90 days are served by the dbt curated layer in Snowflake, not by Redpanda. + - **Historical replay for integration testing.** The `analytics`-tier retention (30 days) is explicitly also a **replay surface** for testing: downstream consumers (ScadaBridge scripts, KPI pipelines, dbt models, or any future consumer that needs to re-run historical windows) can be exercised against real historical event streams instead of synthetic data. Does not require any new component. When longer horizons are needed, extend to the `compliance` tier (90 days). Replay windows beyond 90 days are served by the dbt curated layer in Snowflake, not by Redpanda. **Note:** if a funded physics-simulation / FAT initiative ever materializes, this replay surface is one of the foundations it can consume — but such an initiative is out of the 3-year scope of this plan. - _Remaining open items are tracked inline in the subsections above — sizing, read-path implications, long-outage planning, IdP selection, schema subject/versioning details, etc. Support staffing and on-call ownership are out of scope for this plan._ #### Canonical Equipment, Production, and Event Model The plan already delivers the infrastructure for a cross-system canonical model — OtOpcUa's equipment namespace, Redpanda's `{domain}.{entity}.{event-type}` topic taxonomy, Protobuf schemas in the central `schemas` repo, and the dbt curated layer in Snowflake. What it had not, until now, explicitly committed to is **declaring** that these pieces together constitute the enterprise's canonical equipment / production / event model, and that consumers are entitled to treat them as an integration interface. -This subsection makes that declaration. It is the plan's answer to **Digital Twin Use Cases 1 and 3** (see **Strategic Considerations → Digital twin**) and — independent of digital twin framing — is load-bearing for pillar 2 (analytics/AI enablement) because a canonical model is what makes "not possible before" cross-domain analytics possible at all. +This subsection makes that declaration. It is load-bearing for pillar 2 (analytics/AI enablement) because a canonical model is what makes "not possible before" cross-domain analytics possible at all. > **Schemas-repo dependency — partially resolved.** The OtOpcUa team has contributed an initial seed at [`schemas/`](../schemas/) (temporary location in the 3-year-plan repo until the dedicated `schemas` repo is created — Gitea push-to-create is disabled). The seed includes: > - JSON Schema format definitions (`format/equipment-class.schema.json` with an `extends` field for class inheritance, `format/tag-definition.schema.json`, `format/uns-subtree.schema.json`) @@ -613,7 +620,7 @@ Consumers that need to know "what does a `Faulted` state mean" or "what are all ##### Canonical machine state vocabulary -The plan commits to a **single authoritative set of machine state values** used consistently across layer-3 state derivations, Redpanda event payloads, and dbt curated views. This is the answer to Digital Twin Use Case 1. +The plan commits to a **single authoritative set of machine state values** used consistently across layer-3 state derivations, Redpanda event payloads, and dbt curated views. Starting set (subject to refinement during implementation, but the names and semantics below are committed as the baseline): @@ -741,58 +748,42 @@ _TBD — named owners for each pillar's criterion; quarterly progress metrics (e External strategic asks that are **not** part of this plan's three pillars but that the plan should be *shaped to serve* when they materialize. None of these commit the plan to deliver anything — they are constraints on how components are built so that future adjacent initiatives can consume them. -### Digital twin (management ask — use cases received 2026-04-15) +### Digital twin (scope: two access-control patterns) -**Status: management has delivered the requirements; the plan absorbs two of the three use cases and treats the third as exploratory.** The plan does not add a new "digital twin workstream" to `roadmap.md`, and no pillar criterion depends on a digital twin deliverable. What the plan does is **commit to the pieces** that management's three use cases actually require, as additions to existing components rather than as a parallel initiative. See [`goal-state/digital-twin-management-brief.md`](goal-state/digital-twin-management-brief.md) → "Outcome" for the meeting resolution. +**Status: scope is definitive as of 2026-04-24.** This plan's digital-twin scope is exactly **two access-control patterns**, both delivered by architecture already committed elsewhere in this document. No new component, no new workstream, no pillar criterion depends on a digital-twin deliverable. Anything else stakeholders may call "digital twin" (physics simulation, FAT / commissioning emulation, 3D visualization, genealogy tracking, predictive-maintenance AI) is explicitly **not** in the plan's digital-twin scope — it either belongs to an adjacent initiative, a different pillar, or a separately funded future effort. -#### Management-provided use cases +#### The two patterns -These are the **only requirements** management can provide — high-level framing, no product selection, no sponsor, no timeline beyond "directionally, this is what we want." Captured here verbatim in intent; the source document lives at [`../digital_twin_usecases.md.txt`](../digital_twin_usecases.md.txt) in its original form. +1. **Environment-lifecycle promotion without reconfiguration.** Dev / QA / Prod System Platform instances all consume equipment through OtOpcUa; promotion of a piece of equipment from dev → qa → prod is an ACL flip that moves the single write-holder grant from `sp-dev` → `sp-qa` → `sp-prod` against the same equipment UUID. The connection is configured once; only write authority moves. Replaces today's disable-dev / enable-qa / re-create-connection pattern, and eliminates the stomping-on-each-other risk when multiple environments would otherwise each need write access to the single physical equipment. +2. **Safe read-only consumption for KPI / monitoring systems.** Ignition KPI views, Power BI dashboards, observability / monitoring consumers, and any future read-only analytics consumer get read-only access to canonical equipment streams with **zero write path to physical equipment**. The guarantee is structural (single OPC UA session per piece of equipment, owned by OtOpcUa — no equipment-side session exists for a read-only consumer to misuse) rather than procedural (relying on the consumer's own code to not issue writes). Materially reduces the risk of adding new KPI / monitoring consumers to the estate. -1. **Standardized Equipment State / Metadata Model.** A consistent, high-level representation of machine state derived from raw signals: Running / Idle / Faulted / Starved / Blocked. Normalized across equipment types. Single authoritative machine state, derived from multiple interlocks and status bits. Actual-vs-theoretical cycle time. Top-fault instead of dozens of raw alarms. Value: single consistent view of equipment behavior, reduced downstream complexity, improved KPI accuracy (OEE, downtime). -2. **Virtual Testing / Simulation (FAT, Integration, Validation).** A digital representation of equipment that emulates signals, states, and sequences, so automation logic / workflows / integrations can be tested without physical machines. Replay of historical scenarios, synthetic scenarios, edge-case coverage. Value: earlier testing, reduced commissioning time and risk, improved deployed-system stability. -3. **Cross-System Data Normalization / Canonical Model.** A common semantic layer between systems: standardized data structures for equipment, production, and events. Translates system-specific formats into a unified model. Consistent interface for all consumers. Uniform event definitions (`machine fault`, `job complete`). Value: simplified integration, reduced duplication of transformation logic, improved consistency across the enterprise. +**Implementation surface:** both patterns live in **OtOpcUa → Consumer access patterns enabled by the ACL model**. The structural substrate that makes them possible is: (a) the canonical model's stable equipment UUIDs, (b) the single OPC UA session per piece of equipment owned by OtOpcUa, (c) the read-vs-write-distinguishing ACL model committed in OtOpcUa v2. -Management's own framing of the combined outcome: "a translator (raw signals → meaningful state), a simulator (test without physical dependency), and a standard interface (consistent data across systems)." +#### Design constraints for future adjacent initiatives -#### Plan mapping — what each use case costs this plan +If a later adjacent initiative builds something stakeholders want to call "digital twin" on top of this plan's foundation (physics simulation, 3D visualization, a twin product surface), these constraints apply — they are already committed plan decisions, restated here so adjacent initiatives consume this plan cleanly: -| # | Use case | Maps to existing plan components | Delta this plan commits to | -|---|---|---|---| -| 1 | Standardized equipment state model | Layer 3 (Aveva System Platform + Ignition state derivation) for real-time; dbt curated layer for historical; Redpanda event schemas for event-level state transitions | **Canonical machine state vocabulary.** Adopt `Running / Idle / Faulted / Starved / Blocked` (plus any additions agreed during implementation) as the **authoritative state set** across layer-3 derivations, Redpanda event payloads, and dbt curated views. No new component — commitment is that every surface uses the same state values, and the vocabulary is published in the central `schemas` repo. See **Async Event Backbone → Canonical Equipment, Production, and Event Model.** | -| 2 | Virtual testing / simulation | Not served today by the plan, and not going to be served by a full simulation stack. | **Simulation-lite via replay.** Redpanda's analytics-tier retention (30 days) already enables historical event replay to exercise downstream consumers. OtOpcUa's namespace architecture can in principle host a future "simulated" namespace that replays historical equipment data to exercise tier-1 and tier-2 consumers — architecturally supported, not committed for build in this plan. **Full commissioning-grade simulation stays out of scope** pending a separate funded initiative. | -| 3 | Cross-system canonical model | OtOpcUa equipment namespace (canonical OPC UA surface); Redpanda topic taxonomy (`{domain}.{entity}.{event-type}`) + Protobuf schemas; dbt curated layer (canonical analytics model) — all three already committed. | **Canonical model declaration.** The plan already builds the pieces; what it did not do is **declare** that these pieces together constitute a canonical equipment/production/event model that consumers are entitled to use as an integration interface. This declaration lives in the central `schemas` repo as first-class content and is referenced from every surface that exposes the model. See **Async Event Backbone → Canonical Equipment, Production, and Event Model.** | +- **Must consume equipment data through OtOpcUa.** No direct equipment OPC UA sessions. +- **Must consume historical and analytical data through Snowflake + dbt** — not Historian directly, not a bespoke pipeline. The `≤15-minute analytics` SLO is the freshness budget available. +- **Must consume event streams through Redpanda** — not a parallel bus. The same schemas-in-git and `{domain}.{entity}.{event-type}` topic naming apply. The canonical state vocabulary and canonical model declaration (see **Async Event Backbone → Canonical Equipment, Production, and Event Model**) are how consistent state semantics are delivered. +- **Must stay within the IT↔OT boundary.** Enterprise-hosted twin capabilities cross through ScadaBridge central and the SnowBridge like every other enterprise consumer. -#### Resolution against the meeting brief's four buckets +#### What this commits / does not commit -The meeting brief framed four outcome buckets (#1 already-delivered, #2 adjacent-funded, #3 future-plan-cycle, #4 exploratory). Management's actual answer does not land in a single bucket — it **splits per use case:** +**Commits** (the two patterns — all substrate already committed elsewhere): -- **Use cases 1 and 3 → Bucket #1 with small plan additions.** The plan already delivers the substrate; it now also commits to the canonical state vocabulary (use case 1) and the canonical model declaration (use case 3), both captured below under **Async Event Backbone → Canonical Equipment, Production, and Event Model**. No new workstream, no new component, no pillar impact. -- **Use case 2 → Bucket #4, served minimally.** Replay-based "simulation-lite" is architecturally enabled by Redpanda's retention tiers and OtOpcUa's namespace model. Full FAT / commissioning / integration-test simulation remains out of scope for this plan. If a funded simulation initiative materializes later, this plan's foundation supports it; until then, the narrow answer to use case 2 is "replay what Redpanda already holds, and build a simulated OtOpcUa namespace when a specific testing need justifies it." - -#### Design constraints this imposes (unchanged) - -- **Any digital twin capability must consume equipment data through OtOpcUa.** No direct equipment OPC UA sessions. -- **Any digital twin capability must consume historical and analytical data through Snowflake + dbt** — not from Historian directly, not through a bespoke pipeline. The `≤15-minute analytics` SLO is the freshness budget available to it. -- **Any digital twin capability must consume event streams through Redpanda** — not a parallel bus. The same schemas-in-git and `{domain}.{entity}.{event-type}` topic naming apply. The canonical state vocabulary and canonical model declaration (see below) are how "consistent state semantics" is delivered. -- **Any digital twin capability must stay within the IT↔OT boundary.** Enterprise-hosted twins cross through ScadaBridge central and the SnowBridge like every other enterprise consumer. - -> **Unified Namespace vocabulary:** stakeholders framing the digital twin ask in "Unified Namespace" terms are asking for the same thing Use Cases 1 and 3 describe, just in UNS language. See **Target IT/OT Integration → Unified Namespace (UNS) posture** for the plan's explicit UNS framing and the decision trigger for a future MQTT/Sparkplug projection service. In short: the plan **already** delivers the UNS value proposition; an MQTT-native projection can be added later if a consumer specifically requires it. - -#### What this does and does not commit - -**Commits:** -- A canonical machine state vocabulary (`Running / Idle / Faulted / Starved / Blocked` + any additions), published in the `schemas` repo and used consistently across layer-3 derivations, Redpanda event schemas, and dbt curated views. -- A canonical equipment / production / event model declaration in the `schemas` repo, referencing the three surfaces (OtOpcUa, Redpanda, dbt) where it is exposed. -- Retention-tier replay of Redpanda analytics topics as a documented capability usable for integration testing and simulation-lite. +- A single OPC UA session per piece of equipment owned by OtOpcUa, keyed to stable equipment UUIDs. (See **OtOpcUa**.) +- An ACL model on OtOpcUa that distinguishes read from write and is scoped per equipment UUID. (See **OtOpcUa → Authorization model**.) +- The canonical model and stable UUID identity that make both patterns portable across environments and consumers. (See **Unified Namespace (UNS) posture** and **Async Event Backbone → Canonical Equipment, Production, and Event Model**.) **Does not commit:** -- Building or buying a full commissioning-grade simulation product (Aveva Digital Twin, Siemens NX, DELMIA, Azure Digital Twins, etc.). -- A digital twin UI, dashboard, 3D visualization, or product surface. -- Predictive / AI models specific to digital twin use cases — those are captured under pillar 2 as general analytics/AI enablement, not as digital-twin-specific deliverables. -- Any new workstream, pillar, or end-of-plan criterion tied to digital twin delivery. -_TBD — whether any equipment state additions beyond the five names above are needed (e.g., `Changeover`, `Maintenance`, `Setup`); ownership of the canonical state vocabulary in the `schemas` repo (likely a domain-specific team rather than the ScadaBridge team); whether a use-case-2 funded simulation initiative is on anyone's horizon._ +- The mechanism by which OtOpcUa arbitrates write-authority moves between environments (Admin UI switch, PR-merge on the `schemas` repo, release-pipeline step, or any combination). That is the OtOpcUa team's implementation decision and lives outside this plan. +- Any form of physics simulation, FAT / commissioning-grade integration emulation, 3D visualization, predictive-maintenance AI, or genealogy tracking branded as "digital twin." Adjacent initiatives and other pillars may build such things on this plan's foundation; this plan does not. +- Purchase or build of a commercial digital-twin product (Aveva Digital Twin, Siemens NX, DELMIA, Azure Digital Twins, etc.). +- Any new workstream in `roadmap.md`, any pillar, or any end-of-plan criterion tied to digital-twin delivery. + +_TBD — none remaining for this section. Canonical state vocabulary ownership and possible additions (`Changeover`, `Maintenance`, `Setup`) are tracked under **Async Event Backbone → Canonical machine state vocabulary**, where that work now lives._ ### Enterprise reporting: BOBJ → Power BI migration (adjacent initiative) diff --git a/goal-state/digital-twin-management-brief.md b/goal-state/digital-twin-management-brief.md deleted file mode 100644 index d3dc7ab..0000000 --- a/goal-state/digital-twin-management-brief.md +++ /dev/null @@ -1,150 +0,0 @@ -# Digital Twin — Management Conversation Brief - -A walk-into-the-meeting artifact for the **management conversation** that turns the ask ("we want digital twins") into a scoped response. - -> This brief is a **meeting prep document**, not plan content. The authoritative plan position on digital twin lives in [`../goal-state.md`](../goal-state.md) → **Strategic Considerations (Adjacent Asks)** → **Digital twin** — this file exists to prepare for the clarification conversation referenced there. - -## Outcome — conversation complete (2026-04-15) - -**Status: the conversation has happened.** Management delivered three concrete high-level use cases as their complete answer — that is all the requirements framing they can provide. Source document: [`../digital_twin_usecases.md.txt`](../digital_twin_usecases.md.txt). - -**The three use cases management delivered:** - -1. **Standardized Equipment State / Metadata Model** — raw signals → meaningful canonical state (`Running` / `Idle` / `Faulted` / `Starved` / `Blocked`), cycle-time accuracy, top-fault derivation. -2. **Virtual Testing / Simulation** — emulate equipment signals/states for automation-logic testing, FAT, integration validation, replay of historical and synthetic scenarios. -3. **Cross-System Data Normalization / Canonical Model** — common semantic layer with standardized equipment/production/event structures and uniform event definitions across systems. - -**Bucket resolution — splits across use cases, does not land in a single bucket:** - -| Use case | Bucket | Plan response | -|---|---|---| -| 1 — Standardized state model | **#1 with a small addition** — plan absorbs it. | Commit to a canonical machine state vocabulary (`Running / Idle / Faulted / Starved / Blocked` + TBD additions like `Changeover`, `Maintenance`). Derived at layer 3, published as an enum in the central `schemas` repo, consumed uniformly across Redpanda events and dbt curated views. See [`../goal-state.md`](../goal-state.md) → Async Event Backbone → **Canonical Equipment, Production, and Event Model** → **Canonical machine state vocabulary**. | -| 2 — Virtual testing / simulation | **#4 — served minimally, full scope exploratory.** | Replay-based simulation-lite enabled by Redpanda's `analytics`-tier retention (30 days); OtOpcUa's namespace architecture can accommodate a future `simulated` namespace without reshaping the component. Full commissioning-grade FAT / integration simulation stays **out of scope** for this plan. If a funded simulation initiative materializes, this plan's foundation supports it — no new workstream until then. | -| 3 — Cross-system canonical model | **#1 with a framing commitment** — plan absorbs it. | The plan already builds the pieces (OtOpcUa equipment namespace, Redpanda topic taxonomy, Protobuf schemas in central `schemas` repo, dbt curated layer). Commit to declaring these pieces as **the** canonical equipment/production/event model that consumers are entitled to treat as an integration interface. See [`../goal-state.md`](../goal-state.md) → Async Event Backbone → **Canonical Equipment, Production, and Event Model**. | - -**What this meeting did NOT produce** (deliberately, because management could not provide these details and the plan does not require them to move forward): - -- A named sponsor for a separately funded digital twin initiative. -- A budget or timeline for use case 2 (simulation). -- A specific vendor product selection. -- A "kind of twin" framing (equipment twin vs line twin vs genealogy twin vs simulation twin) — the three use cases above cut across multiple categories from the brief's Q2, which is fine given how the plan absorbs them. -- Any decision that would add a workstream to [`../roadmap.md`](../roadmap.md). - -**What comes next:** - -- **Use cases 1 and 3 are now plan commitments** and get implemented under existing workstreams (Redpanda EventHub for the schemas/vocabulary, Snowflake dbt Transform Layer for the curated-view side). See [`../roadmap.md`](../roadmap.md) → Year 1 updates. -- **Use case 2 remains open as an exploratory item.** The narrower open question carried forward is tracked in [`../status.md`](../status.md) → Top pending items: "Simulation initiative (digital twin use case 2) — exploratory; no plan action until/unless a funded initiative materializes with a sponsor." - -**This brief is retained for reference.** The pre-meeting framing (question priority, interpretation table, decision tree, four-bucket framework) remains useful if a follow-up conversation is needed — especially around use case 2 (simulation scoping) or if management surfaces additional use cases beyond the three above. The rest of the document continues below unchanged for that purpose. - ---- - -## Goal of the meeting - -Come out with enough information to place the ask into **one of four buckets**: - -1. **Already delivered by this plan.** The "real" need is a Snowflake-backed historical / predictive view of equipment health and performance. Recommendation: no new workstream; the first twin use case lands in Year 2 or Year 3 as one of pillar 2's "not possible before" analytics use cases. This is the predicted outcome (see `goal-state.md` → Digital twin → "Likely outcome of this conversation"). -2. **Adjacent initiative, consumes this plan's foundation.** A funded, sponsored, separately-scoped twin effort runs alongside this plan and consumes OtOpcUa, Redpanda, Snowflake, and the SnowBridge as its data substrate. Recommendation: no changes to this plan's pillars; digital twin team owns delivery; this plan commits to keeping the foundation consumable. -3. **Folded into a future version of this plan.** A twin capability becomes a new pillar in a v2 of this plan — not today. Recommendation: document the agreement, park until the next planning cycle. -4. **Genuinely undefined — exploratory ask.** Management wants us to "look at it" but has no problem statement, sponsor, or timeline. Recommendation: run a scoped proof-of-concept (one equipment class, one site) on OtOpcUa's new equipment namespace as an inexpensive, low-commitment response; defer the bigger question. - -Any outcome other than these four means the conversation did not converge; schedule a follow-up rather than try to commit on the spot. - -## Suggested opener - -> "Thanks for raising digital twin as something you want us to look at. Before we commit anything into the 3-year plan, we want to make sure what we build actually lands against what you're after — 'digital twin' covers enough different things that it's worth an hour to sharpen the ask. We've come with a short list of clarifying questions. Good news up front: most of the likely shapes of this ask are already served by the foundation we're building for analytics and AI enablement, so this conversation is more likely to end with 'here's how you already get it' than 'we need a new workstream.'" - -This framing is deliberately **not** defensive. The plan already shapes its components for a prospective digital twin layer; we're not pushing back, we're helping the ask land in a form we can execute against. - -## Question priority grouping - -The 8 questions in `goal-state.md` are all useful, but they are not equally diagnostic for placing the ask into one of the four buckets. Use this order: - -### Must-answer (drive the bucket decision) - -These three typically resolve the entire conversation: - -- **Q1. What problem are you trying to solve?** — The single most diagnostic question. If the answer is framed in terms of downtime, predictive maintenance, quality yields, or compliance evidence, the likely bucket is #1 (Snowflake-backed) or #2 (adjacent initiative on this foundation). If it is framed in terms of operator training or line simulation, the likely bucket is #2 (adjacent, probably vendor product) or #4 (exploratory). If there is no problem — "we just need to be doing digital twin" — the bucket is #4. -- **Q7. Is there a named sponsor and funding?** — Hard gate between buckets. Sponsor + funding → bucket #2. No sponsor, no funding → bucket #4. Future plan cycle → bucket #3. This question also controls how much time it's worth spending on the other seven. -- **Q8. Is this connected to an initiative already underway?** — If yes (operational excellence, predictive maintenance pilot, AI/ML platform, sustainability dashboards), the "real" ask is that parent initiative and we should talk to it directly. Finding the parent is often the fastest path to bucket #1. - -### Nice-to-have (sharpen the scope once the bucket is known) - -Once the bucket is known, these refine the response: - -- **Q2. Which *kind* of digital twin?** — Pins the architectural fit. Equipment/asset twin → OtOpcUa + real-time layer. Line/cell twin → Snowflake + dbt. Product/genealogy twin → Camstar MES, **not** this plan. Simulation twin → vendor product. Predictive/AI twin → Snowflake + dbt + an ML layer. -- **Q4. Real-time, historical, predictive, or simulation?** — Overlaps with Q2 but is useful as a sanity-check if the answer to Q2 is "a bit of everything" (which usually means "undefined"). -- **Q5. Scope and timing?** — Converts an abstract ask into something you can actually say yes or no to. Also the easiest question to get a "someday" answer on, which is itself informative. - -### Skip if time is short - -- **Q3. Who uses it?** — Helpful if answered crisply, usually vague if not. Can be deferred to a follow-up. -- **Q6. Assumed product?** — Only relevant if the bucket is #2 and build-vs-buy is on the table. Irrelevant if we're in bucket #1, #3, or #4. - -## Interpretation table — likely answer patterns and what they mean - -| If the answer sounds like... | The real ask is probably... | Bucket | Response | -|---|---|---|---| -| "Reduce unplanned downtime on our critical equipment" | Predictive maintenance on historical equipment data | #1 | "This is a pillar 2 use case. Year 2–3 delivery on the dbt curated layer." | -| "See equipment state in real time from anywhere" | Real-time equipment dashboard | #1 or #2 | Year 2+ on Ignition + Snowflake (pillar 2) if enterprise-read-only; separate initiative if interactive/bidirectional. | -| "Train operators without touching real equipment" | Simulation / process twin | #2 | Vendor product (Aveva Digital Twin, DELMIA, Siemens NX). Separate initiative — this plan provides the data substrate only. | -| "Track every part through the factory with its full history" | Product / genealogy twin | Not this plan | Camstar MES territory — direct management to the Camstar owner. | -| "Forecast future equipment failures from sensor data" | Predictive / AI twin | #1 | Pillar 2 use case. Year 2–3 on the curated layer + an ML layer. | -| "We saw a demo of \ and want to evaluate it" | Vendor-driven exploration | #4 or #2 | Proof-of-concept, scoped to one equipment class on OtOpcUa's equipment namespace. | -| "The board wants to hear about our digital transformation" | No concrete ask; political positioning | #4 | Reframe as "here's what we're already doing that counts as digital transformation" rather than building something new. | -| "\ needs a digital twin component" | The parent initiative is the real ask | Depends on parent | Route the conversation to the parent initiative's sponsor. | - -## Decision tree - -Use this in the moment to place the ask: - -``` -Is there a named sponsor and funding? (Q7) -├── No → Is there a concrete problem? (Q1) -│ ├── No → Bucket #4 (exploratory). Offer: PoC on one equipment class, deferred bigger decision. -│ └── Yes → Does it fit pillar 2? (Q1, Q4) -│ ├── Yes → Bucket #1. Already delivered; Year 2–3 use case. -│ └── No → Bucket #3. Park for next planning cycle. -└── Yes → Is there a parent initiative? (Q8) - ├── Yes → Route to parent initiative owner. Out of this plan's hands. - └── No → Does it fit the foundation this plan delivers? (Q2, Q4) - ├── Yes → Bucket #2. Adjacent, consumes this plan's foundation. - └── No → Bucket #2 anyway, but flag that the foundation gap may need to be filled. -``` - -## Non-negotiables to hold in the conversation - -Whatever the bucket turns out to be, these are already committed positions of the plan and should not be renegotiated in the meeting: - -- **Any twin must consume equipment data through OtOpcUa.** No direct equipment OPC UA sessions. -- **Any twin must consume historical/analytical data through Snowflake + dbt.** No direct Historian pulls, no bespoke pipelines. -- **Any twin must consume event streams through Redpanda.** No parallel messaging bus. -- **Any twin must stay within the IT↔OT boundary** — enterprise-hosted twins cross through ScadaBridge central and the SnowBridge like every other enterprise consumer. - -These are on line in `goal-state.md` → Digital twin → "Design constraints this imposes." Restate them if the conversation drifts toward a parallel integration path. - -## Outputs of the meeting - -Bring back: - -1. The **bucket assignment** (or a reason the conversation did not converge and needs a follow-up). -2. The **sponsor and funding** status, if known. -3. Any **parent initiative** identified. -4. A **one-line summary** of the actual problem the ask exists to solve, in management's own words — this is the quotable thing you'll use to explain the decision later. -5. Agreement on the **next action**: file the use case into pillar 2, stand up a PoC, park until next planning cycle, or route to a parent initiative owner. - -If you come back without (1) and (5), the meeting did not do its job — schedule the follow-up before leaving the room. - -## What to do after the meeting - -- If **bucket #1**: update `goal-state.md` → Digital twin section with a one-line pointer noting "resolved to pillar 2 analytics use case" and a date. Add the use case to the pillar 2 candidate list. Remove the top-pending-item entry from `../status.md`. -- If **bucket #2**: update `goal-state.md` with the sponsor, scope, and foundation touchpoints. No changes to pillars. Keep this brief on file for the adjacent initiative's kickoff. -- If **bucket #3**: note the agreement in `goal-state.md` and move on. Surface in the next planning cycle. -- If **bucket #4**: document the PoC scope in `goal-state.md` (one equipment class, one site, one quarter) and kick it off as a Year 1 side activity on OtOpcUa. Do **not** add a workstream to `roadmap.md` — PoCs don't belong on the grid. - ---- - -**Related:** -- [`../goal-state.md`](../goal-state.md) → Strategic Considerations → Digital twin — plan position and design constraints. -- [`../goal-state.md`](../goal-state.md) → OtOpcUa — "any future consumers such as a prospective digital twin layer." -- [`../status.md`](../status.md) → Top pending items — where this meeting sits in the open-work queue. diff --git a/outputs/DESIGN.md b/outputs/DESIGN.md index 9450d97..e9b4987 100644 --- a/outputs/DESIGN.md +++ b/outputs/DESIGN.md @@ -31,7 +31,7 @@ Together these reduce "what Claude has to decide" to **text phrasing inside a fi ``` plan/ ├── current-state.md, goal-state.md, roadmap.md, status.md (existing, unchanged) -├── current-state/, goal-state/ (existing, unchanged) +├── current-state/ (existing, unchanged) └── outputs/ (NEW) ├── README.md — trigger phrases + numbered regeneration checklist ├── DESIGN.md — this document @@ -112,7 +112,7 @@ Slide-by-slide mapping (full detail lives in `outputs/presentation-spec.md`): **Page setup:** Letter, 1" margins, chapter-name running header, page-number + as-of-date footer. `document-skills:theme-factory` default serif. -**Excluded from PDF:** `status.md`, `CLAUDE.md`, `goal-state/digital-twin-management-brief.md`, `outputs/*` (all meta, working, or prep content — not plan content). +**Excluded from PDF:** `status.md`, `CLAUDE.md`, `outputs/*` (all meta, working, or prep content — not plan content). ## Section 5 — Diagrams diff --git a/outputs/IMPLEMENTATION-PLAN.md b/outputs/IMPLEMENTATION-PLAN.md index 1e4b0de..c6c66ae 100644 --- a/outputs/IMPLEMENTATION-PLAN.md +++ b/outputs/IMPLEMENTATION-PLAN.md @@ -88,7 +88,7 @@ Expected: `DESIGN.md IMPLEMENTATION-PLAN.md diagrams generated` (README, spec - Appendix A: `current-state/legacy-integrations.md` - Appendix B: `current-state/equipment-protocol-survey.md` - Transformation rules (numbered heading, link normalization, `_TBD_` highlight, ASCII diagram preservation, table handling) -- Exclusion list (`status.md`, `CLAUDE.md`, `goal-state/digital-twin-management-brief.md`, `outputs/*`) +- Exclusion list (`status.md`, `CLAUDE.md`, `outputs/*`) **Verification:** - File exists diff --git a/outputs/longform-spec.md b/outputs/longform-spec.md index 643779b..4a98371 100644 --- a/outputs/longform-spec.md +++ b/outputs/longform-spec.md @@ -47,7 +47,6 @@ These files are **not** part of the PDF — do not include them even if they see |---|---| | [`../CLAUDE.md`](../CLAUDE.md) | Repo meta — instructions for Claude, not plan content. | | [`../status.md`](../status.md) | Working bookmark — a session-state artifact, not authoritative plan content. | -| [`../goal-state/digital-twin-management-brief.md`](../goal-state/digital-twin-management-brief.md) | Meeting prep artifact. Its own header explicitly says it is not plan content. | | [`./README.md`](README.md), [`./DESIGN.md`](DESIGN.md), [`./presentation-spec.md`](presentation-spec.md), [`./longform-spec.md`](longform-spec.md), [`./IMPLEMENTATION-PLAN.md`](IMPLEMENTATION-PLAN.md), [`./run-log.md`](run-log.md) | Output pipeline files — the pipeline does not document itself inside its own output. | | [`./diagrams/*`](diagrams/), [`./generated/*`](generated/) | Output pipeline artifacts. | @@ -85,7 +84,7 @@ Markdown links between plan files are resolved to section references in the rend | `[legacy-integrations.md](current-state/legacy-integrations.md)` | "see Appendix A — Legacy Integrations Inventory" | | `[equipment-protocol-survey.md](current-state/equipment-protocol-survey.md)` | Render as **plain text** — file removed. Log as warning. | | Intra-file anchor links like `[X](#section-name)` | Rendered as internal PDF cross-reference to the numbered section (e.g., "see §1.2") | -| Links to excluded files (e.g., `status.md`, `digital-twin-management-brief.md`) | Rendered as **plain text** — the link target is dropped, the link text stays. Logged as a warning in the run log. | +| Links to excluded files (e.g., `status.md`) | Rendered as **plain text** — the link target is dropped, the link text stays. Logged as a warning in the run log. | | External links (http://, https://) | Rendered as clickable external links, unchanged. | | Unresolvable links (file not found) | Rendered as plain text, logged as a warning in the run log. **Do not silently drop.** | diff --git a/outputs/presentation-spec.md b/outputs/presentation-spec.md index 2abe145..3af6498 100644 --- a/outputs/presentation-spec.md +++ b/outputs/presentation-spec.md @@ -172,7 +172,7 @@ If `document-skills:pptx` cannot render a requested layout: |---|---| | **Layout** | 2-column content (fallback: single column with horizontal rule) | | **Source** | [`../goal-state.md`](../goal-state.md) → **Strategic Considerations (Adjacent Asks)** | -| **Population** | **Left column — Digital Twin:** 4 bullets: (1) Management ask, not a committed workstream; (2) Plan shaped to serve if it materializes (OtOpcUa, Redpanda, Snowflake); (3) 8 clarification questions + 4-bucket decision framework ready; (4) Next: schedule management conversation — brief at `goal-state/digital-twin-management-brief.md`. **Right column — BOBJ → Power BI:** 4 bullets: (1) In-flight reporting initiative, not owned by this plan; (2) Three consumption paths analyzed (Snowflake dbt / Historian direct / both); (3) Recommended position: Path C — hybrid, with Path A as strategic direction; (4) Next: schedule coordination conversation with reporting team — 8 questions ready in `goal-state.md`. | +| **Population** | **Left column — Digital twin (scope: two access-control patterns):** 4 bullets: (1) Scope is definitive — not a committed workstream, not a new component; (2) Pattern 1 — environment-lifecycle promotion without reconfiguration (ACL flip on write authority); (3) Pattern 2 — safe read-only consumption for KPI / monitoring systems (structural zero-write-path guarantee); (4) Both patterns are delivered by already-committed architecture (OtOpcUa ACL model + canonical model + single-connection-per-equipment). **Right column — BOBJ → Power BI:** 4 bullets: (1) In-flight reporting initiative, not owned by this plan; (2) Three consumption paths analyzed (Snowflake dbt / Historian direct / both); (3) Recommended position: Path C — hybrid, with Path A as strategic direction; (4) Next: schedule coordination conversation with reporting team — 8 questions ready in `goal-state.md`. | ## Slide 17 — Non-Goals @@ -189,7 +189,7 @@ If `document-skills:pptx` cannot render a requested layout: |---|---| | **Layout** | Content (bulleted) | | **Source** | [`../status.md`](../status.md) → **Top pending items** + inferred from [`../roadmap.md`](../roadmap.md) → Year 1 | -| **Population** | 5 bullets: (1) Sponsor confirmation + Year 1 funding commitment; (2) Named owners for each of the 7 workstreams (build team alignment); (3) Digital Twin management conversation — schedule (see brief); (4) Power BI coordination conversation with reporting team — schedule; (5) Equipment protocol survey owner named (Q1 Year 1 prerequisite for OtOpcUa core driver library). | +| **Population** | 4 bullets: (1) Sponsor confirmation + Year 1 funding commitment; (2) Named owners for each of the 7 workstreams (build team alignment); (3) Power BI coordination conversation with reporting team — schedule; (4) UNS hierarchy snapshot walk owner named (Q1–Q2 Year 1 prerequisite for canonical model v1 publication). | | **Notes** | This is the closer slide. Each bullet should be a discrete ask with a clear "who needs to do what" so the audience leaves with action. | --- diff --git a/roadmap.md b/roadmap.md index 922abad..a0027d1 100644 --- a/roadmap.md +++ b/roadmap.md @@ -64,9 +64,9 @@ The roadmap is laid out as a 2D grid — **workstreams** (rows) crossed with **y | Workstream | **Year 1 — Foundation** | **Year 2 — Scale** | **Year 3 — Completion** | |---|---|---|---| | **OtOpcUa** | **Evolve LmxOpcUa into OtOpcUa** — extend the existing in-house OPC UA server to add (a) a new equipment namespace with single session per equipment via native protocols translated to OPC UA (committed core drivers: OPC UA Client, Modbus TCP, AB CIP, AB Legacy, S7, TwinCAT, FOCAS, plus Galaxy carried forward), and (b) clustering (non-transparent redundancy, 2-node per site) on top of the existing per-node deployment. **Driver stability tiers:** Tier A in-process (Modbus, OPC UA Client), Tier B in-process with guards (S7, AB CIP, AB Legacy, TwinCAT), Tier C out-of-process (Galaxy — bitness constraint, FOCAS — uncatchable AVE). Core driver list confirmed by v2 implementation team (protocol survey no longer needed for driver scoping). **UNS hierarchy snapshot walk** — per-site equipment-instance discovery (site/area/line/equipment + UUID assignment) to feed the initial schemas-repo hierarchy definition and canonical model; target done Q1–Q2. **ACL model designed and committed** (decisions #129–132): 6-level scope hierarchy, `NodePermissions` bitmask, generation-versioned `NodeAcl` table, Admin UI + permission simulator. Phase 1 ships before any driver phase. **Deploy OtOpcUa to every site** as fast as practical. **Begin tier 1 cutover (ScadaBridge)** at large sites. **Prerequisite: certificate-distribution** to consumer trust stores before each cutover. **Aveva System Platform IO pattern validation** — Year 1 or early Year 2 research to confirm Aveva supports upstream OPC UA data sources, well ahead of Year 3 tier 3. _TBD — first-cutover site selection; **cutover plan owner** (not OtOpcUa — a separate integration/operations team, per decision #136, not yet named); enterprise shortname for UNS hierarchy root; schemas-repo owner team and dedicated repo creation._ | **Complete tier 1 (ScadaBridge)** across all sites. **Begin tier 2 (Ignition)** — Ignition consumers redirected from direct-equipment OPC UA to each site's OtOpcUa, collapsing WAN session counts from *N per equipment* to *one per site*. **Build long-tail drivers** on demand as sites require them. Resolve Warsaw per-building multi-cluster consumer-addressing pattern (consumer-side stitching vs site-aggregator OtOpcUa instance). _TBD — per-site tier-2 rollout sequence._ | **Complete tier 2 (Ignition)** across all sites. **Execute tier 3 (Aveva System Platform IO)** with compliance stakeholder validation — the hardest cutover because System Platform IO feeds validated data collection. Reach steady state: every equipment session is held by OtOpcUa, every downstream consumer reads OT data through it. _TBD — per-equipment-class criteria for System Platform IO re-validation._ | -| **Redpanda EventHub** | Stand up central Redpanda cluster in South Bend (single-cluster HA). Stand up bundled Schema Registry. Wire SASL/OAUTHBEARER to enterprise IdP. Create initial topic set (prefix-based ACLs). Hook up observability minimum signal set. Define the three retention tiers (`operational`/`analytics`/`compliance`). **Stand up the central `schemas` repo** with `buf` CI, CODEOWNERS, and the NuGet publishing pipeline. **Publish the canonical equipment/production/event model v1** — including the canonical machine state vocabulary (`Running / Idle / Faulted / Starved / Blocked` + any agreed additions) as a Protobuf enum, the `equipment.state.transitioned` event schema, and initial equipment-class definitions for pilot equipment. This is the foundation for Digital Twin Use Cases 1 and 3 (see `goal-state.md` → Strategic Considerations → Digital twin) and is load-bearing for pillar 2. **Pilot equipment class for canonical definition: FANUC CNC** (pre-defined FOCAS2 hierarchy already exists in OtOpcUa v2 driver design). Land the FANUC CNC class template in the schemas repo before Tier 1 cutover begins. **Universal `_base` equipment-class template** seeded by the OtOpcUa team — every other class extends it via the `extends` field on the equipment-class JSON Schema. `_base` aligns to **OPC UA Companion Spec OPC 40010 (Machinery)** for the Identification component (Manufacturer, Model, ProductInstanceUri, SerialNumber, HardwareRevision, SoftwareRevision, YearOfConstruction, ManufacturerUri, DeviceManual, AssetLocation) and MachineryOperationMode enum, **OPC UA Part 9** for alarm-summary fields, and **ISO 22400** for lifetime counters that feed Availability + Performance KPIs. Avoids per-class drift in identity / state / alarm field naming and ensures every machine in the estate exposes the same baseline metadata regardless of vendor. _TBD — sizing decisions, initial topic list, canonical vocabulary ownership (domain SME group)._ | Expand topic coverage as additional domains onboard. Enforce tiered retention and ACLs at scale. Prove backlog replay after a WAN-outage drill (also exercises the Digital Twin Use Case 2 simulation-lite replay path). Exercise long-outage planning (ScadaBridge queue capacity vs. outage duration). Iterate the canonical model as additional equipment classes and domains onboard. _TBD — concrete drill cadence._ | Steady-state operation. Harden alerting and runbooks against the observed failure modes from Years 1–2. Canonical model is mature and covers every in-scope equipment class; schema changes are routine rather than foundational. | +| **Redpanda EventHub** | Stand up central Redpanda cluster in South Bend (single-cluster HA). Stand up bundled Schema Registry. Wire SASL/OAUTHBEARER to enterprise IdP. Create initial topic set (prefix-based ACLs). Hook up observability minimum signal set. Define the three retention tiers (`operational`/`analytics`/`compliance`). **Stand up the central `schemas` repo** with `buf` CI, CODEOWNERS, and the NuGet publishing pipeline. **Publish the canonical equipment/production/event model v1** — including the canonical machine state vocabulary (`Running / Idle / Faulted / Starved / Blocked` + any agreed additions) as a Protobuf enum, the `equipment.state.transitioned` event schema, and initial equipment-class definitions for pilot equipment. This is load-bearing for pillar 2 (canonical model is what makes cross-domain "not possible before" analytics possible at all). **Pilot equipment class for canonical definition: FANUC CNC** (pre-defined FOCAS2 hierarchy already exists in OtOpcUa v2 driver design). Land the FANUC CNC class template in the schemas repo before Tier 1 cutover begins. **Universal `_base` equipment-class template** seeded by the OtOpcUa team — every other class extends it via the `extends` field on the equipment-class JSON Schema. `_base` aligns to **OPC UA Companion Spec OPC 40010 (Machinery)** for the Identification component (Manufacturer, Model, ProductInstanceUri, SerialNumber, HardwareRevision, SoftwareRevision, YearOfConstruction, ManufacturerUri, DeviceManual, AssetLocation) and MachineryOperationMode enum, **OPC UA Part 9** for alarm-summary fields, and **ISO 22400** for lifetime counters that feed Availability + Performance KPIs. Avoids per-class drift in identity / state / alarm field naming and ensures every machine in the estate exposes the same baseline metadata regardless of vendor. _TBD — sizing decisions, initial topic list, canonical vocabulary ownership (domain SME group)._ | Expand topic coverage as additional domains onboard. Enforce tiered retention and ACLs at scale. Prove backlog replay after a WAN-outage drill (the replay surface is also the foundation for any future funded physics-simulation / FAT initiative, should one materialize). Exercise long-outage planning (ScadaBridge queue capacity vs. outage duration). Iterate the canonical model as additional equipment classes and domains onboard. _TBD — concrete drill cadence._ | Steady-state operation. Harden alerting and runbooks against the observed failure modes from Years 1–2. Canonical model is mature and covers every in-scope equipment class; schema changes are routine rather than foundational. | | **SnowBridge** | Design and begin custom build in .NET. **Filtered, governed upload to Snowflake is the Year 1 purpose** — the service is the component that decides which topics/tags flow to Snowflake, applies the governed selection model, and writes into Snowflake. Ship an initial version with **one working source adapter** — starting with **Aveva Historian (SQL interface)** because it's central-only, exists today, and lets the workstream progress in parallel with Redpanda rather than waiting on it. First end-to-end **filtered** flow to Snowflake landing tables on a handful of priority tags. Selection model in place even if the operator UI isn't yet (config-driven is acceptable for Year 1). _TBD — team, credential management, datastore for selection state._ | Add the **ScadaBridge/Redpanda source adapter** alongside Historian. Build and ship the operator **web UI + API** on top of the Year 1 selection model, including the blast-radius-based approval workflow, audit trail, RBAC, and exportable state. Onboard priority tags per domain under the UI-driven governance path. _TBD — UI framework._ | All planned source adapters live behind the unified interface. Approval workflow tuned based on Year 2 operational experience. Feature freeze; focus on hardening. | -| **Snowflake dbt Transform Layer** | Scaffold a dbt project in git, wired to the self-hosted orchestrator (per `goal-state.md`; specific orchestrator chosen outside this plan). Build first **landing → curated** model for priority tags. **Align curated views with the canonical model v1** published in the `schemas` repo — equipment, production, and event entities in the curated layer use the canonical state vocabulary and the same event-type enum values, so downstream consumers (Power BI, ad-hoc analysts, future AI/ML) see the same shape of data Redpanda publishes. This is the dbt-side delivery for Digital Twin Use Cases 1 and 3. Establish `dbt test` discipline from day one — including tests that catch divergence between curated views and the canonical enums. _TBD — project layout (single vs per-domain); reconciliation rule if derived state in curated views disagrees with the layer-3 derivation (should not happen, but the rule needs to exist)._ | Build curated layers for all in-scope domains. **Ship a canonical-state-based OEE model** as a strong candidate for the pillar-2 "not possible before" use case — accurate cross-equipment, cross-site OEE computed once in dbt from the canonical state stream, rather than re-derived in every reporting surface. Source-freshness SLAs tied to the **≤15-minute analytics** budget. Begin development of the first **"not possible before" AI/analytics use case** (pillar 2). | The "not possible before" use case is **in production**, consuming the curated layer, meeting its own SLO. Pillar 2 check passes. | +| **Snowflake dbt Transform Layer** | Scaffold a dbt project in git, wired to the self-hosted orchestrator (per `goal-state.md`; specific orchestrator chosen outside this plan). Build first **landing → curated** model for priority tags. **Align curated views with the canonical model v1** published in the `schemas` repo — equipment, production, and event entities in the curated layer use the canonical state vocabulary and the same event-type enum values, so downstream consumers (Power BI, ad-hoc analysts, future AI/ML) see the same shape of data Redpanda publishes. This is the dbt-side delivery of the canonical model (load-bearing for pillar 2). Establish `dbt test` discipline from day one — including tests that catch divergence between curated views and the canonical enums. _TBD — project layout (single vs per-domain); reconciliation rule if derived state in curated views disagrees with the layer-3 derivation (should not happen, but the rule needs to exist)._ | Build curated layers for all in-scope domains. **Ship a canonical-state-based OEE model** as a strong candidate for the pillar-2 "not possible before" use case — accurate cross-equipment, cross-site OEE computed once in dbt from the canonical state stream, rather than re-derived in every reporting surface. Source-freshness SLAs tied to the **≤15-minute analytics** budget. Begin development of the first **"not possible before" AI/analytics use case** (pillar 2). | The "not possible before" use case is **in production**, consuming the curated layer, meeting its own SLO. Pillar 2 check passes. | | **ScadaBridge Extensions** | Implement **deadband / exception-based publishing** with the global-default model (+ override mechanism). Add **EventHub producer** capability with per-call **store-and-forward** to Redpanda. Verify co-located footprint doesn't degrade System Platform. _TBD — global deadband value, override mechanism location._ | Roll deadband + EventHub producer to **all currently-integrated sites**. Tune deadband and overrides based on observed Snowflake cost. Support early legacy-retirement work with outbound Web API / DB write patterns as needed. | Steady state. Any remaining Extensions work is residual cleanup or support for the tail end of Site Onboarding / Legacy Retirement. | | **Site Onboarding** | **No new site onboardings in Year 1.** Use the year to define and document the **lightweight onboarding pattern** for smaller sites — equipment types, network requirements, standard ScadaBridge template set, standard topic/tag set. Keep the existing integrated sites stable. | **Pilot the onboarding pattern** on one smaller site end-to-end (Berlin, Winterthur, or Jacksonville — choice TBD). Use learnings to refine the pattern, then **begin scaling** onboarding to additional smaller sites. _TBD — pilot site selection criteria, per-site effort estimate._ | **Complete onboarding of all remaining smaller sites.** Every site on the authoritative list is on the standardized stack. Pillar 1 check passes. | | **Legacy Retirement** | **Populate the legacy inventory** (`current-state/legacy-integrations.md`) — this is the prerequisite for sequencing. Identify **early-retirement candidates** where the replacement path already exists (e.g., **LEG-002 Camstar**, since ScadaBridge already has a native Camstar path). Retire at least one integration end-to-end as a pattern-proving exercise (including dual-run + decommission). _TBD — inventory ownership, discovery approach._ | **Bulk migration.** Execute retirements in sequence against the inventory, prioritized by a mix of risk and ease. Each retirement follows: plan → build replacement (often in ScadaBridge Extensions) → dual-run → cutover → decommission. Inventory burn-down tracked quarterly. _TBD — prioritization rubric, dual-run duration per integration class._ | **Drive inventory to zero.** Any remaining integrations are in dual-run or decommission phase at start of year; the inventory reaches zero by end of year. Pillar 3 check passes. |