From 98bf2d0da4173e86f03444f9bde7949cc8f69f23 Mon Sep 17 00:00:00 2001 From: Joseph Doherty Date: Fri, 24 Apr 2026 14:53:16 -0400 Subject: [PATCH] Expand SnowBridge to own ingest + in-process transform; drop dbt MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit SnowBridge now owns machine-data ingest, in-process .NET transformation, and direct writes to curated tables in Snowflake. Collapses the previous ingest/transform split into a single service; no dbt, no external orchestrator, no Snowflake landing tier. Keeps the in-house .NET pattern consistent with ScadaBridge and OtOpcUa. The "Snowflake dbt Transform Layer" roadmap workstream merges into SnowBridge (7 → 6 workstreams); Year 2 canonical-state-based OEE moves with it. Canonical model still has three surfaces — the third is renamed from "dbt curated layer" to "SnowBridge curated layer in Snowflake"; mechanics unchanged. --- README.md | 6 +- STATUS.md | 27 ++--- current-state.md | 2 +- current-state/legacy-integrations.md | 2 +- goal-state.md | 154 ++++++++++++++------------- outputs/DESIGN.md | 2 +- outputs/IMPLEMENTATION-PLAN.md | 2 +- outputs/README.md | 2 +- outputs/longform-spec.md | 2 +- outputs/presentation-spec.md | 16 +-- roadmap.md | 19 ++-- schemas/CONTRIBUTING.md | 4 +- schemas/README.md | 2 +- schemas/docs/consumer-integration.md | 12 +-- schemas/docs/overview.md | 2 +- 15 files changed, 130 insertions(+), 124 deletions(-) diff --git a/README.md b/README.md index 031ccc8..1269982 100644 --- a/README.md +++ b/README.md @@ -8,7 +8,7 @@ A **stable, single point of integration between shopfloor OT and enterprise IT** ## Three Pillars (binary at end of plan) -1. **Unification** — 100% of sites on the standardized stack (OtOpcUa + ScadaBridge + Redpanda + SnowBridge + Snowflake/dbt). +1. **Unification** — 100% of sites on the standardized stack (OtOpcUa + ScadaBridge + Redpanda + SnowBridge + Snowflake). 2. **Analytics / AI Enablement** — machine data in Snowflake with a ≤15-minute analytics SLO; at least one "not possible before" use case in production. 3. **Legacy Retirement** — zero remaining bespoke IT/OT integration paths outside ScadaBridge. @@ -23,7 +23,7 @@ Layer 4 ScadaBridge (sole IT/OT crossing point) Enterprise IT (Camstar, Delmia, Snowflake, Power BI, SnowBridge) ``` -The plan also declares a **Unified Namespace (UNS)** composed of OtOpcUa + Redpanda + canonical model in `schemas` repo + dbt curated layer, with a 5-level naming hierarchy standard (Enterprise → Site → Area → Line → Equipment). +The plan also declares a **Unified Namespace (UNS)** composed of OtOpcUa + Redpanda + canonical model in `schemas` repo + SnowBridge curated layer in Snowflake, with a 5-level naming hierarchy standard (Enterprise → Site → Area → Line → Equipment). ## Plan Documents @@ -31,7 +31,7 @@ The plan also declares a **Unified Namespace (UNS)** composed of OtOpcUa + Redpa |---|---| | [`current-state.md`](current-state.md) | Snapshot of today's systems, integrations, and pain points | | [`goal-state.md`](goal-state.md) | Target end-state: architecture, components, success criteria, UNS, canonical model | -| [`roadmap.md`](roadmap.md) | 7 workstreams x 3 years migration grid | +| [`roadmap.md`](roadmap.md) | 6 workstreams x 3 years migration grid | | [`STATUS.md`](STATUS.md) | Working-session bookmark — where we left off, pending items | ### Component Detail Files diff --git a/STATUS.md b/STATUS.md index 083a892..71d807e 100644 --- a/STATUS.md +++ b/STATUS.md @@ -1,6 +1,6 @@ # Plan — Working Session Status -**Saved:** 2026-04-24 +**Saved:** 2026-04-24 (second session of the day) **Previous session:** Opus 4.6 (1M context) **Resume with:** start a new Claude Code session in this directory — `CLAUDE.md` and this file provide full context. No session ID needed; the plan is self-contained in the repo. @@ -16,10 +16,11 @@ The plan is **substantially complete**. All core documents are populated, archit - Enterprise shortname resolved to `zb`; Warsaw West buildings confirmed as 5 and 19 - Equipment protocol survey removed (driver list confirmed directly by v2 team) - First PPTX generated (18 slides, mixed-stakeholder deck) -- 7 component diagrams created (OtOpcUa, Redpanda, SnowBridge, ScadaBridge dataflow + topology, Snowflake/dbt) +- 7 component diagrams created (OtOpcUa, Redpanda, SnowBridge, ScadaBridge dataflow + topology, Snowflake/dbt — dbt diagram is now stale; next regen will replace) - ScadaBridge accuracy corrections from design repo review (email only, not Teams; EventHub not yet implemented) - ScadaBridge topology corrected (no site-to-site routing; direct API access; inbound Web API as input) - **Digital-twin scope finalized (2026-04-24).** Plan's digital-twin scope is definitively **two access-control patterns**: (1) environment-lifecycle promotion without reconfiguration (ACL flip on write authority against stable equipment UUIDs); (2) safe read-only consumption for KPI / monitoring systems (structurally guaranteed by single-connection-through-OtOpcUa). Both delivered by architecture already committed — no new component, no new workstream. The earlier management-conversation brief (`goal-state/digital-twin-management-brief.md`) and the `goal-state/` subdirectory have been removed; the plan uses only the two patterns above. Write-authority arbitration mechanism is out of scope for this plan (OtOpcUa team's concern). Physics simulation / FAT / commissioning emulation is not a plan item; if it ever materializes as a funded adjacent initiative, that will be a separate scoping conversation. +- **SnowBridge scope expanded; dbt workstream removed (2026-04-24).** SnowBridge now owns **ingest + in-process .NET transform + curated-table write**, collapsing the previous ingest/transform split. **No dbt, no external orchestrator, no Snowflake landing tier.** The "Snowflake dbt Transform Layer" roadmap workstream is removed; its Year 2 canonical-state-based OEE commitment moves into the SnowBridge workstream. Workstream count drops from **7 to 6**. The canonical model still has three surfaces — the third is renamed "SnowBridge curated layer in Snowflake" (was "dbt curated layer"); mechanics are identical. Rationale: keep the in-house .NET pattern consistent (ScadaBridge / OtOpcUa / SnowBridge), collapse two tools into one, drop the separate Python/SQL transform skillset. Trade-offs captured in `goal-state.md` → SnowBridge → Trade-offs. ### Files @@ -27,8 +28,8 @@ The plan is **substantially complete**. All core documents are populated, archit - [`CLAUDE.md`](CLAUDE.md) — plan purpose, document index (now including the component detail files and outputs pipeline), markdown-first conventions, component breakout rules. - [`current-state.md`](current-state.md) — snapshot of today's estate (enterprise layout, clusters, systems, integrations, equipment access patterns). -- [`goal-state.md`](goal-state.md) — target end-state with Vision, layered architecture, **Unified Namespace posture + naming hierarchy standard**, component designs (OtOpcUa, SnowBridge, Redpanda EventHub with **Canonical Equipment/Production/Event Model + canonical state vocabulary**, dbt layer, ScadaBridge extensions), success criteria, observability, Strategic Considerations (Digital twin — two access-control patterns; Power BI), and Non-Goals. -- [`roadmap.md`](roadmap.md) — 3-year workstreams × years grid with 7 workstreams and cross-workstream dependencies; Year 1 Redpanda and dbt cells updated for canonical model delivery. +- [`goal-state.md`](goal-state.md) — target end-state with Vision, layered architecture, **Unified Namespace posture + naming hierarchy standard**, component designs (OtOpcUa, SnowBridge with ingest+transform+curated-layer ownership, Redpanda EventHub with **Canonical Equipment/Production/Event Model + canonical state vocabulary**, ScadaBridge extensions), success criteria, observability, Strategic Considerations (Digital twin — two access-control patterns; Power BI), and Non-Goals. +- [`roadmap.md`](roadmap.md) — 3-year workstreams × years grid with 6 workstreams and cross-workstream dependencies; Year 1 Redpanda and SnowBridge cells deliver canonical model v1. **Component detail files:** @@ -52,15 +53,15 @@ The plan is **substantially complete**. All core documents are populated, archit - **Layered architecture:** Layer 1 Equipment → Layer 2 OtOpcUa → Layer 3 SCADA (System Platform + Ignition) → Layer 4 ScadaBridge → Enterprise IT. - **OtOpcUa** (layer 2): custom-built, clustered, co-located on System Platform nodes, hybrid driver strategy (proactive core library + on-demand long-tail), OPC UA-native auth, **absorbs LmxOpcUa** as its System Platform namespace. Tiered cutover: ScadaBridge first, Ignition second, System Platform IO last. **Namespace architecture supports a future `simulated` namespace** for the pre-install case (dev work before equipment is on the floor) and as foundation for a possible future funded physics-simulation initiative — architecturally supported, not committed for build. **ACL model + single-connection-per-equipment also delivers the plan's two digital-twin patterns** (environment-lifecycle promotion via write-authority flip; safe read-only KPI / monitoring exposure) — see `goal-state.md` → OtOpcUa → Consumer access patterns enabled by the ACL model. - **Redpanda EventHub:** self-hosted, central cluster in South Bend (single-cluster HA, VM-level DR out of scope), per-topic tiered retention (operational 7d / analytics 30d / compliance 90d), bundled Schema Registry, Protobuf via central `schemas` repo with `buf` CI, `BACKWARD_TRANSITIVE` compatibility, `TopicNameStrategy` subjects, `{domain}.{entity}.{event-type}` naming, site identity in message (not topic), SASL/OAUTHBEARER + prefix ACLs. Store-and-forward at ScadaBridge handles site resilience. **Analytics-tier retention is also a replay surface** for integration testing (and for a possible future funded physics-simulation initiative, should one materialize). -- **Canonical Equipment, Production, and Event Model:** the plan commits to declaring the composition of OtOpcUa equipment namespace + Redpanda canonical topics + `schemas` repo + dbt curated layer as **the** canonical model. Three surfaces, one source of truth (`schemas` repo). Includes a **canonical machine state vocabulary** (`Running / Idle / Faulted / Starved / Blocked` + TBD additions like `Changeover`, `Maintenance`, `Setup`). Year 1 Redpanda and dbt cells deliver v1. Load-bearing for pillar 2. +- **Canonical Equipment, Production, and Event Model:** the plan commits to declaring the composition of OtOpcUa equipment namespace + Redpanda canonical topics + `schemas` repo + SnowBridge curated layer in Snowflake as **the** canonical model. Three surfaces, one source of truth (`schemas` repo). Includes a **canonical machine state vocabulary** (`Running / Idle / Faulted / Starved / Blocked` + TBD additions like `Changeover`, `Maintenance`, `Setup`). Year 1 Redpanda and SnowBridge cells deliver v1. Load-bearing for pillar 2. - **Unified Namespace (UNS) posture:** the canonical model above is also declared as the plan's UNS, framed for stakeholders using UNS vocabulary. **Deliberate deviations from classic MQTT/Sparkplug UNS:** Kafka instead of MQTT (for analytics/replay), flat `{domain}.{entity}.{event-type}` topics with site in message (for bounded topic count), stateless events instead of Sparkplug state machine. Optional future **UNS projection service** (MQTT/Sparkplug and/or enterprise OPC UA aggregator) is architecturally supported but not committed for build; decision trigger documented. -- **UNS naming hierarchy standard:** 5 levels always present — Enterprise → Site → Area → Line → Equipment, with `_default` placeholder where a level doesn't apply. Text form `ent.warsaw-west.bldg-3.line-2.cnc-mill-05` / OPC UA form `ent/warsaw-west/bldg-3/line-2/cnc-mill-05`. Stable **equipment UUIDv4** alongside the path (path is navigation, UUID is lineage). Authority lives in `schemas` repo; OtOpcUa / Redpanda / dbt consume the authoritative definition. **Enterprise shortname is currently `ent` placeholder — needs assignment.** +- **UNS naming hierarchy standard:** 5 levels always present — Enterprise → Site → Area → Line → Equipment, with `_default` placeholder where a level doesn't apply. Text form `zb.warsaw-west.bldg-5.line-2.cnc-mill-05` / OPC UA form `zb/warsaw-west/bldg-5/line-2/cnc-mill-05`. Stable **equipment UUIDv4** alongside the path (path is navigation, UUID is lineage). Authority lives in `schemas` repo; OtOpcUa / Redpanda / SnowBridge consume the authoritative definition. - **SnowBridge:** custom-built machine-data-to-Snowflake upload service; Year 1 starting source is Aveva Historian SQL; UI + API with blast-radius-based approval workflow; selection state in internal datastore (not git). -- **Snowflake transform tooling:** dbt only, run by a self-hosted orchestrator (specific orchestrator out of scope). -- **Aggregation boundary:** aggregation lives in Snowflake (dbt). ScadaBridge does deadband/exception-based filtering (global default ~1% of span) plus tag opt-in via SnowBridge — not source-side summarization. -- **Observability:** commit to signals (Redpanda, ScadaBridge, SnowBridge, dbt), tool is out of scope. +- **Snowflake transform tooling:** none separate — **SnowBridge owns transformation in-process (.NET)**. No dbt, no Snowflake Dynamic Tables / Streams+Tasks, no external orchestrator (Airflow / Dagster / Prefect). +- **Aggregation boundary:** aggregation lives in **SnowBridge** (writing curated rows to Snowflake). ScadaBridge does deadband/exception-based filtering (global default ~1% of span) plus tag opt-in via SnowBridge — not source-side summarization. +- **Observability:** commit to signals (Redpanda, ScadaBridge, SnowBridge ingest + SnowBridge transforms + SnowBridge validation checks), tool is out of scope. - **Digital-twin scope (finalized 2026-04-24):** the plan's digital-twin scope is definitively **two access-control patterns** — (1) environment-lifecycle promotion without reconfiguration (ACL flip on write authority against stable equipment UUIDs); (2) safe read-only consumption for KPI / monitoring systems (structurally guaranteed by single-connection-through-OtOpcUa). Both delivered by architecture already committed in the **OtOpcUa** and **Canonical Equipment, Production, and Event Model** subsections — no new component, no new workstream, no pillar dependency. Write-authority arbitration mechanism is out of scope (OtOpcUa team's concern). Physics simulation / FAT / commissioning emulation is not a plan item; any future funded adjacent initiative would be a separate scoping conversation. -- **Enterprise reporting coordination (BOBJ → Power BI migration, in-flight adjacent initiative):** three consumption paths analyzed (Snowflake dbt / Historian direct / both). Recommended position: **Path C with Path A as strategic direction** — most machine-data and cross-domain reports move to Snowflake over Years 2–3, compliance reports stay on Historian indefinitely. Conversation with reporting team still to be scheduled. +- **Enterprise reporting coordination (BOBJ → Power BI migration, in-flight adjacent initiative):** three consumption paths analyzed (SnowBridge curated layer in Snowflake / Historian direct / both). Recommended position: **Path C with Path A as strategic direction** — most machine-data and cross-domain reports move to Snowflake over Years 2–3, compliance reports stay on Historian indefinitely. Conversation with reporting team still to be scheduled. - **Output generation pipeline:** PPTX + PDF generation from plan markdown, repeatability anchored by spec files (`presentation-spec.md`, `longform-spec.md`) rather than prompts. Spec files written; diagrams and generation run deferred until the source plan is stable. ## Top pending items (from most recent status check) @@ -69,7 +70,7 @@ All four items from the previous status check have been **advanced to the point ### External-dependency items — waiting on real-world action -1. **BOBJ → Power BI coordination with reporting team.** Plan position documented in `goal-state.md` → Strategic Considerations → **Enterprise reporting: BOBJ → Power BI migration (adjacent initiative)** — three consumption paths analyzed, recommended position stated (Path C with Path A as strategic direction), eight questions and a four-bucket decision rubric included. **Action needed:** schedule the coordination conversation with the reporting team; bring back a bucket assignment. Once a bucket is assigned, update `goal-state.md` → Enterprise reporting and, if the outcome is Bucket A or B, update `roadmap.md` → Snowflake dbt Transform Layer to include reporting-shaped views. +1. **BOBJ → Power BI coordination with reporting team.** Plan position documented in `goal-state.md` → Strategic Considerations → **Enterprise reporting: BOBJ → Power BI migration (adjacent initiative)** — three consumption paths analyzed, recommended position stated (Path C with Path A as strategic direction), eight questions and a four-bucket decision rubric included. **Action needed:** schedule the coordination conversation with the reporting team; bring back a bucket assignment. Once a bucket is assigned, update `goal-state.md` → Enterprise reporting and, if the outcome is Bucket A or B, update `roadmap.md` → SnowBridge to include reporting-shaped curated tables. 2. **UNS hierarchy snapshot walk.** The protocol survey has been **removed** — the OtOpcUa v2 implementation team committed the core driver list (8 drivers) based on internal knowledge, making a formal protocol survey unnecessary for driver scoping. What remains is the **UNS hierarchy snapshot**: a per-site equipment-instance walk capturing site / area / line / equipment assignments and stable UUIDs, which feeds the initial `schemas` repo hierarchy definition and canonical model. See `goal-state.md` → **Unified Namespace (UNS) posture → UNS naming hierarchy standard**. **Action needed:** assign a walk owner; walk System Platform IO config, Ignition OPC UA connections, and ScadaBridge templates across integrated sites within Q1–Q2 of Year 1; capture equipment instances at site/area/line/equipment granularity (not protocol — that's already resolved). The canonical model v1 cannot be published without the initial hierarchy snapshot. **Sub-blocker:** the UNS hierarchy's enterprise-level shortname is currently a placeholder (`ent` in goal-state.md); the real shortname needs to be assigned before the initial hierarchy snapshot can be committed to the `schemas` repo. ### Closed since last status check @@ -86,7 +87,7 @@ All closed items below were worked through the same 2026-04-15 session. Grouped **Canonical model and UNS work:** -- ~~**Canonical Equipment, Production, and Event Model declaration.**~~ **Closed 2026-04-15.** New subsection under `goal-state.md` → Async Event Backbone declares the canonical model: three surfaces (OtOpcUa equipment namespace, Redpanda topics + Protobuf schemas, dbt curated layer) with `schemas` repo as single source of truth. Committed **canonical machine state vocabulary** (`Running / Idle / Faulted / Starved / Blocked` + TBD additions) with explicit semantics, rules, and governance. OEE computed on the canonical state stream named as a candidate for pillar 2's "not possible before" use case. Year 1 Redpanda cell in `roadmap.md` commits to publishing v1. +- ~~**Canonical Equipment, Production, and Event Model declaration.**~~ **Closed 2026-04-15 (third surface renamed 2026-04-24).** New subsection under `goal-state.md` → Async Event Backbone declares the canonical model: three surfaces (OtOpcUa equipment namespace, Redpanda topics + Protobuf schemas, SnowBridge curated layer in Snowflake) with `schemas` repo as single source of truth. Committed **canonical machine state vocabulary** (`Running / Idle / Faulted / Starved / Blocked` + TBD additions) with explicit semantics, rules, and governance. OEE computed on the canonical state stream named as a candidate for pillar 2's "not possible before" use case. Year 1 Redpanda cell in `roadmap.md` commits to publishing v1. - ~~**Unified Namespace (UNS) posture declaration.**~~ **Closed 2026-04-15.** New subsection under `goal-state.md` → Target IT/OT Integration declares the canonical model as **the plan's UNS**, with three deliberate deviations from classic MQTT/Sparkplug UNS (Kafka instead of MQTT, flat topics with site-in-message, stateless events instead of Sparkplug state). Optional future **UNS projection service** (MQTT/Sparkplug and/or enterprise OPC UA aggregator) documented as architecturally supported but not committed for build. Cross-references added from Canonical Model subsection and Digital Twin section. - ~~**UNS naming hierarchy standard.**~~ **Closed 2026-04-15.** Five-level hierarchy committed: Enterprise → Site → Area → Line → Equipment, always present, `_default` placeholder where a level doesn't apply. Naming rules align with Redpanda topic convention (`[a-z0-9-]`, dots/slashes for segments, hyphens within). Stable **equipment UUIDv4** alongside the path. Authority in `schemas` repo. Evolution governance, worked examples, out-of-scope list (no product/job hierarchy — that's Camstar MES), and TBDs all captured. `current-state/equipment-protocol-survey.md` updated to note the dual mandate — same discovery walk produces the initial hierarchy snapshot at equipment-instance granularity. @@ -116,6 +117,6 @@ Items that can wait, design details that close during implementation, and delibe - Don't re-open settled decisions without a reason. The plan's decisions are load-bearing and have explicit rationale captured inline; reversing one should require new information, not re-litigation. - Don't add new workstreams to `roadmap.md` without a matching commitment to one of the three pillars. That's how plans quietly bloat. - Don't let Digital Twin reappear as a new committed workstream or widen beyond the finalized scope. Plan's digital-twin scope is exactly two access-control patterns (environment-lifecycle promotion; safe read-only KPI / monitoring exposure), both delivered by already-committed architecture. Physics simulation / FAT / commissioning emulation is out of plan scope; it does not reappear unless a separately funded initiative with a sponsor is stood up, and even then it is an adjacent initiative, not this plan's work. -- Don't let Copilot 365 reappear. It was deliberately removed earlier — it's handled implicitly by the Snowflake/dbt + canonical model path. +- Don't let Copilot 365 reappear. It was deliberately removed earlier — it's handled implicitly by the SnowBridge curated layer + canonical model path. - Don't build a parallel MQTT UNS broker just because "UNS" means MQTT to many vendors. The plan's UNS posture is deliberate: Redpanda IS the UNS backbone, and a projection service is a small optional addition when a specific consumer requires it — not the default path. - Don't hand-edit files under `outputs/generated/` — they're disposable, regenerated from the spec files on every run. Edit specs or source plan files instead. diff --git a/current-state.md b/current-state.md index 7c73c65..7876aa3 100644 --- a/current-state.md +++ b/current-state.md @@ -145,7 +145,7 @@ SCADA responsibilities are split across two platforms by purpose: - **Retention policy: permanent.** No TTL or rollup is applied — historian data is retained **forever** as a matter of policy. This means the "drill-down to Historian for raw data" pattern in `goal-state.md` works at any historical horizon, and the historian is the authoritative long-term system of record for validated tag data regardless of how much Snowflake chooses to store. - **Integration role:** serves as the system of record for validated/compliance-grade tag data collected via Aveva System Platform, and exposes a **SQL interface** (OPENQUERY and history views) for read access. Downstream use of that SQL interface for Snowflake ingestion is discussed in `goal-state.md` under Aveva Historian → Snowflake. - **Current consumers (reporting):** the primary consumer of Historian data today is **enterprise reporting**, currently on **SAP BusinessObjects (BOBJ)**. Reporting is actively **migrating from SAP BOBJ to Power BI** — this is an in-flight transition that this plan should be aware of but does not own. - - **Implication for pillar 2:** the "enterprise analytics/AI enablement" target in `goal-state.md` sits alongside this Power BI migration, not in competition with it. Whether Power BI consumes from Snowflake (via the dbt curated layer), from Historian directly, or from both is a TBD that coordinates between the two initiatives. + - **Implication for pillar 2:** the "enterprise analytics/AI enablement" target in `goal-state.md` sits alongside this Power BI migration, not in competition with it. Whether Power BI consumes from Snowflake (via the SnowBridge curated layer), from Historian directly, or from both is a TBD that coordinates between the two initiatives. - _TBD — current storage footprint and growth rate, other consumers beyond reporting (e.g., Aveva Historian Client / Insight / Trend tools, ad-hoc analyst SQL, regulatory/audit exports), and how the BOBJ→Power BI migration coordinates with the Snowflake path for machine data._ _TBD — additional shopfloor systems and HMIs not covered above (if any)._ diff --git a/current-state/legacy-integrations.md b/current-state/legacy-integrations.md index 90a1b40..4357fa7 100644 --- a/current-state/legacy-integrations.md +++ b/current-state/legacy-integrations.md @@ -110,7 +110,7 @@ Categories explicitly carved out of the "legacy integration" definition. These c Pillar 3's "legacy integration" target covers **bespoke IT↔OT crossings** — Web API interfaces exposed by the System Platform primary cluster, custom services, file drops, direct DB links into internal stores that weren't designed for external reads. Consuming a system's own first-class SQL interface is categorically different and does not fit that definition. -The BOBJ → Power BI migration currently in flight (see `../current-state.md` → Aveva Historian → Current consumers) will reshape this surface independently of pillar 3. Whether Power BI ultimately reads from Historian's SQL interface, from Snowflake's dbt curated layer, or from both is a coordination question between the reporting team and this plan (tracked in `../status.md` as a top pending item) — but whichever way it lands, the resulting path is **not** tracked here as a retirement target. +The BOBJ → Power BI migration currently in flight (see `../current-state.md` → Aveva Historian → Current consumers) will reshape this surface independently of pillar 3. Whether Power BI ultimately reads from Historian's SQL interface, from the SnowBridge curated layer in Snowflake, or from both is a coordination question between the reporting team and this plan (tracked in `../status.md` as a top pending item) — but whichever way it lands, the resulting path is **not** tracked here as a retirement target. > **Implication:** if at any point a reporting consumer stops using Historian's SQL views and instead starts talking to Historian via a bespoke side-channel (custom extract job, scheduled export, direct file read of the historian store, etc.), **that** side-channel **would** be legacy and would need a row in the inventory. The carve-out applies specifically to the native MSSQL surface. diff --git a/goal-state.md b/goal-state.md index 191086c..a6fcb69 100644 --- a/goal-state.md +++ b/goal-state.md @@ -73,8 +73,8 @@ The target architecture has **four layers** on the OT side plus the enterprise I - **Aveva Historian** is a **store adjacent to layer 3**, not a layer of its own. System Platform (layer 3) writes into it; consumers read historical validated data out of it (either through its SQL interface or through Snowflake after SnowBridge has pulled from it). It is the long-term system of record for validated data regardless of what Snowflake chooses to store. - **Redpanda (EventHub)** is **infrastructure used between layer 4 and enterprise IT**, not a layer of its own. ScadaBridge publishes events into it; enterprise consumers (SnowBridge, KPI processors, Camstar integration) read from it. It decouples layer-4 producers from enterprise consumers without introducing a fifth layer. - **SnowBridge** is an **enterprise-side consumer** that happens to read from both the adjacent-to-layer-3 store (Aveva Historian) and the layer-4-to-enterprise backbone (Redpanda). Its job is the governed, filtered upload to Snowflake — it does not fit inside the layered data path itself. -- **dbt** runs **inside Snowflake**, so it is enterprise-side infrastructure that transforms landed data. It has no layer-1-through-4 position. -- **OtOpcUa's raw data goes "up" through this stack and back out the top on the IT side.** A tag read from a machine in Warsaw West flows: equipment → OtOpcUa (layer 2) → System Platform or Ignition (layer 3) → ScadaBridge (layer 4) → Redpanda → SnowBridge → Snowflake → dbt → Power BI or downstream consumer. +- **SnowBridge** owns both **ingest and transform** into Snowflake (in-process .NET, no separate transform tool). It is enterprise-side (IT) infrastructure and has no layer-1-through-4 position. +- **OtOpcUa's raw data goes "up" through this stack and back out the top on the IT side.** A tag read from a machine in Warsaw West flows: equipment → OtOpcUa (layer 2) → System Platform or Ignition (layer 3) → ScadaBridge (layer 4) → Redpanda → SnowBridge (ingest + transform) → Snowflake → Power BI or downstream consumer. **What the layering rules out.** Cross-layer shortcuts that bypass the layer in between: @@ -101,13 +101,13 @@ Four existing commitments, together, constitute the unified namespace: | **OtOpcUa equipment namespace** (per site) | Hierarchical real-time OT data surface. Equipment, tags, and derived state exposed as a canonical OPC UA tree at each site. This is the "classic UNS hierarchy" at the site level — equipment-class templates in the `schemas` repo define the node layout. | | **Redpanda topics + canonical Protobuf schemas** | Enterprise-wide pub-sub backbone carrying canonical equipment / production / event messages. The `{domain}.{entity}.{event-type}` taxonomy + schema registry define what "speaking UNS" means on the wire. Retention tiers give consumers a bounded replay window against the UNS. | | **`schemas` repo + canonical model declaration** | The shared **context layer** — equipment classes, machine states (`Running / Idle / Faulted / Starved / Blocked`), event types — that makes every surface's data semantically consistent. See **Async Event Backbone → Canonical Equipment, Production, and Event Model** for the full declaration. This is where the ISA-95 hierarchy conceptually lives, even though it is not the topic path. | -| **dbt curated layer in Snowflake** | Canonical historical / analytical surface. Consumers that need "what has this equipment done over time" read the UNS via the curated layer, with the same vocabulary as the real-time surfaces. Same canonical model, different access pattern. | +| **SnowBridge curated layer in Snowflake** | Canonical historical / analytical surface. Consumers that need "what has this equipment done over time" read the UNS via the curated layer (written by SnowBridge), with the same vocabulary as the real-time surfaces. Same canonical model, different access pattern. | -Together: a single canonical data model (the `schemas` repo), a single real-time backbone (Redpanda), a canonical OT-side hierarchy at each site (OtOpcUa), and a canonical analytical surface (dbt). That is the UNS. +Together: a single canonical data model (the `schemas` repo), a single real-time backbone (Redpanda), a canonical OT-side hierarchy at each site (OtOpcUa), and a canonical analytical surface (SnowBridge curated layer in Snowflake). That is the UNS. #### UNS naming hierarchy standard -The plan commits to a **single canonical naming hierarchy** for addressing equipment across every UNS surface (OtOpcUa, Redpanda, dbt, `schemas` repo). Without this, each surface would re-derive its own naming and drift apart; the whole point of "a single canonical model" evaporates. +The plan commits to a **single canonical naming hierarchy** for addressing equipment across every UNS surface (OtOpcUa, Redpanda, SnowBridge curated layer, `schemas` repo). Without this, each surface would re-derive its own naming and drift apart; the whole point of "a single canonical model" evaporates. ##### Hierarchy — five levels, always present @@ -125,13 +125,13 @@ Five levels is a **hard commitment**. Consumers can assume every equipment insta ##### Why "always present with placeholders" rather than "variable depth" -- **Uniform depth makes consumers simpler.** Subscribers and dbt models assume a fixed schema for the equipment identifier; variable-depth paths require special-casing. +- **Uniform depth makes consumers simpler.** Subscribers and SnowBridge transforms assume a fixed schema for the equipment identifier; variable-depth paths require special-casing. - **Adding a building later doesn't shift paths.** If a small site adds a second production building and needs an Area level, the existing equipment at that site keeps its path (now pointing at a named area instead of `_default`), and the new building gets a new area segment — no rewrites, no breaking changes for historical consumers. - **Explicit placeholder is more discoverable than an implicit skip.** A reader looking at `zb.shannon._default.line-1.cnc-mill-03` immediately sees that Shannon has no area distinction today; a variable-depth alternative like `zb.shannon.line-1.cnc-mill-03` leaves the reader wondering whether a level is missing. ##### Naming rules -Identical conventions to the existing Redpanda topic naming — one vocabulary, two serializations (text form for messages / docs / dbt keys; OPC UA browse-path form for OtOpcUa): +Identical conventions to the existing Redpanda topic naming — one vocabulary, two serializations (text form for messages / docs / curated-table keys; OPC UA browse-path form for OtOpcUa): - **Character set:** `[a-z0-9-]` only. Lowercase enforced. No underscores (except the literal placeholder `_default`), no camelCase, no spaces, no unicode. - **Segment separator:** `.` (dot) in text form; `/` (slash) in OPC UA browse paths. The two forms are **mechanically interchangeable** — same segments, different delimiter. @@ -145,7 +145,7 @@ Identical conventions to the existing Redpanda topic naming — one vocabulary, | Form | Example | |---|---| -| Text (messages, docs, dbt keys) | `zb.warsaw-west.bldg-5.line-2.cnc-mill-05.spindle-speed` | +| Text (messages, docs, curated-table keys) | `zb.warsaw-west.bldg-5.line-2.cnc-mill-05.spindle-speed` | | OPC UA browse path | `zb/warsaw-west/bldg-5/line-2/cnc-mill-05/spindle-speed` | | Same machine at a small site (area placeholder) | `zb.shannon._default.line-1.cnc-mill-03` | @@ -178,7 +178,7 @@ Equipment also carries **OPC UA Companion Spec OPC 40010 (Machinery) Identificat - Use **SAPID** when correlating with maintenance/PM systems. - Use the **path** for dashboards, filters, browsing, human search, and anywhere a reader needs to know *where* the equipment is right now. -Canonical events on Redpanda carry `equipment_uuid` (stable) and `equipment_path` (current at event time) as fields. A dbt dimension table (`dim_equipment`) carries all five identifiers plus current and historical paths, and is the authoritative join point for analytical consumers. OtOpcUa's equipment namespace exposes all five as properties on each equipment node. +Canonical events on Redpanda carry `equipment_uuid` (stable) and `equipment_path` (current at event time) as fields. A SnowBridge-populated dimension table (`dim_equipment`) in Snowflake carries all five identifiers plus current and historical paths, and is the authoritative join point for analytical consumers. OtOpcUa's equipment namespace exposes all five as properties on each equipment node. _TBD — **UUID generation authority** (E1 from v2 corrections): OtOpcUa Admin UI currently auto-generates UUIDv4 on equipment creation. If ERP or SAP PM systems take authoritative ownership of equipment registration in the future, the UUID-generation policy should be configurable per cluster (generate locally vs. look up from external system). For now, OtOpcUa generates by default._ @@ -191,7 +191,7 @@ _TBD — **UUID generation authority** (E1 from v2 corrections): OtOpcUa Admin U | `schemas` repo | Canonical hierarchy definition — the full tree with UUIDs, current paths, equipment-class assignments, and evolution history. Stored as **JSON Schema (.json files)** — idiomatic for .NET (System.Text.Json), CI-friendly (any runner can validate with `jq`), human-authorable, and merge-friendly in git. Protobuf is derived (code-generated) from the JSON Schema for wire serialization where needed (Redpanda events). One-way derivation: JSON Schema → Protobuf, not bidirectional. | **Authoritative.** Changes go through the same CODEOWNERS + `buf`-CI governance as other schema changes. | | OtOpcUa equipment namespace | Browse-path structure matching the hierarchy; equipment nodes carry the UUID as a property. Built per-site from the relevant subtree (each site's OtOpcUa only exposes equipment at that site). | **Consumer.** Generated from the `schemas` repo definition at deploy/config time. Drift between OtOpcUa and `schemas` repo is a defect. | | Redpanda canonical event payloads | Every event payload carries `equipment_uuid` (stable) and `equipment_path` (current at event time) as fields. Enables filtering without topic explosion. | **Consumer.** Protobuf schemas reference the hierarchy definition in the same `schemas` repo. | -| dbt curated layer in Snowflake | `dim_equipment` dimension table with UUID, current path, historical path versions, equipment class, site, area, line. Used as the join key by every analytical consumer. | **Consumer.** Populated by a dbt model that reads from an upstream reference table synced from the `schemas` repo — not hand-maintained in Snowflake. | +| SnowBridge curated layer in Snowflake | `dim_equipment` dimension table with UUID, current path, historical path versions, equipment class, site, area, line. Used as the join key by every analytical consumer. | **Consumer.** Populated by SnowBridge from a reference source synced from the `schemas` repo — not hand-maintained in Snowflake. | ##### Evolution and change management @@ -208,12 +208,12 @@ The hierarchy will change. Sites get added (smaller sites onboarding in Year 2). - **Product / job / traveler hierarchy.** Products flow through equipment orthogonally to the equipment tree and are tracked in Camstar's MES genealogy, not in the UNS equipment hierarchy. A product's current equipment is joined in via MES events (`mes.workorder.*`) referencing equipment UUIDs — not by putting products into the equipment path. - **Operator / crew / shift hierarchy.** Same reason — orthogonal to equipment; lives elsewhere. -- **Logical vs physical equipment.** The plan's hierarchy addresses **physical equipment instances**. Logical groupings (e.g., "all CNC mills," "all equipment on the shop floor") are queryable via equipment class + attributes in dbt or via OPC UA browse filters — not via the path hierarchy. +- **Logical vs physical equipment.** The plan's hierarchy addresses **physical equipment instances**. Logical groupings (e.g., "all CNC mills," "all equipment on the shop floor") are queryable via equipment class + attributes in the SnowBridge curated layer or via OPC UA browse filters — not via the path hierarchy. - **Real-time UNS browsing UI.** If stakeholders want a tree-browse experience against the UNS (an HMI, an engineering tool), that is a consumer surface, not a hierarchy definition. The projection service discussed below is the likely delivery path if this is ever funded. **Resolved:** storage format for the hierarchy in the `schemas` repo is **JSON Schema** (see "Where the authoritative hierarchy lives" above). -_TBD — authoritative initial **UNS hierarchy snapshot** for the currently-integrated sites — requires a per-site area/line/equipment walk to capture equipment instances, their UNS path assignments, and stable UUIDs (the protocol survey has been removed since the OtOpcUa v2 design committed the driver list directly; the hierarchy walk is now a standalone Year 1 deliverable); whether the dbt `dim_equipment` historical-path tracking needs a slowly-changing-dimension type-2 pattern or a simpler current+history list; ownership of hierarchy change PRs (likely a domain SME group, not the ScadaBridge team)._ +_TBD — authoritative initial **UNS hierarchy snapshot** for the currently-integrated sites — requires a per-site area/line/equipment walk to capture equipment instances, their UNS path assignments, and stable UUIDs (the protocol survey has been removed since the OtOpcUa v2 design committed the driver list directly; the hierarchy walk is now a standalone Year 1 deliverable); whether the `dim_equipment` historical-path tracking (populated by SnowBridge) needs a slowly-changing-dimension type-2 pattern or a simpler current+history list; ownership of hierarchy change PRs (likely a domain SME group, not the ScadaBridge team)._ #### How this differs from a classic MQTT-based UNS @@ -240,7 +240,7 @@ This mirrors the treatment of OtOpcUa's future `simulated` namespace: the archit **Changes:** -- Stakeholders who ask "do we have a UNS?" get a direct "yes — composed of OtOpcUa + Redpanda + `schemas` repo + dbt" answer instead of "we have a canonical model but we didn't use that word." +- Stakeholders who ask "do we have a UNS?" get a direct "yes — composed of OtOpcUa + Redpanda + `schemas` repo + SnowBridge curated layer" answer instead of "we have a canonical model but we didn't use that word." - The **canonical machine state vocabulary** and **canonical equipment/production/event model declaration** (see **Async Event Backbone → Canonical Equipment, Production, and Event Model**) — which are functionally UNS deliverables in another vocabulary — now have a second name and a second stakeholder audience. - A future projection service is pre-legitimized as a small optional addition, not a parallel or competing initiative. - Vendor conversations that assume "UNS" means a specific MQTT broker purchase can be reframed: the plan delivers the UNS value proposition via different transport; the vendor's MQTT expectations become a projection-layer concern, not a core-architecture concern. @@ -259,7 +259,7 @@ _TBD — whether any stakeholder has specifically asked for UNS vocabulary, or w - **ScadaBridge** is the **global integration network** providing controlled access between **IT and OT**. - **The IT↔OT boundary sits at ScadaBridge central.** In the target architecture: - **OT side = machine data.** Everything that collects, transforms, or stores machine data lives on the OT side. Concretely this includes: **Aveva System Platform** (primary and site clusters, Global Galaxy federation, hot-warm redundancy), **equipment OPC UA and native device protocols** (PLCs, controllers, instruments), **OtOpcUa** (the unified per-site OPC UA layer that exposes raw equipment data and the System Platform namespace — the evolution of LmxOpcUa), **ScadaBridge** (site clusters and central), **Aveva Historian**, and **Ignition SCADA** (as the KPI SCADA UX layer per the UX split). - - **IT side = enterprise applications.** Everything business-facing lives on the IT side. Concretely this includes: **Camstar** (MES), **Delmia** (DNC / digital manufacturing), **enterprise reporting and analytics** (Snowflake, dbt, SAP BusinessObjects today / Power BI tomorrow), **the SnowBridge** (it's a Snowflake-facing enterprise consumer, not an OT component — it happens to read from OT sources but its identity, hosting, and governance are IT), and **any other enterprise app** that needs shopfloor data or has to drive shopfloor behavior. + - **IT side = enterprise applications.** Everything business-facing lives on the IT side. Concretely this includes: **Camstar** (MES), **Delmia** (DNC / digital manufacturing), **enterprise reporting and analytics** (Snowflake, SAP BusinessObjects today / Power BI tomorrow), **SnowBridge** (it's a Snowflake-facing enterprise consumer that owns machine-data ingest and transform — not an OT component — it reads from OT sources but its identity, hosting, and governance are IT), and **any other enterprise app** that needs shopfloor data or has to drive shopfloor behavior. - **Long-term posture.** System Platform traffic (Global Galaxy, site↔site cluster federation, site System Platform clusters ↔ central System Platform cluster, site-level ScadaBridge ↔ local equipment) **stays on the OT side** and is **not** subject to "retire to ScadaBridge." Global Galaxy is how System Platform is supposed to federate and stays the authorized mechanism for OT-internal integration. - **The crossing point.** **ScadaBridge central ↔ enterprise integrations is the single IT↔OT bridge.** Any traffic that crosses between the two zones must cross *through* ScadaBridge central; nothing else is permitted as a long-term path. That includes reads (enterprise app wanting machine data) and writes (enterprise app driving shopfloor state). - **Implication for the Global Galaxy Web API** on the Aveva System Platform primary cluster: its two existing interfaces (Delmia DNC, Camstar MES) are IT↔OT crossings that currently run *outside* ScadaBridge and are therefore in-scope for retirement under pillar 3. @@ -275,35 +275,48 @@ _TBD — whether any stakeholder has specifically asked for UNS vocabulary, or w - **Snowflake** — first-class integration with the enterprise data platform so shopfloor data lands in Snowflake for analytics, reporting, and downstream consumers. See **Aveva Historian → Snowflake** below for the time-series ingestion pattern. _TBD — non-historian data flows (MES, ScadaBridge events, metadata), schema ownership, latency targets, governance._ - _TBD — other enterprise systems (ERP, PLM, quality, etc.) that need integration._ -### SnowBridge — the Machine Data to Snowflake upload service +### SnowBridge — the Machine Data ingest + transform service -**SnowBridge** is a **dedicated integration service** that owns all **machine data** flows into Snowflake. It is a new, purpose-built component — **not** the Snowflake Kafka Connector directly, and **not** configuration living inside ScadaBridge or the central `schemas` repo. +**SnowBridge** is a **dedicated integration service** that owns all **machine data** flows into Snowflake — ingest, transform, and write. It reads from multiple machine-data sources, applies canonical-model-aligned transformations **in-process** (.NET), and writes **curated rows directly** to Snowflake. There is no separate landing tier, no separate transform tool, and no external orchestrator. SnowBridge is a new, purpose-built component — not the Snowflake Kafka Connector directly, not dbt, and not configuration living inside ScadaBridge or the central `schemas` repo. **Responsibilities.** - **Source abstraction.** Reads from multiple machine-data sources behind a common interface: **Aveva Historian** (via its SQL interface), **ScadaBridge / EventHub (Redpanda topics)**, and any future source (e.g., Ignition, Aveva Data Hub, direct OPC UA collectors) without each source needing its own bespoke pipeline. - **Selection.** Operators configure **which topics** (for Redpanda-backed sources) and **which tags/streams** (for Historian and other sources) flow to Snowflake. Selection is a first-class, governed action inside this service — not a side effect of deploying a schema or a ScadaBridge template. -- **Sink to Snowflake.** Writes into Snowflake via the appropriate native mechanism per source (Snowpipe Streaming for event/topic sources, bulk or COPY-based for Historian backfills, etc.) while presenting one unified operational surface. -- **Governance home.** Topic/source/tag opt-in, mapping to Snowflake target tables, schema bindings, and freshness expectations all live in this service's configuration — one place to ask "is this tag going to Snowflake, and why?" +- **Transformation (in-process, .NET).** Applies canonical-model-aligned transforms before writing to Snowflake: type coercion, canonical state vocabulary handling, `dim_equipment` resolution from the `schemas` repo, aggregation, windowing, enrichment with reference data. Transform logic lives in SnowBridge's own codebase — versioned in git, reviewed via PR, unit-tested alongside ingestion code. **No dbt**, no SQL transform layer in Snowflake owned by a separate tool. +- **Sink to Snowflake — curated rows direct.** Writes **curated rows** into Snowflake curated tables via Snowpipe Streaming (streaming sources) or batch COPY (Historian backfills). **There is no separate landing tier** — curated tables are the single destination. Raw historical replay, when needed, comes from Redpanda (analytics/compliance tier retention) or Historian SQL, not a Snowflake landing copy. +- **Scheduling / cadence.** Streaming sources transform continuously as events arrive; batch sources run on SnowBridge's own internal cadence. **No external orchestrator** (Airflow / Dagster / Prefect is not required; dbt Cloud is not used). +- **Data-quality validation.** Runs SQL-based validation checks against Snowflake (the equivalent of `dbt test`) on a defined cadence. A first-class validation framework with primitives for `not_null`, `foreign_key`, `canonical_enum_coverage`, `row_count_in_range`, and `source_freshness`. Validation failures surface in observability alongside ingestion metrics and route through the same alerting path. +- **Governance home.** Topic/source/tag opt-in, mapping to Snowflake target tables, schema bindings, transform logic ownership, freshness expectations, and validation rules all live in this service's configuration — one place to ask "is this tag going to Snowflake, what transform is applied, and why?" **Rationale.** -- **Separation of concerns.** ScadaBridge's job is IT/OT integration at the edge (site-local, OPC UA, store-and-forward, scripting). Shoveling curated data into Snowflake is a different job — long-lived connector state, Snowflake credentials, per-source backfill logic, schema mapping — and does not belong on every ScadaBridge cluster. -- **Source-agnostic.** Not all machine data is going to flow through Redpanda. Aveva Historian in particular has its own SQL interface that is better read directly for bulk/historical work than replayed through EventHub. This service handles that heterogeneity in one place. -- **Governance visibility.** A single operator-facing system answers the "what machine data is in Snowflake?" question, which matters for compliance, cost attribution, and incident response. -- **Decouples schema evolution from data flow.** Adding a Protobuf schema to the central repo no longer implicitly adds data to Snowflake — that requires an explicit action in this service. Prevents accidental volume. +- **Collapse ingest + transform into one tool.** Source knowledge (which tag lives where in Historian, which Redpanda topic carries which event) and transform knowledge (how a raw tag becomes a curated field) are tightly coupled — keeping them in one codebase removes the hand-off between ingestion and transformation that would otherwise need coordination, versioning alignment, and an orchestrator between them. +- **No dbt.** Keeps the plan's in-house .NET pattern consistent (ScadaBridge, OtOpcUa, SnowBridge all .NET). Avoids a separate Python/SQL transform stack, a dbt Cloud line item or self-hosted orchestrator to stand up, and a skills split between .NET engineers (who build ingestion) and analytics engineers (who build dbt). Trade-offs named below. +- **No landing tier.** Snowflake storage cost is not doubled by keeping raw landed rows alongside curated rows. Lineage and replay are recoverable from upstream sources — Redpanda for recent history (30/90 days), Historian SQL for long-term — neither of which requires duplicating raw data in Snowflake. +- **Separation of concerns from ScadaBridge.** ScadaBridge's job is IT/OT integration at the edge (site-local, OPC UA, store-and-forward, scripting). Pulling machine data into Snowflake and transforming it into canonical curated views is a different job — long-lived connector state, Snowflake credentials, per-source backfill logic, transform semantics — and does not belong on every ScadaBridge cluster. +- **Source-agnostic.** Not all machine data flows through Redpanda. Aveva Historian has its own SQL interface that is better read directly for bulk/historical work than replayed through EventHub. SnowBridge handles that heterogeneity in one place. +- **Governance visibility.** A single operator-facing system answers "what machine data is in Snowflake, with what transform, at what freshness?" — which matters for compliance, cost attribution, and incident response. +- **Decouples schema evolution from data flow.** Adding a Protobuf schema to the central repo no longer implicitly adds data to Snowflake — that requires an explicit action in SnowBridge. Prevents accidental volume. + +**Trade-offs accepted.** +- **Transform logic is .NET code, not SQL models.** Analysts familiar with reading dbt models can't directly read or propose changes to SnowBridge's transform logic. **Mitigation:** keep transform declarations as readable, declarative .NET (e.g., expression trees or a small internal DSL — not procedural spaghetti); document source → curated field mappings in the `schemas` repo alongside the equipment-class definitions so downstream consumers have a canonical spec without needing to read SnowBridge's code; publish the curated-table schema as first-class content. +- **No `dbt test` ecosystem.** No packages like `dbt-utils`, `dbt-expectations`, no Jinja macros, no lineage viewer out of the box. **Mitigation:** SnowBridge owns its validation framework end-to-end (see Responsibilities → Data-quality validation). The trade is a smaller but more focused test surface tailored to the canonical model, rather than the full dbt ecosystem. +- **Re-running historical transforms requires re-pulling from source.** With no landing tier in Snowflake, changing transform logic and re-materializing historical curated rows requires replay from Redpanda (up to analytics-tier / compliance-tier retention) or a re-pull from Historian SQL. **Mitigation:** Redpanda retention tiers (30/90 days) already provide a replay surface; Historian is the long-term system of record for raw values; curated-view transform changes are expected to be rare (canonical model evolves slowly). +- **Cross-domain joins happen at query time.** Pre-materialized dbt models that join MES/ERP with machine data don't exist by default. **Mitigation:** canonical keys (`equipment_uuid`, `workorder_id`, consistent `site` / `area` / `line` fields) make query-time joins cheap in Snowflake; Power BI and ad-hoc analysts compose joins at the semantic-layer tier rather than relying on pre-joined dbt models. SnowBridge can still materialize specific cross-domain joins as curated views when a concrete use case justifies the storage cost. **Implications for other decisions in this plan.** -- The **Aveva Historian → Snowflake** recommendation (below) is updated: this service is the component that actually implements the path, rather than a direct ScadaBridge→EventHub→Snowflake Kafka Connector pipeline. -- The **tag opt-in governance** question for the Snowflake-bound stream is resolved here: the opt-in list lives in **this service**, not in the central `schemas` repo and not in ScadaBridge configuration. -- The **Snowflake Kafka Connector** is no longer presumed to be the primary path. It may still be used internally by this service for Redpanda-backed flows, or this service may implement its own consumer — an implementation choice inside the service, not a plan-level commitment. -- The **central EventHub cluster** does not change — machine data still flows through Redpanda for event/topic sources; this service is just one of several consumers (alongside KPI processors, Camstar integration, etc.). +- The **Aveva Historian → Snowflake** recommendation (below) is updated: SnowBridge is the component that implements the full path (ingest → transform → curated). +- The **tag opt-in governance** question for the Snowflake-bound stream is resolved here: the opt-in list lives in **SnowBridge**, not in the central `schemas` repo and not in ScadaBridge configuration. +- The **Snowflake Kafka Connector** is no longer presumed to be the primary path. It may still be used internally by SnowBridge for Redpanda-backed flows, or SnowBridge may implement its own consumer — an implementation choice inside the service, not a plan-level commitment. +- The **central EventHub cluster** does not change — machine data still flows through Redpanda for event/topic sources; SnowBridge is just one of several consumers (alongside KPI processors, Camstar integration, etc.). +- **Pillar 2 "not possible before" OEE use case** is a SnowBridge transform, not a dbt model. Implementation is a canonical-state-based OEE calculation consuming the `equipment.state.transitioned` event stream from Redpanda and writing to a curated OEE table in Snowflake. Mechanics identical; tool different. -**Build-vs-buy: custom build, in-house.** The service is built in-house rather than adopting Aveva Data Hub or a third-party ETL tool (Fivetran, Airbyte, StreamSets, Precog, etc.). -- **Rationale:** bespoke fit for the exact source mix (Aveva Historian SQL + Redpanda/ScadaBridge + future sources), full control over the selection/governance model, alignment with the existing .NET ecosystem that ScadaBridge and OtOpcUa already run on, no commercial license dependency, and no vendor roadmap risk for a component this central. -- **Trade-off accepted:** commits the organization to building and operating another service over the lifetime of the plan. Justified because the requirements (multi-source abstraction, topic/tag selection as a governed first-class action, Snowflake as a targeted sink) don't map cleanly onto any off-the-shelf tool, and the cost of a bad fit would be paid forever. +**Build-vs-buy: custom build, in-house.** SnowBridge is built in-house rather than adopting Aveva Data Hub, dbt + orchestrator, or a third-party ETL tool (Fivetran, Airbyte, StreamSets, Precog, etc.). +- **Rationale:** bespoke fit for the exact source mix (Aveva Historian SQL + Redpanda/ScadaBridge + future sources), full control over the selection/governance model, full control over transform logic, alignment with the existing .NET ecosystem that ScadaBridge and OtOpcUa already run on, no commercial license dependency, no vendor roadmap risk for a component this central, **and no separate transform tool to operate**. +- **Trade-off accepted:** commits the organization to building and operating a single but more complex service (ingest + transform + validation + governance UI) over the lifetime of the plan. Justified because the surface area is tightly scoped (one team, one codebase) and collapses two tools (ingest + transform) into one with a shared operational model. - **Implementation hint (not a commitment):** the most natural starting point is a .NET service — possibly an Akka.NET application to share infrastructure patterns with ScadaBridge — but the specific runtime/framework is an implementation detail for the build team. **Operator interface: web UI backed by an API.** Operators manage source/topic/tag selection through a **dedicated web UI** that sits on top of the service's own API. Selection state lives in an **internal datastore owned by the service**, not in a git repo. -- **Rationale:** lowest friction for the operators who actually run the machine-data estate — non-engineers can onboard a tag or disable a topic without opening a PR. Makes the "what's flowing to Snowflake right now?" question answerable from one screen instead of correlating git state with running state. The underlying API lets ScadaBridge, dbt, or future tooling drive selection changes programmatically when needed. +- **Rationale:** lowest friction for the operators who actually run the machine-data estate — non-engineers can onboard a tag or disable a topic without opening a PR. Makes the "what's flowing to Snowflake right now?" question answerable from one screen instead of correlating git state with running state. The underlying API lets ScadaBridge or future tooling drive selection changes programmatically when needed. - **Trade-off accepted:** git is **not** the source of truth for selection state. Audit, change review, and rollback all have to be built into the service itself — they do not come for free from `git log` and PR review. - **Non-negotiable requirements on the UI/API (to offset the trade-off):** - **Full audit trail** — every selection change records who, what, when, and why (with a required change-reason field). Audit entries are queryable and exportable. @@ -362,26 +375,21 @@ _TBD — service name (working title only); hosting (South Bend, alongside Redpa **Recommended direction.** - **Primary path (updated):** all machine-data ingestion into Snowflake is owned by the **SnowBridge** (see its dedicated section above). That service reads from ScadaBridge/EventHub for event-driven flows and directly from Aveva Historian's SQL interface for historian-native flows, then writes to Snowflake. ScadaBridge remains the producer for event-driven machine data into Redpanda; the integration service consumes from Redpanda rather than Snowflake pulling directly. -- **Aggregation boundary: aggregation lives in Snowflake.** Heavy transforms — summary statistics, time-window rollups, state derivations, cross-site joins, enrichment with MES/other data, business-level KPIs — are **all done in Snowflake** using Snowflake-native transform tooling (dbt, Dynamic Tables, Streams + Tasks — exact tool selection TBD). - - **Rationale:** keeps transform logic where the data analysts and platform owners already work; avoids scattering business logic across Historian Tier-2, ScadaBridge scripts, and Snowflake. One place to version, review, and lineage transforms. Accepts higher Snowflake compute/storage cost as the explicit trade-off. +- **Aggregation boundary: aggregation lives in SnowBridge (writing into Snowflake).** Heavy transforms — summary statistics, time-window rollups, state derivations, cross-site joins, enrichment with MES/other data, business-level KPIs — are **all done inside SnowBridge** (in-process .NET), with curated rows written directly to Snowflake curated tables. See **SnowBridge → Responsibilities → Transformation** above. + - **Rationale:** collapses ingest + transform into one service; transform logic lives where the source knowledge lives; avoids scattering business logic across Historian Tier-2, ScadaBridge scripts, and a separate dbt layer. One codebase to version, review, and lineage transforms. Accepts higher Snowflake compute cost for query-time cross-domain joins as the explicit trade-off (see SnowBridge trade-offs). - **Not used:** Aveva Historian **Tier-2 summary replication** is **not** used as the aggregation layer for the Snowflake path. (Tier-2 may still be used for its original historian purpose, but it's not part of the Snowflake ingestion pipeline.) - **Not used:** ScadaBridge **scripting** is **not** the aggregation layer either — aggregation logic does not live in ScadaBridge scripts. + - **Not used:** **dbt**, Snowflake Dynamic Tables, or Streams + Tasks. SnowBridge owns transformation in-process; no separate Snowflake-side transform tool. - **Volume control — two layers.** Since aggregation happens in Snowflake, both ScadaBridge and the SnowBridge share responsibility for preventing a full-fidelity firehose from reaching Snowflake: - **At ScadaBridge (producing to EventHub):** **deadband / exception-based publishing** — only publish when a tag value changes by a configured threshold or a state changes, not on every OPC UA sample. **Rate limiting** per tag / per site where needed. This controls how much machine data reaches Redpanda in the first place. - **At the SnowBridge (selecting what reaches Snowflake):** **topic and tag selection** — only topics and tags explicitly opted in within the service are forwarded to Snowflake. Not every Redpanda topic or every Historian tag automatically flows. Adding a tag or topic to Snowflake is a governed action in this service. - - Keep **raw full-resolution data in Aveva Historian** as the system of record — Snowflake stores the selected, deadband-filtered stream plus whatever aggregations dbt builds on top, not a mirror of Historian. + - Keep **raw full-resolution data in Aveva Historian** as the system of record — Snowflake stores the selected, deadband-filtered, SnowBridge-transformed curated rows, not a mirror of Historian's raw resolution. - **Drill-down:** for rare raw-data investigations, query the Historian SQL interface directly from analyst tooling rather than copying raw data into Snowflake. - **Historical backfill:** one-off file-based exports (option 5) to seed Snowflake history when a new tag set comes online. -**Snowflake-side transform tooling: dbt.** All Snowflake transforms are built in **dbt**, versioned in git alongside the other integration source (schemas, topic config, etc.), and run on a schedule (or via CI) — not as real-time streaming transforms. -- **Rationale:** dbt is the mature, portable standard for SQL transforms. Strong lineage, testing (`dbt test`), environment separation, and documentation generation. Fits the "everything load-bearing lives in git and is reviewed before it ships" discipline already established for schemas and topic definitions. -- **Explicit trade-off — no real-time transforms.** dbt is batch/micro-batch, not streaming. Transforms land in Snowflake tables on whatever cadence dbt runs (likely minutes-to-hours depending on the model), **not** sub-second. This is acceptable because operational/real-time KPIs continue to run on **Ignition SCADA**, not on Snowflake (see Target Operator / User Experience — Ignition owns KPI UX). Snowflake's job is analytics and enterprise rollups, which tolerate minute-plus latency. -- **Not used:** Dynamic Tables and Streams+Tasks are **not** adopted as part of the primary transform toolchain. If a specific future use case genuinely needs sub-minute latency from Snowflake itself (not Ignition), re-open this decision — don't quietly add a second tool. -- **Orchestration: dbt Core on a self-hosted scheduler.** dbt runs are driven by a self-hosted orchestrator (Airflow / Dagster / Prefect — specific tool TBD), **not** dbt Cloud and **not** CI-only. The orchestrator schedules `dbt build`/`dbt run`/`dbt test`, manages freshness SLAs, backfills, and coordinates dbt alongside the rest of the data pipeline (Snowflake Kafka Connector health checks, ad-hoc backfill jobs, downstream notifications). - - **Rationale:** gives full control over scheduling, dependencies, and alerting; avoids recurring dbt Cloud SaaS spend; keeps dbt runs decoupled from CI so a long-running transform isn't sitting inside a CI build minute. Accepts the operational cost of running one more platform service. - - **Not used:** dbt Cloud (avoided recurring SaaS cost and vendor coupling), pure CI-driven runs (too coupled to PR merge cadence), and Snowflake Tasks as the primary scheduler (too limited). - - **Out of scope for this plan:** specific orchestrator selection (Airflow vs Dagster vs Prefect), whether to stand up a new one or reuse one the enterprise data platform already runs, hosting, and credential management. This plan commits to the *pattern* (dbt Core run by a self-hosted orchestrator) but leaves the concrete orchestrator choice to the team that owns the Snowflake-side data platform. -- _TBD — dbt project layout (one project vs per-domain), model cadence targets, test coverage expectations, source freshness thresholds, CI/CD pipeline for dbt changes._ +**Snowflake-side transform tooling: none separate — SnowBridge owns transformation.** All machine-data transforms into Snowflake curated tables are built inside **SnowBridge** (in-process .NET), not in a Snowflake-side tool like dbt / Dynamic Tables / Streams + Tasks. See **SnowBridge → Responsibilities → Transformation** for the committed model. +- **Batch/micro-batch cadence, not streaming (for transform timing).** SnowBridge transforms land curated rows in Snowflake on whatever cadence the service runs each source — continuous for streaming Redpanda sources, periodic for Historian SQL sources. Not sub-second. Acceptable because operational / real-time KPIs run on **Ignition SCADA**, not Snowflake (see Target Operator / User Experience — Ignition owns KPI UX). Snowflake's job is analytics and enterprise rollups, which tolerate minute-plus latency (see **≤15-minute analytics** SLO below). +- **Not used:** dbt (Core or Cloud), Airflow / Dagster / Prefect as an external orchestrator, Snowflake Dynamic Tables, Snowflake Streams + Tasks. If a specific future use case genuinely needs sub-minute materialization in Snowflake itself (not Ignition), re-open this decision — don't quietly add a second tool. **Deadband / filtering model: global default with explicit per-tag overrides.** ScadaBridge applies **one global deadband** to every tag opted in to the Snowflake stream, and specific tags can be **explicitly overridden** when the global default is too loose or too tight. No per-tag-class templating for deadband — the global default is the floor, overrides are the only fine-tuning mechanism. - **Rationale:** simplest model to reason about and operate — one number to understand across the whole fleet, plus an explicit list of exceptions. Makes it immediately obvious which tags have bespoke tuning (any tag *not* on the override list uses the global default). Avoids per-class template proliferation as a governance surface. @@ -395,10 +403,10 @@ _TBD — service name (working title only); hosting (South Bend, alongside Redpa **Latency SLOs per data class.** End-to-end latency is measured from the moment ScadaBridge (or the Historian source) emits a value to the moment it is queryable in its target consumer. - **Operational / real-time KPI UX — out of scope for the Snowflake path.** Real-time KPI runs on **Ignition SCADA** per the UX split (see Target Operator / User Experience). Snowflake has no sub-minute SLO obligation because no operational UI depends on it. -- **Analytics feeds (Snowflake): ≤ 15 minutes end-to-end.** Covers ScadaBridge emit → Redpanda → SnowBridge → Snowflake landing table → dbt model refresh → queryable in the curated layer. Tight enough to feel alive for analysts and dashboards, loose enough to be reachable with dbt on a self-hosted scheduler and no streaming transforms. +- **Analytics feeds (Snowflake): ≤ 15 minutes end-to-end.** Covers ScadaBridge emit → Redpanda → SnowBridge (in-process transform) → queryable in the curated table in Snowflake. Tight enough to feel alive for analysts and dashboards, loose enough to be reachable with micro-batch transforms inside SnowBridge and no streaming-SQL tooling in Snowflake. - **Compliance / validated data feeds (Snowflake): ≤ 60 minutes end-to-end.** Snowflake is an investigation/reporting tier for validated data; the system of record remains **Aveva Historian**. A 60-minute SLO is sufficient because no compliance control depends on Snowflake freshness — if an investigation needs sub-hour data, it queries Historian directly. - **Ad-hoc raw drill-down — no SLO.** Analysts query the Historian SQL interface directly for rare raw-resolution investigations; this path is not budgeted against any latency target. -- _TBD — which layer is responsible for each segment of the budget (e.g., how much of the 15 minutes is Redpanda vs integration service vs dbt), and how the SLOs are monitored and alerted on in practice._ +- _TBD — which layer is responsible for each segment of the budget (e.g., how much of the 15 minutes is Redpanda vs SnowBridge ingest vs SnowBridge transform vs Snowflake write commit), and how the SLOs are monitored and alerted on in practice._ - Cost model: EventHub throughput + ingestion (Snowpipe Streaming or whatever the integration service uses) + Snowflake **compute for transforms** + Snowflake storage for the target tag volume. Compute cost matters more under this choice than it would have under a source-aggregated model — worth pricing early. - Whether Aveva Data Hub (option 4) should still be piloted as a **reference point** for the custom build — useful for comparison on specific capabilities (Historian connector depth, store-and-forward behavior) even though it is not the target implementation. @@ -571,12 +579,12 @@ _TBD — service name (working title only); hosting (South Bend, alongside Redpa - **Async event notifications** — shopfloor events (state changes, alarms, lifecycle events, etc.) published to EventHub for any interested consumer to subscribe to, without producers needing to know who's listening. - **Async processing for KPI** — KPI calculations (currently handled on Ignition SCADA) can consume event streams from EventHub, enabling decoupled, replayable KPI pipelines instead of tightly coupled point queries. - **System integrations** — other enterprise systems (Camstar, Snowflake, future consumers) integrate by subscribing to EventHub topics rather than opening point-to-point connections into OT. - - **Historical replay for integration testing.** The `analytics`-tier retention (30 days) is explicitly also a **replay surface** for testing: downstream consumers (ScadaBridge scripts, KPI pipelines, dbt models, or any future consumer that needs to re-run historical windows) can be exercised against real historical event streams instead of synthetic data. Does not require any new component. When longer horizons are needed, extend to the `compliance` tier (90 days). Replay windows beyond 90 days are served by the dbt curated layer in Snowflake, not by Redpanda. **Note:** if a funded physics-simulation / FAT initiative ever materializes, this replay surface is one of the foundations it can consume — but such an initiative is out of the 3-year scope of this plan. + - **Historical replay for integration testing.** The `analytics`-tier retention (30 days) is explicitly also a **replay surface** for testing: downstream consumers (ScadaBridge scripts, KPI pipelines, SnowBridge transforms, or any future consumer that needs to re-run historical windows) can be exercised against real historical event streams instead of synthetic data. Does not require any new component. When longer horizons are needed, extend to the `compliance` tier (90 days). Replay windows beyond 90 days are served by the SnowBridge curated layer in Snowflake, not by Redpanda. **Note:** if a funded physics-simulation / FAT initiative ever materializes, this replay surface is one of the foundations it can consume — but such an initiative is out of the 3-year scope of this plan. - _Remaining open items are tracked inline in the subsections above — sizing, read-path implications, long-outage planning, IdP selection, schema subject/versioning details, etc. Support staffing and on-call ownership are out of scope for this plan._ #### Canonical Equipment, Production, and Event Model -The plan already delivers the infrastructure for a cross-system canonical model — OtOpcUa's equipment namespace, Redpanda's `{domain}.{entity}.{event-type}` topic taxonomy, Protobuf schemas in the central `schemas` repo, and the dbt curated layer in Snowflake. What it had not, until now, explicitly committed to is **declaring** that these pieces together constitute the enterprise's canonical equipment / production / event model, and that consumers are entitled to treat them as an integration interface. +The plan already delivers the infrastructure for a cross-system canonical model — OtOpcUa's equipment namespace, Redpanda's `{domain}.{entity}.{event-type}` topic taxonomy, Protobuf schemas in the central `schemas` repo, and the SnowBridge curated layer in Snowflake. What it had not, until now, explicitly committed to is **declaring** that these pieces together constitute the enterprise's canonical equipment / production / event model, and that consumers are entitled to treat them as an integration interface. This subsection makes that declaration. It is load-bearing for pillar 2 (analytics/AI enablement) because a canonical model is what makes "not possible before" cross-domain analytics possible at all. @@ -592,11 +600,11 @@ This subsection makes that declaration. It is load-bearing for pillar 2 (analyti > **OtOpcUa central config DB extended** (per lmxopcua decisions #138 + #139): the Equipment table gains 9 nullable columns for the OPC 40010 Identification fields (Manufacturer, Model, SerialNumber, HardwareRevision, SoftwareRevision, YearOfConstruction, AssetLocation, ManufacturerUri, DeviceManualUri) so operator-set static identity has a first-class home; drivers that can read these dynamically (FANUC `cnc_sysinfo()`, Beckhoff `TwinCAT.SystemInfo`, etc.) override the static value at runtime. Exposed on the OPC UA equipment node under the OPC 40010-standard `Identification` sub-folder per the `category` → folder mapping in `schemas/docs/format-decisions.md` D10. > > **Still needs cross-team ownership:** -> - Name an owner team for the schemas content (it's consumed by OT and IT systems alike — OtOpcUa, Redpanda, dbt) +> - Name an owner team for the schemas content (it's consumed by OT and IT systems alike — OtOpcUa, Redpanda, SnowBridge) > - Decide whether to move to a dedicated `gitea.dohertylan.com/dohertj2/schemas` repo (proposed) or keep as a 3-year-plan sub-tree > - Ratify or revise the 10 format decisions in `schemas/docs/format-decisions.md` > - Establish the CI gate for JSON Schema validation -> - Decide on consumer-integration plumbing for Redpanda Protobuf code-gen and dbt macro generation per `schemas/docs/consumer-integration.md` +> - Decide on consumer-integration plumbing for Redpanda Protobuf code-gen and SnowBridge transform code-gen per `schemas/docs/consumer-integration.md` > **Unified Namespace framing:** this canonical model is also the plan's **Unified Namespace** (UNS) — see **Target IT/OT Integration → Unified Namespace (UNS) posture**. The UNS posture is a higher-level framing of the same mechanics described here: this section specifies the canonical model mechanically; the UNS posture explains what stakeholders asking about UNS should understand about how the plan delivers the UNS value proposition without an MQTT/Sparkplug broker. @@ -606,21 +614,21 @@ The canonical model is exposed on three surfaces, one per layer: | Layer | Surface | What it canonicalizes | |---|---|---| -| Layer 2 — Equipment | **OtOpcUa equipment namespace** | Canonical per-equipment OPC UA node structure. A consumer reading tag `X` from equipment `Y` at any site gets the same node path, the same data type, and the same unit. Equipment-class templates (e.g., "3-axis CNC," "injection molding cell") live here and are referenced from the Redpanda and dbt surfaces. | +| Layer 2 — Equipment | **OtOpcUa equipment namespace** | Canonical per-equipment OPC UA node structure. A consumer reading tag `X` from equipment `Y` at any site gets the same node path, the same data type, and the same unit. Equipment-class templates (e.g., "3-axis CNC," "injection molding cell") live here and are referenced from the Redpanda and SnowBridge surfaces. | | Layer 4 → IT — Events | **Redpanda topics + Protobuf schemas** (`schemas` repo) | Canonical event shape. Every shopfloor event — `equipment.tag.value-changed`, `equipment.state.transitioned`, `mes.workorder.started`, `scada.alarm.raised`, etc. — has exactly one Protobuf message type, registered once, consumed everywhere. **This is where the canonical model is source-of-truth.** | -| IT — Analytics | **dbt curated layer in Snowflake** | Canonical analytics model. Curated views expose equipment, production runs, events, and aggregates with the same vocabulary, dimensions, and state values as the OtOpcUa and Redpanda surfaces. Downstream reporting (Power BI, ad-hoc SQL) and AI/ML consume from here. | +| IT — Analytics | **SnowBridge curated layer in Snowflake** | Canonical analytics model. Curated tables (written by SnowBridge in-process) expose equipment, production runs, events, and aggregates with the same vocabulary, dimensions, and state values as the OtOpcUa and Redpanda surfaces. Downstream reporting (Power BI, ad-hoc SQL) and AI/ML consume from here. | **Single source of truth: the `schemas` repo.** The three surfaces reference a shared canonical definition — they do not each carry their own. Specifically: - **Protobuf message definitions** in the `schemas` repo define the wire format for every canonical event. - **Shared enum types** in the `schemas` repo define the canonical **machine state vocabulary** (see below), canonical event-type values, and any other closed sets of values. -- **Equipment-class definitions** in the `schemas` repo (format TBD — could be a Protobuf message set, could be a YAML document set referenced from Protobuf) describe the canonical node layout that OtOpcUa templates instantiate and that dbt curated views flatten into fact/dim tables. +- **Equipment-class definitions** in the `schemas` repo (format TBD — could be a Protobuf message set, could be a YAML document set referenced from Protobuf) describe the canonical node layout that OtOpcUa templates instantiate and that SnowBridge curated tables flatten into fact/dim tables. Consumers that need to know "what does a `Faulted` state mean" or "what are all the event types in the `mes` domain" look at the `schemas` repo. Any divergence between a surface and the `schemas` repo is a defect in the surface, not in the schema. ##### Canonical machine state vocabulary -The plan commits to a **single authoritative set of machine state values** used consistently across layer-3 state derivations, Redpanda event payloads, and dbt curated views. +The plan commits to a **single authoritative set of machine state values** used consistently across layer-3 state derivations, Redpanda event payloads, and SnowBridge curated tables. Starting set (subject to refinement during implementation, but the names and semantics below are committed as the baseline): @@ -635,7 +643,7 @@ Starting set (subject to refinement during implementation, but the names and sem **Rules of the vocabulary:** - **One state at a time.** An equipment instance is in exactly one of these states at any moment. Multi-dimensional status (e.g., alarm severity, operator mode) is carried in **additional fields** on the state event, not by overloading the state value. -- **Derivation lives at layer 3.** Deriving "true state" from raw signals (interlocks, status bits, PLC words, alarm registers) is a **Layer 3** responsibility — Aveva System Platform for validated derivations, Ignition for KPI-facing derivations. The dbt curated layer consumes the already-derived state; it does not re-derive. +- **Derivation lives at layer 3.** Deriving "true state" from raw signals (interlocks, status bits, PLC words, alarm registers) is a **Layer 3** responsibility — Aveva System Platform for validated derivations, Ignition for KPI-facing derivations. The SnowBridge curated layer consumes the already-derived state; it does not re-derive. - **Events carry state transitions, not state polls.** Redpanda publishes a canonical `equipment.state.transitioned` event every time an equipment instance changes state, with the previous state, the new state, a reason code when available, and the underlying derivation inputs referenced by ID where possible. Current state is reconstructable from the transition stream. - **State values are an enum in the `schemas` repo.** Adding a state value is a schema change reviewed through the normal `schemas` repo governance (CODEOWNERS, `buf` CI, compatibility checks). Removing a state value is effectively impossible without a long-tail consumer migration — treat the starting set as durable. - **Top-fault derivation.** When `Faulted`, the canonical event carries a `top_fault` field (single fault code or string, per the `schemas` repo enum) rather than exposing the full alarm vector. The derivation of "top" from the underlying alarm set lives at layer 3 and is documented per-equipment-class in the `schemas` repo alongside the equipment-class definition. @@ -650,7 +658,7 @@ These are strong candidates but not committed in the starting set; the implement ##### Relationship to OEE and KPI -The canonical state vocabulary directly enables accurate OEE computation in the dbt curated layer without each consumer having to re-derive availability / performance / quality from scratch. This is one of the most immediate answers to pillar 2's "not possible before" use case criterion: cross-equipment, cross-site OEE computed once in dbt from a canonical state stream is meaningfully harder today because the state-derivation logic is fragmented across System Platform and Ignition scripts. Once the canonical state vocabulary is in place, OEE becomes a dbt model, not a bespoke script per site. +The canonical state vocabulary directly enables accurate OEE computation in the SnowBridge curated layer without each consumer having to re-derive availability / performance / quality from scratch. This is one of the most immediate answers to pillar 2's "not possible before" use case criterion: cross-equipment, cross-site OEE computed once in SnowBridge from a canonical state stream is meaningfully harder today because the state-derivation logic is fragmented across System Platform and Ignition scripts. Once the canonical state vocabulary is in place, OEE becomes a SnowBridge transform, not a bespoke script per site. ##### Not in scope for this subsection @@ -690,11 +698,11 @@ The plan commits to **what must be observable**, not to **which tool** emits/sto - **Selection-change audit events** — every approved change is observable as an event, not just a DB row (so alerting on unusual change patterns is possible). - End-to-end latency (source emit → Snowflake queryable), measured against the **≤15-minute analytics** and **≤60-minute compliance** SLOs. -- **dbt (on the self-hosted orchestrator).** - - Per-model run duration, success/failure, and last-successful-run timestamp. - - **Source freshness** — how stale the landing-table sources are that dbt reads from. - - Failed test count (not just failed model count — `dbt test` results are a first-class signal). - - Queued/stalled job counts on the orchestrator. +- **SnowBridge transforms.** + - Per-transform run duration, success/failure, and last-successful-run timestamp. + - **Source freshness** — how stale the upstream source (Historian SQL, Redpanda topic) is when a transform fires. + - Failed validation-check count — SnowBridge's SQL-based data-quality validations are a first-class signal (the equivalent of `dbt test`). + - Queued/stalled transform counts inside SnowBridge's internal scheduler. **Alerting floor (what must page someone, whatever tool is chosen).** - Any component above breaching its SLO for sustained periods (definition of "sustained" is a per-signal TBD). @@ -732,7 +740,7 @@ Success is measured against the three in-scope pillars from the **Vision**. Each 2. **Analytics / AI enablement — machine data queryable in Snowflake at the committed SLOs.** - **Aveva Historian** machine data (and event-driven data from ScadaBridge) is queryable in **Snowflake** at the committed latencies: **analytics ≤ 15 minutes end-to-end**, **compliance ≤ 60 minutes end-to-end**. - - A defined set of **priority tags** (list TBD as part of onboarding) is flowing end-to-end through the SnowBridge, landing in Snowflake, and transformed into **dbt curated layers**. + - A defined set of **priority tags** (list TBD as part of onboarding) is flowing end-to-end through SnowBridge, transformed in-process, and written to the **SnowBridge curated layer** in Snowflake. - At least **one production enterprise analytics or AI use case that was not possible before this pipeline existed** is consuming the curated layer in production. The test is enablement, not throughput: the use case must depend on data, freshness, or cross-site reach that the pre-plan stack could not deliver. Re-platforming an existing report onto Snowflake does not count; a new AI/ML model trained on cross-site machine data, a net-new cross-site OEE view, or an alert that depends on the ≤15-minute SLO would count. 3. **Legacy middleware retirement — zero remaining legacy point-to-point integrations.** @@ -764,7 +772,7 @@ External strategic asks that are **not** part of this plan's three pillars but t If a later adjacent initiative builds something stakeholders want to call "digital twin" on top of this plan's foundation (physics simulation, 3D visualization, a twin product surface), these constraints apply — they are already committed plan decisions, restated here so adjacent initiatives consume this plan cleanly: - **Must consume equipment data through OtOpcUa.** No direct equipment OPC UA sessions. -- **Must consume historical and analytical data through Snowflake + dbt** — not Historian directly, not a bespoke pipeline. The `≤15-minute analytics` SLO is the freshness budget available. +- **Must consume historical and analytical data through the Snowflake curated layer written by SnowBridge** — not Historian directly, not a bespoke pipeline. The `≤15-minute analytics` SLO is the freshness budget available. - **Must consume event streams through Redpanda** — not a parallel bus. The same schemas-in-git and `{domain}.{entity}.{event-type}` topic naming apply. The canonical state vocabulary and canonical model declaration (see **Async Event Backbone → Canonical Equipment, Production, and Event Model**) are how consistent state semantics are delivered. - **Must stay within the IT↔OT boundary.** Enterprise-hosted twin capabilities cross through ScadaBridge central and the SnowBridge like every other enterprise consumer. @@ -789,16 +797,16 @@ _TBD — none remaining for this section. Canonical state vocabulary ownership a **Status: in-flight, not owned by this plan.** Enterprise reporting is actively migrating from **SAP BusinessObjects** to **Microsoft Power BI** (see [`current-state.md`](current-state.md) → Aveva Historian → Current consumers). This is a reporting-team initiative, not a workstream of this 3-year plan — but it **overlaps with pillar 2** (analytics/AI enablement) in a way that requires explicit coordination, because both initiatives ultimately consume machine data and both ultimately present analytics to business users. -**This plan's posture:** no workstream is added to `roadmap.md`, and no pillar criterion depends on the Power BI migration landing on any particular schedule. However, the plan's Snowflake-side components (SnowBridge, dbt curated layer) are shaped so that Power BI can consume them cleanly **if and when** the reporting team decides to point there. Whether Power BI actually does so, on what timeline, and for which reports is **not this plan's decision** — it is a coordination question between this plan and the reporting team. +**This plan's posture:** no workstream is added to `roadmap.md`, and no pillar criterion depends on the Power BI migration landing on any particular schedule. However, the plan's Snowflake-side component (**SnowBridge**, which owns both ingest and transform into curated tables in Snowflake) is shaped so that Power BI can consume it cleanly **if and when** the reporting team decides to point there. Whether Power BI actually does so, on what timeline, and for which reports is **not this plan's decision** — it is a coordination question between this plan and the reporting team. #### Three consumption paths for Power BI The reporting team's Power BI migration can land on any of three paths. Each has different implications for this plan: -**Path A — Power BI reads from the Snowflake dbt curated layer.** -- *Fit with this plan's architecture:* **best.** Machine data flows through the planned pipeline (equipment → OtOpcUa → layer 3 → ScadaBridge → Redpanda → SnowBridge → Snowflake → dbt → Power BI). The architectural diagram in `## Layered Architecture` above already shows this as the intended shape. -- *What it requires from this plan:* the dbt curated layer must be built to serve **reporting**, not only AI/ML. Likely adds a **reporting-shaped view or semantic layer** on top of the curated layer, tuned for Power BI's query patterns and cross-domain joins. SnowBridge's tag selection must include tags that feed reporting, not only tags that feed the pillar-2 AI use case. -- *What it requires from the reporting team:* capacity and willingness to consume Snowflake as a data source (Power BI has a native Snowflake connector; the learning curve is in the semantic layer, not the connection). Commitment to defer at least the machine-data portion of the BOBJ migration until the dbt curated layer is live — which ties the reporting migration's machine-data cutover to this plan's Year 2+ delivery. +**Path A — Power BI reads from the SnowBridge curated layer in Snowflake.** +- *Fit with this plan's architecture:* **best.** Machine data flows through the planned pipeline (equipment → OtOpcUa → layer 3 → ScadaBridge → Redpanda → SnowBridge → Snowflake → Power BI). The architectural diagram in `## Layered Architecture` above already shows this as the intended shape. +- *What it requires from this plan:* the SnowBridge curated layer must be built to serve **reporting**, not only AI/ML. Likely adds **reporting-shaped curated tables or views** tuned for Power BI's query patterns and cross-domain joins. SnowBridge's tag selection must include tags that feed reporting, not only tags that feed the pillar-2 AI use case. +- *What it requires from the reporting team:* capacity and willingness to consume Snowflake as a data source (Power BI has a native Snowflake connector; the learning curve is in the semantic layer, not the connection). Commitment to defer at least the machine-data portion of the BOBJ migration until the SnowBridge curated layer is live — which ties the reporting migration's machine-data cutover to this plan's Year 2+ delivery. - *Risk:* **timing coupling.** If the reporting team wants to finish their migration inside Year 1, this path doesn't work for machine-data reports. They'd need to hold on machine-data reports and migrate the rest first — which is tenable (reports migrate in waves anyway) but needs agreement. - *"Not possible before" hook:* Path A opens the door to **cross-domain reports** (machine data joined with MES/ERP data in one query) that BOBJ couldn't easily deliver. This is a strong candidate for pillar 2's "not possible before" use case. @@ -810,15 +818,15 @@ The reporting team's Power BI migration can land on any of three paths. Each has - *"Not possible before" hook:* none beyond what Historian SQL already offers. **Path C — Both, partitioned by report category.** -- *Shape:* compliance/validation reports read Historian directly (because Historian is the authoritative system of record and auditors typically want reports against it); machine-data analytics and cross-domain reports read from Snowflake dbt; reports sourced from Camstar/Delmia/ERP stay on their native connectors. Reports migrate per category. +- *Shape:* compliance/validation reports read Historian directly (because Historian is the authoritative system of record and auditors typically want reports against it); machine-data analytics and cross-domain reports read from the SnowBridge curated layer in Snowflake; reports sourced from Camstar/Delmia/ERP stay on their native connectors. Reports migrate per category. - *Fit with this plan's architecture:* **pragmatic.** Acknowledges that enterprise reporting is heterogeneous and that one path doesn't fit everything. -- *What it requires from this plan:* Path-A requirements (reporting-shaped dbt layer, tag selection in SnowBridge) for the Snowflake portion. No new requirements for the Historian portion. +- *What it requires from this plan:* Path-A requirements (reporting-shaped SnowBridge curated tables, tag selection in SnowBridge) for the Snowflake portion. No new requirements for the Historian portion. - *What it requires from the reporting team:* a published **report-category → data-source** rubric that dev teams can use to place new reports on the right path. Needs governance; otherwise new reports land wherever feels easiest at the time. - *Risk:* **complexity.** Two semantic layers, two connection paths, two mental models for report authors. Worth it only if the volume of cross-domain / AI-adjacent reporting is high enough to justify Path A alongside Path B. #### Recommended position -**Path C (with Path A as the strategic direction).** Expect most machine-data-heavy reports and all cross-domain reports to move to Snowflake (Path A) over Years 2–3 as the dbt curated layer matures; expect compliance reports to stay on Historian's SQL surface (Path B) indefinitely because Historian is the authoritative regulatory system of record and moving compliance reporting off it introduces chain-of-custody questions we don't want to open. Path B is **explicitly** not a retirement target (see the carve-out in the legacy inventory), so "staying" is a valid end state for compliance reporting. +**Path C (with Path A as the strategic direction).** Expect most machine-data-heavy reports and all cross-domain reports to move to Snowflake (Path A) over Years 2–3 as the SnowBridge curated layer matures; expect compliance reports to stay on Historian's SQL surface (Path B) indefinitely because Historian is the authoritative regulatory system of record and moving compliance reporting off it introduces chain-of-custody questions we don't want to open. Path B is **explicitly** not a retirement target (see the carve-out in the legacy inventory), so "staying" is a valid end state for compliance reporting. **Why not pure Path A:** forces a needless fight over compliance reports that have no business case for leaving Historian. **Why not pure Path B:** gives up the cross-domain reporting upside that is one of the most compelling answers to "what does pillar 2 get us that we couldn't do before?" @@ -830,10 +838,10 @@ Use these to land the coordination conversation. Priority order — the first fo 1. **What's your timeline for completing BOBJ → Power BI?** Specifically, when do you expect to have migrated (a) all non-machine-data reports, (b) machine-data reports that read Historian, and (c) cross-domain reports? This tells us whether holding machine-data reports for Path A is even tenable on your side. 2. **Have you made an architectural decision on Power BI's connection to Historian?** Direct MSSQL link, Power BI gateway + on-prem data source, Azure Analysis Services in front of Historian, dataflows, something else? A decision already baked in may be hard to unwind. -3. **Has Snowflake been evaluated as a Power BI data source?** If yes, what were the findings (cost, performance, semantic modeling effort)? If no, would you be open to an evaluation once the first dbt curated layer is live in Year 2? +3. **Has Snowflake been evaluated as a Power BI data source?** If yes, what were the findings (cost, performance, semantic modeling effort)? If no, would you be open to an evaluation once the first SnowBridge curated layer is live in Year 2? 4. **Is there a business stakeholder asking for cross-domain reports** (machine data joined with MES/ERP/Camstar data in one report) that BOBJ can't deliver today? A named stakeholder here is the strongest signal that Path A is worth the coordination cost. 5. **What's the rough split of your report inventory** between machine-data-heavy reports, compliance reports, cross-domain reports, and pure-enterprise reports? A rough count is enough — we're not looking for a census, just the shape of the portfolio. -6. **Does the reporting team have capacity to learn Snowflake + dbt semantic modeling?** If that's a deal-breaker, Path A is off the table and we should plan for Path B + a parallel Snowflake analytics stack that non-reporting users consume. +6. **Does the reporting team have capacity to learn Snowflake-side semantic modeling against the curated tables SnowBridge writes?** (No dbt on their side either — they'd consume curated tables directly via the Power BI Snowflake connector.) If that's a deal-breaker, Path A is off the table and we should plan for Path B + a parallel Snowflake analytics stack that non-reporting users consume. 7. **Who owns the decision on Power BI's data sources?** Your team, a BI governance body, IT architecture, the CIO? We need to know who to bring into the Path-A discussion if it progresses. 8. **Would you be willing to pilot one cross-domain report on Snowflake (Path A) during Year 2** as a proof point, independent of the rest of the migration? This is a low-commitment way to validate Path A before betting more reports on it. @@ -841,7 +849,7 @@ Use these to land the coordination conversation. Priority order — the first fo After the conversation, place the outcome into one of these buckets: -- **Bucket A — Full Path A commitment.** Reporting team commits to migrating all non-compliance reports to Snowflake over Years 2–3. → Update `roadmap.md` (Snowflake dbt Transform Layer workstream) to include reporting-shaped views in Year 2. Update `goal-state.md` to name cross-domain reporting as a pillar 2 "not possible before" candidate. +- **Bucket A — Full Path A commitment.** Reporting team commits to migrating all non-compliance reports to Snowflake over Years 2–3. → Update `roadmap.md` (SnowBridge workstream) to include reporting-shaped curated tables in Year 2. Update `goal-state.md` to name cross-domain reporting as a pillar 2 "not possible before" candidate. - **Bucket B — Path C commitment.** Reporting team commits to the hybrid path with a published report-category rubric. → Same roadmap updates as A, plus document the rubric as a link from this subsection. - **Bucket C — Path B lock-in.** Reporting team declines Path A for cost, capacity, or timing reasons. → Update `goal-state.md` here to record the decision. No roadmap changes. Pillar 2's "not possible before" use case must come from a different source (e.g., predictive maintenance, OEE anomaly detection) because cross-domain reporting is off the table. - **Bucket D — Conversation inconclusive.** Reporting team needs more time, or the decision is above their level. → Schedule follow-up. Note which questions were answered and which are still open. @@ -850,7 +858,7 @@ After the conversation, place the outcome into one of these buckets: - Whether the reporting team completes their Power BI migration (their decision). - Whether Historian's SQL surface is ever retired (no — it's the compliance system of record). -- Whether this plan's Snowflake dbt layer supports Power BI (yes, it can — the question is only whether the reporting team will consume it). +- Whether this plan's SnowBridge curated layer in Snowflake supports Power BI (yes, it can — the question is only whether the reporting team will consume it). - Whether the SnowBridge's tag selection is driven by reporting requirements (partly — SnowBridge's selection is governed by blast-radius approval, so reporting-team requests are handled through the same workflow as any other). _TBD — name and sponsor of the Power BI migration initiative; named owner on the reporting team for this coordination; whether a joint session between this plan's build team and the reporting team has been scheduled; whether a Power BI + Snowflake proof-of-concept can fit into Year 1 as a forward-looking test, independent of the rest of Year 1's scope._ diff --git a/outputs/DESIGN.md b/outputs/DESIGN.md index e9b4987..8f71606 100644 --- a/outputs/DESIGN.md +++ b/outputs/DESIGN.md @@ -133,7 +133,7 @@ These get resolved during implementation, not during design: 1. **Mermaid rendering in the Claude Code environment.** Unknown until I try. Fallback is manual rendering at mermaid.live; neither path breaks the design. 2. **Whether `document-skills:pptx` can produce a 3-column layout** (needed for slide 5: Three Pillars) and a 2-column layout (needed for slide 16: Open Coordination Items). If not, the spec falls back to single-column with visual separation. -3. **Table overflow behavior on slide 13.** The 7×3 roadmap grid with truncated cell content should fit one slide, but if it overflows, the spec needs a fallback: either shrink text or split across two slides. +3. **Table overflow behavior on slide 13.** The 6×3 roadmap grid with truncated cell content should fit one slide, but if it overflows, the spec needs a fallback: either shrink text or split across two slides. 4. **First-pass theme quality.** I'll use the theme-factory default; the first output becomes the visual baseline. If it looks wrong, section 3's "visual style" line is where the override goes. ## Task list diff --git a/outputs/IMPLEMENTATION-PLAN.md b/outputs/IMPLEMENTATION-PLAN.md index c6c66ae..953090f 100644 --- a/outputs/IMPLEMENTATION-PLAN.md +++ b/outputs/IMPLEMENTATION-PLAN.md @@ -123,7 +123,7 @@ Expected: `DESIGN.md IMPLEMENTATION-PLAN.md diagrams generated` (README, spec **Content requirements:** - Mermaid `flowchart LR` (left-to-right) -- Nodes match `goal-state.md` line 77 exactly: Equipment → OtOpcUa → System Platform/Ignition → ScadaBridge → Redpanda → SnowBridge → Snowflake → dbt → Power BI +- Nodes: Equipment → OtOpcUa → System Platform/Ignition → ScadaBridge → Redpanda → SnowBridge (ingest + in-process transform) → Snowflake (curated layer) → Power BI - IT↔OT boundary marker between ScadaBridge and Redpanda (Redpanda is IT-adjacent from ScadaBridge's central crossing) **Verification:** diff --git a/outputs/README.md b/outputs/README.md index 9b73dbe..9d946ae 100644 --- a/outputs/README.md +++ b/outputs/README.md @@ -86,7 +86,7 @@ Diagrams are **hand-drawn source files** committed to [`diagrams/`](diagrams/), The spec currently references two diagrams: - `diagrams/architecture-layers.png` — the 4-layer goal-state architecture stack (Equipment → OtOpcUa → SCADA → ScadaBridge → Enterprise IT), with the IT↔OT boundary marked. -- `diagrams/end-to-end-flow.png` — the left-to-right data flow for one tag, matching `goal-state.md` line 77 exactly (Equipment → OtOpcUa → System Platform/Ignition → ScadaBridge → Redpanda → SnowBridge → Snowflake → dbt → Power BI). +- `diagrams/end-to-end-flow.png` — the left-to-right data flow for one tag (Equipment → OtOpcUa → System Platform/Ignition → ScadaBridge → Redpanda → SnowBridge → Snowflake → Power BI). SnowBridge owns ingest + in-process transform; no separate dbt hop. Both are **not yet authored.** On first regeneration, Claude will either author the `.mmd` sources and attempt to render them, or flag this as a manual step in the run log. Until the PNGs exist, the corresponding slides (slides 8 and 9 in the deck) will have placeholder boxes. diff --git a/outputs/longform-spec.md b/outputs/longform-spec.md index 4a98371..03b147b 100644 --- a/outputs/longform-spec.md +++ b/outputs/longform-spec.md @@ -107,7 +107,7 @@ The Layered Architecture text diagram in `goal-state.md` (and any similar text d Markdown tables render as PDF tables with row-level borders and header-row emphasis. Tables that exceed one page split cleanly at row boundaries, with the header row repeated at the top of each continuation page. `document-skills:pdf` handles this natively. Specific large tables to expect: -- [`../roadmap.md`](../roadmap.md) → **The grid** (7 workstreams × 3 years) — likely spans 2–3 pages +- [`../roadmap.md`](../roadmap.md) → **The grid** (6 workstreams × 3 years) — likely spans 2–3 pages - [`../current-state/legacy-integrations.md`](../current-state/legacy-integrations.md) → per-row integration detail tables (one per integration) - [`../current-state/equipment-protocol-survey.md`](../current-state/equipment-protocol-survey.md) → field schema table, classification table, rollup views diff --git a/outputs/presentation-spec.md b/outputs/presentation-spec.md index 3af6498..a323099 100644 --- a/outputs/presentation-spec.md +++ b/outputs/presentation-spec.md @@ -124,13 +124,13 @@ If `document-skills:pptx` cannot render a requested layout: | **Source** | [`../goal-state.md`](../goal-state.md) → **OtOpcUa — the unified site-level OPC UA layer** | | **Population** | 6 bullets: (1) single sanctioned OPC UA access point per site, one session per equipment; (2) two namespaces — equipment + System Platform (absorbs LmxOpcUa); (3) clustered, co-located on existing System Platform nodes; (4) hybrid driver strategy — proactive core library + on-demand long-tail; (5) OPC UA-native auth (UserName + standard security modes, inherited from LmxOpcUa); (6) tiered cutover — ScadaBridge → Ignition → System Platform IO across Years 1–3. | -## Slide 11 — Analytics Stack: SnowBridge, Snowflake, dbt +## Slide 11 — Analytics Stack: SnowBridge + Snowflake | Property | Value | |---|---| | **Layout** | Content (bulleted) | | **Source** | [`../goal-state.md`](../goal-state.md) → **SnowBridge** + **Aveva Historian → Snowflake** + **Snowflake-side transform tooling** | -| **Population** | 6 bullets: (1) SnowBridge — custom-built machine-data-to-Snowflake upload service; (2) source abstraction — Aveva Historian SQL in Year 1, Redpanda/ScadaBridge in Year 2; (3) governed selection with blast-radius approval workflow; (4) dbt curated layers, orchestrator out of scope; (5) ≤15-minute analytics SLO; (6) one "not possible before" AI/analytics use case in production by end of plan (pillar 2 gate). | +| **Population** | 6 bullets: (1) SnowBridge — custom-built .NET service owning **ingest + in-process transform + curated-table write** into Snowflake; (2) source abstraction — Aveva Historian SQL in Year 1, Redpanda/ScadaBridge in Year 2; (3) governed selection with blast-radius approval workflow; (4) **no dbt, no external orchestrator, no Snowflake landing tier** — transforms live in the service; (5) ≤15-minute analytics SLO; (6) one "not possible before" AI/analytics use case in production by end of plan (pillar 2 gate). | ## Slide 12 — Redpanda EventHub: the async backbone @@ -146,16 +146,16 @@ If `document-skills:pptx` cannot render a requested layout: |---|---| | **Layout** | Table — 7 rows × 3 columns (+ workstream name column = 4 columns total) | | **Source** | [`../roadmap.md`](../roadmap.md) → **The grid** | -| **Population** | Render the 7×3 roadmap grid as a PPTX table. **Truncate** each cell to the **single most important commitment** (not the full cell text, which would overflow). Workstream column: full name. Year columns: ~10-word headline per cell. Color-code cells by pillar if the theme supports it. | -| **Fallback** | If the 7×3 table doesn't fit one slide at readable type size, split across two slides: workstreams 1–4 on slide 13a (OtOpcUa, Redpanda, SnowBridge, dbt), workstreams 5–7 on slide 13b (ScadaBridge Extensions, Site Onboarding, Legacy Retirement). Label slides 13 and 14, renumber subsequent slides. | +| **Population** | Render the 6×3 roadmap grid as a PPTX table. **Truncate** each cell to the **single most important commitment** (not the full cell text, which would overflow). Workstream column: full name. Year columns: ~10-word headline per cell. Color-code cells by pillar if the theme supports it. | +| **Fallback** | If the 6×3 table doesn't fit one slide at readable type size, split across two slides: workstreams 1–3 on slide 13a (OtOpcUa, Redpanda, SnowBridge), workstreams 4–6 on slide 13b (ScadaBridge Extensions, Site Onboarding, Legacy Retirement). Label slides 13 and 14, renumber subsequent slides. | ## Slide 14 — Year 1 Focus | Property | Value | |---|---| | **Layout** | Content (bulleted) | -| **Source** | [`../roadmap.md`](../roadmap.md) → the Year 1 column across all 7 workstreams | -| **Population** | 7 bullets, one per workstream, ordered by prerequisite position: (1) OtOpcUa — evolve LmxOpcUa, protocol survey, deploy to every site, begin tier-1 cutover; (2) Redpanda — stand up central cluster, schema registry, initial topics; (3) SnowBridge — design + first source adapter (Historian SQL) with filtered flow; (4) dbt — scaffold project, first curated model; (5) ScadaBridge Extensions — deadband publishing + EventHub producer; (6) Site Onboarding — document lightweight onboarding pattern (no new sites Year 1); (7) Legacy Retirement — populate inventory (done), retire first integration as pattern-proving exercise. | +| **Source** | [`../roadmap.md`](../roadmap.md) → the Year 1 column across all 6 workstreams | +| **Population** | 6 bullets, one per workstream, ordered by prerequisite position: (1) OtOpcUa — evolve LmxOpcUa, deploy to every site, begin tier-1 cutover, UNS hierarchy snapshot walk; (2) Redpanda — stand up central cluster, schema registry, initial topics, publish canonical model v1; (3) SnowBridge — design + first source adapter (Historian SQL) with filtered ingest + in-process transform + first curated tables aligned to canonical model; (4) ScadaBridge Extensions — deadband publishing + EventHub producer; (5) Site Onboarding — document lightweight onboarding pattern (no new sites Year 1); (6) Legacy Retirement — populate inventory (done), retire first integration as pattern-proving exercise. | | **Rules** | **Exceeds the 6-bullet truncation rule.** 7 bullets here is intentional because each bullet represents one workstream's Year 1 commitment — dropping one would misrepresent the plan. Keep all 7, tighten wording to ≤10 words per bullet. | ## Slide 15 — Pillar 3: Legacy Retirement (3 → 0) @@ -172,7 +172,7 @@ If `document-skills:pptx` cannot render a requested layout: |---|---| | **Layout** | 2-column content (fallback: single column with horizontal rule) | | **Source** | [`../goal-state.md`](../goal-state.md) → **Strategic Considerations (Adjacent Asks)** | -| **Population** | **Left column — Digital twin (scope: two access-control patterns):** 4 bullets: (1) Scope is definitive — not a committed workstream, not a new component; (2) Pattern 1 — environment-lifecycle promotion without reconfiguration (ACL flip on write authority); (3) Pattern 2 — safe read-only consumption for KPI / monitoring systems (structural zero-write-path guarantee); (4) Both patterns are delivered by already-committed architecture (OtOpcUa ACL model + canonical model + single-connection-per-equipment). **Right column — BOBJ → Power BI:** 4 bullets: (1) In-flight reporting initiative, not owned by this plan; (2) Three consumption paths analyzed (Snowflake dbt / Historian direct / both); (3) Recommended position: Path C — hybrid, with Path A as strategic direction; (4) Next: schedule coordination conversation with reporting team — 8 questions ready in `goal-state.md`. | +| **Population** | **Left column — Digital twin (scope: two access-control patterns):** 4 bullets: (1) Scope is definitive — not a committed workstream, not a new component; (2) Pattern 1 — environment-lifecycle promotion without reconfiguration (ACL flip on write authority); (3) Pattern 2 — safe read-only consumption for KPI / monitoring systems (structural zero-write-path guarantee); (4) Both patterns are delivered by already-committed architecture (OtOpcUa ACL model + canonical model + single-connection-per-equipment). **Right column — BOBJ → Power BI:** 4 bullets: (1) In-flight reporting initiative, not owned by this plan; (2) Three consumption paths analyzed (SnowBridge curated layer in Snowflake / Historian direct / both); (3) Recommended position: Path C — hybrid, with Path A as strategic direction; (4) Next: schedule coordination conversation with reporting team — 8 questions ready in `goal-state.md`. | ## Slide 17 — Non-Goals @@ -189,7 +189,7 @@ If `document-skills:pptx` cannot render a requested layout: |---|---| | **Layout** | Content (bulleted) | | **Source** | [`../status.md`](../status.md) → **Top pending items** + inferred from [`../roadmap.md`](../roadmap.md) → Year 1 | -| **Population** | 4 bullets: (1) Sponsor confirmation + Year 1 funding commitment; (2) Named owners for each of the 7 workstreams (build team alignment); (3) Power BI coordination conversation with reporting team — schedule; (4) UNS hierarchy snapshot walk owner named (Q1–Q2 Year 1 prerequisite for canonical model v1 publication). | +| **Population** | 4 bullets: (1) Sponsor confirmation + Year 1 funding commitment; (2) Named owners for each of the 6 workstreams (build team alignment); (3) Power BI coordination conversation with reporting team — schedule; (4) UNS hierarchy snapshot walk owner named (Q1–Q2 Year 1 prerequisite for canonical model v1 publication). | | **Notes** | This is the closer slide. Each bullet should be a discrete ask with a clear "who needs to do what" so the audience leaves with action. | --- diff --git a/roadmap.md b/roadmap.md index a0027d1..95e39a3 100644 --- a/roadmap.md +++ b/roadmap.md @@ -31,11 +31,10 @@ The roadmap is laid out as a 2D grid — **workstreams** (rows) crossed with **y 1. **OtOpcUa** — evolve the existing in-house `LmxOpcUa` into a unified clustered OPC UA server (**OtOpcUa**) with two namespaces: the existing System Platform namespace plus a new equipment namespace that holds the single session to each piece of equipment. Ship it to every site and execute the tiered cutover of downstream consumers (see `goal-state.md` → **OtOpcUa — the unified site-level OPC UA layer (absorbs LmxOpcUa)**). Prioritized first because it is **foundational** for the rest of the OT plan. 2. **Redpanda EventHub** — stand up and operate the central Kafka-compatible backbone (see `goal-state.md` → Async Event Backbone). -3. **SnowBridge** — custom-build the dedicated service that owns all machine-data flows into Snowflake (see `goal-state.md` → SnowBridge). -4. **Snowflake dbt Transform Layer** — build and evolve the dbt curated layers that Snowflake consumers read from (see `goal-state.md` → Aveva Historian → Snowflake → Snowflake-side transform tooling). -5. **ScadaBridge Extensions** — add and tune the capabilities ScadaBridge needs to serve the new architecture (deadband publishing, EventHub producer configuration, auth alignment). -6. **Site Onboarding** — bring the currently unintegrated smaller sites onto the standardized stack, and keep the already-integrated sites aligned with the evolving pattern. -7. **Legacy Retirement** — discover, sequence, migrate, dual-run, and retire every legacy IT↔OT path tracked in [`current-state/legacy-integrations.md`](current-state/legacy-integrations.md). +3. **SnowBridge** — custom-build the dedicated service that owns machine-data ingest, in-process transform, and write-to-curated-tables into Snowflake (see `goal-state.md` → SnowBridge). Includes the canonical-model-aligned transforms and curated-layer delivery — no separate dbt workstream. +4. **ScadaBridge Extensions** — add and tune the capabilities ScadaBridge needs to serve the new architecture (deadband publishing, EventHub producer configuration, auth alignment). +5. **Site Onboarding** — bring the currently unintegrated smaller sites onto the standardized stack, and keep the already-integrated sites aligned with the evolving pattern. +6. **Legacy Retirement** — discover, sequence, migrate, dual-run, and retire every legacy IT↔OT path tracked in [`current-state/legacy-integrations.md`](current-state/legacy-integrations.md). ### Workstream → pillar mapping @@ -43,8 +42,7 @@ The roadmap is laid out as a 2D grid — **workstreams** (rows) crossed with **y |---|---| | OtOpcUa | Pillars 1, 2 — foundational (unblocks consistent equipment access for both unification and analytics paths) | | Redpanda EventHub | Pillar 2 (analytics/AI enablement) — foundational | -| SnowBridge | Pillar 2 | -| Snowflake dbt Transform Layer | Pillar 2 | +| SnowBridge | Pillar 2 (ingest + transform + curated layer) | | ScadaBridge Extensions | Pillars 1, 2, 3 — touches all three | | Site Onboarding | Pillar 1 (unification) | | Legacy Retirement | Pillar 3 (legacy retirement) | @@ -53,7 +51,7 @@ The roadmap is laid out as a 2D grid — **workstreams** (rows) crossed with **y - **OtOpcUa is foundational** and its *deployment* (software installed and ready at every site) is a Year 1 prerequisite for everything else. Its *cutover* (consumers redirected to it) follows the tiered order and extends across all three years, but the software must be present at every site before other workstreams take hard dependencies on equipment-data consistency. LmxOpcUa is already deployed per-node; Year 1 grows it into OtOpcUa in place, which keeps the rollout a low-risk evolution rather than a parallel install. - **Redpanda** must be in place before the **SnowBridge** can consume Redpanda-backed flows, and before **ScadaBridge Extensions** can test the EventHub producer path end-to-end. -- The **SnowBridge** must be in place before **dbt** curated layers can be built on real machine-data landing tables. +- **SnowBridge** owns both ingest and transform; its Year 1 transforms land the first curated tables directly (no separate dbt layer). - The **Legacy inventory** (in `current-state/legacy-integrations.md`) must be populated before **Legacy Retirement** can be sequenced; inventory discovery is a Year 1 prerequisite. - **ScadaBridge tier-1 cutover** (ScadaBridge reading from OtOpcUa instead of equipment directly) must be completed at a site before **ScadaBridge Extensions** at that site can rely on consistent equipment-data semantics for downstream Redpanda publishing. - **Site Onboarding** for the smaller sites depends on having the **standardized stack** (OtOpcUa + ScadaBridge + Redpanda + SnowBridge) reasonably proven at the large sites — so heavy onboarding is Year 2+, not Year 1. @@ -65,8 +63,7 @@ The roadmap is laid out as a 2D grid — **workstreams** (rows) crossed with **y |---|---|---|---| | **OtOpcUa** | **Evolve LmxOpcUa into OtOpcUa** — extend the existing in-house OPC UA server to add (a) a new equipment namespace with single session per equipment via native protocols translated to OPC UA (committed core drivers: OPC UA Client, Modbus TCP, AB CIP, AB Legacy, S7, TwinCAT, FOCAS, plus Galaxy carried forward), and (b) clustering (non-transparent redundancy, 2-node per site) on top of the existing per-node deployment. **Driver stability tiers:** Tier A in-process (Modbus, OPC UA Client), Tier B in-process with guards (S7, AB CIP, AB Legacy, TwinCAT), Tier C out-of-process (Galaxy — bitness constraint, FOCAS — uncatchable AVE). Core driver list confirmed by v2 implementation team (protocol survey no longer needed for driver scoping). **UNS hierarchy snapshot walk** — per-site equipment-instance discovery (site/area/line/equipment + UUID assignment) to feed the initial schemas-repo hierarchy definition and canonical model; target done Q1–Q2. **ACL model designed and committed** (decisions #129–132): 6-level scope hierarchy, `NodePermissions` bitmask, generation-versioned `NodeAcl` table, Admin UI + permission simulator. Phase 1 ships before any driver phase. **Deploy OtOpcUa to every site** as fast as practical. **Begin tier 1 cutover (ScadaBridge)** at large sites. **Prerequisite: certificate-distribution** to consumer trust stores before each cutover. **Aveva System Platform IO pattern validation** — Year 1 or early Year 2 research to confirm Aveva supports upstream OPC UA data sources, well ahead of Year 3 tier 3. _TBD — first-cutover site selection; **cutover plan owner** (not OtOpcUa — a separate integration/operations team, per decision #136, not yet named); enterprise shortname for UNS hierarchy root; schemas-repo owner team and dedicated repo creation._ | **Complete tier 1 (ScadaBridge)** across all sites. **Begin tier 2 (Ignition)** — Ignition consumers redirected from direct-equipment OPC UA to each site's OtOpcUa, collapsing WAN session counts from *N per equipment* to *one per site*. **Build long-tail drivers** on demand as sites require them. Resolve Warsaw per-building multi-cluster consumer-addressing pattern (consumer-side stitching vs site-aggregator OtOpcUa instance). _TBD — per-site tier-2 rollout sequence._ | **Complete tier 2 (Ignition)** across all sites. **Execute tier 3 (Aveva System Platform IO)** with compliance stakeholder validation — the hardest cutover because System Platform IO feeds validated data collection. Reach steady state: every equipment session is held by OtOpcUa, every downstream consumer reads OT data through it. _TBD — per-equipment-class criteria for System Platform IO re-validation._ | | **Redpanda EventHub** | Stand up central Redpanda cluster in South Bend (single-cluster HA). Stand up bundled Schema Registry. Wire SASL/OAUTHBEARER to enterprise IdP. Create initial topic set (prefix-based ACLs). Hook up observability minimum signal set. Define the three retention tiers (`operational`/`analytics`/`compliance`). **Stand up the central `schemas` repo** with `buf` CI, CODEOWNERS, and the NuGet publishing pipeline. **Publish the canonical equipment/production/event model v1** — including the canonical machine state vocabulary (`Running / Idle / Faulted / Starved / Blocked` + any agreed additions) as a Protobuf enum, the `equipment.state.transitioned` event schema, and initial equipment-class definitions for pilot equipment. This is load-bearing for pillar 2 (canonical model is what makes cross-domain "not possible before" analytics possible at all). **Pilot equipment class for canonical definition: FANUC CNC** (pre-defined FOCAS2 hierarchy already exists in OtOpcUa v2 driver design). Land the FANUC CNC class template in the schemas repo before Tier 1 cutover begins. **Universal `_base` equipment-class template** seeded by the OtOpcUa team — every other class extends it via the `extends` field on the equipment-class JSON Schema. `_base` aligns to **OPC UA Companion Spec OPC 40010 (Machinery)** for the Identification component (Manufacturer, Model, ProductInstanceUri, SerialNumber, HardwareRevision, SoftwareRevision, YearOfConstruction, ManufacturerUri, DeviceManual, AssetLocation) and MachineryOperationMode enum, **OPC UA Part 9** for alarm-summary fields, and **ISO 22400** for lifetime counters that feed Availability + Performance KPIs. Avoids per-class drift in identity / state / alarm field naming and ensures every machine in the estate exposes the same baseline metadata regardless of vendor. _TBD — sizing decisions, initial topic list, canonical vocabulary ownership (domain SME group)._ | Expand topic coverage as additional domains onboard. Enforce tiered retention and ACLs at scale. Prove backlog replay after a WAN-outage drill (the replay surface is also the foundation for any future funded physics-simulation / FAT initiative, should one materialize). Exercise long-outage planning (ScadaBridge queue capacity vs. outage duration). Iterate the canonical model as additional equipment classes and domains onboard. _TBD — concrete drill cadence._ | Steady-state operation. Harden alerting and runbooks against the observed failure modes from Years 1–2. Canonical model is mature and covers every in-scope equipment class; schema changes are routine rather than foundational. | -| **SnowBridge** | Design and begin custom build in .NET. **Filtered, governed upload to Snowflake is the Year 1 purpose** — the service is the component that decides which topics/tags flow to Snowflake, applies the governed selection model, and writes into Snowflake. Ship an initial version with **one working source adapter** — starting with **Aveva Historian (SQL interface)** because it's central-only, exists today, and lets the workstream progress in parallel with Redpanda rather than waiting on it. First end-to-end **filtered** flow to Snowflake landing tables on a handful of priority tags. Selection model in place even if the operator UI isn't yet (config-driven is acceptable for Year 1). _TBD — team, credential management, datastore for selection state._ | Add the **ScadaBridge/Redpanda source adapter** alongside Historian. Build and ship the operator **web UI + API** on top of the Year 1 selection model, including the blast-radius-based approval workflow, audit trail, RBAC, and exportable state. Onboard priority tags per domain under the UI-driven governance path. _TBD — UI framework._ | All planned source adapters live behind the unified interface. Approval workflow tuned based on Year 2 operational experience. Feature freeze; focus on hardening. | -| **Snowflake dbt Transform Layer** | Scaffold a dbt project in git, wired to the self-hosted orchestrator (per `goal-state.md`; specific orchestrator chosen outside this plan). Build first **landing → curated** model for priority tags. **Align curated views with the canonical model v1** published in the `schemas` repo — equipment, production, and event entities in the curated layer use the canonical state vocabulary and the same event-type enum values, so downstream consumers (Power BI, ad-hoc analysts, future AI/ML) see the same shape of data Redpanda publishes. This is the dbt-side delivery of the canonical model (load-bearing for pillar 2). Establish `dbt test` discipline from day one — including tests that catch divergence between curated views and the canonical enums. _TBD — project layout (single vs per-domain); reconciliation rule if derived state in curated views disagrees with the layer-3 derivation (should not happen, but the rule needs to exist)._ | Build curated layers for all in-scope domains. **Ship a canonical-state-based OEE model** as a strong candidate for the pillar-2 "not possible before" use case — accurate cross-equipment, cross-site OEE computed once in dbt from the canonical state stream, rather than re-derived in every reporting surface. Source-freshness SLAs tied to the **≤15-minute analytics** budget. Begin development of the first **"not possible before" AI/analytics use case** (pillar 2). | The "not possible before" use case is **in production**, consuming the curated layer, meeting its own SLO. Pillar 2 check passes. | +| **SnowBridge** | Design and begin custom build in .NET. **Filtered, governed ingest + in-process transform + curated-table write** is the Year 1 purpose — SnowBridge decides which topics/tags flow to Snowflake, applies the governed selection model, transforms in-process (.NET), and writes curated rows directly. Ship an initial version with **one working source adapter** — starting with **Aveva Historian (SQL interface)** because it's central-only, exists today, and lets the workstream progress in parallel with Redpanda rather than waiting on it. First end-to-end filtered + transformed flow to Snowflake **curated tables** on a handful of priority tags. Selection model in place even if the operator UI isn't yet (config-driven is acceptable for Year 1). **Align curated-table schemas with the canonical model v1** published in the `schemas` repo — equipment, production, and event entities in the curated layer use the canonical state vocabulary and the same event-type enum values, so downstream consumers (Power BI, ad-hoc analysts, future AI/ML) see the same shape of data Redpanda publishes. Establish SnowBridge's SQL-based **data-quality validation framework** from day one — including checks that catch divergence between curated tables and the canonical enums. _TBD — team, credential management, datastore for selection state; format for transform declarations within the codebase; validation framework specifics; reconciliation rule if derived state in curated tables disagrees with the layer-3 derivation._ | Add the **ScadaBridge/Redpanda source adapter** alongside Historian. Build and ship the operator **web UI + API** on top of the Year 1 selection model, including the blast-radius-based approval workflow, audit trail, RBAC, and exportable state. Build curated layers for all in-scope domains. **Ship a canonical-state-based OEE model** as a strong candidate for the pillar-2 "not possible before" use case — accurate cross-equipment, cross-site OEE computed once in SnowBridge from the canonical state stream, rather than re-derived in every reporting surface. Source-freshness SLAs tied to the **≤15-minute analytics** budget. Onboard priority tags per domain under the UI-driven governance path. Begin development of the first **"not possible before" AI/analytics use case** (pillar 2). _TBD — UI framework._ | All planned source adapters live behind the unified interface. Approval workflow tuned based on Year 2 operational experience. Feature freeze; focus on hardening. The "not possible before" use case is **in production**, consuming the curated layer, meeting its own SLO. Pillar 2 check passes. | | **ScadaBridge Extensions** | Implement **deadband / exception-based publishing** with the global-default model (+ override mechanism). Add **EventHub producer** capability with per-call **store-and-forward** to Redpanda. Verify co-located footprint doesn't degrade System Platform. _TBD — global deadband value, override mechanism location._ | Roll deadband + EventHub producer to **all currently-integrated sites**. Tune deadband and overrides based on observed Snowflake cost. Support early legacy-retirement work with outbound Web API / DB write patterns as needed. | Steady state. Any remaining Extensions work is residual cleanup or support for the tail end of Site Onboarding / Legacy Retirement. | | **Site Onboarding** | **No new site onboardings in Year 1.** Use the year to define and document the **lightweight onboarding pattern** for smaller sites — equipment types, network requirements, standard ScadaBridge template set, standard topic/tag set. Keep the existing integrated sites stable. | **Pilot the onboarding pattern** on one smaller site end-to-end (Berlin, Winterthur, or Jacksonville — choice TBD). Use learnings to refine the pattern, then **begin scaling** onboarding to additional smaller sites. _TBD — pilot site selection criteria, per-site effort estimate._ | **Complete onboarding of all remaining smaller sites.** Every site on the authoritative list is on the standardized stack. Pillar 1 check passes. | | **Legacy Retirement** | **Populate the legacy inventory** (`current-state/legacy-integrations.md`) — this is the prerequisite for sequencing. Identify **early-retirement candidates** where the replacement path already exists (e.g., **LEG-002 Camstar**, since ScadaBridge already has a native Camstar path). Retire at least one integration end-to-end as a pattern-proving exercise (including dual-run + decommission). _TBD — inventory ownership, discovery approach._ | **Bulk migration.** Execute retirements in sequence against the inventory, prioritized by a mix of risk and ease. Each retirement follows: plan → build replacement (often in ScadaBridge Extensions) → dual-run → cutover → decommission. Inventory burn-down tracked quarterly. _TBD — prioritization rubric, dual-run duration per integration class._ | **Drive inventory to zero.** Any remaining integrations are in dual-run or decommission phase at start of year; the inventory reaches zero by end of year. Pillar 3 check passes. | @@ -82,7 +79,7 @@ The roadmap is laid out as a 2D grid — **workstreams** (rows) crossed with **y At the end of Year 3, the three pillar criteria from `goal-state.md` → Success Criteria are each **binary**. The cells above are structured so that the relevant workstream ends Year 3 having either satisfied its share of the check or not. - **Pillar 1 — Unification:** Site Onboarding ends Year 3 with all sites on the standardized stack. -- **Pillar 2 — Analytics/AI Enablement:** dbt + SnowBridge + Redpanda end Year 3 with the "not possible before" use case in production against the ≤15-minute analytics SLO. +- **Pillar 2 — Analytics/AI Enablement:** SnowBridge + Redpanda end Year 3 with the "not possible before" use case in production against the ≤15-minute analytics SLO. - **Pillar 3 — Legacy Retirement:** Legacy Retirement ends Year 3 with the inventory at zero. If a workstream appears to be falling behind its Year 3 cell, the response is **never** to soften the end-state criterion. It is either to accelerate the workstream, reallocate from a lower-risk workstream, or formally accept slippage and adjust the plan — but the success criteria are not moved. diff --git a/schemas/CONTRIBUTING.md b/schemas/CONTRIBUTING.md index 5c1e883..4bfcc98 100644 --- a/schemas/CONTRIBUTING.md +++ b/schemas/CONTRIBUTING.md @@ -7,7 +7,7 @@ 1. **Open an issue first** for any new equipment class, UNS subtree, or format change. Describe the use case + the consumer(s) that need it. 2. **Branch + PR** — work on a feature branch, open a PR against `main`. 3. **CI gate** validates every JSON file against the schema in `format/`. -4. **Review** — at least one schemas-repo maintainer + one consumer-team representative (OtOpcUa, Redpanda, or dbt depending on what changed). +4. **Review** — at least one schemas-repo maintainer + one consumer-team representative (OtOpcUa, Redpanda, or SnowBridge depending on what changed). 5. **Merge + tag** — merge to `main` and create a semver tag. Consumers pin to tags. ## Adding a new equipment class @@ -30,7 +30,7 @@ Editing files in `format/` is a breaking change for downstream consumers. Process: 1. Open an issue with the proposed change + rationale. -2. Notify all consumer teams (OtOpcUa, Redpanda, dbt, anyone else listed in `docs/consumer-integration.md`). +2. Notify all consumer teams (OtOpcUa, Redpanda, SnowBridge, anyone else listed in `docs/consumer-integration.md`). 3. Get explicit signoff from each before merging. 4. Bump the major version of every affected class file simultaneously (consumers use this to detect breaking changes). diff --git a/schemas/README.md b/schemas/README.md index bf73d8c..bf3328a 100644 --- a/schemas/README.md +++ b/schemas/README.md @@ -20,7 +20,7 @@ Three surfaces per the 3-year-plan handoff §"Canonical Model Integration": |----------|-----| | **OtOpcUa equipment namespace** | At deploy/config time, OtOpcUa nodes fetch the equipment-class template referenced by `Equipment.EquipmentClassRef` and use it to validate the operator-configured tag set. Drift = config validation error | | **Redpanda topics + Protobuf schemas** | Equipment-class templates derive Protobuf message definitions for canonical events (`equipment.state.transitioned`, etc.) | -| **dbt curated layer in Snowflake** | Same templates derive column definitions and dimension tables for the curated analytics model | +| **SnowBridge curated layer in Snowflake** | Same templates derive column definitions and dimension tables for the curated analytics model. SnowBridge owns ingest + in-process .NET transform + write; no separate dbt layer. | OtOpcUa is one consumer of three. Decisions about format, structure, and naming live with the schemas-repo owner team (TBD), not with any one consumer. diff --git a/schemas/docs/consumer-integration.md b/schemas/docs/consumer-integration.md index 9827a4a..c4d9b15 100644 --- a/schemas/docs/consumer-integration.md +++ b/schemas/docs/consumer-integration.md @@ -27,16 +27,16 @@ How each of the three canonical-model consumers integrates with this repo. **Status (2026-04-17)**: not wired. Redpanda team to design the codegen step when the schemas repo has a stable initial class set. -## dbt curated layer in Snowflake +## SnowBridge curated layer in Snowflake -**What it pulls**: equipment-class templates derive column definitions for the curated equipment-state and equipment-signal models in dbt. +**What it pulls**: equipment-class templates derive column definitions for the curated equipment-state and equipment-signal tables SnowBridge writes into Snowflake. SnowBridge owns ingest + in-process transform + write — there is no separate dbt layer. **Integration points**: -- A dbt macro reads `classes/*.json` and generates per-class staging models with the canonical signal columns. -- UNS subtree definitions (`uns/*.json`) drive the dim_site / dim_area / dim_line dimension tables. -- Versioning: dbt project pins to a specific schemas-repo tag; updates require an explicit dbt deploy. +- A SnowBridge codegen step reads `classes/*.json` and generates per-class transform definitions + curated-table DDL with the canonical signal columns. +- UNS subtree definitions (`uns/*.json`) drive the dim_site / dim_area / dim_line dimension tables written by SnowBridge. +- Versioning: the SnowBridge build pins to a specific schemas-repo tag; updates require an explicit SnowBridge release. -**Status (2026-04-17)**: not wired. dbt team to design the macro when the schemas repo has a stable initial class set. +**Status (2026-04-24)**: not wired. SnowBridge team to design the codegen step when the schemas repo has a stable initial class set. ## Cross-consumer compatibility diff --git a/schemas/docs/overview.md b/schemas/docs/overview.md index 9c49950..8e6556b 100644 --- a/schemas/docs/overview.md +++ b/schemas/docs/overview.md @@ -12,7 +12,7 @@ Three OT/IT systems consume the same canonical model: - **OtOpcUa** equipment namespace — exposes raw signals over OPC UA - **Redpanda + Protobuf** event topics — canonical event shape on the wire -- **dbt curated layer in Snowflake** — analytics model +- **SnowBridge curated layer in Snowflake** — analytics model (SnowBridge writes curated rows directly; no separate dbt layer) Without a central source, they would drift. With one repo, every consumer pulls a versioned snapshot and validates against it. Drift becomes a CI failure, not a production incident.