Seed the canonical OT schemas content under 3yearplan/schemas/ as a temporary location until a dedicated schemas repo is created (Gitea push-to-create is disabled, the dedicated repo needs a manual UI step). Initial seed contributed by the OtOpcUa team to unblock the EquipmentClassRef integration timeline (lmxopcua decision #112) and to provide the future cross-team owner with a concrete starting point rather than a blank slate. Marked DRAFT throughout with prominent "ownership TBD" framing in README and CONTRIBUTING — the future owner team should treat this seed as a starting point and revise format / structure / naming as the open questions in README "Open Questions" get resolved.

Includes: README explaining purpose / scope / temporary-location framing / format decision, CONTRIBUTING.md with proposed workflow + per-class semver versioning policy + validation commands, format/equipment-class.schema.json defining the shape of a class template (classId, version, displayName, applicability, signals, alarms, optional stateModel), format/tag-definition.schema.json defining the shape of a single canonical signal (name, dataType, category, unit, isArray, accessLevel, writeIdempotent, isHistorized, scaling), format/uns-subtree.schema.json defining the shape of a per-site UNS subtree (enterprise + site + areas + lines), classes/fanuc-cnc.json as the worked pilot class with 16 signals + 3 alarms + suggested state-derivation notes (per OtOpcUa corrections doc D1), uns/example-warsaw-west.json as a worked UNS subtree example, docs/overview.md (what / why / lifecycle / what's NOT in this repo), docs/format-decisions.md (8 numbered decisions covering JSON Schema choice per corrections D2, per-class semver, additive-only minor bumps, _default placeholder reservation, signal-name vs UNS-segment regex distinction, stateModel-as-informational, no per-equipment overrides at this layer, applicability.drivers as OtOpcUa driver enumeration), docs/consumer-integration.md (how OtOpcUa / Redpanda / dbt each integrate). $id URLs in the JSON schemas resolve at the actual current path so validators don't 404. Top-level README adds a row to the Component Detail Files table pointing to schemas/. Corrections doc B2 (schemas-repo dependencies) marked partially RESOLVED with the seed location and a list of what still needs the plan team or cross-team owner to decide (owner team naming, dedicated repo migration, format-decision ratification, FANUC CNC pilot confirmation, CI gate setup, Redpanda + dbt consumer integration plumbing). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 12:35:27 -04:00
parent dee56a6846
commit 5953685ffb
12 changed files with 651 additions and 1 deletions
--- a/schemas/docs/consumer-integration.md
+++ b/schemas/docs/consumer-integration.md
@@ -0,0 +1,50 @@
+# Consumer Integration
+
+How each of the three canonical-model consumers integrates with this repo.
+
+## OtOpcUa equipment namespace
+
+**Reference**: `lmxopcua/docs/v2/plan.md` decisions #112, #115; `lmxopcua/docs/v2/config-db-schema.md` Equipment table.
+
+**What it pulls**: equipment-class templates from `classes/` keyed by `Equipment.EquipmentClassRef` in the central config DB.
+
+**Integration points**:
+- At Admin UI draft-publish time: `sp_ValidateDraft` checks that every Equipment row's tag set satisfies the referenced class's required-signal list. Missing required signals = draft validation error.
+- At cluster runtime: each OtOpcUa node fetches the relevant class templates at startup (cached locally per the LiteDB cache pattern) and uses them to validate the applied generation.
+
+**Versioning**: each `EquipmentClassRef` value carries a `classId@version` form (e.g. `fanuc-cnc@0.1.0`). Pinning to a specific version protects against breaking changes in the schemas repo.
+
+**Status (2026-04-17)**: `Equipment.EquipmentClassRef` ships as a nullable hook column in OtOpcUa Phase 1 with no validation enforcement. Enforcement lands when the schemas repo is publishable and the OtOpcUa team wires the validator into `sp_ValidateDraft`.
+
+## Redpanda topics + Protobuf schemas
+
+**What it pulls**: equipment-class templates derive Protobuf message definitions for canonical equipment events (`equipment.state.transitioned`, `equipment.signal.changed`, etc.).
+
+**Integration points**:
+- A code-generation step in the Redpanda team's CI reads `classes/*.json` and emits `.proto` files for each class.
+- The generated `.proto` files publish to a Schema Registry (or equivalent) for runtime consumers.
+- Versioning: Protobuf schema versions track this repo's tag versions one-for-one.
+
+**Status (2026-04-17)**: not wired. Redpanda team to design the codegen step when the schemas repo has a stable initial class set.
+
+## dbt curated layer in Snowflake
+
+**What it pulls**: equipment-class templates derive column definitions for the curated equipment-state and equipment-signal models in dbt.
+
+**Integration points**:
+- A dbt macro reads `classes/*.json` and generates per-class staging models with the canonical signal columns.
+- UNS subtree definitions (`uns/*.json`) drive the dim_site / dim_area / dim_line dimension tables.
+- Versioning: dbt project pins to a specific schemas-repo tag; updates require an explicit dbt deploy.
+
+**Status (2026-04-17)**: not wired. dbt team to design the macro when the schemas repo has a stable initial class set.
+
+## Cross-consumer compatibility
+
+A breaking change in a class template (major version bump per CONTRIBUTING.md) requires:
+
+1. Schemas repo PR with the change + rationale
+2. Each consumer team confirms the impact on their integration
+3. All three consumers either upgrade simultaneously OR pin to the prior major version until they can upgrade
+4. OtOpcUa specifically: a breaking change to a class with existing equipment in production requires a config-generation publish that updates the equipment's `EquipmentClassRef` to the new version + reconciles tag changes
+
+Breaking changes should be rare. Preferred pattern: add new optional signals (minor bump) + deprecate old ones across multiple minor releases before removing in a major.
--- a/schemas/docs/format-decisions.md
+++ b/schemas/docs/format-decisions.md
@@ -0,0 +1,56 @@
+# Format Decisions
+
+Why the schemas repo looks the way it does. Each decision is open for the schemas-repo owner team to revisit.
+
+## D1 — JSON Schema (Draft 2020-12) as the authoring format
+
+**Alternatives considered**: Protobuf (`.proto`), YAML, custom DSL.
+
+**Choice**: JSON Schema.
+
+**Why**:
+- Idiomatic for .NET 10 (System.Text.Json + JsonSchema.Net) — OtOpcUa reads templates with no extra dependencies
+- Idiomatic for CI tooling — every CI runner can `jq` and validate JSON Schema without extra toolchain (`ajv`, `jsonschema`, etc.)
+- Best authoring experience: text format, mergeable in git, structured diffing, IDE autocomplete via the `$schema` reference
+- Validation at multiple layers: operator-visible Admin UI errors in OtOpcUa, schemas-repo CI gates, downstream consumer runtime validation
+- Protobuf is better for *wire* serialization (size, speed, generated code) but worse for *authoring* (binary, requires `.proto` compiler, poor merge story in git)
+- Where wire-format efficiency matters (Redpanda events), we code-generate Protobuf from the JSON Schema source. One-way derivation is simpler than bidirectional sync.
+
+## D2 — Per-class versioning, semver
+
+**Alternatives considered**: whole-repo versioning, no versioning.
+
+**Choice**: each class file has its own `version` field (semver); the repo also tags overall releases.
+
+**Why**:
+- Different classes evolve at different rates (FANUC CNC may stabilize while Modbus PLC catalog grows)
+- Consumers can pin per-class for fine-grained compatibility (e.g. `fanuc-cnc@0.1.0` + `modbus-plc@0.3.2`)
+- Repo-level tags exist to bundle a known-good combination for consumers that want one anchor
+
+## D3 — Strict additive policy on minor bumps
+
+**Why**: removes ambiguity. If I see `class@1.3.0`, I know its signal set is a strict superset of `class@1.0.0` (and `class@1.x.y` for any earlier x.y). Breaking changes only happen at major-version boundaries.
+
+## D4 — `_default` reserved as placeholder for unused UNS levels
+
+Imported from `lmxopcua/docs/v2/plan.md` decision #108.
+
+**Why**: some sites have no Area-level distinction (single-building sites). Rather than letting the UNS path have inconsistent depth across sites, we mandate 5 levels always with `_default` as the placeholder. Downstream consumers can rely on path depth.
+
+## D5 — Tag names use PascalCase or snake_case (class's choice), NOT UNS-segment regex
+
+**Why**: UNS path segments (Enterprise/Site/Area/Line/Equipment) are infrastructure-level identifiers — they go on the wire of every browse, every URI, every dashboard filter. The regex (`^[a-z0-9-]{1,32}$`) reflects that constraint.
+
+Signal names (level 6) are vocabulary-level identifiers — they live inside an equipment node. Keeping them in PascalCase or snake_case (e.g. `RunState`, `actual_feedrate`) is more readable for operators looking at OPC UA browse output, and matches OPC UA SDK conventions which expect identifier-style names rather than URL-safe slugs.
+
+## D6 — `stateModel` is informational, not authoritative
+
+**Why**: state derivation lives at Layer 3 (System Platform / Ignition). Placing the derivation rules in the schemas repo would create dual sources of truth (and the schemas-repo version would inevitably drift). Instead, the class template lists which states the class supports + an informational note about what the rough mapping looks like; Layer 3 owns the actual derivation logic.
+
+## D7 — No per-equipment overrides at this layer
+
+**Why**: per-equipment config (which specific CNC has which program, etc.) is OtOpcUa's central config DB concern. Mixing per-instance config with per-class definitions in this repo would muddy the separation and cause the repo to grow with deployment-specific data instead of staying small + reusable.
+
+## D8 — `applicability.drivers` lists OtOpcUa drivers explicitly
+
+**Why**: the schemas repo is OT-side-focused. The OtOpcUa driver enumeration is the closest thing to a canonical "how do you get raw data from this equipment" vocabulary that exists across the org. If a future class is populated by a non-OtOpcUa source, the field becomes optional or extends. For now, listing OtOpcUa driver IDs makes the consumer-side validation (per `lmxopcua/docs/v2/plan.md` decision #111 — driver type ↔ namespace kind) trivial.
--- a/schemas/docs/overview.md
+++ b/schemas/docs/overview.md
@@ -0,0 +1,42 @@
+# Overview
+
+The `schemas` repo is the org's single source of truth for OT equipment definitions. Three things live here:
+
+1. **UNS hierarchy** — per-site declarations of which Areas and Lines exist (`uns/`).
+2. **Equipment-class templates** — per-class declarations of which raw signals each equipment type exposes (`classes/`).
+3. **Format definitions** — JSON Schemas defining what UNS and class files must look like (`format/`).
+
+## Why a central repo
+
+Three OT/IT systems consume the same canonical model:
+
+- **OtOpcUa** equipment namespace — exposes raw signals over OPC UA
+- **Redpanda + Protobuf** event topics — canonical event shape on the wire
+- **dbt curated layer in Snowflake** — analytics model
+
+Without a central source, they would drift. With one repo, every consumer pulls a versioned snapshot and validates against it. Drift becomes a CI failure, not a production incident.
+
+## Lifecycle
+
+```
+[author edits JSON in this repo]
+          │
+          ▼
+[CI validates against format/*.schema.json]
+          │
+          ▼
+[PR review by maintainer + consumer reps]
+          │
+          ▼
+[merge to main → tag a semver release]
+          │
+          ▼
+[each consumer pulls the tag and integrates per docs/consumer-integration.md]
+```
+
+## What's NOT in this repo
+
+- **Per-equipment configuration** — which specific CNC at Warsaw West runs which program. That's per-instance config in OtOpcUa's central config DB (`lmxopcua/docs/v2/config-db-schema.md` Equipment table), not template-level material.
+- **State derivation rules** — how raw signals derive into `Running` / `Idle` / `Faulted` / `Starved` / `Blocked`. That's Layer 3 logic in Aveva System Platform / Ignition. Equipment-class templates can declare which states the class supports (`stateModel.states`) but not the derivation itself.
+- **Wire-format Protobuf schemas** — those are code-generated from this repo into a separate output artifact for Redpanda. The authoring source is here; the binary wire format is derived.
+- **Authoritative LDAP groups, ACL grants, OPC UA security policies** — those live in the consuming systems (OtOpcUa central config DB, identity provider).