diff --git a/docs/v2/acl-design.md b/docs/v2/acl-design.md new file mode 100644 index 0000000..f214131 --- /dev/null +++ b/docs/v2/acl-design.md @@ -0,0 +1,379 @@ +# OPC UA Client Authorization (ACL Design) — OtOpcUa v2 + +> **Status**: DRAFT — closes corrections-doc finding B1 (namespace / equipment-subtree ACLs not yet modeled in the data path). +> +> **Branch**: `v2` +> **Created**: 2026-04-17 + +## Scope + +This document defines the **OPC UA client data-path authorization model** — who can read, write, subscribe, browse, ack alarms, etc. on which nodes when connecting to the OtOpcUa server endpoint. It is distinct from: + +- **Admin UI authorization** (`admin-ui.md`) — who can edit configuration. That layer has FleetAdmin / ConfigEditor / ReadOnly roles + cluster-scoped grants per decisions #88, #105. +- **DB principal authorization** (`config-db-schema.md` §"Authorization Model") — who can call which stored procedures on the central config DB. That layer is per-NodeId for cluster nodes and per-Admin for Admin app users. + +The data-path ACL layer covers OPC UA clients (ScadaBridge, Ignition, System Platform IO, third-party tools) that connect to the OPC UA endpoint to read or modify equipment data. + +## Permission Model + +Every node operation requires an explicit permission. Permissions are bitmask flags on the v2 schema; the OPC UA NodeManager checks them on every browse, read, write, subscribe, history read, alarm event, and method call. + +### Permission flags + +```csharp +[Flags] +public enum NodePermissions : uint +{ + None = 0, + + // Read-side + Browse = 1 << 0, // See node in BrowseRequest results + Read = 1 << 1, // ReadRequest current value + Subscribe = 1 << 2, // CreateMonitoredItems + HistoryRead = 1 << 3, // HistoryReadRaw / HistoryReadProcessed + + // Write-side (mirrors v1 SecurityClassification model — see config-db-schema.md Equipment ACL) + WriteOperate = 1 << 4, // Write attrs with FreeAccess/Operate classification + WriteTune = 1 << 5, // Write attrs with Tune classification + WriteConfigure = 1 << 6, // Write attrs with Configure classification + + // Alarm-side + AlarmRead = 1 << 7, // Receive alarm events for this node + AlarmAcknowledge = 1 << 8, // Ack alarms (separate from Confirm — OPC UA Part 9 distinction) + AlarmConfirm = 1 << 9, // Confirm alarms + AlarmShelve = 1 << 10, // Shelve / unshelve alarms + + // Method invocation (OPC UA Part 4 §5.11) + MethodCall = 1 << 11, // Invoke methods on the node + + // Common bundles (also exposed in Admin UI as one-click selections) + ReadOnly = Browse | Read | Subscribe | HistoryRead | AlarmRead, + Operator = ReadOnly | WriteOperate | AlarmAcknowledge | AlarmConfirm, + Engineer = Operator | WriteTune | AlarmShelve, + Admin = Engineer | WriteConfigure | MethodCall, +} +``` + +The bundles (`ReadOnly` / `Operator` / `Engineer` / `Admin`) are derived from production patterns at sites running v1 LmxOpcUa — they're the common grant shapes operators reach for. Granular per-flag grants stay supported for unusual cases. + +### Why three Write tiers (Operate / Tune / Configure) + +Mirrors v1's `SecurityClassification` mapping (`docs/DataTypeMapping.md`). Galaxy attributes carry a security classification; v1 maps `FreeAccess`/`Operate` to writable, `SecuredWrite`/`VerifiedWrite`/`ViewOnly` to read-only. The v2 model preserves this for Galaxy and extends it to all drivers via `Tag.SecurityClassification`: + +| Classification | Permission needed to write | +|----------------|---------------------------| +| FreeAccess | `WriteOperate` | +| Operate | `WriteOperate` | +| Tune | `WriteTune` | +| Configure | `WriteConfigure` | +| SecuredWrite / VerifiedWrite / ViewOnly | (not writable from OPC UA — v1 behavior preserved) | + +A user with `WriteTune` can write Operate-classified attrs too (Tune is more privileged). The check is `requestedClassification ≤ grantedTier`. + +### Why AlarmRead is separate from Read + +In OPC UA Part 9 alarm subscriptions are a distinct subscription type — a client can subscribe to events on a node without reading its value. Granting Read alone does not let a client see alarm events; AlarmRead is required separately. The `ReadOnly` bundle includes both. + +### Why MethodCall is separate + +OPC UA methods (Part 4 §5.11) are arbitrary procedure invocations on a node. v1 LmxOpcUa exposes very few; future drivers (especially OPC UA Client gateway) will surface more. MethodCall is gated explicitly because side-effects can be unbounded — analogous to executing a stored procedure rather than reading a column. + +## Scope Hierarchy + +ACL grants attach to one of six scope levels. Granting at higher level cascades to lower (with browse implication for ancestors); explicit Deny at lower level is **deferred to v2.1** (decision below). + +``` +Cluster ← cluster-wide grant (highest scope) + └── Namespace ← per-namespace grant (Equipment vs SystemPlatform vs Simulated) + └── UnsArea ← per-area grant (Equipment-namespace only) + └── UnsLine ← per-line grant + └── Equipment ← per-equipment grant + └── Tag ← per-tag grant (lowest scope; rarely used) +``` + +For SystemPlatform-namespace tags (no Equipment row, no UNS structure), the chain shortens to: + +``` +Cluster + └── Namespace + └── (Tag's FolderPath segments — treated as opaque hierarchy) + └── Tag +``` + +### Inheritance and evaluation + +For each operation on a node: + +1. Walk the node's scope chain from leaf to root (`Tag → Equipment → UnsLine → UnsArea → Namespace → Cluster`) +2. At each level, look up `NodeAcl` rows where `LdapGroup ∈ user.Groups` and `(ScopeKind, ScopeId)` matches +3. Union the `PermissionFlags` from every matching row +4. Required permission must be set in the union → allow; else → deny +5. Browse is implied at every ancestor of any node where the user has any non-Browse permission — otherwise the user can't navigate to it + +### Default-deny + +If the union is empty (no group of the user's has any grant matching the node's chain), the operation is **denied**: +- Browse → node hidden from results +- Read / Subscribe / HistoryRead → `BadUserAccessDenied` +- Write → `BadUserAccessDenied` +- AlarmAck / AlarmConfirm / AlarmShelve → `BadUserAccessDenied` +- MethodCall → `BadUserAccessDenied` + +### Why no explicit Deny in v2.0 + +Two patterns can express "X group can write everywhere except production line 3": + +- **(a)** Verbose: grant Engineering on every line except line 3 — many rows but unambiguous +- **(b)** Explicit Deny that overrides Grant — fewer rows but evaluation logic must distinguish "no grant" from "explicit deny" + +For v2.0 fleets (≤50 clusters, ≤20 lines per cluster typical) approach (a) is workable — operators use the bulk-grant Admin UI flow to apply grants across many lines minus exceptions. Explicit Deny adds non-trivial complexity to the evaluator and the Admin UI; defer to v2.1 unless a deployment demonstrates a real need. + +## Schema — `NodeAcl` Table + +Generation-versioned (decision #105 pattern — ACLs are content, travel through draft → diff → publish like every other consumer-visible config): + +```sql +CREATE TABLE dbo.NodeAcl ( + NodeAclRowId uniqueidentifier NOT NULL PRIMARY KEY DEFAULT NEWSEQUENTIALID(), + GenerationId bigint NOT NULL FOREIGN KEY REFERENCES dbo.ConfigGeneration(GenerationId), + NodeAclId nvarchar(64) NOT NULL, -- stable logical ID across generations + ClusterId nvarchar(64) NOT NULL FOREIGN KEY REFERENCES dbo.ServerCluster(ClusterId), + LdapGroup nvarchar(256) NOT NULL, -- LDAP group name (e.g. "OtOpcUaOperators-LINE3") + ScopeKind nvarchar(16) NOT NULL CHECK (ScopeKind IN ('Cluster', 'Namespace', 'UnsArea', 'UnsLine', 'Equipment', 'Tag')), + ScopeId nvarchar(64) NULL, -- NULL when ScopeKind='Cluster'; otherwise the logical ID of the scoped entity + PermissionFlags int NOT NULL, -- bitmask of NodePermissions + Notes nvarchar(512) NULL +); + +CREATE INDEX IX_NodeAcl_Generation_Cluster + ON dbo.NodeAcl (GenerationId, ClusterId); +CREATE INDEX IX_NodeAcl_Generation_Group + ON dbo.NodeAcl (GenerationId, LdapGroup); +CREATE INDEX IX_NodeAcl_Generation_Scope + ON dbo.NodeAcl (GenerationId, ScopeKind, ScopeId) WHERE ScopeId IS NOT NULL; +CREATE UNIQUE INDEX UX_NodeAcl_Generation_LogicalId + ON dbo.NodeAcl (GenerationId, NodeAclId); +-- Within a generation, a (Group, Scope) pair has at most one row (additive grants would be confusing +-- in the audit trail; use a single row with the union of intended permissions instead) +CREATE UNIQUE INDEX UX_NodeAcl_Generation_GroupScope + ON dbo.NodeAcl (GenerationId, ClusterId, LdapGroup, ScopeKind, ScopeId); +``` + +### Cross-generation invariant + +Same pattern as Equipment / Namespace: `NodeAclId` is append-only per cluster — once published, the logical ID stays bound to its `(LdapGroup, ScopeKind, ScopeId)` triple. Renaming an LDAP group is forbidden — disable the old grant and create a new one. This protects the audit trail. + +### Validation in `sp_ValidateDraft` + +Adds these checks beyond the existing schema rules: + +- **ScopeId resolution**: when `ScopeKind ∈ {Namespace, UnsArea, UnsLine, Equipment, Tag}`, `ScopeId` must resolve to the corresponding entity in the same generation +- **Cluster cohesion**: the resolved scope must belong to the same `ClusterId` as the ACL row +- **PermissionFlags validity**: bitmask must only contain bits defined in `NodePermissions` enum (no future-bit speculation) +- **LdapGroup format**: non-empty, ≤256 chars, no characters that would break LDAP DN escaping (allowlist) +- **No identity drift**: `NodeAclId` once published with `(LdapGroup, ScopeKind, ScopeId)` cannot have any of those four columns change in a future generation + +## Evaluation Algorithm + +### At session establishment + +``` +on AcceptSession(user): + user.Groups = LdapAuth.ResolveGroups(user.Token) + user.PermissionMap = BuildEffectivePermissionMap(currentGeneration, user.Groups) + cache user.PermissionMap on the session +``` + +`BuildEffectivePermissionMap` produces a sparse trie keyed by node-path-prefix: + +``` +PermissionMap structure (per session): + / → grant union from Cluster + Namespace-level rows + /Equipment-NS/UnsArea-A/ → adds UnsArea-level grants + /Equipment-NS/UnsArea-A/UnsLine-1/ → adds UnsLine-level grants + /Equipment-NS/UnsArea-A/UnsLine-1/Equipment-X/ → adds Equipment-level grants + /Equipment-NS/UnsArea-A/UnsLine-1/Equipment-X/Tag-Y → adds Tag-level grants (rare) +``` + +Lookup for a node at path P: walk the trie from `/` to P, OR-ing PermissionFlags at each visited level. Result = effective permissions for P. O(depth) — typically 6 or fewer hops. + +### Per-operation check + +```csharp +bool Authorize(SessionContext ctx, NodePath path, NodePermissions required) +{ + var effective = ctx.PermissionMap.Lookup(path); + return (effective & required) == required; +} +``` + +- Browse: `Authorize(ctx, path, Browse)` — falsy → omit from results +- Read: `Authorize(ctx, path, Read)` → falsy → `BadUserAccessDenied` +- Write: `Authorize(ctx, path, requiredWriteFlag)` where `requiredWriteFlag` is derived from the target attribute's `SecurityClassification` +- Subscribe: `Authorize(ctx, path, Subscribe)` — also implies Browse on the path +- HistoryRead: `Authorize(ctx, path, HistoryRead)` +- Alarm event: `Authorize(ctx, path, AlarmRead)` — events for unauthorized nodes are filtered out before delivery +- AlarmAck/Confirm/Shelve: corresponding flag check +- MethodCall: `Authorize(ctx, methodNode.path, MethodCall)` + +### Cache invalidation + +The session's `PermissionMap` is rebuilt when: +- A new config generation is applied locally (the path-trie may have changed structure due to UNS reorg or new equipment) +- The LDAP group cache for the user expires (default: 15 min — driven by the LDAP layer, separate from this design) +- The user's session is re-established + +For unattended consumer connections (ScadaBridge, Ignition) that hold long sessions, the per-generation rebuild keeps permissions current without forcing reconnects. + +## Performance + +Worst-case per-operation cost: O(depth × group-count). For typical fleet sizes (10 LDAP groups per user, 6-deep UNS path), that's ~60 trie lookups per operation — sub-microsecond on modern hardware. The session-scoped cache means the per-operation hot path is array indexing, not DB queries. + +Build cost (at session establish or generation reapply): O(N_acl × M_groups) for N_acl rows and M_groups in user's claim set. For 1000 ACL rows × 10 groups = 10k joins; sub-second on a sane DB. + +Memory cost: per-session trie ~4 KB for typical scopes; bounded by O(N_acl) worst case. Sessions hold their own trie — no shared state to invalidate. + +## Default Permissions for Existing v1 LDAP Groups + +To preserve v1 LmxOpcUa behavior on first migration, the v2 default ACL set on cluster creation maps the existing v1 LDAP-role-to-permission grants: + +| v1 LDAP role (per `Security.md`) | v2 NodePermissions bundle | Scope | +|----------------------------------|---------------------------|-------| +| `ReadOnly` (group: `OtOpcUaReadOnly`) | `ReadOnly` bundle | Cluster | +| `WriteOperate` (group: `OtOpcUaWriteOperate`) | `Operator` bundle | Cluster | +| `WriteTune` (group: `OtOpcUaWriteTune`) | `Engineer` bundle | Cluster | +| `WriteConfigure` (group: `OtOpcUaWriteConfigure`) | `Admin` bundle | Cluster | +| `AlarmAck` (group: `OtOpcUaAlarmAck`) | adds `AlarmAcknowledge \| AlarmConfirm` to user's existing grants | Cluster | + +These are seeded by the cluster-create workflow into the initial draft generation (per decision #123 — namespaces and ACLs both travel through publish boundary). Operators can then refine to per-Equipment scopes as needed. + +## Admin UI + +### New tab: ACLs (under Cluster Detail) + +``` +/clusters/{ClusterId} Cluster detail (tabs: Overview / Namespaces / UNS Structure / Drivers / Devices / Equipment / Tags / **ACLs** / Generations / Audit) +``` + +Two views, toggle at the top: + +#### View 1 — By LDAP group + +| LDAP Group | Scopes | Permissions | Notes | +|------------|--------|-------------|-------| +| `OtOpcUaOperators` | Cluster | Operator bundle | Default operators (seeded) | +| `OtOpcUaOperators-LINE3` | UnsArea bldg-3 | Engineer bundle | Line 3 supervisors | +| `OtOpcUaScadaBridge` | Cluster | ReadOnly | Tier 1 consumer (added before cutover) | + +Click a row → edit grant: change scope, change permission set (one-click bundles or per-flag), edit notes. + +#### View 2 — By scope (UNS tree) + +Tree view of UnsArea → UnsLine → Equipment with permission badges per node showing which groups have what: + +``` +bldg-3/ [Operators: Operator, ScadaBridge: ReadOnly] + ├── line-2/ [+ LINE3-Supervisors: Engineer] + │ ├── cnc-mill-05 [+ CNC-Maintenance: WriteTune] + │ ├── cnc-mill-06 + │ └── injection-molder-02 + └── line-3/ +``` + +Click a node → see effective permissions per group, edit grants at that scope. + +### Bulk grant flow + +"Bulk grant" button on either view: +1. Pick LDAP group(s) +2. Pick permission bundle or per-flag +3. Pick scope set: pattern (e.g. all UnsArea matching `bldg-*`), or multi-select from tree +4. Preview: list of `NodeAcl` rows that will be created +5. Confirm → adds rows to current draft + +### Permission simulator + +"Simulate as user" panel: enter username + LDAP groups → UI shows the effective permission map across the cluster's UNS tree. Useful before publishing — operators verify "after this change, ScadaBridge can still read everything it needs" without actually deploying. + +### Operator workflows added to admin-ui.md + +Three new workflows: +1. **Grant ACL** — usual draft → diff → publish, scoped to ACLs tab +2. **Bulk grant** — multi-select scope + group + permission, preview, publish +3. **Simulate as user** — preview-only, no publish required + +### v1 deviation log + +For each cluster, the Admin UI shows a banner if its NodeAcl set diverges from the v1-default seed (per the table above). This makes intentional tightening or loosening visible at a glance — important for compliance review during the long v1 → v2 coexistence period. + +## Audit + +Every NodeAcl change is in `ConfigAuditLog` automatically (per the publish boundary — same as any other content edit). Plus the OPC UA NodeManager logs every **denied** operation: + +``` +EventType = 'OpcUaAccessDenied' +DetailsJson = { user, groups, requestedOperation, nodePath, requiredPermission, effectivePermissions } +``` + +Allowed operations are NOT logged at this layer (would dwarf the audit log; OPC UA SDK has its own session/operation diagnostics for high-frequency telemetry). The choice to log denials only mirrors typical authorization-audit practice and can be tightened per-deployment if a customer requires full positive-action logging. + +## Test Strategy + +Unit tests for the evaluator: +- Empty ACL set → all operations denied (default-deny invariant) +- Single Cluster-scope grant → operation allowed at every node in the cluster +- Single Equipment-scope grant → allowed at the equipment + its tags; denied at sibling equipment +- Multiple grants for same group → union (additive) +- Multiple groups for same user → union of all groups' grants +- Browse implication: granting Read on a deep equipment auto-allows Browse at every ancestor +- Permission bundle expansion: granting `Operator` bundle = granting `Browse | Read | Subscribe | HistoryRead | AlarmRead | WriteOperate | AlarmAcknowledge | AlarmConfirm` +- v1-compatibility seed: a fresh cluster with the default ACL set behaves identically to v1 LmxOpcUa for users in the v1 LDAP groups + +Integration test (Phase 1+): +- Create cluster + equipment + tags + ACL grants +- Connect OPC UA client as a `ReadOnly`-mapped user → browse and read succeed; write fails +- Re-publish with a tighter ACL → existing session's permission map rebuilds; subsequent writes that were allowed are now denied +- Verify `OpcUaAccessDenied` audit log entries for the denied operations + +Adversarial review checks (run during exit gate): +- Can a client connect with no LDAP group at all and read anything? (must be no — default deny) +- Can a client see a node in browse but not read its value? (yes, if Browse granted but not Read — unusual but valid) +- Does a UnsArea rename cascade ACL grants correctly? (the grant references UnsAreaId not name, so rename is transparent) +- Does an Equipment merge (Admin operator flow) preserve ACL grants on the surviving equipment? (must yes; merge flow updates references) +- Does generation rollback restore the prior ACL state? (must yes; ACLs are generation-versioned) + +## Implementation Plan + +ACL design enters the implementation pipeline as follows: + +### Phase 1 (Configuration + Admin scaffold) +- Schema: add `NodeAcl` table to the Phase 1 migration +- Validation: add NodeAcl rules to `sp_ValidateDraft` +- Admin UI: scaffold the ACLs tab with view + edit + bulk grant + simulator +- Default seed: cluster-create workflow seeds the v1-compatibility ACL set +- Generation diff: include NodeAcl in `sp_ComputeGenerationDiff` + +### Phase 2+ (every driver phase) +- Wire the ACL evaluator into `GenericDriverNodeManager` so every Browse / Read / Write / Subscribe / HistoryRead / AlarmRead / AlarmAck / MethodCall consults the per-session permission map +- Per-driver tests: assert that a default-deny user cannot read or subscribe to that driver's namespace; assert that a `ReadOnly`-bundled user can; assert that the appropriate Write tier is needed for each `SecurityClassification` + +### Pre-tier-1-cutover (before Phase 6 / consumer cutover) +- Verify ScadaBridge's effective permissions in the Admin UI simulator before any cutover +- Adversarial review of the per-cluster ACL set with a fresh pair of eyes + +## Decisions to Add to plan.md + +(Will be appended to the decision log on the next plan.md edit.) + +| # | Decision | Rationale | +|---|----------|-----------| +| 129 | OPC UA client data-path authorization model = bitmask `NodePermissions` flags + per-LDAP-group grants on a 6-level scope hierarchy (Cluster / Namespace / UnsArea / UnsLine / Equipment / Tag) | Closes corrections-doc finding B1. Mirrors v1 SecurityClassification model for Write tiers; adds explicit AlarmRead/Ack/Confirm/Shelve and MethodCall flags. Default-deny; additive grants; explicit Deny deferred to v2.1. See `acl-design.md` | +| 130 | `NodeAcl` table generation-versioned, edited via draft → diff → publish like every other content table | Same pattern as Namespace (decision #123) and Equipment (decision #109). ACL changes are content, not topology — they affect what consumers see at the OPC UA endpoint. Rollback restores the prior ACL state | +| 131 | Cluster-create workflow seeds default ACL set matching v1 LmxOpcUa LDAP-role-to-permission map | Preserves behavioral parity for v1 → v2 consumer migration. Operators tighten or loosen from there. Admin UI flags any cluster whose ACL set diverges from the seed | +| 132 | OPC UA NodeManager logs denied operations only; allowed operations rely on SDK session/operation diagnostics | Logging every allowed op would dwarf the audit log. Denied-only mirrors typical authorization audit practice. Per-deployment policy can tighten if compliance requires positive-action logging | + +## Open Questions + +- **OPC UA Method support scope**: how many methods does v1 expose? Need to enumerate before tier 3 cutover (System Platform IO is the most likely consumer of methods). The MethodCall permission is defined defensively but may not be exercised in v2.0. +- **Group claim source latency**: LDAP group cache TTL (default 15 min above) is taken from the v1 LDAP layer. If the OPC UA session's group claims need to be refreshed faster (e.g. for emergency revoke), we need a shorter TTL or an explicit revoke channel. Decide per operational risk appetite. +- **AlarmConfirm vs AlarmAcknowledge** semantics: OPC UA Part 9 distinguishes them (Ack = "I've seen this"; Confirm = "I've taken action"). Some sites only use Ack; the v2.0 model exposes both but a deployment-level policy can collapse them in practice. diff --git a/docs/v2/admin-ui.md b/docs/v2/admin-ui.md index c39add4..8cfb1aa 100644 --- a/docs/v2/admin-ui.md +++ b/docs/v2/admin-ui.md @@ -251,8 +251,9 @@ Tabbed view for one cluster. Search bar supports any of the five identifiers (ZTag, MachineCode, SAPID, EquipmentId, EquipmentUuid) — operator types and the search dispatches across all five with a typeahead that disambiguates ("Found in ZTag" / "Found in MachineCode" labels on each suggestion). Per-row click opens the Equipment Detail page. 7. **Tags** — paged, filterable table of all tags. Filters: namespace kind, equipment (by ZTag/MachineCode/SAPID), driver, device, folder path, name pattern, data type. For Equipment-ns tags the path is shown as the full UNS path; for SystemPlatform-ns tags the v1-style `FolderPath/Name` is shown. Bulk operations toolbar: export to CSV, import from CSV (validated against active draft). -8. **Generations** — generation history list (see Generation History page) -9. **Audit** — filtered audit log +8. **ACLs** — OPC UA client data-path authorization grants. Two views (toggle at top): "By LDAP group" (rows) and "By scope" (UNS tree with permission badges per node). Bulk-grant flow: pick group + permission bundle (`ReadOnly` / `Operator` / `Engineer` / `Admin`) or per-flag selection + scope (multi-select from tree or pattern), preview, confirm via draft. Permission simulator panel: enter username + LDAP groups → effective permission map across the cluster's UNS tree. Default seed on cluster creation maps v1 LmxOpcUa LDAP roles. Banner shows when this cluster's ACL set diverges from the seed. See `acl-design.md` for full design. +9. **Generations** — generation history list (see Generation History page) +10. **Audit** — filtered audit log The Drivers/Devices/Equipment/Tags tabs are **read-only views** of the published generation; editing is done in the dedicated draft editor to make the publish boundary explicit. The Namespaces tab and the UNS Structure tab follow the same hybrid pattern: navigation is read-only over the published generation, click-to-edit on any node opens the draft editor scoped to that node. **No table in v2.0 is edited outside the publish boundary** (revised after adversarial review finding #2). diff --git a/docs/v2/config-db-schema.md b/docs/v2/config-db-schema.md index 1505f18..82b70b3 100644 --- a/docs/v2/config-db-schema.md +++ b/docs/v2/config-db-schema.md @@ -410,6 +410,47 @@ All five are exposed as **OPC UA properties** on the equipment node so external `EquipmentClassRef` is a nullable string hook; v2.0 ships with no validation. When the central `schemas` repo lands, this becomes a foreign key into the schemas-repo equipment-class catalog, validated at draft-publish time. +### `NodeAcl` + +```sql +CREATE TABLE dbo.NodeAcl ( + NodeAclRowId uniqueidentifier NOT NULL PRIMARY KEY DEFAULT NEWSEQUENTIALID(), + GenerationId bigint NOT NULL FOREIGN KEY REFERENCES dbo.ConfigGeneration(GenerationId), + NodeAclId nvarchar(64) NOT NULL, -- stable logical ID across generations + ClusterId nvarchar(64) NOT NULL FOREIGN KEY REFERENCES dbo.ServerCluster(ClusterId), + LdapGroup nvarchar(256) NOT NULL, + ScopeKind nvarchar(16) NOT NULL CHECK (ScopeKind IN ('Cluster', 'Namespace', 'UnsArea', 'UnsLine', 'Equipment', 'Tag')), + ScopeId nvarchar(64) NULL, -- NULL when ScopeKind='Cluster'; logical ID otherwise + PermissionFlags int NOT NULL, -- bitmask of NodePermissions enum + Notes nvarchar(512) NULL +); + +CREATE INDEX IX_NodeAcl_Generation_Cluster + ON dbo.NodeAcl (GenerationId, ClusterId); +CREATE INDEX IX_NodeAcl_Generation_Group + ON dbo.NodeAcl (GenerationId, LdapGroup); +CREATE INDEX IX_NodeAcl_Generation_Scope + ON dbo.NodeAcl (GenerationId, ScopeKind, ScopeId) WHERE ScopeId IS NOT NULL; +CREATE UNIQUE INDEX UX_NodeAcl_Generation_LogicalId + ON dbo.NodeAcl (GenerationId, NodeAclId); +-- Within a generation, a (Group, Scope) pair has at most one row +CREATE UNIQUE INDEX UX_NodeAcl_Generation_GroupScope + ON dbo.NodeAcl (GenerationId, ClusterId, LdapGroup, ScopeKind, ScopeId); +``` + +`NodeAcl` is **generation-versioned** (decision #130). ACL changes go through draft → diff → publish → rollback like every other content table. Cross-generation invariant: `NodeAclId` once published with `(LdapGroup, ScopeKind, ScopeId)` cannot have any of those columns change in a future generation; rename an LDAP group by disabling the old grant and creating a new one. + +`PermissionFlags` is a bitmask of the `NodePermissions` enum defined in `acl-design.md` (Browse, Read, Subscribe, HistoryRead, WriteOperate, WriteTune, WriteConfigure, AlarmRead, AlarmAcknowledge, AlarmConfirm, AlarmShelve, MethodCall). Common bundles (`ReadOnly`, `Operator`, `Engineer`, `Admin`) expand to specific flag combinations at evaluation time. + +Validation in `sp_ValidateDraft`: +- `ScopeId` must resolve in the same generation when `ScopeKind ≠ 'Cluster'` +- Resolved scope must belong to the same `ClusterId` as the ACL row (cross-cluster bindings rejected, same pattern as decision #122) +- `PermissionFlags` must contain only bits defined in `NodePermissions` +- `LdapGroup` non-empty, ≤256 chars, allowlisted character set (no LDAP-DN-breaking chars) +- Cross-generation identity stability per the invariant above + +Full evaluation algorithm + Admin UI design + v1-compatibility seed in `acl-design.md`. + ### `ExternalIdReservation` ```sql diff --git a/docs/v2/dev-environment.md b/docs/v2/dev-environment.md new file mode 100644 index 0000000..aa82789 --- /dev/null +++ b/docs/v2/dev-environment.md @@ -0,0 +1,276 @@ +# Development Environment — OtOpcUa v2 + +> **Status**: DRAFT — concrete inventory + setup plan for every external resource the v2 build needs. Companion to `test-data-sources.md` (which catalogues the simulator/stub strategy per driver) and `implementation/overview.md` (which references the dev environment in entry-gate checklists). +> +> **Branch**: `v2` +> **Created**: 2026-04-17 + +## Scope + +Every external resource a developer needs on their machine, plus the dedicated integration host that runs the heavier simulators per CI tiering decision #99. Includes Docker container images, ports, default credentials (dev only — production overrides documented), and ownership. + +**Not in scope here**: production deployment topology (separate doc when v2 ships), CI pipeline configuration (separate ops concern), individual developer's IDE / editor preferences. + +## Two Environment Tiers + +Per decision #99: + +| Tier | Purpose | Where it runs | Resources | +|------|---------|---------------|-----------| +| **PR-CI / inner-loop dev** | Fast, runs on minimal Windows + Linux build agents and developer laptops | Each developer's machine; CI runners | Pure-managed in-process simulators (NModbus, OPC Foundation reference server, FOCAS TCP stub from test project). No Docker, no VMs. | +| **Nightly / integration CI** | Full driver-stack validation against real wire protocols | One dedicated Windows host with Docker Desktop + Hyper-V + a TwinCAT XAR VM | All Docker simulators (`oitc/modbus-server`, `ab_server`, Snap7), TwinCAT XAR VM, Galaxy.Host installer + dev Galaxy access, FOCAS TCP stub binary, FOCAS FaultShim assembly | + +The tier split keeps developer onboarding fast (no Docker required for first build) while concentrating the heavy simulator setup on one machine the team maintains. + +## Resource Inventory + +### A. Always-required (every developer + integration host) + +| Resource | Purpose | Type | Default port | Default credentials | Owner | +|----------|---------|------|--------------|---------------------|-------| +| **.NET 10 SDK** | Build all .NET 10 x64 projects | OS install | n/a | n/a | Developer | +| **.NET Framework 4.8 SDK + targeting pack** | Build `Driver.Galaxy.Host` (Phase 2+) | Windows install | n/a | n/a | Developer | +| **Visual Studio 2022 17.8+ or Rider 2024+** | IDE (any C# IDE works; these are the supported configs) | OS install | n/a | n/a | Developer | +| **Git** | Source control | OS install | n/a | n/a | Developer | +| **PowerShell 7.4+** | Compliance scripts (`phase-N-compliance.ps1`) | OS install | n/a | n/a | Developer | +| **Repo clones** | `lmxopcua` (this repo), `scadalink-design` (UI/auth reference per memory file `scadalink_reference.md`), `3yearplan` (handoff + corrections) | Git clone | n/a | n/a | Developer | + +### B. Inner-loop dev (developer machines + PR-CI) + +| Resource | Purpose | Type | Default port | Default credentials | Owner | +|----------|---------|------|--------------|---------------------|-------| +| **SQL Server 2022 dev edition** | Central config DB; integration tests against `Configuration` project | Local install OR Docker container `mcr.microsoft.com/mssql/server:2022-latest` | 1433 | `sa` / `OtOpcUaDev_2026!` (dev only — production uses Integrated Security or gMSA per decision #46) | Developer (per machine) | +| **GLAuth (LDAP server)** | Admin UI authentication tests; data-path ACL evaluation tests | Local binary at `C:\publish\glauth\` per existing CLAUDE.md | 3893 (LDAP) / 3894 (LDAPS) | Service principal: `cn=admin,dc=otopcua,dc=local` / `OtOpcUaDev_2026!`; test users defined in GLAuth config | Developer (per machine) | +| **Local dev Galaxy** (Aveva System Platform) | Galaxy driver tests; v1 IntegrationTests parity | Existing on dev box per CLAUDE.md | n/a (local COM) | Windows Auth | Developer (already present per project setup) | + +### C. Integration host (one dedicated Windows machine the team shares) + +| Resource | Purpose | Type | Default port | Default credentials | Owner | +|----------|---------|------|--------------|---------------------|-------| +| **Docker Desktop for Windows** | Host for containerized simulators | Install | (Hyper-V required; not compatible with TwinCAT runtime — see TwinCAT row below for the workaround) | n/a | Integration host admin | +| **`oitc/modbus-server`** | Modbus TCP simulator (per `test-data-sources.md` §1) | Docker container | 502 (Modbus TCP) | n/a (no auth in protocol) | Integration host admin | +| **`ab_server`** (libplctag binary) | AB CIP + AB Legacy simulator (per `test-data-sources.md` §2 + §3) | Native binary built from libplctag source; runs in a separate VM or host since it conflicts with Docker Desktop's Hyper-V if run on bare metal | 44818 (CIP) | n/a | Integration host admin | +| **Snap7 Server** | S7 simulator (per `test-data-sources.md` §4) | Native binary; runs in a separate VM or in WSL2 to avoid Hyper-V conflict | 102 (ISO-TCP) | n/a | Integration host admin | +| **TwinCAT XAR runtime VM** | TwinCAT ADS testing (per `test-data-sources.md` §5; Beckhoff XAR cannot coexist with Hyper-V on the same OS) | Hyper-V VM with Windows + TwinCAT XAR installed under 7-day renewable trial | 48898 (ADS over TCP) | TwinCAT default route credentials configured per Beckhoff docs | Integration host admin | +| **OPC Foundation reference server** | OPC UA Client driver test source (per `test-data-sources.md` §"OPC UA Client") | Built from `OPCFoundation/UA-.NETStandard` `ConsoleReferenceServer` project | 62541 (default for the reference server) | Anonymous + Username (`user1` / `password1`) per the reference server's built-in user list | Integration host admin | +| **FOCAS TCP stub** (`Driver.Focas.TestStub`) | FOCAS functional testing (per `test-data-sources.md` §6) | Local .NET 10 console app from this repo | 8193 (FOCAS) | n/a | Developer / integration host (run on demand) | +| **FOCAS FaultShim** (`Driver.Focas.FaultShim`) | FOCAS native-fault injection (per `test-data-sources.md` §6) | Test-only native DLL named `Fwlib64.dll`, loaded via DLL search path in the test fixture | n/a (in-process) | n/a | Developer / integration host (test-only) | + +### D. Cloud / external services + +| Resource | Purpose | Type | Access | Owner | +|----------|---------|------|--------|-------| +| **Gitea** at `gitea.dohertylan.com` | Hosts `lmxopcua`, `3yearplan`, `scadalink-design` repos | HTTPS git | Existing org credentials | Org IT | +| **Anthropic API** (for Codex adversarial reviews) | `/codex:adversarial-review` invocations during exit gates | HTTPS via Codex companion script | API key in developer's `~/.claude/...` config | Developer (per `codex:setup` skill) | + +## Network Topology (integration host) + +``` + ┌────────────────────────────────────────┐ + │ Integration Host (Windows + Docker) │ + │ │ + │ Docker Desktop (Linux containers): │ + │ ┌───────────────────────────────┐ │ + │ │ oitc/modbus-server :502/tcp │ │ + │ └───────────────────────────────┘ │ + │ │ + │ WSL2 (Snap7 + ab_server, separate │ + │ from Docker Desktop's HyperV): │ + │ ┌───────────────────────────────┐ │ + │ │ snap7-server :102/tcp │ │ + │ │ ab_server :44818/tcp │ │ + │ └───────────────────────────────┘ │ + │ │ + │ Hyper-V VM (Windows + TwinCAT XAR): │ + │ ┌───────────────────────────────┐ │ + │ │ TwinCAT XAR :48898 │ │ + │ └───────────────────────────────┘ │ + │ │ + │ Native processes: │ + │ ┌───────────────────────────────┐ │ + │ │ ConsoleReferenceServer :62541│ │ + │ │ FOCAS TestStub :8193│ │ + │ └───────────────────────────────┘ │ + │ │ + │ SQL Server 2022 (local install): │ + │ ┌───────────────────────────────┐ │ + │ │ OtOpcUaConfig_Test :1433 │ │ + │ └───────────────────────────────┘ │ + └────────────────────────────────────────┘ + ▲ + │ tests connect via the host's hostname or 127.0.0.1 + │ + ┌────────────────────────────────────────┐ + │ Developer / CI machine running │ + │ `dotnet test --filter Category=...` │ + └────────────────────────────────────────┘ +``` + +## Bootstrap Order — Inner-loop Developer Machine + +Order matters because some installs have prerequisites. ~30–60 min total on a fresh machine. + +1. **Install .NET 10 SDK** (https://dotnet.microsoft.com/) — required to build anything +2. **Install .NET Framework 4.8 SDK + targeting pack** — only needed when starting Phase 2 (Galaxy.Host); skip for Phase 0–1 if not yet there +3. **Install Git + PowerShell 7.4+** +4. **Clone repos**: + ```powershell + git clone https://gitea.dohertylan.com/dohertj2/lmxopcua.git + git clone https://gitea.dohertylan.com/dohertj2/scadalink-design.git + git clone https://gitea.dohertylan.com/dohertj2/3yearplan.git + ``` +5. **Install SQL Server 2022 dev edition** (local install) OR start the Docker container (see Resource B): + ```powershell + docker run --name otopcua-mssql -e "ACCEPT_EULA=Y" -e "MSSQL_SA_PASSWORD=OtOpcUaDev_2026!" ` + -p 1433:1433 -d mcr.microsoft.com/mssql/server:2022-latest + ``` +6. **Install GLAuth** at `C:\publish\glauth\` per existing CLAUDE.md instructions; populate `glauth-otopcua.cfg` with the test users + groups (template in `docs/v2/dev-environment-glauth-config.md` — to be added in the setup task) +7. **Run `dotnet restore`** in the `lmxopcua` repo +8. **Run `dotnet build ZB.MOM.WW.OtOpcUa.slnx`** (post-Phase-0) or `ZB.MOM.WW.LmxOpcUa.slnx` (pre-Phase-0) — verifies the toolchain +9. **Run `dotnet test`** with the inner-loop filter — should pass on a fresh machine + +## Bootstrap Order — Integration Host + +Order matters more here because of Hyper-V conflicts. ~half-day on a fresh machine. + +1. **Install Windows Server 2022 or Windows 11 Pro** (Hyper-V capable) +2. **Enable Hyper-V** + WSL2 +3. **Install Docker Desktop for Windows**, configure to use WSL2 backend (NOT Hyper-V backend — leaves Hyper-V free for the TwinCAT XAR VM) +4. **Set up WSL2 distro** (Ubuntu 22.04 LTS) for native Linux binaries that conflict with Docker Desktop +5. **Pull / start Modbus simulator**: + ```powershell + docker run -d --name modbus-sim -p 502:502 -v ${PWD}/modbus-config.yaml:/server_config.yaml oitc/modbus-server + ``` +6. **Build + start ab_server** (in WSL2): + ```bash + git clone https://github.com/libplctag/libplctag + cd libplctag/src/tests + make ab_server + ./ab_server --plc=ControlLogix --port=44818 # default tags loaded from a config file + ``` +7. **Build + start Snap7 Server** (in WSL2): + - Download Snap7 from https://snap7.sourceforge.net/ + - Build the example server; run on port 102 with the test DB layout from `test-data-sources.md` §4 +8. **Set up TwinCAT XAR VM**: + - Create a Hyper-V VM (Gen 2, Windows 11) + - Install TwinCAT 3 XAE + XAR (download from Beckhoff, free for dev/test) + - Activate the 7-day trial; document the rotation schedule + - Configure ADS routes for the integration host to reach the VM + - Deploy the test PLC project from `test-data-sources.md` §5 ("a tiny test project — `MAIN` (PLC code) + `GVL`") +9. **Build + start OPC Foundation reference server**: + ```bash + git clone https://github.com/OPCFoundation/UA-.NETStandard + cd UA-.NETStandard/Applications/ConsoleReferenceServer + dotnet run --port 62541 + ``` +10. **Install SQL Server 2022 dev edition** (or run the Docker container as on developer machines) +11. **Build + run FOCAS TestStub** (from this repo, post-Phase-5): + ```powershell + dotnet run --project src/ZB.MOM.WW.OtOpcUa.Driver.Focas.TestStub -- --port 8193 + ``` +12. **Verify** by running `dotnet test --filter Category=Integration` from a developer machine pointed at the integration host + +## Credential Management + +### Dev environment defaults + +The defaults in this doc are **for dev environments only**. They're documented here so a developer can stand up a working setup without hunting; they're not secret. + +### Production overrides + +For any production deployment: +- SQL Server: Integrated Security with gMSA (decision #46) — never SQL login with shared password +- LDAP: production GLAuth or AD instance with proper service principal +- TwinCAT: paid license (per-runtime), not the 7-day trial +- All other services: deployment-team's credential management process; documented in deployment-guide.md (separate doc, post-v2.0) + +### Storage + +For dev defaults: +- SQL Server SA password: stored in each developer's local `appsettings.Development.json` (gitignored) +- GLAuth bind DN/password: stored in `glauth-otopcua.cfg` (gitignored) +- Docker secrets / volumes: developer-local + +For production: +- gMSA / cert-mapped principals — no passwords stored anywhere +- Per-NodeId credentials in `ClusterNodeCredential` table (per decision #83) +- Admin app uses LDAP (no SQL credential at all on the user-facing side) + +## Test Data Seed + +Each environment needs a baseline data set so cross-developer tests are reproducible. Lives in `tests/ZB.MOM.WW.OtOpcUa.IntegrationTests/SeedData/`: + +- **GLAuth** users: `test-readonly@otopcua.local` (in `OtOpcUaReadOnly`), `test-operator@otopcua.local` (`OtOpcUaWriteOperate` + `OtOpcUaAlarmAck`), `test-fleetadmin@otopcua.local` (`OtOpcUaAdmins`) +- **Central config DB**: a seed cluster `TEST-CLUSTER-01` with 1 node + 1 namespace + 0 drivers (other tests add drivers) +- **Modbus sim**: YAML config preloading the addresses from `test-data-sources.md` §1 (HR 0–9 constants, ramp at HR 100, etc.) +- **TwinCAT XAR**: the test PLC project deployed; symbols match `test-data-sources.md` §5 +- **OPC Foundation reference server**: starts with built-in test address space; tests don't modify it + +Seeds are idempotent (re-runnable) and gitignored where they contain credentials. + +## Setup Plan (executable) + +### Step 1 — Inner-loop dev environment (each developer, ~1 day with documentation) + +**Owner**: developer +**Prerequisite**: Bootstrap order steps 1–9 above +**Acceptance**: +- `dotnet test ZB.MOM.WW.OtOpcUa.slnx` passes +- A test that touches the central config DB succeeds (proves SQL Server reachable) +- A test that authenticates against GLAuth succeeds (proves LDAP reachable) + +### Step 2 — Integration host (one-time, ~1 week) + +**Owner**: DevOps lead +**Prerequisite**: dedicated Windows machine, hardware specs ≥ 8 cores / 32 GB RAM / 500 GB SSD +**Acceptance**: +- Each simulator (Modbus, AB, S7, TwinCAT, OPC UA reference) responds to a probe from a developer machine +- A nightly CI job runs `dotnet test --filter Category=Integration` against the integration host and passes +- Service-account permissions reviewed by security lead + +### Step 3 — TwinCAT XAR VM trial rotation automation (one-time, half-day) + +**Owner**: DevOps lead +**Prerequisite**: Step 2 complete +**Acceptance**: +- A scheduled task on the integration host either re-activates the 7-day trial automatically OR alerts the team 24h before expiry; cycle tested + +### Step 4 — Per-developer GLAuth config sync (recurring, when test users change) + +**Owner**: developer (each) +**Acceptance**: +- A script in the repo (`scripts/sync-glauth-dev-config.ps1`) updates the local GLAuth config from a template; documented in CLAUDE.md +- Test users defined in the template work on every developer machine + +### Step 5 — Docker simulator config (per-developer, ~30 min) + +**Owner**: developer (each) +**Acceptance**: +- The Modbus simulator container is reachable from `127.0.0.1:502` from the developer's test runner (only needed if the developer is debugging Modbus driver work; not required for Phase 0/1) + +### Step 6 — Codex companion setup (per-developer, ~5 min) + +**Owner**: developer (each) +**Acceptance**: +- `/codex:setup` skill confirms readiness; `/codex:adversarial-review` works against a small test diff + +## Operational Risks + +| Risk | Mitigation | +|------|------------| +| TwinCAT 7-day trial expires mid-CI run | Step 3 automation; alert before expiry; license budget approved as fallback for production-grade pre-release validation | +| Docker Desktop license terms change for org use | Track Docker pricing; budget approved or fall back to Podman if license becomes blocking | +| Integration host single point of failure | Document the setup so a second host can be provisioned in <2 days; test fixtures pin to a hostname so failover changes one DNS entry | +| GLAuth dev config drifts between developers | Sync script + template (Step 4) keep configs aligned; periodic review | +| Galaxy / MXAccess licensing for non-dev-machine | Galaxy stays on the dev machines that already have Aveva licenses; integration host does NOT run Galaxy (Galaxy.Host integration tests run on the dev box, not the shared host) | +| Long-lived dev env credentials in dev `appsettings.Development.json` | Gitignored; documented as dev-only; production never uses these | + +## Decisions to Add to plan.md + +| # | Decision | Rationale | +|---|----------|-----------| +| 133 | Two-tier dev environment: inner-loop (in-process simulators on developer machines) + integration (Docker / VM / native simulators on a single dedicated Windows host) | Per decision #99. Concrete inventory + setup plan in `dev-environment.md` | +| 134 | Docker Desktop with WSL2 backend (not Hyper-V backend) on integration host so TwinCAT XAR VM can run in Hyper-V alongside Docker | TwinCAT runtime cannot coexist with Hyper-V-mode Docker Desktop; WSL2 backend leaves Hyper-V free for the XAR VM. Documented operational constraint | +| 135 | TwinCAT XAR runs only in a dedicated VM on the integration host; developer machines do NOT run XAR locally | The 7-day trial reactivation needs centralized management; the VM is shared infrastructure | +| 136 | Galaxy / MXAccess testing happens on developer machines that have local Aveva installs, NOT on the shared integration host | Aveva licensing scoped to dev workstations; integration host doesn't carry the license. v1 IntegrationTests parity (Phase 2) runs on developer boxes. | +| 137 | Dev env credentials are documented openly in `dev-environment.md`; production credentials use Integrated Security / gMSA per decision #46 | Dev defaults are not secrets; they're convenience. Production never uses these values | diff --git a/docs/v2/implementation/overview.md b/docs/v2/implementation/overview.md index 67ad97d..27113a2 100644 --- a/docs/v2/implementation/overview.md +++ b/docs/v2/implementation/overview.md @@ -177,4 +177,5 @@ The implementation **deviates from the plan** when any of those conditions fails | 3 | (Phase 3: Modbus TCP driver — TBD) | NOT STARTED | | 4 | (Phase 4: PLC drivers AB CIP / AB Legacy / S7 / TwinCAT — TBD) | NOT STARTED | | 5 | (Phase 5: Specialty drivers FOCAS / OPC UA Client — TBD) | NOT STARTED | -| 6+ | (Phases 6–8: tier 1/2/3 consumer cutover — separate planning track per corrections doc C5) | NOT SCOPED | + +**Consumer cutover (ScadaBridge / Ignition / System Platform IO) is OUT of v2 scope.** It is a separate work track owned by the integration / operations team, tracked in the 3-year-plan handoff (`handoffs/otopcua-handoff.md` §"Rollout Posture") and the corrections doc (§C5). The OtOpcUa team's responsibility ends at Phase 5 (all drivers built, all stability protections in place, full Admin UI shipped). Cutover sequencing, validation methodology, rollback procedures, and Aveva-pattern validation for tier 3 are the integration team's deliverables. diff --git a/docs/v2/implementation/phase-1-configuration-and-admin-scaffold.md b/docs/v2/implementation/phase-1-configuration-and-admin-scaffold.md index d332805..a6c214c 100644 --- a/docs/v2/implementation/phase-1-configuration-and-admin-scaffold.md +++ b/docs/v2/implementation/phase-1-configuration-and-admin-scaffold.md @@ -40,7 +40,7 @@ Stand up the **central configuration substrate** for the v2 fleet: | OPC UA wire behavior | Galaxy address space still served exactly as v1; the Configuration substrate is read but not yet driving everything | | Equipment-class template integration with future schemas repo | `EquipmentClassRef` is a nullable hook column; no validation yet (decisions #112, #115) | | Per-driver custom config editors in Admin | Generic JSON editor only in v2.0 (decision #27); driver-specific editors land in their respective phases | -| Consumer cutover (ScadaBridge / Ignition / SystemPlatform IO) | Phases 6–8 | +| Consumer cutover (ScadaBridge / Ignition / SystemPlatform IO) | OUT of v2 scope — separate integration-team track per `implementation/overview.md` | ## Entry Gate Checklist @@ -139,6 +139,7 @@ Implement DbContext with entities matching `config-db-schema.md` exactly: - `UnsArea`, `UnsLine` - `ConfigGeneration` - `DriverInstance`, `Device`, `Equipment`, `Tag`, `PollGroup` +- `NodeAcl` (generation-versioned per decision #130; data-path authorization grants per `acl-design.md`) - `ClusterNodeGenerationState`, `ConfigAuditLog` - `ExternalIdReservation` (NOT generation-versioned per decision #124) @@ -443,6 +444,24 @@ Per `admin-ui.md` §"Release an external-ID reservation" and §"Merge or rebind - After release: same `(Kind, Value)` can be reserved by a different EquipmentUuid in a future publish - Merge equipment A → B: draft preview shows tag re-pointing + ID re-reservation; publish executes atomically; A is disabled with `EquipmentMergedAway` audit entry +#### Task E.9 — ACLs tab + bulk-grant + permission simulator + +Per `admin-ui.md` Cluster Detail tab #8 ("ACLs") and `acl-design.md` §"Admin UI": +- ACLs tab on Cluster Detail with two views ("By LDAP group" + "By scope") +- Edit grant flow: pick scope, group, permission bundle or per-flag, save to draft +- Bulk-grant flow: multi-select scope, group, permissions, preview rows that will be created, publish via draft +- Permission simulator: enter username + LDAP groups → live trie of effective permissions across the cluster's UNS tree +- Cluster-create workflow seeds the v1-compatibility default ACL set (per decision #131) +- Banner on Cluster Detail when the cluster's ACL set diverges from the seed + +**Acceptance**: +- Add an ACL grant via draft → publishes → row in `NodeAcl` table; appears in both Admin views +- Bulk grant 10 LDAP groups × 1 permission set across 5 UnsAreas → preview shows 50 rows; publish creates them atomically +- Simulator: a user in `OtOpcUaReadOnly` group sees `ReadOnly` bundle effective at every node in the cluster +- Simulator: a user in `OtOpcUaWriteTune` sees `Engineer` bundle effective; `WriteConfigure` is denied +- Cluster-create workflow seeds 5 default ACL grants matching v1 LDAP roles (table in `acl-design.md` §"Default Permissions") +- Divergence banner appears when an operator removes any of the seeded grants + ## Compliance Checks (run at exit gate) A `phase-1-compliance.ps1` script that exits non-zero on any failure: @@ -599,8 +618,8 @@ The exit gate signs off only when **every** item below is checked. Each item lin - Any Modbus / AB / S7 / TwinCAT / FOCAS driver code (Phases 3–5) - Per-driver custom config editors in Admin (each driver's phase) - Equipment-class template integration with the schemas repo -- Consumer cutover (Phases 6–8, separate planning track) -- ACL / namespace-level authorization for OPC UA clients (corrections doc B1 — needs scoping before Phase 6, parallel work track) +- Consumer cutover (out of v2 scope, separate integration-team track per `implementation/overview.md`) +- Wiring the OPC UA NodeManager to enforce ACLs at runtime (Phase 2+ in each driver phase). Phase 1 ships the `NodeAcl` table + Admin UI ACL editing + evaluator unit tests; per-driver enforcement lands in each driver's phase per `acl-design.md` §"Implementation Plan" - Push-from-DB notification (decision #96 — v2.1) - Generation pruning operator UI (decision #93 — v2.1) - Cluster-scoped admin grant editor in UI (admin-ui.md "Deferred / Out of Scope" — v2.1) diff --git a/docs/v2/implementation/phase-2-galaxy-out-of-process.md b/docs/v2/implementation/phase-2-galaxy-out-of-process.md index 8b1db11..47fc707 100644 --- a/docs/v2/implementation/phase-2-galaxy-out-of-process.md +++ b/docs/v2/implementation/phase-2-galaxy-out-of-process.md @@ -501,5 +501,5 @@ foreach ($p in $protections) { - Equipment-class template integration with the schemas repo (Galaxy doesn't use `EquipmentClassRef`) - Push-from-DB notification (decision #96 — v2.1) - Any change to OPC UA wire behavior visible to clients (parity is the gate) -- ScadaBridge cutover (Phase 6 — separate planning track) +- Consumer cutover (ScadaBridge, Ignition, System Platform IO) — out of v2 scope, separate integration-team track per `implementation/overview.md` - Removing the v1 deployment from production (a v2 release decision, not Phase 2) diff --git a/docs/v2/plan.md b/docs/v2/plan.md index e964448..23df94d 100644 --- a/docs/v2/plan.md +++ b/docs/v2/plan.md @@ -893,6 +893,15 @@ Each step leaves the system runnable. The generic extraction is effectively free | 126 | Three-gate model (entry / mid / exit) for every implementation phase, with explicit compliance-check categories | Specified in `implementation/overview.md`. Categories: schema compliance (DB matches the doc), decision compliance (every decision number has a code/test citation), visual compliance (Admin UI parity with ScadaLink), behavioral compliance (per-phase smoke test), stability compliance (cross-cutting protections wired up for Tier C drivers), documentation compliance (any deviation reflected back in v2 docs). Exit gate requires two-reviewer signoff; silent deviation is the failure mode the gates exist to prevent | 2026-04-17 | | 127 | Per-phase implementation docs live under `docs/v2/implementation/` with structured task / acceptance / compliance / completion sections | Each phase doc enumerates: scope (in / out), entry gate checklist, task breakdown with per-task acceptance criteria, compliance checks (script-runnable), behavioral smoke test, completion checklist. Phase 0 + Phase 1 docs are committed; Phases 2–8 land as their predecessors clear exit gates | 2026-04-17 | | 128 | Driver list is fixed for v2.0 — Equipment Protocol Survey is NOT a prerequisite | The seven committed drivers (Modbus TCP including DL205, AB CIP, AB Legacy, S7, TwinCAT, FOCAS, OPC UA Client) plus the existing Galaxy/MXAccess driver are confirmed by direct knowledge of the equipment estate, not pending the formal survey. Supersedes the corrections-doc concern (C1) that the v2 commitment was made pre-survey. The survey may still produce useful inventory data for downstream planning (capacity, prioritization), but adding or removing drivers from the v2 implementation list is out of scope. Closes corrections-doc C1 | 2026-04-17 | +| 129 | OPC UA client data-path authorization model = `NodePermissions` bitmask flags + per-LDAP-group grants on a 6-level scope hierarchy (Cluster / Namespace / UnsArea / UnsLine / Equipment / Tag) with default-deny + additive grants; explicit Deny deferred to v2.1 | Mirrors v1 SecurityClassification model for Write tiers (WriteOperate / WriteTune / WriteConfigure); adds explicit AlarmRead / AlarmAcknowledge / AlarmConfirm / AlarmShelve / MethodCall flags; bundles (`ReadOnly` / `Operator` / `Engineer` / `Admin`) for one-click grants. Per-session permission-trie evaluator with O(depth × group-count) cost; cache invalidated on generation-apply or LDAP group cache expiry. Closes corrections-doc B1. See `acl-design.md` | 2026-04-17 | +| 130 | `NodeAcl` table generation-versioned, edited via draft → diff → publish | Same pattern as Namespace (#123) and Equipment (#109). ACL changes are content, not topology — they affect what consumers see at the OPC UA endpoint. Rollback restores the prior ACL state. Cross-generation invariant: `NodeAclId` once published with `(LdapGroup, ScopeKind, ScopeId)` cannot have any of those columns change | 2026-04-17 | +| 131 | Cluster-create workflow seeds default ACL set matching v1 LmxOpcUa LDAP-role-to-permission map | Preserves behavioral parity for v1 → v2 consumer migration. Operators tighten or loosen from there. Admin UI flags any cluster whose ACL set diverges from the seed | 2026-04-17 | +| 132 | OPC UA NodeManager logs denied operations only; allowed operations rely on SDK session/operation diagnostics | Logging every allowed op would dwarf the audit log. Denied-only mirrors typical authorization audit practice. Per-deployment policy can tighten if compliance requires positive-action logging | 2026-04-17 | +| 133 | Two-tier dev environment: inner-loop (in-process simulators on developer machines) + integration (Docker / VM / native simulators on a single dedicated Windows host) | Per decision #99. Concrete inventory + setup plan in `dev-environment.md` | 2026-04-17 | +| 134 | Docker Desktop with WSL2 backend (not Hyper-V backend) on integration host so TwinCAT XAR VM can run in Hyper-V alongside Docker | TwinCAT runtime cannot coexist with Hyper-V-mode Docker Desktop; WSL2 backend leaves Hyper-V free for the XAR VM. Documented operational constraint | 2026-04-17 | +| 135 | TwinCAT XAR runs only in a dedicated VM on the integration host; developer machines do NOT run XAR locally | The 7-day trial reactivation needs centralized management; the VM is shared infrastructure. Galaxy is the inverse — runs only on developer machines (Aveva license scoping), not on integration host | 2026-04-17 | +| 136 | Consumer cutover (ScadaBridge / Ignition / System Platform IO) is OUT of v2 scope | Owned by a separate integration / operations team. OtOpcUa team's scope ends at Phase 5 (all drivers built, all stability protections in place, full Admin UI shipped including ACL editor). Cutover sequencing, validation methodology, rollback procedures, and Aveva-pattern validation for tier 3 are the integration team's deliverables, tracked in 3-year-plan handoff §"Rollout Posture" and corrections doc §C5 | 2026-04-17 | +| 137 | Dev env credentials documented openly in `dev-environment.md`; production uses Integrated Security / gMSA per decision #46 | Dev defaults are not secrets — they're convenience. Production never uses these values; documented separation prevents leakage | 2026-04-17 | ## Reference Documents @@ -901,6 +910,8 @@ Each step leaves the system runnable. The generic extraction is effectively free - **[Driver Stability & Isolation](driver-stability.md)** — stability tier model (A/B/C), per-driver hosting decisions, cross-cutting protections, FOCAS and Galaxy deep dives - **[Central Config DB Schema](config-db-schema.md)** — concrete table definitions, indexes, stored procedures, authorization model, JSON conventions, EF Core migrations approach - **[Admin Web UI](admin-ui.md)** — Blazor Server admin app: information architecture, page-by-page workflows, per-driver config screen extensibility, real-time updates, UX rules +- **[OPC UA Client Authorization (ACL Design)](acl-design.md)** — data-path authz model: `NodePermissions` bitmask flags (Browse / Read / Subscribe / HistoryRead / WriteOperate / WriteTune / WriteConfigure / AlarmRead / AlarmAcknowledge / AlarmConfirm / AlarmShelve / MethodCall + bundles), 6-level scope hierarchy (Cluster / Namespace / UnsArea / UnsLine / Equipment / Tag) with inheritance, default-deny + additive grants, per-session permission-trie evaluator with O(depth × group-count) cost, default cluster-seed mapping v1 LmxOpcUa LDAP roles, Admin UI ACL tab + bulk grant + simulator. Closes corrections-doc finding B1. +- **[Development Environment](dev-environment.md)** — every external resource the v2 build needs (SQL Server, GLAuth, Galaxy, Docker simulators, TwinCAT XAR VM, OPC Foundation reference server, FOCAS stub + FaultShim) with default ports / credentials / owners; two-tier model (inner-loop on developer machines, integration on a single dedicated Windows host with WSL2-backed Docker + Hyper-V VM for TwinCAT); concrete bootstrap order for both tiers - **[Implementation Plan Overview](implementation/overview.md)** — phase gate structure (entry / mid / exit), compliance check categories (schema / decision / visual / behavioral / stability / documentation), deliverable conventions, "what counts as following the plan" - **[Phase 0 — Rename + .NET 10 cleanup](implementation/phase-0-rename-and-net10.md)** — mechanical LmxOpcUa → OtOpcUa rename with full task breakdown, compliance checks, completion checklist - **[Phase 1 — Configuration + Core.Abstractions + Admin scaffold](implementation/phase-1-configuration-and-admin-scaffold.md)** — central MSSQL schema, EF Core migrations, stored procs, LDAP-authenticated Blazor Server admin app with ScadaLink visual parity, LiteDB local cache, generation-diff applier; 5 work streams (A–E), full task breakdown, compliance checks, 14-step end-to-end smoke test