Five operational docs rewritten for v2 (multi-process, multi-driver, Config-DB authoritative): - docs/Configuration.md — replaced appsettings-only story with the two-layer model. appsettings.json is bootstrap only (Node identity, Config DB connection string, transport security, LDAP bind, logging). Authoritative config (clusters, namespaces, UNS, equipment, tags, driver instances, ACLs, role grants, poll groups) lives in the Config DB accessed via OtOpcUaConfigDbContext and edited through the Admin UI draft/publish workflow. Added v1-to-v2 migration index so operators can locate where each old section moved. Cross-links to docs/v2/config-db-schema.md + docs/v2/admin-ui.md. - docs/Redundancy.md — Phase 6.3 rewrite. Named every class under src/ZB.MOM.WW.OtOpcUa.Server/Redundancy/: RedundancyCoordinator, RedundancyTopology, ApplyLeaseRegistry (publish fencing), PeerReachabilityTracker, RecoveryStateManager, ServiceLevelCalculator (pure function), RedundancyStatePublisher. Documented the full 11-band ServiceLevel matrix (Maintenance=0 through AuthoritativePrimary=255) from ServiceLevelCalculator.cs and the per-ClusterNode fields (RedundancyRole, ServiceLevelBase, ApplicationUri). Covered metrics (otopcua.redundancy.role_transition counter + primary/secondary/stale_count gauges on meter ZB.MOM.WW.OtOpcUa.Redundancy) and SignalR RoleChanged push from FleetStatusPoller to RedundancyTab.razor. - docs/security.md — preserved the transport-security section (still accurate) and added Phase 6.2 authorization. Four concerns now documented in one place: (1) transport security profiles, (2) OPC UA auth via LdapUserAuthenticator (note: task spec called this LdapAuthenticationProvider — actual class name is LdapUserAuthenticator in Server/Security/), (3) data-plane authorization via NodeAcl + PermissionTrie + AuthorizationGate — additive-only model per decision #129, ClusterId → Namespace → UnsArea → UnsLine → Equipment → Tag hierarchy, NodePermissions bundle, PermissionProbeService in Admin for "probe this permission", (4) control-plane authorization via LdapGroupRoleMapping + AdminRole (ConfigViewer / ConfigEditor / FleetAdmin, CanEdit / CanPublish policies) — deliberately independent of data-plane ACLs per decision #150. Documented the OTOPCUA0001 Roslyn analyzer (UnwrappedCapabilityCallAnalyzer) as the compile-time guard ensuring every driver-capability async call is wrapped by CapabilityInvoker. - docs/ServiceHosting.md — three-process rewrite: OtOpcUa Server (net10 x64, BackgroundService + AddWindowsService, hosts OPC UA endpoint + all non-Galaxy drivers), OtOpcUa Admin (net10 x64, Blazor Server + SignalR + /metrics via OpenTelemetry Prometheus exporter), OtOpcUa Galaxy.Host (.NET Framework 4.8 x86, NSSM-wrapped, env-variable driven, STA thread + MXAccess COM). Pipe ACL denies-Admins detail + non-elevated shell requirement captured from feedback memory. Divergence from CLAUDE.md: task spec said "TopShelf is still the service-installer wrapper per CLAUDE.md note" but no csproj in the repo references TopShelf — decision #30 replaced it with the generic host's AddWindowsService wrapper (per the doc comment on OpcUaServerService). Reflected the actual state + flagged this divergence here so someone can update CLAUDE.md separately. - docs/StatusDashboard.md — replaced the full v1 reference (dashboard endpoints, health check rules, StatusData DTO, etc.) with a short "superseded by Admin UI" pointer that preserves git-blame continuity + avoids broken links from other docs that reference it. Class references verified by reading: src/ZB.MOM.WW.OtOpcUa.Server/Redundancy/{RedundancyCoordinator, ServiceLevelCalculator, ApplyLeaseRegistry, RedundancyStatePublisher}.cs src/ZB.MOM.WW.OtOpcUa.Core/Authorization/{PermissionTrie, PermissionTrieBuilder, PermissionTrieCache, TriePermissionEvaluator, AuthorizationGate}.cs src/ZB.MOM.WW.OtOpcUa.Server/Security/{AuthorizationGate, LdapUserAuthenticator}.cs src/ZB.MOM.WW.OtOpcUa.Admin/{Program.cs, Services/AdminRoles.cs, Services/RedundancyMetrics.cs, Hubs/FleetStatusPoller.cs} src/ZB.MOM.WW.OtOpcUa.Server/Program.cs + appsettings.json src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host/{Program.cs, Ipc/PipeServer.cs} src/ZB.MOM.WW.OtOpcUa.Configuration/Entities/{ClusterNode, NodeAcl, LdapGroupRoleMapping}.cs src/ZB.MOM.WW.OtOpcUa.Analyzers/UnwrappedCapabilityCallAnalyzer.cs Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
142 lines
9.5 KiB
Markdown
142 lines
9.5 KiB
Markdown
# Configuration
|
|
|
|
## Two-layer model
|
|
|
|
OtOpcUa configuration is split into two layers:
|
|
|
|
| Layer | Where | Scope | Edited by |
|
|
|---|---|---|---|
|
|
| **Bootstrap** | `appsettings.json` per process | Enough to start the process and reach the Config DB | Local file edit + process restart |
|
|
| **Authoritative config** | Config DB (SQL Server) via `OtOpcUaConfigDbContext` | Clusters, namespaces, UNS hierarchy, equipment, tags, driver instances, ACLs, role grants, poll groups | Admin UI draft/publish workflow |
|
|
|
|
The rule: if the setting describes *how the process connects to the rest of the world* (Config DB connection string, LDAP bind, transport security profile, node identity, logging), it lives in `appsettings.json`. If it describes *what the fleet does* (clusters, drivers, tags, UNS, ACLs), it lives in the Config DB and is edited through the Admin UI.
|
|
|
|
---
|
|
|
|
## Bootstrap configuration (`appsettings.json`)
|
|
|
|
Each of the three processes (Server, Admin, Galaxy.Host) reads its own `appsettings.json` plus environment overrides.
|
|
|
|
### OtOpcUa Server — `src/ZB.MOM.WW.OtOpcUa.Server/appsettings.json`
|
|
|
|
Bootstrap-only. `Program.cs` reads four top-level sections:
|
|
|
|
| Section | Keys | Purpose |
|
|
|---|---|---|
|
|
| `Node` | `NodeId`, `ClusterId`, `ConfigDbConnectionString`, `LocalCachePath` | Identity + path to the Config DB + LiteDB offline cache path. |
|
|
| `OpcUaServer` | `EndpointUrl`, `ApplicationName`, `ApplicationUri`, `PkiStoreRoot`, `AutoAcceptUntrustedClientCertificates`, `SecurityProfile` | OPC UA endpoint + transport security. See [`security.md`](security.md). |
|
|
| `OpcUaServer:Ldap` | `Enabled`, `Server`, `Port`, `UseTls`, `AllowInsecureLdap`, `SearchBase`, `ServiceAccountDn`, `ServiceAccountPassword`, `GroupToRole`, `UserNameAttribute`, `GroupAttribute` | LDAP auth for OPC UA UserName tokens. See [`security.md`](security.md). |
|
|
| `Serilog` | Standard Serilog keys + `WriteJson` bool | Logging verbosity + optional JSON file sink for SIEM ingest. |
|
|
| `Authorization` | `StrictMode` (bool) | Flip `true` to fail-closed on sessions lacking LDAP group metadata. Default false during ACL rollouts. |
|
|
| `Metrics:Prometheus:Enabled` | bool | Toggles the `/metrics` endpoint. |
|
|
|
|
Minimal example:
|
|
|
|
```json
|
|
{
|
|
"Serilog": { "MinimumLevel": "Information" },
|
|
"Node": {
|
|
"NodeId": "node-dev-a",
|
|
"ClusterId": "cluster-dev",
|
|
"ConfigDbConnectionString": "Server=localhost,14330;Database=OtOpcUaConfig;Integrated Security=True;TrustServerCertificate=True;Encrypt=False;",
|
|
"LocalCachePath": "config_cache.db"
|
|
},
|
|
"OpcUaServer": {
|
|
"EndpointUrl": "opc.tcp://0.0.0.0:4840/OtOpcUa",
|
|
"ApplicationUri": "urn:node-dev-a:OtOpcUa",
|
|
"SecurityProfile": "None",
|
|
"AutoAcceptUntrustedClientCertificates": true,
|
|
"Ldap": { "Enabled": false }
|
|
}
|
|
}
|
|
```
|
|
|
|
### OtOpcUa Admin — `src/ZB.MOM.WW.OtOpcUa.Admin/appsettings.json`
|
|
|
|
| Section | Purpose |
|
|
|---|---|
|
|
| `ConnectionStrings:ConfigDb` | SQL connection string — must point at the same Config DB every Server reaches. |
|
|
| `Authentication:Ldap` | LDAP bind for the Admin login form (same options shape as the Server's `OpcUaServer:Ldap`). |
|
|
| `CertTrust` | `CertTrustOptions` — file-system path under the Server's `PkiStoreRoot` so the Admin Certificates page can promote rejected client certs. |
|
|
| `Metrics:Prometheus:Enabled` | Toggles the `/metrics` scrape endpoint (default true). |
|
|
| `Serilog` | Logging. |
|
|
|
|
### Galaxy.Host
|
|
|
|
Environment-variable driven (`OTOPCUA_GALAXY_PIPE`, `OTOPCUA_ALLOWED_SID`, `OTOPCUA_GALAXY_SECRET`, `OTOPCUA_GALAXY_BACKEND`, `OTOPCUA_GALAXY_ZB_CONN`, `OTOPCUA_HISTORIAN_*`). No `appsettings.json` — the supervisor owns the launch environment. See [`ServiceHosting.md`](ServiceHosting.md#galaxyhost-process).
|
|
|
|
### Environment overrides
|
|
|
|
Standard .NET config layering applies: `appsettings.{Environment}.json`, then environment variables with `Section__Property` naming. `DOTNET_ENVIRONMENT` (or `ASPNETCORE_ENVIRONMENT` for Admin) selects the overlay.
|
|
|
|
---
|
|
|
|
## Authoritative configuration (Config DB)
|
|
|
|
The Config DB is the single source of truth for every setting that a v1 deployment used to carry in `appsettings.json` as driver-specific state. `OtOpcUaConfigDbContext` (`src/ZB.MOM.WW.OtOpcUa.Configuration/OtOpcUaConfigDbContext.cs`) is the EF Core context used by both the Admin writer and every Server reader.
|
|
|
|
### Top-level sections operators touch
|
|
|
|
| Concept | Entity | Admin UI surface | Purpose |
|
|
|---|---|---|---|
|
|
| Cluster | `ServerCluster` | Clusters pages | Fleet unit; owns nodes, generations, UNS, ACLs. |
|
|
| Cluster node | `ClusterNode` + `ClusterNodeCredential` | RedundancyTab, Hosts page | Per-node identity, `RedundancyRole`, `ServiceLevelBase`, ApplicationUri, service-account credentials. |
|
|
| Generation | `ConfigGeneration` + `ClusterNodeGenerationState` | Generations / DiffViewer | Append-only; draft → publish workflow (`sp_PublishGeneration`). |
|
|
| Namespace | `Namespace` | Namespaces tab | Per-cluster OPC UA namespace; `Kind` = Equipment / SystemPlatform / Simulated. |
|
|
| Driver instance | `DriverInstance` | Drivers tab | Configured driver (Modbus, S7, OpcUaClient, Galaxy, …) + `DriverConfig` JSON + resilience profile. |
|
|
| Device | `Device` | Under each driver instance | Per-host settings inside a driver instance (IP, port, unit-id…). |
|
|
| UNS hierarchy | `UnsArea` + `UnsLine` | UnsTab (drag/drop) | L3 / L4 of the unified namespace. |
|
|
| Equipment | `Equipment` | Equipment pages, CSV import | L5; carries `MachineCode`, `ZTag`, `SAPID`, `EquipmentUuid`, reservation-backed external ids. |
|
|
| Tag | `Tag` | Under each equipment | Driver-specific tag address + `SecurityClassification` + poll-group assignment. |
|
|
| Poll group | `PollGroup` | Driver-scoped | Poll cadence buckets; `PollGroupEngine` in Core.Abstractions uses this at runtime. |
|
|
| ACL | `NodeAcl` | AclsTab + Probe dialog | Per-level permission grants, additive only. See [`security.md`](security.md#data-plane-authorization). |
|
|
| Role grant | `LdapGroupRoleMapping` | RoleGrants page | Maps LDAP groups → Admin roles (`ConfigViewer` / `ConfigEditor` / `FleetAdmin`). |
|
|
| External id reservation | `ExternalIdReservation` | Reservations page | Reservation-backed `ZTag` and `SAPID` uniqueness. |
|
|
| Equipment import batch | `EquipmentImportBatch` | CSV import flow | Staged bulk-add with validation preview. |
|
|
| Audit log | `ConfigAuditLog` | Audit page | Append-only record of every publish, rollback, credential rotation, role-grant change. |
|
|
|
|
### Draft → publish generation model
|
|
|
|
All edits go into a **draft** generation scoped to one cluster. `DraftValidationService` checks invariants (same-cluster FKs, reservation collisions, UNS path consistency, ACL scope validity). When the operator clicks Publish, `sp_PublishGeneration` atomically promotes the draft, records the audit event, and causes every `RedundancyCoordinator.RefreshAsync` in the affected cluster to pick up the new topology + ACL set. The Admin UI `DiffViewer` shows exactly what's changing before publish.
|
|
|
|
Old generations are retained; rollback is "publish older generation as new". `ConfigAuditLog` makes every change auditable by principal + timestamp.
|
|
|
|
### Offline cache
|
|
|
|
Each Server process caches the last-seen published generation in `Node:LocalCachePath` via LiteDB (`LiteDbConfigCache` in `src/ZB.MOM.WW.OtOpcUa.Configuration/LocalCache/`). The cache lets a node start without the central DB reachable; once the DB comes back, `NodeBootstrap` syncs to the current generation.
|
|
|
|
### Full schema reference
|
|
|
|
For table columns, indexes, stored procedures, the publish-transaction semantics, and the SQL authorization model (per-node SQL principals + `SESSION_CONTEXT` cluster binding), see [`docs/v2/config-db-schema.md`](v2/config-db-schema.md).
|
|
|
|
### Admin UI flow
|
|
|
|
For the draft editor, DiffViewer, CSV import, IdentificationFields, RedundancyTab, AclsTab + Probe-this-permission, RoleGrants, and the SignalR real-time surface, see [`docs/v2/admin-ui.md`](v2/admin-ui.md).
|
|
|
|
---
|
|
|
|
## Where did v1 appsettings sections go?
|
|
|
|
Quick index for operators coming from v1 LmxOpcUa:
|
|
|
|
| v1 appsettings section | v2 home |
|
|
|---|---|
|
|
| `OpcUa.Port` / `BindAddress` / `EndpointPath` / `ServerName` | Bootstrap `OpcUaServer:EndpointUrl` + `ApplicationName`. |
|
|
| `OpcUa.ApplicationUri` | Config DB `ClusterNode.ApplicationUri`. |
|
|
| `OpcUa.MaxSessions` / `SessionTimeoutMinutes` | Bootstrap `OpcUaServer:*` (if exposed) or stack defaults. |
|
|
| `OpcUa.AlarmTrackingEnabled` / `AlarmFilter` | Per driver instance in Config DB (alarm surface is capability-driven per `IAlarmSource`). |
|
|
| `MxAccess.*` | Galaxy driver instance `DriverConfig` JSON + Galaxy.Host env vars (see [`ServiceHosting.md`](ServiceHosting.md#galaxyhost-process)). |
|
|
| `GalaxyRepository.*` | Galaxy driver instance `DriverConfig` JSON + `OTOPCUA_GALAXY_ZB_CONN` env var. |
|
|
| `Dashboard.*` | Retired — Admin UI replaces the dashboard. See [`StatusDashboard.md`](StatusDashboard.md). |
|
|
| `Historian.*` | Galaxy driver instance `DriverConfig` JSON + `OTOPCUA_HISTORIAN_*` env vars. |
|
|
| `Authentication.Ldap.*` | Bootstrap `OpcUaServer:Ldap` (same shape) + Admin `Authentication:Ldap` for the UI login. |
|
|
| `Security.*` | Bootstrap `OpcUaServer:SecurityProfile` + `PkiStoreRoot` + `AutoAcceptUntrustedClientCertificates`. |
|
|
| `Redundancy.*` | Config DB `ClusterNode.RedundancyRole` + `ServiceLevelBase`. |
|
|
|
|
---
|
|
|
|
## Validation
|
|
|
|
- **Bootstrap**: the process fails fast on missing required keys in `Program.cs` (e.g. `Node:NodeId`, `Node:ClusterId`, `Node:ConfigDbConnectionString` all throw `InvalidOperationException` if unset).
|
|
- **Authoritative**: `DraftValidationService` runs on every save; `sp_ValidateDraft` runs as part of `sp_PublishGeneration` so an invalid draft cannot reach any node.
|