Files
lmxopcua/docs/Configuration.md
Joseph Doherty 5506b43ddc Doc refresh (task #204) — operational docs for multi-process multi-driver OtOpcUa
Five operational docs rewritten for v2 (multi-process, multi-driver, Config-DB authoritative):

- docs/Configuration.md — replaced appsettings-only story with the two-layer model.
  appsettings.json is bootstrap only (Node identity, Config DB connection string,
  transport security, LDAP bind, logging). Authoritative config (clusters, namespaces,
  UNS, equipment, tags, driver instances, ACLs, role grants, poll groups) lives in
  the Config DB accessed via OtOpcUaConfigDbContext and edited through the Admin UI
  draft/publish workflow. Added v1-to-v2 migration index so operators can locate where
  each old section moved. Cross-links to docs/v2/config-db-schema.md + docs/v2/admin-ui.md.

- docs/Redundancy.md — Phase 6.3 rewrite. Named every class under
  src/ZB.MOM.WW.OtOpcUa.Server/Redundancy/: RedundancyCoordinator, RedundancyTopology,
  ApplyLeaseRegistry (publish fencing), PeerReachabilityTracker, RecoveryStateManager,
  ServiceLevelCalculator (pure function), RedundancyStatePublisher. Documented the
  full 11-band ServiceLevel matrix (Maintenance=0 through AuthoritativePrimary=255)
  from ServiceLevelCalculator.cs and the per-ClusterNode fields (RedundancyRole,
  ServiceLevelBase, ApplicationUri). Covered metrics
  (otopcua.redundancy.role_transition counter + primary/secondary/stale_count gauges
  on meter ZB.MOM.WW.OtOpcUa.Redundancy) and SignalR RoleChanged push from
  FleetStatusPoller to RedundancyTab.razor.

- docs/security.md — preserved the transport-security section (still accurate) and
  added Phase 6.2 authorization. Four concerns now documented in one place:
  (1) transport security profiles, (2) OPC UA auth via LdapUserAuthenticator
  (note: task spec called this LdapAuthenticationProvider — actual class name is
  LdapUserAuthenticator in Server/Security/), (3) data-plane authorization via
  NodeAcl + PermissionTrie + AuthorizationGate — additive-only model per decision
  #129, ClusterId → Namespace → UnsArea → UnsLine → Equipment → Tag hierarchy,
  NodePermissions bundle, PermissionProbeService in Admin for "probe this permission",
  (4) control-plane authorization via LdapGroupRoleMapping + AdminRole
  (ConfigViewer / ConfigEditor / FleetAdmin, CanEdit / CanPublish policies) —
  deliberately independent of data-plane ACLs per decision #150. Documented the
  OTOPCUA0001 Roslyn analyzer (UnwrappedCapabilityCallAnalyzer) as the compile-time
  guard ensuring every driver-capability async call is wrapped by CapabilityInvoker.

- docs/ServiceHosting.md — three-process rewrite: OtOpcUa Server (net10 x64,
  BackgroundService + AddWindowsService, hosts OPC UA endpoint + all non-Galaxy
  drivers), OtOpcUa Admin (net10 x64, Blazor Server + SignalR + /metrics via
  OpenTelemetry Prometheus exporter), OtOpcUa Galaxy.Host (.NET Framework 4.8 x86,
  NSSM-wrapped, env-variable driven, STA thread + MXAccess COM). Pipe ACL
  denies-Admins detail + non-elevated shell requirement captured from feedback memory.
  Divergence from CLAUDE.md: task spec said "TopShelf is still the service-installer
  wrapper per CLAUDE.md note" but no csproj in the repo references TopShelf — decision
  #30 replaced it with the generic host's AddWindowsService wrapper (per the doc
  comment on OpcUaServerService). Reflected the actual state + flagged this divergence
  here so someone can update CLAUDE.md separately.

- docs/StatusDashboard.md — replaced the full v1 reference (dashboard endpoints,
  health check rules, StatusData DTO, etc.) with a short "superseded by Admin UI"
  pointer that preserves git-blame continuity + avoids broken links from other docs
  that reference it.

Class references verified by reading:
  src/ZB.MOM.WW.OtOpcUa.Server/Redundancy/{RedundancyCoordinator, ServiceLevelCalculator,
      ApplyLeaseRegistry, RedundancyStatePublisher}.cs
  src/ZB.MOM.WW.OtOpcUa.Core/Authorization/{PermissionTrie, PermissionTrieBuilder,
      PermissionTrieCache, TriePermissionEvaluator, AuthorizationGate}.cs
  src/ZB.MOM.WW.OtOpcUa.Server/Security/{AuthorizationGate, LdapUserAuthenticator}.cs
  src/ZB.MOM.WW.OtOpcUa.Admin/{Program.cs, Services/AdminRoles.cs,
      Services/RedundancyMetrics.cs, Hubs/FleetStatusPoller.cs}
  src/ZB.MOM.WW.OtOpcUa.Server/Program.cs + appsettings.json
  src/ZB.MOM.WW.OtOpcUa.Driver.Galaxy.Host/{Program.cs, Ipc/PipeServer.cs}
  src/ZB.MOM.WW.OtOpcUa.Configuration/Entities/{ClusterNode, NodeAcl,
      LdapGroupRoleMapping}.cs
  src/ZB.MOM.WW.OtOpcUa.Analyzers/UnwrappedCapabilityCallAnalyzer.cs

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 01:34:25 -04:00

9.5 KiB

Configuration

Two-layer model

OtOpcUa configuration is split into two layers:

Layer Where Scope Edited by
Bootstrap appsettings.json per process Enough to start the process and reach the Config DB Local file edit + process restart
Authoritative config Config DB (SQL Server) via OtOpcUaConfigDbContext Clusters, namespaces, UNS hierarchy, equipment, tags, driver instances, ACLs, role grants, poll groups Admin UI draft/publish workflow

The rule: if the setting describes how the process connects to the rest of the world (Config DB connection string, LDAP bind, transport security profile, node identity, logging), it lives in appsettings.json. If it describes what the fleet does (clusters, drivers, tags, UNS, ACLs), it lives in the Config DB and is edited through the Admin UI.


Bootstrap configuration (appsettings.json)

Each of the three processes (Server, Admin, Galaxy.Host) reads its own appsettings.json plus environment overrides.

OtOpcUa Server — src/ZB.MOM.WW.OtOpcUa.Server/appsettings.json

Bootstrap-only. Program.cs reads four top-level sections:

Section Keys Purpose
Node NodeId, ClusterId, ConfigDbConnectionString, LocalCachePath Identity + path to the Config DB + LiteDB offline cache path.
OpcUaServer EndpointUrl, ApplicationName, ApplicationUri, PkiStoreRoot, AutoAcceptUntrustedClientCertificates, SecurityProfile OPC UA endpoint + transport security. See security.md.
OpcUaServer:Ldap Enabled, Server, Port, UseTls, AllowInsecureLdap, SearchBase, ServiceAccountDn, ServiceAccountPassword, GroupToRole, UserNameAttribute, GroupAttribute LDAP auth for OPC UA UserName tokens. See security.md.
Serilog Standard Serilog keys + WriteJson bool Logging verbosity + optional JSON file sink for SIEM ingest.
Authorization StrictMode (bool) Flip true to fail-closed on sessions lacking LDAP group metadata. Default false during ACL rollouts.
Metrics:Prometheus:Enabled bool Toggles the /metrics endpoint.

Minimal example:

{
  "Serilog": { "MinimumLevel": "Information" },
  "Node": {
    "NodeId": "node-dev-a",
    "ClusterId": "cluster-dev",
    "ConfigDbConnectionString": "Server=localhost,14330;Database=OtOpcUaConfig;Integrated Security=True;TrustServerCertificate=True;Encrypt=False;",
    "LocalCachePath": "config_cache.db"
  },
  "OpcUaServer": {
    "EndpointUrl": "opc.tcp://0.0.0.0:4840/OtOpcUa",
    "ApplicationUri": "urn:node-dev-a:OtOpcUa",
    "SecurityProfile": "None",
    "AutoAcceptUntrustedClientCertificates": true,
    "Ldap": { "Enabled": false }
  }
}

OtOpcUa Admin — src/ZB.MOM.WW.OtOpcUa.Admin/appsettings.json

Section Purpose
ConnectionStrings:ConfigDb SQL connection string — must point at the same Config DB every Server reaches.
Authentication:Ldap LDAP bind for the Admin login form (same options shape as the Server's OpcUaServer:Ldap).
CertTrust CertTrustOptions — file-system path under the Server's PkiStoreRoot so the Admin Certificates page can promote rejected client certs.
Metrics:Prometheus:Enabled Toggles the /metrics scrape endpoint (default true).
Serilog Logging.

Galaxy.Host

Environment-variable driven (OTOPCUA_GALAXY_PIPE, OTOPCUA_ALLOWED_SID, OTOPCUA_GALAXY_SECRET, OTOPCUA_GALAXY_BACKEND, OTOPCUA_GALAXY_ZB_CONN, OTOPCUA_HISTORIAN_*). No appsettings.json — the supervisor owns the launch environment. See ServiceHosting.md.

Environment overrides

Standard .NET config layering applies: appsettings.{Environment}.json, then environment variables with Section__Property naming. DOTNET_ENVIRONMENT (or ASPNETCORE_ENVIRONMENT for Admin) selects the overlay.


Authoritative configuration (Config DB)

The Config DB is the single source of truth for every setting that a v1 deployment used to carry in appsettings.json as driver-specific state. OtOpcUaConfigDbContext (src/ZB.MOM.WW.OtOpcUa.Configuration/OtOpcUaConfigDbContext.cs) is the EF Core context used by both the Admin writer and every Server reader.

Top-level sections operators touch

Concept Entity Admin UI surface Purpose
Cluster ServerCluster Clusters pages Fleet unit; owns nodes, generations, UNS, ACLs.
Cluster node ClusterNode + ClusterNodeCredential RedundancyTab, Hosts page Per-node identity, RedundancyRole, ServiceLevelBase, ApplicationUri, service-account credentials.
Generation ConfigGeneration + ClusterNodeGenerationState Generations / DiffViewer Append-only; draft → publish workflow (sp_PublishGeneration).
Namespace Namespace Namespaces tab Per-cluster OPC UA namespace; Kind = Equipment / SystemPlatform / Simulated.
Driver instance DriverInstance Drivers tab Configured driver (Modbus, S7, OpcUaClient, Galaxy, …) + DriverConfig JSON + resilience profile.
Device Device Under each driver instance Per-host settings inside a driver instance (IP, port, unit-id…).
UNS hierarchy UnsArea + UnsLine UnsTab (drag/drop) L3 / L4 of the unified namespace.
Equipment Equipment Equipment pages, CSV import L5; carries MachineCode, ZTag, SAPID, EquipmentUuid, reservation-backed external ids.
Tag Tag Under each equipment Driver-specific tag address + SecurityClassification + poll-group assignment.
Poll group PollGroup Driver-scoped Poll cadence buckets; PollGroupEngine in Core.Abstractions uses this at runtime.
ACL NodeAcl AclsTab + Probe dialog Per-level permission grants, additive only. See security.md.
Role grant LdapGroupRoleMapping RoleGrants page Maps LDAP groups → Admin roles (ConfigViewer / ConfigEditor / FleetAdmin).
External id reservation ExternalIdReservation Reservations page Reservation-backed ZTag and SAPID uniqueness.
Equipment import batch EquipmentImportBatch CSV import flow Staged bulk-add with validation preview.
Audit log ConfigAuditLog Audit page Append-only record of every publish, rollback, credential rotation, role-grant change.

Draft → publish generation model

All edits go into a draft generation scoped to one cluster. DraftValidationService checks invariants (same-cluster FKs, reservation collisions, UNS path consistency, ACL scope validity). When the operator clicks Publish, sp_PublishGeneration atomically promotes the draft, records the audit event, and causes every RedundancyCoordinator.RefreshAsync in the affected cluster to pick up the new topology + ACL set. The Admin UI DiffViewer shows exactly what's changing before publish.

Old generations are retained; rollback is "publish older generation as new". ConfigAuditLog makes every change auditable by principal + timestamp.

Offline cache

Each Server process caches the last-seen published generation in Node:LocalCachePath via LiteDB (LiteDbConfigCache in src/ZB.MOM.WW.OtOpcUa.Configuration/LocalCache/). The cache lets a node start without the central DB reachable; once the DB comes back, NodeBootstrap syncs to the current generation.

Full schema reference

For table columns, indexes, stored procedures, the publish-transaction semantics, and the SQL authorization model (per-node SQL principals + SESSION_CONTEXT cluster binding), see docs/v2/config-db-schema.md.

Admin UI flow

For the draft editor, DiffViewer, CSV import, IdentificationFields, RedundancyTab, AclsTab + Probe-this-permission, RoleGrants, and the SignalR real-time surface, see docs/v2/admin-ui.md.


Where did v1 appsettings sections go?

Quick index for operators coming from v1 LmxOpcUa:

v1 appsettings section v2 home
OpcUa.Port / BindAddress / EndpointPath / ServerName Bootstrap OpcUaServer:EndpointUrl + ApplicationName.
OpcUa.ApplicationUri Config DB ClusterNode.ApplicationUri.
OpcUa.MaxSessions / SessionTimeoutMinutes Bootstrap OpcUaServer:* (if exposed) or stack defaults.
OpcUa.AlarmTrackingEnabled / AlarmFilter Per driver instance in Config DB (alarm surface is capability-driven per IAlarmSource).
MxAccess.* Galaxy driver instance DriverConfig JSON + Galaxy.Host env vars (see ServiceHosting.md).
GalaxyRepository.* Galaxy driver instance DriverConfig JSON + OTOPCUA_GALAXY_ZB_CONN env var.
Dashboard.* Retired — Admin UI replaces the dashboard. See StatusDashboard.md.
Historian.* Galaxy driver instance DriverConfig JSON + OTOPCUA_HISTORIAN_* env vars.
Authentication.Ldap.* Bootstrap OpcUaServer:Ldap (same shape) + Admin Authentication:Ldap for the UI login.
Security.* Bootstrap OpcUaServer:SecurityProfile + PkiStoreRoot + AutoAcceptUntrustedClientCertificates.
Redundancy.* Config DB ClusterNode.RedundancyRole + ServiceLevelBase.

Validation

  • Bootstrap: the process fails fast on missing required keys in Program.cs (e.g. Node:NodeId, Node:ClusterId, Node:ConfigDbConnectionString all throw InvalidOperationException if unset).
  • Authoritative: DraftValidationService runs on every save; sp_ValidateDraft runs as part of sp_PublishGeneration so an invalid draft cannot reach any node.